In this interview, we talk customer data platforms with Craig Kelly, Group Project Manager at Overstock.com 

April 5, 2018

How are you taking everything you know about your customer and fusing it together to create a single view of their behavior? Craig and his team have been building a customer data platform (CDP) for the past eight months. Here he explains how to start your own data project, and overcome major obstacles along the way.

 

Can you tell me about your company and role?

I’m Craig Kelly, Group Product Manager at Overstock. We’re a tech-driven online retailer, founded in 1999.

 

What are your biggest problems right now?

One is the ‘identity’ problem where you have multiple email addresses and social profiles associated with one user, and users with multiple devices. We’re trying to figure out how these all combine to create a single identity for a customer. The second problem is how to take everything you know about a user and make it available to any system. We’ve been working on our customer data platform (CDP) for about eight months, in partnership with mParticle. The CDP combines the users’ identities with their attributes. An attribute is a component of the complete picture of the user that we can take action on- whether that’s what channel they prefer to hear from us on (Facebook, email, push messaging, display, SMS, etc), what time of day they want to hear from us and what their favorite styles are, etc. There are so many known or implied attributes of a user that we can group together.

 

How is the CDP different to something like a data management platform? Are they the same thing?

I think that they are different. A CDP takes a lot of DMP functionality and applies it to first-party data; its a mix of a CRM and a DMP. It’s an evolution of the two that’s more heavily reliant on first-party data but then has the ability to take action in real time off of that first-party data, which is the big change and extensibility of the platform. With DMPs, you might go in and manually create an audience and push it to a channel. In the CDP world, you’re taking all of your information about a user, and you’re no longer reliant on audiences. You can still group people into audiences, but there’s the DMP functionality and the extensibility on top of it. One difference between a DMP and a CDP is that it’s all of your identities and attributes about a user. When you’re looking at the DMP world, you’re usually working off some third-party ID, and you never understand the actual identity of that user in a way that you can utilize across multiple platforms. The DMP can use it across multiple platforms but you cannot. So it’s more reliant on aggregation than optimizations of third party data than it is on direct first-party data.

The third piece (that you get outside of the identities and attributes of a user) that CDP’s provide is the event histogram. You have a view of every interaction customers have on every platform whether it’s your apps, your website, your emails, all of your owned and operated channels; you have a complete profile for every single user tied to that group of identities. A lot of CDPs will compare themselves to middleware or call themselves middleware, but they allow us to rapidly upgrade everything that we’re doing outside the CDP.

So as long as we have that CDP piece it takes us almost no time to go out and upgrade the systems around it whether that’s our data warehouse, our machine learning platform, our email and push providers or our linking solutions. We now have a middle layer not tied into a traditional marketing cloud. We can easily integrate with all of these different platforms because all of our events are captured in a single location that can then forward all of those events and user identities and attributes to other platforms. In this way, the CDP project is at the core of our modernization efforts.

 

You started this project a year ago. What was the real drive behind it?

Overstock has traditionally been incredibly efficiency-focused and so we look really closely at our customer acquisition cost compared to our lifetime value metrics. We’ve historically always been best in class at that. We asked ourselves the question, “how do we create growth, and as we scale the business/try to create rapid growth, how do we do that efficiently?” We realized that we needed to take all our efficiency drives to the next level, get out of the silos and allow our channels to talk to each other. That required rapid modernization and a centralized layer which can coordinate and orchestrate all of our customer data and audiences across all of our different platforms.

 

Clearly integrating all this customer data and unifying these customer profiles across systems is a huge undertaking. Who are the stakeholders here and how did you get buy-in from those different teams?

Our biggest stakeholder knew that we were trying to achieve the timeless marketing mantra of serving each customer the right message in the right place at the right time while also maintaining the efficiency of our systems. This stakeholder understood the need for a CDP layer to coordinate all of those activities. So, there was a deep understanding that we might not necessarily see immediate returns from this, but in order to get move forwards and change the mindset of channels working in silos, we had to go for it. Channels today in most companies work in silos for a reason because there were no good ways to push data and orchestrate it across all of these different channels. So, the fact that the CDP holds that promise allowed us to get a lot of buy-in early.

 

On the surface it sounds like it’s working and it’s been very helpful. But what kind of obstacles did you encounter at the very beginning?

Most of the earliest issues revolved around combining cookie-based data with device-ID data (i.e. app and web data where you have different identities and different experiences on each). Your web experience is not necessarily consistent with your app experience because you have different users and different tools available on the mobile app side such as camera access. Creating a consistent data architecture across both was a challenge at the start. And then tying the identities together was it’s own major challenge. The third issue also revolves around identity- if you only have one identity piece, like a cookie, it might work for cookie-based channels but for API and push channels it doesn’t work. How do you tie them together? How do you orchestrate data when you don’t have a complete list of identities that works across all of the channels? That is and remains a challenge, and it’s something that we’re actively working on. I think there’s lots of interesting ways to try to solve that problem, and it’s certainly not unique to us. But that’s going to be the future of how all of this pans out- how well we can utilize identities that are in a language that every channel speaks.

 

What’s the timeframe to get to that point?

It’s not totally up to us. It depends on what types of identities the big publishers decide to accept in order to coordinate orchestration. From our end, we see it as an evolving picture. It limits the impact that we’re able to have today. If we’re only able to orchestrate data for authenticated users, for example, that gives us some percentage win. For all of our authenticated users who are signed in that we have an email address for we can match almost all of them to just about every system because you have that email address and a cookie ID. So it pushes on two sides. One, we need to do a better job of driving user authentication and on the other side, we need to see how we can supplement our ID graph with third-party data.

 

If you were to start all over again from scratch what things, if anything, would you do differently?

I think for the most part it’s gone smoother than I ever expected. Certainly, the biggest thing that I would have done differently is I would have spent more time in the beginning thinking about the data structures and how we’re going to put data into the CDP because it’s hard to change that once you do it. I would have been a bit more rigorous around how we architect that data so that we don’t have to step back and change as the picture becomes clearer. I don’t know necessarily how we would have done that, but it certainly has caused headaches along the way.

 

Any final top tips or advice for anyone starting a CDP project like this?

Just get started! To me, the most surprising thing along the way has been how just starting this project has changed our mindset and how we approach the whole world. It’s changed how we understand the possibilities of what we can do as a marketing organization. We were limited in how we thought about the potential of each piece, and I think as you go down the CDP road it starts to change your mindset from ‘what’s possible today? to ‘anything is possible’.

In this interview, we talk customer data platforms with Craig Kelly, Group Project Manager at Overstock.com 

How are you taking everything you know about your customer and fusing it together to create a single view of their behavior? Craig and his team have been building a customer data platform (CDP) for the past eight months. Here he explains how to start your own data project, and overcome major obstacles along the way.

 

Can you tell me about your company and role?

I’m Craig Kelly, Group Product Manager at Overstock. We’re a tech-driven online retailer, founded in 1999.

 

What are your biggest problems right now?

One is the ‘identity’ problem where you have multiple email addresses and social profiles associated with one user, and users with multiple devices. We’re trying to figure out how these all combine to create a single identity for a customer. The second problem is how to take everything you know about a user and make it available to any system. We’ve been working on our customer data platform (CDP) for about eight months, in partnership with mParticle. The CDP combines the users’ identities with their attributes. An attribute is a component of the complete picture of the user that we can take action on- whether that’s what channel they prefer to hear from us on (Facebook, email, push messaging, display, SMS, etc), what time of day they want to hear from us and what their favorite styles are, etc. There are so many known or implied attributes of a user that we can group together.

 

How is the CDP different to something like a data management platform? Are they the same thing?

I think that they are different. A CDP takes a lot of DMP functionality and applies it to first-party data; its a mix of a CRM and a DMP. It’s an evolution of the two that’s more heavily reliant on first-party data but then has the ability to take action in real time off of that first-party data, which is the big change and extensibility of the platform. With DMPs, you might go in and manually create an audience and push it to a channel. In the CDP world, you’re taking all of your information about a user, and you’re no longer reliant on audiences. You can still group people into audiences, but there’s the DMP functionality and the extensibility on top of it. One difference between a DMP and a CDP is that it’s all of your identities and attributes about a user. When you’re looking at the DMP world, you’re usually working off some third-party ID, and you never understand the actual identity of that user in a way that you can utilize across multiple platforms. The DMP can use it across multiple platforms but you cannot. So it’s more reliant on aggregation than optimizations of third party data than it is on direct first-party data.

The third piece (that you get outside of the identities and attributes of a user) that CDP’s provide is the event histogram. You have a view of every interaction customers have on every platform whether it’s your apps, your website, your emails, all of your owned and operated channels; you have a complete profile for every single user tied to that group of identities. A lot of CDPs will compare themselves to middleware or call themselves middleware, but they allow us to rapidly upgrade everything that we’re doing outside the CDP.

So as long as we have that CDP piece it takes us almost no time to go out and upgrade the systems around it whether that’s our data warehouse, our machine learning platform, our email and push providers or our linking solutions. We now have a middle layer not tied into a traditional marketing cloud. We can easily integrate with all of these different platforms because all of our events are captured in a single location that can then forward all of those events and user identities and attributes to other platforms. In this way, the CDP project is at the core of our modernization efforts.

 

You started this project a year ago. What was the real drive behind it?

Overstock has traditionally been incredibly efficiency-focused and so we look really closely at our customer acquisition cost compared to our lifetime value metrics. We’ve historically always been best in class at that. We asked ourselves the question, “how do we create growth, and as we scale the business/try to create rapid growth, how do we do that efficiently?” We realized that we needed to take all our efficiency drives to the next level, get out of the silos and allow our channels to talk to each other. That required rapid modernization and a centralized layer which can coordinate and orchestrate all of our customer data and audiences across all of our different platforms.

 

Clearly integrating all this customer data and unifying these customer profiles across systems is a huge undertaking. Who are the stakeholders here and how did you get buy-in from those different teams?

Our biggest stakeholder knew that we were trying to achieve the timeless marketing mantra of serving each customer the right message in the right place at the right time while also maintaining the efficiency of our systems. This stakeholder understood the need for a CDP layer to coordinate all of those activities. So, there was a deep understanding that we might not necessarily see immediate returns from this, but in order to get move forwards and change the mindset of channels working in silos, we had to go for it. Channels today in most companies work in silos for a reason because there were no good ways to push data and orchestrate it across all of these different channels. So, the fact that the CDP holds that promise allowed us to get a lot of buy-in early.

 

On the surface it sounds like it’s working and it’s been very helpful. But what kind of obstacles did you encounter at the very beginning?

Most of the earliest issues revolved around combining cookie-based data with device-ID data (i.e. app and web data where you have different identities and different experiences on each). Your web experience is not necessarily consistent with your app experience because you have different users and different tools available on the mobile app side such as camera access. Creating a consistent data architecture across both was a challenge at the start. And then tying the identities together was it’s own major challenge. The third issue also revolves around identity- if you only have one identity piece, like a cookie, it might work for cookie-based channels but for API and push channels it doesn’t work. How do you tie them together? How do you orchestrate data when you don’t have a complete list of identities that works across all of the channels? That is and remains a challenge, and it’s something that we’re actively working on. I think there’s lots of interesting ways to try to solve that problem, and it’s certainly not unique to us. But that’s going to be the future of how all of this pans out- how well we can utilize identities that are in a language that every channel speaks.

 

What’s the timeframe to get to that point?

It’s not totally up to us. It depends on what types of identities the big publishers decide to accept in order to coordinate orchestration. From our end, we see it as an evolving picture. It limits the impact that we’re able to have today. If we’re only able to orchestrate data for authenticated users, for example, that gives us some percentage win. For all of our authenticated users who are signed in that we have an email address for we can match almost all of them to just about every system because you have that email address and a cookie ID. So it pushes on two sides. One, we need to do a better job of driving user authentication and on the other side, we need to see how we can supplement our ID graph with third-party data.

 

If you were to start all over again from scratch what things, if anything, would you do differently?

I think for the most part it’s gone smoother than I ever expected. Certainly, the biggest thing that I would have done differently is I would have spent more time in the beginning thinking about the data structures and how we’re going to put data into the CDP because it’s hard to change that once you do it. I would have been a bit more rigorous around how we architect that data so that we don’t have to step back and change as the picture becomes clearer. I don’t know necessarily how we would have done that, but it certainly has caused headaches along the way.

 

Any final top tips or advice for anyone starting a CDP project like this?

Just get started! To me, the most surprising thing along the way has been how just starting this project has changed our mindset and how we approach the whole world. It’s changed how we understand the possibilities of what we can do as a marketing organization. We were limited in how we thought about the potential of each piece, and I think as you go down the CDP road it starts to change your mindset from ‘what’s possible today? to ‘anything is possible’.