India's largest platform for AI & Analytics leaders, professionals & aspirants

Sign in

India's largest platform for AI & Analytics leaders, professionals & aspirants

3AI Digital Library

Enriching Customer Data Platforms with Customer Identity Graphs in a Cookieless World

October 19, 2022

Featured Article:

Author: Jayachandran Ramachandran, Senior Vice President – Artificial Intelligence Labs, Course5 Intelligence

The pandemic has created substantial changes in our shopping behavior. Even diehard offline buyers have moved to online channels, experiencing new ways of buying and fulfilling their needs through the digital ecosystem. Improving customer engagement and value realization by providing relevant content, recommendations, and offers has been an area of constant research for organizations. For most marketing and Ad tech companies, third-party cookies have been a key instrument for content customization on online channels. With GDPR, CCPA and other country-specific regulations already in place, there is larger scrutiny on collection and usage of customer data, and companies like Google and Apple are phasing out the usage of third-party cookies. Browsers like Safari and Firefox have also ended third-party cookies. This regulatory trend impacts all digital companies who relied mostly on third-party data for their marketing campaigns and are grappling with this new challenge. Organizations are exploring various methods to address this challenge and some of them plan to solve it using the following approaches:

  1. Collect zero-party data
  2. Collect first-party data
  3. Set up or enhance customer data platforms (CDP) for the new normal

CDPs enriched with zero-party and first-party data are proposed as the go-to solution in the cookieless world. However, it does not solve the problem of resolving customer identify from the deluge of data that is fundamental to accurate customer profiling. CDPs require an identity resolution framework to make them more impactful by building a differentiated and privacy-compliant customer experience.

Key elements of this approach are:

Collect Zero-Party Data

Zero-party data is the data collected directly from customers with their consent through interviews, forms, surveys, polls, emails, chatbots etc. The customer willingly shares such data in anticipation of availing better services, personalized recommendations and offers from the provider. It is a win-win contract between customer and provider.

Collect First-Party Data

This is the data collected directly by the company from their logged-in or anonymous users through various channels such as websites, mobile apps, social channels etc. This includes understanding the customer journey, touchpoints, behavioral instincts, time spent, preferences, feedback, reviews, ratings, etc.

Set up or enhance customer data platforms (CDP) for the new normal

CDPs are platforms that collect data from multiple sources including zero-party and first-party sources and endeavor to provide a unified and 360-degree view of the customer that can be consumed by other systems. Machine Learning (ML) models are used for creating customer profiles, carrying out customer segmentation and fostering recommendation systems for personalized targeting and monetization. The quality of the data in CDPs is fundamental to providing the best experience to customers. This is the most challenging part. Customers reciprocate well when they trust that the company can understand their preferences and personalized recommendations provided are relevant. Since data is coming from various sources and there are other dimensions of multiple devices, multiple endpoints (IP addresses), customer identify resolution becomes critical. Identity resolution is the approach of establishing customer identity and creating a single view of customer by unifying user data across devices and channels. With advances in graph technology, it is possible to create customer identify graphs that can comprehend customers’ identity across devices, channels and endpoints, and establish interconnected relationships across the data

Using Customer Identity Graphs for CDPs

The first step is to collect the data, cleanse it and transform the data into a graph data model consisting of various entities and the relationships between them. This is the raw data and has disambiguated entities. Entity resolution is required to further fine-tune and establish the relationship for further consumption by end systems. Entity Resolution (ER) is the process of analyzing and disambiguating the data to assess if multiple records represent the same entity such as people, products, organizations, locations, etc. When a user accesses a website from different devices at different points in time and using different end points (hotspots), the user is saved as a new user in each instance. The data captured would consist of the digital logs such as device ID, cookie ID, user clicks, page accessed, time spent, etc. Graph technology offers the best data structures to model such data. Leading graph technology providers like Neo4J and TigerGraph have graph data science libraries that can be used for entity resolution and link prediction. Following are the ways of performing entity resolution to build a customer identity graph that will help us create unified customer profiles.

Using Graph Query Language

If the graph data model is simple and the criteria for entity resolution are clear and well-defined (e.g.: National ID, SSN, email ID, address) then GraphSQL queries can be used to identify similar entities and identify duplicates.

By Unsupervised Learning

If the data is large and complex, and entity similarity must be determined using multiple dimensions, then node embedding techniques like Node2Vec or FastRP (Fast Random Projection) algorithm can be explored. The embeddings are numerical representations and correspond to a vector that can be used as input to any machine learning algorithm. Once the embedding is available, then entity resolution can be accomplished through distance-based node similarity algorithms (e.g.: Jaccard, Cosine). Nodes that have similar attributes (e.g. similar browsing history) will have less distance between the embeddings, whereas nodes with diverse attributes (e.g. diverse browsing history) will have more distance between the embeddings. The nodes that are close are identified as same entities.

If the underlying graph is large, algorithms like Weakly Connected Components (WCC) can be applied to create subgraphs wherein the nodes are connected by some paths. Applying distance-based node similarity algorithms on subgraphs would reduce the computation overhead and complexity in entity resolution.

By Supervised Learning

Once entities are identified using unsupervised machine learning approaches like defined above, a supervised graph machine learning algorithm like Link Prediction can be used. It can train an entity linkage model and predict new entity links when new data flows into the system. A combination of unsupervised and supervised machine learning approaches can be applied to fine-tune entity resolution.

Adapt to the new normal

A cookieless world is going to be the new normal. Organizations need to prepare and adapt to that. A CDP with high-quality data and disambiguated entities enriches the platform. It enables many use cases for digital commerce such as user profiling, customer journey analysis, micro-segmentation, personalization, interest-based advertising, product recommendation, cross-sell, upsell, fraud detection, etc. It helps to enhance customer experience and increase stickiness and overall customer lifetime value. Customer identifying graphs will bring clarity and order to a chaotic cookieless digital ecosystem and pave the way for providing hyper-personalized services to customers.

About the Author:

Jayachandran (Jay) Ramachandran is the Senior Vice President – Artificial Intelligence Labs at Course5 Intelligence. Jay has over 25 years of industry experience and is an AI thought leader, consultant, design thinker, inventor, and speaker at industry forums with extensive innovation and delivery expertise across a wide variety of industry verticals. He has been instrumental in creating innovative solutions to solve complex business problems using Machine Learning, Deep Learning, and AI technologies. He specifically brings thought leadership and expertise in combining AI, cognitive neuroscience and behavioral economics with a vision of democratizing AI, augmenting human capabilities, embedding ethical principles and building trustworthiness in AI systems, and creating a Human-AI world that benefits everyone.

Title picture source: freepik.com

    3AI Trending Articles

  • Blockchain-Based Stock Exchange in Japan by 2022

    The exchange reportedly planned by SBI and SMFG is expected to be the first of its kind in Japan. SBI Holdings has reportedly partnered with Sumitomo Mitsui Financial Group (SMFG) to launch a digital stock exchange slated for spring 2022. SBI and SMFG are expected to launch the platform in Osaka to compete against the […]

  • Future Proofing GCCs

    For a while now, India has been driving back-office functions for some of the large enterprises across the world. However, over the past few years, it has taken giant strides towards becoming the innovation hub of the world, by adding strategic value to multi-billion-dollar businesses from around the world, through Global Capability Centres (GCCs) based […]

  • CES 2021 conference stress importance of security education

    Experts at the CES 2021 conference stress importance of security education The “second age” of quantum computing is poised to bring a wealth of new opportunities to the cybersecurity industry – but in order to take full advantage of these benefits, the skills gap must be closed. This was the takeaway of a discussion between […]

  • How Augmented Analytics is Transforming the Analytics Ecosystem

    Author:  Sidharth Sivasailam, Vice President – Products, Course5 Intelligence | LinkedIn – https://www.linkedin.com/in/sidharthsiva/ The world of Business Analytics is at an inflection point. Trillions of bytes of data are being generated every day; however, companies continue to struggle with harmonizing this data, analyzing the data of various shapes and sizes they are storing, determining what’s […]