AWS Partner Network (APN) Blog

Event-driven Composable CDP Architecture Powered by Snowplow and Databricks

By Nehmé Tohmé, Data & AI Strategist – Databricks
By John Bourous, Sr. Manager Product and Partner Marketing – Snowplow
By Ghandi Nader, Sr. Partner Solutions Architect, Customer Experience – AWS
By Franck Georget, Segment Lead, Customer Experience – AWS

Snowplow Logo
Snowplow
Connect with Snowplow

As customer expectations continue to evolve and data volumes grow exponentially, organizations face unprecedented challenges in managing and activating customer data effectively. As businesses strive to deliver personalized experiences and drive meaningful customer engagement, the role of Customer Data Platforms (CDPs) has become increasingly critical. The emergence of composable CDPs marks an evolution in customer data management, offering organizations the flexibility and scalability needed to meet modern business requirements.

Snowplow, a behavioral data collection platform, combined with Databricks Lakehouse platform, creates a scalable foundation for building a next-generation composable CDP on HAQM Web Services (AWS). This combination enables organizations to collect, process, and activate customer data at scale while maintaining complete data ownership and control.

This blog post will cover the evolution of CDPs, explain composable architecture, and show how Snowplow and Databricks on AWS build flexible, scalable, and secure customer data infrastructure. We’ll examine the architectural components and key benefits of this modern approach to customer data management.

Evolution of Customer Data Platform

Organizations adopt CDPs to unify, process, and activate customer data for various business needs. The goal is to improve customer engagement, drive personalization, and optimize marketing strategies. This is achieved through three key applications: enhancing media targeting with unified customer insights for better ad performance and conversion rates; enabling precise audience segmentation and cross-channel orchestration for more targeted marketing campaigns; and delivering contextual experiences based on behavioral data to create more personalized customer interactions. These capabilities help organizations maximize their marketing effectiveness while providing customers with more relevant experiences.

Despite their proven value, organizations face several obstacles in CDP adoption, ranging from poor technology selection and unrealistic expectations of out-of-box capabilities to complex integration requirements. Additionally, companies struggle with privacy and compliance requirements, while often underestimating the resources needed for long-term maintenance and managing exponential data growth.

These challenges highlight the critical need for a more flexible approach to CDP technology selection. Organizations require the ability to customize and combine modular CDP components that align with their specific use cases. While marketing stakeholders often drive CDP initiatives, the technical evaluation phase cannot be overlooked, given the complexities of maintenance and data growth. This is where the concept of a composable CDP becomes invaluable. By disaggregating CDP functionality into distinct layers such as data collection, management, and activation – organizations can adopt a more modular approach. This architectural flexibility enables companies to construct their CDP solution, selecting best-of-breed components while maintaining agility to evolve over time. Rather than being locked into a monolithic solution, organizations can build a CDP that meets current needs and adapts to future requirements.

Intro to Composable CDP

A composable CDP’s distinctive power lies in its modular architecture and key characteristics. The solution separates core functionalities into independent layers, allowing organizations to make technology choices for each component. For instance, companies can select data collection tools specifically tailored to their channels (Example: web and mobile) that directly feed into their data warehouse. Data ownership stands as another crucial feature. Organizations maintain complete control over their customer data schema and infrastructure within their own cloud data warehouse environment. By using the data warehouse as the primary source for activation, companies eliminate data movement between their warehouse and third-party solutions.

In essence, a composable CDP is a modern, modular approach to customer data management that integrates directly with existing data warehouses. It enables organizations to select and combine specific components based on their needs, offering quick deployment, data activation, and efficient scaling. This flexible architecture eliminates data duplication while ensuring seamless integration with modern data stacks.

The Next-Gen Composable CDP Powered by Snowplow and Databricks on AWS

Snowplow, Databricks, and AWS unite to build a next-generation modular CDP. This offers real-time data streaming and analytics, improved privacy via a dedicated HAQM Virtual Private Cloud (HAQM VPC) , and easy access to advanced AI tools from Databricks and AWS. This modern architecture represents the next evolution in customer data platforms.

The Next-Gen Composable CDP Powered by Snowplow and Databricks on AWS

Figure 1. The Next-Gen Composable CDP Powered by Snowplow and Databricks on AWS

Snowplow: Event Tracking and Data Collection

Snowplow provides a customer data infrastructure that empowers data teams to capture and manage high-fidelity behavioral data across every digital touchpoint. Whether users are browsing a website, interacting with mobile apps, or clicking through emails, Snowplow transforms each interaction into richly structured, AI-ready event data.

Real-Time Data Streaming: Snowplow integrates with AWS-native services like HAQM Kinesis and HAQM MSK to deliver real-time behavioral event streams directly into Databricks. This enables low-latency data ingestion and insights that fuel everything from personalized recommendations to operational decision-making.

Rich Contextual Tracking: With over 130+ out-of-the-box event properties and the flexibility to define custom entities, Snowplow gives you a complete, contextualized view of the customer journey. Events such as checkouts, video views, or support interactions are captured with precision, ready for downstream activation in AI/ML models or other services.

Privacy & Control by Design: Deployed natively within your HAQM VPC, Snowplow ensures complete control over your data pipeline instead of black-box processing. Built-in schema validation, proactive data quality enforcement, and integration with AWS Identity and Access Management and AWS Key Management Service provide scalable, compliant infrastructure for managing sensitive customer data with confidence

Databricks: Data Management and Governance

Databricks, powered by its Lakehouse architecture and Unity Catalog, serves as a comprehensive data intelligence platform that provides robust governance and management capabilities for Snowplow-collected data.

Data Integration: Databricks ingests data from multiple sources, including Snowplow, CRM systems, and third-party platforms, creating a unified view of customer interactions.

Data Governance with Unity Catalog: centralize metadata management by including both structured and unstructured data, fine-grained access controls, and comprehensive data lineage tracking across all data products with data discovery while maintaining automated compliance controls and detailed audit logging for security requirements.

Artificial Intelligence and Machine Learning: Leverage Databricks’ ML capabilities for propensity scoring and personalization. For instance, predict which customers are most likely to churn and target them with retention campaigns.

Databricks AI/BI Genie: an AI-powered conversational interface that allows business users to interact with data naturally through simple language queries.

Building a composable CDP with Snowplow and Databricks on AWS unlocks the full potential of AWS’s cloud.

  • Enhance modularity and agility with over 200 services within your VPC.
  • Unlock AI use cases by leveraging HAQM Bedrock‘s foundational models in conjunction with Databricks Mosaic AI.
  • Activate data directly from your data warehouse using AWS services like HAQM AppFlow, AWS End User Messaging, or the customer engagement platform of your choice.

A Real World Scenario: How Burberry Revolutionizes Retail Experience

Burberry, the global luxury fashion brand, successfully implemented a composable CDP using Snowplow’s real-time event tracking and Databricks Data Intelligence Platform on AWS. This combination has transformed their customer experience by reducing clickstream data latency by 99% and extending cookie duration by 52x (from 7 days to 12 months). The solution enables Burberry to create AI-Ready Customer 360 profiles in real-time, powering 40 personalized models for product recommendations, propensity scoring, and lifetime value calculations. Most importantly, this composable approach allows in-store client advisors to access opted-in customers’ online browsing behavior through their mobile devices, delivering a truly personalized NextGen customer experience that bridges the digital and physical shopping journey. The architecture also provides Burberry with enhanced marketing attribution capabilities and better GDPR compliance through server-side cookies, demonstrating how AWS partners can help enterprises build powerful, privacy-compliant customer data solutions.

Key Benefits of the Composable CDP Architecture

The composable CDP architecture offers three main advantages. First, its modular design allows organizations to choose optimal tools for data operations that match their specific needs, from collection to activation. Second, the solution’s AWS integration enables seamless scaling and innovation as business requirements evolve. Finally, it ensures robust data governance with full transparency and control over personal information handling, helping maintain regulatory compliance with GDPR and CCPA.

Conclusion

By combining the powerful capabilities of Snowplow, Databricks, and AWS, organizations can build a flexible, scalable, and secure customer data infrastructure that truly meets their unique needs. This modern approach provides a foundation for future innovation and growth. Organizations that embrace this approach will find themselves well-positioned to deliver exceptional customer experiences while maintaining agility, compliance, and control over their customer data infrastructure.

Contact your AWS, Snowplow or Databricks team to learn more about composable CDP solution.

Connect with Snowplow

.


Snowplow – AWS Partner Spotlight

Snowplow is an AWS Advanced Technology Partner and AWS Competency Partner that lets you track, contextualize, validate and model your customers’ behaviour across your entire digital estate. Your data is available in real-time and is delivered to your data warehouse of choice, where it can be used to power analytics, reporting and business-critical applications. The Snowplow product is running in your own cloud environment giving you complete ownership of your data.

Contact Snowplow | Partner Overview | AWS Marketplace