Eliminating the XDM Engineering Bottleneck for Enterprise AEP Ingestion
A Fortune 500 financial services firm replaced a brittle, code-intensive dbt workflow with configuration-driven pipelines — achieving bidirectional Snowflake ↔ AEP data flow with zero customer data leaving their cloud.
The challenge
The firm needed to bring structured customer data from Snowflake into Adobe Experience Platform (AEP) — and return AEP audience segments back into Snowflake for downstream activation. AEP requires all ingested data to conform to Experience Data Model (XDM) schemas: deeply nested JSON structures with strict field paths, data types, and relationship constraints.
The existing approach used dbt to transform Snowflake tables into XDM-formatted JSON. Engineers hand-coded SQL/Jinja templates to construct nested JSON structures field by field. Adding a single new field took days. When AEP schemas changed, dbt models had no awareness — causing silent ingestion failures that took days to diagnose.
As a financial institution, strict data residency policies applied: customer data could not leave the bank's cloud perimeter. Snowflake sat behind a VPN, inaccessible from outside their AWS environment. Any solution requiring data to leave their cloud was a non-starter.
Manual XDM construction
Hand-coded SQL/Jinja templates for deeply nested JSON structures — days per field.
Schema drift
No sync between dbt models and AEP schemas — silent failures and days of debugging.
Data residency
Customer data cannot leave the bank's cloud perimeter — SaaS processing is a non-starter.
Wrong ingestion pattern
Streaming ingestion for bulk data competes with real-time workloads and wastes AEP capacity.
The solution
Ingest Labs' data pipeline platform separates orchestration from execution. Pipeline definitions, scheduling, and monitoring run in the Ingest Labs SaaS — no customer data stored or processed. The actual batch job that queries Snowflake, transforms data, constructs XDM files, and writes output runs as an AWS Batch job inside the bank's own cloud environment.
The bank provides Ingest Labs with credentials scoped exclusively to submitting Batch jobs — no access to Snowflake, no access to S3 data. Customer data never traverses the bank's cloud boundary.
For destination mapping, the platform connects to AEP's Schema Registry API and renders the full XDM schema as a visual tree. The user maps source fields to XDM fields visually — no hand-coded JSON construction, no manual schema translation. A "Check for updates" button re-fetches the schema from AEP, detecting any changes since the mapping was configured.
Field-level validation catches issues during processing — before data reaches XDM construction. Inline transformations (SHA256 hashing, date formatting, type casting) are applied per field declaratively, not as custom SQL functions.
The pipeline constructs XDM-compliant files and deposits them in the S3 location AEP monitors for batch ingestion — no streaming capacity consumed, no competition with real-time workloads.
Bidirectional closed loop
The same pipeline architecture operates in reverse — AEP audience segments flow back into Snowflake for downstream activation.
Snowflake → AEP
Customer profiles extracted via SQL, validated, transformed, mapped to XDM, and delivered as batch files to AEP for segmentation and personalization.
AEP → Snowflake
Audience segments exported from AEP to S3, parsed and mapped by an Ingest Labs pipeline, and written back to Snowflake for CRM campaigns and analytics.
Results
| Metric | Detail |
|---|---|
| Field activation time | Reduced from days per field to minutes — visual mapping replaces hand-coded XDM |
| Schema synchronization | Live schema fetch from AEP with drift detection — eliminates silent failures |
| Debugging time | Reduced from days to minutes — row-level audit trail with payload inspection |
| Customer data in SaaS | Zero — all data processing executes inside the customer's cloud perimeter |
| Ingestion pattern | File-based batch — no streaming capacity consumed, no RPS competition |
| Bidirectional flow | Same platform, same connectors — no separate tooling for AEP-to-Snowflake |
| Pipeline versioning | Full version history — every schema change and mapping update tracked |
Security architecture
Orchestration and execution are separated by design — customer data never leaves the bank's cloud.
Ingest Labs SaaS
- Pipeline definitions & scheduling
- Schema mappings & monitoring
- Zero customer data stored
- IAM scope: submit Batch jobs only
Customer's AWS account
- Queries Snowflake (VPN access)
- Transforms data & writes to S3
- All customer data stays here
- No VPN tunneling required
A financial institution that was spending days to activate a single field into AEP now maps entire datasets in minutes — with built-in validation, schema synchronization, and row-level auditability — all while keeping every byte of customer data inside their own cloud.
Ready to eliminate the data engineering bottleneck?
Visual schema mapping, built-in validation, and zero data leaving your cloud.