NEW Data Integrity Score · Ingest Labs

Data quality,
without the DQ tax.

Know the integrity of your data at the partition, not the incident. A 0–100 Data Integrity Score, your own rules, millions of rows — without the bloated DQ platform.

Start Free Trial
9
Built-in rule types
▲ Bring your own too
30m
Partition granularity
▲ Incremental · scalable
M+ rows
Per check, per partition
▲ Runs on your warehouse
Runs on Data Warehouses Traditional Databases OLAP Data Stores
Data Integrity
Last 3 days · updated 32s ago
LIVE
98.0
DIS · 0–100
10 Million
Total scanned rows
289
Failed rows
Top violations View all →
Transaction ID — Missing on purchase
Conditional Not Null · event=purchase
142 2.1%
Add to Cart — Items missing
Completeness · ≥1 item per event
89 0.4%
Event ID — Duplicate
Unique · All values unique
58 0.0%
Why DIS

You don't need a bloated DQ platform

Legacy data quality tools charge enterprise money to run some SQL. We give you the rules, the scale, and the score — and you keep the keys to your data.

Legacy DQ Platform
Enterprise tooling, enterprise drag
Monte-Carlo-style platforms assume you'll rebuild your stack around them.
Another platform to install
Separate SaaS or on-prem stack — your data leaves your warehouse to be checked.
Opinionated, vendor-picked rules
Defaults based on what the vendor detects — adding your own rules means a ticket and a sprint.
Per-row or per-seat pricing
Cost scales with volume, not value. Million-row tables become line-item negotiations.
Reactive — incident-based alerts
You find out something is broken after it already flowed downstream.
Black-box scoring
A traffic-light dashboard without the context you need to actually act on it.
Starting at $60K+ / year
Ingest Labs · Data Integrity Score
Config-driven, partition-scoped, yours
Built to live inside the data platform you already have — not replace it.
Zero new infrastructure
Runs on the warehouse you already own — OLTP, OLAP, Iceberg — with Trino as the execution engine. Your data stays put.
Bring your own rules
All 9 rule types plus custom SQL. Declarative config in a transactional DB, versioned with the rest of your stack.
Predictable cost
Pay for compute you already pay for. No per-row fees, no seat counts, no volume tiers.
Proactive — partition-scoped
Checks run at the 30-minute partition, before bad rows fan out to reports, models, or ads.
A single 0–100 Data Integrity Score
Composite, weighted, and visible per table, per column, per partition. Scan once, act anywhere.
Included with the IL Platform
VS
Core Capabilities

Everything you need to trust every row

DIS runs inside your warehouse across every partition — declaring rules, scoring results, and surfacing issues before they reach dashboards, models, or ad platforms.

01 —
Proactive, at the partition

Checks run every 30 minutes on the partition that just landed. Bad rows are caught before they flow to dashboards, ad platforms, or ML features.

30-min buckets
02 —
Bring your own rules

All 9 built-in rule types — Not Null, Unique, Freshness, In Range, In List, Regex, Row Count Min, Completeness, Conditional Not Null — plus custom SQL for the rules you uniquely need.

9 types + custom SQL
03 —
Data Integrity Score

Every table, every column, every partition gets a single composite score. Scan once, triage anywhere — no more hunting through raw pass/fail counts.

0–100 · weighted
04 —
Runs where your data lives

Trino for OLAP/Iceberg targets, direct SQL for OLTP databases. No data egress, no replication, no shadow warehouse of duplicate rows.

OLAP · OLTP
05 —
Zero new infrastructure

Rules, targets, and schedules live as config rows in your transactional DB. No sidecar agent, no separate SaaS, no new thing for your SRE team to babysit.

Config-driven
06 —
Health Map for coverage

See pass rate per target-rule combination across every dataset — the squares tell you what's green, what's amber, and what you haven't covered yet.

Target × Rule
Health Map

Data quality coverage at a glance

One grid — targets on the rows, rule types on the columns, pass rate in the cells. You see what's green, what's failing, and what you haven't covered yet, without digging through a single dashboard.

Coverage Matrix
Pass rate per target-rule combination · last 3 days
96% pass rate· 41/43 checks
All IDL Media Tags
100% passed 80–99% passed <80% passed Not assigned
Not Null
Unique
Completeness
In Range
In List
Matches Regex
Row Count Min
Freshness
Conditional Not Null
Order ID
100
50
100
100
100
Transaction ID
100
100
100
100
100
100
Customer ID
92
100
88
100
Cart Total
100
100
100
Cart Items
100
100
100
Product SKU
100
100
100
Currency
100
100
100
Country Code
100
100
100
Payment Method
100
100
100
Coupon Code
85
100
100
Campaign ID
100
100
Traffic Source
100
100
38 checks passing
2 warnings
1 critical, needs attention
Coverage gaps surface instantly — add a rule in seconds
Rule Library

9 rule types. Declarative. Your SQL when you need it.

Every rule is a config row in your transactional DB — versioned, reviewable, and portable. Attach rules to columns, schedule once, and let the engine decide where to run them.

NOT_NULL
Not Null

Flag the partition when any value in the column is NULL. The simplest integrity guard — at warehouse scale.

OLAP · OLTP Active
UNIQUE
Unique

Every value in the partition must be unique. Catches duplicate order IDs and transaction leaks as they land.

OLAP · OLTP Active
COMPLETENESS
Completeness

Percentage of non-null values must stay above a configurable threshold. Partial coverage catches drift early.

OLAP · OLTP Active
IN_RANGE
In Range

Numeric values must sit within a min/max bound. Catches negative totals, impossible ages, outlier prices.

OLAP · OLTP Active
IN_LIST
In List

Values must match a configured allow-list. Great for currencies, country codes, event names, or status fields.

OLAP · OLTP Active
MATCHES_REGEX
Matches Regex

Values must conform to a regex pattern. Enforce SKU formats, ISO country codes, UUID shapes, or tracking IDs.

OLAP · OLTP Active
ROW_COUNT_MIN
Row Count Min

Table or partition must contain at least N rows. Surfaces pipelines that silently stopped or underdelivered.

OLAP · OLTP Active
FRESHNESS
Freshness

The most recent row must have landed within the last N hours. Catches stalled upstream jobs and delayed ingest.

OLAP · OLTP Active
CONDITIONAL_NOT_NULL
Conditional Not Null

When one column equals a value, another column must not be NULL. For fields required only in specific contexts.

OLAP · OLTP Active
Need a rule we don't ship? Bring your own SQL.
Custom rule type runs any parameterized query against your OLAP or OLTP database — same scheduling, same scoring, same Health Map.
CUSTOM_SQL ·  `SELECT COUNT(*) FROM ...`
How It Works

From rule to score in minutes

01
ConfigDeclarative
Define rules

Write rules as config rows — 9 built-in types or your own custom SQL. Versioned in the repo alongside the rest of your data stack.

~1 min
02
TargetColumn binding
Attach to targets

Point each rule at a specific table × column × partition strategy. One rule can cover many datasets, and the mapping is fully reusable.

~10 sec
03
RuntimeTrino / OLTP
Execute at scale

Batch jobs run every 30 minutes on partitioned buckets, pushing checks as native SQL to the engine where your data already sits.

<60 sec
04
AuditIceberg results
Score & route

Results land in auditable Iceberg tables. The Data Integrity Score recomputes, the Health Map updates, alerts route to Slack or PagerDuty.

Instant
End-to-end cadence: 30 min
Scales with your warehouse compute
Every check is auditable in Iceberg
Integrations

Connects to the stack you already run

20+data sources · alert destinations · SSO
T
Trino
Query engine
I
Iceberg
Table format
P
PostgreSQL
OLTP source
Snowflake
Warehouse
BQ
BigQuery
Warehouse
DB
Databricks
Lakehouse
R
Redshift
Warehouse
S
Slack
Alert channel
PD
PagerDuty
Incident routing
+
11+ more
Explore all →
SQL-native Webhooks OAuth · SAML SSO Config-as-code SOC 2 Type II

Ready to trust every row?

Turn on Data Integrity Score and catch issues at the partition, not the incident — in the warehouse you already run.