Product

One catalog for every dataset, in every cloud

Cloud Datasets brings together structured, semi-structured and unstructured data with rich lineage, ownership and quality signals.

Unified catalog

Auto-discover datasets across AWS, Azure, GCP and on-prem sources. Browse, search and request access in one place.

Lineage & quality

Column-level lineage and freshness, completeness and drift checks surfaced on every dataset.

Secure sharing

Share datasets across teams, partners and regions without copying data. Revocation in a single click.

Open by default

Standards-based metadata APIs and Iceberg-compatible tables — no lock-in.

Built for the scale of modern data

100+
Source connectors
Databases, SaaS, files and event streams.
1M+
Datasets governed
At our largest customers.
5 min
To first dataset
From signup to a usable catalog.
Zero
Data copies
Federated access by default.

How it works

  1. 01

    Connect sources

    Bring in warehouses, lakes, SaaS apps and file stores with one-click connectors.

  2. 02

    Auto-classify

    PII detection, business glossary tagging and ownership inferred from usage.

  3. 03

    Govern access

    Define policies once and enforce them across SQL, BI, notebooks and AI agents.

  4. 04

    Share safely

    Cross-team, cross-cloud and cross-region sharing with full audit and revocation.

Frequently asked questions

Where is my data stored?+

Your data stays in your cloud accounts. Avaloka indexes metadata and brokers access — we don't move or copy your data.

How is PII handled?+

PII is automatically detected and tagged. You can apply masking policies that travel with the dataset across every consumer.

Can I share data with partners?+

Yes. Secure data sharing supports external accounts with expiring access, watermarking and full audit trails.

Is there an open standard?+

All metadata is exposed via REST and Iceberg-compatible APIs, so you can integrate with dbt, Great Expectations and your own tools.

Make every dataset discoverable and trusted

See how Cloud Datasets unifies your data estate in a single, governed catalog.