Data Observability for Data Warehousing

Trust all the stages of your data warehouse

Have you ever desperately stopped a data mart refresh to avoid loading invalid data?

Monitor the Data Quality of all tables at all stages in your data warehouse. Define the data dependencies between the tables. Having the data lineage and data quality alerts in the upstream tables, detect potential issues in downstream tables before the data load happens.

Unified Data Quality Process

Unified Data Quality Process

Define all Data Quality rules for all stages of the data warehouse the same way, in one place.

DQO.ai Data Quality rules are easy to attach as just YAML files with code complete. It is easy to connect hundreds or thousands of Data Quality rules to cover the whole data warehouse without a big effort. Define the dependencies of tables and you get the visibility of all Data Quality issues at the whole data lineage.

  • Easily cover many tables with Data Quality checks, without clicking
  • Track issues in related tables by following the table dependencies
  • Migrate the Data Quality rules across environments by just copying files

Healthy Data Marts

Healthy Data Marts

Monitor all Data Quality issues in upstream tables that may cause loading invalid data to the data mart.

DQO.ai stores unique the Data Quality alerts in it’s Data Quality database. Add a code in your data mart loading pipeline that checks if all Data Quality issues in all tables on the data lineage path are resolved.

  • Load the data mart tables incrementally only when there are no unresolved Data Quality issues in previous stages
  • Delay data mart refresh until Data Quality issues are resolved or accepted as minor issues
  • Protect your data marts from invalid data

Incremental data loading

Incremental data loading

Avoid a full refresh by acting proactively on Data Quality issues in previous data stages. Delay the incremental load until the data is correct.

DQO.ai can analyze the data at multiple date and time gradients. Analyze new data that you will use in the incremental load and just delay the incremental refresh when Data Quality issues are identified.

  • Avoid full refresh of a data mart if the fact table would be loaded with wrong data
  • Detect duplicate data that would cause invalid sums of aggregated columns
  • Delay or abandon an incremental refresh if that could harm the data mart integrity

Data Quality documentation

Data Quality documentation

Document tables in your data warehouse as a set of easy to understand Data Quality checks.

Data Quality checks that are defined in easy to read YAML files may be shared with Business Intelligence developers. BI developers and Data Scientists can just see which columns are unique, not null or what is the format of the tables.

  • Avoid questions about the data formats of columns from BI developers and Data Scientists
  • Build a knowledge base of your data warehouse
  • Verify the quality of your data warehouse by running the Data Quality checks

No one can understand your data like we do!