We want to built
The first developer friendly Data Observability platform.
The next generation Data Observability platform should go beyond a simple anomaly detection and dependency tracking.
The Data Observability (detect anomalies) should be combined with Data Quality (is your data correct) and Data Curation (hold the data that is invalid, you can always reload healthy data later). Data Engineers should be able to implement business specific Data Quality checks easily, so we need perfect extensibility.
The Data Curation goal requires a Data Observability platform that is callable from data pipelines and is able to stop or pause processing in case of an error. Our customers were always asking us how can we stop the data loading pipeline if the data quality is not satisfied.