business triangle technical hexagon

AI-powered Automated Data Quality on Data Lakes

Technical talk | English

Theatre 21: Track 5

Thursday - 16.10 to 16.50 - Technical


Did you hear about Data Quality repeatedly? Are you tired of manually checking/querying data quality issues on your Data Lake? Without a consistent governance principles, a set of common rules for data quality and an automated solution, a Data Lake may quickly end up in a data swamp, this is obvious. However, how can do we adopt/develop an automated data quality solution? In this talk, we will answer this question through an exploration of different data quality dimensions and metrics. We will then build different data pipelines using Azure Databricks to detect and predict data quality issues on different datasets, and visualize them in PowerBI. We will use a real case-study as example.