AI-powered Automated Data Quality on Data Lakes
Technical talk | English
Technical talk | English
Theatre 21: Track 5
Thursday - 16.10 to 16.50 - Technical
Did you hear about Data Quality repeatedly? Are you tired of manually checking/querying data quality issues on your Data Lake? Without a consistent governance principles, a set of common rules for data quality and an automated solution, a Data Lake may quickly end up in a data swamp, this is obvious. However, how can do we adopt/develop an automated data quality solution? In this talk, we will answer this question through an exploration of different data quality dimensions and metrics. We will then build different data pipelines using Azure Databricks to detect and predict data quality issues on different datasets, and visualize them in PowerBI. We will use a real case-study as example.