Data Quality Assessment Solution
“Bad data costs the US $3.1 trillion per year.”
–Thomas Redman, Harvard Business Review
Understand the questions you can ask of your data, and your probability of success, before you begin a costly and time-consuming analytics project.
Analyze with confidence
Information Quality Score – indicates the amount of meaningful information available to build a stable predictor.
Information Consistency Score – indicates whether or not the data can be represented in a single model or needs to be segmented.
Identify meaningful, irrelevant and duplicate data
Central & Alternate Targets – identifies best target candidates within a dataset.
Proxies to the Target – identifies columns that have redundant information relevant to the target.
Non-Informative Columns – identifies columns that have no relevant information for any analysis.
Duplicate Columns – identifies columns that contain redundant information.
A connected directed acyclic graph representing hierarchies of relations of columns in the dataset.