Data Quality Assessment Solution

“Bad data costs the US $3.1 trillion per year.”
Thomas Redman, Harvard Business Review

Understand the questions you can ask of your data, and your probability of success, before you begin a costly and time-consuming analytics project.

Analyze with confidence

We used an opensource dataset and ran it through Iris, just to show off a little. 

See Iris Soccer Results

Target Data

Information Quality Score – indicates the amount of meaningful information available to build a stable predictor.
Information Consistency Score – indicates whether or not the data can be represented in a single model or needs to be segmented.

Identify meaningful, irrelevant and duplicate data

Central & Alternate Targets – identifies best target candidates within a dataset.
Proxies to the Target – identifies columns that have redundant information relevant to the target.
Non-Informative Columns – identifies columns that have no relevant information for any analysis.
Duplicate Columns – identifies columns that contain redundant information.

Information Graph

A connected directed acyclic graph representing hierarchies of relations of columns in the dataset.