Abstract program code digital concept

IF DATA IS the new oil – and has become the world’s most valuable resource – then we are all in a bit of trouble. There are major spills everywhere, very little clean-up equipment, and many of the owners are blissfully ignorant of the true value of the resource they hold, and the importance of keeping it clean.

What’s more, data isn’t like oil in that it doesn’t run out. It is ubiquitous, continuous and can be reused and reworked in countless ways. But that’s a story for another day. Data volumes are exploding. More data has been created in the last two years than in the entire previous history of the human race.

Numerous studies quote that bad data costs US businesses alone $600 billion annually, a mindboggling number, although perhaps based on bad data, so who really knows?

The cost of errors

Whatever the overall numbers, there is no doubt that data errors are costly for institutions in many respects – remediation expenditure, compensation payments, and reputational damage just to name a few. Not only that, but the longer a data quality error goes undetected and unresolved, the greater the harm to bottom lines for institutions and members.

The exponential harm that the proliferation of data errors can cause can aptly be characterised as the ‘disease effect’. If data quality issues are not detected and remedied soon after they occur, there is a tendency for the error to spread and contaminate other data, even jumping to other systems.

Data quality issues are not just an IT or technology issue, there are all too real financial consequences as well. And they are pervasive. The financial harm of poor data typically follows some variation of the 1-10-100 rule.

The traditional interpretation of this rule suggests that verifying the quality of a data record costs $1. Remediation by cleansing and de-duplicating a record costs $10. Working with a record that is of poor quality and inaccurate costs $100.

However, in QMV’s experience, it is closer to a 1-5-50 rule, whereby if the error is identified after:

One day, the remediation cost is minimal. While the errant data requires correction, the error can be quarantined to minimise any external visibility and further impact.

One month, there is a five-fold increase in the cost to remediate. The error may have filtered into several monthly processes (fees, premiums), some investors may have left or transferred products (super to pension), and the remedial costs begin to escalate.

One year , and there is a 50-fold increase in the cost to remediate. By then, the error has probably filtered into several annual processes (member statements, taxes, Australian Prudential Regulation Authority reporting), and may now be a breach that requires compensation and additional reporting to various external stakeholders.

The costs aren’t only financial. The damage poor data quality can do to customer service and an institution’s reputation is profound. There is nothing more likely to break the trust Australians place in financial services institutions than stuffing up people’s hard-earned savings.

The impact of poor data quality also extends to the compliance obligations of financial institutions. Poor data can cause a specific breach of laws, fund rules or policies. Even worse, systemic issues can lead to major breaches and regulatory intervention.

In terms of financial, reputational and non-compliance costs, it is clear that time makes things worse.

Groundhog day

At QMV, we sometimes refer to the data migration process as the Groundhog Day of quality errors. Bill Murray’s Phil Connors perishes at the end of each day in the seminal ’90s cinematic classic and most migration processes follow a similar path.

The recurring mistake is the timing of the data quality effort. Quality should always represent the first phase of a migration but it is too often dealt with it as a post-migration activity. The blinkered focus on project deadlines often leads to the scaling back of data quality efforts; identified errors are reclassified as less severe and get overlooked.

Then, 18 months after the migration, funds get cornered into a costly, painful major data remediation program because the problem gets way out of hand.


How much do we spend on data quality? Well done to anyone who can confidently answer this question, because you are absolutey in the minority. For everyone else, the answer is probably: more than you think and more than you need to.

We can look at the spending on data quality through three classifications:

Prevention: controls designed to stop errors from happening
Detection: controls to identify once an error occurs
Correction: remediation and restoration of data.

Most executives view data quality as an expense, rather than an investment. This may seem like an innocuous play on words, but it is an important distinction. Not understanding or measuring the return on investment that data quality can deliver, leads to the reactive approach, in which data quality spending becomes heavily geared toward correction.

The key is to break the endless cycle of manually fixing defective data by transitioning from a reactive approach to one that finds the appropriate balance between all three classifications. In this way, corrective controls become more targeted, helping to mitigate the spread of problems.

How to improve quality

What can be done to improve data quality, and reduce related risks? There are a few key initiatives that can help eliminate the disease effect:

1 Develop data quality metrics: You cannot manage what you do not measure. Metrics will provide a factual basis on which to justify, focus and monitor efforts, while acting as a leading risk indicator.

2 Determine ownership: Efforts to improve data quality will not succeed without the oversight, collaboration and accountability of all key stakeholders.

3 Invest in a system: Break the cycle of Word documents, spreadsheets and SQL scripts. An integrated system can help drive the real cost efficiencies and risk management that true data quality can deliver.

4 Measure return on investment (ROI): the key to measuring ROI is choosing the metrics that matter most to the business – those that can be measured and offer the biggest potential for improvement.

Addressing the points above should help an organisation find the appropriate balance between prevention, detection and correction. This will lead to more $1 verifications and fewer $100 corrections.

Of course, each data error is different and there is no one right way to measure the associated cost but the key theme is clear: the financial and reputational costs of data errors spread like a disease. Prevention is better than a cure and remains the optimum solution, and early detection is a critical tool that can lower the cost and impact of errors.

Join the discussion