“The Cost of Poor Data Management”

It is surprising that data quality is still a concept that is viewed as a luxury, rather than a necessity. As an unapologetic data quality advocate, I’ve written white papers and blog posts about the value of  good data management. It takes the efforts of many to change  habits. In her blog post, The Costs of Poor Data Management, on the Data Integration Blog, Julie Hunt breaks down the impact data quality has on business.

Here’s an infographic on the cost poor data quality can have on business.

Global research - Bad customer data costs you millions

She points out that the areas of data quality deserving the greatest focus are specific to each organization. If you read my post, “Avoiding Data Quality Pitfalls”,  you know that I’m a proponent of good data governance. Update early and often. My top four suggestions are:

  • Translation Tables
  • Stored Procedures
  • Database Views
  • Validation Lookups, Tables, and Rule

What are yours? Read Julie’s post, and send me your comments.

Advertisements

Pitney Bowes Spectrum: Future-Proofing MDM, by Julie Hunt

“Data is the most valuable asset of any business and is the foundation for building lifetime customer relationships.” Which means that accuracy of the data is mission critical to building strong healthy relationships with customers. Julie Hunt’s blog post on Hub Design Magazine  “Pitney Bowes Spectrum Future Proofing” provides keen insight to how a 93-year-old company uses master data management to innovate for the future.

 

Hub Designs Magazine

A briefing by Pitney Bowes Software for the Hub Designs MDM Think Tank

View original post 1,688 more words

Avoid Data Quality Pitfalls

If you haven’t experienced the frustration of trying to wade through duplicate and incorrect data, you’re one of the very few. Dirty data clogs up our databases, integration projects and creates obstacles to getting the information we need from the data. It can be like trying to paddling through a sea of junk.

The value of our data is providing reporting that is accurate and business intelligence that enable good business decisions. Good data governance is critical to successful business as well as meeting compliance requirements.

Image

So how do we avoid the pitfalls of poor data quality?

Perform quality assurance activities for each step of the process. Data quality results from frequent and ongoing efforts to reduce duplication and update information. If that sounds like a daunting task, remember that using the right tools can save substantial time and money, as well as create better results.

Take the time to set clear and consistent rules for setting up your data. If you inherited a database, then you can still update the governance to improve your data quality.

How to update data governance?

Recommendation: Updating data governance will almost always require new code segments being added to existing data import/scrub/validation processes.  A side effect of adding new code segments is a “cleanup”.  When code is updated to promote data governance, it is usually only applied to new data entering the system.  What about the data that was in the system prior to the new data governance code?  We want all the new data governance rules to hit new data as well as existing data.  You’ll need build the new code segments into separate processes for (hopefully) a one-time cleanup of the existing data.  Applying the updated data governance code in conjunction with executing the “cleanup” will bring data governance current, update existing data, and maintain a uniform dataset.

Which are the most important things to update?

  • Translation Tables
  • Stored Procedures
  • Database Views
  • Validation Lookups, Tables, and Rules

GIGO – garbage in = garbage out. Rid your data of the garbage early and avoid a massive clean up later. The C-suite appreciates that you’ll run more efficient projects and processes as well.

When Profiling Is A Good Thing

We all know the kind of profiling that is completely unacceptable and that’s not what I’m talking about here. I neither condone nor practice any kind of socially unacceptable profiling. But there IS one type of profiling that I strongly recommend: Data Profiling. Especially before you migrate your data.

If you think that sounds like a luxury you don’t have the time to fit into your project’s schedule, consider this: Bloor Research conducted a study and found that the data migration projects that used data profiling best practices were significantly more likely (72% compared to 52%) to came in on time and on budget. That’s big difference and there are a lot more benefits organizations realize when they use data profiling in their projects.

Data Profiling enables better regulatory compliance, more efficient master data management and better data governance. It all goes back to the old adage that “You have to measure what you want to manage.” Profiling data is the first step in measuring how good the quality of your data is before you migrate or integrate it. It allows monitoring the quality of the data throughout the life of the data. Data deteriorates at around 1.25-1.5% per month. That adds up to a lot of bad data over the course of a year or two. The lower your data quality is, the lower your process and project efficiencies will be. No one wants that. Download the Bloor Research “Data Profiling – The Business Case” white paper and learn more about the results of this study.

White Paper Download

Three Most Likely Culprits for Data Quality Problems

Few things get better with time.  Without careful attention your data certainly won’t be one of them.

For most organizations, a love/hate relationship exists with their data.  We love that we can draw together information from various systems and use it to see a picture of how effective we’re being. We hate how difficult it is to move and maintain that information.

Recently, I’ve been working with a large organization that is making changes to its data integration infrastructure.  As part of the project, we’re reviewing how data moves into the organization and through multiple core business systems.  It’s been remarkable how many times the data is touched and the potential negative impact this whole cycle has on data quality.

Through identifying actual problem areas we’ve come across some now familiar culprits:

  1. Intake of information.  Often all the data isn’t loaded. The risk is that we don’t get everything, and it reduces the quality of what we have.
  2. Cyclic miscommunication between systems.  Dependencies and the system strain associated with moving large amounts of data in and out result in periodically missing a transfer.  One process gets backed up or breaks and the delays snowball.
  3. Complexity of processes.  At some point in every process, business rules get inserted to make decisions on how and where the data belongs.  Knowledgeable IT staffers are asked to create complex processes that are very difficult to test.

We see these same problem areas to varying degrees with most of our larger clients.  Data is certainly difficult to handle – that’s not a new idea.  But what is the collected result of this difficulty?

The quality of the information you use to run your business depreciates steadily over time. Given time and complexity the quality of your data will decrease.

External factors can add fuel to the fire.  If some of this data is about people (and some of it surely is), then there’s a silent but significant change going on external to your organization.  People are constantly in flux – moving, changing jobs, getting married, etc. – all of these activities are bad for the information you house about them.  Even in a short amount of time you know less than you did initially.

What’s to be done? How do you earn top marks for clean data?

A+-Grade

Ideally, the solution is to examine your data handling processes and look for problem areas.  Is your organization using the best tools to do the job?  While this is the best approach, it can be overwhelming.  At some point this just has to be done and the longer you wait the more difficult the mess will be to unravel.

At the other extreme you can ignore the problem and treat the symptoms.  While this seems like a bad idea for the long haul (it is), it can be very cost efficient and give the organization a significant lift.  Taking a look at the data where it’s being used and identifying missing or bad data is the first step.   Once you see the troubles then solutions become possible.

Data is a corporate asset.  It requires maintenance and it depreciates over time.  Like everything else you do, recognizing the problem is the first step to a solution.