Three Most Likely Culprits for Data Quality Problems

Few things get better with time.  Without careful attention your data certainly won’t be one of them.

For most organizations, a love/hate relationship exists with their data.  We love that we can draw together information from various systems and use it to see a picture of how effective we’re being. We hate how difficult it is to move and maintain that information.

Recently, I’ve been working with a large organization that is making changes to its data integration infrastructure.  As part of the project, we’re reviewing how data moves into the organization and through multiple core business systems.  It’s been remarkable how many times the data is touched and the potential negative impact this whole cycle has on data quality.

Through identifying actual problem areas we’ve come across some now familiar culprits:

  1. Intake of information.  Often all the data isn’t loaded. The risk is that we don’t get everything, and it reduces the quality of what we have.
  2. Cyclic miscommunication between systems.  Dependencies and the system strain associated with moving large amounts of data in and out result in periodically missing a transfer.  One process gets backed up or breaks and the delays snowball.
  3. Complexity of processes.  At some point in every process, business rules get inserted to make decisions on how and where the data belongs.  Knowledgeable IT staffers are asked to create complex processes that are very difficult to test.

We see these same problem areas to varying degrees with most of our larger clients.  Data is certainly difficult to handle – that’s not a new idea.  But what is the collected result of this difficulty?

The quality of the information you use to run your business depreciates steadily over time. Given time and complexity the quality of your data will decrease.

External factors can add fuel to the fire.  If some of this data is about people (and some of it surely is), then there’s a silent but significant change going on external to your organization.  People are constantly in flux – moving, changing jobs, getting married, etc. – all of these activities are bad for the information you house about them.  Even in a short amount of time you know less than you did initially.

What’s to be done? How do you earn top marks for clean data?

A+-Grade

Ideally, the solution is to examine your data handling processes and look for problem areas.  Is your organization using the best tools to do the job?  While this is the best approach, it can be overwhelming.  At some point this just has to be done and the longer you wait the more difficult the mess will be to unravel.

At the other extreme you can ignore the problem and treat the symptoms.  While this seems like a bad idea for the long haul (it is), it can be very cost efficient and give the organization a significant lift.  Taking a look at the data where it’s being used and identifying missing or bad data is the first step.   Once you see the troubles then solutions become possible.

Data is a corporate asset.  It requires maintenance and it depreciates over time.  Like everything else you do, recognizing the problem is the first step to a solution.

Advertisements

One thought on “Three Most Likely Culprits for Data Quality Problems”

  1. Good insight, Marc. Especially like the comment about business rules getting inserted into the data processes. Wouldn’t it be great if those business needs could be identified before the integration happens?!

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s