“The Cost of Poor Data Management”

It is surprising that data quality is still a concept that is viewed as a luxury, rather than a necessity. As an unapologetic data quality advocate, I’ve written white papers and blog posts about the value of  good data management. It takes the efforts of many to change  habits. In her blog post, The Costs of Poor Data Management, on the Data Integration Blog, Julie Hunt breaks down the impact data quality has on business.

Here’s an infographic on the cost poor data quality can have on business.

Global research - Bad customer data costs you millions

She points out that the areas of data quality deserving the greatest focus are specific to each organization. If you read my post, “Avoiding Data Quality Pitfalls”,  you know that I’m a proponent of good data governance. Update early and often. My top four suggestions are:

  • Translation Tables
  • Stored Procedures
  • Database Views
  • Validation Lookups, Tables, and Rule

What are yours? Read Julie’s post, and send me your comments.

Advertisements

Pitney Bowes Spectrum: Future-Proofing MDM, by Julie Hunt

“Data is the most valuable asset of any business and is the foundation for building lifetime customer relationships.” Which means that accuracy of the data is mission critical to building strong healthy relationships with customers. Julie Hunt’s blog post on Hub Design Magazine  “Pitney Bowes Spectrum Future Proofing” provides keen insight to how a 93-year-old company uses master data management to innovate for the future.

 

Hub Designs Magazine

A briefing by Pitney Bowes Software for the Hub Designs MDM Think Tank

View original post 1,688 more words

Avoid Data Quality Pitfalls

If you haven’t experienced the frustration of trying to wade through duplicate and incorrect data, you’re one of the very few. Dirty data clogs up our databases, integration projects and creates obstacles to getting the information we need from the data. It can be like trying to paddling through a sea of junk.

The value of our data is providing reporting that is accurate and business intelligence that enable good business decisions. Good data governance is critical to successful business as well as meeting compliance requirements.

Image

So how do we avoid the pitfalls of poor data quality?

Perform quality assurance activities for each step of the process. Data quality results from frequent and ongoing efforts to reduce duplication and update information. If that sounds like a daunting task, remember that using the right tools can save substantial time and money, as well as create better results.

Take the time to set clear and consistent rules for setting up your data. If you inherited a database, then you can still update the governance to improve your data quality.

How to update data governance?

Recommendation: Updating data governance will almost always require new code segments being added to existing data import/scrub/validation processes.  A side effect of adding new code segments is a “cleanup”.  When code is updated to promote data governance, it is usually only applied to new data entering the system.  What about the data that was in the system prior to the new data governance code?  We want all the new data governance rules to hit new data as well as existing data.  You’ll need build the new code segments into separate processes for (hopefully) a one-time cleanup of the existing data.  Applying the updated data governance code in conjunction with executing the “cleanup” will bring data governance current, update existing data, and maintain a uniform dataset.

Which are the most important things to update?

  • Translation Tables
  • Stored Procedures
  • Database Views
  • Validation Lookups, Tables, and Rules

GIGO – garbage in = garbage out. Rid your data of the garbage early and avoid a massive clean up later. The C-suite appreciates that you’ll run more efficient projects and processes as well.

When Profiling Is A Good Thing

We all know the kind of profiling that is completely unacceptable and that’s not what I’m talking about here. I neither condone nor practice any kind of socially unacceptable profiling. But there IS one type of profiling that I strongly recommend: Data Profiling. Especially before you migrate your data.

If you think that sounds like a luxury you don’t have the time to fit into your project’s schedule, consider this: Bloor Research conducted a study and found that the data migration projects that used data profiling best practices were significantly more likely (72% compared to 52%) to came in on time and on budget. That’s big difference and there are a lot more benefits organizations realize when they use data profiling in their projects.

Data Profiling enables better regulatory compliance, more efficient master data management and better data governance. It all goes back to the old adage that “You have to measure what you want to manage.” Profiling data is the first step in measuring how good the quality of your data is before you migrate or integrate it. It allows monitoring the quality of the data throughout the life of the data. Data deteriorates at around 1.25-1.5% per month. That adds up to a lot of bad data over the course of a year or two. The lower your data quality is, the lower your process and project efficiencies will be. No one wants that. Download the Bloor Research “Data Profiling – The Business Case” white paper and learn more about the results of this study.

White Paper Download

Pervasive Migration Utility Demonstration – v9 to v10

Pervasive has recently developed an effective utility for migrating Data Integrator v9 projects into Pervasive Data Integrator v10. The process is quick and relatively smooth; however, there is the potential for challenges to arise due to the complex nature of most DI projects.  If you are thinking about transitioning from v9 to v10, please reach out to Emprise to learn how our team of Certified Pervasive Developers can help your transition to v10 be successful.

Emprise Sponsoring Pervasive IntegrationWorld 2013

Emprise Technologies is proud to be a Platinum sponsor of Pervasive IntegrationWorld 2013.  We are also sponsoring the Data Clinic. If you are going to be at IntegrationWorld, come by the Data Clinic and ask one of our Pervasive certified consultants questions about Data Integrator. Bring your toughest Data Integration questions: The Emprise team has collective 30,000 hours of Pervasive work, so we doubt you’ll be able to stump us. But we’re open to your trying! See you at IntegrationWorld 2013. We’ll be in the Hyatt Hill Country Ballroom A-C from 10:15 a.m. until 4:00 p.m. on Monday, April 15 and again on Tuesday, the 16, from 9:20 a.m. – 12:00 p.m.

Introduction to Pervasive Data Integrator

Overview

Pervasive Data Integrator can be a powerful tool, enabling multiple connections between a wide variety of systems and data points. The Repository Explorer is the starting point for every Data Integrator project.  To get started on your first project, you must understand how to configure your Workspaces and Repositories through Repository Explorer.

What is the Repository Explorer?

The Repository Explorer is the starting point for all Pervasive Data Integrator projects.  From this one application, developers can navigate to projects, open existing Pervasive Data Integrator elements (Maps, Processes, Structured Schemas, etc), or create new instances of those elements.

Who uses the Repository Explorer?

The Repository Explorer is used almost exclusively by developers, but it can also be used by quality assurance resources to access and review code that has already been developed.

How to Configure Repository Explorer?

Opening Repository Explorer

Once installed, Repository Explorer can be accessed like any other program on Windows.

  1. Open the Start menu
  2. Select ‘All Programs’
  3. Select Pervasive folder
  4. Select Data Integrator 9 folder
  5. Select Repository Explorer 9 program

Setting up a Workspace

The Repository Explorer organizes files using two methods.  The first method is via a Workspace.  A Workspace is a collection of one ore many repositories and a single Macrodef.xml file that is specific to the Workspace.  (Note:  For further information on the Macrodef.xml file, check out our two videos on Macro Definition Variables).  At Emprise, our best practice is to create one Workspace for each project.  This allows us to specific a unique Macrodef file for each project.

To create a new Workspace, one just has to follow a few simple steps.

  1. Select File from the menu bar
  2. Select Manage Workspaces…
  3. Click the drop down to the right of Workspaces Root Directory and navigate to the location you would like to save your workspace in.  Hit OK.  The Workspace Root Directory is the location where the Workspace folder will be created.  Inside of this folder a set of mandatory, default files and folders will be created.
    1. Xmldb – This is the folder used for the default repository when creating a new Workspace.
    2. Fileref.xml – A list of file references used by the workspace.
    3. Macrodef.xml – The macrodef file.  For further information see our video.
    4. Repositores.xml – A list of all repositories for the Workspace.  This directory is rarely changed after initial setup.
  4. Select Add
  5. To add an existing workspace, check the box for the proper workspace.
  6. If adding a new Workspace, which will often be the case, click the ‘Create New Workspace” button.  You will be prompted for a name.  Give your workspace a descriptive name and then click “OK” to return to the previous screen.
  7. Click OK
  8. Find the Workspace you just added, click the name to highlight it, then click the Set Current button on the right.  This activates the Workspace, allowing you access to its repositories.

*Note: You can also right click on the white space to the left of the screen that displays your current repository and select the ‘Manage Workspaces’ option from there.  Also, when creating a new Workspace using the “Create New Workspace” option, the Macrodef.xml file from the current Workspace will be copied into the directory of the new Workspace.  This includes the Macro names and values.  This is helpful when standard Macros are used, but is something to pay attention to in regards to directory paths and file names.

Setting up a Repository

Repositories are directory paths pointing to where the Pervasive DI project files will be located.  A Workspace can have any number of Repositories, which are displayed in a tree view on the left side of the Repository Explorer.  Only Pervasive files located in one of the Repositories for the Workspace can be opened and edited from the Repository Explorer.

  1. If you are not currently working in the workspace within which you want to create a repository, navigate to that workspace using File -> Manage Workspaces or right click on the white space on the left that displays the file directory structure and select Manage Workspaces. Then, click the text of the Workspace you would like to use to highlight it before clicking the Set Current button on the right.
  2. Select File form the menu bar
  3. Select Manage Repositories…
  4. Click the Add… button to create a new repository. At this point you can either navigate to the folder you would like to select as the Repository or paste in the file path as copied from Windows Explorer.

Note: You can also right click on the white space to the left of the screen that displays your current repository and select the ‘Manage Repositories’ option from there.

General Notes

  1. Use a standard naming convention for the Workspaces.  This allows for easy identification of what project the Workspace is for.
  2. Use a standard data structure and directory path for the Repository folder.  This prevents issues that may arise when multiple developers work on the same code base.  Pervasive Data Integrator projects use a series of pointers within the files and by standardizing the repository paths you prevent these pointers from being corrupted when moving code between developers.
  3. After creating a new Workspace it is prudent to open the Macrodef.XML file and remove unneeded Macros and change others to match your new project.  Please see our video on the MacroDef for further information.