Secret #2 to Maximize Pervasive Data Integrator

When using an Excel document as a source, the header row is used to determine the names of the fields for the source connector. In order to ensure the field names will be consistent, one can insert a row into the beginning of each document before it is processed. For example, today our client sent us an Excel document that contained the following header in column A: “Account Number”, but yesterday, the value in column A was: “Acct Num”.

Dynamically inserting a static header row into an Excel document allows for the processing of Excel documents regardless of whether or not they contain a consistent header row from the client.

This should be done when you are asked to process an Excel document that is missing a header row or does not have a consistent header row.

The use of consistent column headers is beneficial to the:

  • Developer – Implements the code to add a header row to the Excel document prior to processing.
  • End User – Is able to review and utilize the new data loaded into the system.

Inserting a header row to an Excel document is implemented using a RIFL step within Pervasive Data Integrator Process Designer.

Before the Excel document is processed, use a RIFL step to open the document and insert a static header row at the beginning of the document that matches the column names identified in the Map’s source schema. If the file may come in with or without a header, you can add a Source Filter to your Map that processes the Excel document. The filter can validate each row to filter out any extra or unwanted header rows and rows that only contain whitespace, which will allow you to process the data in the document successfully.

Subscribe to the Emprise Technologies YouTube channel to access our library of video demos.

Secret #1 to Maximize Pervasive Data Integrator

In most cases, Pervasive Data Integrator users want to react and handle errors has they occur. By default, Pervasive will exit a Process as soon as an error is encountered. Alternatively, a Process can be configured to allow errors, which will allow the Process to continue executing through completion regardless of an error occurring. What we have found is that neither option is practical with Pervasive Data Integrator.

Responsive error handling can be introduced to ensure that the business needs are met and all errors are handled appropriately as they occur. For example, if a step in a Process cannot find a file, the Process should not abort, nor should it continue execution as if the file exists. Instead, use responsive error handling to manage the missing file exception and send an email to the appropriate recipients.

With Pervasive Data Integrator, custom error handling should be used to ensure that a Process is behaving as intended and as defined by business rules. By introducing responsive error handling, you can gather and react to errors as they occur, versus digging through error logs to identify why a Process aborted.

Data Integrator Processes that contain steps which are dependent upon the success of a previous step, are ideal candidates for implementing responsive error handling.

The use of responsive error handling is beneficial to the:

  • Developer – Implements the code for responsive error handling.
  • End User – Defines business rules and reviews results of responsive error handling.

Responsive error handling is implemented within Pervasive Data Integrator Process Designer.

During the execution of a Process, Pervasive stores metadata about the Process and its steps and session objects. The metadata can be accessed via RIFL Script in both RIFL steps and Event Handlers. One approach is to create a RIFL Script step immediately after each step that ignores an error. This ensures that errors are caught immediately as they occur and can be handled appropriately. To implement responsive error handling, the Process must not be configured to “Break on First Error”. Additionally, all steps should be set to “Ignore Error”. Then, insert RIFL steps to handle errors appropriately.

Subscribe to the Emprise Technologies YouTube channel to access our library of video demos.

David Linthicum’s Data Integration Predictions for 2013

David Linthicum recently made 3 Data Integration Predictions for 2013 in his blog post on Pervasive Data Integration blog.  As a CEO whose business it is to help IT organizations get the most our their data, I concur with Linthicum’s predictions.

Whether you’re tired of hearing about “Big Data” or not, it’s here to stay. And the data will only get bigger and more complex. That means companies have to create business processes as well as IT processes that enable them to manage and integrate that data as it grows. Otherwise, far too much of the organizations resources will be spent on trying to manage the unmanageable  and not on their core business.

With government requirements for healthcare organizations to convert their data from flat files to 837 EDI files and corporations looking to increase their “Business Intelligence” (BI) for better decision-making, the Cloud will continue to drive IT teams to integrate data that’s on premise to the cloud. We see many customers moving to a hybrid combination of Cloud and on-premise. As Linthicum says, “It’s a certainty that data integration will become more important as the years progress.”


Pervasive Data Integrator – Macro Definition Variables, Part 2

Continued from Part 1. Learn more about how to use macro definition variables along with tips & tricks when using  Pervasive Data IntegratorEmprise Technologies has created a 2 part video demo for developers using DI. Don’t forget to subscribe to our Emprise Technologies YouTube channel to watch more informational videos from us.

Pervasive Data Integrator – Using Macro Definition Variables, Video Demo Part 1

Want to learn more about how to use macro definition variables Pervasive Data Integrator? Emprise Technologies has created a 2 part video demo for developers using DI. This video is part 1 of the demo for using macro definition variables. Don’t forget to subscribe to our Emprise Technologies YouTube channel to receive more informational videos from us. We’ll post the 2nd video shortly.

Three Most Likely Culprits for Data Quality Problems

Few things get better with time.  Without careful attention your data certainly won’t be one of them.

For most organizations, a love/hate relationship exists with their data.  We love that we can draw together information from various systems and use it to see a picture of how effective we’re being. We hate how difficult it is to move and maintain that information.

Recently, I’ve been working with a large organization that is making changes to its data integration infrastructure.  As part of the project, we’re reviewing how data moves into the organization and through multiple core business systems.  It’s been remarkable how many times the data is touched and the potential negative impact this whole cycle has on data quality.

Through identifying actual problem areas we’ve come across some now familiar culprits:

  1. Intake of information.  Often all the data isn’t loaded. The risk is that we don’t get everything, and it reduces the quality of what we have.
  2. Cyclic miscommunication between systems.  Dependencies and the system strain associated with moving large amounts of data in and out result in periodically missing a transfer.  One process gets backed up or breaks and the delays snowball.
  3. Complexity of processes.  At some point in every process, business rules get inserted to make decisions on how and where the data belongs.  Knowledgeable IT staffers are asked to create complex processes that are very difficult to test.

We see these same problem areas to varying degrees with most of our larger clients.  Data is certainly difficult to handle – that’s not a new idea.  But what is the collected result of this difficulty?

The quality of the information you use to run your business depreciates steadily over time. Given time and complexity the quality of your data will decrease.

External factors can add fuel to the fire.  If some of this data is about people (and some of it surely is), then there’s a silent but significant change going on external to your organization.  People are constantly in flux – moving, changing jobs, getting married, etc. – all of these activities are bad for the information you house about them.  Even in a short amount of time you know less than you did initially.

What’s to be done? How do you earn top marks for clean data?


Ideally, the solution is to examine your data handling processes and look for problem areas.  Is your organization using the best tools to do the job?  While this is the best approach, it can be overwhelming.  At some point this just has to be done and the longer you wait the more difficult the mess will be to unravel.

At the other extreme you can ignore the problem and treat the symptoms.  While this seems like a bad idea for the long haul (it is), it can be very cost efficient and give the organization a significant lift.  Taking a look at the data where it’s being used and identifying missing or bad data is the first step.   Once you see the troubles then solutions become possible.

Data is a corporate asset.  It requires maintenance and it depreciates over time.  Like everything else you do, recognizing the problem is the first step to a solution.