he Big Data & Integration Summit was a success and our presentations are now available to the public for viewing. http://ow.ly/q64hz
Pervasive Data Integrator can be a powerful tool, enabling multiple connections between a wide variety of systems and data points. The Repository Explorer is the starting point for every Data Integrator project. To get started on your first project, you must understand how to configure your Workspaces and Repositories through Repository Explorer.
What is the Repository Explorer?
The Repository Explorer is the starting point for all Pervasive Data Integrator projects. From this one application, developers can navigate to projects, open existing Pervasive Data Integrator elements (Maps, Processes, Structured Schemas, etc), or create new instances of those elements.
Who uses the Repository Explorer?
The Repository Explorer is used almost exclusively by developers, but it can also be used by quality assurance resources to access and review code that has already been developed.
How to Configure Repository Explorer?
Opening Repository Explorer
Once installed, Repository Explorer can be accessed like any other program on Windows.
- Open the Start menu
- Select ‘All Programs’
- Select Pervasive folder
- Select Data Integrator 9 folder
- Select Repository Explorer 9 program
Setting up a Workspace
The Repository Explorer organizes files using two methods. The first method is via a Workspace. A Workspace is a collection of one ore many repositories and a single Macrodef.xml file that is specific to the Workspace. (Note: For further information on the Macrodef.xml file, check out our two videos on Macro Definition Variables). At Emprise, our best practice is to create one Workspace for each project. This allows us to specific a unique Macrodef file for each project.
To create a new Workspace, one just has to follow a few simple steps.
- Select File from the menu bar
- Select Manage Workspaces…
- Click the drop down to the right of Workspaces Root Directory and navigate to the location you would like to save your workspace in. Hit OK. The Workspace Root Directory is the location where the Workspace folder will be created. Inside of this folder a set of mandatory, default files and folders will be created.
- Xmldb – This is the folder used for the default repository when creating a new Workspace.
- Fileref.xml – A list of file references used by the workspace.
- Macrodef.xml – The macrodef file. For further information see our video.
- Repositores.xml – A list of all repositories for the Workspace. This directory is rarely changed after initial setup.
- Select Add
- To add an existing workspace, check the box for the proper workspace.
- If adding a new Workspace, which will often be the case, click the ‘Create New Workspace” button. You will be prompted for a name. Give your workspace a descriptive name and then click “OK” to return to the previous screen.
- Click OK
- Find the Workspace you just added, click the name to highlight it, then click the Set Current button on the right. This activates the Workspace, allowing you access to its repositories.
*Note: You can also right click on the white space to the left of the screen that displays your current repository and select the ‘Manage Workspaces’ option from there. Also, when creating a new Workspace using the “Create New Workspace” option, the Macrodef.xml file from the current Workspace will be copied into the directory of the new Workspace. This includes the Macro names and values. This is helpful when standard Macros are used, but is something to pay attention to in regards to directory paths and file names.
Setting up a Repository
Repositories are directory paths pointing to where the Pervasive DI project files will be located. A Workspace can have any number of Repositories, which are displayed in a tree view on the left side of the Repository Explorer. Only Pervasive files located in one of the Repositories for the Workspace can be opened and edited from the Repository Explorer.
- If you are not currently working in the workspace within which you want to create a repository, navigate to that workspace using File -> Manage Workspaces or right click on the white space on the left that displays the file directory structure and select Manage Workspaces. Then, click the text of the Workspace you would like to use to highlight it before clicking the Set Current button on the right.
- Select File form the menu bar
- Select Manage Repositories…
- Click the Add… button to create a new repository. At this point you can either navigate to the folder you would like to select as the Repository or paste in the file path as copied from Windows Explorer.
Note: You can also right click on the white space to the left of the screen that displays your current repository and select the ‘Manage Repositories’ option from there.
- Use a standard naming convention for the Workspaces. This allows for easy identification of what project the Workspace is for.
- Use a standard data structure and directory path for the Repository folder. This prevents issues that may arise when multiple developers work on the same code base. Pervasive Data Integrator projects use a series of pointers within the files and by standardizing the repository paths you prevent these pointers from being corrupted when moving code between developers.
- After creating a new Workspace it is prudent to open the Macrodef.XML file and remove unneeded Macros and change others to match your new project. Please see our video on the MacroDef for further information.
When using an Excel document as a source, the header row is used to determine the names of the fields for the source connector. In order to ensure the field names will be consistent, one can insert a row into the beginning of each document before it is processed. For example, today our client sent us an Excel document that contained the following header in column A: “Account Number”, but yesterday, the value in column A was: “Acct Num”.
Dynamically inserting a static header row into an Excel document allows for the processing of Excel documents regardless of whether or not they contain a consistent header row from the client.
This should be done when you are asked to process an Excel document that is missing a header row or does not have a consistent header row.
The use of consistent column headers is beneficial to the:
- Developer – Implements the code to add a header row to the Excel document prior to processing.
- End User – Is able to review and utilize the new data loaded into the system.
Inserting a header row to an Excel document is implemented using a RIFL step within Pervasive Data Integrator Process Designer.
Before the Excel document is processed, use a RIFL step to open the document and insert a static header row at the beginning of the document that matches the column names identified in the Map’s source schema. If the file may come in with or without a header, you can add a Source Filter to your Map that processes the Excel document. The filter can validate each row to filter out any extra or unwanted header rows and rows that only contain whitespace, which will allow you to process the data in the document successfully.
Subscribe to the Emprise Technologies YouTube channel to access our library of video demos.
In most cases, Pervasive Data Integrator users want to react and handle errors has they occur. By default, Pervasive will exit a Process as soon as an error is encountered. Alternatively, a Process can be configured to allow errors, which will allow the Process to continue executing through completion regardless of an error occurring. What we have found is that neither option is practical with Pervasive Data Integrator.
Responsive error handling can be introduced to ensure that the business needs are met and all errors are handled appropriately as they occur. For example, if a step in a Process cannot find a file, the Process should not abort, nor should it continue execution as if the file exists. Instead, use responsive error handling to manage the missing file exception and send an email to the appropriate recipients.
With Pervasive Data Integrator, custom error handling should be used to ensure that a Process is behaving as intended and as defined by business rules. By introducing responsive error handling, you can gather and react to errors as they occur, versus digging through error logs to identify why a Process aborted.
Data Integrator Processes that contain steps which are dependent upon the success of a previous step, are ideal candidates for implementing responsive error handling.
The use of responsive error handling is beneficial to the:
- Developer – Implements the code for responsive error handling.
- End User – Defines business rules and reviews results of responsive error handling.
Responsive error handling is implemented within Pervasive Data Integrator Process Designer.
During the execution of a Process, Pervasive stores metadata about the Process and its steps and session objects. The metadata can be accessed via RIFL Script in both RIFL steps and Event Handlers. One approach is to create a RIFL Script step immediately after each step that ignores an error. This ensures that errors are caught immediately as they occur and can be handled appropriately. To implement responsive error handling, the Process must not be configured to “Break on First Error”. Additionally, all steps should be set to “Ignore Error”. Then, insert RIFL steps to handle errors appropriately.
Subscribe to the Emprise Technologies YouTube channel to access our library of video demos.
Continued from Part 1. Learn more about how to use macro definition variables along with tips & tricks when using Pervasive Data Integrator. Emprise Technologies has created a 2 part video demo for developers using DI. Don’t forget to subscribe to our Emprise Technologies YouTube channel to watch more informational videos from us.