Pervasive has recently developed an effective utility for migrating Data Integrator v9 projects into Pervasive Data Integrator v10. The process is quick and relatively smooth; however, there is the potential for challenges to arise due to the complex nature of most DI projects. If you are thinking about transitioning from v9 to v10, please reach out to Emprise to learn how our team of Certified Pervasive Developers can help your transition to v10 be successful.
Pervasive Data Integrator can be a powerful tool, enabling multiple connections between a wide variety of systems and data points. The Repository Explorer is the starting point for every Data Integrator project. To get started on your first project, you must understand how to configure your Workspaces and Repositories through Repository Explorer.
What is the Repository Explorer?
The Repository Explorer is the starting point for all Pervasive Data Integrator projects. From this one application, developers can navigate to projects, open existing Pervasive Data Integrator elements (Maps, Processes, Structured Schemas, etc), or create new instances of those elements.
Who uses the Repository Explorer?
The Repository Explorer is used almost exclusively by developers, but it can also be used by quality assurance resources to access and review code that has already been developed.
How to Configure Repository Explorer?
Opening Repository Explorer
Once installed, Repository Explorer can be accessed like any other program on Windows.
- Open the Start menu
- Select ‘All Programs’
- Select Pervasive folder
- Select Data Integrator 9 folder
- Select Repository Explorer 9 program
Setting up a Workspace
The Repository Explorer organizes files using two methods. The first method is via a Workspace. A Workspace is a collection of one ore many repositories and a single Macrodef.xml file that is specific to the Workspace. (Note: For further information on the Macrodef.xml file, check out our two videos on Macro Definition Variables). At Emprise, our best practice is to create one Workspace for each project. This allows us to specific a unique Macrodef file for each project.
To create a new Workspace, one just has to follow a few simple steps.
- Select File from the menu bar
- Select Manage Workspaces…
- Click the drop down to the right of Workspaces Root Directory and navigate to the location you would like to save your workspace in. Hit OK. The Workspace Root Directory is the location where the Workspace folder will be created. Inside of this folder a set of mandatory, default files and folders will be created.
- Xmldb – This is the folder used for the default repository when creating a new Workspace.
- Fileref.xml – A list of file references used by the workspace.
- Macrodef.xml – The macrodef file. For further information see our video.
- Repositores.xml – A list of all repositories for the Workspace. This directory is rarely changed after initial setup.
- Select Add
- To add an existing workspace, check the box for the proper workspace.
- If adding a new Workspace, which will often be the case, click the ‘Create New Workspace” button. You will be prompted for a name. Give your workspace a descriptive name and then click “OK” to return to the previous screen.
- Click OK
- Find the Workspace you just added, click the name to highlight it, then click the Set Current button on the right. This activates the Workspace, allowing you access to its repositories.
*Note: You can also right click on the white space to the left of the screen that displays your current repository and select the ‘Manage Workspaces’ option from there. Also, when creating a new Workspace using the “Create New Workspace” option, the Macrodef.xml file from the current Workspace will be copied into the directory of the new Workspace. This includes the Macro names and values. This is helpful when standard Macros are used, but is something to pay attention to in regards to directory paths and file names.
Setting up a Repository
Repositories are directory paths pointing to where the Pervasive DI project files will be located. A Workspace can have any number of Repositories, which are displayed in a tree view on the left side of the Repository Explorer. Only Pervasive files located in one of the Repositories for the Workspace can be opened and edited from the Repository Explorer.
- If you are not currently working in the workspace within which you want to create a repository, navigate to that workspace using File -> Manage Workspaces or right click on the white space on the left that displays the file directory structure and select Manage Workspaces. Then, click the text of the Workspace you would like to use to highlight it before clicking the Set Current button on the right.
- Select File form the menu bar
- Select Manage Repositories…
- Click the Add… button to create a new repository. At this point you can either navigate to the folder you would like to select as the Repository or paste in the file path as copied from Windows Explorer.
Note: You can also right click on the white space to the left of the screen that displays your current repository and select the ‘Manage Repositories’ option from there.
- Use a standard naming convention for the Workspaces. This allows for easy identification of what project the Workspace is for.
- Use a standard data structure and directory path for the Repository folder. This prevents issues that may arise when multiple developers work on the same code base. Pervasive Data Integrator projects use a series of pointers within the files and by standardizing the repository paths you prevent these pointers from being corrupted when moving code between developers.
- After creating a new Workspace it is prudent to open the Macrodef.XML file and remove unneeded Macros and change others to match your new project. Please see our video on the MacroDef for further information.
Few things get better with time. Without careful attention your data certainly won’t be one of them.
For most organizations, a love/hate relationship exists with their data. We love that we can draw together information from various systems and use it to see a picture of how effective we’re being. We hate how difficult it is to move and maintain that information.
Recently, I’ve been working with a large organization that is making changes to its data integration infrastructure. As part of the project, we’re reviewing how data moves into the organization and through multiple core business systems. It’s been remarkable how many times the data is touched and the potential negative impact this whole cycle has on data quality.
Through identifying actual problem areas we’ve come across some now familiar culprits:
- Intake of information. Often all the data isn’t loaded. The risk is that we don’t get everything, and it reduces the quality of what we have.
- Cyclic miscommunication between systems. Dependencies and the system strain associated with moving large amounts of data in and out result in periodically missing a transfer. One process gets backed up or breaks and the delays snowball.
- Complexity of processes. At some point in every process, business rules get inserted to make decisions on how and where the data belongs. Knowledgeable IT staffers are asked to create complex processes that are very difficult to test.
We see these same problem areas to varying degrees with most of our larger clients. Data is certainly difficult to handle – that’s not a new idea. But what is the collected result of this difficulty?
The quality of the information you use to run your business depreciates steadily over time. Given time and complexity the quality of your data will decrease.
External factors can add fuel to the fire. If some of this data is about people (and some of it surely is), then there’s a silent but significant change going on external to your organization. People are constantly in flux – moving, changing jobs, getting married, etc. – all of these activities are bad for the information you house about them. Even in a short amount of time you know less than you did initially.
What’s to be done? How do you earn top marks for clean data?
Ideally, the solution is to examine your data handling processes and look for problem areas. Is your organization using the best tools to do the job? While this is the best approach, it can be overwhelming. At some point this just has to be done and the longer you wait the more difficult the mess will be to unravel.
At the other extreme you can ignore the problem and treat the symptoms. While this seems like a bad idea for the long haul (it is), it can be very cost efficient and give the organization a significant lift. Taking a look at the data where it’s being used and identifying missing or bad data is the first step. Once you see the troubles then solutions become possible.
Data is a corporate asset. It requires maintenance and it depreciates over time. Like everything else you do, recognizing the problem is the first step to a solution.