Merging Progress OpenEdge databases

SteveS

New Member
Hi, i hope someone out there can help me please!!

I am running OpenEdge 10.1b to host a database behind another third party application. I am running 9 instances of this database and application combination, with each database holding different iterations of the same format of information. The time has come for me to have one applicaiton with one database and to do this I need to merge all the the databases I am currently running. The application provider does not think that this is possible, but i reckon that it can be done through Progress tools.

Has anyone out there tried this in the past with any levels of success? If so, any guidance as to the process and tools that need to be used would be v helpful.

Thanks
 

TomBascom

Curmudgeon
The issue is one of uniqueness of the data.

If the data is reliably unique then it is fairly simple -- just dump each of the 9 databases and load them into a single consolidated database.

Duplicate, or partially duplicate, data could mean many different things:

1) You might have the same record redundantly entered in more than one database. If the same data is entered identically in all cases then it truly is redundant and the duplicates will fail on load (so long as there is a unique key). This is not a problem.

2) You might have what superficially seems like the same data but isn't quite. For instance you might have a customer such as "Lift Line Skiing" in each database. They might even have all the same data in the visible fields. But if they are using different customer-id numbers behind the scenes you will have multiple instances of that customer.

3) A variation on #2 would be one where you not only have different customer-ids but also have different data. You may, perhaps, have "Lift Line Skiing" as a customer in 3 different cities and they are, in fact 3 different customers...

4) You might not have either problem 2 or 3 but you may have the same customer-id (or whatever) used for completely different customers.

Only you and your vendor know your data well enough to say if these problems (and others that I didn't think of this morning) are, or are not actual issues in your case.
 

SteveS

New Member
Thanks for this. It's nice to see that in theory the merge could be relatively straight forward. I think that we are likely to encounter problems 2 and / or 3 for sure on our merge.

There hopefully wont be two many iterations of the same customer or record so it might just be the case of going for a manual tidy up after merge.

Thanks fo the assistance. :)
 

tamhas

ProgressTalk.com Sponsor
There are two very different goals in such a merge. In one version, one wishes to retain the separateness of the data even though there is only one database. For this one needs to prepend a company or entity identifier on all records and modify the code accordingly. From what you have said, I am guessing that is not what you want.

The other version is to consolidate previously separate data into one database, e.g., moving from separate regional operations to a centralized operation. For that, you need to be concerned about the kinds of issues which Tom has listed. I would suggest starting out by writing some code that will connect to all databases, or to each database in relation to one "master", and come up with some comparisons. I.e., look for matching customer numbers and buffer compare the records. Then, look for matching customer names and see if they have the same number.

Note that whenever you encounter the problem of having two different customers with the same number or the same customer with two different numbers, it won't just be those records you need to fix, but any record in that database which points to that number. I would ***highly*** recommend doing the cleanup in the separate databases before you attempt a merge. With the data that is related in its own database, it will be much, much, much easier to find and fix referential integrity connections when you have to make a change.

Thus, I would pick one database as master and, for simplicity, start by comparing to one other database. As you encounter conflicts, write code to fix the conflict, i.e., bring corresponding records into alignment or renumber records as needed. Then, either add the third and do it again relative to both databases or merge the first two and repeat the operation with the third. Lather, rinse, repeat.

Note that fixing something like a customer number can be fairly messy since there may be many references to it, but fixing something like invoice numbers can be a bigger problem because, while there are fewer references, they are a number which is important outside the database and you will have to deal with conflict between the number in the database and the external document.

Potentially a very large job, depending on the complexity of the application and your business practices.
 
Top