Making legacy system data available in Access

HartJF · Dec 21, 2010

Although I work in a functional department (not IS), I've been instructed to develop an archive for our portion of our legacy system. (Other departments aren't interested in the historical data, for reasons that I can't fathom) That system was developed in DEC VAX and is a cobbled-together mish-mash of relational and flat structures.

I have identified at least seventeen files (tables) that are essential to mine the historical data in the legacy system. Two files have more than 300 fields; one of those files has more than 130K records. Some files that have fewer fields (on the order of 25) have well over a half-million records.

The system has a facility to export comma-separated, text-delimited files, but each field has to be listed by name. The system uses fixed-length records (and fields) that can be overlayed in multiple ways to define subfields. (For example, consider the consecutive characters 12212010 -- today's date. This could be referred to as a single 8-character date field, and positions 1 and 2 could be defined as a month field, and positions 1 through 4 could be defined as a month-day field, all simultaneously.) Because of the substantial overhead of the delimiters in a 340-field file (nearly than 1K per record, more than 100MB for the table), I am considering overlaying the record with a single field for export.

In converting the legacy data to our new system, I wrote code that translated the textfile output of data-definition reports into a database of fields and their properties. Since Access is a recursive tool (table definitions are themselves data in tables), I am hoping to manipulate the data-definition database into linked-table specifications.

(As for future use of the data: I would link the data in a FE-BE configuration, even though the back-end data would be static)

How do I achieve this? Can I push data to MSysIMEXSpecs and -Columns? Must I use Schema.ini? (If so, what is the best resource for learning about it?)

Thank you for any wisdom you can share! Please let me know if I need to clarify anything about this project. I think I'm in over my head, but that doesn't diminish the expectations of my supervisors.

Galaxiom · Dec 22, 2010

Firstly be aware that Access can only manage a maximum of 255 fields in a table or query. This number includes both the original and updated fields in a query so the practical limit in a table is lower.

This is a one off import. I wouldn't bother trying to hook into the system import spec tables.

Personally I would process the original files as TextStreamObjects using parsing information held in my own tables (probably very similar to your data definition tables). Basically this just uses the Mid function to parse the data in a loop using something like the start character position and field length.

Start by setting up your destination database using the information you got from the data definiton reports to generate the tabledefs or MakeTable queries. (You are definitely on the right track there.) Then open the tables as recordsets and write from the parsed TextStream.

The loops involved in the extraction advance the parsing frame across the line and increment the field index in step as they work across the record.

I would focus on getting a clear definiton of the data into tables with the destination table and field information for every piece of data in the system. Experiment by applying this to a small sample of data until you get it right.

Then sit back and enjoy a quiet drink while it churns through the whole thing.

jdraw · Dec 22, 2010

Have you (or your boss) asked IT (especially the Director of IT) what sort of backup/archiving is done for your production systems?

I find it hard to believe this isn't being done somewhere in the organization.

Does the IT section of your company have an official DBA?
Just curious....

gemma-the-husky · Dec 23, 2010

it will all be manageable, and probably not inordinately slow - except that I expect you need to automate the data capture programme.

can't you use DEC to output a csv or fixed width file with headers for each table - then you get a starting point. you will probably have to export 2 files for each table with over 255 fields - but you are almost certain to find they arent probably normalized

I don't think it will actually take too long to produce and manage these files - but you will need to develop the techniques to automate the access imports.

HartJF · Dec 23, 2010

Thanks, everyone, for your insights! I never cease to be amazed by our worldwide interconnectedness.

I have to dash away for a doctor's appointment, after which I'll have follow-up for each of you.

HartJF · Dec 23, 2010

GalaxiomAtHome:

Thank you for your thoughtful response!

I appreciate the mention of the 255-field limitation. Does that refer to the total count of fields in all tables joined within the query plus the output fields, or does that only refer to the source fields referenced by the query plus the output fields? (This could explain some strange behaviors in other projects)

What is this TextStream object that you mention. I find only a very brief (and unsatisfying) discussion of it in Access' Help. What is a good source to learn more?

I'm wondering if I really want this to be a one-off import. No future use of this data will require all fields in all files; I'm just three-quarters through the export and I have 640MB of raw data. That's why I was wondering aloud about editing the Im-Ex Specs. If my data-definition database would include forms with which I could select the files and fields required by a new application, I would only import the data I need from fixed archive media (probably CDs). Could this be a reasonable implementation of the TextStream object?

I am a monster Aussie Footy fan. A local public-television channel broadcasts a weekly Match of the Week from either Fox Sports or Channel Seven. During the off-season, they've replayed key matches from the home-and-away rounds. This month, they've replayed finals matches. Last week I watched that amazing drawn Grand Final; this Monday, I'll see the Grand Final Rematch. I'll have my drink and watch the match while this churns.

Thanks again for your encouraging help!

HartJF · Dec 23, 2010

jdraw:

Thanks for your thought-provoking question.

This is a legacy system. My department was the last to migrate to the new system, six months ago; the rest of the organization migrated eighteen months ago. We are still paying licensing fees on the software, which management wants to curtail.

The data is being retained on back-up tapes. In fact, the server failed about a month ago, so a back-up had to be restored.

We are an educational institution established near the end of the Nineteenth Century. Our "DBA" does not know SQL Server, even though our new system is SQL-based. We are not allowed to use Reporting Services or even an ODBC connection to Access with the new system. Instead, I am using a proprietary tool to dump data to a text file for import into Access in order to develop reports and scorecards.

IS has told me to grab whatever I want from the legacy system before it disappears.

HartJF · Dec 23, 2010

gemma-the-husky (and you too, Dave):

Thanks for your encouragement!

I want to avoid the overhead a .csv file creates -- text delimiters and field separators would add about 1K to each record in the two largest files, an unnecessary 150MB in total. The "data dictionary" includes dozens of virtual fields, making it behave more like a SQL view. With nearly 750 characters from virtual fields in the two largest files, the files have an additional 100MB of bloat.

The files are most definitely de-normalized. One table has thirteen sets of fields ContactnType, ContactnDate, ContactnUser. The data dictionary includes more than 100 redundant overlay fields.

The challenge that I am focused on now is to develop a strategy to dump the data in the most efficient manner. That strategy must conceptually consider both export and import vehicles.

Then I will consider techniques to automate the import.

Making legacy system data available in Access

HartJF

Registered User.

Galaxiom

Super Moderator

jdraw

Super Moderator

gemma-the-husky

Super Moderator

HartJF

Registered User.

HartJF

Registered User.

HartJF

Registered User.

HartJF

Registered User.

Similar threads

Users who are viewing this thread