Can I import this?

joe789 · Dec 20, 2011

Hi Folks,

This is a bit confusing, so I am going to try to explain with some examples and I hope someone can help me.

If I were to load a nice normal standard flat file with the information I needed, that flat file could easily be loaded and would contain perhaps the following fields in say a tilda delimited format:

Example A (column names): solar system~galaxy~nearest star's name~hemisphere~continent~coordinates~country name~country coordinates~political system~city name~city population~city birth rate~city death rate~precint name~precint number~population ... ... ...

Example B (data): The Universe~milky way~the sun~north~north america~52N57S .....

My problem is that I am attempting to load/import a far from nice/normal/standard file format, in that the structure is broken out by groups with the end of each group being a carriage return.

Example A (columns):

PLANET(CONTRL HEADER) AKA CH contains the fields solar system, galaxy, nearest star's name

COUNTRY (SET HEADER) AKA SH contains the fields hemisphere, continent, coordinates

COUNTY (GROUP HEADER) AKA GH contains the fields country name, coordinates, political system

CITY (CITY INFO) AKA CI contains the fields city name, city population, city birth rate, city death rate

PRECINCT (PRECINCT INFO) AKA PI contains the fields precint name, precint number, population

ZIPCODE (ZIPCODE INFO) AKA ZC contains the zip code, the nearest major city >50K population

PERSON (PERSON DETAIL) AKA PD contains the person's fname, lname, dob, ssn, address, city, state

ACTIVITY (LOOP 1) AKA AC contains the persons date of service, location of service, duration of service, provider of service

Example B (data) - this is how the data would look if I opened the file):
CH~UNIVERSE~MILKY WAY~THE SUN
SH~NORTH~NORTH AMERICA~52N57S
PD~10/11/2011~JOHN~SMITH~...

Notice the first few characters above on each line can indicate to me what sort of information I am dealing with, i.e 'CH' in the first line would indicate the 'PLANET' information.

As you can see from the above, I need to somehow someway take the above format and import it into the same format it would be imported if it was a normal layout/flat file (Example):

solar system, galaxy, nearest star's name, hemisphere, continent, coordinates, political system .... so that each and every data element in all the groupings would just be nicely and neatly side by side in cells like a normal import and a normal table.

So essentially, I would need to determine what grouping I am focused on by studying the first few characters of each line and then load the appropriate data from that header into a table while maintaining the same row to populate subsequent data until I reach a header that requires a new record.

For instance, if there are 1000 records under the primary grouping of PLANET named 'EARTH', that group would not be repeated or mentioned again in the file until a different planet's data has it's turn, so I would somehow need to continue to populate the PLANET grouping's data in the appropriate fields as I continue to drill down thru the data in the different groups. It becomes even more of a mess because the data has dozens of groupings in it and some of the groupings 'loop' many times, that is, it is for example a person's activity which can be any number of records but somehow must be attached to subsequent information in the previous headers.

I am losing hope here big time ... is there any way to perform the above or am I pretty much out of luck? Is there some sort of logic or something that can help me with this extremely and overly complex issue. What is a file structure or issue like this evenly properly called? The problem is that I don't have a say in the format, this is how the data is presented to me for reasons unknown to me (perhaps someone can at least tell me WHY anyone would have a format like this). For the first time since I started working in this field well over a decade ago, I am totally confused and lost on this one and I have never seen such a horrendous data format in my life ... please help/advice/suggest/inform!

Thank you very much!

Joe

bparkinson · Dec 20, 2011

You could bring the file in to a staging table, and then select from that table each business object (CH, SH, PD, etc.) from that table to import them into their appropriate table(s). That way you a lot of SQL to populate the real destination tables, but that SQL is simple.

rodmc · Dec 21, 2011

Can I suggest that you should put your data into excel and from there go through the normalisation process and break the table down into related tables

jdraw · Dec 21, 2011

Joe,
You could post some data to help us understand. However, you could process the data as you suggested provided you can identify which records should be placed in what table.
This is where a data model of what your working database structure should be will be most helpful.

Good luck

Somebody must have identified what the "record types" represent early on in this data collection exercise.

I think bparkinson is in the right track. I'm not an excel person so rodmc's suggestion would not be my choice.

If I had to work in non-Access it would probably be something like textPad to separate the various record types into separate text files.

Can I import this?

joe789

Registered User.

bparkinson

Registered User.

rodmc

Registered User.

jdraw

Super Moderator

Similar threads

Users who are viewing this thread