Hi Folks,
This is a bit confusing, so I am going to try to explain with some examples and I hope someone can help me.
If I were to load a nice normal standard flat file with the information I needed, that flat file could easily be loaded and would contain perhaps the following fields in say a tilda delimited format:
Example A (column names): solar system~galaxy~nearest star's name~hemisphere~continent~coordinates~country name~country coordinates~political system~city name~city population~city birth rate~city death rate~precint name~precint number~population ... ... ...
Example B (data): The Universe~milky way~the sun~north~north america~52N57S .....
My problem is that I am attempting to load/import a far from nice/normal/standard file format, in that the structure is broken out by groups with the end of each group being a carriage return.
Example A (columns):
PLANET(CONTRL HEADER) AKA CH contains the fields solar system, galaxy, nearest star's name
COUNTRY (SET HEADER) AKA SH contains the fields hemisphere, continent, coordinates
COUNTY (GROUP HEADER) AKA GH contains the fields country name, coordinates, political system
CITY (CITY INFO) AKA CI contains the fields city name, city population, city birth rate, city death rate
PRECINCT (PRECINCT INFO) AKA PI contains the fields precint name, precint number, population
ZIPCODE (ZIPCODE INFO) AKA ZC contains the zip code, the nearest major city >50K population
PERSON (PERSON DETAIL) AKA PD contains the person's fname, lname, dob, ssn, address, city, state
ACTIVITY (LOOP 1) AKA AC contains the persons date of service, location of service, duration of service, provider of service
Example B (data) - this is how the data would look if I opened the file):
CH~UNIVERSE~MILKY WAY~THE SUN
SH~NORTH~NORTH AMERICA~52N57S
PD~10/11/2011~JOHN~SMITH~...
Notice the first few characters above on each line can indicate to me what sort of information I am dealing with, i.e 'CH' in the first line would indicate the 'PLANET' information.
As you can see from the above, I need to somehow someway take the above format and import it into the same format it would be imported if it was a normal layout/flat file (Example):
solar system, galaxy, nearest star's name, hemisphere, continent, coordinates, political system .... so that each and every data element in all the groupings would just be nicely and neatly side by side in cells like a normal import and a normal table.
So essentially, I would need to determine what grouping I am focused on by studying the first few characters of each line and then load the appropriate data from that header into a table while maintaining the same row to populate subsequent data until I reach a header that requires a new record.
For instance, if there are 1000 records under the primary grouping of PLANET named 'EARTH', that group would not be repeated or mentioned again in the file until a different planet's data has it's turn, so I would somehow need to continue to populate the PLANET grouping's data in the appropriate fields as I continue to drill down thru the data in the different groups. It becomes even more of a mess because the data has dozens of groupings in it and some of the groupings 'loop' many times, that is, it is for example a person's activity which can be any number of records but somehow must be attached to subsequent information in the previous headers.
I am losing hope here big time ... is there any way to perform the above or am I pretty much out of luck? Is there some sort of logic or something that can help me with this extremely and overly complex issue. What is a file structure or issue like this evenly properly called? The problem is that I don't have a say in the format, this is how the data is presented to me for reasons unknown to me (perhaps someone can at least tell me WHY anyone would have a format like this). For the first time since I started working in this field well over a decade ago, I am totally confused and lost on this one and I have never seen such a horrendous data format in my life ... please help/advice/suggest/inform!
Thank you very much!
Joe
This is a bit confusing, so I am going to try to explain with some examples and I hope someone can help me.
If I were to load a nice normal standard flat file with the information I needed, that flat file could easily be loaded and would contain perhaps the following fields in say a tilda delimited format:
Example A (column names): solar system~galaxy~nearest star's name~hemisphere~continent~coordinates~country name~country coordinates~political system~city name~city population~city birth rate~city death rate~precint name~precint number~population ... ... ...
Example B (data): The Universe~milky way~the sun~north~north america~52N57S .....
My problem is that I am attempting to load/import a far from nice/normal/standard file format, in that the structure is broken out by groups with the end of each group being a carriage return.
Example A (columns):
PLANET(CONTRL HEADER) AKA CH contains the fields solar system, galaxy, nearest star's name
COUNTRY (SET HEADER) AKA SH contains the fields hemisphere, continent, coordinates
COUNTY (GROUP HEADER) AKA GH contains the fields country name, coordinates, political system
CITY (CITY INFO) AKA CI contains the fields city name, city population, city birth rate, city death rate
PRECINCT (PRECINCT INFO) AKA PI contains the fields precint name, precint number, population
ZIPCODE (ZIPCODE INFO) AKA ZC contains the zip code, the nearest major city >50K population
PERSON (PERSON DETAIL) AKA PD contains the person's fname, lname, dob, ssn, address, city, state
ACTIVITY (LOOP 1) AKA AC contains the persons date of service, location of service, duration of service, provider of service
Example B (data) - this is how the data would look if I opened the file):
CH~UNIVERSE~MILKY WAY~THE SUN
SH~NORTH~NORTH AMERICA~52N57S
PD~10/11/2011~JOHN~SMITH~...
Notice the first few characters above on each line can indicate to me what sort of information I am dealing with, i.e 'CH' in the first line would indicate the 'PLANET' information.
As you can see from the above, I need to somehow someway take the above format and import it into the same format it would be imported if it was a normal layout/flat file (Example):
solar system, galaxy, nearest star's name, hemisphere, continent, coordinates, political system .... so that each and every data element in all the groupings would just be nicely and neatly side by side in cells like a normal import and a normal table.
So essentially, I would need to determine what grouping I am focused on by studying the first few characters of each line and then load the appropriate data from that header into a table while maintaining the same row to populate subsequent data until I reach a header that requires a new record.
For instance, if there are 1000 records under the primary grouping of PLANET named 'EARTH', that group would not be repeated or mentioned again in the file until a different planet's data has it's turn, so I would somehow need to continue to populate the PLANET grouping's data in the appropriate fields as I continue to drill down thru the data in the different groups. It becomes even more of a mess because the data has dozens of groupings in it and some of the groupings 'loop' many times, that is, it is for example a person's activity which can be any number of records but somehow must be attached to subsequent information in the previous headers.
I am losing hope here big time ... is there any way to perform the above or am I pretty much out of luck? Is there some sort of logic or something that can help me with this extremely and overly complex issue. What is a file structure or issue like this evenly properly called? The problem is that I don't have a say in the format, this is how the data is presented to me for reasons unknown to me (perhaps someone can at least tell me WHY anyone would have a format like this). For the first time since I started working in this field well over a decade ago, I am totally confused and lost on this one and I have never seen such a horrendous data format in my life ... please help/advice/suggest/inform!
Thank you very much!
Joe