Text query

knightwest

Registered User.
Local time
Today, 09:13
Joined
Apr 18, 2007
Messages
16
Hi guys,
I have a table with a rather unfriendly field. This is legacy data so it is not possible to alter the way data was input.

The file contains text in the following way.

Word1 Word2 Word3 Word4 etc...

The number of words is not constant throughout the column.

What I would like to do is split this field, so that i get each Word in a new column, ie if there are 4 words in the field, i could run a make table to give me a new table with 4 new columns, each containing one word.

I am thinking about splitting the text at the spaces. that would work very well for me.

But i do not know how to do this. Any help would be appreciated.

Thanks

Knightwest
 
You will need to look up the following functions in Access help.

Instr()
Left()
Mid()
Right()

Col
 
The only problem you will encounter is the fact that you do not know in advance how many words are to be split. So it can't be performed in a query.
This also goes against the rules of normalisation.

As a recomendation I would turn all your repeating words into rows.

if you have access 2000 or higher then check out the Split function.

here is some pseudo code
create a table to contain the words

for each row
store the PK
split the column into an array
for each element in the array
write stored PK as a FK and the word to a table on the many side of a one to many
 
Just a quick question, WHY?
monkey.gif
This reeks of non-normalization and pain with a lot of suffering. Why in the world would you want a table with a column for each word? I would hope that you have a good answer for that question as I can't see any situation where that would be desired.
 
Last edited:
sadly i am unfamiliar with normalisation.

i am trying to isolate policy numbers stored in that text field. in 90% of the time they are the first "word" in the field, but in the rest they are either at the end or sometimes in the middle.

i thought if i could split them into different fields, then i could create some sort of function to take the policy numbers from these new fields and finally have them in one place.

if you have a better idea on how to this i would be very grateful. i get the impression that my way of wanting to do this is a bit of a nightmare
 
So, a few questions:

1. How does the data get into your database in the first place.

2. Why is there multiple data in one column, from which you have to extract a policy number.

3. Is there anything you can to change that?

4. Is there a pattern to the policy number so that it can be identified in the string?

If at all possible, I would redesign the database so that it is properly designed with the correct entities and data elements. A data element should only contain one piece of datum. It should not have multiple data residing in it as that defeats the purpose of a database, which is to get meaningful information out of it.
 
1. How does the data get into your database in the first place.
It is uploaded from excel, it originates from an insurance software that bunches some of the fields into one.
2. Why is there multiple data in one column, from which you have to extract a policy number.
the insurance software bunches the data
3. Is there anything you can to change that?
no this cannot be changed at the moment, or in the near future.
4. Is there a pattern to the policy number so that it can be identified in the string?
there is indeed. the policy numbers fall into one of these categories:
"*[A-Z][A-Z]#######[A-Z][A-Z][A-Z]*"
"*[A-Z][A-Z]#######[A-Z][A-Z]*"
"*87########[A-Z]##*"
"*87# #######[A-Z]##*"
"*[A-Z][A-Z]#######[A-Z]*"
"*#######[A-Z]*"
"*5######*"

where [a-z] represents any letter and # represents any number.
 
Okay, now I'm not going to be overly helpful at this point as I don't have experience in using them, but you need to look into how to use REGULAR EXPRESSIONS. If you haven't already done so, I would do a search here, and on Google, to find as much information on how to use them as possible.

I don't envy your position at all. If you can figure out how to use them, you should be able to come up with a way to transform your data as you bring it in so that it can be more usable.
 
thanks boblarson,

your input has helped me progress a bit.

for instance, with the function

Like "5*" i have been able to run a query bringing back all the policy numbers starting with a five.

I now face a few new problems:
1.) while this query now brings back all records with the above criteria, it still contains all the crap coming behind it. i want to isolate the policy number in a new column.
2.) as the wildcard * is missing at the front of the criteria, it works fine for policy numbers that come at the start of the record cell (they're the ones i haven't got a problem with anyway), but if i include the wildcard * at the beginning, then access includes policy numbers that have a 5 in it.

Is it possible for access to select record on case, Ie could it somehow select all records starting with 2 capitals, Ie HHwhr?
 
Last edited:

Users who are viewing this thread

Back
Top Bottom