wiklendt
i recommend chocolate
- Local time
- Today, 15:24
- Joined
- Mar 10, 2008
- Messages
- 1,746
hi everyone,
i'm in the situation many will tsk tsk about. i was, in the past, assured i wouldn't have to deal with this data, but here i am, dealing with it.
i have a field with descriptive text. i want to extract some of that text for further analysis.
i can find various instances of text that we're interested in, but now my supervisor wants me to separate two types of the data, which are not really separable the way the data has been entered.
from the text field "notes", i want to determine a score of samples which have a pink isolate, a blue isolate, and a blue isolate with a pink halo.
so i'm creating a boolean for pink, blue, and blue PH for each sample tested.
i have been able to get the pink and pink halo all right by using:
and
but now i cannot determine if the field contains a blue isolate AND a blue with pink halo, or just a blue isolate....
example of the records for "notes":
as you can see, if i search just for "*blue*", then i'll return a 1 even for those samples that only have a blue PH, not just plain blue.
my idea, if it's possible, is to COUNT how many instances there are in the string of PH and of Blue. then, if blue and PH count match, then there is only blue PH isolates from that sample. if there are more of blue than PH, then the sample had both types of isolates.
i just can't come up with a way (in a query) to COUNT how many times "blue" and "PH" appear in each record...
think about a bag of marbles - to give an analogy - we have data that describes all the different coloured marbles in our bag in one text field for each bag. initially, my supervisor just wanted to know which bags contained "pink" marbles and "blue" marbles, if any. NOW he's decided that the decorative bits matter, and wants to know which bags have "pink" marbles, "blue without a swirl" marbles, and then "blue with a swirl" marbles. thing is, "blue without a swirl" was only ever recorded as "blue"....
we have thousands of records with this field, so that's why i'm trying to find a query solution rather than re-typing or re-designing (there is no more data to go in, all the data is there, the study is complete, it's just data extraction and analysis now to go).
i'm in the situation many will tsk tsk about. i was, in the past, assured i wouldn't have to deal with this data, but here i am, dealing with it.
i have a field with descriptive text. i want to extract some of that text for further analysis.
i can find various instances of text that we're interested in, but now my supervisor wants me to separate two types of the data, which are not really separable the way the data has been entered.
from the text field "notes", i want to determine a score of samples which have a pink isolate, a blue isolate, and a blue isolate with a pink halo.
so i'm creating a boolean for pink, blue, and blue PH for each sample tested.
i have been able to get the pink and pink halo all right by using:
Code:
Pink: IIf([notes] Like "*pink*",1,0)
Code:
Blue PH: IIf([notes] Like "*PH*",1,0)
example of the records for "notes":
notes
-----------------
pink / white (mixed)
tiny pink
pink, blue, (mixed)
pink, blue, white, (mixed)
Blue and white - brown agar (mixed)
Pink, blue, white - brown agar (mixed)
blue
blue / white / blue PH (mixed)
blue PH / white (mixed)
pink, blue, white, (mixed)
Blue and white - brown agar (mixed)
Pink, blue, white - brown agar (mixed)
Blue (mixed)
Pink and blue (mixed)
Blue
as you can see, if i search just for "*blue*", then i'll return a 1 even for those samples that only have a blue PH, not just plain blue.
my idea, if it's possible, is to COUNT how many instances there are in the string of PH and of Blue. then, if blue and PH count match, then there is only blue PH isolates from that sample. if there are more of blue than PH, then the sample had both types of isolates.
i just can't come up with a way (in a query) to COUNT how many times "blue" and "PH" appear in each record...
think about a bag of marbles - to give an analogy - we have data that describes all the different coloured marbles in our bag in one text field for each bag. initially, my supervisor just wanted to know which bags contained "pink" marbles and "blue" marbles, if any. NOW he's decided that the decorative bits matter, and wants to know which bags have "pink" marbles, "blue without a swirl" marbles, and then "blue with a swirl" marbles. thing is, "blue without a swirl" was only ever recorded as "blue"....
we have thousands of records with this field, so that's why i'm trying to find a query solution rather than re-typing or re-designing (there is no more data to go in, all the data is there, the study is complete, it's just data extraction and analysis now to go).