Not sure where to start?

sgladie@bgsu.edu · May 31, 2017

Hello, Access 2016
Three tables:
tblPositive (has one field PosWord)
tblNegative (has one field NegWord)
tblTargetText (has three fields: Chapter, Paragraph, TargetText)

Goal: I would like to have a query look for all the PosWord in TargetText and return those records. I have tried "*tblPositive.PosWord*" in criteria to no avail. I tried a DLookup also.

Maybe a I need a loop statement. Just not sure. The end goal:
To have it return all records in TargetText (memo field) that has one PosWord AND one NegWord.

Just having a mental block of where to begin. I am thinking I could create a query, but not I am not so sure.

Please help!

Galaxiom · May 31, 2017

It can be done using InStr() in the query join criteria.

This join can only be done in the SQL View of the query. Start by building the query in the designer with an ordinary join and then edit the join in the SQL.

sgladie@bgsu.edu · May 31, 2017

So here is the SQL:

SELECT tblTargetText.TargetText
FROM tblNegative INNER JOIN (tblPositive INNER JOIN tblTargetText ON tblPositive.PosWord = tblTargetText.TargetText) ON tblNegative.NegWord = tblTargetText.TargetText;

I joined the tblPositive table and tblNegative table to TargetText field - even though it really isn't a good join, since I need to use wild cards in the beginning and end of PosWord and NegWord.

Can you please help me with using the InStr function, I have never used that before. Really appreciate your help!

Update: I went and researched the InStr function, and I think this might not work. I do not need the location of the word, a number value of position, etc. I just need all the TargetText records that have at least one PositiveWord and one NegativeWord. I have looked into loops, DLookups (nested), and Recordsets. I still cannot seem to locate a solution. :banghead:

Galaxiom · May 31, 2017

Since you are using wildcard characters best use Like and Not Like. You can do this in the design view.

sgladie@bgsu.edu · May 31, 2017

I figured it out! I had to concatenate the wildcards! This is what was screwing me up, thinking it was harder than it was. It wouldn't work properly until I did this:

Like "*" & [tblNegative]![NegWord] & "*" And Like "*" & [tblPositive]![PosWord] & "*"

All is well now. I knew I was making it harder than it really should be, but I have never seen wildcards concatenated! Thanks for your help!

The_Doc_Man · May 31, 2017

It is not entirely clear, but I surmise you are doing a positive-word/negative-word evaluation in a much longer text sequence, some sort of word frequency analysis for specific words with positive or negative connotations.

Issues that will make this work reasonably or not so reasonably:

1. In what kind of fields (in your target) did you expect to find the "connotation" words - short text, long text, something else?

2. How many neutral words will be in the field to be searched - i.e. what percent of the target string is likely to contain the information you want?

3. Are you searching for whole words or can there be included/overlap matches? I.e. if you look for "crap" will you also match "crappy" in the same search or will you do separate searches based on two different entries in your negative word tables?

4. How many words are in the the positive and negative tables and how many paragraphs are in the target table? This governs how many matches you will have to face when looking at your results.

I would guess that there is at least some potential for more than one word of positive or negative connotation in any arbitrary paragraph. Further, you are doing a complex type of string match-up, potentially with LIKE operators. Therefore this will be a compute-intensive query and you might not get a LOT of hits quickly depending on the exclusivity of your word tables. I wonder about performance for this case.

If this information is coming from a Word document and if you are at least a little bit adventuresome, there might be another way to deal with this, called a Scripting Dictionary object. You can use it in Access if you have a reference to the Windows Scripting library. (It is NOT native to Access, but IS usable by access.)

Instead of having a table of good words and a table of bad words and trying to do a join, you might create a dictionary of ALL of your "connotation" words with an attribute (called the "Item") of "Good" or "Bad" (or any other value you wish for these items). The dictionary words are automatically indexed, so they are easy to find once the dictionary is built.

What you do is create the dictionary by adding all the "connotation" words (perhaps by loops through recordsets for your good/bad tables). Then you can probe the dictionary; first, using the dictionaryobj.Exists(keyword) method to see if your target word was one of your connotation words; then if the word IS in the dictionary, look up the good/bad marker item by using value = dictionaryobj.Item(keyword).

See this article in MSDN: https://msdn.microsoft.com/en-us/library/x4k5wbx4(v=vs.84).aspx

Several links on that page explain the available methods and give simple code examples.

It might sound daunting, but it is actually not too bad. Just as a reminder, if you go this way, you should probably remember to use the dictionaryobj.RemoveAll method when you are done and then to set the dictionary object variable to Nothing. ( SET dictionaryobj = Nothing ) - though that last might not be necessary if the object variable is declared in a subroutine rather than in the declaration area of a general module. I advise it as more of a "belt-and-suspenders" precaution.

The idea would then be that you can step through the individual words in your target document, keeping track of paragraph and chapter, and check each word ONCE using the populated dictionary object to see if any of your flagged words are there.

Why do it this way?

1. Dictionary objects are indexed so that search for a "connotation" word is fast.

2. It is fast because the dictionary stays resident in memory during your search.

3. The search is memory based. The only disk I/O is for the target text source (plus whatever you wanted to store in a reference table.

4. Doing it this way is a one-pass algorithm with only a little bit of overhead for setting up the dictionary up front and breaking it down when done.

I'm not at all saying that a query wouldn't do what you want - but the performance would be hellish for larger lists of "connotation" words and larger targets. Referencing "chapter" and "paragraph" makes me think there is a potential for the target to be a large document.

Galaxiom · May 31, 2017

Doc makes some excellent points.

I would further say that if your Target Text comes from any kind of text based document files then consider storing them using a File Table in MS SQL Server.

The contents of file tables are presented as a file system share, allowing the files to be stored and used exactly as they are when stored in the Windows file system.

The big advantage is that File Tables support full text indexing within SQL Server with contextual querying of those indexes. For example, a query can reveal if specified words appear within a specified separation in the files and much more.

The properties of the files can also be manipulated via queries.

The_Doc_Man · May 31, 2017

Since I have not been fortunate enough to use MS SQL Server, I wasn't aware of the File Tables features. (My primary experience with mixed FE/BE was ORACLE-related.) However, the ability to do word indexing in that context very likely would make this faster.

Galaxiom · May 31, 2017

The_Doc_Man said:
Since I have not been fortunate enough to use MS SQL Server, I wasn't aware of the File Tables features.

It is a relatively recent extension of FileStream, being included from MSSQL Server 2012 onwards. It is even included in the Express version.

It is a pretty slick tool but the queries just reveal which documents contain the positive search results. It needs a lot of front end work to actually display the detail of the context.

Bizarrely, a free extension has to be installed to support indexing MS Office documents even though dozens of other text based formats are supported by default.

The_Doc_Man · Jun 1, 2017

Bizarrely, a free extension has to be installed to support indexing MS Office documents even though dozens of other text based formats are supported by default

Sounds like a MicroSoft product.

Not sure where to start?

sgladie@bgsu.edu

Registered User.

Galaxiom

Super Moderator

sgladie@bgsu.edu

Registered User.

Galaxiom

Super Moderator

sgladie@bgsu.edu

Registered User.

The_Doc_Man

Immoderate Moderator

Galaxiom

Super Moderator

The_Doc_Man

Immoderate Moderator

Galaxiom

Super Moderator

The_Doc_Man

Immoderate Moderator

Similar threads

Users who are viewing this thread