Private MessageAha! Thank you. Yes I see an envelope top rhs, as you said. Thank you I'll get there yet. What does PM stand for? EdK
MajP --After searching this I am not sure if this is such a great approach and all that useable for an index. Maybe it is a start, but you would definitely need to build a full application. There are lots of software out there to do indexing, obviously it is a bigger field than I thought. Some of these apps may be freeware.
Looking at a real index, very few entries are single words. So getting single words may be a start but you would need some interface and features to refine the solutions.
View attachment 97436
You might be able to use REGEXP to get "Adobe Flash Player", but I know of no way to get "book pricing". My code would return "book" and "pricing". Also a lot of software apps have the ability to do sub items. (Artwork -- covers --- ownership of). Word has a pretty good Index creator, but that requires you to tag the items.
If I was doing this for me, I would probably have the ability to pull all the Words first. Then the means to delete excluded words, and other words to narrow down the list. Then use the navigation controls to search for "Book" and update items. Once I got all the terms in my table I would use it to then automate Word to tag the items and create the index using the Word features.
So you can see the problem in the returns here
View attachment 97437
For example I know the context of this document dealing with RADARS. I know there are important discussions about accuracy and I see it happens 25 times and 4 "accurate". I know that there are topics on "Position Accuracy", "Height Accuracy", "Speed Accuracy", but by itself it would not be very useful.
So maybe I can use the found words to aid in finding the real terms, but a pure list of words is of little value.
I don't know if this got sent to MajP or not so am sending this EdKMajP --
"So maybe I can use the found words to aid in finding the real terms, but a pure list of words is of little value."- your comment.
Agreed. but lets go the further step:
I have a simple search routine in my MS Access database & book writer that connects any two "searched for"/"found" words and makes a list/report. From that list I manually check to the listed occurrences in either the database or the listed book paragraphs, and I use that list for book writing and research purposes, unrelated to indexing.
But it occurs to me (and as you that imply_suggest) your Access_MS Word application, could be (?easily) be tweaked such that
- it finds the linkages between any two words
- would also for each "found occasion" know the exact location of each
- could therefore produce a list of "two word combination" plus location [+ content, actually, of each combo, as mine does]
- use the total final word list (as automatically, then manually, created - as you describe_suggest) - the key here would be to pare down to a final list that is not going to have an exponentially large number of possible combination of words.
- Now, it would be a difficult (but not impossible?) task to manually then go through and extract a list for any index, I know that. But a smart guru (such as yourself) could think this through and create something workable.
- (I can see that it would be necessary to) have some routine in MS Access (or linked via Access to control MS Word) whereby the operator person can manually combine the essential "word pairs" of his/her choice, armed with the knowledge/understanding of (a) what the guts_meaning of the text is (b) what the reader of the document is likely to want to search for.
- As an author, I would be perfectly happy to spend my time (within reason, of course) doing this "value adding" to the raw word list you can already produce. So that a nice simple usable end product is achieved. So that the end user can then decide what amount of time they want to invest in trying to get the perfect index done, manually adding to the computer produced list(s)
- I know this is not ideal. but i think a fair amount of judgement_skill could be applied by the person selecting the index "pairs", as they (in my mind) would be mostly the author of the material anyway, and would know the content intimately - so I would look on any "word finder_word pair producer" as a helpful aid, rather than as the total solution. Artificial intelligence isn't, and perhaps never will be (hopefully) as smart as the best human brains.
I would like to know what search phrase you used to find the instances of "word indexing" programs on the 'Net - I tried early on and (though mainly looking for freeware) didn't find much to get that excited about.
I see you have posted something that "crosses" with what I'm about to post ie this post (like a letter in the post in the olden days where I come from ...). Which I will answer point by point after I have posted this current lot of my comments off. I'ts pretty simple stuff and despite my mis-reading of the indexing application market, I reckon I'd love to see what you came up with in line with what I've written above. totally up to you and your inclination and available time etc etc, of course.
Added soon after: I absolutely LOVE your suggestion to (via Access?) auto add chosen index tags to the Word docx, fantastic idea, especially if it can do "pairing" of selected words ----------> phrases.
Yes, I do really need to get more into MS Word, but I have great faith in Access's ability to query and manipulate (in the right hands of selected savants, of course)!!!
The silly (and/or quite valid) reasons I got sucked into Access were that
EdK
- I couldn't figure out how to control the position and behaviour of images in MS Word (!!!!!)
- I needed to combine book writing function with database function, which (as far as I can see) MS Ward fails horribly at, but my clunky setup in MS Access controls and connects the two functions OK.
- Note: I soon enough found a fairly satisfactory routine in Access to control the selection, storage, and positioning of images into about 8 preselected positions and sizes in book report, for any part of the book (chapter or paragraph) - the basic working unit of the book is the paragraph
- bottom line: works really well for me but (until its rebirth at the hands of some savvy guru) only I can use it (seems a waste, but it happens)
If you use the term "book Indexing software" instead of "word"I would like to know what search phrase you used to find the instances of "word indexing"
MajP -This one is a little pricey, but way more advanced. This is more what I was thinking. Watch the video. This appears to extract useable terms not just words.
Here is some interesting reading to actually program keyword extraction.
![]()
Real Time Text Analytics Software - Medallia
Medallia's text analytics software tool provides actionable insights via customer and employee experience sentiment data analysis from reviews & comments.monkeylearn.com
I found a few Freeware versions. Take a look at the video associated with this one , you would have to save as PDF.
This does a lot of the stuff I do and more. The key is once your words are extracted you have to be able to search the document by selecting your words. Then you need to determine if you are going to use it or not. The tagging looks very similar to how Word tags a term.Index Generator Video
www.openviewdesign.com
![]()
Software
Please Note: If you are an author or editor needing to prepare an index to your book or other publication, you may wish to consult our Indexer Locator, which lists professional indexers, their area…www.asindexing.org
The software in the video has the same problem I was referring to. You are stuck with individual words. So IMO there needs to be a feature to review each occurrence of the word and then Add/Edit words based on context. So in my example I would click on Accuracy and the first occurrence deals with Position Accuracy. I would add that to the found list. Then look at the next occurrence and see that it deals with Speed Accuracy. I would add that to the list. Each time a word is added to the list it would automatically search the document and add to the page numbers.
If you use the term "book Indexing software" instead of "word"
MajP -FYI. I only did a cursory Google search on "book Indexing Software" and found those examples. You may find other free or inexpensive ones that are better.
Out of my own interest, I will continue to work on this a little when I get time. I think I have a lot of the pieces from my other tool that could be the basis for the application. If you check back here I will PM you if I ever get anything useable
.
I currently am trying to code the Rapid Application Keyword Extractor (RAKE). This will greatly improve what I have
![]()
Rapid Keyword Extraction (RAKE) Algorithm in Natural Language Processing
Rapid Automatic Keyword Extraction(RAKE) is a Domain-Independent keyword extraction algorithm in Natural Language Processing.www.analyticsvidhya.com
This will allow me to identify Key Terms and not just single words. So in my example in thread #25 it would find then
"Amazon best sellers"
Not the meaningless single words of
Amazon
best
sellers
The algorithm in theory is very simple. However, implementing it in code efficiently on a large set of data will take significant coding. If you are saving all the words of a document you end up with very large arrays (or other data structures) and then need to search them efficiently to apply your algorithms.
This will probably get me closer to a 75% working solution. The article describes how I can find terms like
but not
- Justice Department
Department of Justice
The "of" messes things up in the RAKE algorithm
From what I read there are a lot of cheap software that is very limited, and may not save a lot of work/time.I've gone right off Index Creator, it fails to create/join together any of my test word extraction exercises into even two word phrases.
Dedicated Indexing Software
This list of dedicated software geared toward the needs of professional indexers is for informational purposes only. It is not intended to be a comprehensive list of all tools that indexers may use in the course of their work. ASI does not endorse any product.
CINDEX™ (for Windows, and Macintosh)
Indexing Research
Tel: (585) 413-1819
Email: info@indexres.com
URL: http://www.indexres.com
Users who formerly purchased CINDEX™ and obtained support through Leverage Technologies, Inc. are invited to contact Indexing Research for ongoing support and information.
Index Manager (for Windows and Macintosh)
Klarso GMBH
Berlin, Germany
Email: info@index-manager.net
URL: http://index-manager.net
Pilar Wyman, Regional Sales Manager, USA/Canada
Email: wyman@index-manager.net
Tel: (443) 336-5497
Macrex™ (for Windows)
Wise Bytes — Macrex Support Office, North America
Tech. Support: (888) 348-4292
Sales: (888) 348-4292
Email: macrexna@gmail.com
URL: http://www.macrex.com
In North America:
20% discount for ASI members;
$200 discount for students & instructors of approved indexing courses.
SKY Index™ Professional (for Windows)
SKY Software
Tel: (540) 751-4336
Email: sales@sky-software.com
URL: http://www.sky-software.com
TExtract book indexing software
Harry Bego, Texyz
The Netherlands
Tel: +31-30-6700318
Fax: +31-30-3100271
Email: info@texyz.com
URL: http://www.texyz.com
I am definitely no authority on this. I just did a little bit of web searches, but this is a lot bigger field then I ever thought. Doing a good index is both art and science as far as I can tell, and fully automating it is unlikely. That is why there are expensive applications and services. I do not think there is a free lunch. Either you spend a lot of time or pay for the service.I've printed off your list and will investigate
Ah, no free lunch. That figures .... Hey, are you saying that AI (Artificial Intelligence) is a forlorn hope_fear? I love the succinct phrases you use. Just for the record, I came across this site - https://docs.marklogic.com/guide/concepts/indexing#id_40941 Those guys haven't given up hope. Maybe somebody is paying them good money to pursue this "dream" - eg NASA or various government militaries. And maybe the professional indexers are (still) in for the long haul. I would love to have known about that profession (say) 60 years ago. Damn. And I've lost the keys to my time machine. Like you, MajP, I seem to have learned quite a bit in the past week or so re book indexing. I wouldn't have missed it for quids .... EdKI am definitely no authority on this. I just did a little bit of web searches, but this is a lot bigger field then I ever thought. Doing a good index is both art and science as far as I can tell, and fully automating it is unlikely. That is why there are expensive applications and services. I do not think there is a free lunch. Either you spend a lot of time or pay for the service.