Import complex text file into Access

that is correct!
 
May I add that it wouldn't matter if some fields are present or not on different records, that can be easily handled.

The four main things I would consider are:
1. Do we have all the field names that could ever be present?
2. The maximum number of citations.
3. If the first word of any text is the same as the field. I.e.
Code:
Database                    <----Field
Database is part of ....    <----Text
4. Is there always going to be a field name followed by its corresponding text? You don't want to import data for a field that it shouldn't belong to.

3 can be easily handled too.

I'm going back to the background.
 
I will have the database with all these fields.
Citation
Database
Author
Title
Source
Abstract
Language
Publication Type


Why do you need the Field "Citation" in the Database Table? Surely that is only necessary for the email in that it indicates the different blocks of text. It will not be necessary in the database Table?

Please explain...
 
May I add that it wouldn't matter if some fields are present or not on different records, that can be easily handled.

The four main things I would consider are:
1. Do we have all the field names that could ever be present?
2. The maximum number of citations.
3. If the first word of any text is the same as the field. I.e.
Code:
Database                    <----Field
Database is part of ....    <----Text
4. Is there always going to be a field name followed by its corresponding text? You don't want to import data for a field that it shouldn't belong to.

3 can be easily handled too.

I'm going back to the background.

1. Do we have all the field names that could ever be present?
These are all fields:
Citation
Database
Author
Title
Source
Abstract
Language
Publication Type

2. The maximum number of citations.
I don't know, but probably no more 200. In my opinion I don't think it's matter. Loop will go through the end anyway, right

3. If the first word of any text is the same as the field. I.e.
No. it's starts with the text
 
Why do you need the Field "Citation" in the Database Table? Surely that is only necessary for the email in that it indicates the different blocks of text. It will not be necessary in the database Table?

Please explain...

The users asked for. But if it's easier w/o it, then let's do it w/o it.
 
2. The maximum number of citations.
I don't know, but probably no more 200. In my opinion I don't think it's matter. Loop will go through the end anyway, right
From your point of view it doesn't matter, but from our view (well at least mine for now), it makes a big difference in the approach.
 
Sergo

Please check previous posts... There are a few you haven't answered yet.
 
The users asked for. But if it's easier w/o it, then let's do it w/o it.

You will need to check with the user....

They may be expecting the following in the citation field:-
Citation 1.
Citation 2.
Citation 3.
etc
 
4. Is there always going to be a field name followed by its corresponding text? You don't want to import data for a field that it shouldn't belong to.

3 can be easily handled too.

I'm going back to the background.

To answer #4
I think so! It will be always same structure except for the last field "Publication Type", sometimes it's missing
 
And the other thing as well, does the text file contain line numbers?
 
From your point of view it doesn't matter, but from our view (well at least mine for now), it makes a big difference in the approach.

It does, if you transfer first into memo field, which is only except no more then 65,000 char.
 
It does, if you transfer first into memo field, which is only except no more then 65,000 char.
What I'm talking about is the programming side of things, not about character limitations. Anyway, Uncle Gizmo is on it. I was only dropping hints. :)
 
You will need to check with the user....

They may be expecting the following in the citation field:-
Citation 1.
Citation 2.
Citation 3.
etc

I will follow with them. But if it's to complicate, the users will understand and except it.
 
What I'm talking about is the programming side of things, not about character limitations. Anyway, Uncle Gizmo is on it. I was only dropping hints. :)

I see, thank you! :)
 
Please find attached DB and text file, this is as far as I have got.
I have slept on it, but still no inspiration is forthcoming.

I have the feeling that I'm digging myself in to a deeper and deeper hole with this approach so guidance would be appreciated.

I am also aware that there is NO REAL data, hench there WILL most likely be other problems to solve like, what if the section words:-

Citation
Database
Author
Title
Source
Abstract
Language
Publication Type

appear in the actual text?

The other thing is some of the text may be over 255 long, so use a memo field?

My code currently creates an Info table which stores the line number of the key word, and the section number it is in.

Code:
ID --- SectionName --- Current Citation --- LineNumber
576 --- Citation ---------- 3 --------------- 3
577 --- Database ---------- 3 --------------- 5
578 --- Author ------------ 3 --------------- 7
579 --- Title ------------- 3 --------------- 9
580 --- Source ------------ 3 --------------- 11
581 --- Abstract ---------- 3 --------------- 13
582 --- Language ---------- 3 --------------- 20
583 --- Citation ---------- 23 -------------- 23
584 --- Database ---------- 23 -------------- 24
585 --- Author ------------ 23 -------------- 26
586 --- Title ------------- 23 -------------- 28
587 --- Source ------------ 23 -------------- 30
588 --- Abstract ---------- 23 -------------- 32
589 --- Language ---------- 23 -------------- 47
590 --- Citation ---------- 50 -------------- 50
591 --- Database ---------- 50 -------------- 51
592 --- Author ------------ 50 -------------- 53
593 --- Title ------------- 50 -------------- 55
594 --- Source ------------ 50 -------------- 57
595 --- Abstract ---------- 50 -------------- 59
596 --- Language ---------- 50 -------------- 63

The approach I was taking, use this data to control the inserting of the text into the DB.

The next problem to solve "How to get the text delimiter"

581 --- Abstract ---------- 3 --------------- 13
582 --- Language ---------- 3 --------------- 20

ie ... Abstract ends at 19 (20-1)

And how to get the delimiter for:-
596 --- Language ---------- 50 -------------- 63

Then the lines will need to be concatenated.

So the hole is getting deeper, and I'm not confident the approach will lead to a solution.
 

Attachments

Last edited:
I like your strategy! I wouldn’t event guess to go this direction. I just ran it with the real data, you right, it did find same words in text on some locations. Thank you anyway for your hard work!
What if we make it easy. I just want give user at list one option, otherwise they can continue copy paste.
What if we’ll make 2 fields: Citation# and Citation_Data. It will look like this.
Citation# ----- Citation_Data
1-----------------(insert whole Citation1)
2-----------------(insert whole Citation2)
3-----------------(insert whole Citation3)

What do you thing?
 
Alright I looked into this and ironed out all those issues on my version. But I must know about my last post.
 
Sergo there's still this anomaly. Author and Authors. What's happening there?

Thank you for pointing this out! I did look and found that some citation has “s” some not. That is another reason it will not work. I need talk to users, because before start this project I asked them all labels (fields) in citation are consistent and structure is same. They told me “Yes”
 

Users who are viewing this thread

Back
Top Bottom