Compare string with records in table (query?) (1 Viewer)

Noruen

Member
Local time
Today, 13:50
Joined
Jul 7, 2020
Messages
46
Hi!

I have bloody dirty database which I'm trying to clean up. It is basically reference management database I created by export of my references from EndNote.

My issue is that AUTHORS Field (which is temporary until I do cleanup) is string of concatenated authors with “;”as separator. It is easy to use VBA to split this field into separate strings based on separator. However I realized I'm stucked with next step. In Excel I created table of all authors and imported this table as tblAuthors into my database. And now I would like to analyze my unstringed authors one by one against this table like:

string: Brown, A.D. … is there any “Brown” in tblAuthors.LastName?

it is easy to do search for Brown in tblAuthors.LastName with my string (if I would be able to extract only last name, what would be quite difficult). However I would need something like “reverse query” where I can compare this one string to every possible LastName in tblAuthors. Like: “WHERE strAuthor Like *tblAuthors.LastName*”.

maybe the solution is clear like the sunny day but I'm stucked like hell :( thanks for your help!
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 07:50
Joined
Feb 28, 2001
Messages
26,996
Since you suggest this might be a once-off, your solution of a query involving a LIKE operator is not necessarily wrong. The double-ended wild card of an asterisk at each end of the search string WILL prevent use of indexes. However, how many records are we talking? A few hundred? Maybe a few thousand? For a once-off, that isn't bad.

As to taking the last name when you have "lastname, intials or name" then look at using the INSTR function to locate the first comma and then take the LEFT function of the string to that point. Actually that point - 1 because you don't want to include the comma.

Code:
AuthString = LEFT( UnstringedAuthor, INSTR( 1, UnstringedAuthor, "," ) - 1 )

The LEFT function takes the leftmost part of the string in the size you specify, and you can specify the INSTR (which returns a numeric position) to specify the length. If you are going to diddle with strings like this, Access VBA has some pretty decent string functions. Here is INSTR.


You can also look up LEFT, RIGHT, MID, LEN, and TRIM as useful functions for string diddling.
 

Noruen

Member
Local time
Today, 13:50
Joined
Jul 7, 2020
Messages
46
Since you suggest this might be a once-off, your solution of a query involving a LIKE operator is not necessarily wrong. The double-ended wild card of an asterisk at each end of the search string WILL prevent use of indexes. However, how many records are we talking? A few hundred? Maybe a few thousand? For a once-off, that isn't bad.

As to taking the last name when you have "lastname, intials or name" then look at using the INSTR function to locate the first comma and then take the LEFT function of the string to that point. Actually that point - 1 because you don't want to include the comma.

Code:
AuthString = LEFT( UnstringedAuthor, INSTR( 1, UnstringedAuthor, "," ) - 1 )

The LEFT function takes the leftmost part of the string in the size you specify, and you can specify the INSTR (which returns a numeric position) to specify the length. If you are going to diddle with strings like this, Access VBA has some pretty decent string functions. Here is INSTR.


You can also look up LEFT, RIGHT, MID, LEN, and TRIM as useful functions for string diddling.
Thanks DocMan!

well this is jist part of my problem. The other thing is that I want to find that extracted name in table of cleaned names and eventually run append query. My idea is that it would work like:

I have source string: Brown AJ;Peach P;Lavender KL
Now I run first loop and I extract the first name:

Brown AJ

Now is the step I dont know how to perform: how I can search if, in this extracted string, there is any match of the lastName from tblAuthors? The other way it is easy, but I cannot consistently extract Last name from this string as some authors have only initials, sometimes Last and First name are switched so only consistent pattern in concatenated Authors string is the separator…

we are talking about 400 references with 20 average, so 400*20 authors, 8000 i total :)
 

MajP

You've got your good things, and you've got mine.
Local time
Today, 08:50
Joined
May 21, 2018
Messages
8,463
I think your process may be different, and not perfect. But do it in many steps. I assume this is a many to many but do not know what the main table is. References? I would assume you need a junction table

tblReferences_Authors
ReferenceID_FK
AuthorID_FK 'tbd'
ExtractedAuthorName


So now a reference can have many authors and an author can be assigned to many references.
1. I would extract the ReferenceID and the AuthorName from the concatenation.
For my first record I would get something like

1 Brown, A. D.
1 Author Two
1 Author Three

2. Step Two. Run update queries where you get matches and update the real AuthorID_FK from your new table
You will do several different updates. First find records of the form FirstName LastName
where ExtractedAuthorName = AuthorTable.FirstName & " " AuthorTable.LastName
or in the Brown, A. D. case join
Where ExtractedAuthorName = AuthorTable.LastName & ", " & left(Authortable.firstName,1) & ". " & AuthorTable.MiddleInitial
just keep linking different forms.

You probably will get 90% with 4 different forms
FirstName LastName
LastName, FirstName
LastName, FI. MI.
Then a handful you update yourself.

3. Once done you can get rid of the extracted author name field.
 

MajP

You've got your good things, and you've got mine.
Local time
Today, 08:50
Joined
May 21, 2018
Messages
8,463
If you can send these tables in a database with the pertinent fields I can do this pretty quick.
In the reference table I would just need referencID and the author field. You could export a query. Then would need your author table.
 

Noruen

Member
Local time
Today, 13:50
Joined
Jul 7, 2020
Messages
46
Since you suggest this might be a once-off, your solution of a query involving a LIKE operator is not necessarily wrong. The double-ended wild card of an asterisk at each end of the search string WILL prevent use of indexes. However, how many records are we talking? A few hundred? Maybe a few thousand? For a once-off, that isn't bad.

As to taking the last name when you have "lastname, intials or name" then look at using the INSTR function to locate the first comma and then take the LEFT function of the string to that point. Actually that point - 1 because you don't want to include the comma.

Code:
AuthString = LEFT( UnstringedAuthor, INSTR( 1, UnstringedAuthor, "," ) - 1 )

The LEFT function takes the leftmost part of the string in the size you specify, and you can specify the INSTR (which returns a numeric position) to specify the length. If you are going to diddle with strings like this, Access VBA has some pretty decent string functions. Here is INSTR.


You can also look up LEFT, RIGHT, MID, LEN, and TRIM as useful functions for string diddling.
I think your process may be different, and not perfect. But do it in many steps. I assume this is a many to many but do not know what the main table is. References? I would assume you need a junction table

tblReferences_Authors
ReferenceID_FK
AuthorID_FK 'tbd'
ExtractedAuthorName


So now a reference can have many authors and an author can be assigned to many references.
1. I would extract the ReferenceID and the AuthorName from the concatenation.
For my first record I would get something like

1 Brown, A. D.
1 Author Two
1 Author Three

2. Step Two. Run update queries where you get matches and update the real AuthorID_FK from your new table
You will do several different updates. First join by FirstName LastName
where ExtractedAuthorName = AuthorTable.FirstName & " " AuthorTable.LastName
or in the Brown, A. D. case join
Where ExtractedAuthorName = AuthorTable.LastName & ", " & left(Authortable.firstName,1) & ". " & AuthorTable.MiddleInitial
just keep linking different forms.

You probably will get 90% with 4 different forms
FirstName LastName
LastName, FirstName
LastName, FI. MI.
Then a handful you update yourself.

3. Once done you can get rid of the extracted author name field.
My mistake. I didn't made myself clear…

1) I already have table where all authors are (i did it in Excel)

so I have like

AuthorIDLastName
1Brown
2Lavender
3Peach
4Bleach
Etc

I have junction table (I want to populate):

PublicationID <> AuthorID

What I want to do by VBA is to extract from that string of concatenated Authors each one author one by one (I already have VBA function to do so) bur what I dont know how to do, when I have extracted eg. Brown AJ to search. Fundamentally, I know how to find match of extracted string in tblAuthors but I dont know how to do it the other way: search IN STRING for all possible authors from tblAuthors and if there is match I can run Append query to populate junction table…
 

Noruen

Member
Local time
Today, 13:50
Joined
Jul 7, 2020
Messages
46
I think your process may be different, and not perfect. But do it in many steps. I assume this is a many to many but do not know what the main table is. References? I would assume you need a junction table

tblReferences_Authors
ReferenceID_FK
AuthorID_FK 'tbd'
ExtractedAuthorName


So now a reference can have many authors and an author can be assigned to many references.
1. I would extract the ReferenceID and the AuthorName from the concatenation.
For my first record I would get something like

1 Brown, A. D.
1 Author Two
1 Author Three

2. Step Two. Run update queries where you get matches and update the real AuthorID_FK from your new table
You will do several different updates. First find records of the form FirstName LastName
where ExtractedAuthorName = AuthorTable.FirstName & " " AuthorTable.LastName
or in the Brown, A. D. case join
Where ExtractedAuthorName = AuthorTable.LastName & ", " & left(Authortable.firstName,1) & ". " & AuthorTable.MiddleInitial
just keep linking different forms.

You probably will get 90% with 4 different forms
FirstName LastName
LastName, FirstName
LastName, FI. MI.
Then a handful you update yourself.

3. Once done you can get rid of the extracted author name field.
Ah, sorry! I overlooked your post, sorry… it is quite interesting solution!
I will try it :) Thanks a lot!
 

MajP

You've got your good things, and you've got mine.
Local time
Today, 08:50
Joined
May 21, 2018
Messages
8,463
What I want to do by VBA is to extract from that string of concatenated Authors each one author one by one (I already have VBA function to do so) bur what I dont know how to do, when I have extracted eg. Brown AJ to search. Fundamentally, I know how to find match of extracted string in tblAuthors but I dont know how to do it the other way: search IN STRING for all possible authors from tblAuthors and if there is match I can run Append query to populate junction table
No I understand, I am suggesting to do it differently.

I understand you have an author table. I assume Last Name, First Name, ... More than just last name.

1. I am going to populate the junction table first. It will have the PublicationID and a record for each extracted name (temporary). So I should have around 2000 records.
2. Now instead of searching a concatenated string or the publication table. I am joining the author table to the junction table where I can get a partial name match on the extracted field. This will be a lot easier.

I see we posted at the same time.
Do you need help with the extract and populate of the junction table? If you send me the junction table and original table I can do it.
 

Noruen

Member
Local time
Today, 13:50
Joined
Jul 7, 2020
Messages
46
No I understand, I am suggesting to do it differently.

I understand you have an author table. I assume Last Name, First Name, ... More than just last name.

1. I am going to populate the junction table first. It will have the PublicationID and a record for each extracted name (temporary). So I should have around 2000 records.
2. Now instead of searching a concatenated string or the publication table. I am joining the author table to the junction table where I can get a partial name match on the extracted field. This will be a lot easier.

I see we posted at the same time.
Do you need help with the extract and populate of the junction table? If you send me the junction table and original table I can do it.
Yes, sorry: I believe I can do that so I will try it today night and I will let you know. Many thanks! :)
 

MajP

You've got your good things, and you've got mine.
Local time
Today, 08:50
Joined
May 21, 2018
Messages
8,463
Here is a demo to help.
I have a publication table
publications publications

PublicationIDAuthors
1​
Brown, A. D.; John Smith; Mark Twain
2​
Stephen King; Bill Shakespeare;Edgar Allen Poe
And an unpopulated junction table.
I run this code.


Code:
Public Function ExtractNames()
  Const SourceTable = "Publications"
  Const JunctionTable = "Publications_Authors"
  Const AuthorField = "Authors"
  Const PubIDField = "PublicationID"

  Dim rs As DAO.Recordset
  Dim Author As String
  Dim PubID As Long
  Dim arrAuthors() As String
  Dim i As Integer
  Dim strSql As String

  Set rs = CurrentDb.OpenRecordset(SourceTable)
  Do While Not rs.EOF
    arrAuthors = Split(rs.Fields(AuthorField), ";")
    PubID = rs.Fields(PubIDField)
    For i = 0 To UBound(arrAuthors)
      Author = Trim(arrAuthors(i))
      strSql = "Insert into " & JunctionTable & "(PublicationID_FK,ExtractedName) VALUES (" & PubID & ", '" & Author & "')"
      Debug.Print strSql
      CurrentDb.Execute strSql
     Next i
    rs.MoveNext
  Loop
End Function

I get this result
publications_Authors publications_Authors

IDPublicationID_FKAuthorID_FkExtractedName
1​
1​
0​
Brown, A. D.
2​
1​
0​
John Smith
3​
1​
0​
Mark Twain
4​
2​
0​
Stephen King
5​
2​
0​
Bill Shakespeare
6​
2​
0​
Edgar Allen Poe
The extracted name is just a place holder until I can get a match.
Now I have an author Table
TblAuthors TblAuthors

AuthorIDFirstNameMiddleNameLastName
1​
EdgarAllanPoe
2​
StephenKing
3​
BrownAlexDavid
Then start update the different formats
FirstName MiddleName LastName
Code:
UPDATE TblAuthors, publications_Authors
SET publications_Authors.AuthorID_Fk = [AuthorID]
WHERE (((publications_Authors.ExtractedName)=[TblAuthors].[FirstName] & " " & [TblAuthors].[MiddleName] & " " & [TblAuthors].[LastName]));
And I get EAP updated.
publications_Authors publications_Authors

IDPublicationID_FKAuthorID_FkExtractedName
1​
1​
0​
Brown, A. D.
2​
1​
0​
John Smith
3​
1​
0​
Mark Twain
4​
2​
0​
Stephen King
5​
2​
0​
Bill Shakespeare
6​
2​
1​
Edgar Allen Poe

Now just change the where statements to handle different name formats. Your done.
 

MajP

You've got your good things, and you've got mine.
Local time
Today, 08:50
Joined
May 21, 2018
Messages
8,463
Here is the A. D., Brown format
Code:
UPDATE TblAuthors, publications_Authors SET publications_Authors.AuthorID_Fk = [AuthorID]
WHERE (((publications_Authors.ExtractedName)=[TblAuthors].[LastName] & ", " & Left([TblAuthors].[Firstname],1) & ". " & Left([TblAuthors].[MiddleName],1) & "."));
After fixing the name in the author table.
 

Noruen

Member
Local time
Today, 13:50
Joined
Jul 7, 2020
Messages
46
Here is a demo to help.
I have a publication table
publications publications

PublicationIDAuthors
1​
Brown, A. D.; John Smith; Mark Twain
2​
Stephen King; Bill Shakespeare;Edgar Allen Poe
And an unpopulated junction table.
I run this code.


Code:
Public Function ExtractNames()
  Const SourceTable = "Publications"
  Const JunctionTable = "Publications_Authors"
  Const AuthorField = "Authors"
  Const PubIDField = "PublicationID"

  Dim rs As DAO.Recordset
  Dim Author As String
  Dim PubID As Long
  Dim arrAuthors() As String
  Dim i As Integer
  Dim strSql As String

  Set rs = CurrentDb.OpenRecordset(SourceTable)
  Do While Not rs.EOF
    arrAuthors = Split(rs.Fields(AuthorField), ";")
    PubID = rs.Fields(PubIDField)
    For i = 0 To UBound(arrAuthors)
      Author = Trim(arrAuthors(i))
      strSql = "Insert into " & JunctionTable & "(PublicationID_FK,ExtractedName) VALUES (" & PubID & ", '" & Author & "')"
      Debug.Print strSql
      CurrentDb.Execute strSql
     Next i
    rs.MoveNext
  Loop
End Function

I get this result
publications_Authors publications_Authors

IDPublicationID_FKAuthorID_FkExtractedName
1​
1​
0​
Brown, A. D.
2​
1​
0​
John Smith
3​
1​
0​
Mark Twain
4​
2​
0​
Stephen King
5​
2​
0​
Bill Shakespeare
6​
2​
0​
Edgar Allen Poe
The extracted name is just a place holder until I can get a match.
Now I have an author Table
TblAuthors TblAuthors

AuthorIDFirstNameMiddleNameLastName
1​
EdgarAllanPoe
2​
StephenKing
3​
BrownAlexDavid
Then start update the different formats
FirstName MiddleName LastName
Code:
UPDATE TblAuthors, publications_Authors
SET publications_Authors.AuthorID_Fk = [AuthorID]
WHERE (((publications_Authors.ExtractedName)=[TblAuthors].[FirstName] & " " & [TblAuthors].[MiddleName] & " " & [TblAuthors].[LastName]));
And I get EAP updated.
publications_Authors publications_Authors

IDPublicationID_FKAuthorID_FkExtractedName
1​
1​
0​
Brown, A. D.
2​
1​
0​
John Smith
3​
1​
0​
Mark Twain
4​
2​
0​
Stephen King
5​
2​
0​
Bill Shakespeare
6​
2​
1​
Edgar Allen Poe

Now just change the where statements to handle different name formats. Your done.
Amazing! Thanks a lot! Especially for that VBA part. Finally I can see how to work with arrays. You saved me many days of manual work.
 

Noruen

Member
Local time
Today, 13:50
Joined
Jul 7, 2020
Messages
46
Once again, thanks a lot @MajP ! I did some edits to your code as I already null-ed some of the TempAuthors fields manually and it returned error:

Code:
Public Function ExtractNames()
  Const SourceTable = "tblPublications"
  Const JunctionTable = "tblPublicationAuthors"
  Const AuthorField = "PubAuthorsTemp"
  Const PubIDField = "PublicationID"

  Dim rs As DAO.Recordset
  Dim Author As String
  Dim PubID As Long
  Dim arrAuthors() As String
  Dim i As Integer
  Dim strSql As String
  Dim strSQLB As String

  strSQLB = "SELECT PublicationID, PubAuthorsTemp FROM tblPublications WHERE (PubAuthorsTemp Is Not Null)"

  Set rs = CurrentDb.OpenRecordset(strSQLB)

  Do While Not rs.EOF
   
    If Not IsNull(AuthorField) Then
    arrAuthors = Split(rs.Fields(AuthorField), ";")
    PubID = rs.Fields(PubIDField)
    For i = 0 To UBound(arrAuthors)
      Author = Trim(arrAuthors(i))
      strSql = "INSERT INTO " & JunctionTable & " ( PublicationID, ExtractedName ) VALUES ( " & PubID & ", '" & Author & "')"
      Debug.Print strSql
      CurrentDb.Execute strSql
     Next i
    End If
    rs.MoveNext
  Loop
End Function

After it successfully filled my junction table I performed several queries with variations in criteria to find all unique authors (only count = 1 are unique):

SQL:
SELECT
    PA.PubAuthorID,
    PA.PublicationID,
    PA.ExtractedName,
    (SELECT First(A.NameLast) FROM tblAuthors AS A WHERE PA.ExtractedName Like "*" & A.NameLast) AS TheName,
    (SELECT Count(A.NameLast) FROM tblAuthors AS A WHERE PA.ExtractedName Like "*" & A.NameLast) AS TheTest,
    (SELECT First(A.AuthorID) FROM tblAuthors AS A WHERE PA.ExtractedName Like "*" & A.NameLast) AS TheAuthor
FROM
    tblPublicationAuthors AS PA
WHERE
    PA.ExtractedName is not null

Through Make Table queries and subsequently Update queries I have left like 400 authors I need to update manually but it is quite easy-peasy in comparison to 6173 records in total :)

So thanks again!
 

MajP

You've got your good things, and you've got mine.
Local time
Today, 08:50
Joined
May 21, 2018
Messages
8,463
I just wrote two recent threads on "automating" cleaning up data. I learned long ago that in most of these cases you cannot have a magical button that does everything. And if you go down that path of fully automating you will spend days trying to write code with an infinite amount of cases and conditions.

At the same time you can come up with a partial solution and then an interface that makes doing the updates far simpler. So maybe it is still a somewhat manual process but you can do it orders of magnitude faster.
So for the remaining 400, there may be a simple interface to aid in the updates. Here are two examples of which I speak.


So think of a simple form to aid in the 400. What about the 400, are these not being found. Can you post a few examples?
Can you describe a little how you manually update the remaining 400?
I think you can see in the first example what would be an impossible problem to fully automate, and a painful process to do completely manually, can be done in a few seconds with a partial solution and a helpful UI.
 

MajP

You've got your good things, and you've got mine.
Local time
Today, 08:50
Joined
May 21, 2018
Messages
8,463
If you upload the junction table and the author table I can build you a quick interface for the remaining 400. It would be a modification of this.

On the left you would have the junction table with the extracted string for those records that a match was not found. On the right the author list.
You select an unmatched record and it would allow you to pick how to search the authors for some partial match. You could expand or narrow down the search. Once you find the match you click on it and it will update the foreign key. My guess then it is a few seconds to match a record.
If you can get that down to about 5 seconds a record, you should be able to do it all in a half hour.
I assume you also may need to add records to the Author table from the extracted list. If so then you would want to automate that too. You would want to be able to click on an extracted name and have it give you best options how to add it as a new name and assign it to the junction table.
 

Noruen

Member
Local time
Today, 13:50
Joined
Jul 7, 2020
Messages
46
@MajP you are very convincing. I'm sending you my whole database :) There is no secret there, but please do not vomit: I'm really building things up on the go so it is in the state of mess.

tables of interest are tblPublicationAuthors and tblAuthors. In tblPublicationAuthors I have those extracted authors without assigned proper AuthorID.

Thanks for your help! :)
 

Attachments

  • ACLIDA.accdb
    3.9 MB · Views: 377

MajP

You've got your good things, and you've got mine.
Local time
Today, 08:50
Joined
May 21, 2018
Messages
8,463
I did some work and updated the Author and AuthorPub, Keyword, an KeywordPub tables. I did another extract. For now I removed the cascade deletes from the junction tables.
All the links are there because I based them on the extracted word and not the cleaned up word. You will see I built the author table by extracting all the names and saved the extracted name. I did the same on the junction.
So for example (and here is the problem) if an author was listed in the text incorrectly
Aaron, Miller;
tblAuthors tblAuthors
tblAuthors tblAuthors

AuthorIDNameFirstInitialFirstNameLastNameMiddleInitialMiddleSuffixExtractedName
57​
MillerMAaronAaron, Miller


It created a record in the junction table and the Author table. Without digging further I assume the first name is not Miller. In fact there is another Pub with
Miller, A. E.
tblAuthors tblAuthors

AuthorIDNameFirstInitialFirstNameLastNameMiddleInitialMiddleSuffixExtractedName
1141​
AAMillerEEMiller, A. E.
And I bet that is the correct name.
I am sure there are more things like that.

To fix this you are going to need some interface where you can search for potential duplicate authors. Then an easy way to pick which one to keep and reassign those publications to the correct author.
In this case you would see that Miller Aaron is wrong. You would want to be able to see like versions of that name. Then be able to pick the correct name and update any pubs assigned to 57 to 1141.
In this case you have some additional information. You can fill out NameFirst to "Aaron" in 1141

So I have not built that form yet. Doing this manually would be a pain.
Here is another example.
tblAuthors tblAuthors

AuthorIDNameFirstInitialFirstNameLastNameMiddleInitialMiddleSuffixExtractedName
1215​
RoviraRAlexAlex, Rovira

Question
1. You did not extract authors for pubs less than 150. Is there a reason for this?
2. On many author lists there are leading initials. Almost another format. I assume these might all be the same author just either a typo in the author list or something else. What is the rule on those leading initials?

tblAuthors tblAuthors
So this ended up creating three different authors.
Vermesch, J. P.
Vermesch, G.
Vermesch, A.

It is easy to see these possiblities, just will need to make an interface to allow you to pick on and reassign the child records.

ExtractedName
P. E. Vermersch, J-P
P. C. Vermersch, G
P. C. Vermersch, A.
I did not want to do much more until I know what you want to do with the first 150 publications. So every publications keyword and every publications author is mapped to a publication, what needs to be fixed is determine what is a duplicate and pick one to reassign the child records.
 

Attachments

  • ACLIDA_MajP.accdb
    6 MB · Views: 375

MajP

You've got your good things, and you've got mine.
Local time
Today, 08:50
Joined
May 21, 2018
Messages
8,463
FYI,
You may understand this, but just in case. Cleaning up an author record is not the same as reassigning duplicates.
So if you just had a record Miller, Aaron and no similar record that could mean the same thing, you might determine the last and first are in the wrong place. Since all the links are in place you can simply clean up the record and change the Last and First. Other names were in the form
Last, FiMi. Brown, AP
So it parsed the first name as "AP". I assume it is Brown, A. P. and would edit put the information in the correct columns.
I have a separate Middle Initial and First Initial field for now. These are not really needed until you finish clean up of duplicates. They help in finding similar things.
Also if not clear a "duplicate" in this context are two records that may have different field values but really are supposed to be the same author.
 

Noruen

Member
Local time
Today, 13:50
Joined
Jul 7, 2020
Messages
46
@MajP

very nice! Well, reason for ID > 150 is simoly that it was the last pub before that append query failed because of author that had ' in its name. I have had to replace rhis symbol with small dash and then run the rest of the pubs: from 151 and above. So in fact I extracted all the authors from pubs based on your code.

Regatding that names: the issue is how badly EndNote handles authors and how screwed the export was. I'm aware some authors are mismatched with weong initials. However I really do not know if there is any rule of how and why EndNote handled specific autors like it did.

So, now there is every author extracted and assigned to pub. I will check your edits :) thanks!
 

MajP

You've got your good things, and you've got mine.
Local time
Today, 08:50
Joined
May 21, 2018
Messages
8,463
I added in the last 150.
You have a lot of cleanup to do. You have have a lot of names in different formats that are probably the same name. You need to take all the partial information and make a good record. Then change all the other records to that author. I made an interface, but it still is a lot of work. Mostly becuase you have to decide what is the real name.

Clean1.jpg

The form has for subforms. In the upper left is a list of authors. You can filter and sort this any way you want since it is a datasheet. When you are on an author that authors publciations are listed below in the LL. In the UR are similar names.
Selected is Angel, Cronin and the UR shows you there is also a Cronin, Angel. My money is on Cronin, Angel.
You can then unassign Angel, Cronin from any publications and assign them to the likely Cronin, A. (If there are lots of pubs you can use copy and past of the name or the ID). You can also edit partial information. I know that Cronin, A is Cronin, Angel
Clean2.jpg

Once you unassign everything from a "bad name" it appears in the LR. You can delete that name.
Clean3.jpg
 

Attachments

  • ACLIDA_MajP3.accdb
    4.6 MB · Views: 367

Users who are viewing this thread

Top Bottom