Duplicates in 1 field (1 Viewer)

LHood

New member
Local time
Today, 20:50
Joined
Nov 10, 2021
Messages
2
Hi,

I am trying to remove duplicates from 1 field, for example in the field i have named as Address Line 1 the first record is "1 sample road, 1sampleroad, 1 Sample Road." then the next record is "2sampleroad, 2 Sample Road" then third record "3 Sample Road" which is how i would like it to look.

So its not every record , Is there a way of removing duplicates from the field?

Any help would be appreicated :)
 

MajP

You've got your good things, and you've got mine.
Local time
Today, 15:50
Joined
May 21, 2018
Messages
8,554
Some modification of this idea would work
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 14:50
Joined
Feb 28, 2001
Messages
27,223
If it is a small number of records, do it by hand. Doing this via programming will depend on the reliability of the faulty cases, and you have already show three cases that would become hard to program.

Your "1 Sample Road" case has two poorly formatted entries and one of them only differs from the "good" one by case, not spaces. In both cases, the LAST entry is apparently the good one, but it is not always 3rd. In one case it is 2nd. From a programming viewpoint, you need to be able to establish a rule before you can do anything with programming.

Also, the apparent delimiter is a comma. Will there ever be a case where a comma is a legitimate element of the address?
 

MajP

You've got your good things, and you've got mine.
Local time
Today, 15:50
Joined
May 21, 2018
Messages
8,554
Or a version of this

As stated in the threads the trick is to make a user interface that make finding possible duplicates very easy. Coming up with a single function to find all cases and fix it is not doable.
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 15:50
Joined
Feb 19, 2002
Messages
43,352
I did several consulting assignments for Readers' Digest. Their whole business was about mailing stuff. Not only did they mail out the magazine and other products they sold but they were a huge direct marketer so they were obsessed with identifying duplicates because they sure didn't want to waste their money sending multiple expensive mailings to the same person. To do this right, you need to separate the address parts into separate fields WHEN THEY ARE ENTERED, not after the fact. Of course, you can do a one time cleanup to get from where you are to where you want to be but you do not want to be in perpetual cleanup mode. It never ceases to amaze me the variations that people can come up with and new ones will always get past your code. So in the RD world an address had these parts
CompassPrefix (N, S, E, W, NE, NW, SE, SW)
HouseNumber
HouseLetter (this might contain 1/2 or possibly other fractions as well as letters)
StreetName
StreetType (Rd, St, Blvd, etc)
CompassSuffix (N, S, E, W, NE, NW, SE, SW)
Unit

There may have been more. It was a long time ago. They also insisted on zip+4 so they could pre-sort the mail. Everything they delivered to the PO had already gone though the first sort and was in trays ready for the trucks so they got a discount on their postage. for saving the local PO the sorting step.

Most applications mush the parts of street address but most applications don't care about duplicates or invalid strings. If you care, don't mush the data. Separate it into discrete parts.
 

Users who are viewing this thread

Top Bottom