get a number of unique values without separating values that belong to the same block (1 Viewer)

lala

Registered User.
Local time
Today, 10:45
Joined
Mar 20, 2002
Messages
741
This is a very tough problem to explain. Please ask any questions and i will do my best to answer them clearly.

I have the following dataset in a table called NR_PVO_120. How do i pick out a number (which can change but let's say, 6) of UNIQUE OtherIDs without excluding any OtherIDs under any fax numbers?

So, if you pick OtherID from Row7 you then also must pick OtherIDs from rows 8 and 9 because they have the same fax number. Basically, once you pick an OtherID you're then obligated to pick all OtherIDs that have the same fax number as the one you picked.

If the number requested (6 for this example) isn't possible then "the closest number possible but not exceeding" would be the rule.

For example, if you take OtherIDs from rows 1-10 you will get 6 unique OtherIDs but row 10 shares a fax with rows 11 and 12. You either need to take all 3 (but that will raise the unique count to 8, which isn't acceptable) or skip this OtherID and find one with a fax that has no other OtherIDs and that isn't on the result set already. My result of 6 UNIQUE OtherIDs will need to contain ALL OtherIDs under any fax the existing OtherIDs are connected to.

So one solution is to take rows 1-6, 26. Another is to take rows 1-4,10-14. There are more but you get the idea.

There will be many possibilities (the real dataset has tens of thousands of rows and the number of people requested will be around 10K), as long all OtherIDs connected to all faxes on the result set are part of the requested number (6 in this case) any combination would do.

A few notes.
1.
Getting as close as possible to the requested number is a requirement.

2.
Some OtherIDs will have a blank fax, they should only be included as a last resort (not enough OtherIDs for the requested number).


thank you


my table (NR_PVO_120)
Row OtherID Fax
1 11098554 2063504752
2 56200936 2080906666
3 11098554 7182160901
4 25138850 7182160901
5 56148974 7182232046
6 56530104 7182234134
7 25138850 7182234166
8 56148974 7182234166
9 11098554 7182234166
10 56597717 7182248132
11 56166294 7182248132
12 25138850 7182248132
13 56148974 7182390090
14 56226456 7182390090
15 56148974 7182395285
16 25138850 7182395285
17 56166614 7180930966
18 11098554 7180930966
19 56159509 7180930966
20 25138850 7185462234
21 56148974 7185462234
22 25138850 7185465013
23 56024315 7185465013
24 56115247 7185465281
25 25138850 7185465281
26 56148975 7185466029



A few sample outputs

one solution is taking rows 1-6 and 26.
OtherID
11098554
56200936
25138850
56148974
56530104
56148975

Another solution is taking rows 1-4 and 10-14.
OtherID
11098554
56200936
25138850
56024315
56115247
56148974

There are many more solutions.

This is for a fax campaign, we need to make sure no fax number is faxed twice, that all people connected to that fax number are contacted under one fax sent.

So the idea is to take all OtherIDs under ANY fax you end up using.
 

CJ_London

Super Moderator
Staff member
Local time
Today, 15:45
Joined
Feb 19, 2013
Messages
16,607
Your basic code for this would be something like this for a single selection

Code:
SELECT myTable_1.OtherID 
FROM myTable INNER JOIN myTable AS myTable_1 ON myTable.Fax=myTable_1.fax
WHERE myTable.OtherID=[Enter OtherID]

For a multiple selection, that depends on how you are making that selection your where clause might be this for example

Code:
....
WHERE myTable.OtherID in (11098554, 56200936, 11098554, 25138850...)
 

lala

Registered User.
Local time
Today, 10:45
Joined
Mar 20, 2002
Messages
741
interesting. i will need to write a loop that will test every combination to see how many unique OtherIDs i have and then keep tweaking up or down but i think it can be done. i will post if it works, thank you very much
 

CJ_London

Super Moderator
Staff member
Local time
Today, 15:45
Joined
Feb 19, 2013
Messages
16,607
i will need to write a loop that will test every combination to see how many unique OtherIDs i have
Seems a bit extreme. If you want to know how many duplicates you have for each fax, just use the query wizard to create a duplicates query on fax
 

lala

Registered User.
Local time
Today, 10:45
Joined
Mar 20, 2002
Messages
741
it's not about duplicates. my task is to pick out a certain number of unique OtherIDs. you gave me a query that picks all OtherIDs for all faxes, thank you for that.

so now i need to get my number (10K, let's say). so the loop will need to run your query for the first 10K providers and see how many uniques there is, it will be way over. then it will need to keep looping to get to my 10K
 

CJ_London

Super Moderator
Staff member
Local time
Today, 15:45
Joined
Feb 19, 2013
Messages
16,607
Use the TOP element of SQL -

Code:
SELECT TOP 10000 myTable_1.OtherID 
FROM myTable INNER JOIN myTable AS myTable_1 ON myTable.Fax=myTable_1.fax
 

lala

Registered User.
Local time
Today, 10:45
Joined
Mar 20, 2002
Messages
741
Use the TOP element of SQL -

Code:
SELECT TOP 10000 myTable_1.OtherID 
FROM myTable INNER JOIN myTable AS myTable_1 ON myTable.Fax=myTable_1.fax

nope, won't work, any number of OtherIDs i plug in your query will at best (but will never happen) return the same number of OtherIDs but usually much more. because any OtherID you end up picking will always bring other OtherIDs with the same fax


this question has been up on StackOverflow (not sure if i can add a link) for 4 days i think and 2 days with a bounty, no answers. i think it's more complex than that :)


ETA and i've been at it for a week. anyone i asked in person said it's not possible. but i don't believe that. there are always solutions. some are better than others but there are always ways. i have one in mind but it's pretty ugly. trying to work with yours now. will be just as ugly lol but it's ok.
 
Last edited:

CJ_London

Super Moderator
Staff member
Local time
Today, 15:45
Joined
Feb 19, 2013
Messages
16,607
please clarify what you want

is it 10000 fax numbers and you only want one otherID per fax number - in which case is there any requirement on which one?

Why do you need to start from otherID?
 

lala

Registered User.
Local time
Today, 10:45
Joined
Mar 20, 2002
Messages
741
i want 10000 unique people while making sure that every person picked also includes all people attached to any fax numbers they're attached to. almost a circular reference.

we are trying to make sure not to fax any fax number twice but the fax center can only proccess batches of 10K (sometimes a little more) and we need to be able to pick the 10K without leaving any people with the same fax number for later
 

CJ_London

Super Moderator
Staff member
Local time
Today, 15:45
Joined
Feb 19, 2013
Messages
16,607
Still not clear :confused:

Lets say you have 100,000 OtherID's spread across say 80,000 fax numbers.

You want to send 10,000 faxes to OtherID's, some of which will go to the same fax number at the same time, i.e. each fax is 'personalised' to a single OtherID

Fax 1 - Dear OtherID#1
Fax 2 - Dear OtherID#2
Fax 3 - Dear OtherID#3


not

Fax 1 - Dear OtherID#1, OtherID#2, OtherID#3

And if you run a second batch, how do you exclude those fax numbers which have already been sent? Do you have a sent flag or similar?
 

lala

Registered User.
Local time
Today, 10:45
Joined
Mar 20, 2002
Messages
741
Still not clear :confused:

Lets say you have 100,000 OtherID's spread across say 80,000 fax numbers.

You want to send 10,000 faxes to OtherID's, some of which will go to the same fax number at the same time, i.e. each fax is 'personalised' to a single OtherID

Fax 1 - Dear OtherID#1
Fax 2 - Dear OtherID#2
Fax 3 - Dear OtherID#3


not

Fax 1 - Dear OtherID#1, OtherID#2, OtherID#3

And if you run a second batch, how do you exclude those fax numbers which have already been sent? Do you have a sent flag or similar?

i'm sorry, i get this reaction from everyone i explain this to, i'm not good at explaining

here's how this goes in real life. we are faxing doctors that can practice at multiple offices. we do campaigns by state, so lets'say they say, give me 10K from CA. CA has 30K, so i pick out the 10K based on rules above.

the faxes will go like this

fax 1 Dear OtherIDs 1,2,5
fax 2 Dear OtherIDs 1,2,6,7,8 (because 2 can have more than one fax and that's OK)
fax 3 Dear OtherID 7,11,13,14,15,16,17,20

and so on

we need to make sure one fax machine doesn't get multiple faxes. one doc can get multiple faxes at multiuple offices

and yes, i do have a flag and they might or might not be excluded (depending on the criteria they give me when they ask for CA again)

i'm sure this is still not clear, i'm not sure how to explain it, sorry


ETA your query works nicely to give me all OtherIDs attached to any fax for any OtherIDs i pick. so i'm half way there with your help, i think looping is the only way. there will need to be a loop at some point to get the number as close as possible to 10k (or whatever number they use) but to not exceed it and to make sure all are included
 

maturner

New member
Local time
Today, 10:45
Joined
Dec 22, 2014
Messages
1
Just an observation, but your example is 10K out 0f 30K and you state that you have a flag that is used to possibly exclude certain records from the result set. Does that provide enough variance so that you dont have the same result set based on given criteria over and over? It seems that if you use the same criteria you would get the same result set and possibly not be utilizing the entire list for the selection, based on the limits of the count.
 

lala

Registered User.
Local time
Today, 10:45
Joined
Mar 20, 2002
Messages
741
Just an observation, but your example is 10K out 0f 30K and you state that you have a flag that is used to possibly exclude certain records from the result set. Does that provide enough variance so that you dont have the same result set based on given criteria over and over? It seems that if you use the same criteria you would get the same result set and possibly not be utilizing the entire list for the selection, based on the limits of the count.

sorry, i forgot to post, it's been solved

http://stackoverflow.com/questions/...out-separating-values-that-belong-to-the-same


but to answer
lets say there are 30K in the set, today they request 10, i give it to them and mark it as called today
in a month they request another 10. that 10 won't include the original 10 because it has been marked
now i mark the second 10 as called today
in a month they request the third, they will get the remaining 10 because the first 20 has been marked and mark the today's 10 as called today
 

Users who are viewing this thread

Top Bottom