Joining nulls (1 Viewer)

jaryszek

Registered User.
Local time
Today, 04:10
Joined
Aug 25, 2016
Messages
756
Hi Guys,

i want to ask you once again about nulls in tables.
I created 2 tables:
1592556444619.png


1592556459710.png


And now i create query to check if records in TempTable exists in Target Table:
1592556495839.png


The result seems to be ok even i have nulls here.
When i am joining much more fields, like 15 like that sometimes nulls is not joined.
1. Why this is somehow randomly working and sometimes not?
2. Is is always safe to change nulls to not used values like "-199999" fornumbers or "null" for strings to be sure that this will work?
3. What are your best practices?

thank you in advance for any answer,
Best,
Jacek
 

jaryszek

Registered User.
Local time
Today, 04:10
Joined
Aug 25, 2016
Messages
756
And adding database as an example
 

Attachments

  • Database2.accdb
    416 KB · Views: 108

plog

Banishment Pending
Local time
Today, 06:10
Joined
May 11, 2011
Messages
11,611
1. Why this is somehow randomly working and sometimes not?

That's a bad way to word/explain it. I mean, I see and understand your SQL, so to me it "works". My expectations of what it should return is different than yours. So, a better way for you to say it is it doesn't produce expected results and then to tell/show us what results you expect. Then we can help you generate the query you want.

2. Is is always safe to change nulls to not used values like "-199999" fornumbers or "null" for strings to be sure that this will work?

Again, you need to specifically define what "works" means. Example data would be best--show me the specific data in your query that isn't expected (either supposed to not be there or is supposed to be there but isn't)

The sample you uploaded the query produces 5 records. Is that correct? Are the values in those 5 records correct? If not tell me how they are unexpected. If that query is correct then your sample data is of no value because it doesn't demonstrate your issue and you should upload a database that does.
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 06:10
Joined
Feb 28, 2001
Messages
26,996
I'm not sure I totally understand your question, but I'll take a shot at what I think you may have asked.

You can't form a JOIN based on nulls because two nulls NEVER equal each other. If you have this expression in Access: IF Null = Null Then... the result of the express will always be FALSE. You can't return a record based on comparison to a null. That is why you have special operators and syntax for using nulls. Things like the NZ function, IsNull function, and SQL syntax IS NULL. It simply isn't "legal" in Access to compare nulls. There should be no randomness involved at all. However, if you have multiple selection criteria that DOESN'T involve nulls and you are using "OR" logic, you can perhaps get things that match on one criterion but not the other.

If you are seeing any apparent randomness to this, I think you are simply missing the real pattern.

As to changing NULL values to some unlikely value, I do that all the time. My favorite is to see some unknown future year to 31-Dec-9999 (the highest legal date string) and some unknown past year to 1-Jan-100 (the earliest legal date string). Then when I see null or zero, I know I'm missing data.
 

jaryszek

Registered User.
Local time
Today, 04:10
Joined
Aug 25, 2016
Messages
756
thank you Guys very much!

Ok, so The_Doc_Man to confirm to use join from my example you would also change null values to some unlikely value yes? To be sure that joins are working correctly?
 

theDBguy

I’m here to help
Staff member
Local time
Today, 04:10
Joined
Oct 29, 2018
Messages
21,357
The result seems to be ok even i have nulls here.
Hi. I'm not sure what result you were expecting; but since you used an OUTER JOIN in your query, you basically just got back everything from the OUTER table. Try adding more columns to see what you actually got and see if you can understand why. For example, make the query like this:

query.png


So, did you actually get the result you were expecting?
 
Last edited:

Micron

AWF VIP
Local time
Today, 07:10
Joined
Oct 20, 2018
Messages
3,476
I'd say something doesn't smell right if you have queries with 15 joined fields...
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 06:10
Joined
Feb 28, 2001
Messages
26,996
I've only seen a 15-way JOIN once, with the U.S. Navy's main personnel database. The 15 tables involved three main tables and 12 other tables. Persons held Billets and were members of a Unit, so that was three tables. The remaining 12 tables were translation tables like determining the person's pay-grade based on their rank or rating, determining number of allowed overtime hours in a particular unit, and other things of that type. If you have a couple of "real" JOINs and the rest are translations or other type of lookup, it MIGHT make sense, but I'm with Micron in being a bit skeptical.

On your other question, for some things NULL is an important (non-)value and where that occurs, I leave it as-is. However, for certain fields in my Ancestry database I had occasion to define a pair of default values for specific cases. Where a person's birth date was unknown but their death date WAS known, I found it useful to assert a birth date that was CLEARLY in the past and easily recognized but not null. That was my choice of 1-Jan-100, which is not zero but not null either. And for all persons in my DB who are still living, their death date is 31-Dec-9999, clearly in the future but not null. In that case, I would again suggest that a JOIN is inappropriate based on intent.

As a trivial example of what I mean there, I could easily make a JOIN between one person's birth date and another person's death date but what would such a JOIN mean? If you are joining fields and happen to make a JOIN because BOTH records have the same "impossible" value, I'm not sure that you would learn anything important and in fact might even confuse yourself. Now if you have a RANGE of impossible values and constrain your values so that each table uses a different "IMPOSSIBLE" value then you would not have JOINed records where the joining was in fact coincidental. Did I make that clear? Because sometimes my sentences even confuse English speakers.

I did what I did with dates in my Ancestry DB to avoid null comparison issues. That crops up because the data coming back from Ancestry.COM isn't always complete (and it is not their fault that it isn't!) For me to artificially impose possible matchups based on intentionally faked data seems somehow not quite right.
 

Micron

AWF VIP
Local time
Today, 07:10
Joined
Oct 20, 2018
Messages
3,476
Just to clarify the meaning of my post - I meant 15 joins between 2 tables, similar to what was being shown with 3 joins.
 

Isaac

Lifelong Learner
Local time
Today, 04:10
Joined
Mar 14, 2017
Messages
8,738
That makes sense.
As I have definitely done sql server queries in large data warehouses with 10 or 15 joins. Not between two tables though(!) and never with nulls, of course.
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 06:10
Joined
Feb 28, 2001
Messages
26,996
Now that is a different question. The most I have ever had to do myself was a six-way join between five participating tables, two JOINs between two tables and then two more JOINs between two tables, one of which was a participant in the other two-field JOIN (but the two sets of joining fields had nothing in common) and then two more translation joins to two tables. Built a status report out of that one. And that was a horrendous stretch.

My thought on having that many joins is that either there is a significant amount of overkill going on or some of those joins are meaningless. Let's do some simple math.

Let's say that for field X in table A, a given value appears 10% of the time and there are 10 different common values of equal weight. So if you JOIN table A to table B, you expect no more than 10% of A to match up with 10% of table B. To have fair representation, you need significantly more than 10 records in each table.

Now let's add field Y to the equation, and let's further say that X and Y are unrelated and independent. If Y has equal distribution of 10 different values, then your odds just changed so that you would only see 1% of the tables' records matching up. To have any population in that case, you would need a lot more than 100 records.

Skip ahead to a 14-way join. Do you want to discuss the odds on this?

The point I am ACTUALLY making is that if you get back records at all, then some of those JOINs are either inappropriate or unnecessary because they appear to not filter anything at all. They are overkill and if they don't filter anything, it is because there must be a hidden relationship behind the scenes that is obscure in some way or another.
 

Isaac

Lifelong Learner
Local time
Today, 04:10
Joined
Mar 14, 2017
Messages
8,738
Yeah I am just thinking of cases in sql server, where they are all different tables, and I am joining in medical industries customers to enrollment tables, opt out tables, address tables, claims(header), claims(detail), pharmacy benefit manager, mental health claims, other subqueries, the list goes on and on. But usually with so many joins, especially on left joins (90% of the join types I am doing in a typical corporate data warehouse), one has to be incredibly certain of the proper join predicates so as not to unexpectedly impact the cardinality of the output.
As for two field joins, well I wasn't really counting that as two joins for the purpose of this conversation, just a join with multiple criteria. But I know Access displays it in a way that draws attention to the number of columns involved in the join.
I think most of what you are saying on Odds ...Yes, that does make sense....mostly in the context of inner joins.
At this point my mind is blown and I need more coffee. : )
 

Micron

AWF VIP
Local time
Today, 07:10
Joined
Oct 20, 2018
Messages
3,476
it is because there must be a hidden relationship behind the scenes that is obscure in some way or another
My suspicion is that the tables are designed as spreadsheets. A table for apples with 15 varieties, each in their own field would likely require joining on each field when dealing with temp and non-temp data. Then there is the oddity whereby the joins are not on pk and fk fields but on multiple data fields instead, even though a unique ID field is shown in the examples, which also seems suspect - because a temp table really doesn't need a pk field, does it?

Anyway, I fear that I may have induced us to go off the rails a bit here. I'm going to let the OP provide any clarification, because I think everything we're posting is just speculation now. I'm guilty of not looking at the db because it seems it's just a reflection of the posted pictures (no pun intended). What is needed to shed light on whether or not there is an issue is at least a pic of the relationships.
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 07:10
Joined
Feb 19, 2002
Messages
42,970
Joining two tables on 15 fields doesn't make any sense. It is indicative of a design flaw. Joins should be FK to PK. They might include multiple columns but only if the PK is multiple columns.

Why do you require a join between two tables on so many fields?
 

Cronk

Registered User.
Local time
Today, 22:10
Joined
Jul 4, 2013
Messages
2,770
Hypothetical example with a table containing people. Need to avoid duplicates in adding additional records. There is no unique key available. Test firstName, surname, middleName, street address, phone, dob. Though, can't see it being as many as 15 fields to compare by oins..
 

jaryszek

Registered User.
Local time
Today, 04:10
Joined
Aug 25, 2016
Messages
756
Hi Guys,

thank you for your statemements.
As Cronk told you can have for example people table withpout PK and you have to check what exists there. Table can have more than 15 atributes so you can meet more tahn 15 joins Guys...Espacially if you have not nice data.

Let's look at this example:


You can see that here you have to join all fields to check if records exists because of lack of PK.
I have the same but i have about 18 attributes in one csv. So it is possible, why not?

Another example:

1592641277227.png


1592641286450.png



1592641293966.png


as you can see i am getting only ID2 = 1 in result query because for ID2 = 2 and ID=3 i have nulls. So it is not joining properly and to avoif it i have to use Nz function or update nulls to some value like "null" or -99999 number.

So this is the case...

Best,
Jacek
 

Attachments

  • Database9.accdb
    476 KB · Views: 103

isladogs

MVP / VIP
Local time
Today, 11:10
Joined
Jan 14, 2017
Messages
18,186
Jacek
You seem to be asking exactly the same questions about nulls and multiple joins as was the case over a year ago early in 2019.
You are even making references to some earlier threads so I'm unclear why you are repeating the same questions.
@MajP has also reminded you of his very clever record matching tool that he developed in response to one of your earlier questions.

So is there anything new you are asking? If not, please again study the previous threads and responses to them.
The answers are already out there.
 

jaryszek

Registered User.
Local time
Today, 04:10
Joined
Aug 25, 2016
Messages
756
Hi Guys,

thank you. Yes i know MajP solution because i started the topic.
Now i want to just clarify why this is happening? Why this is not showing ID2 when at least one field has null in join? Why Access is not reading this?
I just want to go deeper into access structure. The solutions is very nice!

Jacek
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 06:10
Joined
Feb 28, 2001
Messages
26,996
Why this is not showing ID2 when at least one field has null in join? Why Access is not reading this?

Access IS reading it - and seeing a null in a JOIN statement, which IMPLIES that an impossible comparison is under way.

Saying "... FROM A JOIN B on A.X = B.X ..." is equivalent in final effect to "...FROM A, B ...WHERE A.X = B.X ..." though they start differently.

The difference is that using a JOIN, you MUST have an index on whichever table is the ONE side of a ONE/MANY relationship, and Access will use that fact to make a faster combination set. If you have no unique index on the ONE side of the relationship then you have to use the WHERE clause to do what is called a relation scan, which is terribly slow - and perhaps slightly more likely to introduce bloat in the FE file.

Now, why is Access not joining on those nulls? Because usually NULL doesn't show up in an index. Therefore, if A.X is null and B.X is null, that JOIN, which will be based on an index, has no records in the index with nulls. A null is NOTHING. It is not a value, it is a record with an explicit absence of a value. You cannot compare against it either, because in that WHERE clause above, if A.X is null and B.X is null, they will NOT compare as equals without regard to indexing. That's why I showed both methods. They lead to the same result - you cannot use traditional methods to test for equality based on nulls regardless of how you approach it.

I don't know how you are on math, but it is the same problem as comparing infinities. The only thing you can compare in an infinity is whether it is of the same order of infinity as its potential comparand. Basically, that use of nulls is an illegal operation in Access and it is refusing to play your game.
 

Users who are viewing this thread

Top Bottom