Relationships and referential integrity in linked/imported data DB's (1 Viewer)

Asoul

Registered User.
Local time
Today, 01:34
Joined
Apr 11, 2012
Messages
19
Have a huge DB that I did too much work to before making sure I was creating it properly. Was back tracking and trying to normalize and set up relationships and referential integrity but ran into the following issue:

The database uses tables which are periodically updated through imports from Excel. My problem is if I create relationships on any of the data that needs to be reimported, I am unable to import data without needing to break the relationship.

What is the point of creating these relationship considering this limitation or is there another method I should be using to maintain referential integrity?

I also tried linking the source data through an ODBC link and it is SO SLOW because the source data goes back over 10 years and would max the DB size out with one table... hence why I use excel to import current data versus using the whole historical data set.
 

CraigDolphin

GrumpyOldMan in Training
Local time
Today, 00:34
Joined
Dec 21, 2005
Messages
1,582
Responding to this could be like opening a right-royal can of worms. But here are my general thoughts.

What are you doing storing massive amounts of historical data in Excel? The remedy for this involves a slapping and a large, wet fish. After that you ought to be in a place where you are open to the idea of using a database to store the entire historical record in fully normalized tables.

How many rows do you have in your massive excel table? I have one database with over half a million records in two different tables and the file size of the backend database is only 75MB. Unless you're taking over google's web indexing business, I doubt that your historical data is so vast that it will exceed Access's 1 - 2 GB file size limit.

Once we get past that obstacle (keep the wet fish handy though, you never know...) then you use criteria queries as your row source for bound forms to limit the amount of data being retrieved at any one time by your front end forms.

And on that note, you ARE using a split database design right? Excellent. Just thought I'd check as that fish is looking a bit bored.

Excel is a fantastic program for analyzing summary data. It is the worst thing in the world for storing raw data. (ok, maybe I am being a smidge hyperbolic with that characterization...paper could arguably be worse unless you start talking about long-term [more than a century] compatibility issues).

Your database should be the primary place where the information is entered and stored to begin with, and you export the raw/summary data to an excel spreadsheet from the database when you need to do some fancy analysis. Even then, if it's a regular report, you can probably get it done in access.
 

Asoul

Registered User.
Local time
Today, 01:34
Joined
Apr 11, 2012
Messages
19
Craig,

I guess I need to clarify.
The source data is NOT stored in Excel.
I currently import it into Excel then import into Access because the source data program does not archive historical data (or even in some cases where it does archive, it does not do so logically to where I may need data from both the historical and current data sets).
Storing the entire historical records are just too large for Access 2GB limit and I would still need to update the data regularly.
Like I mentioned the ODBC link option completely failed as well because these records go back to the 1990's.
There are 5 plus sets of large files that contain 250K-750K rows of data SINCE 2011. Plus they sometimes have columns in excess of 50. So yes, the data is that extensive...

Now, hopefully the fish is still alive and can be put back in the water.

Yes the database is split.

Right now my concern is the painful process I have to go through to get the source data into Excel then into Access as well as maintaining a properly related and documented DB.

Is there anyway to set a date criteria on an ODBC link?
Any other suggestions greatly appreciated.

A
 

ButtonMoon

Registered User.
Local time
Today, 08:34
Joined
Jun 4, 2012
Messages
304
Right now my concern is the painful process I have to go through to get the source data into Excel then into Access as well as maintaining a properly related and documented DB.

Is there anyway to set a date criteria on an ODBC link?
Any other suggestions greatly appreciated.

Clearly it would be better not to use Jet at all. Get SQL Server (you can download Express Edition for free) or any other SQL DBMS. You can of course still use Access on top of the database. Also I don't understand why you are importing via Excel. It ought to be easier to go direct into your target database.
 

Lightwave

Ad astra
Local time
Today, 08:34
Joined
Sep 27, 2004
Messages
1,521
I would have a go at importing the files direct into access.

Access is pretty good at accepting things like CSV files and you can set up automatic imports which will work brilliantly provided the structure of the target csvs remain constant.

This could potentially remove one step but SQL Server Express would probably be your best bet long term position if this is a High value ongoing requirement I believe they have upped the maximum size to 10GB for SQL Server 2012 express note I suspect older versions will have lower limit.

Additionally you can odbc link tables in SQL Server to a blank Access database and use the database like management studio to import information into the SQL Server. I have in the past found this more forgiving than using TSQL in Sql Server Management Studio Express but that was using 2005 it may have got easier.
 

CraigDolphin

GrumpyOldMan in Training
Local time
Today, 00:34
Joined
Dec 21, 2005
Messages
1,582
Thanks for clarifying your situation. Sadly, the fish was live-exported to Japan and went under the sushi auction this morning.

If your original data source is a database, and those tables with 50 columns are fully normalized then you're hosed considering Jet's limitations IMO. I agree with ButtonMoon and Lightwave that you'd be better switching to a sql server backend and using access strictly as a front end.

Aside from that very good advice, you could consider trying something like creating an intermediary table.

Say you have a currently imported set of records from a table with 50 fields. That goes into something like tblCurrentImportView and it contains the unique pk field for that table (let's call it RemoteKey) The rows of data in this table can be deleted, changed, whatever.

You also permanently store a table of all of the RemoteKey values that you've ever worked with. Say we call this tblImportedKeys with it's own autonumber pk field (LocalKey), and the foreign key field (RemoteKey). It only has the two fields and so shouldn't create a massive file size. You don't delete the rows with RemoteKey values that might no longer be present in the tblCurrentImportView table.

There would be a one to many relationship between tblImportedKeys and tblCurrentImportView

The general concept would be that when you import data you would use an append query to add any RemoteKey values that are new to your database to your tblImportedKeys. Rows with RemoteKey values that are already in your table do not need to be added since they should still be there.

Then, for other data that you create/store in your other local database tables, instead of storing the RemoteKey value as a foreign key field, and enforcing RI on that, you store the relevant LocalKey value from the intermediary table instead. You can then enforce RI because you can change the contents of tblCurrentImportView without deleting the contents of the tblImportedKeys table.

You might also consider keeping your tblCurrentImportView table in one backend access file, and the rest of your tables in a different backend file object. That ensures that the bloated non-normalized mess of data in your imported table doesn't blow out the file size limitations in your fully normalized backend.
 

Asoul

Registered User.
Local time
Today, 01:34
Joined
Apr 11, 2012
Messages
19
Thanks ButtonMoon, Lightwave, and Craig.
I am spending today getting SQL Server Express installed and will do my best to assimilate all of this through some good old fashion trial and error throughout next week, but will likely end up reposting for some help. Peace
 

Asoul

Registered User.
Local time
Today, 01:34
Joined
Apr 11, 2012
Messages
19
So disappointing :banghead:(reminds me why I began avoiding all of Microsoft's specialty products) but SQL Server Express will not work for this situation. The data I need to access is in DB2 format which requires a separate download to SQL Server and the download will only work with the Standard or higher editions.

Buttonmoon mentioned "or any other SQL DMBS".

Is anyone aware of any other solution that works with DB2 formatted data?

Not sure paying $1200 for SQL Server Standard for this purpose can be justified, and seems I am just complicating the whole process.

Comes back to the question of: am I really loosing anything or creating a problem by not have the relationships and referential integrity created within the DB?
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 03:34
Joined
Feb 19, 2002
Messages
43,302
You can use Access as your DTL tool. Access links to DB2. Access links to SQL Server. Query selects data from DB2 and appends to SQL Server.

This isn't really going to solve your problem though since you still have the update problem if you are using RI. Craig offered a solution I have used when I needed to maintain RI but had to update data from another source.

Another alternative is to run DDL queries to drop relationships so you can replace entire tables and then run DDL queries to put the relationships back. The problem with this is that you could end up with errors when you put the relationship back and you'll have to deal with that.

For projects that require conversion of existing data, I find the DDL query to be invaluable. It is almost impossible to run a large conversion once and be done with it. In practice, you need to run it multiple times for testing and then a final time when you are ready to go into production mode. So for my conversion routines, I drop all relationships. Delete all the data. Reload the data. And finally, reestablish the relationships. Followed by a compact and repair.

The following is a pair of queries from one of my conversions. The first drops the constraint and the second adds it.

It's a pain to set this up the first time because when you create the constraints through the relationship window GUI, Access sometimes gives the relationships funky names. Sometimes the names are tblAtblB and if that isn't unique, I think a column name is added. So open MSysRelationships to see what you've got and decide if you want to keep the existing names or if you want to start clean with new names.

 

Attachments

  • DDL.jpg
    DDL.jpg
    65.3 KB · Views: 2,103

Asoul

Registered User.
Local time
Today, 01:34
Joined
Apr 11, 2012
Messages
19
Thanks Pat! I think your post helped clarify some of what Craig wrote that I was having difficulty digesting. This level of complexity in a database is beyond what I am used to doing (I am a financial guy not a programmer) but I am slowly grasping it.
I have utilized Craig's method but simplified things a little and realized I should have thought of that to begin with... essentially I am using the linked full data tables to query the Unique PK fields against the sub data set using a greater than date to determine my updated data set that needs to be added. Using a Make table I now have the data to append to the existing table so this should not blow away RI - correct? I haven't made it all the way through this process, but think since I am utilizing the linked full data sets for only the update process, I should be able to incorporate them directly into the existing back end of the DB without blowing out file size as they are only linked tables... or am I still better off keeping them separate?

Now that it is all spelled out it seems so simple - but in the heat of battle I was struggling.

I have also started working with MySQL as I feel this will eventually need to be migrated to something larger, and eventually this will get handed off to some real programmers, but I am having migration issues with the data so figured I could work on the full solution after achieving this fix within Access.

Thanks again for everyone's help and continued feedback! Peace
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 03:34
Joined
Feb 19, 2002
Messages
43,302
Don't use a MakeTable. That just causes unnecessary database bloat. Use the select query as the source of the data for the append query and save yourself a step.
 

Users who are viewing this thread

Top Bottom