Replace Primary Key

DreamGenius · Nov 18, 2008

Long story short, I've inherited maintaining a database in which the Primary Key on most of the tables is Autonumber.

Unfortunately, this database is used to collate data which is collected in clones of itself off-site, so I can't import data from the clones and maintain referential integrity, because the clones will use their own Autonumber sequence. At best, there'll be duplicates. At worst, there'll be mismatched joins!

The tables affected record medical practices and doctors at those practices. A subject therefore has a DoctorID recorded against their record, which pulls the Doctor from tblDoctors and the Practice from tblPractices, each Doctor in tblDoctors having PracticeID as a foreign key.

I'm inclined to replace the Primary Key in tblPractices with the post code (spaces removed) on the basis that there should not be any full post code in the UK with two practices on it. The Primary Key in tblDoctors would then be updated to the same post code plus a unique number per doctor.

This would enable me to import data from the subset databases without risking upsetting the referential integrity but can anyone see any flaws in this approach before I go off and louse it all up completely?

Apologies for the length of post.

Rabbie · Nov 18, 2008

Try reading up on Database replication in Access Help. This may be an alternative way to go for you.

Although it is not likely it is possible that there could be more than one practice on a Postcode so you need to be able to handle that in a clean way.

gemma-the-husky · Nov 18, 2008

replication effictively uses a massively random autonumber value, iso clashes between values are extremely unlikely, and you can therefore combine data without expecting the clashes you describe.

DreamGenius · Nov 18, 2008

I'll read up on Replication, with thanks. So far, it looks good, but would have been so much easier if they'd used it before creating the clones. The bit I need to read up on most now is the potential problems of changing the main database and the clones independantly.

Thanks again

DreamGenius · Nov 18, 2008

It strikes me that Replication is a path that should have gone down long before these databases were released into the wild. Can anyone confirm or contest this please?

I'm still not convinced that it is the perfect solution, because each of the databases in the wild contain a different and distinct subset of the master data, applicable only to the location of that database copy.

Therefore, if I set up Replication and create my release copies, then remove the data that is no longer required to be in each of those copies, will all of the data will be removed from my master database on next synchronisation?

As I need at least Long Integer but preferably Replication ID types for the Autonumber keys, there's still no easy way of pairing the data contained in the master copy against the returns already received, because the keys still won't be the same.

Finally, I really only want certain columns in the tables overwritten by the data returned in the clones, whilst others should remain sacrosanct in the master copy. Does Replication offer this level of granularity?

Oh, if only life were easy! Thanks in advance for your help.

The_Doc_Man · Nov 18, 2008

Replication is a path that should have gone down long before these databases were released into the wild.

Probably true.

will all of the data will be removed from my master database on next synchronisation?

That's a good catch, which is why you have to convert the fielded copies AND the "master" copy before synchronizing. Also, look up the special properties of a "replication master" database to follow up on that question.

only want certain columns in the tables overwritten by the data returned in the clones,

Best implemented by importing the data to a staging table and then doing an intermediate and highly selective update to the master table, perhaps.

dfenton · Nov 21, 2008

DreamGenius said:
It strikes me that Replication is a path that should have gone down long before these databases were released into the wild. Can anyone confirm or contest this please?

From what you've said, it seems like it. However, if they are emailing their databases for import, or some other method of moving them, that will not work with Jet replication. Jet replicas have to be synched in place. A typical scenario would be replicas on laptops that are taken out in the field and edited, then brought back in the office, connected to the LAN and synched with a central replica.

If the machines the remote replicas are on are not brought into the office (i.e., where they can be connected directly to the LAN), then there will need to be some form of indirect replication over VPN over the Internet. This is very complex to set up, but does work (as long as you do it right and keep on top of maintaining the infrastructure).

I'm still not convinced that it is the perfect solution, because each of the databases in the wild contain a different and distinct subset of the master data, applicable only to the location of that database copy.

You should look into Partial Replication, which is designed specifically for that purpose. I have never used it myself (despite creating replicated apps since 1997), because I don't see the benefit -- my app hides other users' data, and I like the fact that all the data is backed up in every replica.

Therefore, if I set up Replication and create my release copies, then remove the data that is no longer required to be in each of those copies, will all of the data will be removed from my master database on next synchronisation?

That's not how you'd do it. Instead, you'd import all your data into a single file, replicated it, then create partial replicas filtered according to each practice (or whatever your selection criterion is).

As I need at least Long Integer but preferably Replication ID types for the Autonumber keys, there's still no easy way of pairing the data contained in the master copy against the returns already received, because the keys still won't be the same.

Never use ReplicationIDs -- they just don't work well with Access, they break the provided conflict resolver, and they really aren't needed (see http://trigeminal.com/usenet/usenet011.asp?1033). All you need is random Autonumbers.

Finally, I really only want certain columns in the tables overwritten by the data returned in the clones, whilst others should remain sacrosanct in the master copy. Does Replication offer this level of granularity?

No, it doesn't. You'd have to partition your table so that the user-edited data fields are in one table and the master fields are in another, presumably, in a 1:1 relationship.

But what you're describing sounds wrong to me. I would like to hear the justification for a table that has a mix of user-editable and master data. Sounds like a normalization error to me.

You might want to look at my Jet Replication Wiki:

http://dfenton.com/DFA/Replication/

And you'll probably want to look at the articles listed on the front page (though some of the basic ones have never been fleshed out).

DreamGenius · Nov 21, 2008

David

That's a very thorough answer, for which thank you. Replication certainly does appear to be a tin of worms that is perhaps left alone in this case. I agree that the database doesn't sound well designed but I inherited the design and I can't go making wholesale changes as it's used too much and is mission-critical.

Were I given the opportunity to start from scratch, I could implement all of the suggestions that you've made, but there will never be the budget for that.

I'll soldier on with exploring random autonumbers. Is there some way of using a seed for each of different databases or is that not likely to be necessary? I acknowledge that it's a cludge but so's the whole situation.

neileg · Nov 21, 2008

DreamGenius said:
I'm inclined to replace the Primary Key in tblPractices with the post code (spaces removed) on the basis that there should not be any full post code in the UK with two practices on it. The Primary Key in tblDoctors would then be updated to the same post code plus a unique number per doctor.

Thought I'd address this point. Near my home there are three practices that share the same building and the same postcode.

I always use an autonumber as a PK. Too often the natural key that appears to contain unique values turns up duplicates. UK National Insurances numbers are another example. Full numbers are unique but emergency NiNos can be duplicated.

fkegley · Nov 26, 2008

Is there a combination of two or more fields that could be used to join the tables? You don't have to use the PK unless you just want to. But to maintain referential integrity you must have a unique index on the "one" side of the relationship.

DreamGenius · Mar 24, 2009

For the sake of completeness, I thought I'd post how I finally solved this problem. It isn't elegant and probably isn't even best practice but it does work, and is far simpler than retrospectively applying any of the suggestions made above. That isn't to detract from the suggestors in any way; more to admit that these paths should have been explored very much earlier than when the problem was found.

In this thread there is a discussion about Autonumber fields and the revelation for me was that you can import entries into an Autonumber field, so long as the data being imported are both integers and unique. That is to say that they don't break the rule set for that data type, just the same as importing any other data into any other type.

By applying a large seed, unique to each slave database, when importing the data, I preserved the data integrity. Thus:

Slave1 - Autonumber + 100,000
Slave2 - Autonumber + 200,000
Slave3 - Autonumber + 300,000
Slave4 - Autonumber + 400,000
Slave5 - Autonumber + 500,000
Slave6 - Autonumber + 600,000

This prevented me writing over any existing data, by making Record 864 from Slave1 into Record 100864 while the same record from Slave2 is 200864. Doing it this way, I still have an Autonumber key which is still Unique but care has to be taken to ensure that the imported entry is changed everywhere it appears, or the data will be broken or worse badly linked.

I should perhaps make clear that this change is only made at the import stage. Otherwise, the table design is exactly the same in the Master and Slave versions of the database. I don't think you can force a seed on an Autonumber field and I don't think you can change an Autonumber once the record has been created but you can change the entry whilst the data is being imported provided it is still an integer and unique.

This solved my problem but it would be interesting to see if anyone more knowledgeable can point out any potential failings in the solution.

DCrake · Mar 24, 2009

I only read your first post and a comment you made jumped out at me so this is why I skipped the rest and posted this comments.

There are many practiices with the same postcode. Just think about health centres that house many practicies. The building owns the post cod enot the practice. You would be better using the Practice code as your primary key. Such as P81047

David

DreamGenius · Mar 24, 2009

David

You're entirely correct, and others pointed out the same flaw in that plan. My previous post (to this one) was the actual solution used, which was far simpler if perhaps less elegant.

Replace Primary Key

DreamGenius

Annoying Questionner

Rabbie

Super Moderator

gemma-the-husky

Super Moderator

DreamGenius

Annoying Questionner

DreamGenius

Annoying Questionner

The_Doc_Man

Immoderate Moderator

dfenton

AWF VIP

DreamGenius

Annoying Questionner

neileg

AWF VIP

fkegley

Frank Kegley

DreamGenius

Annoying Questionner

DCrake

Remembered

DreamGenius

Annoying Questionner

Similar threads

Users who are viewing this thread