Question Designing replication-like functionality

eellenoff · Nov 8, 2010

Hello, I have a database that is currently used by office-employees of a company to track the services that their technicians perform at customers' sites. It is an access database with all of the forms coded from within an Access 2007 frontend (mdb), with a separate access backend (.mdb). All unique ID's are currently sequential and some of them are as high as 100000, so I'm not sure I want to change them to random. We are trying to transition so that the technicians are able to enter records themselves using netbooks that will not have internet connectivity, and I'm struggling with what the best way to accomplish that is.

Most of the advice I've seen about using Access's built-in replication is along the lines of "don't use it". I was considering trying to create my own synchronization approach where technicians will have a full copy of the backend database on their netbooks and when a technician creates a new record it gets marked as new, and then at the end of the day, when the technicians synch their changes, all "new" records would get inserted into to the master database, and whenever a collision is detected, the records on the technician's machines would get updated with a new unique ID, and then would attempt to sync again.

This certainly seems like a somewhat painful process to me, but so far I haven't come across any straight-forward easy solution. Do you think I'm going down the wrong path here? Should I be trying to use the built-in replication functionality?

Thanks for any suggestions / advice,
-Eric

Banana · Nov 8, 2010

One possible consideration is to see if your company already have a Windows Server and therefore either Windows SharePoint Foundation or SharePoint Foundation Services (both are same beast but renamed for 2007 & 2010 version respectively). The reason for this is that Access has built-in replication functionality when using SharePoint as a data source which has improved significantly with 2010. A caveat needs to be said, however, SharePoint isn't a RDBMS and with 2007, there was no way to ensure data integrity as you could with Access or ODBC tables. 2010 finally introduces this and offers several other improvements so it may be worth a look before you go and re-invent the wheel.

eellenoff · Nov 8, 2010

We don't a Windows Server/Sharepoint, but I've been very unsuccessful in figuring out what exactly sharepoint does and how to use it. Would sharepoint allow me to take the technician's copy of the database that has (for example) a table with two new records A and B and primary keys that conflict with records C and D in the master DB and intelligently resolve those conflicts by changing the primary keys on A and B? I've looked but haven't found a lot of information on how to use sharepoint to accomplish what I'm shooting for. Also, paying the additional monthly fee would be a notable downside for our company

dfenton · Nov 8, 2010

Maintaining a sequence is one of the key problems of any multi-master replication scenario. Avoiding Jet Replication doesn't change that at all -- it just moves the problem around.

There are two basic approaches:

1. in your routine that you use to determine the next sequence number, incorporate something specifically to the source user or the source database, so that the result is unique across the entire replica set. So, if you used the user's initials as part of the sequence, or the name of the source computer, you'd end up with globally unique numbers.

2. allocate batches of sequence to individual users/computers. In that case, you'd have say 100,000 to 109,999 allocated as one set, 110,000 to 119,999 to the next, and so forth (in whatever size batches were determined to be appropriate). SQL Server replication, for instance, offers this at the engine level (i.e., each replica can be allocated its own range of sequence numbers).

3. allocate a temporary sequence number, and then assign the permanent number later. This basically requires that you use something other than your sequence number as the primary key, since otherwise you won't be able to tell which record to synch back with once the canonical sequence number has been assigned. Jet Replication does this for you, assigning a GUID that is independent of whatever data you have stored in your record, so that whatever you use for your PK (and/or your sequence) does not break the connection to the source record.

Random Autonumbers in Jet Replication work very, very well in avoiding duplicate surrogate primary keys, but, of course, they are the antithesis of a sequence. Likewise, GUIDs are guaranteed globabally unique, but are pretty opaque to users, so don't solve that problem either.

So, in other words, regardless of whether or not you use Jet Replication or roll your own synchronization, the problem is EXACTLY THE SAME. You still have to figure out how to maintain a globally unique sequence across multiple locations where records are being added.

The easiest thing to do is to eliminate the multiple locations.

If that's not possible, then you have to do something else.

I have myself never implemented anything like this because when the issue arose in an app with a sequence number, I've always moved to a solution that had only one database, instead of multi-master editing.

Banana · Nov 8, 2010

eellenoff said:
We don't a Windows Server/Sharepoint, but I've been very unsuccessful in figuring out what exactly sharepoint does and how to use it.

I already mentioned that SharePoint isn't a RDBMS, but essentially it's a platform to replace file shares and emailing attachments, maintaining a mailing lists. Instead of maintaining several shared folders that can get very deep and figuring out which is the latest version, SharePoint abstract this away by either implementing a search functionality and/or providing a page for users to obtain the latest version and update the documents. There's more but I think that's basically the most important aspect. A side note on this - it has lot to do with decentralizing the IT administration and thus enabling end users to generate and share their contents which is good thing in the sense that workers no longer depend on a IT person to deploy the solution.

Would sharepoint allow me to take the technician's copy of the database that has (for example) a table with two new records A and B and primary keys that conflict with records C and D in the master DB and intelligently resolve those conflicts by changing the primary keys on A and B?

No, because it's unnecessary. Dfenton already alluded to this fact by preferring a solution that allows you to use a single backend instead of multi-master replication - SharePoint would let you to do precisely just that, and will work with or without an internet connection (Access allows you to work offline upon SharePoint Lists). Conceptually, you can do the same thing with an ODBC linked table, which also mean you don't have to copy a database and give it to the other guy and deal with ugly issues as Dfenton explained. Unfortunately, ODBC linking assumes a persistent connection so it's a dead-end for scenario where you cannot depend on internet connection being always available.

I've looked but haven't found a lot of information on how to use sharepoint to accomplish what I'm shooting for. Also, paying the additional monthly fee would be a notable downside for our company

Well, I understand they're introducing Office365 which is targeted towards small business who otherwise would not pay for a SharePoint installation, hosted or not and you have to consider that using a built solution that's been designed, tested and maintained by a large team of developers/engineers/testers tend to be cheaper in long run than rolling out your own. No matter how great coder I am, there's no way I can reinvent wheel that's just as good that's not also too costly (in the sense that it takes too long to deliver and therefore is of no value once delivered).

Now, maybe your scenario actually can be justified - I don't know all the possible factors though I can see good cases for where we're doing insert-only replication or restricting data to only a small sub-set or doing one-way replication but you have to count on finding and fixing bugs.

I hope that helps somewhat.

gemma-the-husky · Nov 9, 2010

eellenoff said:
Hello, I have a database that is currently used by office-employees of a company to track the services that their technicians perform at customers' sites. It is an access database with all of the forms coded from within an Access 2007 frontend (mdb), with a separate access backend (.mdb). All unique ID's are currently sequential and some of them are as high as 100000, so I'm not sure I want to change them to random. We are trying to transition so that the technicians are able to enter records themselves using netbooks that will not have internet connectivity, and I'm struggling with what the best way to accomplish that is.

Most of the advice I've seen about using Access's built-in replication is along the lines of "don't use it". I was considering trying to create my own synchronization approach where technicians will have a full copy of the backend database on their netbooks and when a technician creates a new record it gets marked as new, and then at the end of the day, when the technicians synch their changes, all "new" records would get inserted into to the master database, and whenever a collision is detected, the records on the technician's machines would get updated with a new unique ID, and then would attempt to sync again.

This certainly seems like a somewhat painful process to me, but so far I haven't come across any straight-forward easy solution. Do you think I'm going down the wrong path here? Should I be trying to use the built-in replication functionality?

Thanks for any suggestions / advice,
-Eric

I think there is just is no easy solution. Databases dont lend themselves to being distributed (over a wide network), for performance reasons. Therefore there is always the issue of managing keys (and data) on databases that ultimately need to be merged.

A2010 has begun to offer web based databases -which is possibly an ideal way forward - but I think they cannot yet be as complex as an off-web dbs.

Question Designing replication-like functionality

eellenoff

Registered User.

Banana

split with a cherry atop.

eellenoff

Registered User.

dfenton

AWF VIP

Banana

split with a cherry atop.

gemma-the-husky

Super Moderator

Similar threads

Users who are viewing this thread