Reliable Methodology for Offsite Data

Thales750 · Mar 18, 2011

This POS (Point of Sale) software will have a MySQL server located in the main office and will support local front ends in branch operations. The local front ends will have jet tables located in the system to provide high reliability and performance in the local systems.

We intend to sync the systems periodically using a timer event on forms.

Several problems arise in this process.

1. What is the preferred method of pausing sync when a user is performing sales operations? Somehow it needs to wait until X number of minutes from a keystroke.

2. It needs to check to see if the connection to the server is active.

3. It needs to continue to allow sales even if the connection is invalid.

4. Update server first on new sales transaction data. This is straight forward insert into the server tables and will verify that the primary key is not already used in the server tables. No problem there.

5. Searches by the branch for individual items will need to be pass through querried to the host. If host is not available no data exist.

6. Update or Insert into local tables updates to companywide contact and customer information based on time stamps. This will be an upload insert and a subsequent download insert. Where only a greater than time stamps will be updated or insert in the case of new records.
The thought just occurred to me that it’s possible that MySQL provides built in tools for this type of sync. If ya’ll are familiar with that let me know.

Some of the data will need to be live and not available during a network failure. The question there is how do you stop Access from shutting down when the WAN connection is bad?

Well I can see there is a lot to think about.

Hopefully by the time we are through with this thread we will have compiled a best practices for offsite management.

DCrake · Mar 18, 2011

I created a system whereby a POS created end of day files that were ported to the HO server from over 56 sites. as zip files.These were then unzipped on the server and fed into SQL Server 15ml transactions per year.

The POS was not mine I only wrote the middleware that sent the data from the outlet to the HO Server. We did not need to worry about sync'ing.

Thales750 · Mar 18, 2011

DCrake said:
I created a system whereby a POS created end of day files that were ported to the HO server from over 56 sites. as zip files.These were then unzipped on the server and fed into SQL Server 15ml transactions per year.

The POS was not mine I only wrote the middleware that sent the data from the outlet to the HO Server. We did not need to worry about sync'ing.

Well the customer's business model requires live data, and quite frankly it is one of the primary motivators for scrapping their old off-the-shelf product.

So I get to learn how to do it.

DCrake · Mar 18, 2011

Having live data from WAN may be part of their business model, but if a proper feasibility study has beeen undertaken then this would have identified that this is not always possible. As you have mentioned down times and outages will prevent this. At best they could be an hour behind. This will allow for batch processing of transactions. Alot will depend on the configuration at each outlet, ie number of POS devices, average transactions per hour, staffing levels, transmission times, number of outlets.

Each outlet will have a pool number and alocated sequence slots for transmissions. B cannot transmit until A has completed theirs, Likewise C must wait for B to finish. A trawl of all outlets must be performed until A can (re)send the next batch.

So you will need to set up some communication between the central server and the outlet to flag the all clear to send the packet. Lots to think about

Thales750 · Mar 18, 2011

Not sure I understand why the need to "trawl". Each of these offsites are connected to the server through a high speed VPN so they all look like a local connection to the MySQL server.

ChrisO · Mar 19, 2011

I’m thinking of this as part of your previous thread found here: -
http://www.access-programmers.co.uk/forums/showthread.php?t=206862

Two distinct modes of operation: -
A. Part of the network which requires communication to devices other than itself.
B. Autonomous operation.

The difference between the two modes of operation is the validity of the communication path.

It may be that a system needs verification of valid data transfer from source to sink before it can commit to some action. In your previous thread you suggested a temp table. So the temp table contains the source data until such time as it receives sink permission to delete.

The permission from sink (to delete data in source) should follow the same path, but in the reverse direction that the data flow from source to sink took.
(The path may involve backup paths as would be the case with redundant networks, in which case any valid permission to commit to some action, from any path, would be acceptable.)

Added to that situation is a need to limit response time from sink back to source.
From my experience, in another area (not POS or ATM), the time limit error was called a Turnaround Timeout error or TATO error and was set to 40 milliseconds for all devices connected to the networks, primary + 3 redundant. The reason for the TATO error was to stop devices on the network from becoming what was called babblers.

A babbler would be a device that, for hardware or software reasons, transmits something and continues to transmit the same thing if it did not get an immediate reply. That would lead to the network lockout of other devices on the network which were in fact behaving correctly.

The condition of the babbler may be transient, some errors do self-repair. So a log of babblers was kept and after some number of retries was exceeded its, the babbler's, path was disconnected. After disconnection, manual intervention was required to reconnect the babbling device.

Up to the point of disconnection the device could be said to be operating in mode A above.
After disconnection the device could be said to be operating in mode B above.

Mode B is not a total failure of functionality but is a restricted set of the operational modes of A.
The restricted set of operational modes is dependant on what the device is supposed to do.

I’ll make a guess that a POS device should continue to function in order to keep its primary function operational. That may simply be to take money, give change, record what was sold and log the transaction…whatever. It may even allow the purchase to be charged, depending on the business rules applied.

What it would not be able to do is, for example, get a current balance of an account. That would require a calculation of current transaction amount against data which would only be available if a network path was available.

So mode B is only a partial operating set of mode A, not a total failure.
(Some companies use the word Flunk condition and the Flunk condition can drop to lower levels of operation if the device operations can be sub divided further.)

When a device moves from mode B back to mode A, the status of its transactions, while it was in Flunk condition, needs to be sent to its master. For that to happen the transaction data needs to be stored locally until such time as communications are reestablished.
(This last point is not necessarily true for some devices. Some devices in process control would have data that is not considered imperative for the master to know about during a Flunk condition. Such niceties as logging information may not be considered imperative for system operation and, depending on network traffic, could be seen as contrary to keeping network traffic to a minimum when a device comes back on line, B->A.)

But a POS system would need to save transaction data somewhere if it is to operate in mode B, and hence the need for local storage…temp tables, EEPROM… whatever.

>>Each of these offsites are connected to the server through a high speed VPN so they all look like a local connection to the MySQL server.<<

The immediate failure I see here is the singularity of failure, the Achilles’ heel of the system.
Just food for thought but I would not consider any system like that as acceptable.

Often people regard systems as being reliable but, in a very real sense, they need to think about system availability.
System availability, even with failures, is what counts.
System availability, despite device failures, requires redundancy.
System availability may mean reduced functionality.
Reduced functionality should not include the primary device requirement.
System availability of primary system requirement is the goal.

Primary system requirement needs to be understood for the specific condition of use.

What that means to me is that a solution needs to be constructed for a given set of conditions; no best practice, just best fit.

Chris.

Thales750 · Mar 19, 2011

Morning All,
Chris,
It’s not a continuation of the other post. They are two separate elements of the same job.

The plan is to create 100% functionally autonomous branch locations. The will either run on Jet, MsSQL Express, or MySQL. Most likely it will be Jet, or whatever they're calling it now.

Each of the offsites has a high speed internet connection which provides 2 types of services.

1. VPN to the company server.
2. Internet connection to Credit Card Processors.

So it’s actually a lot simpler than it sounds. I will most likely have a second Access Database running in the background and it will poll the server periodically and if the connection is valid it will perform a series of queries that will insert transaction data into the appropriate MySQL Server table.

End of that story.

Actually this does present a stronger case for the former thread you were referencing.
http://www.access-programmers.co.uk/...d.php?t=206862

In that thread we discussed not saving any sales order data until the entire sales order was completed. It was referring only to local data.

With that method of only inserting sales records after a completed transaction means that the system can merely set the criteria for the time stamped records a few seconds before it runs the queries to update the server.

After it runs the queries to the server it runs another set of queries to compare the local data with the server data. It logs the results and sends a message to an administrator if corrupt data exist.

Also it would seem prudent to time stamp and number the transfer as well.
This doesn’t sound problematic to me.

If anyone has a more elegant solution, or if you see flaws in my logic, please let me know.

I hate to spend a week on something, just to find out later there was a much simpler and, well you know, a much rounder and perfectly suitable wheel.

The_Doc_Man · Mar 19, 2011

I'm going to ramble on this, hoping that a few thoughts and some experience in a related situation might offer you insights. I don't know that I can actually tell you what to do, but maybe I can help you see what you can and cannot do.

I don't care that you have a high-speed VPN. That is a chimera. If you establish an Access-type continuous connection over it and it crashes, your database is still potentially toast. Access, as it is written, is not currently (to the best of my knowledge) capable of gracefully surviving network failures. I work with the US Dept. of Defense on a mission-critical system and WE can't guarantee that our networks are always up. And WE have our own networks that don't depend on commercial carriers! I don't see how any system on a commercial, even private, network will do any better.

Our solution is that real-time data is JUST NOT POSSIBLE if you throw in the word "reliable" AND have an impossibly fast definition of real-time. The three-cornered stool "WAN, reliable, instant real-time" does not stand up. What we CAN do is make it "near-real-time." We send transactions over the network, but the database doesn't depend on the connection. A "courier" task does, and we write it as a separate, stand-alone little thing that is not Access-dependent. It can crash and restart as needed. All it does is place transaction files in a known location or get them from a known location. It also knows to die when it gets back the network code that says "path lost to partner" or whatever your O/S says has just happened. A partner task on the other side of the network does the same thing, sending and receiving transaction files asynchronously. The databases at local and remote sites then just look for these transaction files and replay them - but now, as completed files, they can be handled using synchronous, batch-like methods.

Using a "divide and conquer" method, there are actually multiple database applications active at once. One is the user's interface looking at whatever we wish to publish. The user terminal has a background interface to the same BE file that does updating as needed based on the transactions.

The rule is that whatever we send to the central site gets posted as a transaction, but it ain't real until the central site sends back a transaction to update the local copy of the totals. Not an acknowledgment, 'cause that would be wrong in a multi-site environment. Instead, the transactions tell us the current level of what it is we just changed.

The local sites have a table that includes the time of the most recent data refresh, so if you call up data, you would see the last time of refresh from the central site. You would know just how old it is. And the central site is the authoritative data source. All local copies of anything are non-authoritative. (Which is why they need postings from the central site.) A system reboot or a network crash can be expensive, because the need to resync is expensive. However, if you don't do it this way, then you can do nothing at all when the network is down.

The key here is first, the concept of being only "near-real-time" - which requires the customer to make the decision of "how near must 'near' be in order to meet the requirements?" If the answer is that only real time is acceptable, you cannot do it. It ain't physically possible in a distributed environment if you don't accept SOME level of lag time. And that is because networks crash. Never mind why they do. Never mind how they do. Just know that they do.

With a local non-authoritative system, you can still do POS work within the limits of the data you have. And the reason that generally saves your butt is that things don't move impossibly fast in a POS network.

Let me digress. In the oil/gas pipeline industry, we were considered real time when we took 30-120 seconds to respond to pipeline changes. Why? Because the inertia of moving oil, the slow nature of the massive "block" valves, and the viscosity of product made it impossible within the laws of physics for any state change to happen in less than 2 minutes. Therefore, if we got data sooner than 2 minutes, we were EFFECTIVELY real-time.

I'm thinking that realistically, in a POS terminal, very few things change fast enough that you need a lot of massive traffic. So part of your solution would be to determine the latency of the business flow. (No, NOT the network latency.) How long does it really take for something to materially change? If you can make "near-real-time" fall inside that time limit, you have it beat. You then advertise to the customer that a transaction using this method will take x seconds but will be accurate. You further advertise that with network failures, your transactions can still go on but you will have some issues to be addressed for cases where you are at or near your inventory re-order point.

Thales750 · Mar 19, 2011

I believe we just described the exact same scenario.

Real time in this system will be the interval that all of the stores combined take to update the server. Which is in the neighborhood of 15 mins. Reorder data is computed at the warehouse level and is in increments of 1 week, each store receives transfers from the warehouse once a week.

There will be no catastrophic cost to any inventory data in less than one week increments.

The purpose of the near real time updating of the server is for stores to be able to run certain queries on the inventory levels pertinent to a specific item of inventory to provide an individual customer with an alternate location, offer them an expected delivery date, or to add them to the mail order list for shipping to their location.

None of this will require real time in any increment less than an hour or so. The real and paramount requirement is to never lose the ability to perform a sales transaction during network problems. Obviously a problem with the credit card processor is untenable and will just have to be dealt with.

But that is the same condition that all retail stores contend with if they except credit cards.

So to recap.
1. Stores must operate autonomously.
2. Using a separate database, stores transfer data to and from the main server on a timed bases and must check data validity at each transfer.
3. Some method to report and fix the inevitability of data loss during transfers.
4. Queries to the server for inventory purposes must be limited to pass through queries to the main database located on the server.

The bottom line is that it is imperative that 2 separate systems are located in the stores. One to process transactions, and the other to transfer data to and from the host.

Thanks everyone for helping me clarify this problem.

ChrisO · Mar 19, 2011

Just as a suggestion, if at all possible, I would add another level of redundancy to the system.

0.4
Multiple POS terminals. Redundancy of core functionality.

0.5
All POS terminals must be able to operate autonomously to achieve their core functionality.
That will mean that each POS terminals core functionality needs to be defined.

That definition needs to address such things as: -
1. Paper receipt required. Yes/No
2. Credit facilities required. Yes/No
3. Automatically storing transaction history. Yes/No. Perhaps a manual balance when the server comes back on line.

Given, say, a (singular) store server disk head crash, does the store need to close its doors to business?
Or, does the store stay in business but at some reduced functionality?

Chris.

Thales750 · Mar 20, 2011

I agree every cash register must run completely independent of the rest of the system. Each will have its own printer and connection to the Credit Card Servers.

gemma-the-husky · Mar 20, 2011

thales - i think the point that is being made is that a strategy that expects live real time updating of data from remote sites is prone to failure.

so if you are trying to implement routines that depend on this, such as just-in-time stock replenishment - then it is liable to lead to problems.

it's your system - but what others have tried to say, is that the ramifications of this should have been considered as part of your initial feasibilty study - and the robustness of otherwise of your data transfer mechanism is something you need to take into account.

Thales750 · Mar 20, 2011

gemma-the-husky said:
thales - i think the point that is being made is that a strategy that expects live real time updating of data from remote sites is prone to failure.

so if you are trying to implement routines that depend on this, such as just-in-time stock replenishment - then it is liable to lead to problems.

it's your system - but what others have tried to say, is that the ramifications of this should have been considered as part of your initial feasibilty study - and the robustness of otherwise of your data transfer mechanism is something you need to take into account.

Does it seem like I'm missing something?

Simon_MT · Mar 23, 2011

If you have got a high speed connection from the server site to your branches using VPN then you could consider using thin clients (local printers) and a Terminal Server. To mitigate damage to the database caused by outages the Session Timeout needs to get generous. It does require:

Decent upspeed on the server site
Business broadband at each site (low contention)
A Terminal Server

So far as busting Access over the WAN, I have seen users jump from one PC to another and simply resume the previous (same) session.

You take the local printer with each Terminal Server Session although the drivers have to be loaded on the server, the remote printers do not have to be physically set up.

All a Terminal server is doing is screen dumps. Little traffic actually goes across the WAN. I have done with a database with 16,000 images to two sites one in NY.

Never say never!

Simon

Reliable Methodology for Offsite Data

Thales750

Formerly Jsanders

DCrake

Remembered

Thales750

Formerly Jsanders

DCrake

Remembered

Thales750

Formerly Jsanders

ChrisO

Registered User.

Thales750

Formerly Jsanders

The_Doc_Man

Immoderate Moderator

Thales750

Formerly Jsanders

ChrisO

Registered User.

Thales750

Formerly Jsanders

gemma-the-husky

Super Moderator

Thales750

Formerly Jsanders

Simon_MT

Registered User.

Similar threads

Users who are viewing this thread