Number of Indices

MatMac · May 19, 2004

Hi

I'm trying to link a table to about 35 others with referential integrity and cascading updates/deletes enforced. I'm being told that the tables has too many indices. (I understand Access' limit is 32 per table.)

Am I correct in assuming that, if RE is enforced, an index is created in the parent table in order to establish the link? (I can successfully establish all the links if I don't try to enforce RE.)

Given that I really do need all these tables and links with RE enforced, any suggestions of a way round this?

Do later versions of Access permit more indices per table? (I'm using Access 2000.)

Many Thanks - MatMac.

Pat Hartman · May 19, 2004

To start with, make sure that you don't have any extraneous indexes created. A2K and newer loads with a default that causes it to create indexes on fields that end with ID,NUM, and a few other strings. Most of these are completely unnecessary since they duplicate the primary key. Go to Tools/Options and remove the option. Then open each table and remove any of these bogus indexes.

Secondly, although you probably don't want to hear this, your design is suspect. Having a single table related to so many others sounds like a design problem. I recently ran into this problem with a db created by one of my clients. The db was very complex and included over 100 tables, many of which were related 1-1. The client didn't have any trouble because he didn't bother to set primary keys or enforce RI but as soon as I started cleaning up the mess, I ran into this problem. The client did't want me to take a month to re-design the db so instead we converted the be to SQL server which has a higher limit.

I have over 30 years experience in designing applications. I have managed multi-million dollar, multi-year, projects for all aspects of a business from order entry to shop floor control to credit card authorization. NOT ONE of these applications ever exceeded 255 columns in any single table or more than about 10 individual relationships.

One scenerio that I can imagine that might run into problems is a shared lookup table. If you use a single table to hold all lookup values for your app (as I do), you might run into a need for more than 32 relationships with that table. In that case, I would probably create a second lookup table. RI is too important to not enforce.

MatMac · May 21, 2004

Pat

Thanks for this. Clearly your observations are of much concern to me, but at the same time, I'm always happy to listen and learn from those with more experience than I. (And in the case of Database developement, I'm pretty inexperienced!)

Would it be inappropriate for me to ask you to send me your e-mail address so that I could contact you directly on this issue? What I would like to do is to send you a short precis of what my database needs to do and why I have set it up the way I have. If you then wished to offer any comments or suggestions on a more suitable model, I could then weight up whether it might be worth re-structuring. My E-mail address below:

Thanks - Mathew.

Dr Mathew D. Mackenzie
M.Mackenzie@kent.ac.uk

Pat Hartman · May 21, 2004

I earn my living as a consultant. I volunteer time here because it helps so many people. Conducting private consultations for individuals would take away time that I would otherwise spend answering public questions. If you need private consulting we can discuss rates. Otherwise, post your questions here. I will answer them as I have time as will the other members.

MatMac · May 21, 2004

Pat - I quite understand and appologies for any misunderstanding. I wasn't after any more from you than you provide within this forum. I simply thought it might be easier for me to lay out the problem and mail it to you rather than type it into this form. Also, given that you have suggested a flaw in my design, I thought you might wish to elaborate.

Anyway, below is a short problem description, which also mentions my table layout and why I chose it. If you can explain why you think this is wrong, it would be most helpful.

***********

The database is used to strore the results of comprehensive healthcare assessments conducted on older people. Each record constitutes a single assessment on a single person. Data are anonymised and consented for our use.

An assessment comprises over 400 questions, divided into approx. 20 sections. One section might be, e.g. “cognitive function”, and another e.g. “disease diagnosis”. Each field consitutes a single question. Some responses are textual, although most are single digit numeric codes. One datum is hence a single response to a single question within a single assessment.

Assessments for residential and community dwelling people are different, discreet assessment types, with variations in both question meanings and response codes. Residential and community assessments must therefore be stored separately.

The purpose of the database is to centralise assessments conducted by a variety of services providing care to older people. Data are provided electronically, as text files, and imported into the database. A variety of service providers, each using different software themselves, means a number of different specifications for the text files received. Assessment data are not typed into the database directly.

At present, the database is a single user system, the administrator (me) importing received data sets, and exporting groups of assessments into statistical software for analysis and reporting back to service providers.

Assessments represent a snapshot in time of a person’s healthcare status. Hence, once assessment records are added into the database, individual fields (responses) are not altered or deleted (except for automatic cleaning to remove out of range values). Subsequent assessments on the same person are expected, and are stored as separate assessments.

Aside from during the developmental process, it is unlikely that assessments should be required to be deleted from the database, although this facility is desireable.

STRUCTURE
· An individual or organisation wishing to send us data and in return receive statistical reports on their patients, is termed a SUBSCRIBER.
· A subscriber may run one or more FACILITIES (care homes or community teams assessing older people).
· Each facility sends us assessments in one or more BATCHES
· A batch can contain one or more ASSESSMENTS.
· The older peson on whom an assessment is undertaken is termed a CLIENT. A client may have one or more assessments within a single batch, may have one or more assessments within the database as a whole, and may, thoughout their history, be assessed by one or more facility.

Each of the entities listed above are separate single tables within the database, linked accordingly. An assessment however, is stored across multiple tables. Basic assessment details common to both residential and community assessments (e.g. date & client ID) are stored in a table named ASSESSMENT. The assessment response data are then stored in separate tables, with one table for each discreet section within the assessment because:

· 400 is too many fields for a single table.
· Within an assessment, some but not all sections are completed in each case. Storing each section in a separate table is hence a more efficient use of space.
· An assessment section represents a specific healthcare DOMAIN, e.g. cognitive function. It is helpful to review data within a single domain as a single table, rather than in combination with data from other domains.

The ASSESSMENT table must be linked to each DOMAIN table. Given approx 20 sections in each of the community and residental assessments, this is approximately 40 links from ASSESSMENTS, and therein lies my problem if I wish to enfore RE, which I do if I ever wish to delete assessments.

Is that enough information?

Regards

Mathew.

Pat Hartman · May 21, 2004

You have a typical survey application. There have been many posts here regarding the proper way to structure a survey db. Your basic problem is that you have created tables that look like spreadsheets with each question in a separate column. This is what is causing the numerous 1-1 tables and the excessive numbers of columns. When you see the suggested survey structures, you will notice that all have a table that defines the questions and then another table that joins the questions table with the surveyresponder table and includes the value of the response for a particular question from a particular respondant.

Hopefully you don't have a lot of data entered. It will be a trememdous amount of work to normalize the structure once it is populated. It would requre a query for EVERY column (question) that needs to be come a row. As long as there is no data, you can just delete the extraneous tables/columns and add the properly normalized ones. This of course also has a big impact on forms/reports. If you have none, that's great because creating forms/reports for the properly normalized structure will be simple and fast compared to the amount of work required to build forms/reports on the unnormalized structure.

MatMac · May 24, 2004

Pat. Thanks very much indeed for your reply. I have also reviewed some of the archives and the responses you provide therin. The reason I didn't go down the fully normalised route to start with was mainly due to the format I receive my data in. Data come in electronically as fixed-width text files, in a flat file spreadsheet-type format. Given the fully normalised structure you suggest - I couldn't figure out how to get the data into the tables, with one row per individual question/response tuple. I don't know VBA, and so am limitted to Access' built in macro language. I would however, like to get my DB structure correct, so if you have any suggestions to point me in the right direction I would be very grateful.

Similarly, I would not know how to export the normalised data into flat files again to go into statistical software (SPSS).

Whilst most questions have single digit numeric responses, some are dates and some are free text. How would I best structure my "questions" table(s), to avoid having to format the response code field itself as text to store all types of response?

Finally, with one rown per question and 400 questions per assessment, it is very likely I could end up with tens of millions of rows in this table. Reading forum threads this seems OK, limitted only be file size.

Pat Hartman · May 24, 2004

It is possible to "normalize" a flat table by using queries. The problem is that it takes a query for EACH column so if you have 100 columns, you need to run 100 queries. A code loop would be less work if you knew VBA. If you intend to analyze the data with SPSS rather than Access, can you tell us why you are bothering to import the data into Access at all. It is hard to provide sound advice in this area without knowing what you are trying to accomplish with the Access database.

MatMac · May 26, 2004

Yes that would be a lot of work to normalise using queries. It would also require temporary flat file tables to store the data until they were normalised. What I wondered was how difficult it would be to import data from flat text files into fully normalised access tables and how this could be achieved.

The reason I am using an Access database is basically to try and store the data we receive in as efficient a manner as possible. It is quite possible that we will end up with hundreds of thousands even millions of records, in which case an SPSS flat file with this many records and over 400 fields would be riduculously clumsy even if SPSS permitted such a large file. As I mentioned before, an assessment may have only a small part completed, in which case the SPSS flat file would contain mostly nulls. At least using my partly normalised database, which has one table per assessment section, I can delete section records with all null values after import, keeping only those section records with one or more valid entries. Of course I would like to go all the way and realise a fully normalised structure.

The data analysis we perform is on specific sets of assessment records, which I export to SPSS for this task. There are a large number of quite complex algorithms which run on our data to provided a variety of "health outcomes" scales. These have historically been written as SPSS syntax, so at present it is far easier to export and analyse rather than re-code these algorithms using Access. This however is a future possibility. Of course we could simply keep lots of individual SPSS data files, one for each "set" of records. However, as with most applications, we have future objectives and desire a certain flexibility which would be best served by holding all our data in an efficient centralised manner.

The_Doc_Man · May 26, 2004

To import normalized data from flat files can be done. It is just tricky.

One way is to import data to a QUERY that does the join based on a pre-defined relationship. The query must not have any functions and must address all fields that you get from these flat files. But even that can be awfully tricky when enough tables are involved.

Another way involves writing some VBA code. You can import the flat file, then write a routine that consists of a lot of little snippets that all look the same. Namely, open a recordset, add a new record, populate the record, update the record, close the recordset. You could theoretically write a subroutine to do this if the secondary tables are simple enough.

WARNING: Won't be the fastest thing you've ever seen.

Pat Hartman · May 26, 2004

Yes that would be a lot of work to normalise using queries. It would also require temporary flat file tables to store the data until they were normalised.

- it actually doesn't require ANY temporary files. Each query selects the data from a single column and along with the key info, appends records to the final table. Once all the queries finish, the normalized table is completely populated.

This certainly has the potential to produce a large number of records. However, you can ignore nulls. So, if your flat records are sparsely populated, you can control what the queries select so they won't produce rows for empty answers.

To export the data as a flat file, you would export a crosstab query that flattens the data rather than exporting the table.

All in all it seems like a lot of work to take flat files in, normalize them, and then export them as flat files again but do nothing with them in the Access db.

MatMac · May 27, 2004

Pat, DocMan. Many thanks for your replies.

If I may summarise: the main use our data will be put to is analysis using SPSS. However, as this database will grow rapidly and constantly, we require an efficient form of storage. Using SPSS to store the data files would be clumsy and inefficient. The reason that I chose Access was to attempt to store these data as efficiently as possible. However, as you point out Pat, the table structure I use is not fully normalised, the reason being that our data arrive electronically as flat files and it is much easier to import to my chosen structure. Aside from my original problem (see start of thread) that Access does not permit sufficient indices for me to link my tables with RE enforced, the database works well despite its shortcomings.

I am keen to adopt a fully normalised structure, but only if I am able to reliably create import/export routines. (There's also something rewarding about doing something the right way!) This, despite a large number of potential records (10,000,000s) would save a lot of space, as, in some cases, many fields can be Null. DocMan, I'm sorry but I do not understand your suggestion on how to create a query to import flat file tables into a normailised structure. If you can provide even a basic example this would help me a great deal. Pat, reading your last message, I'm a little confused as not using a temporary table, implies to me that the TransferText command would be used to import directly into a query. I did not realise that this was possible.

Given my scenario and current uses, Pat/Docman, if you think that, despite an effectively poor database structure, I am best advised to leave things as they are, do let me know. I note from the archives that there is some who believe that in some case, it is counterproductive to fully normalise.

Pat Hartman · May 27, 2004

The usual justifications for normalization are missing from your application.

1. Enforce referential integrity - although you are attempting to do this, there doesn't seem to be a pressing need. All you are doing is importing and exporting flat files. You won't have users updating and adding data so there is less danger of records being orphaned.
2. Efficiency of storage - depending on the number of sparse fields, your solution might actually require less space.
3. Data analyzis - you have no requirement to analyze the data. You are exporting to SPSS to do the analysis. Access doesn't come with the functions you probably need to do the analysis within Access although you can purchase statistical functions from third-party vendors.

TransferText command would be used to import directly into a query. I did not realise that this was possible

- Not exactly. You would use the TransferText to "link" the external file rather than import it. Then when you are done running the appropriate append queries or code, you would delete the link.

MatMac · Jun 1, 2004

Thanks very much. Lots for me to think about.

Number of Indices

MatMac

Access Developer

Pat Hartman

Super Moderator

MatMac

Access Developer

Pat Hartman

Super Moderator

MatMac

Access Developer

Pat Hartman

Super Moderator

MatMac

Access Developer

Pat Hartman

Super Moderator

MatMac

Access Developer

The_Doc_Man

Immoderate Moderator

Pat Hartman

Super Moderator

MatMac

Access Developer

Pat Hartman

Super Moderator

MatMac

Access Developer

Similar threads

Users who are viewing this thread