Table Performance

thechazm

VBA, VB.net, C#, Java
Local time
Today, 17:40
Joined
Mar 7, 2011
Messages
515
Ok so I know there are plenty of tips and tweeks out there that you can do for table performance but I cannot find the ultimate solution for this specific table.

It houses the most data in the backend and is called tblShopCode. This houses all of the schools data for everyone. I'll try to cleanly put down the table field names and their types.

[Table Structure]
*[ID] - Number - Primary Key - Indexed
One index - [Personnel Number] - Number
----------- [Course Number] - Text
[Completed] - Yes/No
[SchoolBy] - Number
[Critical] - Yes/no
[Shop] - Text
[Expire Date] - Date/Time
[Priority] - Number
[Note] - Text
[Note Date] - Date/Time
[Schedualed Date] - Date/Time
[Need to Schedule] - Yes/No
[N/A] - Yes/No
[Refresh Date] - Date/Time
[Refresh Time] - Date/Time
[Record Status] - Number
[Record Load Date] - Date/Time

The problem comes into play when everyday I have to check these records against another system and apply the changes in this table if there is a change then inform the user and the individuals of the change.

The performance hit comes from when I search for the record using
[Personnel Number] and [Course Number]. Then also when it has I have to do .update to update that record.

The table itself has 187785 records. When I had only less than 80,000 it worked just fine so I am guessing its from the indexing but I don't know what more to do.

Any help on this is appreciated.

Thanks,

TheChazm
 
What is your search query?
Why not add an index on Course Number?
 
Here is my search string
rsATMSInfo.FindFirst "[Personnel Number] = " & stblAutoNumber(i) & " And [Course Number] = """ & AtmsCoursesID & """"

The reason why Course Number is not a number or indexed is because these value's are being pulled directly from our other system by reading the contents of the web site. Course Number on the website is just plain text.
 
I am not sure your table is normalised correctly see below:

[Table Structure]
*[ID] - Number - Primary Key - Indexed
One index - [Personnel Number] - Number
----------- [Course Number] - Text - Course numbers should be stored somewhere else as there could be many course numbers to one person
[Completed] - Yes/No
[SchoolBy] - Number - depending on the relevance, not sure if this could be one to many
[Critical] - Yes/no
[Shop] - Text - Would this have many values to one person?
[Expire Date] - Date/Time
[Priority] - Number - This could be stored in another table too as an option
[Note] - Text - Surely this could be one to many (many notes to one person)

[Note Date] - Date/Time - Surely this could be one to many (many notes to one person)

[Schedualed Date] - Date/Time - Assuming there could nbe many schedules another table with detail too
[Need to Schedule] - Yes/No - same as above
[N/A] - Yes/No - Could be stored in another table with an option
[Refresh Date] - Date/Time
[Refresh Time] - Date/Time
[Record Status] - Number
[Record Load Date] - Date/Time - Record load dates could be something like below:

tblRecordLoadDates
pkdteRecordLoadID
RecordLoadDateDetail - Short date


Essentially the more that the one to many items are split away from the main table the more efficient it will become.

To give you an example I just created a database for a friend and they didnt really have that many options to store but the amount of one to many options made it necessary to create 10 different tables all linked with a unique relationship.
 
I apprecaite the informative reply. I think I may need to explain this a little more in detail. The table that I listed is already a part of a one to many relationship with the personnel table.

So what I have is:
[Personnel Table] that has all the personnel id's and info tied in it.
[tblShopCode] Which is the one I am having the issue with is on the many side from Personnel Table.

Now to get to the specifics and please correct me if I am wrong as I am just trying to understand the best approach.

[Table Structure]
*[ID] - Number - Primary Key - Indexed
One index - [Personnel Number] - Number
----------- [Course Number] - Text - Course numbers should be stored somewhere else as there could be many course numbers to one person. There is many to one person but never repeated with the same one. Here would be a small example.
[Personnel Number]----[Course Number]
---------1-----------------G565------
---------1-----------------G570------
---------2-----------------G565------
---------2-----------------G570------
---------3-----------------B001------
---------4-----------------B002------
So should I still split that? Plus I have to search for this match on every entry during the update to evaluate its contents versus the other system.

[Completed] - Yes/No
[SchoolBy] - Number - depending on the relevance, not sure if this could be one to many.
This field is used to calculate if the school is past due. Still do one to many? 1 = 1 Month, 2 = 2 Months, etc...

[Critical] - Yes/no
[Shop] - Text - Would this have many values to one person?
No same value. Actually looking into removing this as it no longer really serves a purpose. I have to go through my code to make sure its not being depended on.... 19,000 lines takes a while.


[Expire Date] - Date/Time
[Priority] - Number - This could be stored in another table too as an option
For this in another table would I need the table something like this?
[ID]
[Personnel Number]
[School ID]
[Priority Number]


[Note] - Text - Surely this could be one to many (many notes to one person)
There is only allowed to be one note per school entry as this note gets replaced with the location of where the class is being held and then wiped clean after the class is completed.

[Note Date] - Date/Time - Surely this could be one to many (many notes to one person)
This would follow the statement above because all we want is one note per person per course. I hope I am making some sense?

[Schedualed Date] - Date/Time - Assuming there could nbe many schedules another table with detail too
This is very specific to that course and that course only. Should I still split?


[Need to Schedule] - Yes/No - same as above
Again very specific to that entry. Split?

[N/A] - Yes/No - Could be stored in another table with an option
Again very specific to that entry. Split?

[Refresh Date] - Date/Time
[Refresh Time] - Date/Time
[Record Status] - Number
[Record Load Date] - Date/Time

I can code all day long but when it comes to normilization I am pretty weak. Thank you so much for your help and hope to hear back with what you think.

Thanks,

TheChazm
 
I apprecaite the informative reply. I think I may need to explain this a little more in detail. The table that I listed is already a part of a one to many relationship with the personnel table.

So what I have is:
[Personnel Table] that has all the personnel id's and info tied in it.
[tblShopCode] Which is the one I am having the issue with is on the many side from Personnel Table.

Now to get to the specifics and please correct me if I am wrong as I am just trying to understand the best approach.

[Table Structure]
*[ID] - Number - Primary Key - Indexed
One index - [Personnel Number] - Number
----------- [Course Number] - Text - Course numbers should be stored somewhere else as there could be many course numbers to one person. There is many to one person but never repeated with the same one. Here would be a small example.
[Personnel Number]----[Course Number]
---------1-----------------G565------
---------1-----------------G570------
---------2-----------------G565------
---------2-----------------G570------
---------3-----------------B001------
---------4-----------------B002------
So should I still split that? Plus I have to search for this match on every entry during the update to evaluate its contents versus the other system.
This answer will probably be repeated further on. Essentially where ever there is scope for more than one option per record it should be stored in a seperate table. I will illustrate the above below:

Main Table:

pkPersonnelID - Autonumber
NameDetail
etc etc
fkCourseNumber - Number field

tblCourseID
pkCourseID - Autonumber
CourseDetail - text

a relationship shared between the main table and tblCourseID



[Completed] - Yes/No
[SchoolBy] - Number - depending on the relevance, not sure if this could be one to many.
This field is used to calculate if the school is past due. Still do one to many? 1 = 1 Month, 2 = 2 Months, etc...

This one could be stored elsewhere as follows:

tblPersonnel
pkPersonnelID - Autonumber
NameDetail
etc etc
fkCourseCompleteID - number

tblPersonnelCourseCompleted
pkCourseCompletedID - Autonumber
CourseCompletedDetail - text (can be used as a combo in a form then)

[Critical] - Yes/no
[Shop] - Text - Would this have many values to one person?
No same value. Actually looking into removing this as it no longer really serves a purpose. I have to go through my code to make sure its not being depended on.... 19,000 lines takes a while.

Would help to know the relevance of Shop.


[Expire Date] - Date/Time
[Priority] - Number - This could be stored in another table too as an option
For this in another table would I need the table something like this?
[ID]
[Personnel Number]
[School ID]
[Priority Number]

As above really. you would have a table like:

tblPriorty
pkPriorityID - Autonumber
PriorityDetail

to be used as a combo in a form. My reason for using this method as well as normalisation is that its easy to make a query on all personnel with priority "1" for instance. It gives you more flexibility.


[Note] - Text - Surely this could be one to many (many notes to one person)
There is only allowed to be one note per school entry as this note gets replaced with the location of where the class is being held and then wiped clean after the class is completed.

Even though you say it could be only one note it gives you the flexibility of adding more just incase.

[Note Date] - Date/Time - Surely this could be one to many (many notes to one person)
This would follow the statement above because all we want is one note per person per course. I hope I am making some sense?

Same as note

[Schedualed Date] - Date/Time - Assuming there could nbe many schedules another table with detail too
This is very specific to that course and that course only. Should I still split?

If this is specific to the course then it needs to feed from a course table and not the personnel table. So thats the reason to split it off. So for example you have a course table with all your courses stored, because course throughtout the years ahead could change, then you would need a table linked to the course table with dates etc.......

[Need to Schedule] - Yes/No - same as above
Again very specific to that entry. Split?

linked to the course table maybe. for example:

tblCourse
pkCourseID - autonumber
CourseDetail - text
fkNeedtoReSchedule - number

tblCourseReschedule
pkRescheduleID - autonumber
RescheduleDetail - text


[N/A] - Yes/No - Could be stored in another table with an option
Again very specific to that entry. Split?

The only thing is this prevents using a lookup in the table itself and allowing you to use a combo box in a form to do the work for you, taking the burden away from your main table.

[Refresh Date] - Date/Time
[Refresh Time] - Date/Time
[Record Status] - Number
[Record Load Date] - Date/Time

I can code all day long but when it comes to normilization I am pretty weak. Thank you so much for your help and hope to hear back with what you think.

Thanks,

TheChazm

Hope the above helps and I know it seems a lot of tables for little data, but as your data grows you will see the benefits. I too when I first started out had to learn the hard way. I thought I produced this great database until an MVP ripped it to pieces.

I always use the rule that if there are options then it goes in a table of its own, if there are many criteria to one item then it goes in a seperate table and use the relationships to bring them together.
 
Thanks again! Very informative. It will take me a while but I understand where your going. Owwww the code is going to hurt :eek: but if it makes it better/faster then I'll do it. Thank you again and I hope this post benifits others.

TheChazm
 
There is many to one person but never repeated with the same one. Here would be a small example.
[Personnel Number]----[Course Number]
---------1-----------------G565------
---------1-----------------G570------
---------2-----------------G565------
---------2-----------------G570------
---------3-----------------B001------
---------4-----------------B002------

Based on the above data, you have a many-to-many relationship between the person and the course. A person can have many courses but also a course (ie G570) can apply to many people. To properly handle this you need a junction table

You of course need your personnel table, but you will need a table to hold the basic course info.

tblCourses
-pkCourseID primary key, autonumber
-txtCourseNumber
-txtCourseName


The juction table would look like this:

tblPeopleCourses
-pkPeopleCourseID primary key, autonumber
-fkPersonnelID foreign key to tblPersonnel
-fkCourseID foreign key to tblCourses

Now I also see a series of date/time fields related to the course/person combination.

[Completed] - Yes/No
[Expire Date] - Date/Time
[Schedualed Date] - Date/Time
[Refresh Date] - Date/Time
[Refresh Time] - Date/Time
[Record Load Date] - Date/Time

These dates appear to be tracking activities: schedule, refresh load, expire. In fact, I would probably classify the completed yes/no field as an activity as well and instead of recording yes/no, just record the completion date. If my interpretation is correct (please correct me if I am wrong), then you have a one-to-many relationship--one course/person combination to many activities. Of course and activity can apply to many course/person combinations so technically you actually have another many-to-many relationship

First a table to hold all possible activities (completed, expire, scheduled etc. each as a record)

tblActivities
-pkActivityID primary key, autonumer
-txtActivity

Now relate the person/combination and activity in the juction table.

tblPersonCourseActivity (you can pick another name if you wish)
-pkPersonCourseActivityID primary key, autonumber
-fkPeopleCourseID foreign key to tblPeopleCourses
-fkActivityID foreign key to tblActivities
-dteActivity (a date/time field to hold when the activity related to the person/course combination occurred or is to occur)
 
It is hard to tell from just seeing the relationships what you are doing with the database. I do see field names that repeat in various tables which makes me suspicious that your design is not normalized, but to be able to help we need more info on what you are trying to do.

However, going back to your original issue about the database being slow

Code:
The problem comes into play when everyday I have to check these records against another system and apply the changes in this table if there is a change then inform the user and the individuals of the change.

The performance hit comes from when I search for the record using
[Personnel Number] and [Course Number]. Then also when it has I have to do .update to update that record.

How are you getting the data from this other source?
What type of source?
Where is the source located?
Are you using a linked table?
What is your process for finding and updating the specific records?
 
How are you getting the data from this other source?
I developed code to read a website that houses the data.

What type of source?
Website local intranet

Where is the source located?
On Intranet

Are you using a linked table?
Yes using linked tables in the main program but the tables get copied to the local computer for processing. This is to avoid network slowdowns.

What is your process for finding and updating the specific records?
.findfirst using the ID of the individual and the Text of the course since its reading it from a website.

Here is the exact code for the find:
Code:
rsATMSInfo.FindFirst "[Personnel Number] = " & stblAutoNumber(i) & " And [Course Number] = """ & AtmsCoursesID & """"
 
Yes using linked tables in the main program but the tables get copied to the local computer for processing. This is to avoid network slowdowns.

OK, I'm a little confused. You pull data from the intranet site via some code you developed but then you dump it into a file located on your local computer. What form is this data in -- an Excel file? You then update your local Access database with any new or changed info?

Out of curiosity, is there a database backend to the website such as SQLServer, Oracle or MySQL?

Is the data you are pulling from the website date stamped as to when it was last changed or a date when it was added? You could use this to filter so you do not have to search through as many records.

If you are using a linked table, have you created a query that first identifies those records that have changed (duplicate query wizard would show those that have not you can use this to filter) and have you used the unmatched query wizard to identify any new records that have been added?
 
I will try to answer these to the best of my ability. The total explination is a little long but here goes.

OK, I'm a little confused. You pull data from the intranet site via some code you developed but then you dump it into a file located on your local computer. What form is this data in -- an Excel file?
1. The users log into the access program by using an icon on there desktop. This loads the front end which is automaticlly updated when I make changes to the network version.

2. They then select the location of where they want to work (or in other words what data/personnel do they want to see and work with).

3. This then goes to the network and grabs the latest backend's for that location and puts the backend databases in their my documents folder.

4. Deletes all the relationship's and links then restablishes them all from 3 tables I setup to administer the database locations, tables, and relationships. (To get to this point takes only about 10 seconds)

5. Once the program is loaded then they have the option to update their specific shop/code/personnel to a server that has lost its funding but is still the core point of inserting data for schools.

6. This then goes to the websites locating the individuals they wanted to update and reads the website contents and saves it directly into the access database.

7. During this time it is checking to see what schools have changed/modified/deleted and produces two reports. One report is showing the changed schools the other shows any users not found in the website.

8. This allows for us to expand our capabilities by using systems I designed to aid in their everyday work and reports they have to do.

This program will eventually replace the ones that its syncing from with the c# version I am writing but I am having slow downs in the table that houses the schools data when it tries to find the entry and update it. It has 187,000 records or so in there and it didn't start becoming slow until I got around 80,000.

I originally thought it was my indexing that was causing the hangup's so I removed all indexes and setup on the ones I was searching on. This did however speed it up but its still pausing. And where its located on the local system it should not be doing this.

9. After all the records have been updated it then sync's only the records that was marked as modified/new/deleted to the backend verion on the network using my [Record Status] field to complete this task.

You then update your local Access database with any new or changed info?
From the website yes.

Out of curiosity, is there a database backend to the website such as SQLServer, Oracle or MySQL?
The website is fueled by Oracle

Is the data you are pulling from the website date stamped as to when it was last changed or a date when it was added? You could use this to filter so you do not have to search through as many records.
No date stamps.

If you are using a linked table, have you created a query that first identifies those records that have changed (duplicate query wizard would show those that have not you can use this to filter) and have you used the unmatched query wizard to identify any new records that have been added?
Even though its a linked table the backend tables are being utilized locally on the computer to eliminate bandwidth issue's. It does a record by record sync when the users close the forms. This is actually very fast and extremely accurate as I control my own numbering system.

It puts any changed schools into the changed schools table. This gets wiped out everytime you start the update so not to see old updates.

Thanks,

TheChazm
 
Personally and I have done this many times in the past. I would be using MySQL or SQL Server as my database for internet/intranet use. Access has a limit to the amount of people that can log in at any given time. I think its something like 20 but dont quote me.
 
Even though its a linked table the backend tables are being utilized locally on the computer to eliminate bandwidth issue's. It does a record by record sync when the users close the forms. This is actually very fast and extremely accurate as I control my own numbering system.

The above seems to part of the issue. Ideally the backend should be stored on a server and not a local computer.
 
Since the user is using the mdw file remotely and the database locally it only ever has one user at a time in it. Only when it sync's does it create a remote connection and this is only temporary but yes I would like to use a full blown server with c# but not possible just yet.
 
The main backend is stored on a server file share but is not a sql server.
 
This program will eventually replace the ones that its syncing from with the c# version I am writing but I am having slow downs in the table that houses the schools data when it tries to find the entry and update it. It has 187,000 records or so in there and it didn't start becoming slow until I got around 80,000.

From your description it sounds like the search and update function is done only when all the data is within Access (and all external links are severed).

I would probably filter the data set first based on whatever site you pulled the data from and then apply the search/update routine. Alternatively, instead of searching record by record use the duplicate query wizard to get the primary key values of those records that have not changed and use that as a nested query to pull records that have changed.

SELECT * FROM table WHERE location=x and ID in (find duplicate query text goes here)

You can then run the update for just the records returned from the query above without having to go through all 187000 or so records
 
I appreciate the comment but it already does that. What it does is uses the personnels badge number and passes it to the website locating that specific person. Then it evaluates what has changed and what has not. If there is no changes it does no updates. It does however have to search through the filtered personnel list to toggle each one.

Thanks,

TheChazm

P.S. I have done benchmarks and it is the .findfirst meathod that is hanging. If the record is located before the 80,000 records it screems by and does its job. If its larger than 80,000 it applies a 3-4 second pause for each search.
 
I'm confused again. I thought you said that all the searching and updating took place within Access:

6. This then goes to the websites locating the individuals they wanted to update and reads the website contents and saves it directly into the access database.

7. During this time it is checking to see what schools have changed/modified/deleted and produces two reports. One report is showing the changed schools the other shows any users not found in the website.


But the following statement implies that the searching and updating is not done totally within Access:

What it does is uses the personnels badge number and passes it to the website locating that specific person. Then it evaluates what has changed and what has not.

If the searching/updating is not being done totally within Access, I would have to go with chrisguk's cautions about using Access over the internet.
 

Users who are viewing this thread

Back
Top Bottom