massive data hierarchy advice

accessuser1023 · Jul 17, 2013

Hi all,

I have a situation here where I'm going to be making a DB app that has a massive amount of data storage requirements, but ironically it fits perfectly into a 3-dimensional structure (e.g. - literally all 1-many relationships can be used for the entire set of tables). Here is an example of the hierarchy I might have:

Code:

=> table 1
   => table 1 sub 1
   => table 1 sub 2
=> table 2
   => table 2 sub 1
=> table 3
   => table 3 sub 1
      => table 3 sub 1 subsub 1
   => table 3 sub 2
      => table 3 sub 2 subsub 2
   => table 3 sub 3
   => table 3 sub 4
=> table 4
   => table 4 sub 1
=> table 5
   => table 5 sub 1
   => table 5 sub 2
=> table 6
   => table 6 sub 1
=> table 7
   => table 7 sub 1
   => table 7 sub 2
=> table 8
   => table 8 sub 1

So here's a question for someone who might be better at handling large apps than I am:

How close am I getting to Access not being able to handle this? I am looking to create the app such that Oracle creates theirs (e.g. - lots 'o forms!). So based on that hierarchy, I would assume that the number of forms would be 50 or so. There isn't anything wrong with this, is there? I mean from a concurrent user prospective? I will cover all the necessary issues such as write conflict, record locks / form locks, etc... I'm just wondering if the resource usage strain on Access itself would be something that needs to be looked into. The user for this initially would be 3, but it could get as high as 25. The other issue that might come into play is putting the datasets on Sharepoint in Sharepoint lists.

Any advice on this would be great guys. I'm actually looking to start a discussion regarding issues like this. It would certainly be useful for me, if not everyone else too.

thanks.

MarkK · Jul 17, 2013

How big is massive? I've had tables with hundreds of thousands of records and performance is fine.

But otherwise, are you asking if Access is an appropriate tool?

accessuser1023 · Jul 17, 2013

lagbolt said:
How big is massive? I've had tables with hundreds of thousands of records and performance is fine.

But otherwise, are you asking if Access is an appropriate tool?

I'm not really asking about the *size* of tables, but rather all of the relationships that have to be formed between them, forms being used behind all of those relationships, and then many people then using those forms.

So yes, the question is about performance with this, and more than that, corruption issues. And also, yes I'm asking if Access is an appropriate tool for the scenario that I have outlined.

thanks.

MarkK · Jul 17, 2013

Relationships between tables are not dynamic things, like they don't require run-time resources. A field in one table matches a field in another, and that essentially what a *relationship* is.

Forms are only used by one user at a time, at least in a "split database," which is what you want. Each user has his own dedicated program file, which links to the tables on the server, and the forms for that user are in that program file, his or her dedicated copy. In this case there is no bottle-neck in respect to forms or their use.

Corruption is not a significant issue if your database is split as outlined in the previous paragraph. I've broken Access files in development, but I've never actually lost data to corruption. I don't believe that is an issue.

The tree structure you outline is not, at face value, complex. That could be represented in one table that is related to itself, with a structure like . . .

Code:

[B]tTree[/B]
TreeID (Primary Key)
ParentID (Foreign Key, link to PK in this table)
SomeDataHere

. . . so that is not going to put pressure on Access in any way.

But as I understand it, the only real thing that impacts performance in a database is the size of the tables, since every join operation is essentially a combination of a search and/or a sort--in which performance degrades as the set size increases--so the main thing you need to do is index your tables properly, on the fields you intend to use as joins or constraints, and that will have the greatest impact on performance. Given the case that your tables are properly indexed, then the only performance problem will be size, and in my experience, in Access, performance is acceptable in the range of hundreds of thousands of records.

Maybe network speed was an issue, but you can get Gigabit network hardware now, for cheap, so moving data from the server is not going to slow you down.

Does this address your concerns?

accessuser1023 · Jul 17, 2013

lagbolt said:
But as I understand it, the only real thing that impacts performance in a database is the size of the tables, since every join operation is essentially a combination of a search and/or a sort--in which performance degrades as the set size increases--so the main thing you need to do is index your tables properly, on the fields you intend to use as joins or constraints, and that will have the greatest impact on performance.

YES. this is what I was getting at. So if indeed an application becomes such that hundreds of thousands of records are dwelling in the tables, would that be an indication (if you know things will go that far) that you should consider up front how many joins or complex joins should be involved in your table structure? It would seem to me that it would make sense in this way.

So if you have a choice initially to build tables with or without certain joins required, and you know that hundreds of thousands of records will eventually be in plenty of your tables, would it not be wise to choose lesser amounts of joins rather than more of them? Even if choosing 'more' would result in a better 3-dimensional and/or normalized setup?

This is exactly what I was getting at Lagbolt, I just didn't know what the issue was until someone mentioned it. thanks.

MarkK · Jul 17, 2013

Yeah, you bet . . .

. . . but, you don't perform joins as a function of what you think your data are going to scale to at some future date. You design your data structures, your tables, as models of the reality you are dealing with, and your model will be correct for your purpose or not. Joins in this sense aren't optional features that you include if the software can handle it. Joins will be necessary where the things you are modelling have a one-to-many relationship with each other. Period.

So if I write a CD database, I have artists and bands and tracks and CDs and genres, and the way these things are related to each other will dictate my tables and how they are joined, and these will be facts. A join will exist between the CD table and the Track table, because a CD has many tracks. Period. Not an option based on the speed of my database processing, a fact of the topology of my data.

Hope this helps,

accessuser1023 · Jul 17, 2013

That is wonderful advice Lagbolt. Nothing else needed! thanks so much sir.

massive data hierarchy advice

accessuser1023

Registered User.

MarkK

bit cruncher

accessuser1023

Registered User.

MarkK

bit cruncher

accessuser1023

Registered User.

MarkK

bit cruncher

accessuser1023

Registered User.

Similar threads

Users who are viewing this thread