Import must be read as-is into a table. (1 Viewer)

Guus2005

AWF VIP
Local time
Today, 13:13
Joined
Jun 26, 2007
Messages
2,641
I allways thought that when i import a table in SQL Server the file was imported as i was presented.

File:
Code:
This
is
a
file
in
which
the
sequence
is
very
important.

But i am having doubts as to the truth of that statement.
And if it is not, how can i make sure that it is?

I am using this command:
Code:
BCP SomeDatabase.dbo.tmpAnswerRaw in 'c:\temp\Filename.csv' -T -c

Thanks!
 

CJ_London

Super Moderator
Staff member
Local time
Today, 12:13
Joined
Feb 19, 2013
Messages
16,613
Think you will find imports are always random (not just BCP) - if they happen to be in order that is a happy coincidence. After all, data in tables is stored randomly, to get an order you need to provide it. Things I am aware of that can affect the order is the size and nature of the file (small files tend to be in order, large files not), and perhaps whatever 'minimize bloat' functionality the importing mechanism has.

I'm not familiar with BCP but a quick google suggests you can import in batches - perhaps smaller batches will retain some semblance of order?

If you require an order, usual way would be to include an 'orderby' column in the file to be imported

see this link
 

GPGeorge

Grover Park George
Local time
Today, 04:13
Joined
Nov 25, 2004
Messages
1,867
I allways thought that when i import a table in SQL Server the file was imported as i was presented.

File:
Code:
This
is
a
file
in
which
the
sequence
is
very
important.

But i am having doubts as to the truth of that statement.
And if it is not, how can i make sure that it is?

I am using this command:
Code:
BCP SomeDatabase.dbo.tmpAnswerRaw in 'c:\temp\Filename.csv' -T -c

Thanks!
How about a real sample of real data. Sometimes, trying to "simplify" by using made up examples actually obscures the problem.
 

Isaac

Lifelong Learner
Local time
Today, 04:13
Joined
Mar 14, 2017
Messages
8,777
I would use one of the bulk insert features instead, it is a bit easier to write code that specifies the formats.

But you probably have a problem with line breaks
 

Galaxiom

Super Moderator
Staff member
Local time
Today, 21:13
Joined
Jan 20, 2009
Messages
12,852
Import the whole csv into a table as a single value. Then apply STRING_SPLIT with the ordinal argument set. (You will need SQL Server 2022 or Azure.)

Then split the multiple records on the commas.
 

Isaac

Lifelong Learner
Local time
Today, 04:13
Joined
Mar 14, 2017
Messages
8,777
And wouldn't you need some follow up querying? To transpose to columns based on the number in the Ordinal 2nd column?
 

SQL_Hell

SQL Server DBA
Local time
Today, 12:13
Joined
Dec 4, 2003
Messages
1,360
SQL Server data in a table will be stored in the order of the clustered index (usually the primary key), if there is no clustered index defined then the data will be stored in a heap (random order).
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 07:13
Joined
Feb 19, 2002
Messages
43,275
By definition, tables and queries are always unordered sets. Think of marbles in a bag. Reach in and pull out one. Reach in and pull out the second. If you want to impose order, you must sort on a unique identifier. In the real world, the retrieval order won't be random but it also may deviate from how the rows went in. the point is that the rules of SQL Server, et al, do NOT promise a specific retrieval order without an order by clause on a unique identifier. If the sort field(s) are not unique but only partially unique, then the order will be only partially predictable. For example, if you are retrieving the order details and the table is sorted on OrderID but nothing within that, then the items might be returned as 1, 2, 3 or 2, 3, 1 or 3,1,2, etc. Not predictable ahead of time but likely repeatable unless some change happens to the table between query executions.

When reading a "flat" file into a table and appending it, the flat file will be read in physical record order and so written to the table in physical record order. If your table has an autonumber/Identity PK, then that PK will retain the order of the rows as they were added to the table.

When reading data from a table, the rows are retrieved at the convenience of the query engine and so the order may or may not be as written to the disk. This is where you would add an Order By clause on the autonumber/Identity PK to ensure the records come out in the same order they went in.
 

Isaac

Lifelong Learner
Local time
Today, 04:13
Joined
Mar 14, 2017
Messages
8,777
When reading a "flat" file into a table and appending it, the flat file will be read in physical record order
If you happen to be using some line-by-line method, yes
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 07:13
Joined
Feb 19, 2002
Messages
43,275
Why do you thing it works differently using TransferText? I do know that I've imported spreadsheets that seem to be out of order. That is because at some point the file was sorted. To me it looks to be in a specific sequence but it gets loaded in a different sequence. It seems to get loaded in its original physical sequence although I never created a test case to confirm the theory. Flat file I/O is very different from that of random access files like Jet/ACE and their ilk. There is no way to reach into a flat file and pull out a specific record. You have to read the file record by record until you get where you are going. You can also read a flat file backwards but there are no indexes and so there is no way to go directly to some specific record. You can append a record to the end of a flat file but not insert it in the middle. When I used to process flat files with COBOL, you read the flat file and updated it, inserted records etc as you wrote the updated file out to a different location. You could not update in place as you can with indexed files. Think about all those old movies where they show the 9-track tape drives spinning forward and backward. That was for visual effect. You would never see that in reality. Flat files are read from front to back or sometimes from back to front but they don't go back and forth. Random access files were stored on disk drives which don't present the cool visual tha the tape drives presented. Companies did build disk drives that had windows so you could see the action. The drives were stacks of flat platters that rotated. The drive heads were in a vertical shaft and they moved in and out as they retrieved records. The pictures show a drive circa 1965. The drive stored ~ 2 MB. My internal HD stores 2 TB. The USB I have plugged in stores 8 G. So, as you can imagine, the cost of storing data on a disk was very expensive until the advent of the PC when miniaturization accelerated. That HD cost was a direct cause of the Y2K problem. Nobody stored 4 digit years because the extra 2 bytes per record was just too costly.

1705000497426.png


This is the drive. You can see the mounted drive inside the machine. On top is an unmounted drive. The whole drive was about the size of a top loading washing machine.
1705000700358.png
 
Last edited:

Users who are viewing this thread

Top Bottom