Complex Alphanumeric Sorts

One of the keys to understanding how to do this and why it works is to first realize that an Access ascending text sort

For example:

001

009

010

099

100

sorts exactly as we would expect (using "0"s to right justify the digits) but typical real world numbers stored as text data:

1

10

100

9

99

gives no satisfaction with an ascending sort.

We can easily manipulate the text sort order by

Now the above text strings are sorted as we would expect. The trick is to count the significant digits for all "numbers" in the original string and insert the character that represents the number of significant digits in front of the digits in the text string (as shown above).

It's relatively simple VBA coding to count how many digits are strung consecutively together before either the end of the string or a non-digit character is encountered. Once we know how many digits there are, inserting the text character representing that value into the string in front of the number sequence is also relatively straight foreward.

Does it work for complex alpha-numeric strings? Let's try it:

A10A1A1

A1A10A1

A1A1A1

A1A1A10

A1A1A9

A1A9A1

A9A1A1

is what an ascending text sort will return but is not intuitive, whereas:

is much more appealing if sorted ascending by the SortCode.

Notice that I've increased the sort codes (in bold, above) to 2 characters. I've seen Serial Numbers and Model Numbers of equipment items that come close to stringing 10 digits together, but have yet to see one that strings 99 digits together.

All that we're really doing is sorting numerical characters (left to right) by the length of the contiguous numerical characters and then by the characters in the number.

In other words:

Every "01" in the calculated SortCode string will be followed by 1 numerical character.

Every "02" in the calculated SortCode string will be followed by 2 numerical characters.

Every "03" in the calculated SortCode string will be followed by 3 numerical characters, etc...

Since we know that any 1 digit number has to be less than any 2 digit number, and any 2 digit number has to be less than any 3 digit number, etc..., a calculated SortCode as described will sort complex alpha-numeric strings more intuitively.

Another functionality we almost have to include in our SortCode generating routine is a Leading Zero counter/sort: We can't count the leading zeros of a string of digits as significant digits.

"09", "009", and "0009" should all be resolved to numbers having only one significant digit. By generating and concatenating the number of leading zeros each consecutive number sequence encountered in the string has (using 2 numerical characters for how many leading Zeros there are for every resolved number in the string to be sorted) to the end of our SortCode, we will also be sorting by leading zeros. Like this (leading Zero Count/Sort shown in bold):

Implementation:

The function code:
Calling the Function:

I wouid suggest:

SELECT MyTableName.tMyTableFieldName

FROM MyTableName

ORDER BY IIf(IsNull([tMyTableFieldName]),"",fSortCode([tMyTableFieldName]));

One of the keys to understanding how to do this and why it works is to first realize that an Access ascending text sort

*will*sort strings of digits perfectly well provided the digit characters are right justified.For example:

__tTxt__001

009

010

099

100

sorts exactly as we would expect (using "0"s to right justify the digits) but typical real world numbers stored as text data:

__tTxt__1

10

100

9

99

gives no satisfaction with an ascending sort.

We can easily manipulate the text sort order by

**inserting**characters (Sort Codes) into the string that has the digit characters like so:__tTxt__**1**1**1**9**2**10**2**99**3**100Now the above text strings are sorted as we would expect. The trick is to count the significant digits for all "numbers" in the original string and insert the character that represents the number of significant digits in front of the digits in the text string (as shown above).

It's relatively simple VBA coding to count how many digits are strung consecutively together before either the end of the string or a non-digit character is encountered. Once we know how many digits there are, inserting the text character representing that value into the string in front of the number sequence is also relatively straight foreward.

Does it work for complex alpha-numeric strings? Let's try it:

A10A1A1

A1A10A1

A1A1A1

A1A1A10

A1A1A9

A1A9A1

A9A1A1

is what an ascending text sort will return but is not intuitive, whereas:

Code:

```
tTxt SortCode
A1A1A1 A[b]01[/b]1A[b]01[/b]1A[b]01[/b]1
A1A1A9 A[b]01[/b]1A[b]01[/b]1A[b]01[/b]9
A1A1A10 A[b]01[/b]1A[b]01[/b]1A[b]02[/b]10
A1A9A1 A[b]01[/b]1A[b]01[/b]9A[b]01[/b]1
A1A10A1 A[b]01[/b]1A[b]02[/b]10A[b]01[/b]1
A9A1A1 A[b]01[/b]9A[b]01[/b]1A[b]01[/b]1
A10A1A1 A[b]02[/b]10A[b]01[/b]1A[b]01[/b]1
```

Notice that I've increased the sort codes (in bold, above) to 2 characters. I've seen Serial Numbers and Model Numbers of equipment items that come close to stringing 10 digits together, but have yet to see one that strings 99 digits together.

All that we're really doing is sorting numerical characters (left to right) by the length of the contiguous numerical characters and then by the characters in the number.

In other words:

Every "01" in the calculated SortCode string will be followed by 1 numerical character.

Every "02" in the calculated SortCode string will be followed by 2 numerical characters.

Every "03" in the calculated SortCode string will be followed by 3 numerical characters, etc...

Since we know that any 1 digit number has to be less than any 2 digit number, and any 2 digit number has to be less than any 3 digit number, etc..., a calculated SortCode as described will sort complex alpha-numeric strings more intuitively.

Another functionality we almost have to include in our SortCode generating routine is a Leading Zero counter/sort: We can't count the leading zeros of a string of digits as significant digits.

"09", "009", and "0009" should all be resolved to numbers having only one significant digit. By generating and concatenating the number of leading zeros each consecutive number sequence encountered in the string has (using 2 numerical characters for how many leading Zeros there are for every resolved number in the string to be sorted) to the end of our SortCode, we will also be sorting by leading zeros. Like this (leading Zero Count/Sort shown in bold):

Code:

```
tTxt SortCode
DA1A DA011A [b]00[/b]
DA1T-JP1 DA011T-JP011 [b]0000[/b]
DA01T-JP1 DA011T-JP011 [b]0100[/b]
DA001T-JP1 DA011T-JP011 [b]0200[/b]
DA2S-DR02 DA012S-DR012 [b]0001[/b]
DA02S-DR2 DA012S-DR012 [b]0100[/b]
DA02S-DR002 DA012S-DR012 [b]0102[/b]
DA002S-DR2 DA012S-DR012 [b]0200[/b]
DA010A DA0210A [b]01[/b]
DA11T-JP1 DA0211T-JP011 [b]0000[/b]
DAS-DR2 DAS-DR012 [b]00[/b]
F F
```

The function code:

Code:

```
Function fSortCode(sPassed As String) As String
Dim sLZTemp As String
Dim iLZCount As Integer
Dim sNumTemp As String
Dim iNumCount As Integer
While Len(sPassed)
'is the first character a number?
If IsNumeric(Left(sPassed, 1)) Then
While IsNumeric(Left(sPassed, 1))
If Left(sPassed, 1) = "0" And iNumCount = 0 Then
iLZCount = iLZCount + 1 'Leading Zero counter
Else
iNumCount = iNumCount + 1 'significant digit counter
sNumTemp = sNumTemp & Left(sPassed, 1) 'remember all significant digits
End If
sPassed = Mid(sPassed, 2) 'loop on next character
Wend
'Leading Zeros counted, significant digits counted...
sLZTemp = sLZTemp & Format(iLZCount, "00") 'leading zero sort code for later...
sNumTemp = Format(iNumCount, "00") & sNumTemp
fSortCode = fSortCode & sNumTemp
iNumCount = 0 'reset for next imbedded digit string...
iLZCount = 0
sNumTemp = ""
End If
'next character not a number, loop on next character in string...
fSortCode = fSortCode & Left(sPassed, 1)
sPassed = Mid(sPassed, 2)
Wend
fSortCode = fSortCode & " " & sLZTemp 'concatenate leading zero sort code
End Function
```

I wouid suggest:

SELECT MyTableName.tMyTableFieldName

FROM MyTableName

ORDER BY IIf(IsNull([tMyTableFieldName]),"",fSortCode([tMyTableFieldName]));

Last edited: