Sorting Complex Alpha-Numeric Strings (1 Viewer)

Status
Not open for further replies.

DALeffler

Registered Perpetrator
Local time
Today, 07:19
Joined
Dec 5, 2000
Messages
263
Complex Alphanumeric Sorts

One of the keys to understanding how to do this and why it works is to first realize that an Access ascending text sort will sort strings of digits perfectly well provided the digit characters are right justified.

For example:

tTxt
001
009
010
099
100

sorts exactly as we would expect (using "0"s to right justify the digits) but typical real world numbers stored as text data:

tTxt
1
10
100
9
99

gives no satisfaction with an ascending sort.

We can easily manipulate the text sort order by inserting characters (Sort Codes) into the string that has the digit characters like so:

tTxt
11
19
210
299
3100

Now the above text strings are sorted as we would expect. The trick is to count the significant digits for all "numbers" in the original string and insert the character that represents the number of significant digits in front of the digits in the text string (as shown above).

It's relatively simple VBA coding to count how many digits are strung consecutively together before either the end of the string or a non-digit character is encountered. Once we know how many digits there are, inserting the text character representing that value into the string in front of the number sequence is also relatively straight foreward.

Does it work for complex alpha-numeric strings? Let's try it:

A10A1A1
A1A10A1
A1A1A1
A1A1A10
A1A1A9
A1A9A1
A9A1A1

is what an ascending text sort will return but is not intuitive, whereas:

Code:
tTxt	SortCode
A1A1A1	A[b]01[/b]1A[b]01[/b]1A[b]01[/b]1
A1A1A9	A[b]01[/b]1A[b]01[/b]1A[b]01[/b]9
A1A1A10	A[b]01[/b]1A[b]01[/b]1A[b]02[/b]10
A1A9A1	A[b]01[/b]1A[b]01[/b]9A[b]01[/b]1
A1A10A1	A[b]01[/b]1A[b]02[/b]10A[b]01[/b]1
A9A1A1	A[b]01[/b]9A[b]01[/b]1A[b]01[/b]1
A10A1A1	A[b]02[/b]10A[b]01[/b]1A[b]01[/b]1

is much more appealing if sorted ascending by the SortCode.

Notice that I've increased the sort codes (in bold, above) to 2 characters. I've seen Serial Numbers and Model Numbers of equipment items that come close to stringing 10 digits together, but have yet to see one that strings 99 digits together.

All that we're really doing is sorting numerical characters (left to right) by the length of the contiguous numerical characters and then by the characters in the number.

In other words:

Every "01" in the calculated SortCode string will be followed by 1 numerical character.
Every "02" in the calculated SortCode string will be followed by 2 numerical characters.
Every "03" in the calculated SortCode string will be followed by 3 numerical characters, etc...

Since we know that any 1 digit number has to be less than any 2 digit number, and any 2 digit number has to be less than any 3 digit number, etc..., a calculated SortCode as described will sort complex alpha-numeric strings more intuitively.

Another functionality we almost have to include in our SortCode generating routine is a Leading Zero counter/sort: We can't count the leading zeros of a string of digits as significant digits.

"09", "009", and "0009" should all be resolved to numbers having only one significant digit. By generating and concatenating the number of leading zeros each consecutive number sequence encountered in the string has (using 2 numerical characters for how many leading Zeros there are for every resolved number in the string to be sorted) to the end of our SortCode, we will also be sorting by leading zeros. Like this (leading Zero Count/Sort shown in bold):

Code:
tTxt		SortCode
DA1A		DA011A [b]00[/b]
DA1T-JP1	DA011T-JP011 [b]0000[/b]
DA01T-JP1	DA011T-JP011 [b]0100[/b]
DA001T-JP1	DA011T-JP011 [b]0200[/b]
DA2S-DR02	DA012S-DR012 [b]0001[/b]
DA02S-DR2	DA012S-DR012 [b]0100[/b]
DA02S-DR002	DA012S-DR012 [b]0102[/b]
DA002S-DR2	DA012S-DR012 [b]0200[/b]
DA010A		DA0210A [b]01[/b]
DA11T-JP1	DA0211T-JP011 [b]0000[/b]
DAS-DR2		DAS-DR012 [b]00[/b]
F		F

Implementation:

The function code:
Code:
Function fSortCode(sPassed As String) As String
Dim sLZTemp As String
Dim iLZCount As Integer
Dim sNumTemp As String
Dim iNumCount As Integer

While Len(sPassed)
    'is the first character a number?
    If IsNumeric(Left(sPassed, 1)) Then
            While IsNumeric(Left(sPassed, 1))
                If Left(sPassed, 1) = "0" And iNumCount = 0 Then
                    iLZCount = iLZCount + 1     'Leading Zero counter
                Else
                    iNumCount = iNumCount + 1               'significant digit counter
                    sNumTemp = sNumTemp & Left(sPassed, 1)  'remember all significant digits
                End If
            sPassed = Mid(sPassed, 2)           'loop on next character
            Wend
        'Leading Zeros counted, significant digits counted...
        sLZTemp = sLZTemp & Format(iLZCount, "00")  'leading zero sort code for later...
        sNumTemp = Format(iNumCount, "00") & sNumTemp
        fSortCode = fSortCode & sNumTemp
        iNumCount = 0      'reset for next imbedded digit string...
        iLZCount = 0
        sNumTemp = ""
    End If
'next character not a number, loop on next character in string...
fSortCode = fSortCode & Left(sPassed, 1)
sPassed = Mid(sPassed, 2)
Wend
fSortCode = fSortCode & " " & sLZTemp 'concatenate leading zero sort code

End Function

Calling the Function:

I wouid suggest:

SELECT MyTableName.tMyTableFieldName
FROM MyTableName
ORDER BY IIf(IsNull([tMyTableFieldName]),"",fSortCode([tMyTableFieldName]));
 
Last edited:
Status
Not open for further replies.

Users who are viewing this thread

Top Bottom