## StringArray Sort (case independent)

Post your FreeBASIC tips and tricks here. Please don’t post your code without including an explanation.
MrSwiss
Posts: 3220
Joined: Jun 02, 2013 9:27
Location: Switzerland

### StringArray Sort (case independent)

It seems, that in many cases of sorting a StringArray, the case's inside the individual
Strings, are simply ignored, e.g. QuickSort ... (this leads to: incorrect sort result!).

Example: "NOT" comes before "No" (it should be vice versa, from a human perspective)
(From a strictly numeric point of view, the sort is correct "as-is"!)

The Question:
How can we solve the dilemma presented above?
I'd suggest a three step approach:
1) figure out the technical reasons first
2) propose a workaround that eliminates the technical barrier(s)
3) implement 2nd in code, using all coding tricks we know of

The problem, detailed (step 1):

- Alphabet consists of: 2 distinct ranges (A - Z and a - z), in ASCII-numbers.
(see: ASCII-Table in Manual)
- QuickSort sorts: low to high (only numerical values considered)
- Therefore: 'Upper Case' superseeds 'Lower Case' always
(this is the actually problematic point at issue, to be solved)

My solution approach (step 2):
- Preprocessing (make Strings all the same case (UCase() used here)
- Sorting (QuickSort, for String)
- Postprocessing (restore sorted Array to 'original' case, again)

The solution (step 3):
- my assesment of the problem (step 1), might be wrong (to start off with)
- this whould invalidate step 2/3, of course
- before posting code here, I'm interested in other, maybe better approches you'd
propose to solve the issue, as explained above in words, not in code
dodicat
Posts: 5913
Joined: Jan 10, 2006 20:30
Location: Scotland

### Re: StringArray Sort (case independent)

Let DOS take the strain?

Code: Select all

` Function StringSplit(s_in As String,chars As String,result() As String) As Long    Dim As Long ctr,ctr2,k,n,LC=len(chars)    dim As boolean tally(Len(s_in))    #macro check_instring()        n=0        while n<Lc        If chars[n]=s_in[k] Then         tally(k)=true        If (ctr2-1) Then ctr+=1        ctr2=0        exit while        end if        n+=1       wend    #endmacro       #macro split()    If tally(k) Then        If (ctr2-1) Then ctr+=1:result(ctr)=Mid(s_in,k+2-ctr2,ctr2-1)        ctr2=0    End If    #endmacro    '==================  LOOP TWICE =======================    For k  =0 To Len(s_in)-1        ctr2+=1:check_instring()    Next k    if ctr=0 then         if len(s_in) andalso instr(chars,chr(s_in)) then ctr=1':beep         end if    If ctr Then Redim result(1 To ctr): ctr=0:ctr2=0 Else  Return 0    For k  =0 To Len(s_in)-1        ctr2+=1:split()    Next k    '===================== Last one ========================    If ctr2>0 Then        Redim Preserve result(1 To ctr+1)        result(ctr+1)=Mid(s_in,k+1-ctr2,ctr2)    End If    Return Ubound(result)End Functionsub sort(s() as string,d as string="down")    var f=freefile    var cd=curdir    dim as string g,cmd,text    open "templist__.txt" for output as #f    for n as long=lbound(s) to ubound(s)       print #f,s(n)    next n    close #f    if d="down" then    cmd= "SORT/r " + cd + "\templist__.txt /o " +cd +"\result__.txt"     else     cmd= "SORT " + cd + "\templist__.txt /o " +cd +"\result__.txt"     end if    shell cmd    Open "result__.txt" For Binary Access Read As #f    If Lof(f) > 0 Then      text = String(Lof(f), 0)      Get #f, , text    End If    Close #f    redim as string s2()   StringSplit(text,chr(10),s2())    for n as long=lbound(s2) to ubound(s2)        s(n)=(s2(n))    next    kill cd+"\result__.txt"    kill cd+"\templist__.txt"end subdim as string s(1 to ...)={"Zanussi","Not","No","A","C","B","b"}sort(s(),"up")for n as long=lbound(s) to ubound(s)    print n,s(n)nextprintsleep`
MrSwiss
Posts: 3220
Joined: Jun 02, 2013 9:27
Location: Switzerland

### Re: StringArray Sort (case independent)

dodicat wrote:Let DOS take the strain?

Nice idea, but not what I've intended (make QuickSort return correctly).
There is no need for file I/O ... can be done in memory alone.
My intention is: to get those 'coders brains' up to speed, once in a while.
Lost Zergling
Posts: 240
Joined: Dec 02, 2011 22:51
Location: France

### Re: StringArray Sort (case independent)

I would use my leaky lzle to fill a list using Lcase or Ucase for Tag0 (keys) and original keys as values or Tag1 ?...
dodicat
Posts: 5913
Joined: Jan 10, 2006 20:30
Location: Scotland

### Re: StringArray Sort (case independent)

Just pop lcase into the callback function

Code: Select all

` '#undef lcase'#define lcase#include once "crt.bi"'=========  set up c sort =========#define up <,>#define down >,<#define ArrayToSort(x,start,finish) @X((start)),((finish)-(start)+1),Sizeof(X)#macro SetCSort(Datatype,FnName,b1,b2,dot)Function FnName Cdecl(n1 As Any Ptr,n2 As Any Ptr) As long    If lcase(*Cptr(Datatype Ptr,n1))dot b1 lcase(*Cptr(Datatype Ptr,n2))dot Then Return -1    If lcase(*Cptr(DataType Ptr,n1))dot b2 lcase(*Cptr(DataType Ptr,n2))dot Then Return 1    return 0End Function#endmacrosetCsort(string,callback,up,)dim as string s(1 to ...)={"Zanussi","Not","No","A","C","B","d","a"}qsort(arraytosort(s,Lbound(s),Ubound(s)),@callback)for n as long=lbound(s) to ubound(s)    print n,s(n)nextprintsleep `

Re comment lines 2 and 3 to see the raw result.

Here it is laid out.

Code: Select all

` #include "crt.bi"Function callbackU Cdecl(n1 As Any Ptr,n2 As Any Ptr) As Long    If Lcase(*Cptr(String Ptr,n1)) < Lcase(*Cptr(String Ptr,n2)) Then Return -1    If Lcase(*Cptr(String Ptr,n1)) > Lcase(*Cptr(String Ptr,n2)) Then Return 1    Return 0End FunctionFunction callbackD Cdecl(n1 As Any Ptr,n2 As Any Ptr) As Long    If Lcase(*Cptr(String Ptr,n1)) > Lcase(*Cptr(String Ptr,n2)) Then Return -1    If Lcase(*Cptr(String Ptr,n1)) < Lcase(*Cptr(String Ptr,n2)) Then Return 1    Return 0End FunctionSub sortstring(s() As String,L As Long,U As Long,direction As String="up")    If Lcase(direction)="up" Then        qsort( @s((L)),((U)-(L)+1),Sizeof(s),@callbackU)    Else        qsort( @s((L)),((U)-(L)+1),Sizeof(s),@callbackD)    End IfEnd SubDim As String s(1 To ...)={"Zanussi","Not","No","A","C","B","d","a"}sortstring(s(),Lbound(s),Ubound(s),"up")For n As Long=Lbound(s) To Ubound(s)    Print n,s(n)NextPrintSleep `
MrSwiss
Posts: 3220
Joined: Jun 02, 2013 9:27
Location: Switzerland

### Re: StringArray Sort (case independent)

Jep, dodicat got it, again!

Here comes my implementation, a bit more code, but no "crt.bi", because
Q_Sort() is implemented too. All I'm doing is:
wrap Q_Sort() within another Sub Q_SortUniStr(), which does the pre-/post-
processing, before and after Q_Sort(), with temp. string array:

Code: Select all

`' Q_Sort_String_test1.bas -- (c) 2019-07-26, MrSwiss'' compile: -s console'' translated from C source: quick(int array[], int start, int end) {}Private Sub Q_Sort OverLoad( _          ' modified C to FB (strings)          sa()  As String, _            ' sa = string-array    ByVal start As Long, _              ' lbound | below: ubound    ByVal end_  As Long _               ' added underline (keyword conflict)    )    If start < end_ Then        Dim As String   p = sa(start)        Dim As Long     l = start + 1, r = end_        While l < r            If sa(l) <= p Then                l += 1            ElseIf sa(r) >= p Then                r -= 1            Else                Swap sa(l), sa(r)            End If        Wend        If sa(l) < p Then            Swap sa(l), sa(start)            l -= 1        Else            l -= 1            Swap sa(l), sa(start)        End If        Q_Sort(sa(), start, l)        Q_Sort(sa(), r, end_)    End IfEnd SubPrivate Sub Q_SortUniStr( _             ' Q_Sort Unified String          a()   As String, _            ' array to be sorted          b()   As String  _            ' original string array (dyn./fixed)    )    Dim As Long lb_a = LBound(a), ub_a = UBound(a), _                lb_b = LBound(b), ub_b = UBound(b)  ' store array's bounds    ReDim As String t(lb_a To ub_a)     ' temp. dynamic array    ' all strings unified to: upper case only    For i As Integer = lb_a To ub_a     ' copy a() to t() upper cased        t(i) = UCase(a(i))              ' prepare for sorting    Next    ' for correct string sort, we need it with: all the same case    Q_Sort(t(), lb_a, ub_a)             ' sort it    ' restore strings original case (from b())    For i As Integer = lb_b To ub_b     ' get from original string array        Var ts = b(i)                   ' store one string for comparison        For j As Integer = lb_a To ub_a ' compare to t(j)            If Len(t(j)) = Len(ts) Then ' only if equal len() (otherwise skip)                If t(j) = UCase(ts) Then' only if strings are the same (dito)                    a(j) = ts           ' assign original string to a(j)                End If            End If        Next    Next    Erase t                             ' reset t()End Sub' Q_Sort procedures END#Define iRange(l, h)    ( Int(Rnd() * ((h + 1) - (l)) + (l)) )Randomize(Timer, 3)For i As UInteger = 0 To 999    Rnd() : Rnd() : Rnd() : Rnd() : _    Rnd() : Rnd() : Rnd() : Rnd()NextReDim As String a(Any)                  ' dynamic array (below: fixed size array)Dim As String b(...) = { "Wednesday", "Thursday", "Friday", "Null", _                         "September", "October", "November", "Copy", _                         "Saturday", "Sunday", "Monday", "Pasta", _                         "December", "January", "February", "Cut", _                         "Apple", "Pear", "Orange", "Banana", "No", _                         "Tuesday", "March", "April", "May", "Rice", _                         "Yes", "June", "July", "August", "Zero", _                         "EOF", "Error", "OR", "NOR", "XOR", "EXOR" }Dim As Long lb_a, ub_a, lb_b, ub_bReDim a(99)                             ' 0 to 99 (aka: 100 elements)lb_b = LBound(b) : ub_b = UBound(b)     ' store array bounds (of b())lb_a = LBound(a) : ub_a = UBound(a)     ' store array bounds (of a())For i As Integer = lb_a To ub_a         ' initialize array randomly    a(i) = b(iRange(lb_b, ub_b))        ' using single line macro (#define)    Print "a("; Str(i); "): "; _          Tab(10); a(i)                 ' show itNextPrint : PrintPrint "press a key to SORT array ";Sleep : Print : PrintQ_SortUniStr(a(), b())                  ' sort string arrayFor i As Integer = lb_a To ub_a    Print "a("; Str(i); "): "; _          Tab(10); a(i)                 ' show itNextPrint : PrintPrint "press a key to EXIT program ... ";Sleep`
This is of course not, what I'm really using ...
Just copy/paste from my "Q_Sort.bi" with 'overloaded' Q_Sort's for
different data-types (Long, Double, String).
fxm
Posts: 9126
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

### Re: StringArray Sort (case independent)

Variant of 'Q_SortUniStr' ('Q_SortUniStr(a() As String)') to respect the initial case, even in case of string duplication but with different cases:

Code: Select all

`Private Sub Q_SortUniStr( _                         ' Q_Sort Unified String          a()   As String _                         ' array to be sorted    )    Dim As Long lb_a = LBound(a), ub_a = UBound(a)  ' store array's bounds    ReDim As String t(lb_a To ub_a)                 ' temp. dynamic array    ReDim As Boolean used(lb_a To ub_a)    ' all strings unified to: upper case only    For i As Integer = lb_a To ub_a                 ' copy a() to t() upper cased        t(i) = UCase(a(i))                          ' prepare for sorting    Next    ' for correct string sort, we need it with: all the same case    Q_Sort(t(), lb_a, ub_a)                         ' sort it    ' restore strings original case (from a())    For i As Integer = lb_a To ub_a        For j As Integer = lb_a To ub_a            If t(i) = Ucase(a(j)) Then                If used(j) = False Then                    Swap a(i), a(j)                    used(i) = True                    Exit For                End If            End If        Next j    Next i    Erase t                                         ' reset t()End Sub`
MrSwiss
Posts: 3220
Joined: Jun 02, 2013 9:27
Location: Switzerland

### Re: StringArray Sort (case independent)

fxm wrote:in case of string duplication but with different cases:
Sorry, not useful IMO, because:
- only for use, if 2 differing occurences (not triplicates, or larger multiples)
- I'm usually after sorting, killing all multiple occurences, anyway ...
(the pick of case can be influenced, by positioning in array b() ...)

No own, 'original' ideas?
fxm
Posts: 9126
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

### Re: StringArray Sort (case independent)

MrSwiss wrote:- only for use, if 2 differing occurences (not triplicates, or larger multiples)

Have you really tested it?
MrSwiss
Posts: 3220
Joined: Jun 02, 2013 9:27
Location: Switzerland

### Re: StringArray Sort (case independent)

fxm wrote:Have you really tested it?
No, just NOT interrested ...
I'm only keeping unique entry's, as already stated!
(that code was omitted, to keep close to the stated case)

What I'm interrested in, is: No own, 'original' ideas?
Last edited by MrSwiss on Jul 27, 2019 17:55, edited 1 time in total.
dodicat
Posts: 5913
Joined: Jan 10, 2006 20:30
Location: Scotland

### Re: StringArray Sort (case independent)

Using the standard quicksort with the option of producing dos style string sort (case independent) or sort as the array stands.(comment out lcase)

i.e.
#define casetype 'lcase

Code: Select all

` #define datatype  string#define casetype  lcaseSub sort(array() As DataType,begin As Long=0,Finish As Long=-1)    If begin>finish Then begin=Lbound(array):Finish=Ubound(array)    Dim As Long i=begin,j=finish    Dim As DataType x=casetype(array(((I+J)\2)))    While I <= J        While casetype(array(I)) < X :I+=1:Wend 'CHANGE TO > FOR SORTING DOWN            While casetype(array(J)) > X :J-=1:Wend 'CHANGE TO < FOR SORTING DOWN                If I<=J Then Swap array(I),array(J): I+=1:J-=1            Wend            If J >begin Then  sort(array(),begin,J)            If I <Finish Then sort(array(),I,Finish)        End Sub                Dim As String b(...) = { "Wednesday", "Thursday", "Friday", "Null", _        "September", "October", "November", "Copy", _        "Saturday", "Sunday", "Monday", "Pasta", _        "December", "January", "February", "Cut", _        "Apple", "Pear", "Orange", "Banana", "No", _        "Tuesday", "March", "April", "May", "Rice", _        "Yes", "June", "July", "August", "Zero", _        "EOF", "Error", "OR", "NOR", "XOR", "EXOR","wed","thur","fri","mon","tue","Thur", _        "jan","feb","Dec","Mon","OCT"}                sort b(),Lbound(b),Ubound(b)                For n As Long=Lbound(b) To Ubound(b)            Print n,b(n)        Next        Print        Sleep`
jj2007
Posts: 1214
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

### Re: StringArray Sort (case independent)

Converting all strings to lcase or ucase strings is an overkill, and probably very slow. It would be sufficient to use a string compare function that is case-insensitive. If there is interest, I can offer a 32-bit DLL using StringsDiffer().
MrSwiss
Posts: 3220
Joined: Jun 02, 2013 9:27
Location: Switzerland

### Re: StringArray Sort (case independent)

While 64 bit only is acceptable but, 32 bit only is totally inacceptable ...
Preferable is anyway: both!
That's why we're using FB and not "lower level languages" after all! ;-)
jj2007
Posts: 1214
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

### Re: StringArray Sort (case independent)

Well, for me slow code is totally unacceptable. De gustibus non est disputandum ;-)
dodicat
Posts: 5913
Joined: Jan 10, 2006 20:30
Location: Scotland

### Re: StringArray Sort (case independent)

I did some speed tests.
The dos sort was by far the fastest (3.2 seconds for one million shortish strings)
Standard Quicksort was about eight seconds.
The C runtime was about ten seconds.