Split String Algorithm for FreeBasic

Post your FreeBASIC source, examples, tips and tricks here. Please don’t post code without including an explanation.
fxm
Moderator
Posts: 12081
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Split String Algorithm for FreeBasic

Post by fxm »

fxm wrote: Sep 16, 2022 6:14 An nth version in pure FreeBASIC!
But compatible with zstring and string (supports null character).

Code: Select all

Sub split(Byref s1 As String, Byref s2 As String, splits(Any) As String)
    Dim As Integer n, n0 = 1
    Do
        Redim Preserve splits(Ubound(splits) + 1)
        n = Instr(n0, s1, s2)
        If n > 0 Then
            splits(Ubound(splits)) = Mid(s1, n0, n - n0)
            n0 = n + Len(s2)
        Else
            splits(Ubound(splits)) = Mid(s1, n0)
            Exit Do
        End If
    Loop
End Sub

Dim As String s1 = "reinvent the wheel " & Chr(0) & " in pure FreeBASIC :lol:"
Dim As String splits(Any)
Print s1
Print
split(s1, " ", splits())
For i As Integer = 0 To Ubound(splits)
    Print i & " '" & splits(i) & "'"
Next i

Sleep

Improved version of the previous 'split(Byref s1 As String, Byref s2 As String, splits(Any) As String)' subroutine above, to minimize execution time:
- The 'splits()' array is first resized to a maximum size depending to 'Len(s1) / Len(s2) + 1', then resized at the end to the true useful size.
- 'Mid(s1, ...)' is no longer used. Instead the descriptor of 's1' is temporary modified to match to the 'Mid(s1, ...)' string used previously.
- A non-empty string array passed as an argument is also supported, and the array is only completed (with the extracted sub-strings).

Code: Select all

Sub split(Byref s1 As String, Byref s2 As String, splits(Any) As String)
    Type SD
        Dim As Any Ptr p
        Dim As Integer l
    End Type
    Dim As SD Ptr psd1 = Cptr(SD Ptr, @s1)
    Dim As Any Ptr p1 = psd1->p
    Dim As Integer l1 = psd1->l
    
    Dim As integer i = Ubound(splits) + 1
    Redim Preserve splits(Lbound(splits) To Ubound(Splits) + Len(s1) / Len(s2) + 1)
    
    Dim As Integer n, n0 = 1
    
    Do
        n = Instr(n0, s1, s2)
        psd1->p = p1 + n0 - 1
        If n > 0 Then
            psd1->l = n - n0
            splits(i) = s1
            psd1->p = p1
            psd1->l = l1
            n0 = n + Len(s2)
        Else
            psd1->l = l1 - n0 + 1
            splits(i) = s1
            psd1->p = p1
            psd1->l = l1
            Redim Preserve splits(Lbound(splits) To i)
            Exit Do
        End If
        i += 1
    Loop
End Sub

Dim As String s1 = "reinvent the wheel " & Chr(0) & " in optimized pure FreeBASIC :lol:"
Dim As String splits(Any)
Print s1
Print
split(s1, " ", splits())
For i As Integer = 0 To Ubound(splits)
    Print i & " '" & splits(i) & "'"
Next i

Sleep
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Split String Algorithm for FreeBasic

Post by dodicat »

s2 doesn't change, but you ask for len(s2) every time in the loop?
fxm
Moderator
Posts: 12081
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Split String Algorithm for FreeBasic

Post by fxm »

Evaluating 'Len(s2)' just corresponds to reading an integer value in the 's2' string descriptor. That is an indirection which does not add much execution time compared to direct access to an integer variable.
However, I still made this minimal improvement:

Code: Select all

Sub split(Byref s1 As String, Byref s2 As String, splits(Any) As String)
    Type SD
        Dim As Any Ptr p
        Dim As Integer l
    End Type
    Dim As SD Ptr psd1 = Cptr(SD Ptr, @s1)
    Dim As Any Ptr p1 = psd1->p
    Dim As Integer l1 = psd1->l
    
    Dim As Integer l2 = Len(s2)
    
    Dim As integer i = Ubound(splits) + 1
    Redim Preserve splits(Lbound(splits) To Ubound(Splits) + l1 / l2 + 1)
    
    Dim As Integer n, n0 = 1
    
    Do
        n = Instr(n0, s1, s2)
        psd1->p = p1 + n0 - 1
        If n > 0 Then
            psd1->l = n - n0
            splits(i) = s1
            psd1->p = p1
            psd1->l = l1
            n0 = n + l2
        Else
            psd1->l = l1 - n0 + 1
            splits(i) = s1
            psd1->p = p1
            psd1->l = l1
            Redim Preserve splits(Lbound(splits) To i)
            Exit Do
        End If
        i += 1
    Loop
End Sub

Dim As String s1 = "reinvent the wheel " & Chr(0) & " in optimized pure FreeBASIC :lol:"
Dim As String splits(Any)
Print s1
Print
split(s1, " ", splits())
For i As Integer = 0 To Ubound(splits)
    Print i & " '" & splits(i) & "'"
Next i

Sleep
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Split String Algorithm for FreeBasic

Post by dodicat »

I have tried many ways to optimize code over these years.
Len in every loop slows everything down for sure, descriptor reading or not, who are you kidding?
That's why I have used Cast(Integer Ptr,@b)[1] for len(b) when a string b changes every loop.
Anyway, water off a duck's back for me, these retorts.
I'll say nothing to you from now on.
fxm
Moderator
Posts: 12081
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Split String Algorithm for FreeBasic

Post by fxm »

You are right.
The time wasted using 'Len(s)' instead of 'Cptr(Integer Ptr, @s)[1]' is not due to the 'Len()' function body (which does something similar) but to the time wasted in calling the function and returning from it.

My updated 'split()' version:

Code: Select all

Sub split(Byref s1 As String, Byref s2 As String, splits(Any) As String)
    Dim As Any Ptr p1 = Cptr(Any Ptr Ptr, @S1)[0]
    Dim As Integer l1 = Cptr(Integer Ptr, @S1)[1]
    Dim As Integer l2 = Cptr(Integer Ptr, @s2)[1]
    Dim As integer i = Ubound(splits) + 1
    Dim As Integer n, n0 = 1

    Redim Preserve splits(Lbound(splits) To i + l1 / l2)
    Do
        n = Instr(n0, s1, s2)
        Cptr(Any Ptr Ptr, @S1)[0] = p1 + n0 - 1
        If n > 0 Then
            Cptr(Integer Ptr, @S1)[1] = n - n0
            splits(i) = s1
            Cptr(Any Ptr Ptr, @S1)[0] = p1
            Cptr(Integer Ptr, @S1)[1] = l1
            n0 = n + l2
        Else
            Cptr(Integer Ptr, @S1)[1] = l1 - n0 + 1
            splits(i) = s1
            Cptr(Any Ptr Ptr, @S1)[0] = p1
            Cptr(Integer Ptr, @S1)[1] = l1
            Redim Preserve splits(Lbound(splits) To i)
            Exit Do
        End If
        i += 1
    Loop
End Sub

Dim As String s1 = "reinvent the wheel " & Chr(0) & " in optimized pure FreeBASIC :lol:"
Dim As String splits(Any)
Print s1
Print
split(s1, " ", splits())
For i As Integer = 0 To Ubound(splits)
    Print i & " '" & splits(i) & "'"
Next i

Sleep
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Split String Algorithm for FreeBasic

Post by dodicat »

Sorry for being so ill tempered fxm.
Your method and my method both suffer from the same thing, empty elements are saved in the array.
But it might be a bit of a headache changing this, so I'll just leave mine as it is.

Code: Select all



Function tally (somestring As String,partstring As String,arr() As integer) As integer
    redim arr(1 to len(somestring)\2+1)
    Dim As integer i,j,ln,lnp,count
    ln=Len(somestring)
    lnp=Len(partstring)
    count=0
    i=-1
    Do
        i+=1
        If somestring[i] <> partstring[0] Then continue do
        If somestring[i] = partstring[0] Then
            For j=0 To lnp-1
                If somestring[j+i]<>partstring[j] Then continue do
            Next j
        End If
        count+=1
        arr(count)=i+1
        i=i+lnp-1
    Loop Until i>=ln-1
    redim preserve arr(1 to count) 
    Return count
End Function


Function splitd(somestring As String,partstring As String,a() As String) As integer
    Redim As integer x()
    Var t= tally(somestring,partstring,x()),lps=Len(partstring)
    If t=0 Or Len(somestring)=0 Or lps=0 Then Return 0
    Redim a(1 To t+1)
    a(1)=Mid(somestring,1,x(1)-1)
    For n As integer=1 To Ubound(x)-1
        a(n+1)= Mid(somestring,x(n)+lps,x(n+1)-x(n)-lps)
    Next n
    a(Ubound(a))=Mid(somestring,x(Ubound(x))+lps)
    Return t+1
End Function


Sub splitf(Byref s1 As String, Byref s2 As String, splits(Any) As String)
    Dim As Any Ptr p1 = Cptr(Any Ptr Ptr, @S1)[0]
    Dim As Integer l1 = Cptr(Integer Ptr, @S1)[1]
    Dim As Integer l2 = Cptr(Integer Ptr, @s2)[1]
    Dim As integer i = Ubound(splits) + 1
    Dim As Integer n, n0 = 1

    Redim Preserve splits(Lbound(splits) To i + l1 / l2)
    Do
        n = Instr(n0, s1, s2)
        Cptr(Any Ptr Ptr, @S1)[0] = p1 + n0 - 1
        If n > 0 Then
            Cptr(Integer Ptr, @S1)[1] = n - n0
            splits(i) = s1
            Cptr(Any Ptr Ptr, @S1)[0] = p1
            Cptr(Integer Ptr, @S1)[1] = l1
            n0 = n + l2
        Else
            Cptr(Integer Ptr, @S1)[1] = l1 - n0 + 1
            splits(i) = s1
            Cptr(Any Ptr Ptr, @S1)[0] = p1
            Cptr(Integer Ptr, @S1)[1] = l1
            Redim Preserve splits(Lbound(splits) To i)
            Exit Do
        End If
        i += 1
    Loop
End Sub


redim as string a()

dim as string instring="34x34x34x01234x56734x34x34x877534x34x"

splitf(instring,"34x",a())

for n as long=lbound(a) to ubound(a)
      print n;"    ";a(n)
next
Print

 splitd(instring,"34x",a())
print

for n as long=lbound(a) to ubound(a)
      print n;"    ";a(n)
next
print

sleep 
fxm
Moderator
Posts: 12081
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Split String Algorithm for FreeBasic

Post by fxm »

Skipping empty elements is simple from my version:

Code: Select all

Sub split(Byref s1 As String, Byref s2 As String, splits(Any) As String, byval skipEmptyElement As Integer = 0)
    Dim As Any Ptr p1 = Cptr(Any Ptr Ptr, @S1)[0]
    Dim As Integer l1 = Cptr(Integer Ptr, @S1)[1]
    Dim As Integer l2 = Cptr(Integer Ptr, @s2)[1]
    Dim As integer i = Ubound(splits) + 1
    Dim As Integer n, n0 = 1

    Redim Preserve splits(Lbound(splits) To i + l1 / l2)
    Do
        n = Instr(n0, s1, s2)
        If n > 0 Then
            If (skipEmptyElement = 0) Orelse (n - n0) > 0 Then
                Cptr(Any Ptr Ptr, @S1)[0] = p1 + n0 - 1
                Cptr(Integer Ptr, @S1)[1] = n - n0
                splits(i) = s1
                Cptr(Any Ptr Ptr, @S1)[0] = p1
                Cptr(Integer Ptr, @S1)[1] = l1
                i += 1
            End If
            n0 = n + l2
        Else
            If (skipEmptyElement = 0) Orelse (l1 - n0 + 1) > 0 Then
                Cptr(Any Ptr Ptr, @S1)[0] = p1 + n0 - 1
                Cptr(Integer Ptr, @S1)[1] = l1 - n0 + 1
                splits(i) = s1
                Cptr(Any Ptr Ptr, @S1)[0] = p1
                Cptr(Integer Ptr, @S1)[1] = l1
            Else
                i -= 1
            End If
            Redim Preserve splits(Lbound(splits) To i)
            Exit Do
        End If
    Loop
End Sub

Dim As String s1 = "reinvent|the|wheel|" & Chr(0) & "|in|optimized|pure||FreeBASIC|||:lol:||||"
Dim As String splits(Any)
Print "'" & s1 & "'"
Print
split(s1, "|", splits())
For i As Integer = 0 To Ubound(splits)
    Print i & " '" & splits(i) & "'"
Next i
Print
Erase splits
split(s1, "|", splits(), 1)
For i As Integer = 0 To Ubound(splits)
    Print i & " '" & splits(i) & "'"
Next i

Sleep
Last edited by fxm on Sep 18, 2022 15:23, edited 2 times in total.
Post Reply