Import files

General FreeBASIC programming questions.
Gablea
Posts: 1104
Joined: Apr 06, 2010 0:05
Location: Northampton, United Kingdom
Contact:

Import files

Post by Gablea »

Hi everyone,

I have a good understanding of reading csv files and thanks to a few of you on here I am starting to understand how to read in files with tab formats

But how would I read in files that use | as field seperators

I know this is a very simple question but everything I look at online contradicted previous information.

Any advice would be welcome.
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Import files

Post by dodicat »

Perhaps you could replace | with ,

Code: Select all


'for replacement of one string by another
Function FindAndReplace(subject As String,find As String,replacement As String) As String 
    Dim s As String=subject
    var position=Instr(s,find)
    While position>0
        s=Mid(s,1,position-1) & replacement & Mid(s,position+Len(find))
        position=Instr(position+Len(replacement),s,find)
    Wend
    FindAndReplace=s
End Function

'for replacement of one char by another
function FindAndReplaceOnechar(subject As String,char As String,replacementchar As String) As String 
 dim as string s=subject
 for n as long=0 to len(s)-1
     if s[n]=asc(char) then s[n]=asc(replacementchar)
 next
 return s
 end function
 

dim as string s="A|b|Canada|d|e|America|Portland|Freebasic|Imortis " +chr(10)
for n as long=1 to 5
    s+=s
next
print s
dim as string g=FindAndReplaceOneChar(s,"|",",")
print g
sleep
 
If it suitable, I think basiccoder2 wrote a faster replace version somewhere.
Gablea
Posts: 1104
Joined: Apr 06, 2010 0:05
Location: Northampton, United Kingdom
Contact:

Re: Import files

Post by Gablea »

Hi dodicat
Thanks for that advice. But I have to be careful as one of my customers use , in the description name of the items

Is it not possible to split lines at the | and process then like you would with the csv files?
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Import files

Post by dodicat »

A comma used in a sentence, is grammatically a comma and a space.
The comma separated lists are comma only.
You could use this property to keep them apart.
Example:

Code: Select all

'for replacement of one string by another
Function FindAndReplace(subject As String,find As String,replacement As String) As String 
    Dim s As String=subject
    var position=Instr(s,find)
    While position>0
        s=Mid(s,1,position-1) & replacement & Mid(s,position+Len(find))
        position=Instr(position+Len(replacement),s,find)
    Wend
    FindAndReplace=s
End Function

'for replacement of one char by another
function FindAndReplaceOnechar(subject As String,char As String,replacementchar As String) As String 
 dim as string s=subject
 for n as long=0 to len(s)-1
     if s[n]=asc(char) then s[n]=asc(replacementchar)
 next
 return s
 end function
 

dim as string s="A|b|Canada|d|e|America, South|Portland|Freebasic|Imortis " +chr(10)
for n as long=1 to 2
    s+=s
next
print s
dim as string g=FindAndReplaceOneChar(s,"|",",")
print "| to ,"
print g
g=findandreplace(g,"South","North")
print "South to North"
print g
g=findandreplace(g,", ",chr(0))
print "comma + space to a chr(0)"
print g
g=findandreplaceOneChar(g,",","|")
print ", to |"
print g
g=findandreplace(g,chr(0),", ")
print "chr(0) to comma + space"
print g
sleep  
Gablea
Posts: 1104
Joined: Apr 06, 2010 0:05
Location: Northampton, United Kingdom
Contact:

Re: Import files

Post by Gablea »

So once the file has been converted I just Read it in as normal

Great thanks for the advice I will give that a try tonight
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Import files

Post by jj2007 »

Instead of the comma, use the tab, i.e. chr(9). Both "|" and tab are non-ambiguous characters. much easier to handle.
Gablea
Posts: 1104
Joined: Apr 06, 2010 0:05
Location: Northampton, United Kingdom
Contact:

Re: Import files

Post by Gablea »

See I’m not 100% up on how to read in the files with tab as the determinator

Is there any simple examples on here that shows how to do that?
paul doe
Moderator
Posts: 1730
Joined: Jul 25, 2017 17:22
Location: Argentina

Re: Import files

Post by paul doe »

Gablea wrote:... Is there any simple examples on here that shows how to do that?
Weren't you answered how to do it in another one of your threads? How hard can it be to rewrite the code to accept any delimiter?

Anyway:

Code: Select all

'' Include needed dependencies
#include once "file.bi"

/'
  Splits a string into substrings delimited by any char
  specified in the delimiters.
'/
function splitString( _
  byref anInputString as const string, _
  byref theDelimiters as const string, _
  substrings() as string _
  ) as uinteger
  
  dim as uinteger sLen = len( anInputString )
  
  if( _
    sLen = 0 ) then
    redim substrings( 0 )
    return( 0 )
  end if
  
  dim as boolean mark ( 1 to sLen )
  dim as uinteger _
    count, _
    delimiters, _
    position = 1, _
    charPos
  
  /'
    Go through the string and mark the position of every delimiter
    we can find. This process is called 'tallying', and we use a
    boolean to tally whether the character in the input string is
    a delimiter or not.
  '/
  do
    /'
      We use the inStr statement of FreeBasic to look for the
      delimiters. See the documentation on inStr to see how
      the function works.
    '/
    charPos = _
      inStr( _
        position, _
        anInputString, _
        any theDelimiters )
    
    '' Check if we actually found a delimiter.
    if( _
      charPos > 0 ) then
      '' We found one
      mark( charPos ) = true      
      count += 1
      delimiters += 1
      
      if( _
        charPos > 1 ) then
        if( _
          mark( charPos - 1 ) = false ) then
          /'
            This is just a string that contains a single
            word, without delimiters.
          '/
          count += 1
        end if
      end if
      
      '' Advance the starting position of the search
      position = charPos + 1
    else
      exit do
    end if
  loop
  /'
    If the last token wasn't a delimiter, add one to the token
    count (to account for the last token).
  '/
  if( _
    mark( sLen ) = false ) then
    count += 1
  end if
  
  /'
    Allocate space for the spliced substrings. There's an extra
    space at the end, reserved for a null token.
  '/
  redim _
    substrings( 0 to count - delimiters )
  
  '' Reset current and next positions
  dim as uinteger current = 0
  position = 1
  
  /'
    And then splice the substrings, using the marked positions
  '/
  for _
    i as integer = 1 to sLen
    if( _
      mark( i ) = true ) then
      if( _
        i > 1 ) then
        if( _
          mark( i - 1 ) = false ) then
          /'
            Splices the tallied token. We'll use the mid() function, so
            look at the documentation to see how it works.
          '/
          substrings( current ) = _
            mid( _
              anInputString, _
              position, _
              i - position )        
          
          current += 1
        end if
      end if
      
      position = i + 1
    end if
  next
  
  /'
    If the last char wasn't a delimiter, splice the last token. This
    accounts for the case where the last token doesn't end in a
    delimiter (otherwise, it would be ignored).
  '/  
  if( _
    position <= sLen ) then
    substrings( current ) = _
      mid( _
        anInputString, _
        position, _
        sLen - position + 1 )
  end if
  
  '' Return the amount of substrings we've found
  return ( count )
end function

/'
  Loads a file whose fields are delimited by any user-specified
  chars. 
'/
function loadDelimitedFile( _
  byref aFileName as const string, _
  byref aDelimiter as const string, _
  strings() as string _
  ) as boolean
  
  if( _
    fileExists( aFileName ) ) then
    dim as long fileHandle = freeFile()
    dim as string file = _
      space( fileLen( aFileName ) )
    
    open _
      aFileName for binary access read as fileHandle
      get _
        #fileHandle, , file
    close( fileHandle )
    
    if( _
      splitString( _
        file, _
        aDelimiter, _
        strings() ) > 0 ) then
      return( true )
    end if
  end if
  
  return( false )
end function

/'
  Main code
'/
dim as string _
  path = exePath() & "/", _
  fileName = path & "test.txt", _
  values( any ), _
  delimiters = "|,;:~¬" & chr( 13, 10, 32, 9 )

if( _
  loadDelimitedFile( _
    fileName, _
    delimiters, _
    values() ) ) then
  /'
    Here I simply show the values, but this is where you load them into
    their appropriate places.
  '/
  for _
    i as integer = 0 to ubound( values )
    ? values( i )
  next
end if

sleep()
EDIT: Another little bug fix.
EDIT: I forgot to remove the space allocated to split the delimiters also. Fixed now.
Last edited by paul doe on Nov 30, 2018 4:19, edited 5 times in total.
BasicCoder2
Posts: 3906
Joined: Jan 01, 2009 7:03
Location: Australia

Re: Import files

Post by BasicCoder2 »

dodicat wrote:If it suitable, I think basiccoder2 wrote a faster replace version somewhere.
viewtopic.php?f=7&t=25334&hilit
dodicat
Posts: 7976
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Import files

Post by dodicat »

basiccoder2
Was this your completed version?

Code: Select all

function findAndReplace(txtOld as string,txtFind as string,txtReplace as string) as string
    if len(txtFind) = 0 then return txtOld
    
    dim as string txtNew
    if len(txtReplace)<len(txtFind) then
        txtNew = string(len(txtOld)," ")
    else
        txtNew = string(len(txtOld)/len(txtFind)*len(txtReplace)+1," ")
    end if
    
    dim as integer ptrOld,ptrReplace,ptrFind,ptrNew
    ptrReplace = 0
    ptrOld = 0

    while ptrOld <> len(txtOld)+1
    
        if ptrFind = len(txtFind) then  'found substring !!

            'undo copying of txtFind string in txtNew
            ptrNew = ptrNew - len(txtFind)  'erase txtFind string from new string
            ptrFind = 0
        
            'append any replacement string to txtNew
            ptrReplace = 0
            if len(txtReplace)>0 then
                for i as integer = 0 to len(txtReplace)
                    txtNew[ptrNew] = txtReplace[ptrReplace]
                    ptrNew = ptrNew + 1
                    ptrReplace = ptrReplace + 1
                next i
                ptrReplace = 0
                ptrNew = ptrNew - 1
            end if
        
        else
        
            if txtOld[ptrOld] = txtFind[ptrFind] then  'could be the sub string
                ptrFind = ptrFind + 1
            else
                ptrFind = 0  'nope not the sub string reset to start of substring
            end if

           txtNew[ptrNew] = txtOld[ptrOld]  'copy character to new string
            ptrNew = ptrNew + 1
            ptrOld = ptrOld + 1
        
        end if

    wend

    txtNew = left(txtNew,ptrNew-1)
    return txtNew
end function
  
Gablea could use it instead of mine, it is very fast.
Paul Doe
You are splitting on each character in the deliminator, perhaps gablea needs to find and replace blocks.
example using basiccoder's replacer.

Code: Select all

 

function findAndReplace(txtOld as string,txtFind as string,txtReplace as string) as string
    if len(txtFind) = 0 then return txtOld
    
    dim as string txtNew
    if len(txtReplace)<len(txtFind) then
        txtNew = string(len(txtOld)," ")
    else
        txtNew = string(len(txtOld)/len(txtFind)*len(txtReplace)+1," ")
    end if
    
    dim as integer ptrOld,ptrReplace,ptrFind,ptrNew
    ptrReplace = 0
    ptrOld = 0

    while ptrOld <> len(txtOld)+1
    
        if ptrFind = len(txtFind) then  'found substring !!

            'undo copying of txtFind string in txtNew
            ptrNew = ptrNew - len(txtFind)  'erase txtFind string from new string
            ptrFind = 0
        
            'append any replacement string to txtNew
            ptrReplace = 0
            if len(txtReplace)>0 then
                for i as integer = 0 to len(txtReplace)
                    txtNew[ptrNew] = txtReplace[ptrReplace]
                    ptrNew = ptrNew + 1
                    ptrReplace = ptrReplace + 1
                next i
                ptrReplace = 0
                ptrNew = ptrNew - 1
            end if
        
        else
        
            if txtOld[ptrOld] = txtFind[ptrFind] then  'could be the sub string
                ptrFind = ptrFind + 1
            else
                ptrFind = 0  'nope not the sub string reset to start of substring
            end if

           txtNew[ptrNew] = txtOld[ptrOld]  'copy character to new string
            ptrNew = ptrNew + 1
            ptrOld = ptrOld + 1
        
        end if

    wend

    txtNew = left(txtNew,ptrNew-1)
    return txtNew
end function



dim as string s="A|b|Canada|d|e|America, South|Portland|Freebasic|Imortis " +chr(10)
for n as long=1 to 15
    s+=s
next


print "original, length ";len(s)
print mid(s,1,500)+"  .... "

dim as string g

s=FindAndReplace(s,",",chr(0))
print ", to chr(0)"
print mid(s,1,500)+"  .... "

 s=FindAndReplace(s,"|",",")
print "| to ,"
print mid(s,1,500)+"  .... "

s=findandreplace(s,"South","Northtern part")
print "South to Northern Part"
print mid(s,1,500)+"  .... "

s=findandreplace(s,",","|")
print ", to |"
print mid(s,1,500)+"  .... "
s=findandreplace(s,chr(0),",")
print "chr(0) to comma"
print mid(s,1,500)+"  .... "

print "New string length ";len(s)
sleep  
paul doe
Moderator
Posts: 1730
Joined: Jul 25, 2017 17:22
Location: Argentina

Re: Import files

Post by paul doe »

dodicat wrote:...Paul Doe
You are splitting on each character in the deliminator, perhaps gablea needs to find and replace blocks...
Perhaps it's just me, but Gablea clearly stated he wanted to read each field delimited with a '|' from a CSV-like file:
Gablea wrote:...
But how would I read in files that use | as field seperators
...
Gablea wrote:...
Is it not possible to split lines at the | and process then like you would with the csv files?
...
And yes, it splits at either a comma, a tab or a vertical bar '|'. Thus you can use it to read a delimiter-separated value file with a single function. Of course, the validity of the file is up to the client code to implement.
Last edited by paul doe on Nov 30, 2018 3:17, edited 2 times in total.
Gablea
Posts: 1104
Joined: Apr 06, 2010 0:05
Location: Northampton, United Kingdom
Contact:

Re: Import files

Post by Gablea »

@Paul Doe

THANK YOU that was just what I was looking for and thank you for the comments explaining what they do it makes sense now

And yes I think it was show how to do that but I was struggling to understand what the code did now You have done your example that shows and
explains what each section does it totally makes sense.

I can now be ready for the future file formats (I assume I can add any other flied spliter into the delimiters variable (for example say a customers system outputs with a ¬ as the spliter)
paul doe
Moderator
Posts: 1730
Joined: Jul 25, 2017 17:22
Location: Argentina

Re: Import files

Post by paul doe »

Gablea wrote:@Paul Doe

THANK YOU that was just what I was looking for and thank you for the comments explaining what they do it makes sense now

And yes I think it was show how to do that but I was struggling to understand what the code did now You have done your example that shows and
explains what each section does it totally makes sense.

I can now be ready for the future file formats (I assume I can add any other flied spliter into the delimiters variable (for example say a customers system outputs with a ¬ as the spliter)
You're welcome. Yes, it can split a file using any delimiter you specify. Try this file:

Code: Select all

1:this,is,a,test;
2:this is a test;
3:this	is	a	test;
4:this~is~a~test;
5:this¬is¬a¬test;
If you change delimiters to this:

Code: Select all

delimiters = "|,;:~¬" & chr( 13, 10, 32, 9 )
The output should be:

Code: Select all

1
this
is
a
test
2
this
is
a
test
3
this
is
a
test
4
this
is
a
test
5
this
is
a
test

As you can see, you don't need to keep the same delimiter for all the file, you can use different ones if you want.

Note that there's a null entry at the end, which is added on purpose, so you can remove it if you don't need it. BTW, check the post before, I updated the function splitString() to not allocate space for the delimiters, since in the original function I retrieved them also.
Gablea
Posts: 1104
Joined: Apr 06, 2010 0:05
Location: Northampton, United Kingdom
Contact:

Re: Import files

Post by Gablea »

@Paul Doe

So all I need to do now is crate the function that saves the csv file into a fix length data file (think that would be the simplist part of the project)
paul doe
Moderator
Posts: 1730
Joined: Jul 25, 2017 17:22
Location: Argentina

Re: Import files

Post by paul doe »

Gablea wrote:@Paul Doe

So all I need to do now is crate the function that saves the csv file into a fix length data file (think that would be the simplist part of the project)
Yes, indeed. You can use types, directly write the data as strings, bytestreams, whatever. That's entirely up to you now.
Post Reply