Import files
Import files
Hi everyone,
I have a good understanding of reading csv files and thanks to a few of you on here I am starting to understand how to read in files with tab formats
But how would I read in files that use | as field seperators
I know this is a very simple question but everything I look at online contradicted previous information.
Any advice would be welcome.
I have a good understanding of reading csv files and thanks to a few of you on here I am starting to understand how to read in files with tab formats
But how would I read in files that use | as field seperators
I know this is a very simple question but everything I look at online contradicted previous information.
Any advice would be welcome.
Re: Import files
Perhaps you could replace | with ,
If it suitable, I think basiccoder2 wrote a faster replace version somewhere.
Code: Select all
'for replacement of one string by another
Function FindAndReplace(subject As String,find As String,replacement As String) As String
Dim s As String=subject
var position=Instr(s,find)
While position>0
s=Mid(s,1,position-1) & replacement & Mid(s,position+Len(find))
position=Instr(position+Len(replacement),s,find)
Wend
FindAndReplace=s
End Function
'for replacement of one char by another
function FindAndReplaceOnechar(subject As String,char As String,replacementchar As String) As String
dim as string s=subject
for n as long=0 to len(s)-1
if s[n]=asc(char) then s[n]=asc(replacementchar)
next
return s
end function
dim as string s="A|b|Canada|d|e|America|Portland|Freebasic|Imortis " +chr(10)
for n as long=1 to 5
s+=s
next
print s
dim as string g=FindAndReplaceOneChar(s,"|",",")
print g
sleep
Re: Import files
Hi dodicat
Thanks for that advice. But I have to be careful as one of my customers use , in the description name of the items
Is it not possible to split lines at the | and process then like you would with the csv files?
Thanks for that advice. But I have to be careful as one of my customers use , in the description name of the items
Is it not possible to split lines at the | and process then like you would with the csv files?
Re: Import files
A comma used in a sentence, is grammatically a comma and a space.
The comma separated lists are comma only.
You could use this property to keep them apart.
Example:
The comma separated lists are comma only.
You could use this property to keep them apart.
Example:
Code: Select all
'for replacement of one string by another
Function FindAndReplace(subject As String,find As String,replacement As String) As String
Dim s As String=subject
var position=Instr(s,find)
While position>0
s=Mid(s,1,position-1) & replacement & Mid(s,position+Len(find))
position=Instr(position+Len(replacement),s,find)
Wend
FindAndReplace=s
End Function
'for replacement of one char by another
function FindAndReplaceOnechar(subject As String,char As String,replacementchar As String) As String
dim as string s=subject
for n as long=0 to len(s)-1
if s[n]=asc(char) then s[n]=asc(replacementchar)
next
return s
end function
dim as string s="A|b|Canada|d|e|America, South|Portland|Freebasic|Imortis " +chr(10)
for n as long=1 to 2
s+=s
next
print s
dim as string g=FindAndReplaceOneChar(s,"|",",")
print "| to ,"
print g
g=findandreplace(g,"South","North")
print "South to North"
print g
g=findandreplace(g,", ",chr(0))
print "comma + space to a chr(0)"
print g
g=findandreplaceOneChar(g,",","|")
print ", to |"
print g
g=findandreplace(g,chr(0),", ")
print "chr(0) to comma + space"
print g
sleep
Re: Import files
So once the file has been converted I just Read it in as normal
Great thanks for the advice I will give that a try tonight
Great thanks for the advice I will give that a try tonight
Re: Import files
Instead of the comma, use the tab, i.e. chr(9). Both "|" and tab are non-ambiguous characters. much easier to handle.
Re: Import files
See I’m not 100% up on how to read in the files with tab as the determinator
Is there any simple examples on here that shows how to do that?
Is there any simple examples on here that shows how to do that?
Re: Import files
Weren't you answered how to do it in another one of your threads? How hard can it be to rewrite the code to accept any delimiter?Gablea wrote:... Is there any simple examples on here that shows how to do that?
Anyway:
Code: Select all
'' Include needed dependencies
#include once "file.bi"
/'
Splits a string into substrings delimited by any char
specified in the delimiters.
'/
function splitString( _
byref anInputString as const string, _
byref theDelimiters as const string, _
substrings() as string _
) as uinteger
dim as uinteger sLen = len( anInputString )
if( _
sLen = 0 ) then
redim substrings( 0 )
return( 0 )
end if
dim as boolean mark ( 1 to sLen )
dim as uinteger _
count, _
delimiters, _
position = 1, _
charPos
/'
Go through the string and mark the position of every delimiter
we can find. This process is called 'tallying', and we use a
boolean to tally whether the character in the input string is
a delimiter or not.
'/
do
/'
We use the inStr statement of FreeBasic to look for the
delimiters. See the documentation on inStr to see how
the function works.
'/
charPos = _
inStr( _
position, _
anInputString, _
any theDelimiters )
'' Check if we actually found a delimiter.
if( _
charPos > 0 ) then
'' We found one
mark( charPos ) = true
count += 1
delimiters += 1
if( _
charPos > 1 ) then
if( _
mark( charPos - 1 ) = false ) then
/'
This is just a string that contains a single
word, without delimiters.
'/
count += 1
end if
end if
'' Advance the starting position of the search
position = charPos + 1
else
exit do
end if
loop
/'
If the last token wasn't a delimiter, add one to the token
count (to account for the last token).
'/
if( _
mark( sLen ) = false ) then
count += 1
end if
/'
Allocate space for the spliced substrings. There's an extra
space at the end, reserved for a null token.
'/
redim _
substrings( 0 to count - delimiters )
'' Reset current and next positions
dim as uinteger current = 0
position = 1
/'
And then splice the substrings, using the marked positions
'/
for _
i as integer = 1 to sLen
if( _
mark( i ) = true ) then
if( _
i > 1 ) then
if( _
mark( i - 1 ) = false ) then
/'
Splices the tallied token. We'll use the mid() function, so
look at the documentation to see how it works.
'/
substrings( current ) = _
mid( _
anInputString, _
position, _
i - position )
current += 1
end if
end if
position = i + 1
end if
next
/'
If the last char wasn't a delimiter, splice the last token. This
accounts for the case where the last token doesn't end in a
delimiter (otherwise, it would be ignored).
'/
if( _
position <= sLen ) then
substrings( current ) = _
mid( _
anInputString, _
position, _
sLen - position + 1 )
end if
'' Return the amount of substrings we've found
return ( count )
end function
/'
Loads a file whose fields are delimited by any user-specified
chars.
'/
function loadDelimitedFile( _
byref aFileName as const string, _
byref aDelimiter as const string, _
strings() as string _
) as boolean
if( _
fileExists( aFileName ) ) then
dim as long fileHandle = freeFile()
dim as string file = _
space( fileLen( aFileName ) )
open _
aFileName for binary access read as fileHandle
get _
#fileHandle, , file
close( fileHandle )
if( _
splitString( _
file, _
aDelimiter, _
strings() ) > 0 ) then
return( true )
end if
end if
return( false )
end function
/'
Main code
'/
dim as string _
path = exePath() & "/", _
fileName = path & "test.txt", _
values( any ), _
delimiters = "|,;:~¬" & chr( 13, 10, 32, 9 )
if( _
loadDelimitedFile( _
fileName, _
delimiters, _
values() ) ) then
/'
Here I simply show the values, but this is where you load them into
their appropriate places.
'/
for _
i as integer = 0 to ubound( values )
? values( i )
next
end if
sleep()
EDIT: I forgot to remove the space allocated to split the delimiters also. Fixed now.
Last edited by paul doe on Nov 30, 2018 4:19, edited 5 times in total.
-
- Posts: 3904
- Joined: Jan 01, 2009 7:03
- Location: Australia
Re: Import files
viewtopic.php?f=7&t=25334&hilitdodicat wrote:If it suitable, I think basiccoder2 wrote a faster replace version somewhere.
Re: Import files
basiccoder2
Was this your completed version?
Gablea could use it instead of mine, it is very fast.
Paul Doe
You are splitting on each character in the deliminator, perhaps gablea needs to find and replace blocks.
example using basiccoder's replacer.
Was this your completed version?
Code: Select all
function findAndReplace(txtOld as string,txtFind as string,txtReplace as string) as string
if len(txtFind) = 0 then return txtOld
dim as string txtNew
if len(txtReplace)<len(txtFind) then
txtNew = string(len(txtOld)," ")
else
txtNew = string(len(txtOld)/len(txtFind)*len(txtReplace)+1," ")
end if
dim as integer ptrOld,ptrReplace,ptrFind,ptrNew
ptrReplace = 0
ptrOld = 0
while ptrOld <> len(txtOld)+1
if ptrFind = len(txtFind) then 'found substring !!
'undo copying of txtFind string in txtNew
ptrNew = ptrNew - len(txtFind) 'erase txtFind string from new string
ptrFind = 0
'append any replacement string to txtNew
ptrReplace = 0
if len(txtReplace)>0 then
for i as integer = 0 to len(txtReplace)
txtNew[ptrNew] = txtReplace[ptrReplace]
ptrNew = ptrNew + 1
ptrReplace = ptrReplace + 1
next i
ptrReplace = 0
ptrNew = ptrNew - 1
end if
else
if txtOld[ptrOld] = txtFind[ptrFind] then 'could be the sub string
ptrFind = ptrFind + 1
else
ptrFind = 0 'nope not the sub string reset to start of substring
end if
txtNew[ptrNew] = txtOld[ptrOld] 'copy character to new string
ptrNew = ptrNew + 1
ptrOld = ptrOld + 1
end if
wend
txtNew = left(txtNew,ptrNew-1)
return txtNew
end function
Paul Doe
You are splitting on each character in the deliminator, perhaps gablea needs to find and replace blocks.
example using basiccoder's replacer.
Code: Select all
function findAndReplace(txtOld as string,txtFind as string,txtReplace as string) as string
if len(txtFind) = 0 then return txtOld
dim as string txtNew
if len(txtReplace)<len(txtFind) then
txtNew = string(len(txtOld)," ")
else
txtNew = string(len(txtOld)/len(txtFind)*len(txtReplace)+1," ")
end if
dim as integer ptrOld,ptrReplace,ptrFind,ptrNew
ptrReplace = 0
ptrOld = 0
while ptrOld <> len(txtOld)+1
if ptrFind = len(txtFind) then 'found substring !!
'undo copying of txtFind string in txtNew
ptrNew = ptrNew - len(txtFind) 'erase txtFind string from new string
ptrFind = 0
'append any replacement string to txtNew
ptrReplace = 0
if len(txtReplace)>0 then
for i as integer = 0 to len(txtReplace)
txtNew[ptrNew] = txtReplace[ptrReplace]
ptrNew = ptrNew + 1
ptrReplace = ptrReplace + 1
next i
ptrReplace = 0
ptrNew = ptrNew - 1
end if
else
if txtOld[ptrOld] = txtFind[ptrFind] then 'could be the sub string
ptrFind = ptrFind + 1
else
ptrFind = 0 'nope not the sub string reset to start of substring
end if
txtNew[ptrNew] = txtOld[ptrOld] 'copy character to new string
ptrNew = ptrNew + 1
ptrOld = ptrOld + 1
end if
wend
txtNew = left(txtNew,ptrNew-1)
return txtNew
end function
dim as string s="A|b|Canada|d|e|America, South|Portland|Freebasic|Imortis " +chr(10)
for n as long=1 to 15
s+=s
next
print "original, length ";len(s)
print mid(s,1,500)+" .... "
dim as string g
s=FindAndReplace(s,",",chr(0))
print ", to chr(0)"
print mid(s,1,500)+" .... "
s=FindAndReplace(s,"|",",")
print "| to ,"
print mid(s,1,500)+" .... "
s=findandreplace(s,"South","Northtern part")
print "South to Northern Part"
print mid(s,1,500)+" .... "
s=findandreplace(s,",","|")
print ", to |"
print mid(s,1,500)+" .... "
s=findandreplace(s,chr(0),",")
print "chr(0) to comma"
print mid(s,1,500)+" .... "
print "New string length ";len(s)
sleep
Re: Import files
Perhaps it's just me, but Gablea clearly stated he wanted to read each field delimited with a '|' from a CSV-like file:dodicat wrote:...Paul Doe
You are splitting on each character in the deliminator, perhaps gablea needs to find and replace blocks...
Gablea wrote:...
But how would I read in files that use | as field seperators
...
And yes, it splits at either a comma, a tab or a vertical bar '|'. Thus you can use it to read a delimiter-separated value file with a single function. Of course, the validity of the file is up to the client code to implement.Gablea wrote:...
Is it not possible to split lines at the | and process then like you would with the csv files?
...
Last edited by paul doe on Nov 30, 2018 3:17, edited 2 times in total.
Re: Import files
@Paul Doe
THANK YOU that was just what I was looking for and thank you for the comments explaining what they do it makes sense now
And yes I think it was show how to do that but I was struggling to understand what the code did now You have done your example that shows and
explains what each section does it totally makes sense.
I can now be ready for the future file formats (I assume I can add any other flied spliter into the delimiters variable (for example say a customers system outputs with a ¬ as the spliter)
THANK YOU that was just what I was looking for and thank you for the comments explaining what they do it makes sense now
And yes I think it was show how to do that but I was struggling to understand what the code did now You have done your example that shows and
explains what each section does it totally makes sense.
I can now be ready for the future file formats (I assume I can add any other flied spliter into the delimiters variable (for example say a customers system outputs with a ¬ as the spliter)
Re: Import files
You're welcome. Yes, it can split a file using any delimiter you specify. Try this file:Gablea wrote:@Paul Doe
THANK YOU that was just what I was looking for and thank you for the comments explaining what they do it makes sense now
And yes I think it was show how to do that but I was struggling to understand what the code did now You have done your example that shows and
explains what each section does it totally makes sense.
I can now be ready for the future file formats (I assume I can add any other flied spliter into the delimiters variable (for example say a customers system outputs with a ¬ as the spliter)
Code: Select all
1:this,is,a,test;
2:this is a test;
3:this is a test;
4:this~is~a~test;
5:this¬is¬a¬test;
Code: Select all
delimiters = "|,;:~¬" & chr( 13, 10, 32, 9 )
Code: Select all
1
this
is
a
test
2
this
is
a
test
3
this
is
a
test
4
this
is
a
test
5
this
is
a
test
Note that there's a null entry at the end, which is added on purpose, so you can remove it if you don't need it. BTW, check the post before, I updated the function splitString() to not allocate space for the delimiters, since in the original function I retrieved them also.
Re: Import files
@Paul Doe
So all I need to do now is crate the function that saves the csv file into a fix length data file (think that would be the simplist part of the project)
So all I need to do now is crate the function that saves the csv file into a fix length data file (think that would be the simplist part of the project)
Re: Import files
Yes, indeed. You can use types, directly write the data as strings, bytestreams, whatever. That's entirely up to you now.Gablea wrote:@Paul Doe
So all I need to do now is crate the function that saves the csv file into a fix length data file (think that would be the simplist part of the project)