Change string to ordered index ?
Change string to ordered index ?
Hi,
for saving disk space it interest me, if exist here some generic solution/formula for conversion string (e.g. string*18) to integer value (index - position) in (incoming ) order 1 to N.
Say, something produce string*18.
Then happens:
a)if string is empty, null - do nothing
b)if its known already, - only increment its occurency
c)compute its ordered index & change it to UDT of 3 members (string*18 + its incoming ordered index + occrurency) accordingly
Thank you for any hints
for saving disk space it interest me, if exist here some generic solution/formula for conversion string (e.g. string*18) to integer value (index - position) in (incoming ) order 1 to N.
Say, something produce string*18.
Then happens:
a)if string is empty, null - do nothing
b)if its known already, - only increment its occurency
c)compute its ordered index & change it to UDT of 3 members (string*18 + its incoming ordered index + occrurency) accordingly
Thank you for any hints
Re: Change string to ordered index ?
Sorry, but I fail to see, how this is supposed to save any Disk/SSD resources.
Please explain the basic idea, behind the scenes. What do you want to "get done"?
Epecially the Type (UDT), doesn't make any sense (in the given context).
(if I'm correctly guessing: "just save a string once")
Please explain the basic idea, behind the scenes. What do you want to "get done"?
Epecially the Type (UDT), doesn't make any sense (in the given context).
(if I'm correctly guessing: "just save a string once")
Re: Change string to ordered index ?
Something like this?
Note: For large data sets 'redim preserve' can slow things down. Other list or tree types could be considered.
Code: Select all
type data_type
dim as string text
dim as integer count
declare operator cast () as string
end type
operator data_type.cast () as string
return str(count) + " x " + text
end operator
'-------------------------------------------------------------------------------
type simple_list
private:
dim as data_type myData(any)
public:
declare function update(newText as string) as integer
declare sub printAll()
end type
function simple_list.update(newText as string) as integer
if newText = "" then return -1
dim as integer index = -1
for i as integer = 0 to ubound(myData)
if myData(i).text = newText then
index = i
exit for
end if
next
if index >= 0 then 'found
myData(index).count += 1
else 'not listed, add
dim as integer ub = ubound(myData) + 1
redim preserve myData(ub) 'increase array size
myData(ub).text = newText
myData(ub).count = 1
end if
return 0
end function
'loop all and print
sub simple_list.printAll()
for i as integer = 0 to ubound(myData)
print i, myData(i)
next
end sub
'-------------------------------------------------------------------------------
dim as simple_list list
list.update("test123")
list.update("test123")
list.update("ABC")
list.update("test123")
list.update("ABC")
list.update("EDF")
list.update("test123")
list.printAll()
Re: Change string to ordered index ?
Hi MrSwiss
due memory troubles on 32 bit distro I must reduce amount of data saved to disk/ramdisk.
Complete range of generated data is 1 to 17M.Here I know formula to get index & compute all needed things with pleasure.
I got 3M for now, impossible to handel with.
Moreover, mostly it's garbage, empty, null strings.
Amount of obtained valuable strings is 100k.So I need to know/recode index in new range 1 -100k,
Spared data looses 2 parameters, for now unimportant.We'll see..
due memory troubles on 32 bit distro I must reduce amount of data saved to disk/ramdisk.
Complete range of generated data is 1 to 17M.Here I know formula to get index & compute all needed things with pleasure.
I got 3M for now, impossible to handel with.
Moreover, mostly it's garbage, empty, null strings.
Amount of obtained valuable strings is 100k.So I need to know/recode index in new range 1 -100k,
Spared data looses 2 parameters, for now unimportant.We'll see..
Re: Change string to ordered index ?
Hi badidea
very helpful code, looks closer to what I am exactly looking for, thank you very much !
very helpful code, looks closer to what I am exactly looking for, thank you very much !
Re: Change string to ordered index ?
What do you mean with 17M? 17 Megabyte? That is close to nothing.
Re: Change string to ordered index ?
Sure.But next associated things, e.g. .as online finalizing sorts of sheet 17M rows x 50 cols and store that to (temporary) archive files is harder.
I would like to see it working on 64bit, what is a computing speed there and possible way of flow design.
I would like to see it working on 64bit, what is a computing speed there and possible way of flow design.
-
- Posts: 538
- Joined: Dec 02, 2011 22:51
- Location: France
Re: Change string to ordered index ?
Hello ppf. Consider this one viewtopic.php?f=8&t=26533 (adapted for larger data sets, 5000 entries+)
One exemple here viewtopic.php?f=2&t=27568&p=261028#p261028 Some few documentation here : viewtopic.php?f=9&t=26551 Welcome trying it. Have fun.
One exemple here viewtopic.php?f=2&t=27568&p=261028#p261028 Some few documentation here : viewtopic.php?f=9&t=26551 Welcome trying it. Have fun.
Re: Change string to ordered index ?
Hi ppf,
using a UDT (Type) for writing/reading to/from file, its important to NOT use
the data-type 'Integer' inside the UDT, because it doesn't have a fixed size.
For your specified sizes, I'd opt for a ULong (unsigned 32 bit).
This makes the data-files compatible, with the .exe, whether its 32/64 bits
compiled (NOT so with Integer, differing size!).
using a UDT (Type) for writing/reading to/from file, its important to NOT use
the data-type 'Integer' inside the UDT, because it doesn't have a fixed size.
For your specified sizes, I'd opt for a ULong (unsigned 32 bit).
This makes the data-files compatible, with the .exe, whether its 32/64 bits
compiled (NOT so with Integer, differing size!).
-
- Posts: 538
- Joined: Dec 02, 2011 22:51
- Location: France
Re: Change string to ordered index ?
@MrSwiss : you mean for binary files ? (or non ascii)
Re: Change string to ordered index ?
Look at the UDT as a 'record', which must always work, with the same number of bytes.
Aka: The UDT's size (in bytes) remains equal, independent of the compilers bitness.
(from here onwards, you can surely work it out, on your own)
Aka: The UDT's size (in bytes) remains equal, independent of the compilers bitness.
(from here onwards, you can surely work it out, on your own)
-
- Posts: 538
- Joined: Dec 02, 2011 22:51
- Location: France
Re: Change string to ordered index ?
Ok, got it. Tool originally designed to work with strings, not UDT as records and Integer were almost designed for internal countdown (or pointers casting,.., afterward) better than for datas. Indeed ,you're right till you consider udt as 'records', integer in types should be turned to uLong, (if so check for no -1 value).
ps Addendum : a somewhat inconsistent tool in that it is supposed to be for beginners and also more advanced users (but beefy programmers may prefer to do without it or to adapt it). Thus, a beta version, so we can still expect some bugs in 64 bits especially if we try to push the tool to the limits. But until the next delivery, it seems to meet many expectations.
ps Addendum : a somewhat inconsistent tool in that it is supposed to be for beginners and also more advanced users (but beefy programmers may prefer to do without it or to adapt it). Thus, a beta version, so we can still expect some bugs in 64 bits especially if we try to push the tool to the limits. But until the next delivery, it seems to meet many expectations.
Re: Change string to ordered index ?
Well, I don't agree with your conclusions (so far). The reason is simple:
One cannot expect a Tool to evolve, by adding more new functionality,
without also accepting the contras, like: more complexity in handling it.
There is almost always, a certain amount of tradeoffs, to live with.
(Btw. you can expect the very same bug's, in the 32 bit version!)
One cannot expect a Tool to evolve, by adding more new functionality,
without also accepting the contras, like: more complexity in handling it.
There is almost always, a certain amount of tradeoffs, to live with.
(Btw. you can expect the very same bug's, in the 32 bit version!)
-
- Posts: 538
- Joined: Dec 02, 2011 22:51
- Location: France
Re: Change string to ordered index ?
Almost all the features added so far were thought of at the design stage, which explains their integration. Neither the manipulation nor the complexity of the basic instruction set has been impacted. On the other hand, the more advanced functionalities add a complexity of use which is related to the increase of the functional possibilities. Exception handling is excluded from parser, the only possible place left is the execution context. Exceptions are handled around the parser(s). The addition of functionalities impacts the complexity of the context because the kinematics of use of the instruction set is multiplied. Bugsfix must sometimes be thought of in a global way. If you precisely identify new bugs in the tool, I'm listening for them quickly or in the next version or take note of it depending on the severity. Users are judges of technical tradeoffs. This tool should be evaluated according to the use it brings, what users want it to do, and not only on the beauty of the code. Nevertheless I understand your point of view event thought I do not share.
Re: Change string to ordered index ?
Regarding saving a udt to a file.
If anybody can save and reload an array of udts with a fixed length string field (As perhaps required in this thread), then please demonstrate.
(similar to a previous topic)
I have tried every which way, but the data back from from the file is not correct.
So I convert a udt with a fixed length string to a udt of ubyte array.
This way the data is recovered.
If anybody can save and reload an array of udts with a fixed length string field (As perhaps required in this thread), then please demonstrate.
(similar to a previous topic)
I have tried every which way, but the data back from from the file is not correct.
So I convert a udt with a fixed length string to a udt of ubyte array.
This way the data is recovered.
Code: Select all
#include "file.bi"
type stringtype 'cannot save and reoad this type efficiently from disk
as string * 18 value
as long index
as long occurrency
declare operator cast() as string
end type
type arraytype 'this can be saved and reloaded from disk.
as ubyte value(1 to 18)
as long index
as long occurrency
end type
operator stringtype.cast() as string 'print out the stringtype results
print "'";value;"'"
print index
print occurrency
return ""
end operator
sub convert overload(a() as arraytype,s() as stringtype) 'arraytype to stringtype
for n as long=lbound(a) to ubound(a)
for m as long=1 to 18: s(n).value+= chr(a(n).value(m)):next
s(n).index=a(n).index
s(n).occurrency=a(n).occurrency
next
end sub
sub convert overload(s() as stringtype,a() as arraytype) 'stringtype to arraytype
for n as long=lbound(a) to ubound(a)
for m as long=0 to 17: a(n).value(m+1)= (s(n).value[m]):next
a(n).index=s(n).index
a(n).occurrency=s(n).occurrency
next
end sub
sub loadfile(file as string,b() as arraytype)
If FileExists(file)=0 Then Print file;" not found":Sleep:end
var f=freefile
Open file For Binary Access Read As #f
If Lof(f) > 0 Then
Get #f, , b()
End If
Close #f
end sub
Sub savefile(filename As String,p() As arraytype)
Dim As Integer n
n=Freefile
If Open (filename For Binary Access Write As #n)=0 Then
Put #n,,p()
Close
Else
Print "Unable to load " + filename
End If
End Sub
dim as string z="AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz"
#define range(f,l) Int(Rnd*((l+1)-(f))+(f))
dim as stringtype st(1 to 10) 'or 1 to whatever
dim as arraytype at(1 to ubound(st))
'create some instances of stringtype
for n as long=1 to ubound(st)
with st(n)
.value=mid(z,range(1,(52-18)),18)
.index=n
.occurrency=range(5,15)
end with
next n
print "Save this data to file:"
print
for n as long=lbound(st) to ubound(st) 'show them
print st(n)
next
print "____________________________"
print
convert(st(),at())'convert internal string data to ubyte array
savefile("text.txt",at()) 'must save arraytype
erase st,at
'all saved to drive, erase all arrays
'===================================================
'reload from drive
var lngth=filelen("text.txt")\sizeof(arraytype) 'get incoming array dimension
dim as arraytype x(1 to lngth)
dim as stringtype y(1 to lngth)
loadfile("text.txt",x()) 'must load to arraytype
convert(x(),y()) 'convert to stringtype
print "Returned data:"
print
for n as long=lbound(st) to ubound(st) 'show them
print y(n)
next
sleep