ForumBlobPost - Utility (Ver. 0.4-0.51)

Post your FreeBASIC source, examples, tips and tricks here. Please don’t post code without including an explanation.
leopardpm
Posts: 1795
Joined: Feb 28, 2009 20:58

Re: ForumBlobPost - Utility (Ver.0.4)

Post by leopardpm »

MrSwiss wrote:
leopardpm wrote:we can call our 'standard': FB-Forum ASCII (FBFA), 6-bit encoding for 8 bit data through the forum
Well, technically this isn't correct, since some of the char's used in the Alphabet,
are in effect 7 bit coded (induced by the shift, of 35).
The Alphabet: from "#" to "b", all char's in between are valid ... call it: B64SH35 ?!?
I guess I just think of how many bits of data per byte the encoding method uses: so, PER FORUM BYTE, we manage to transfer 6 bits of data.... the forum bytes are actually still 8 bit bytes (of course), we just can't use them fully...

but, if you are really attached to the cryptic "B64SH35", then... why not!? lol
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: ForumBlobPost - Utility (Ver.0.4)

Post by dodicat »

The best compressor IMHO is simply to regard the input file (graphics e.t.c.) as a large base 256 number.
It satisfies the criteria, every digit in range 0 to 255.
change the whole number to base ten then change from base 10 to base 128, ascii start at 32.

This gives a good compression.
Here is Basiccoder's spritesheet results:

Code: Select all

Length original  48963
Length readable  55958
Length returned  48963
-1

First few characters of the forum readable file

15"Y.-%&a@   -Dr(e0  !ö   >@0#   "MûPA   +èa8tBƒƒƒ
 
Unfortunately the GMP library is really required to achieve results in a few seconds.
Here is the GMP code for the above result.

Code: Select all


#Include once "gmp.bi"
#include "file.bi"
'convert a decimal string to base b
Function convertDecimaltobase(Byval Base10 As String ,b As String="256",begin As Long=32) As String
    Dim As zstring Ptr s
    Dim As __mpz_struct answer,id,divd,modd,bb,zero
    mpz_init2( @answer,0)
    mpz_init2( @id,0)
    mpz_init2( @divd,0)
    mpz_init2( @modd,0)
    mpz_init2( @bb,0)
    mpz_init2( @zero,0)
    mpz_init_set_str( @id,Base10,10)
    mpz_init_set_str( @bb,b,10)
    Dim As String acc
    Do
        mpz_div(@divd,@id,@bb)
        mpz_mod(@modd,@id,@bb)
        s= mpz_get_str(0, 10, @modd)
        acc= Chr(Vallng(*s)+begin)+acc
        Mpz_set(@id,@divd)
    Loop Until  Mpz_cmp(@divd,@zero)=0 
    Function= acc
    mpz_clear(@answer) : mpz_clear(@id) : mpz_clear(@divd)
    mpz_clear(@modd) : mpz_clear(@bb) : mpz_clear(@zero)
    Deallocate s
End Function

'convert from base b to a decimal string
Function convertbasetoDecimal(Byval Basenum As String ,b As String="256",begin As Long=32) As String
    Dim As zstring Ptr s
    Dim As String sum
    Dim As Ulong u=Valulng(b)
    sum=Str(Asc(Left(BaseNum,1))-begin)
    Dim As __mpz_struct mul,znum,zsum
    mpz_init2( @mul,0)
    mpz_init2 (@znum,0)
    mpz_init2 (@zsum,0)
    mpz_set_str(@zsum,sum,10)
    For n As Long=2 To Len(basenum)
        Var z=basenum[n-1]
        mpz_mul_ui(@mul,@zsum,u)
        mpz_set_str(@znum,Str(z-begin),10)
        mpz_add(@zsum,@mul,@znum)
    Next n
    s= mpz_get_str(0, 10, @zsum) 
    Function= *s
    mpz_clear(@mul) : mpz_clear(@znum) : mpz_clear(@zsum)
    Deallocate s
End Function

Function compress(f As String,basenum As String="128",begin As Long=32) As String export
    Var b=convertbasetoDecimal(f,"256",0)
    Return convertDecimaltobase(b,basenum,begin)
End Function

Function uncompress(f As String,basenum As String="128",begin As Long=32) As String export 
    Var g=convertbasetoDecimal(f,basenum,begin)
    Return convertDecimaltobase(g,"256",0)
End Function


Function loadfile(file As String) As String
    Var  f=Freefile
    if  FileExists(file)=0 then print file;"  not found,press a key":sleep:end
    Open file For Binary Access Read As #f
    Dim As String text
    If Lof(1) > 0 Then
        text = String(Lof(f), 0)
        Get #f, , text
    End If
    Close #f
    Return text
    end function

Sub savefile(filename As String,p As String)
    Dim As Integer n
    n=Freefile
    If Open (filename For Binary Access Write As #n)=0 Then
        Put #n,,p
        Close
    Else
        Print "Unable to load " + filename
    End If
end sub

dim as string nbase="128"
dim as long ascistart=32

var f=loadfile("spritesheet.png")
var c=compress(f,nbase,ascistart)              'forum readable
''savefile("something.txt",c)
print "Length original ";len(f)
print "Length readable ";len(c)
var r=uncompress(c,nbase,ascistart)             'original returned
print "Length returned ";len(r)
print f=r
print

print "First few characters of the forum readable file"
print
print left(c,50)
sleep 
I have both 32 bit and 64 bit GMP static libraries.
I can send them via mediafire if there is interest.
Coupled with a little Win api thing to directly deal with the clipboard, this method would suffice I think.
The WEB basese 64 encoders (there was use of one quite recently used on the forum) possibly uses a Bigint routine to get that compression edge, rather than a digit to digit method.
Not to take anything away from Mr Swiss's method here, which is pretty good.
MrSwiss
Posts: 3910
Joined: Jun 02, 2013 9:27
Location: Switzerland

Re: ForumBlobPost - Utility (Ver.0.4)

Post by MrSwiss »

dodicat wrote:change the whole number to base ten then change from base 10 to base 128, ascii start at 32.
I've done that too, before switching to BASE64, the reason is simple: the only "reliable" method is, to stay:
  • A) above Chr(34), as a minimum (for Chr(34) use as "delimiter" for: filename)
    B) below Chr(127), as a maximum
To be readable for all current "Encoding Schemes", employed by various OS's.
leopardpm
Posts: 1795
Joined: Feb 28, 2009 20:58

Re: ForumBlobPost - Utility (Ver.0.4)

Post by leopardpm »

we already have a base128 version (7 bit encoding) for the forum - BUT, apparently it uses characters that wont transfer correctly depending upon certain user regional browser/system settings, so now we are reduced to using a base 64 (6 bit encoding) to be completely compatible across regions. The base128 version was 'perfect' in that exactly 7 bits were able to be encoded into a forum byte, so no improvement would occur with what you are suggesting, and no additional libraries required. Basically, every 7 bytes of 8 bit info is re-coded into 8 bytes of 7 bit info each...
leopardpm
Posts: 1795
Joined: Feb 28, 2009 20:58

Re: ForumBlobPost - Utility (Ver.0.4)

Post by leopardpm »

NOW, to be 'most' efficient using the allowable forum bytes (chr35 thru chr127 = 93 values), then we could make a 'complex' encoder that changes our raw data into base93! lol, craziness! and it would only give us roughly 1/2 bit more efficiency (about 4 or 5%)
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: ForumBlobPost - Utility (Ver.0.4)

Post by dodicat »

Here are base 93 results starting at ascii 35

Code: Select all

Length original  48963
Length readable  59902
Length returned  48963
-1

First few characters of the forum readable file

$yYagm`4b~&0`<c':h@_>$RI}Z,GN3{w\K4^a|)z:5?)tKstaw
 
leopardpm
Posts: 1795
Joined: Feb 28, 2009 20:58

Re: ForumBlobPost - Utility (Ver.0.4)

Post by leopardpm »

dodicat... did you
estimate the results, or actually write the code? I would like to see your base93 encoding method!
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: ForumBlobPost - Utility (Ver.0.4)

Post by dodicat »

The code is above(GMP).
I just changed nbase to 93
ascistart to 35.

I think I'll work on a GMP alternative so no libs are needed.
So that'll keep me busy for a while.
leopardpm
Posts: 1795
Joined: Feb 28, 2009 20:58

Re: ForumBlobPost - Utility (Ver.0.4)

Post by leopardpm »

as far as I can understand, 93 x 256 = 23,808 (an Common Multiple), can't find a smaller LCM - so it would take 23,808 forum bytes to fully contain a base93 representation.... but I am already getting lost in the numbers...

would the code go something like this:

Given a base256 number (spread acrss X bytes),

(1) divide it by 93, store that as the NSD (Next Significant Digit),
(2) take the remainder multiplied by 93 and store the result as the LSD (Least Significant Digit - the far rightmost Base93 digit, the next LSD will be one digit to the left of this digit)
(3) use the previous NSD and proceed to #1 again until the NSD < 94

it is in the 'divide it by 93' that swamps me... doing a division of a number which is across possibly thousands of bytes...
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: ForumBlobPost - Utility (Ver.0.4)

Post by dodicat »

It is a div/mod process.
So you are correct.
number is base 256

Do
d=div(number,93)
m=mod(number,93)
g=chr(valint(m)+35)+g
number=d
loop until number="0"
answer=g


I am working on the mechanical maths of the problem.
I need a fast string div and mod.
The fact that you are simply changing base should yield an optimal compression.
The bits and bobs and bytes are computer parameters.
(They complicate the immediate problem IMO)
srvaldez
Posts: 3379
Joined: Sep 25, 2005 21:54

Re: ForumBlobPost - Utility (Ver.0.4)

Post by srvaldez »

hello MrSwiss
I tried your program with your example encoded post and it produces the file P2140875_small.jpg but I can't open the file for viewing, I get the error
The file “P2140875_small.jpg” could not be opened.
It may be damaged or use a file format that Preview doesn't recognize.
tested on OS X, but similar failure on Windows.
leopardpm
Posts: 1795
Joined: Feb 28, 2009 20:58

Re: ForumBlobPost - Utility (Ver.0.4)

Post by leopardpm »

dodicat wrote: Do
d=div(number,93)
m=mod(number,93)
g=chr(valint(m)+35)+g
number=d
loop until number="0"
answer=g
elegant... i like it! Now the trick is, as you say, how to divide a multi-byte (1,000's) or string number in a 'ffast' way, retrieving both the int(result) and the modulus of the division.... ick!
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: ForumBlobPost - Utility (Ver.0.4)

Post by dodicat »

What would be wrong with base 96, starting at ascii 32.
96 is nice because 96=2*6*8.
But base 128 is really nice because 128=2*8*8

base 93 is awkward because 93 has no suitable single digit factors.
Base 90 would be the next one down. 90=2*5*9
example of a string div at base 96

Code: Select all



function shortdiv(byval s as string,s2 as string) as string
    dim as ubyte main,carry,d=s2[0]-48,temp
    dim as string ans=s
    for z as integer=0 to len(s)-1
        temp=(s[z]-48+carry)
       main=temp\d
      carry=(temp mod d)*10
      ans[z]=main+48
    next z
    return ltrim(ans,"0")
end function

function div96(s as string) as string
    var n1=shortdiv(s,"2")
    var n2=shortdiv(n1,"6")
    var n3=shortdiv(n2,"8")
    if n3="" then n3="0"
    return n3
end function



for z as long=1 to 100000
    var n=(int(rnd*100000000))
    if str(n\96)<>div96(str(n)) then print "error"
next
print "done"
print div96("99999998888887777776666665555544444333322211111")
sleep

 
The mod96 of N is just N-(div96)*96, which is easily enough done with a minus and mult function, which I already have.
MrSwiss
Posts: 3910
Joined: Jun 02, 2013 9:27
Location: Switzerland

Re: ForumBlobPost - Utility (Ver.0.4)

Post by MrSwiss »

srvaldez wrote:hello MrSwiss
I tried your program with your example encoded post and it produces the file P2140875_small.jpg but I can't open the file for viewing, I get the error

The file “P2140875_small.jpg” could not be opened.
Hi srvaldez,
have you used the latest ver. of the Utility, 0.51? (NOT 0.5 or earlier)
This is a MUST, since the encoder was modified, in between versions.
dodicat wrote:What would be wrong with base 96, starting at ascii 32.
Hi dodicat,
as I've tried to explain before: we need some (at least one, at the time), valid char's
as delimiters, but outside of the used encoding range.

This, to be able, to "insert" the original filename, before the encoded String-Part.
The requirement for that, was made by: leopardpm and BasicCoder2.
(The idea behind it: automatic output file naming, by the Utility, without any user
interaction.)

However, in the future, we might need another delimiter for other purposes ...
It seems to be good practice, to leave some options (as yet undefined) for later
additions, alterations, whatever ... this has led me to start at Asc(35) ... leaving:
32, 33, 34 for other purposes, while they're still valid (but, outside encoding range).
srvaldez
Posts: 3379
Joined: Sep 25, 2005 21:54

Re: ForumBlobPost - Utility (Ver.0.4)

Post by srvaldez »

hello MrSwiss
yes, the latest version works ok. :-)
Post Reply