Weird behavior in fixed length strings

General FreeBASIC programming questions.
deltarho[1859]
Posts: 4308
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Weird behavior in fixed length strings

Post by deltarho[1859] »

Lost Zergling wrote:I understand the size is only known at first allocation time, then it vanish ?
Perhaps I am misreading you, but Len() and SizeOf() can be used as shown by fxm in post #8.
fxm
Moderator
Posts: 12107
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Weird behavior in fixed length strings

Post by fxm »

- For a fix-len string ('Dim s As String * N'), the 'LEN' (N) or the 'SIZEOF' (N+1) is only known at compile time.
- At runtime, when such a variable is used in expressions, its real length (useful for assignment, concatenation, ...) is estimated by looking for the first occurrence of the null character.
deltarho[1859]
Posts: 4308
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Weird behavior in fixed length strings

Post by deltarho[1859] »

fxm wrote:- For a fix-len string ('Dim s As String * N'), the 'LEN' (N) or the 'SIZEOF' (N+1) is only known at compile time.
So it is ( :) ): “A String declared with a fixed size (numeric constant, or expression that can be evaluated at compile time) is a QB-style fixed length string, with the exception that unused characters are set to 0, regardless of what "-lang" compiler option is used. It has no descriptor and it is not resized to fit its contents.”

It follows then that with a fixed size, it does not have a 'real length'.
- At runtime, when such a variable is used in expressions, its real length (useful for assignment, concatenation, ...) is estimated by looking for the first occurrence of the null character.

“first occurrence”?

You have now drifted into ZStrings.

Dim s As String * 20 = "Alpha" + Chr(0) + "Beta"

has two null characters; one that we have imposed upon it and a null terminator; with a Len of 10 and SizeOf of 11.
fxm
Moderator
Posts: 12107
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Weird behavior in fixed length strings

Post by fxm »

Code: Select all

Dim s As String * 20 = "Alpha" + Chr(0) + "Beta"
Print s
Print "'" & s & "'"
Dim As String s1 = s
Print s1

Sleep

Code: Select all

Alpha Beta	
'Alpha'
Alpha
deltarho[1859]
Posts: 4308
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Weird behavior in fixed length strings

Post by deltarho[1859] »

Tch, tch. You are muddying the waters by pulling Strings into play. :)
fxm
Moderator
Posts: 12107
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Weird behavior in fixed length strings

Post by fxm »

Only concatenation (with operator '&') or assignment (with operator '=').
deltarho[1859]
Posts: 4308
Joined: Jan 02, 2017 0:34
Location: UK
Contact:

Re: Weird behavior in fixed length strings

Post by deltarho[1859] »

fxm wrote:Only concatenation (with operator '&') or assignment (with operator '=').
Ah, so only a slight muddying then. :lol:
SARG
Posts: 1765
Joined: May 27, 2005 7:15
Location: FRANCE

Re: Weird behavior in fixed length strings

Post by SARG »

More inconsistent
I was wondering why this code :

Code: Select all

Dim s As String * 20 = "Alpha" + Chr(0) + "Beta"
Print s
Sleep
prints 'Alpha Beta'
Logically it should print only 'Alpha' as there is a null between Alpha and Beta.

The trick is that there is a creation of a temporary var len string pointing to the fixed len string and taking in account the fixed lenght.

So the print is not stopped at the null as it should be.


An easy fix is to remove the test below and always use strlen. Maybe it also suppress many weird VISIBLE behaviours with fixed lenght strings.

Code: Select all

In str_tempdescf.c  str_size is the size defined at compile time.

	/* can't use strlen() if the size is known */
	if( str_size != 0 )
		dsc->len = str_size - 1;			/* less the null-term */
	else
	{
		if( str != NULL )
			dsc->len = strlen( str );
		else
			dsc->len = 0;
	}

	dsc->size = dsc->len;
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Weird behavior in fixed length strings

Post by dodicat »

I seem to remember this from a few years ago.
We would like a fixed length string to retain the chr(0), because the zstring stops at any chr(0) if you wish that to happen.
A crude workaround, (crude cast to var length string)

Code: Select all

#define str2(s) mid((s),1)

dim s1 as string*80=chr(0)+"Alpha"+chr(0)+"Beta"
print asc(s1)
print s1
print

dim s2 as string*80
s2=str2(chr(0)+"Alpha"+chr(0)+"Beta")
print asc(s2)
print s2
print

dim s3 as string
s3=chr(0)+"Alpha"+chr(0)+"Beta"
print asc(s3)
print s3
print

dim s4 as zstring*80
s4=str2(chr(0)+"Alpha"+chr(0)+"Beta")
print asc(s4)
print s4
print
sleep 
But it needs fixed at source.
Lost Zergling
Posts: 538
Joined: Dec 02, 2011 22:51
Location: France

Re: Weird behavior in fixed length strings

Post by Lost Zergling »

I think if goal is to tend to something really versatile between string and zstring, perhaps a solution would be to implement a 'micro string dedcriptor' specific to fixed len strings. Fixed len string would have a fixed few bytes reserved just for len as data header, then data, and then a chr(0). That way, it would really be interpreted as string or zstring (offset) depending on context.
Such implementation, half the way between strings and zstrings would thus adress the functionnal need of a structure able to accept the two types of optimisation while reading a string (test chr(0) and predefined len).
In such data structure, it could be imagine the chr(0) becomes optionnal or computed on requirement.
This may requires serious evolution and would break ascending compatibility onto fixed string.
I can hardly imagine other solution properly getting rid of chr(0), if objective is to go that way.
marcov
Posts: 3462
Joined: Jun 16, 2005 9:45
Location: Netherlands
Contact:

Re: Weird behavior in fixed length strings

Post by marcov »

SARG wrote: Jul 08, 2022 8:49 So the print is not stopped at the null as it should be.
APIS are often not zero terminated (unix write(2) and writefile), so that is no surprise.

However dealing with this needs extra strlens, and one could argue that the code with embedded zeros is simply illegals(null terminated strings should not contain nulls in the body of the string)

Specially with concatenation (often abused to create long strings with e.g. html documetns), these extra passes caused by streln() might be noticable
SARG
Posts: 1765
Joined: May 27, 2005 7:15
Location: FRANCE

Re: Weird behavior in fixed length strings

Post by SARG »

marcov wrote: Jul 08, 2022 11:59 one could argue that the code with embedded zeros is simply illegals(null terminated strings should not contain nulls in the body of the string)
this seems to be the best solution.


From the manual :
Note: For the fixed-length string type only (QB-style fixed-length string), the 'Len()' keyword always returns the declared constant number of characters, regardless of the number of characters assigned to it by user.
(hence the formula: 'user_characters_length = IIf(InStr(s, Chr(0)) > 0, InStr(s, Chr(0)) - 1, Len(s))')

Code: Select all

dim s as string*80=chr(0)+"Alpha"+chr(0)+"Beta"
print "lenght=";IIf(InStr(s, Chr(0)) > 0, InStr(s, Chr(0)) - 1, Len(s));" !!!!"
print s
sleep
speedfixer
Posts: 606
Joined: Nov 28, 2012 1:27
Location: CA, USA moving to WA, USA
Contact:

Re: Weird behavior in fixed length strings

Post by speedfixer »

I read, then re-read these string discussion threads over and over, again and again. Perhaps I am missing the point of these threads. Yes, once in a very great while, a bug is uncovered. But usually:

In terms of the concepts of human usage, strings are a very complex topic when attempting to implement them in binary programming. Our brain automagically does a lot of things for our language processing that we ASSUME is the correct/only way to handle strings.

FB has three types of strings. Clearly, the intent was to provide some flexibility without sacrificing too much speed, both in compile and usage. There will never be an 'ideal' answer about to how to implement strings, especially in a portable, relatively high level language.

Perhaps a best answer for some of these never-ending discussions about how to 'improve' speed or handling or percieved anomolies is to simply make some other COMPLEX STRING type as a library addon. Code it in C and asm with hints per platform to keep the speed up. (And then maintain it.) This could have all those little wrinkles of string handling and big functions of more complex 'string' handling that really are convenient to have available when a lot of string processing is needed - but ONLY load it when desired.

That has been the answer for nearly every text editor/word processor ever wriitten. They write their own or use someone else's lib. That is also why some people love this or that text mgmt lib, and others *hate* that same lib. Choices are made: none are ideal.

But for programming - simple is better. Unless it is broken, don't fix it.
Personally, I like that a fixed len string can hold any chr() without consequences.
To paraphrase a man smarter than me: simple, but not too simple.

david
dodicat
Posts: 7983
Joined: Jan 10, 2006 20:30
Location: Scotland

Re: Weird behavior in fixed length strings

Post by dodicat »

Pascal's ansistring can incorporate chr(0)
var g:ansistring=chr(0)+'Alpha'+chr(0)+'Beta';
begin
writeln(ord(g[1]),' ',g);
end.
gives:
0 Alpha Beta
So why would marcov deny fb strings the same capability?
I think it is important that fb string or fixed length string has the ability to hold all 256 ascii values.
marcov
Posts: 3462
Joined: Jun 16, 2005 9:45
Location: Netherlands
Contact:

Re: Weird behavior in fixed length strings

Post by marcov »

dodicat wrote: Jul 08, 2022 17:18 Pascal's ansistring can incorporate chr(0)
Pascal's ansistring is not null terminated but ends with a null, which is quite a difference. There is a real length field somewhere, and there is just an null appended as a stop gap to be able to pass it to C routines using a typecast (but C routines that scan for null will terminate early in such case).

As said, I think even for C, with the advent of the -n- routines, purely null termination is somewhat relaxed. Static arrays that are filled to the brim, don't need terminating zero, because the -n- value bounds it with strnlen().
Post Reply