Freebasic 1.20.0 Development

General discussion for topics related to the FreeBASIC project or its community.
Post Reply
Arachnophilia
Posts: 26
Joined: Sep 04, 2015 13:33

Re: Freebasic 1.20.0 Development

Post by Arachnophilia »

adeyblue wrote: Feb 27, 2024 6:34 Why?
Who is this for?

This doesn't affect me so whatever, but I'm gonna be honest:
potentially throwing your entire current user base under the bus and changing the ABI for some potential future code or mythical people who are going to port from a different language they're currently using sounds incredibly, incredibly stupid.

It better be worth it. You're flushing a lot of trust down the drain if usage is more widespread than you guess at.
P-i-s-s-ing off everybody would be a good exit strategy though. Actually, yes, I approve of that. Give 'em hell Jeff then slink off into the night.
Hey adeyblue
Why so angry. :?:
If Jeff didn't want an exchange about the pros and cons, he wouldn't have initiated a discussion.
You're defending your point of view. That's fine too.
But your post gives me no insight into the actual topic. :wink:
In fact, the post should be deleted so as not to nip a serious discussion in the bud.
That's just my humble opinion.....
coderJeff
Site Admin
Posts: 4332
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: Freebasic 1.20.0 Development

Post by coderJeff »

The only thing I immediately disagree with is the idea that I would slink off. Pretty sure I consistently own up to all my failures.

Why? - I think I need to do this first
Who is this for? - Me, and hopefully others find it useful.

At the moment unfinished existing STRING*N feels like it is in my way.
TODO.txt from 2006 and 2012 wrote: [ ] All functions returning STRING should actually return the FBSTRING object
- it must be coded in plain C to avoid C++ dependencies
- compiler has to allocate the descriptor as it does now following the gcc ABI
- any function in the run-time library returning strings will have to be
modified (chicken-egg problem)
- allocated on stack instead of using temp descriptors,
- better with threads, as no more locking needed in the rtlib
- allows STRING results to be passed between multiple rtlibs without
memory leaks (e.g. returned from DLL)
- no more STR_LOCK's
- no more checks for temp descriptors in all rtlib procs taking STRINGs

[ ] fixed-len strings compatible with QB:
- no null-term, temporaries always created when passing to functions
- probably will need their own assign and concat functions
[/code]
I want to eventually get to adding the var-len WSTRING and maybe a var-len UTF8STRING. And I'd like to base it on existing var-len STRING rtlib and I don't want to copy a bunch of code that is unfinished or doesn't work else it will need to be repaired in multiple places. I tried doing the implementation as a class only header, but there really is a still a lot of supporting features needed to fully realize. So I try a different approach to get var-len WSTRING going.
angros47
Posts: 2326
Joined: Jun 21, 2005 19:04

Re: Freebasic 1.20.0 Development

Post by angros47 »

adeyblue wrote: Feb 27, 2024 6:34 This doesn't affect me so whatever, but I'm gonna be honest:
Well, that's the point: finding out whom would be affected... or if anyone would ever be affected.

potentially throwing your entire current user base under the bus and changing the ABI for some potential future code or mythical people who are going to port from a different language they're currently using sounds incredibly, incredibly stupid.
Whom would be thrown under the bus? Most people would not even notice the change. And the few ones who might be affected are likely the ones who used low level access to data structures, and that are likely skilled enough to fix their programs in few minutes.

Ok, you said "potentially"... but every single update could potentially throw some user under the bus: adding a new command? It would break any code that uses a variable with the same name. Fixing a bug? Would break the code of people who exploited that bug to achieve unusual results.

To use your own word, you want to prevent the fixing of a language issue because you are concerned for some potential issue or some mythical people who are going to be thrown under the bus

It better be worth it. You're flushing a lot of trust down the drain if usage is more widespread than you guess at.
P-i-s-s-ing off everybody would be a good exit strategy though. Actually, yes, I approve of that. Give 'em hell Jeff then slink off into the night.
I think it will be worth, and I already wrote some reasons why I think it will be. But if you, or anyone else are really so "p-i-s-s-e-d" with such a change, you always have a solution: you could stay with previous version, updating the compiler is not mandatory. There isn't even an automated update.
Or you can make a fork of it, where you will add all the updates but this one that you seem not to like. So, all the users who will be unhappy with the change will use your version and claim you as their savior.
shadow008
Posts: 86
Joined: Nov 26, 2013 2:43

Re: Freebasic 1.20.0 Development

Post by shadow008 »

coderJeff wrote: Feb 26, 2024 23:10
shadow008 wrote: Feb 26, 2024 20:21
angros47 wrote: Feb 26, 2024 10:26 Another question: will be possible to use null bytes inside a fixed length string?
This would not make any sense as there is no corresponding character to the null (0) byte.
Not for printing but still as a null terminator -- in a list of null terminated strings.
On windows, REG_MULTI_SZ comes to mind. For c runtime, result of strtok (but only kind of).
Fundamentally the point of a string is to convey information to a human being. The purpose of a "string type" (of any kind) is to therefore to facilitate that purpose. Null terminated string lists inside a byte array is not a "string", it is a data type. It's the job of a data type to facilitate data organization, and the job of a string to present something to a person. Thus, my comment of it not making sense as you'd then be overloading what a null terminator represents to a person. As an aside, I didn't know what the REG_MULTI_SZ was, but after looking it up it's not a string type but a "format" (akin to a data type as far as I can tell).

Null bytes can currently be used "inside" a fixed length string already. Of course it's not going to function properly with the laundry list of things you brought up, but this can be worked around with a user written library or even a UDT wrapper. I don't believe it would be the job of the compiler to support this unless fixed length strings become "syntax sugar around a character array", which violates the "purpose" of a string.

I still think the most egregious aspect of this discussion of fixed length strings is that the word "string" is overloaded and context dependent ("string ptr" vs "string ptr"). That's the kind of thing my company refers to as "special case-y bullsh*t" and is bigtime frowned upon.
coderJeff
Site Admin
Posts: 4332
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: Freebasic 1.20.0 Development

Post by coderJeff »

shadow008 wrote: Feb 27, 2024 18:55 Fundamentally the point of a string is to convey information to a human being.
Not to a machine or algorithm? What about control characters and escape sequences?

Is not 'zero terminated sequence of bytes' also a kind of format that the the producer and consumer need to agree on?

Anyway, going back to your type-erased system...which I misinterpreted your meaning since as I pointed out fbc compiler currently doesn't fully preserve / track sizes of all pointed to things. It could, but requires much work to do.

Pointers to fixed length strings are not supported, not string, nor zstring, nor wstring. If you actually understand this, then you should agree. To help understand, try and answer this: which fbc data type *best* represents a single 8bit character - like if you want to convey the information of a single letter to human being? Do we actually have a correct fbc data type for this purpose -- for a single character? Or is it only correct in the context of other information? If you understand this, then I think you should realize there is no "string ptr" versus "string ptr"
VANYA
Posts: 1839
Joined: Oct 24, 2010 15:16
Location: Ярославль
Contact:

Re: Freebasic 1.20.0 Development

Post by VANYA »

Will this change affect null characters in the STRING type? Example:

Code: Select all

dim as string s = "bla" & chr(0) & "bla"
? Len(s)  ' 7


If it does, it will upset me. I really hope that null characters can continue to be used in the string type. But I also understand that if you have to make sacrifices to implement a dynamic UNICODE string, then there is no problem.

As for the as string*N type, I almost never used it. I always tried to use zstring*N. But maybe for some this change will be a blow.
coderJeff
Site Admin
Posts: 4332
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: Freebasic 1.20.0 Development

Post by coderJeff »

VANYA wrote: Feb 28, 2024 3:58 Will this change affect null characters in the STRING type? Example:
No change to STRING, therefore no changes to CHR(0) in STRING

Proposed change is for STRING*N only.
VANYA
Posts: 1839
Joined: Oct 24, 2010 15:16
Location: Ярославль
Contact:

Re: Freebasic 1.20.0 Development

Post by VANYA »

coderJeff wrote: Feb 28, 2024 9:47 No change to STRING, therefore no changes to CHR(0) in STRING

Proposed change is for STRING*N only.
ok!
angros47
Posts: 2326
Joined: Jun 21, 2005 19:04

Re: Freebasic 1.20.0 Development

Post by angros47 »

I wonder, will the update fix this bug too?
viewtopic.php?t=31753
coderJeff
Site Admin
Posts: 4332
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: Freebasic 1.20.0 Development

Post by coderJeff »

angros47 wrote: Feb 28, 2024 17:23 I wonder, will the update fix this bug too?
viewtopic.php?t=31753
Changes to string*n won't fix that bug directly, but would allow the possibility to fix that bug.
Lost Zergling
Posts: 538
Joined: Dec 02, 2011 22:51
Location: France

Re: Freebasic 1.20.0 Development

Post by Lost Zergling »

Hello everyone, it's been a while!
I wondered about the problem of pointers to fixed length strings. The basic type string is very efficient, this is what my tests with lzle (tagmode2) showed. I had studied two other approaches: zstring* (tagmode1) and a kind of micro descriptor (tagmode0) containing only the pointer and the length (variable but limited). As the goal is to make the most of the substitution of memory addresses to avoid fragmentation, managing the optimization of strings that are too long (or too short) can become complex: can we afford truncation? should we allow resizing and risk losing the benefit of implicit substitution (hence the limitation)? should we cut the string that is too long and create other linked strings (of standardized length) managed algorithmically?
Furthermore, depending on whether this string will be a long value, a short value, or a key, the optimization choice cannot be the same.
So I reach at end three possible choices impacting the structure and therefore the entire overall operation, with sub-options on resizing and lengths.
We can add to this the recycling challenge (GC & related) : recycling of micro blocks is not necessarily the best option: it is often much faster to let the memory fragment into a predefined block to the extent that we know that the duration of operation and volume will not exceed a certain limit, the memory is freed when the sub function/object stops or dereferenced. It is the object design logic which will structure the nested operating memory volumes allocated according to the memory fragmentation tolerances.
Strings handling is also & therefore related to instruction set, meaning handling & complexity issues.
fxm
Moderator
Posts: 12133
Joined: Apr 22, 2009 12:46
Location: Paris suburbs, FRANCE

Re: Freebasic 1.20.0 Development

Post by fxm »

Could we process a fix-len string compared to a var-len string, in the same way than a fix-len array compared to a var-len array:
'Dim s As String * N' or 'Dim s As String'
compared to:
'Dim a(N) As data' or 'Dim a() As data'

Overview
When a fix-len string is passed to a procedure, it is always declared 'Byref|Byval As String' (as for a var-len string) and a descriptor is also passed.
The same descriptor format could be used for both, with the third field (number of allocated bytes) set to '-N' (for example) for a fix-len string of length 'N'.
So a same procedure could be called on a fix-len or a var-len string, similarly to an array (tests and controls for procedure body are executed at run-time only as for processing an array).
Last edited by fxm on Mar 01, 2024 17:10, edited 3 times in total.
Reason: Third field of the string descriptor set to “-N” for a fix-len string of length “N”.
coderJeff
Site Admin
Posts: 4332
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: Freebasic 1.20.0 Development

Post by coderJeff »

fxm wrote: Mar 01, 2024 12:33 Could we process a fix-len string compared to a var-len string, in the same way than a fix-len array compared to a var-len array:
I think is possible, though I haven't explored that optimization, it will take some incremental steps to get there.

Currently, fbc will generate a temporary descriptor and assign a copy of the fixed length string/zstring to a byval/byref string and copy the results back after the call in the instance of byref string. Encoding var-len/fixed and null termintor/no null terminator in to the descriptor seems possible, but obviously all of rtlib needs know how to handle the descriptor.

If rtlib handling of temporary strings is removed and handled by fbc code generation, then the descriptor could potentially encode (in ptr, len, size fields): a pointer to a string, size of the container, length of the string in the container or if 'strlen()' must be used or strlen_s()' can be used to find the length, if the length is variable (can be reallocate) or fixed length, and if null terminator or no null terminator should be expected or written. Though this requires reserving the high bit of of len and size fields and limiting total size to signed positive values, it avoids adding another field to the string descriptor. However, this is not a new limitation since current rtlib implementation uses the highbit to indicate a temporary descriptor handled by the runtime.

Side note:
For wstrings, fbc generates a kind of wstring descriptor internally, just enough functionality to pass copies of wstring's and get function results. The problems to solve with wstrings and strings are virtually the same except the element size is different but currently each has varying implementations.
marcov
Posts: 3462
Joined: Jun 16, 2005 9:45
Location: Netherlands
Contact:

Re: Freebasic 1.20.0 Development

Post by marcov »

coderJeff wrote: Mar 01, 2024 15:55 Side note:
For wstrings, fbc generates a kind of wstring descriptor internally, just enough functionality to pass copies of wstring's and get function results. The problems to solve with wstrings and strings are virtually the same except the element size is different but currently each has varying implementations.
If Widestrings are to be COM compatible and must be managed over oleaut dll functions
coderJeff
Site Admin
Posts: 4332
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: Freebasic 1.20.0 Development

Post by coderJeff »

Thanks everyone for the input. Have some ideas what to look out for going forward.

Changes for string*n are now merged to fbc/master
coderJeff wrote: Feb 25, 2024 15:23 STRING*N occupies N bytes of memory and has no terminating null character

- STRING*N will occupy exactly N bytes and will initialize variables and fields with spaces
- On assignment STRING*N pads the string with spaces and does not automatically add a null terminating character.
- SWAP will pad values with spaces where one of the arguments is of the STRING*N data type
- STRING*N arguments are copied to a temporary string when passed to ZSTRING ptr parameters because otherwise STRING*N will be lacking the implicit NUL termination character
- due to the padding LEN(STRING*N) == sizeof(STRING*N), always.
Post Reply