[SOLVED] Standard character Set for all OS

Linux specific questions.
Post Reply
exagonx
Posts: 314
Joined: Mar 20, 2009 17:03
Location: Italy
Contact:

[SOLVED] Standard character Set for all OS

Post by exagonx »

Salutations,
I've tried various solutions including changing the character set in the terminal, but in the end it still doesn't work.

The characters I use for DOS and Windows are not displayed under Linux, and even creating the same interface with "LINE" this works in a graphical environment but does not show anything if run without X running, therefore I try to find a character set that is in common between DOS and Linux.
Is there anything?

Thank you for your attention.

ps this is the char code

Code: Select all

dim as integer STopLeft = 218,SVerLine = 179, STopRight = 191, SBottLeft = 192, SBottTLine = 193, STopTLin = 194 
dim as integer SOriLine = 196, SCrossLin = 197, SBottRight = 217
dim as integer SLeftTLine = 180, SRightTLine = 195
dim as integer DRightTLine = 185, DVerLine = 186, DTopRight = 187, DBottRight = 188
dim as integer DBottLeft = 200, DTopLeft = 201, DBottTLine = 202, DTopTLine = 203
dim as integer DLeftTLine = 204, DOriLine = 205, DCrosLine = 206
dim as integer CursA = 176, CursB = 177, CursC = 178, CursD = 219
dim as integer ShadowADown = 220, ShadowAUp = 223, ShadowVert = 219
Last edited by exagonx on Feb 03, 2022 21:43, edited 1 time in total.
Munair
Posts: 1286
Joined: Oct 19, 2017 15:00
Location: Netherlands
Contact:

Re: Standard character Set for all OS

Post by Munair »

On Linux you have to use unicode characters. Have a look at this table.

In the terminal you can try the characters by pressing Ctrl+Shift+U followed by the unicode, e.g U02E7 (must be hexadecimal).
Munair
Posts: 1286
Joined: Oct 19, 2017 15:00
Location: Netherlands
Contact:

Re: Standard character Set for all OS

Post by Munair »

With the link I provided above, you will find the desired symbols in the hex 2500 range. I used U2554, U2550 and U2557 in the terminal:
Image

In order to use this in FB you might want to use an UTF8 library, such as provided here:
viewtopic.php?f=7&t=26170
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Standard character Set for all OS

Post by jj2007 »

Munair wrote: Jan 31, 2022 6:50 On Linux you have to use unicode characters. Have a look at this table.
Nice table, very useful!

Re "unicode characters": In practice, this means you need either a UTF-16 or a UTF-8 encoding. The Internet uses UTF-8.

Be prepared for a mess, and for some reading. I won't write it up for you, but googling UTF-16 UTF-8 encoding character set installed fonts helps. Re "to use this in FB you might want to use an UTF8 library": even in standard FB you can use this, provided your editor can write UTF-8 to the compiler:

Code: Select all

Print String(20, str("x"))	' String(20, "x") chokes if there is a Utf-8 BOM
Print Left("Добро пожаловать", 5);	' combined output of the three rows:
Print Mid("Добро пожаловать", 6, 5);	' Добро пожалоловать
Print Right("Добро пожаловать", 6)
Print "That was a Russian text printed as Utf8"
Munair
Posts: 1286
Joined: Oct 19, 2017 15:00
Location: Netherlands
Contact:

Re: Standard character Set for all OS

Post by Munair »

jj2007 wrote: Jan 31, 2022 8:29Re "to use this in FB you might want to use an UTF8 library": even in standard FB you can use this, provided your editor can write UTF-8 to the compiler
Yes, that's right and standard string variables will do, provided you keep in mind that characters are up to 3 bytes long (UTF8) so you won't be able to do standard (byte) string manipulation on them. Geany IDE on Unix systems natively supports UTF8, but I'm not sure on Windows (the whole world uses UTF8 these days, except Windows...).
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Standard character Set for all OS

Post by jj2007 »

Munair wrote: Jan 31, 2022 8:35Geany IDE on Unix systems natively supports UTF8, but I'm not sure on Windows
It works with RichMasm, but I have no idea about other editors.
TJF
Posts: 3809
Joined: Dec 06, 2009 22:27
Location: N47°, E15°
Contact:

Re: Standard character Set for all OS

Post by TJF »

One UTF-8 character is up to four bytes long.

As long as you're operating in the base range (0-127), you can use standard fbc string manipulation. In order to handle any kind of character, find a complete set of string manipulation functions in GLib.

On wodniws each Scintilla based editor should be able to handle UTF-8. Like Geany, which is available for wodniws as well. There's the UnxUtils project with helpful tools like grep, so that Geany gets as powerful as on LINUX.
Munair
Posts: 1286
Joined: Oct 19, 2017 15:00
Location: Netherlands
Contact:

Re: Standard character Set for all OS

Post by Munair »

TJF wrote: Jan 31, 2022 13:29 One UTF-8 character is up to four bytes long.
That's true, but four bytes UTF8 characters aren't used that often, i.e. they don't cover any modern language.

Overview of 4-bytes characters: https://design215.com/toolbox/utf8-4byte-characters.php
exagonx
Posts: 314
Joined: Mar 20, 2009 17:03
Location: Italy
Contact:

Re: Standard character Set for all OS

Post by exagonx »

TJF wrote: Jan 31, 2022 13:29 One UTF-8 character is up to four bytes long.

As long as you're operating in the base range (0-127), you can use standard fbc string manipulation. In order to handle any kind of character, find a complete set of string manipulation functions in GLib.

On wodniws each Scintilla based editor should be able to handle UTF-8. Like Geany, which is available for wodniws as well. There's the UnxUtils project with helpful tools like grep, so that Geany gets as powerful as on LINUX.
Thank you TJF
Now I understand why all console application use no graphics char .
marcov
Posts: 3455
Joined: Jun 16, 2005 9:45
Location: Netherlands
Contact:

Re: [SOLVED] Standard character Set for all OS

Post by marcov »

Notepad supports utf-8 and unix lineendings since a year or two.
TJF
Posts: 3809
Joined: Dec 06, 2009 22:27
Location: N47°, E15°
Contact:

Re: Standard character Set for all OS

Post by TJF »

exagonx wrote: Feb 03, 2022 21:47 Now I understand why all console application use no graphics char .
You can use each and every character form UTF-8 in your console app. But when your user sets up a different encoding to his terminal settings, both of you will end up in the encoding hell.

That's why experienced console app developers concentrate on the basics: 7 bit ASCII encoding (= UTF-8 base page).
Munair
Posts: 1286
Joined: Oct 19, 2017 15:00
Location: Netherlands
Contact:

Re: Standard character Set for all OS

Post by Munair »

TJF wrote: Feb 04, 2022 15:22 That's why experienced console app developers concentrate on the basics: 7 bit ASCII encoding (= UTF-8 base page).
Which answers the original question:
exagonx wrote: Dec 27, 2021 12:04 a character set that is in common between DOS and Linux.
Is there anything?
The ASCII characters 128-255 as shown in the initial example, do not qualify unfortunately. But this was already a problem in the DOS era because "extended" ASCII also changed with the selected code page, e.g. 437 (US) versus 850 (basic Latin) which is why DOS apps run on different machines could show strange characters in their text interface, especially window corners and joint characters.
exagonx
Posts: 314
Joined: Mar 20, 2009 17:03
Location: Italy
Contact:

Re: Standard character Set for all OS

Post by exagonx »

Munair wrote: Feb 04, 2022 15:44 The ASCII characters 128-255 as shown in the initial example, do not qualify unfortunately. But this was already a problem in the DOS era because "extended" ASCII also changed with the selected code page, e.g. 437 (US) versus 850 (basic Latin) which is why DOS apps run on different machines could show strange characters in their text interface, especially window corners and joint characters.
Thank you for the confirmation.

I understood this, but I was hoping that someone had found a solution that would allow the recognition of the environment and select the ASCII code suitable for the machine in use.

something like

Code: Select all

#ifdef __FB_LINUX__
Okay, knowing this I will do without graphic characters, so with SCREEN I can simulate the boxes using DRAW LINE and other graphic functions.
jj2007
Posts: 2326
Joined: Oct 23, 2016 15:28
Location: Roma, Italia
Contact:

Re: Standard character Set for all OS

Post by jj2007 »

TJF wrote: Feb 04, 2022 15:22You can use each and every character form UTF-8 in your console app. But when your user sets up a different encoding to his terminal settings
With Windows, just use chcp 65001
Post Reply