Wakka/fbdoc parsing discrepancies

Forum for discussion about the documentation project.
Post Reply
counting_pine
Site Admin
Posts: 6323
Joined: Jul 05, 2005 17:32
Location: Manchester, Lancs

Wakka/fbdoc parsing discrepancies

Post by counting_pine »

I'm mainly starting a thread here to document them. And either to warn wiki editors or perhaps to invite fixes for fbdoc.

I'm mainly coming across this now because I'm hoping to get a working version of the French documentation (which uses some extended characters) in waka format.

There are some differences in table parsing - {{table columns="..." cellpadding="..." cells="..."}}
The cells= param contains the table data, using ';' to separate cells. Escape characters (e.g. &, é) aren't allowed as-is because of the way it's parsed, but it does allow &amp and &eacute (without the ';'). I'd guess this means wakka is checking for a hard-coded set of HTML escapes and appending ';'.

Anyway, this is supported in the online version but not in fbdoc which just copies them verbatim. So it looks like non-printing characters will have to be used instead, e.g. "é". Or switching to the meta HTML format as seen in www.freebasic.net/wiki/CptAscii/raw.

fbdoc could maybe be fixed by scanning for these codes itself and adding a ';' to them, as I suspect Wakka does.
Though another question is how the regular escapes are emitted in non-HTML formats e.g. fbhelp. I don't know what it does currently or whether it's been an issue, but it must presumably convert them to ASCII equivalents, if only in the case of www.freebasic.net/wiki/CptAscii.

Note: I changed some docs recently to remove unfriendly characters and added a few escapes. I may have to think about reverting them now with this in mind.
Last edited by counting_pine on Dec 19, 2011 7:47, edited 2 times in total.
counting_pine
Site Admin
Posts: 6323
Joined: Jul 05, 2005 17:32
Location: Manchester, Lancs

Post by counting_pine »

(Another tidbit worth knowing is that the {{table}}s can support minor formatting using HTML commands e.g. <b></b>. Unfortunately fbdoc doesn't support this either, desirable though it may sometimes be.
Implementing these may be overkill, particularly since it still doesn't give as much control as we may like over the formatting in them.
So tables where formatting is really desired are probably best upgrading as with www.freebasic.net/wiki/CptAscii/raw, as mentioned above.)
coderJeff
Site Admin
Posts: 4313
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: Wakka/fbdoc parsing discrepancies

Post by coderJeff »

I almost understand. It would be helpful, I think, if you can give me a short example of what it is that you want to have working for both the on-line wiki and fbdoc. St_W made some contributions to {{table cells=}} parsing last October, but I don't think that's documented anywhere.
counting_pine
Site Admin
Posts: 6323
Joined: Jul 05, 2005 17:32
Location: Manchester, Lancs

Re: Wakka/fbdoc parsing discrepancies

Post by counting_pine »

I’m not sure now if there was anywhere that I wanted to be able to format within tables, or whether the info was genuinely “worth knowing”. I possibly just wanted to make sure it was documented somewhere.

Sadly, I never did get the French documentation properly wikified..
coderJeff
Site Admin
Posts: 4313
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: Wakka/fbdoc parsing discrepancies

Post by coderJeff »

counting_pine wrote:fbdoc could maybe be fixed by scanning for these codes itself and adding a ';' to them, as I suspect Wakka does.
Though another question is how the regular escapes are emitted in non-HTML formats e.g. fbhelp.
The wiki engine recognizes all of unicode plus some html in tables. The encoding for raw stored pages is UTF-8 so it is possible to directly use all of unicode in the on-line wiki.

Current fbdoc converter recognizes & > < " &#999; with or without the semi-colon. Currently expects all characters to be 0 to 255 decimal only.

Current issues are:
- fbdoc handles all text as single byte characters
- fbdoc should read raw wakka files with utf-8 encoding and work with the data internally either as utf-8 or wstring
- txt format will generally look OK when loaded in a utf-8 capable viewer unless one of the bytes in a utf-8 character conflicts with either wakka formatting or fbhelp control codes
- in a non-utf-8 capable viewer, possibly need to specify an output encoding for txt. Either utf-8, or replace non-ASCII characters with something else.
coderJeff
Site Admin
Posts: 4313
Joined: Nov 04, 2005 14:23
Location: Ontario, Canada
Contact:

Re: Wakka/fbdoc parsing discrepancies

Post by coderJeff »

update: I have added info to FBWikiFormatting to describe the escapes in {{table}} action.

What I noticed in testing, and maybe this is what is mentioned in the opening post, is that using the escapes outside of table's cells appears OK in on-line and CHM/HTML fbdoc generation, but not the text generator. The HTML escapes are just that, passed through as HTML. So for the TXT version, need to un-escape the HTML in to TXT.
St_W
Posts: 1618
Joined: Feb 11, 2009 14:24
Location: Austria
Contact:

Re: Wakka/fbdoc parsing discrepancies

Post by St_W »

The small fixes I made some months ago were about ecape sequences in tables. The parser previously required to use non-standard HTML escape sequences, which differed from standard ones by a missing trailing semicolon. After my changes standard escape sequences like """ are allowed (and the old version "&quot" was still allowed for backwards compatibility) and I changed every occurence of the old (non-standard) escape sequences I found in the wiki by the new (standard) ones.

I just tested the wakka -> CHM conversion, though. So there might be issues with other conversions.
Post Reply