compile-time bindgen for FB
compile-time bindgen for FB
Hi,
FB bindings for C/C++ are a recurring issue, something I've been personally interested in, and spent a lot of time on. But no matter how good the .bi files are, they're always outdated...
How about making a binding generator for FB that runs during .bas file compilation, like #include_c "foo.h"? (inspired by rust-bindgen)
- no more .bi files to maintain
- bindings never outdated, no more ABI mismatch
- can & must use original .h files of the libraries, including standard C/gcc-specific headers, which is no problem e.g. with MinGW or Linux, because they include the .h files anyways
- fbc/the bindgen tool can come with a "database" of translation hints, for parts that can't be handled automatically
Technically speaking, I expect that binding generation for a single target, i.e. only one operating system + architecture and only one preprocessor path (#ifdefs), is much easier to implement than multi-target bindings. Plus it could use libclang for C/C++ parsing.
I've done some experiments with integrating libclang into fbfrog, and it looks like it would work. But really a new tool like fbbindgen should be written, as part of fbc source tree, perhaps at contrib/bindgen/, to ensure it will be maintained and can be integrated with fbc. I'm prepared to do some work on this based on my experiences with fbfrog.
Would anyone else be interested in working on this?
FB bindings for C/C++ are a recurring issue, something I've been personally interested in, and spent a lot of time on. But no matter how good the .bi files are, they're always outdated...
How about making a binding generator for FB that runs during .bas file compilation, like #include_c "foo.h"? (inspired by rust-bindgen)
- no more .bi files to maintain
- bindings never outdated, no more ABI mismatch
- can & must use original .h files of the libraries, including standard C/gcc-specific headers, which is no problem e.g. with MinGW or Linux, because they include the .h files anyways
- fbc/the bindgen tool can come with a "database" of translation hints, for parts that can't be handled automatically
Technically speaking, I expect that binding generation for a single target, i.e. only one operating system + architecture and only one preprocessor path (#ifdefs), is much easier to implement than multi-target bindings. Plus it could use libclang for C/C++ parsing.
I've done some experiments with integrating libclang into fbfrog, and it looks like it would work. But really a new tool like fbbindgen should be written, as part of fbc source tree, perhaps at contrib/bindgen/, to ensure it will be maintained and can be integrated with fbc. I'm prepared to do some work on this based on my experiences with fbfrog.
Would anyone else be interested in working on this?
Re: compile-time bindgen for FB
Such a generator would have to deal with C code that's hard to translate, like macros. Not sure if a database will be helpful there (perhaps to some degree). In general, looking at the various TODOs that FBFROG outputs, it is not going to be an easy task and one probably ends up hacking bi files to make a project compile.
Although I'm new to this community, I noticed that several projects were abandoned, some dating back to 2010-2012. It seems more a lack of interest/motivation to maintain the code. I experienced this first hand with specific GTK bindings. When asking for it, I was redirected to outdated software that doesn't even run anymore. So even though answers are given, they are not very helpful if one wants to compile FB code against more recent libraries.
It is my opinion that code maintenance will always be necessary no matter how much we try to automate things. But it doesn't have to be a monsterous task. Updating header files every now and then will suffice to provide reasonable up-to-date code.
That being said, FbFrog is already a very helpful tool. Unfortunately my knowledge of the C language isn't good enough, so I put my focus on other projects that could benefit FB.
Although I'm new to this community, I noticed that several projects were abandoned, some dating back to 2010-2012. It seems more a lack of interest/motivation to maintain the code. I experienced this first hand with specific GTK bindings. When asking for it, I was redirected to outdated software that doesn't even run anymore. So even though answers are given, they are not very helpful if one wants to compile FB code against more recent libraries.
It is my opinion that code maintenance will always be necessary no matter how much we try to automate things. But it doesn't have to be a monsterous task. Updating header files every now and then will suffice to provide reasonable up-to-date code.
That being said, FbFrog is already a very helpful tool. Unfortunately my knowledge of the C language isn't good enough, so I put my focus on other projects that could benefit FB.
Last edited by Munair on Dec 09, 2017 17:57, edited 2 times in total.
-
- Posts: 862
- Joined: May 05, 2015 5:35
- Location: Germany
Re: compile-time bindgen for FB
To have a reliable working .h-to-.bi translator would be really great! Thought to the end this would mean that we could relinquish most of the .bi files of the FB installation packet.
Although I've already translated some C headers (and some small programs, too) "by hand" I'm not sure if my skills are good enough to support such an ambitious project, but anyhow I'd like to give it a try. If I find the requirements are beyond my capabilities, at least I can do some testing.
I'm in there.
Although I've already translated some C headers (and some small programs, too) "by hand" I'm not sure if my skills are good enough to support such an ambitious project, but anyhow I'd like to give it a try. If I find the requirements are beyond my capabilities, at least I can do some testing.
I'm in there.
Re: compile-time bindgen for FB
I like the idea of automating things and making things simpler, so I also like the idea of creating a binding generator for FB. Instead of maintaining the header files this would mean to maintain the bindgen-tool and the translation hints, which should be simpler and less work.
Unfortunately I've never used Rust or its rust-bindgen tool (yet), so I don't really know how that works, but I'll definitely take a look.
With "fbfrog" you've already created a great tool that simplifies the binding creation process. To be honest I don't know about any implementation details, but I'm wondering why a complete rewrite is needed/wanted or in which points the new tool should fundamentally differ so that a rewrite is required?
I'm also interested in the integration of the tool into fbc. At which level should those two communicate - do you have planned to emit FB sources (like .bi files) or is your intent to integrate that at a deeper (e.g. AST) level? I've thought about some downsides too: Integrating the tool into fbc would blow up the compiler and make it even more a "make" tool (instead of compiler only) than it is already anyway. Depending on the level of integration it could also make finding binding-related errors more difficult. Of course it could be implemented also in a lightweight manner by simply launching bindgen as external tool (like the resource compiler, gcc, as, ...).
I did some experiments with libclang in FB a few years ago and noted that the C API is a bit limited in contrast to its C++ interface, which could cause some problems. Apart from that clang is definitly a great C/C++ parser and it would be probably a lot easier than writing and maintaining an own parser, like you did with fbfrog IIRC.
I'm not sure about the extent I'd be able to help on this, but I'm definitely interested to contribute to this project.
Unfortunately I've never used Rust or its rust-bindgen tool (yet), so I don't really know how that works, but I'll definitely take a look.
With "fbfrog" you've already created a great tool that simplifies the binding creation process. To be honest I don't know about any implementation details, but I'm wondering why a complete rewrite is needed/wanted or in which points the new tool should fundamentally differ so that a rewrite is required?
I'm also interested in the integration of the tool into fbc. At which level should those two communicate - do you have planned to emit FB sources (like .bi files) or is your intent to integrate that at a deeper (e.g. AST) level? I've thought about some downsides too: Integrating the tool into fbc would blow up the compiler and make it even more a "make" tool (instead of compiler only) than it is already anyway. Depending on the level of integration it could also make finding binding-related errors more difficult. Of course it could be implemented also in a lightweight manner by simply launching bindgen as external tool (like the resource compiler, gcc, as, ...).
I did some experiments with libclang in FB a few years ago and noted that the C API is a bit limited in contrast to its C++ interface, which could cause some problems. Apart from that clang is definitly a great C/C++ parser and it would be probably a lot easier than writing and maintaining an own parser, like you did with fbfrog IIRC.
I'm not sure about the extent I'd be able to help on this, but I'm definitely interested to contribute to this project.
Re: compile-time bindgen for FB
Well, I've been experimenting with libclang in fbfrog (after finding a bug in fbfrog's custom CPP), and making it single-target only. It's close to a full rewrite: By using libclang, most of the existing C parsing code becomes obsolete and must be replaced with a libclang AST parser. And being single-target and producing only a temporary .bi file obsoletes lots of the fbfrog binding generation/beautification options and obviously the binding merging. fbfrog just has grown into a different direction (making permanent bindings).
So I think there are a few parts from fbfrog that can be re-used (like the custom C expression parser for parsing #define bodies, which libclang apparently only exposes as tokens), but other than that probably only some of the ideas/lessons learned, and not much other code. That's why I thought it might be easier to start from scratch - or at least come up with a design goal that's not limited by fbfrog's existing architecture.
I think it would be good to have a separate fbbindgen program that is invoked by fbc to generate a temporary .bi file, instead of integrating a libclang-based parser directly into fbc.
* We'll just have to see whether it's fast enough, but it will definitely be easier to develop and test.
* It will probably require a custom AST/IR to generate FB code, but this could be inspired by fbfrog, and can (and might need to) be more flexible than fbc's symbol tables.
* It allows inserting arbitrary FB code in form of text into the output (thinking of hand-made translations from a database).
So I think there are a few parts from fbfrog that can be re-used (like the custom C expression parser for parsing #define bodies, which libclang apparently only exposes as tokens), but other than that probably only some of the ideas/lessons learned, and not much other code. That's why I thought it might be easier to start from scratch - or at least come up with a design goal that's not limited by fbfrog's existing architecture.
I think it would be good to have a separate fbbindgen program that is invoked by fbc to generate a temporary .bi file, instead of integrating a libclang-based parser directly into fbc.
* We'll just have to see whether it's fast enough, but it will definitely be easier to develop and test.
* It will probably require a custom AST/IR to generate FB code, but this could be inspired by fbfrog, and can (and might need to) be more flexible than fbc's symbol tables.
* It allows inserting arbitrary FB code in form of text into the output (thinking of hand-made translations from a database).
Re: compile-time bindgen for FB
Speed is less important if you cache. Just have an output option to FB that you pass on to bindgen to store generated .bi's. On trouble, simply erase the cache dir, and it will repopulate.
Of course, newly installed headers won't be overwritten automatically, and neither will changes to the preprocessor state before the headers be detected. But as both are fairly uncommon, when it is slow, I'd take the speedup over fixing those rare cases.
(added later:)
Maybe having a small file in the cache dir with all three compiler versions (fb, llvm and gcc) could be used to do a quick check if the cache is still valid, fixes some scenarios during setup/installation
Of course, newly installed headers won't be overwritten automatically, and neither will changes to the preprocessor state before the headers be detected. But as both are fairly uncommon, when it is slow, I'd take the speedup over fixing those rare cases.
(added later:)
Maybe having a small file in the cache dir with all three compiler versions (fb, llvm and gcc) could be used to do a quick check if the cache is still valid, fixes some scenarios during setup/installation
-
- Site Admin
- Posts: 6323
- Joined: Jul 05, 2005 17:32
- Location: Manchester, Lancs
Re: compile-time bindgen for FB
This sounds like a pretty cool idea, although with comparatively little experience parsing C headers, it's hard for me to guess how well it would work.
One question that comes to my mind is how it will cope when it comes across things that can't be translated from C to FB?
I remember trying to port a header with some complicated #defines that had no direct FB equivalent, e.g. 'if ( (a=b()) == c) {foo(a)}'.
One question that comes to my mind is how it will cope when it comes across things that can't be translated from C to FB?
I remember trying to port a header with some complicated #defines that had no direct FB equivalent, e.g. 'if ( (a=b()) == c) {foo(a)}'.
Re: compile-time bindgen for FB
That's what I meant in my post. One will probably end up hacking bi files to get it to work. In any case, it's a challenge. Currently I use fbfrog and often try to convert individual header files so that todo's can be addressed on target.counting_pine wrote:This sounds like a pretty cool idea, although with comparatively little experience parsing C headers, it's hard for me to guess how well it would work.
One question that comes to my mind is how it will cope when it comes across things that can't be translated from C to FB?
I remember trying to port a header with some complicated #defines that had no direct FB equivalent, e.g. 'if ( (a=b()) == c) {foo(a)}'.
Re: compile-time bindgen for FB
Also types imported from the header start to vary from system to system, and might not be compatible with fields or operations that the FB code does on them.
In a hand translated header, you can keep the FB type the same as long as the binary projection remains the same, but this way you import whatever gore there is in the headers, and it also takes skill to manage that from the FB application programmer's side.
A workaround would be to only invoke the header generator if you can't find a .bi in the distribution, so problematic headers could still be handtranslated and distributed (both in Counting Pine's and my scenario)
Still, I'm curious how this works out in practice. I guess it is very hard to correctly weight the pros and cons without actually trying.
In a hand translated header, you can keep the FB type the same as long as the binary projection remains the same, but this way you import whatever gore there is in the headers, and it also takes skill to manage that from the FB application programmer's side.
A workaround would be to only invoke the header generator if you can't find a .bi in the distribution, so problematic headers could still be handtranslated and distributed (both in Counting Pine's and my scenario)
Still, I'm curious how this works out in practice. I guess it is very hard to correctly weight the pros and cons without actually trying.
-
- Site Admin
- Posts: 6323
- Joined: Jul 05, 2005 17:32
- Location: Manchester, Lancs
Re: compile-time bindgen for FB
Just to say, it occurs to me something like this would potentially be amazing on a package-managed Linux distro. If suddenly FB becomes compatible with any C-based library you can install on a system.
Perhaps one approach to untranslateable sections is to mark them as such, and throw an error if a program tries to use them, but allow them to be shadowed by definitions written in FB instead.
It may mean some libraries just need some very thin headers that include the C header, and fill in a few of the definitions manually.
Perhaps one approach to untranslateable sections is to mark them as such, and throw an error if a program tries to use them, but allow them to be shadowed by definitions written in FB instead.
It may mean some libraries just need some very thin headers that include the C header, and fill in a few of the definitions manually.
Re: compile-time bindgen for FB
I like counting_pine's suggestion. I was about to suggest something similar. It seems simpler to distribute .bi files with FB which #include_c the original header and then add whatever couldn't be translated to FB, than to have to deal with a DSL for specifying the "hints". fbfrog works beautifully most of the time, but I find writing fbfrog rules and replacements an enormous pain (I don't fully understand how many of the options work due to lack of documentation, I've had to resort to patching fbfrog or using gdb to figure out why they don't work, and in the end it seems like the options I really need often don't exist anyway).
With a full-blown C compiler, what still won't be possible to translate directly? It seems like it would be:
-some macros (if parsing or translation to FB fails)
-some inline functions (if translation to FB fails)
-C extensions such function attributes
Except for C extensions with no FB analogue which can't just be ignored as they affect the ABI (and are therefore hopeless to support), it seems that other TODOs can be replaced with declarations in a handwritten header.
Using libclang sounds like the nicest option. It's too bad a separate C parser will still be necessary for macros, but a libclang tool will still be able to handle more than fbfrog does.
The API for LLVM is notoriously fast-moving. I've heard people say it's a full-time job to keep up with it... probably a bit of an exaggeration. Does libclang also suffer from that?
But fbfrog already works so well most of the time that while this would be a very cool feature, I personally don't have much need for it - I write C or C++ code if using translated headers is a concern.
With a full-blown C compiler, what still won't be possible to translate directly? It seems like it would be:
-some macros (if parsing or translation to FB fails)
-some inline functions (if translation to FB fails)
-C extensions such function attributes
Except for C extensions with no FB analogue which can't just be ignored as they affect the ABI (and are therefore hopeless to support), it seems that other TODOs can be replaced with declarations in a handwritten header.
Using libclang sounds like the nicest option. It's too bad a separate C parser will still be necessary for macros, but a libclang tool will still be able to handle more than fbfrog does.
The API for LLVM is notoriously fast-moving. I've heard people say it's a full-time job to keep up with it... probably a bit of an exaggeration. Does libclang also suffer from that?
But fbfrog already works so well most of the time that while this would be a very cool feature, I personally don't have much need for it - I write C or C++ code if using translated headers is a concern.
Re: compile-time bindgen for FB
Using "wrapper" .bi files to contain manually written declarations besides the #include_c/#bindgen keyword sounds like a great idea. Probably fbbindgen can just omit declarations that it can't translate. I think this would mostly help with complex macros/inline functions, but probably some hints will still be needed to consistently/deterministically resolve symbol name conflicts (with FB keywords or each-other due to case-insensitivity), and maybe other things (e.g. whether to use Extern "Windows" or "Windows-MS").
I've pushed some code at dkl/fbc:bindgen, containing contrib/bindgen and an #bindgen keyword for fbc. It has some basic "parsing" of data types, variable and procedure declarations, and structs/unions/enums.
Basic usage:
- nested/anonymous/forward declared structs/unions/enums (I've been experimenting using libclang's USR mechanism for this, but help is welcome)
- #define parsing (I was going to import some code from fbfrog for this)
- enumconst initializers/values
And overall:
- more "tests" (I can import some from fbfrog), any ideas how to improve the "test suite" (currently it's just the expected output .bi stored in Git)?
- any ideas to improve the code?
- it would be nice to also have some help with the fbc-side of things (#bindgen and its usability). E.g. currently fbc doesn't search fbbindgen relative to itself, but requires it to be in PATH. Cross-compiling must work. How to get libclang for MinGW? Should we worry about the DOS port in this context? etc.
- we could add simple C++ support (references, default parameter values, simple classes - looking for help)
- fbbindgen should emit #asserts to verify structure sizes (especially in the context of bitfields)
- figure out how to handle built-in gcc/clang types such as __builtin_va_list (probably just emit as dummy byte arrays of the correct size?)
I've pushed some code at dkl/fbc:bindgen, containing contrib/bindgen and an #bindgen keyword for fbc. It has some basic "parsing" of data types, variable and procedure declarations, and structs/unions/enums.
Basic usage:
What I want to work on next to continue the basic functionality:sudo apt-get install libclang-4.0-dev
cd contrib/bindgen
echo 'FBFLAGS := -g -exx' > config.mk
make
echo "extern int i;" > a.h
./fbbindgen a.h
fbc parse-gcc-incdirs.bas
./fbbindgen $(./get-gcc-incdirs.sh) /usr/include/stdlib.h -v
- nested/anonymous/forward declared structs/unions/enums (I've been experimenting using libclang's USR mechanism for this, but help is welcome)
- #define parsing (I was going to import some code from fbfrog for this)
- enumconst initializers/values
And overall:
- more "tests" (I can import some from fbfrog), any ideas how to improve the "test suite" (currently it's just the expected output .bi stored in Git)?
- any ideas to improve the code?
- it would be nice to also have some help with the fbc-side of things (#bindgen and its usability). E.g. currently fbc doesn't search fbbindgen relative to itself, but requires it to be in PATH. Cross-compiling must work. How to get libclang for MinGW? Should we worry about the DOS port in this context? etc.
- we could add simple C++ support (references, default parameter values, simple classes - looking for help)
- fbbindgen should emit #asserts to verify structure sizes (especially in the context of bitfields)
- figure out how to handle built-in gcc/clang types such as __builtin_va_list (probably just emit as dummy byte arrays of the correct size?)
Re: compile-time bindgen for FB
This is a very cool idea that I know has been turned down by devs in the past. I am not entirely sure why it was turned down, but a C backend was also turned down at the same time.
Times change, clearly. If there is anything I can do to help with this, let me know. I don't think there is much but I think it would go a long way toward simplifying the use of 3rd partly libraries in FB.
Times change, clearly. If there is anything I can do to help with this, let me know. I don't think there is much but I think it would go a long way toward simplifying the use of 3rd partly libraries in FB.
Re: compile-time bindgen for FB
Yea, I'm looking for help with the idea, someone else who works on it, because I won't have time to finish it by myself.
Re: compile-time bindgen for FB
I was looking at updating zlib.bi to version 1.2.11 as response to pull request #61 with fbfrog as-is.
aw, while doing this, was thinking about fbbindgen:
1) I expect that fbbindgen still needs a special recipe, options, replacements, etc, and this would be done through the .bi wrapper header
2) That needed .h files need to be available from the build system (mingw, gcc, etc).
3) Installing a library for use with fb would require installing .h, .a, .so, .dll, etc, to the build/run environment, outside of fb's tree.
I am very thankful for the hard work on translated headers. And I can see the motivation for the desire to process .h files directly.
aw, while doing this, was thinking about fbbindgen:
1) I expect that fbbindgen still needs a special recipe, options, replacements, etc, and this would be done through the .bi wrapper header
2) That needed .h files need to be available from the build system (mingw, gcc, etc).
3) Installing a library for use with fb would require installing .h, .a, .so, .dll, etc, to the build/run environment, outside of fb's tree.
I am very thankful for the hard work on translated headers. And I can see the motivation for the desire to process .h files directly.