Thursday, July 21, 2011

Automating the wrapping of C functions and GObject properties

Hello everyone.  This is my first post on this technical blog and though I may not generally post because of lack of time, I really wanted to share this discovery I found in case it could benefit others when doing similar work to what I have been doing when translating GStreamer's C API to C++ for the gstreamermm library using a tool called gmmproc.

Generally, as the wrapping docs say, it is necessary to write a series of _WRAP_METHOD() statements to translate the C functions to C++ class methods and a lot of time can be spent having to manually translate C functions that may be available for a particular class.  Luckily, I've discovered a method that makes this process much faster (less time consuming), a little easier and somewhat more reliable than having to manually write the _WRAP_METHOD() statements oneself.  (Sadly, though this method can be only used with vim -- sorry.)

The method consists of the following vim script I came up with in the last couple of days while wrapping GStreamer's GstDiscoverer API:

Update: The script has been modified slightly to fix minor bugs and make it more useful since the original post.

" Convert C function declarations copy and pasted from C documentation (maybe
" from a browser such as firefox) to _WRAP_METHOD directives.  To use, set the
" 'a' mark to the beginning of the declarations and the 'b' mark to the end of
" the declarations and run this script.

" Compress multiple spaces to a single space.
:'a,'bs/\s\{2,\}/ /ge

" Convert 'Type *' to 'Type*'
:'a,'bs/ \(\*\+\)\s*/\1 /ge

" Join lines starting with (... to the previous one.
:'a,'bs/\n\s*(/(/ge

" Remove space between function name and (.
:'a,'bs/ (/(/ge

" Join lines ending with a comma to the one below.
:'a,'bs/,\n/,/ge

" Convert the resulting function declarations to _WRAP_METHOD() directives.
:'a,'bs/\(^.*\) \(\w\+\)\((.*)\);/  _WRAP_METHOD(\1 \2\3, \2)/ge

" Ensure that two spaces precedes the _WRAP_METHOD() directives.
:'a,'bs/^ _WRAP_METHOD/  _WRAP_METHOD/ge

" Convert gboolean to bool.
:'a,'bs/gboolean/bool/ge

" Remove the inital 'g' from gchar, gint, gfloat, and gdbouble.
:'a,'bs/g\(char\|int\|float\|double\)/\1/ge

" Remove final GError** params from declarations and append an 'errthrow' to
" the _WRAP_METHOD() directives.
:'a,'bs/\(_WRAP_METHOD(.*\), GError\** \w\+), \(\w\+\))/\1), \2, errthrow)/ge

" Ask if return const-char* should be converted to Glib::ustring.
:'a,'bs/_WRAP_METHOD(\(const \)\?char\*/_WRAP_METHOD(Glib::ustring/gec

" Ask if a char* parameter (usually has a const preceding it) should be
" converted to Glib::ustring&.
:'a,'bs/char\*/Glib::ustring\&/gec

" Ask if remaining return const-char* should be converted to std::string.
:'a,'bs/_WRAP_METHOD(const char\*/_WRAP_METHOD(std::string/gec

" Ask if remaining char* parameters should be converted to std::string&.
:'a,'bs/char\*/std::string\&/gec

" Ask if a return pointer type should be embedded in a Glib::RefPtr<>.
:'a,'bs/_WRAP_METHOD(\(\w\+\)\* /_WRAP_METHOD(Glib::RefPtr<\1> /gec

" Ask if a parameter pointer type should be embedded in a const
" Glib::RefPtr<>&.
:'a,'bs/\(const \)\?\(\w\+\)\* /const Glib::RefPtr<\2>\& /gec

" Ask if double indirect pointers (normally parameters) should be embedded in
" a Glib::RefPtr<>& (non-const).
:'a,'bs/\(\w\+\)\*\* /Glib::RefPtr<\1>\& /gec

" Ask if remaining return pointer types should be changed to non-pointers.
:'a,'bs/_WRAP_METHOD(\(\w\+\)\* /_WRAP_METHOD(\1 /gec

" Ask if remaining parameter pointer types should be changed to const
" non-pointer references.
:'a,'bs/\(const \)\?\(\w\+\)\* /const \2\& /gec

" Ask if remaining double indirect pointers (normally parameters) should be
" changed to C++ references (non-const).
:'a,'bs/\(\w\+\)\*\* /\1\& /gec

" Ask if remaining parameter pointer parameters should be completely removed.
:'a,'bs/\w\+\* \w\+,\? *//gec


What I did was save the script in the ~/.vim directory as ~/.vim/wrap_method.vim and then included the following map command in the ~/.vimrc file which maps the 'Alt-=' key combination to run the script (the key mapping can be changed to what is most comfortable):

map <A-=> :source ~/.vim/wrap_method.vim<CR>

Update: I finally decided to use Alt-m as a mapping because I've also made a similar script (which is posted below) to convert property documention to _WRAP_PROPERTY() directives which I've mapped to Alt-p.

Once the script and the mapping are in place, all that has to be done is copy the C declarations into the .hg file from firefox like so:



(Notice that the declarations are embedded in a comment block -- that's mostly for convenience.)

After, set the 'a' mark at the beginning of the block and the 'b' mark at the end of the block by pressing 'ma' in command mode at the opening comment block and 'mb' at the closing comment block (this can be done before pasting the declarations into the comment block if the comment block is created first).

Finally, press the hot-key (Alt-= in this posting) and the script will replace the declarations by initial _WRAP_METHOD() statements that can be further edited like so:


As you can see, the script will begin to ask if the 'const char*' returns should be replaced by 'Glib::ustring' (it will do this for all the 'const char*' returns).  It will also ask if the 'char*' parameters should be replaced by 'Glib::ustring&'.  If the the char* return and parameters are not converted to Glib::ustrings, the script will ask if they should be converted to std::strings.

After, the script will ask if pointer returns should be embedded in Glib::RefPtr<> like so:


By pressing 'y', the _WRAP_METHOD() will be changed like so and the query will continue:


Finally, the script will ask if pointer parameter types should be converted to 'const Glib::RefPtr<...>&'.

Update: Actually, the updated script will ask if pointer parameter types should be converted as described above, but if they are not converted, it will then proceed to ask if they should be converted to 'const ...&' (the const reference of the type).  If the types are still not converted, the script will conclude by asking if the parameters of those types should be removed altogether from the parameter list (this is convenient when the parameter is the first parameter of a function and the function would be a method of the correspoding C++ type in which case that parameter would not be needed in the C++ method).
 
Once the script is done, all that is needed is to edit the method names, the types in the Glib::RefPtr<>'s and the remaining parameters:


I'm pretty sure the script is not perfect but I think it could be very handy for the work of translating C function declarations to gmmproc _WRAP_METHOD() statements.  I hope it can be useful to any other doing similar work.

Update: Below is another script that converts property documention to _WRAP_PROPERTY() statements.  It works in a similar way as the script above:

" Convert property descriptions of GObject types copy and pasted from C
" documentation (maybe from a browser such as firefox) to _WRAP_PROPERTY
" directives.  To use, set the 'a' mark to the beginning of the declarations
" and the 'b' mark to the end of the declarations and run this script.

" Compress multiple spaces to a single space.
:'a,'bs/\s\{2,\}/ /ge

" Convert 'Type *' to 'Type*'
:'a,'bs/ \(\*\+\)\s*/\1 /ge

" Convert the resulting property documentation to _WRAP_PROPERTY() directives.
:'a,'bs/ "\(.\+\)" \(.\+\) : .*/  _WRAP_PROPERTY("\1", \2)/ge

" Ensure that two spaces precedes the _WRAP_PROPERTY() directives.
:'a,'bs/^ _WRAP_PROPERTY/  _WRAP_PROPERTY/ge

" Convert gboolean to bool.
:'a,'bs/gboolean/bool/ge

" Remove the inital 'g' from gchar, gint, gfloat, and gdbouble.
:'a,'bs/g\(char\|int\|float\|double\)/\1/ge

" Ask if a char* types should be " converted to Glib::ustring.
:'a,'bs/char\*/Glib::ustring/gec

" Ask if remaining char* parameters should be converted to std::string.
:'a,'bs/char\*/std::string/gec

" Ask if a parameter pointer type should be embedded in a Glib::RefPtr<>.
:'a,'bs/\(\w\+\)\*/Glib::RefPtr<\1>/gec

" Ask if remaining parameter pointer types should be changed to non-pointers.
:'a,'bs/\(\w\+\)\*/\1/gec