Posts Tagged ‘gcc’

Getting gcc to warn you when you mess up stdargs

Wednesday, January 20th, 2010

Sometimes, you may write functions in C that do things in the same way as printf, using stdargs.

An example of this would be something like this short debug function

int ptf(const char *fmt,...)
{
  va_list varlist;
  FILE *fp=fopen("/tmp/debug","a");
  va_start(varlist,fmt);
  vfprintf(fp,fmt,varlist);
  va_end(varlist);
  fclose(fp);
}

This function isn’t rocket science, it just simply appends your string into a file. It is a simple time saver utility.

However, using it can be a problem. You can do something like this

int x=1;
ptf("Error %s\n",x);

And gcc will say ’sure, no problem’.

But running the code will always crash. It tries to interpret the integer as a string.

This is the kind of thing that should be picked up on by the compiler. And in fact it can be, quite easily.

In your prototype for the function, you would have something like

extern int ptf(const char *,...);

This is pretty standard, and no surprises there. However, gcc has the capability to be given a hint as to how this function should be handled. You can instead prototype the function using

extern int ptf(const char *,...) __attribute__ ((format (printf, 1, 2)));

This tells gcc to treat the parameters 1 and 2 as the parameters to printf (which it knows how to check for errors). It will then check parameter 1 (the format string) against what is passed in starting at parameter 2 (the …). If an incorrect data type is used, this will now be detected and flagged up as a warning, in exactly the same way as an incorrect type used in a printf.

  • Share/Bookmark

Variable length C macros

Saturday, May 30th, 2009

Something that came up in porting X2 and X3 is that Visual Studio in Windows handles variable length macros, and as far as we could see, GCC didnt.

This means that in gcc, if you wanted to create a define that would do a job and report where it was called from, you would need to do something like:

void mydebug(const char *file,int line,const char *fmt,...)

#define DEBUGOUT1(x) mydebug(__FILE__,__LINE__,x)
#define DEBUGOUT2(x,y) mydebug(__FILE__,__LINE__,x,y)
#define DEBUGOUT3(x,y,z) mydebug(__FILE__,__LINE__,x,y,z)

and so on…

However, there is a solution which does the job, which we found after a fair amount of investigation. The recommended method would be

#define DEBUGOUT(x,y...) mydebug(__FILE__,__LINE__,x,y)

Which will work, but only if you actually add a second parameter to the DEBUGOUT call. Without, it will expand

DEBUGOUT(x)

to

mydebug(__FILE__,__LINE__,x,)

which will obviously fail to compile. To fix this, simply bear in mind that … in the macro just means everything else, so you can do

#define DEBUGOUT(x...) mydebug(__FILE__,__LINE__,x)

which then works with single and multiple values in macros, x becomes the fmt AND the … for the mydebug .

  • Share/Bookmark

Our new way to meet the LGPL

Sunday, February 8th, 2009

Hi again, and welcome to our next technical article. This is a mix of technical and legal, but as I know many of us in the open source community are very serious about the licences we work under, I thought you would like a little background reading to lead you up to a really neat and little-known feature of the GNU linker (ld) that we have just adopted.

For years, LGP has been working with libraries such as SDL, ffmpeg, and others that are licensed under the LGPL (GNU Lesser General Public License). Without these invaluable tools from the open source community, LGP would not exist, and nor would hundreds of open source projects.

The LGPL states that an application that links against an LGPL library is not bound by the LGPL itself, but then goes on to qualify this, and make exceptions, and even itself states that the boundaries between what counts as simply linking against a library, and what counts as a derivative work, are ‘not precisely defined by law’.

The problem we have always faced is finding a way to make sure the game is portable. To do this you MUST make sure that you are using a known version of as many libraries as you possibly can. There is no point in exhaustively testing a game against SDL 1.2.12 when next week SDL 1.2.13 comes out, changes a few of our assumptions, and means the game crashes. Multiply the problem by the number of versions a library has, multiplied by the number of libraries a game links against, and you can see why this is a big problem. And so, we like to make sure we build the game, test the game, and run the game, all against exactly the same libraries as the end user will use, in as many cases as is possible.

Since the beginning of commercial Linux games, the common practice has been to create a release of each game such that there was a static and a dynamic linked version of the game in each release. The dynamic version of the game would be completely in compliance with the word and spirit of the LGPL, using the users own system libraries, while the static linked version of the game was released because linking the libraries directly into the game ensured we knew which libraries were being used. The statically linked executable though, was really not very much in the spirit of the LGPL. We always got away with it because we included the exact same game in full LGPL compliance,and because of the wording of the LGPL, it was fairly ambiguous as to whether this was allowed. But even so, we were never happy with it. Loopholes are not something to be proud of using.

There was another method of course. The other method involved forcing the game, via the LD_LIBRARY_PATH to use libraries in a certain directory. However that had issues of its own. To do this you either have to tell the user ‘before you start your game type this long command into the commandline’ or you start the game from a shellscript. Shellscripts are all well and good, but they bring problems of their own, such as (for security) making changes to the euid, resetting values from /etc/profile, and of course, assuming that the shell in use has exactly the same syntax as the shell at the time of release. It was decided that because of this, and many other issues, starting from a shellscript was too much of a risk for portability and was ruled out.

And so, we were left with the method that has been being used for the last 12 or so years. That is until recently, when we found a nice new way to fix this problem once and for all.

Most people are probably unaware of the linker option, -rpath. Most of you don’t ever need to be. This option lets you tell an application where to look for libraries. It works just like adding a new path into the LD_LIBRARY_PATH. Great, but it doesn’t really help like that. It is set at compile time and so we would need to restrict installation to a known directory on everyone’s machine. Obviously unacceptable for most users.

And so the problem remained until one of our devteam discovered a neat little trick that isn’t even documented in the manual for the linker. You can use a special keyword $ORIGIN to say ‘relative to the actual location of the executable’. Suddenly we found we could use -rpath $ORIGIN/lib and it worked. The game was loading the correct libraries, and so was stable and portable, but was also now completely in the spirit of the LGPL as well as the letter!

For those of you a little newer to compiling under Linux, some of you may not even be aware you use the ld linker. It is done automatically by gcc for you. If you are simply using gcc in a Makefile, it is a little more difficult in syntax, but as a hint you would change an example Makefile line that started like this

gcc obj1.o obj2.o -o my_application

to be

gcc -Wl,-rpath,\$$ORIGIN/lib/ obj1.o obj2.o -o my_application

So, that’s the neat little trick I thought I’d like to share with you, maybe it will help some of you out there to organise the way your projects run, as of course it isn’t just useful for closed source, this is useful for any project that has to use a specific library version in order to work properly!

  • Share/Bookmark

The trouble with storing binary data structures

Thursday, January 29th, 2009

To start off our series of Programming posts, I’d like to start you off on a technical issue we bumped into yesterday. This isn’t a new issue for us, but running into it again made us think ‘Hey, this would be a great topic for our first technical article’.

Assumptions: You know some C, You know what a struct in C is.

So, as we were working yesterday on a patch for Majesty, we bumped into an issue

We had the following data structure (this is an abbreviation, the real structure is code we aren’t really allowed to just post on a website!)

struct datastruct
{
  char ltr;
  short key;
  int value;
};

Now, we were using this to read in a blob of binary data from the games datafiles. These data blobs had been stored from Windows when the game was made, and on testing, loaded just fine into Windows.

On Linux, however, reading the data failed.

struct datastruct datastuff;

//src is a data stream that is the same on Windows and Linux
memcpy(&datastuff,src,sizeof(datastuff));

The same code on Windows and Linux produces different results! Why can this be?

The Answer

The answer lies in how the struct is stored.

Windows was being told to ‘pack’ its data structures, to save memory. So the data in the structure was held as follows

Byte     0    1    2    3    4    5    6
Data  |-ltr-||--key--||-----value--------|

When we were using Linux to read this data back in, it was not packed in the same way. On Linux, the default alignment of a 32 bit machine is to align values on 32 bit boundaries, like so

Byte     0    1    2    3    4    5    6    7    8    9    10   11
Data   |-ltr-|              |--key--|          |-----value--------|

As you can see, if you are simply reading in a data stream, you will find that the ltr will be correct, the key will be reading bytes from the middle of the value, and the value could be absolutely anything!

So, how do you fix this?

gcc uses a pragma to resolve this. Use

#pragma pack(n)

on a line of itsown before the struct is defined, where n is the number of bytes you want to pack to. n must be a power of 2 (so 1,2,4,8,16…).

When you are finished defining things that need to be packed in a certain way restore it using

#pragma pack()

So, if you did, at the start of the file defining the structure

#pragma pack(1)

Then the datastructure will look the same as in the first example, all scrunched up into 7 bytes. If you use

#pragma pack(2)

Then the data structure will be aligned so that each element starts on a 2 byte boundry. This means that it will take up 8 bytes, and there will be a 1 byte gap between ltr and key, which would again cause problems.

The second packing example (the one with all the gaps) is a

#pragma pack(4)

example.

So, how do you detect this when you find your data is corrupted?

It isnt that hard to detect when this has happened. If your data is not the same when you read it in, and you are reading a whole struct in from a binary stream or blob, then chances are, it is a packing issue. Look at the bytes in the stream, try and match them up with the bytes you see in your struct, and see if you can see a pattern, see where bits are missing from the data stream when you look in your struct.

If the data in the struct matches the data in the stream, but the data when you read is different from the data you have saved, don’t forget that packing works both ways. If you have a struct that is packed using 32 bit (4 byte) boundries, and you write this to a stream, it will look like this

Byte     0    1    2    3    4    5    6    7    8    9    10   11
Data   |-ltr-|              |--key--|          |-----value--------|

The bits in the gaps (bytes 1,2,3,6,7) will still be saved, but they can be ANYTHING. Do not rely on them being 0, it isnt always the case.

So if you read this into a packed data structure, you will find that you read in the first byte correctly, you then read the key as 2 completely random bytes, and the value will be made up of bits of the key and random bytes!

We hope that this little tutorial has been helpful to you, and given you a bit of an understanding of this problem. If you spot any mistakes, or see ways to improve it, please drop us a comment on the article!

  • Share/Bookmark