Programming | Linux Game Publishing Blog

Archive for the ‘Programming’ Category

Getting gcc to warn you when you mess up stdargs

Wednesday, January 20th, 2010

Sometimes, you may write functions in C that do things in the same way as printf, using stdargs.

An example of this would be something like this short debug function

int ptf(const char *fmt,...)
{
  va_list varlist;
  FILE *fp=fopen("/tmp/debug","a");
  va_start(varlist,fmt);
  vfprintf(fp,fmt,varlist);
  va_end(varlist);
  fclose(fp);
}

This function isn’t rocket science, it just simply appends your string into a file. It is a simple time saver utility.

However, using it can be a problem. You can do something like this

int x=1;
ptf("Error %s\n",x);

And gcc will say ’sure, no problem’.

But running the code will always crash. It tries to interpret the integer as a string.

This is the kind of thing that should be picked up on by the compiler. And in fact it can be, quite easily.

In your prototype for the function, you would have something like

extern int ptf(const char *,...);

This is pretty standard, and no surprises there. However, gcc has the capability to be given a hint as to how this function should be handled. You can instead prototype the function using

extern int ptf(const char *,...) __attribute__ ((format (printf, 1, 2)));

This tells gcc to treat the parameters 1 and 2 as the parameters to printf (which it knows how to check for errors). It will then check parameter 1 (the format string) against what is passed in starting at parameter 2 (the …). If an incorrect data type is used, this will now be detected and flagged up as a warning, in exactly the same way as an incorrect type used in a printf.

Tags: C, compiler, gcc, printf, stdargs, warnings
Posted in Programming | 2 Comments »

Some gotchyas in porting from Visual C++ to g++

Thursday, December 3rd, 2009

Todays technical article is going to go into a couple of issues that we have run across porting from Visual C++ to g++.

I will say right now that I am an application developer, I don’t know the fine details of how the inner workings of the compilers work, so some of this is merely an educated guess as to WHY the problem happens, but the effect and solution is correct.

The first issue for today is a fairly rare situation where g++, upon hitting a certain piece of code that builds fine under Visual C++, gives the helpful error message:

some/source/file.cpp:172: sorry, unimplemented: called from here

Now, I don’t know about you, but I find that error message singularly unhelpful. It took some time to run this one down the first time. It seems that g++ and VC++ perform some symbol resolution in different passes to each other. Because what this message is saying is that ‘at line 172, you have tried to use __forceinline, or something you reference has a member function that has __forceinline in it, but I do not know what the thing is, so I cannot inline it’.

EDIT: g++ does not have the __forceinline keyword. There is an equivalent which does the same job however

#define __forceinline inline __attribute__((always_inline))

As VC++ has no problem with the same calls, the only real answer must be a compiler difference that allows VC++ to get away with something that g++ doesn’t. The simple solution is to just change __forceinline to inline and let the compiler inline as it sees fit. Otherwise, you will need to hunt down the exact problem and resolve the order of events for g++

The second issue I thought I would raise today is one that actually happens quite commonly on porting from Windows to Linux. This is the case of bad memory management.

On Windows, the way memory is allocated allows for a certain sloppiness in code. While HIGHLY unrecommended, this is the sort of thing that happens in Windows code all the time. The following is a highly simplified example of the problem.

char *str;
str=(char *)malloc(6)
strcpy(str,"hello");
free(str)
switch (str[0])
{
  ...
}

Now, as you can see here, the application uses the variable str right after it has been free()’d. On Windows it seems that this kind of thing can be gotten away with. The memory manager on Windows is highly unlikely to either assign this memory address somewhere else, or to mark it as unavailable. On Linux however, you will often find that this kind of code leads immediately to a segmentation fault.

The example above is highly simplified, but illustrates what some of the more complex segmentation faults we have seen boil down to. If you see an error like this, which works on Windows and not on Linux, check your order of accessing. I know it sounds obvious, but it happens so often in commercial code that I feel it is worth stressing – free memory only when you have REALLY finished with it.

Tags: compiler, forceinline, free, g++, inline, malloc, memory, visual c++
Posted in Programming | 6 Comments »

argv and argc, and just how to get them

Monday, October 12th, 2009

argv and argc. They are vital parts of many programs. For the uninitiated, in many programming languages, they are the variables that hold the values typed into the commandline by the user. This article is about a particular problem we ran into trying to use them in a way that wasn’t really forseen. It really only applies to C and C++

Now most C/C++ programmers are probably thinking ‘what is so hard, you get them when you start main, right’…

int main(int argc,char **argv,char **envp)

That’s true, and in virtually all cases, you can store these values, either in a class, a global variable, a function, whatever your programming style uses. And from then on, you can simply access them from anywhere in the program. Its really simple, and something that most programmers know how to do in their sleep.

But what do you do before main happens? Or what about if you don’t have a main()

I know, you always have a main and nothing runs before it.

But that is very not true. Lets take two examples

Firstly, in C++

class gamedata
{
public:
  gamedata()
  {
    commandline=get_argv();
  }
  char **commandline;
}

static gamedata datastore;

int main(int argc,char **argv)
{
   set_argv(argv);
   ...
}

To start, this looks OK. Sure, Ive missed out a whole bunch of things, assumed what the user functions set_argv and get_argv do. But it looks fine. Until you look again. That static class instantiation will run its constructor before main runs. How on earth can you POSSIBLY get argv at this stage??

Lets look at an example in C now

If you are creating a shared object, you will very often need to initialise it to ensure that its state is known for the first call into the objects functions.

static char **commandline;

void __attribute__ ((constructor)) localinit()
{
  commandline=get_argv();
}

Again, you try this, and you get a big nothing. This function is called before main runs, before you ever have access to the values.

So, what is the answer?

The answer is, unfortunately, ugly. There is, in Linux, no good way of doing this. There are two perfectly good symbols inside libc which do the job perfectly, __libc_argv and __libc_argc, which are defined way before the program gets to any user-created code. Unfortunately they are declared private and you as a user are not permitted to see them. So, another way is needed.

We came up with two ways to make it work. Neither are portable, and one of them, while it works just fine, does make me go ewww. I’ll leave it to your imagination which one makes me go ewww the most.

char **get_argv()
{
  static char **preset_argv=NULL;

  if (!preset_argv)
  {
    FILE *fp;
    fp=fopen("/proc/self/cmdline","r");
    if (fp)
    {
      //Your implimentation to take the commandline as typed from this
      //file and turn it into argv. Its fairly basic
    }
  }
  return preset_argv;
}

char **get_argv()
{
  static char **preset_argv=NULL;

  if (!preset_argv)
  {
    extern char **environ;
    char ***argvp;
    void *p;
    asm ("mov %%esp, %0" : "=r" (p));
    argvp = p;
    while (*argvp != environ)
      argvp++;
    argvp--;
    preset_argv = *argvp;
  }
  return preset_argv;
}

So, a little explaination

The first example relies on /proc, the part of the filesystem that you can get all sorts of interesting information from. /proc/self/cmdline is always an exact duplicate of the command typed to execute the application, or the exact value passed in from the menu option you clicked to get it working. I haven’t bothered to create the bit of code to separate out the commandline parts onto their components. Partly because that code is fairly straightforwards, and partly because it is quite long and dull (remember it isn’t just a case of separating by spaces, you have to take into account things grouped in quotes, and other fun stuff). This is not portable beyond Linux, and people keep telling me that not all Linux distros have /proc either.

The second example requires a tiny bit of assembler knowledge. It is portable across most unix flavours, but I expect trying it outside of unix will cause you much pain. It relies on the fact that a unix standard is to push the argc and argv values right onto the top of the stack when a program starts. The example reads from the top of the stack until it finds the environ value (which is globally available at all times), and then reads back one value to get the pointer to argv. This way has the advantage that the values are correctly parsed for quotes and the like already. It makes the assumption that environ is the next thing on the stack after argv. I have seen many reports claiming this is always true, but I cannot find the location of an authoritative piece of documentation saying that it is specified true and will not change in future.

It isn’t often that you will need to do this. Most of the time, argv and argc are perfectly usable in the way you will probably have been using them for years. Even the examples above can be ‘worked around’ using initialisers called immediately after main() starts. But if one day, you come up against a problem where you need your argv where you usually don’t have access, I hope you find this post useful.

Tags: argv, assembler, C, classes, commandline, constructor, proc, shared objects
Posted in Programming | 8 Comments »

Handling misbehaving libraries in binary products

Tuesday, August 18th, 2009

It is a bit of an arcane artform, making a game that runs on all Linux versions. We do quite well at LGP but there is always room for improvement.

Recently, we came across an issue with PulseAudio (I will rant about Pulse in some other post). It became almost impossible, due to the architecture of alsa, and the combination of our build environment, and our higher level sound libraries, to make sound work – pretty much at all, in Linux games. Those of you with older games will know that for a while they all stopped having sound. Some of them we still haven’t released patches for yet (they will be forthcoming).

Now, one of the big problems was that sound libraries (such as OpenAL or SDL) go through their own motions of opening sound. We have no way of directly controlling what they do or how they do it. This is a common feature with many libraries, they will load their own dependencies in a way we cannot control.

The biggest problem is that OpenAL and SDL try to dlopen libasound, and on some machines, libasound doesn’t work with our binaries. On others, it can actually crash the whole game due to incompatibilities. This is a common issue when dealing with unknown system configurations when sending out a binary-only product into the world. You never know just what peoples systems will have installed, and just how they may confuse the situation. But you cannot just say ‘well, bad luck, it crashes on your machine’, people will be understandably upset.

We came up with a rather horrible system to make this work, and one that I hope may be of use to those of you who come here for our technical insights. This system is deliberately designed to work with libraries that have the potential to actually crash a program. As a simple example that many here will follow, I am using SDL’s audio system as the target, and using it to test whether ALSA works or crashes the program.

  static int system_asound_ok=-1;

  if (system_asound_ok==-1)
    {
      forkval=fork();
      switch (forkval)
        {
        case -1:
          //This is an error, we couldnt fork() so we just say no to
          //system openal working
          system_asound_ok=0;
        case 0:
          //This is the child, try and run the open, and then close and exit
          {
            //Your code here to remove whichever signal handlers you have set up
          }
          fail=SDL_InitSubSystem(SDL_INIT_AUDIO);
          if (!fail)
            SDL_QuitSubSystem(SDL_INIT_AUDIO);
          //MUST use _exit here not exit
          _exit(0);
          break;
        default:
          //This is the parent, wait for the child to exit somehow
          waitpid(forkval,&fail,0);

          //The child has now exited somehow, check how
          if (WIFEXITED(fail))
            {
              //The child exited ok, we can try and use the system library
              system_asound_ok=1;
            }
          else
            {
              //The child probably crashed, dont try and use the system library
              system_asound_ok=0;
            }
        }
    }

At this point you know whether the system asound (ALSA) library is safe to use.

This is a truly truly horrible system, when you actually have to allow an application to crash, you know something is wrong, but sometimes there is no option.

As a technical note, note the use of _exit() instead of exit(). This ensures that when the program ends, no atexit() functions are called. Using exit() could mean things like SDL’s cleanup being called, or who knows what else!

Signal handlers should be removed at the point in the code indicated. They interfere with the proper signaled exit of the program and will prevent WIFEXITED from detecting a crash properly. Bear in mind the signal removing only happens in the child process and will not affect signals in the parent.

So, what can you do knowing that the system works or doesn’t work?

Well, knowing that it works, you can simply run the initialisation as it stands and enjoy the working library.

If you know it fails, you can either skip this system altogether (in this case run without sound) or you can switch to another library. If you setenv an LD_LIBRARY_PATH this will influence the path that dlopen looks for libraries. LD_LIBRARY_PATH is searched before any other directories, and so will override the default location of the library. This means, in this instance, that SDL_InitSubSystem will dlopen the libasound.so in the location you have set the LD_LIBRARY_PATH to.

I hope this will relieve some headaches for some developers out there. It is always difficult making any application when you do not know what the target system will have installed on it. Of course, we can always say ’sorry, you need a specific version of this or that library’ but when you have just bought a game, that is the last thing you want to hear, and so – we have to jump through some hoops to stop that happening.

Tags: ALSA, crash, fork, libraries, openAL, PulseAudio, SDL, sound, _exit
Posted in Programming | 11 Comments »

IP MASQ shortcuts causing Grapple to fail

Saturday, July 4th, 2009

This is a technical article that goes to some depth into networking protocols. Reading this article, you will need some overall knowledge of how UDP networking works, and a basic idea about what NAT and IP MASQ is.

We’ve just spent quite a while investigating issues with a small number of games failing to make connections to perfectly good servers. Grapple, the LGP networking layer, is designed to work around the presence of firewalls, NAT, and all that good stuff, so that users can host a game from any machine on their network, whether it be exposed to the outside world or not.

So, when we found out that some people couldn’t connect to games, we were a bit concerned. After all, our code should be just fine for this. It has worked in all of our test scenarios.

Unfortunately we assumed that all networking at the kernel level worked. This was a mistake.

So, take a scenario, lets say host A (lets call it gateway) is the gateway server, running IP MASQ to NAT all servers on its local network. Host B (call it laptop) is a machine on the internal network.

Now, as we run through this article I will run you through some aspects of NAT, STUN, TURN, and other nice things about firewall and NAT handling. I wont be going through the exact protocols, but I’ll give you a bit of an idea how it works.

So, Laptop is the machine I want to use to start up a game server. I start a server and bind a UDP socket to port 2000 on all interfaces. I then use STUN to find out what kind of connection I am on. The method for STUN is shown below. stun

STUN tells me that the server is on a symmetric NAT. This means that the only things that can communicate with the server, are things that the server has talked to itself first.

This is the most restrictive type of NAT, under usual circumstances it will mean that it is effectively impossible to make connections to this laptop game server without making manual traffic routes through your NAT routing table on the gateway machine for each client. Any client connecting to the server will try and connect to the laptop, but any routing will be rejected because the laptop has not first connected to the client.

However, Grapple can handle this, as I will demonstrate.

So, on the gateway machine (it is important for later on to note that the gateway machine is also the client trying to connect to the laptop server to play the game), I start up a client. The client tries to directly connect to the server, sending a number of UDP connection requests. The server does not receive them.

This is not unexpected, as demonstrated earlier, the server is behind a symmetric NAT and the server hasn’t sent any data to the client, and so the servers routing will reject anything being sent to it from the client.

So, at this point it looks fairly hopeless, how can the server magically know the client is connecting to it.

Well, the answer lies in the STUN server. The laptop server has already talked to the STUN server and so the NAT routing tables allow the STUN server to send more data to the laptop.

So, the client gateway knows about the same STUN server, and so sends a request to the STUN server to ask the laptop server to connect to it.

The laptop receives this request, and connects to the address that the STUN server requests. The client should then, we hope, be able to communicate.

Now at this point this can go two ways.

1) Client is not behind a symmetric NAT or firewall

In this case, the client receives the data packet, and communication can start. We’re pretty much done handshaking, the client and server can talk to each other.

2) Client IS behind a symmetric NAT or firewall

The client will never receive the packet from the server. No communication is possible.

Now, in the instance in question, we discover the situation is an impossible one. The client receives the packet from the server, but then is unable to reply.

This is completely impossible, the routing table to the client has seen a packet to from the laptop to the gateway, and so should allow 2 way communication between the two machines (on the same ports). But for some reason – it wasn’t!

This was the sticking point. How could this be possible. We knew that the laptop worked this way because we had communicated with the STUN server successfully, and this relies on the fact that an outbound packet creates a tunnel back to the server. This is vitally important for all networking, or you would just send out data and never get anything back, rendering NAT pretty much useless.

The issue turned out to be the IP MASQ NAT on the gateway.

The laptop sent a packet to the external IP address of the client. This means that it should have gone out to the NAT layer, and been NAT’d to the external IP and port of the server socket communicating to the client. The client would have received a request that external IP 123.123.123.123 was trying to talk, and would respond to that address. The response would have gone through the same IP MASQ layer, and the server would get back a response from the location it expected.

HOWEVER

IP MASQ has decided in its infinite wisdom to take a short cut. It spots a packet will be routed to the gateway, and decides that, hey, they are on the same internal network. They can talk directly! And so, the packet presented to the client is not showing as coming from the outside internet, it is shown as coming from 10.2.2.13 (the internal IP address of the laptop). And so, when it gets this reply, it sends a response back. As the response never passes through the IP MASQ because it is going over the internal network, then the response to the server comes from 10.2.1.1 and as the server sent its data to 123.123.123.123 the servers own firewall (independent of the NAT) says ‘hey, I don’t know you’ and rejects the packet.

All because IP MASQ takes a short cut.

In the end, the problem is solved that if the client doesn’t receive further communication from the server, then it falls back to its final communication option, TURN. We have already demonstrated that the STUN server can talk to both sides, as both sides initiated communication with the server. This means that the STUN server can act as a relay between the client and the server, sending any and all data via this third party. However to be honest, we don’t like doing that as obviously it will increase latency of games, and also it means that any and all game traffic that has to resort to TURN has to be passed via our servers. A few games running at once, no problem, but more and more games means more and more bandwidth used, and we’d like to avoid that!

So in the end, the problem is solved, but really, if not for IP MASQ making a shortcut for efficiency, there would never have been a problem in the first place.

[Edit: I am adding the layout of the network below, to explain a little better]

network

Tags: firewall, Grapple, IP MASQ, kernel, NAT, UDP
Posted in Programming | 6 Comments »

Variable length C macros

Saturday, May 30th, 2009

Something that came up in porting X2 and X3 is that Visual Studio in Windows handles variable length macros, and as far as we could see, GCC didnt.

This means that in gcc, if you wanted to create a define that would do a job and report where it was called from, you would need to do something like:

void mydebug(const char *file,int line,const char *fmt,...)

#define DEBUGOUT1(x) mydebug(__FILE__,__LINE__,x)
#define DEBUGOUT2(x,y) mydebug(__FILE__,__LINE__,x,y)
#define DEBUGOUT3(x,y,z) mydebug(__FILE__,__LINE__,x,y,z)

and so on…

However, there is a solution which does the job, which we found after a fair amount of investigation. The recommended method would be

#define DEBUGOUT(x,y...) mydebug(__FILE__,__LINE__,x,y)

Which will work, but only if you actually add a second parameter to the DEBUGOUT call. Without, it will expand

DEBUGOUT(x)

mydebug(__FILE__,__LINE__,x,)

which will obviously fail to compile. To fix this, simply bear in mind that … in the macro just means everything else, so you can do

#define DEBUGOUT(x...) mydebug(__FILE__,__LINE__,x)

which then works with single and multiple values in macros, x becomes the fmt AND the … for the mydebug .

Tags: gcc, macro, visual studio
Posted in Programming | 5 Comments »

Being sure of your shared objects

Wednesday, April 15th, 2009

This short article is all about how to be sure you know exactly what your application is doing with shared objects.

It can be quite easy sometimes to not know which objects are being used by a program. Sure, you can ldd a binary and find out what it links to, but which order are they being used, and what about when the program dlopens something, as many do. How can you easilly find out what it is using and how it finds it?

Simply, you can run an application as such:

LD_DEBUG=libs ./yourapp

This will give you pages of output telling you everything that the application is doing with shared objects, and you will note that each time it opens a shared object, it will report this with ‘init’, leaving you able to simply grep to see which .so’s it is using.

Hopefully this helps people that are trying to work out just quite what their apps are doing sometimes!

Tags: shared objects
Posted in Programming | 1 Comment »

A little-known trick with C switches

Tuesday, March 3rd, 2009

OK, so this little programming post is nothing like rocket science, it really isn’t. It is, however, REALLY useful, and something that almost nobody knows about the C language. We came across it when we were porting a game a few years ago, and none of us had ever seen it before. We tried it, and it works!

So, lets take a C switch statement

switch (somenumber)
{
case 1:
case 2:
case 3:
  do_something();
  break;
case 4:
case 5:
case 6:
  do_something else();
  break;
}

Fairly standard, nothing too unusual here. However there is another way to do it!

switch (somenumber)
{
case 1 ... 3:
  do_something();
  break;
case 4 ... 6:
  do_something else();
  break;
}

Unfortunately, it doesn’t seem to work on anything other than C/C++, which is a shame!

Enjoy!

Tags: C, switch
Posted in Programming | 7 Comments »

How to search a data tree efficiently

Monday, February 23rd, 2009

This nice little method is one we used for PenguinPlay.

Assumptions: You know some C, and understand tree structures

If you have a whole load of data, lets say a hierarchy of categories with items in them, like in an online store, it is common to store this data in a tree. This allows you to move up and down the tree logically to access the data in the same way it is visualised.

With the logical way of doing this, there comes a problem of scalability. How do you find all of the categories that are below a certain point in the tree. Lets take a tree as an example

            +----(2)Windows
(1)Games----|               +----(5)Strategy
            +----(3)Linux---|                       +----(7)Action
            |               +----(6)First Person----|
            +----(4)Mac                             +----(8)RPG

Now to find all categories that are children of Linux, how do you do this?

Well, most of the time, the simplest way to write some code for this will find all of the categories that are children of Linux (5,6) and then for each of these find all of the children of these (7,8) and so on and so forth down the hierarchy.

This isn’t too inefficient in smaller trees, but imagine having ten levels, each with five branches at each level. You very quickly get an impossible number of items to deal with using this method.

So, how do we do this efficiently?

The solution is quite simple. It takes a moment to understand, and then it just makes you think ‘Why didn’t I think of that!’

Initially lets draw the above example again, but in a different way

+----------+
|          |
|          +------------+
|          | (2)Windows |
|          +------------+
|          |
|          +------------+
|          |            |
|          |            +-----------------+
|          |            | (5)Strategy     |
|          |            +-----------------+
|          |            |
|          |            +-----------------+
|          |            |                 |
|          |            |                 +-----------+
|          | (3)Linux   |                 | (7)Action |
| (1)Games |            | (6)First Person +-----------+
|          |            |                 |
|          |            |                 +-----------+
|          |            |                 | (8)RPG    |
|          |            |                 +-----------+
|          |            |                 |
|          |            +-----------------+
|          |            |
|          +------------+
|          |
|          +------------+
|          | (4)Mac     |
|          +------------+
|          |
+----------+

As you can see, if you look at it, the layout above shows the exact same data that is shown in the first tree, just displayed differently.

So, now onto how this helps us. Firstly, take your first category, (1)Games. We assign each category a minimum and maximum number. As we are starting, we use 1 and 2

1 +---------+
  | (1)Game |
2 +---------+

Simple enough. Now, lets add the first child, the (2)Windows. Now, as this is a child of (1)Game, then its minimum and maximum must be inside of the min and max for its parent. So, we simply change the maximum of (1)Game to 4, and set the min of (2)Windows to be 2 and the max to be 3

1  +----------+
   |          |
2  |          2------------+
   | (1)Games | (2)Windows |
3  |          3------------+
   |          |
4  +----------+

Now, lets add in category 3 and 4, in each case, we insert the categories with min and max values that are inside of the parents, and not overlapping with any other children of that parent. Like so.

1  1----------+
   |          |
2  |          2------------+
   |          | (2)Windows |
3  |          3------------+
   |          |
4  |          4------------+
   | (1)Games | (3)Linux   |
5  |          5------------+
   |          |
6  |          6------------+
   |          | (4)Mac     |
7  |          7------------+
   |          |
8  8----------+

Simple enough, but what does that get us? Not a lot yet. Lets go a bit further down the tree, and add some children to the Linux category, and see what happens.

So, using the same rule, any child we add must have a min and max inside its parents, and its parent must expand to accommodate it. Lets add (5)strategy in as a child of Linux.

1  1----------+
   |          |
2  |          2------------+
   |          | (2)Windows |
3  |          3------------+
   |          |
4  |          4------------+
   |          |            |
5  |          |            5-------------+
   | (1)Games | (3)Linux   | (5)Strategy |
6  |          |            6-------------+
   |          |            |
7  |          7------------+
   |          |
8  |          8------------+
   |          | (4)Mac     |
9  |          9------------+
   |          |
10 10---------+

Now, by adding the child to Linux, you have to change lots of mins and maxes. It looks like it would be a real pain to keep track of, but simply, you can just go through a flat list of all categories, and add 2 to each min and max that are greater than or equal to the old maximum of the parent. So in this case, the Linux category min/max were previously 4 and 5, so anything higher than a 4, add 2 to it, and then you have 2 spare numbers for the min/max for the new category. Simple enough. Now we can add in the final categories, and the tree looks like it did in the example further up, but with each category given a min and max number

 1 1----------+
   |          |
 2 |          2------------+
   |          | (2)Windows |
 3 |          3------------+
   |          |
 4 |          4------------+
   |          |            |
 5 |          |            5-----------------+
   |          |            | (5)Strategy     |
 6 |          |            6-----------------+
   |          |            |
 7 |          |            7-----------------+
   |          |            |                 |
 8 |          |            |                 8-----------+
   |          | (3)Linux   |                 | (7)Action |
 9 | (1)Games |            | (6)First Person 9-----------+
   |          |            |                 |
10 |          |            |                 10----------+
   |          |            |                 | (8)RPG    |
11 |          |            |                 11----------+
   |          |            |                 |
12 |          |            12----------------+
   |          |            |
13 |          13-----------+
   |          |
14 |          14-----------+
   |          | (4)Mac     |
15 |          15-----------+
   |          |
16 16---------+

OK, so there we have it, we have a tree, showing a hierarchy, and all of the categories have numbers for a min and a max value. All numbers are unique, and all numbers belonging to a child are inside the scope of their parents. Great, but how does this help us?

Take a look. If you want a list of all categories that are children of Linux (min=4, max=13) then simply, get a list of all categories whose min is more than 4 and whose max is less than 13

Simple. In a single pass through a flat list of categories, you obtain a full list of all children. No matter how deep the tree system goes, as long as you keep to the rules of calculating the min and max values, this will always work.

It also works the other way. If say, you want a list of all parents of the RPG category (min=10,max=11), simply look for all categories whose min is less than 10 and whose max is more than 11, and you get the entire ancestry of that category. All in one pass.

This all comes in especially useful if you combine it with SQL. Complete tree segments can be obtained from a single efficient SQL query. Queries are easy to generate that can find, using the above tree as an example, all Linux games of any category!

I hope some people found this useful, and I hope that some of you had the reaction I had when someone first introduced this to me several years ago – ‘Thats very cool!’

Tags: algorithm, C, tree
Posted in Programming | 1 Comment »

64 bit games on the horizon, but not here yet.

Wednesday, February 11th, 2009

As modern machines are moving more and more towards 64 bit, we are receiving a growing number of requests for 64 bit binaries of our games. This semi-technical article will illustrate how we do plan to create 64 bit binaries in the end, and to give you an overview as to why we haven’t done so before.

The first stumbling block to 64 bit, is the build environment. The build environment is absolutely essential, and it is what allows us to create a game that works on all Linux distros. Over many years, we have tuned our 32 bit system to find a set of libraries and build tools that are cross-distro compatible with pretty much any Linux distro out there. The discussion of this in detail will be for another day, but finding a comparable cross-distro environment for 64 bit is the work of many months of solid research, costing tens of thousands of pounds.

The second problem is the code itself. Only in the last 12 months or so have a few games started to come out for Windows with 64 and 32 bit versions. Most games, even today, are 32 bit only. This means that assumptions have been made about the sizes of pointers, and sizes of various data types, that do not hold true for 64 bit. The biggest issue there is the size of pointers. Various aspects of the game rely on a pointer being 4 bytes in size, and suddenly throwing an 8 byte pointer into the mix is guaranteed to cause problems. This is something that, depending on the way the program is written, is incredibly hard to work around.

Data types such as long int, are easier to work around. Simply replacing all long int variables to be an int32_t will solve that issue with no problem – unless of course the game uses pointer arithmetic. I have never seen a game that does not use pointer arithmetic. This means we cannot make blanket substitutions, every variable in the whole program, sometimes millions of lines, tens of thousands of variables, needs to be examined in depth.

Once we have all of these factors addressed, we will be in a position to produce a 64 bit version of a game. An estimated 3-6 months advance work making a build environment template, probably 25% extra development time increase for the project, and we end up with a speed increase that is not large enough to warrant the extra development time. Our research shows that 64 bit versions of games would run around 5% faster than 32 bit, and you would get more of a performance increase by closing down your web browser when playing, than by running 64 bit.

With the impatience our customers have for new games, we have for now made the call to bring the game to release earlier, than to have a 64 bit version. As and when we license a game that is available as source for 64 and 32 bit, we will certainly spend the time doing the work to build an appropriate build environment, so that we can take advantage of the source, but the costs and risks of doing this to an entire source base that is not designed for it are too high to justify.

Tags: 32 bit, 64 bit, build environment
Posted in General, Programming | 4 Comments »