Inline Assembler Lab

Let’s look at the use of inline assembler in a C program.

Assembly-language code can be embedded into a C (or other high-level languages) source file in order to optimize a certain part of the program by using processor-specific instructions that are faster than what the compiler would create out of functionally-equivalent C code. Moreover, inline assembly code can be used for hardware-specific tasks that would be impossible with high-level code.

An example of inline assembly code can be found in K9Copy, a DVD backup tool. In the file src/dvdread/md5.h, there is the following piece of code:

#if defined __GNUC__ && defined __i386__
static md5_uint32
rol(md5_uint32 x, int n)
{
  __asm__("roll %%cl,%0"
          :"=r" (x)
          :"0" (x),"c" (n));
  return x;
}
#else
# define rol(x,n) ( ((x) << (n)) | ((x) >> (32-(n))) )
#endif

If the program is compiled with a GNU compiler and on an i386 processor, the function rol(md5_uint32 x, int n) is defined. It contains the assembly instruction roll (rotate left long), which rotates the 32-bit integer x to the left by n places, bringing bits that fall off the end back to the beginning. If the program is compiled with a different compiler or on a different processor, the function is instead replaced with a macro function that achieves the same result through bit shifts; it would use multiple instructions whereas the version using inline assembly uses only one.

This, I think, is a good use of inline assembly: it’s there to be used if it can make the executable faster, and if it can’t be used then there is an alternative. Unless there are machines that have the __i386__ macro defined but no rol function available (which I don’t think would be the case?), there are no problems with portability. The concept is sound. However, this particular source file was written in 1999. Modern compilers would almost certainly recognize the second function (that is, the macro function using bit shifts) and replace it with a rol instruction if possible. Using inline assembly for optimization is not so (relatively) straightforward a job as one might think, or as it would have been 20 years ago. Most attempts at optimization with inline assembly would probably be surpassed by the compiler’s optimizations; at worst, inline assembly might even impede the compiler and result in an executable that performs worse than it would have had the compiler been left to its own devices. There are, I’m sure, exceptions, and in some situations it could be the best course of action to use inline assembly in an optimizing capacity, but it probably shouldn’t be the first resort.

Inline assembly code should, of course, still be used for certain platform-specific operations or for tasks that cannot be done with high-level languages–those things, at least for the moment, lie beyond the powers of the compiler.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s