Previous Next Contents

4. Porting and Compiling

4.1 Automatically defined symbols

You can find out what symbols your version of gcc defines automatically by running it with the -v switch. For example, mine does:

$ echo 'main(){printf("hello world\n");}' | gcc -E -v -
Reading specs from /usr/lib/gcc-lib/i486-box-linux/2.7.2/specs
gcc version 2.7.2
 /usr/lib/gcc-lib/i486-box-linux/2.7.2/cpp -lang-c -v -undef
-D__GNUC__=2 -D__GNUC_MINOR__=7 -D__ELF__ -Dunix -Di386 -Dlinux
-D__ELF__ -D__unix__ -D__i386__ -D__linux__ -D__unix -D__i386
-D__linux -Asystem(unix) -Asystem(posix) -Acpu(i386)
-Amachine(i386) -D__i486__ -

If you are writing code that uses Linux-specific features, it is a good idea to enclose the nonportable bits in

#ifdef __linux__
/* ... funky stuff ... */
#endif /* linux */

Use __linux__ for this purpose, not linux. Although the latter is defined, it is not POSIX compliant.

4.2 Compiler invocation

The documentation for compiler switches is the gcc info page (in Emacs, use C-h i then select the `gcc' option). Your distributor may not have packed this with your system, or you may have an old version; the best thing to do in this case is to download the gcc source archive from ftp://prep.ai.mit.edu/pub/gnu or one of its mirrors, and copy them out of it.

The gcc manual page (gcc.1) is, generally speaking, out of date. It will warn you of this when you try to look at it.

Compiler flags

gcc can be made to optimize its output code by adding -On to its command line, where n is an optional small integer. Meaningful values of n, and their exact effect, vary according to the exact version, but typically it ranges from 0 (no optimization) to 2 (lots) or 3 (lots and lots).

Internally, gcc translates these to a series of -f and -m options. You can see exactly which -O levels map to which options by running gcc with the -v flag and the (undocumented) -Q flag. For example, for -O2, mine says

enabled: -fdefer-pop -fcse-follow-jumps -fcse-skip-blocks
-fexpensive-optimizations
         -fthread-jumps -fpeephole -fforce-mem -ffunction-cse -finline
         -fcaller-saves -fpcc-struct-return -frerun-cse-after-loop
         -fcommon -fgnu-linker -m80387 -mhard-float -mno-soft-float
         -mno-386 -m486 -mieee-fp -mfp-ret-in-387

Using an optimization level higher than your compiler supports (e.g. -O6) will have exactly the same effect as using the highest level that it does support. Distributing code which is set to compile this way is a poor idea though --- if further optimisations are incorporated into future versions, you (or your users) may find that they break your code.

Users of gcc 2.7.0 thru 2.7.2 should note that there is a bug in -O2 on these. Specifically, strength reduction doesn't work. A patch can be had to fix this if you feel like recompiling gcc, otherwise make sure that you always compile with -fno-strength-reduce

Processor-specific

There are other -m flags which aren't turned on by any variety of -O but are nevertheless useful. Chief among these are -m386 and -m486, which tell gcc to favour the 386 or 486 respectively. Code compiled with one of these will still work on the other; 486 code is bigger, but otherwise not slower on the 386.

There is currently no -mpentium or -m586. Linus suggests using -m486 -malign-loops=2 -malign-jumps=2 -malign-functions=2, to get 486 code optimisations but without the big gaps for alignment (which the pentium doesn't need). Michael Meissner (of Cygnus) says

My hunch is that -mno-strength-reduce also results in faster code on the x86 (note, I'm not talking about the strength reduction bug, which is another issue). This is because the x86 is rather register starved (and GCC's method of grouping registers into spill registers vs. other registers doesn't help either). Strength reduction typically results in using additional registers to replace multiplications with addition. I also suspect -fcaller-saves may also be a loss.
Another hunch is that -fomit-frame-pointer might or might not be a win. On the one hand, it can mean that another register is available for allocation. On the other hand, the way the x86 encodes its instruction set, means that stack relative addresses take more space instead of frame relative addresses, which means slightly less Icache availble to the program. Also, -fomit-frame-pointer, means that the compiler has to constantly adjust the stack pointer after calls, while with a frame, it can let the stack accumulate for a few calls.

The final word on this subject is from Linus again:

Note that if you want to get optimal performance, don't believe me: test. There are lots of gcc compiler switches, and it may be that a particular set gives the best optimizations for you.

Internal compiler error: cc1 got fatal signal 11

Signal 11 is SIGSEGV, or `segmentation violation'. Usually it means that the program got its pointers confused and tried to write to memory it didn't own. So, it could be a gcc bug.

gcc is however, a well tested and reliable piece of software, for the most part. It also uses a large number of complex data structures, and an awful lot of pointers. In short, it's the pickiest RAM tester commonly available. If you can't duplicate the bug --- if it doesn't stop in the same place when you restart the compilation --- it's almost certainly a problem with your hardware (CPU, memory, motherboard or cache). Don't claim it as a bug because your computer passes the power-on checks or runs Windows ok or whatever; these `tests' are commonly and rightly held to be worthless. And don't claim it's a bug because a kernel compile always stops during `make zImage' --- of course it will! `make zImage' is probably compiling over 200 files; we're looking for a slightly smaller place than that.

If you can duplicate the bug, and (better) can produce a short program that exhibits it, you can submit it as a bug report to the FSF, or to the linux-gcc mailing list. See the gcc documentation for details of exactly what information they need.

4.3 Portability

It has been said that, these days, if something hasn't been ported to Linux then it is not worth having :-)

Seriously though, in general only minor changes are needed to the sources to get over Linux's 100% POSIX compliance. It is also worthwhile passing back any changes to authors of the code such that in the future only `make' need be called to provide a working executable.

BSDisms (including bsd_ioctl, daemon and <sgtty.h>)

You can compile your program with -I/usr/include/bsd and link it with -lbsd (i.e. add -I/usr/include/bsd to CFLAGS and -lbsd to the LDFLAGS line in your Makefile). There is no need to add -D__USE_BSD_SIGNAL any more if you want BSD type signal behavior, as you get this automatically when you have -I/usr/include/bsd and include <signal.h>.

`Missing' signals (SIGBUS, SIGEMT, SIGIOT, SIGTRAP, SIGSYS etc)

Linux is POSIX compliant. These are not POSIX-defined signals --- ISO/IEC 9945-1:1990 (IEEE Std 1003.1-1990), paragraph B.3.3.1.1 sez:

``The signals SIGBUS, SIGEMT, SIGIOT, SIGTRAP, and SIGSYS were omitted from POSIX.1 because their behavior is implementation dependent and could not be adequately categorized. Conforming implementations may deliver these signals, but must document the circumstances under which they are delivered and note any restrictions concerning their delivery.''

The cheap and cheesy way to fix this is to redefine these signals to SIGUNUSED. The correct way is to bracket the code that handles them with appropriate #ifdefs:

#ifdef SIGSYS
/* ... non-posix SIGSYS code here .... */
#endif

K & R Code

GCC is an ANSI compiler; much existing code is not ANSI. There's really not much that can be done about this, except to add -traditional to the compiler flags. There is a certain amount of finer-grained control over which varieties of brain damage to emulate; consult the gcc info page.

Note that -traditional has effects beyond just changing the language that gcc accepts. For example, it turns on -fwritable-strings, which moves string constants into data space (from text space, where they cannot be written to). This increases the memory footprint of the program.

Preprocessor symbols conflict with prototypes in the code

One of the most frequent problems is that some common functions are defined as macros in Linux's header files and the preprocessor will refuse to parse similar prototype definitions in the code. Common ones are atoi() and atol().

sprintf()

Something to be aware of, especially when porting from SunOS, is that sprintf(string, fmt, ...) returns a pointer to string on many unices, whereas Linux (following ANSI) returns the number of characters which were put into the string.

fcntl and friends. Where are the definitions ofFD_* stuff ?

In <sys/time.h>. If you are using fcntl you probably want to include <unistd.h> too, for the actual prototype.

Generally speaking, the manual page for a function lists the necessary #includes in its SYNOPSIS section.

The select() timeout. Programs start busy-waiting.

Once upon a time, the timeout parameter to select() was used read-only. Even then, manual pages warned:

select() should probably return the time remaining from the original timeout, if any, by modifying the time value in place. This may be implemented in future versions of the system. Thus, it is unwise to assume that the timeout pointer will be unmodified by the select() call.

The future has arrived! At least, it has here. On return from a select(), the timeout argument will be set to the remaining time that it would have waited had data not arrived. If no data had arrived, this will be zero, and future calls using the same timeout structure will immediately return.

To fix, put the timeout value into that structure every time you call select(). Change code like

      struct timeval timeout;
      timeout.tv_sec = 1; timeout.tv_usec = 0;
      while (some_condition)
            select(n,readfds,writefds,exceptfds,&timeout); 
to, say,
      struct timeval timeout;
      while (some_condition) {
            timeout.tv_sec = 1; timeout.tv_usec = 0;
            select(n,readfds,writefds,exceptfds,&timeout);
      }

Some versions of Mosaic were at one time notable for this problem. The speed of the spinning globe animation was inversely related to the speed that the data was coming in from the network at!

Interrupted system calls.

Symptom:

When a program is stopped using Ctrl-Z and then restarted - or in other situations that generate signals: Ctrl-C interruption, termination of a child process etc. - it complains about "interrupted system call" or "write: unknown error" or things like that.

Problem:

POSIX systems check for signals a bit more often than some older unices. Linux may execute signal handlers ---

For other operating systems you may have to include the system calls creat(), close(), getmsg(), putmsg(), msgrcv(), msgsnd(), recv(), send(), wait(), waitpid(), wait3(), tcdrain(), sigpause(), semop() to this list.

If a signal (that the program has installed a handler for) occurs during a system call, the handler is called. When the handler returns (to the system call) it detects that it was interrupted, and immediately returns with -1 and errno = EINTR. The program is not expecting that to happen, so bottles out.

You may choose between two fixes.

(1) For every signal handler that you install, add SA_RESTART to the sigaction flags. For example, change

  signal (sig_nr, my_signal_handler);
to
  signal (sig_nr, my_signal_handler);
  { struct sigaction sa;
    sigaction (sig_nr, (struct sigaction *)0, &sa);
#ifdef SA_RESTART
    sa.sa_flags |= SA_RESTART;
#endif
#ifdef SA_INTERRUPT
    sa.sa_flags &= ~ SA_INTERRUPT;
#endif
    sigaction (sig_nr, &sa, (struct sigaction *)0);
  }

Note that while this applies to most system calls, you must still check for EINTR yourself on read(), write(), ioctl(), select(), pause() and connect(). See below.

(2) Check for EINTR explicitly, yourself:

Here are two examples for read() and ioctl(),

Original piece of code using read()

int result;
while (len > 0) { 
  result = read(fd,buffer,len);
  if (result < 0) break;
  buffer += result; len -= result;
}
becomes

int result;
while (len > 0) { 
  result = read(fd,buffer,len);
  if (result < 0) { if (errno != EINTR) break; }
  else { buffer += result; len -= result; }
}
and a piece of code using ioctl()

int result;
result = ioctl(fd,cmd,addr);
becomes
int result;
do { result = ioctl(fd,cmd,addr); }
while ((result == -1) && (errno == EINTR));

Note that in some versions of BSD Unix the default behaviour is to restart system calls. To get system calls interrupted you have to use the SV_INTERRUPT or SA_INTERRUPT flag.

Writable strings (program seg faults randomly)

GCC has an optimistic view of its users, believing that they intend string constants to be exactly that --- constant. Thus, it stores them in the text (code) area of the program, where they can be paged in and out from the program's disk image (instead of taking up swapspace), and any attempt to rewrite them will cause a segmentation fault. This is a feature!

It may cause a problem for old programs that, for example, call mktemp() with a string constant as argument. mktemp() attempts to rewrite its argument in place.

To fix, either (a) compile with -fwritable-strings, to get gcc to put constants in data space, or (b) rewrite the offending parts to allocate a non-constant string and strcpy the data into it before calling.

Why does the execl() call fail?

Because you're calling it wrong. The first argument to execl is the program that you want to run. The second and subsequent arguments become the argv array of the program you're calling. Remember: argv[0] is traditionally set even when a program is run with `no' arguments. So, you should be writing

execl("/bin/ls","ls",NULL);
not just
execl("/bin/ls", NULL);

Executing the program with no arguments at all is construed as an invitation to print out its dynamic library dependencies, at least using a.out. ELF does things differently.

(If you want this library information, there are simpler interfaces; see the section on dynamic loading, or the manual page for ldd).


Previous Next Contents