|
The C programming language has enjoyed remarkable success as a "portable assembly language", although of course one should not take that phrase too literally. But C programs are not automatically portable. The programmer who wishes his or her programs to be portable must take a modicum of time and trouble to make them so. As Kernighan and Ritchie point out in the introduction to The C Programming Language, "With a little care it is easy to write portable programs that can be run without change on a variety of hardware. The standard makes portability issues explicit, and prescribes a set of constants that characterize the machine on which the program is run."
Character SetsNot all machines use EBCDIC! Some computers use an alternative character set known as ASCII, and others use Unicode or other "wide" character sets. Nor are these the only possibilities that C acknowledges. The C Standard does allow us to make a few basic assumptions about character representations, though. Firstly, we can be absolutely sure that the following characters are available to us in the source and execution character sets (although not necessarily in the order that is given here):
as well as "the space character, and control characters representing horizontal tab, vertical tab, and form feed". In the execution character set, we are also guaranteed characters to represent "alert, backspace, carriage return, and new line". The Standard guarantees that "the value of each character after 0 in the above list of decimal digits shall be one greater than the value of the previous", which gives us a useful way of converting any character into a number, provided we know that that character represents a decimal digit:
and of course a way to reverse the process (provided we know
that
We are further assured that the values ("coding points") of all of the above characters are positive. This in itself turns out to be highly useful information! Whilst it is true that the digits are guaranteed to have coding points that are sequential and in a sane order, the same is not true, alas, for the letters of the alphabet. We should take care not to assume such an ordering if we wish our code to be portable to character sets in which the ordering does not exist.
Bits and Bytes
C guarantees that a Incidentally, there really are systems with bytes that are wider than 8 bits. For example, typical modern digital signal processor (DSP) boards have 16- or even 32-bit bytes. Such boards are often used in devices such as set-top boxes. C is commonly used in such environments.
Guaranteed Minimum Data Type Sizes and Ranges
Don't assume that
Note: if
In particular, we should not assume that
Implementation NamespaceThe Standard reserves rather more identifiers to the implementation than most people realise. Whilst specifying the exact rules here would be possible, it would be terribly tedious (and anyway, that's what the Standard is for!). Instead, I'm going to give you some rules of thumb that should keep you out of naming trouble. These guidelines are more restrictive than the Standard requires, because they are designed to be easy to remember. Avoid using any identifier that starts with:
Life would be much simpler if all of C's reserved
identifiers started with an underscore but, alas, they
don't. Hence the other rules. Now, I know that it's sometimes
tempting to use a leading underscore, In fact, the easiest way to skip around the implementation's namespace is to define your own. Alas, C doesn't have a C++-style namespace feature; consequently, we can't define our own namespaces quite as elegantly as we might wish. Nevertheless, there is a crude, but effective, way to fence off a namespace: the use of a prefix.
For my CLINT library (the library With No Name), I used the
prefix Whilst it's true that someone using the CLINT library might also use some other library that uses the same prefixes, it's not terribly likely; and even if it does happen, the fix is relatively obvious (a massive, automated, search and replace operation, to substitute a fresh prefix). (Incidentally, CLINT is no longer available. The design sucked.)
Just to warn you off them, I also use
LinkersUm, I'm not sure how you're going to take this. The original C Standard recognises the existence of linkers that require external identifiers (basically, functions and file scope objects with external linkage) to be unique in the first six characters! And even linkers that are case-insensitive. Frankly, rather than make your code almost unreadable in a bid to satisfy these Draconian limitations, I have a much better idea -- get a better linker. But if you can't do that, then you are going to be very much more restricted than most people when it comes to naming functions. (On this site, I have more or less ignored such foolishness.) The C99 standard requires linkers to be a bit less restrictive.
Abstracting Non-Portable BehaviourWhen you have to use non-portable constructs, it can be helpful to encapsulate them into a library, so that your application code can be written completely portably. All you have to do, for each new platform, is to rewrite the library.
Let's take as a very simple example the problem of finding
the length of a file. This isn't actually as simple as it
sounds, because the concept of "file length" is
more complex than most people give it credit for, but we'll
assume for the purposes of this example that we simply
mean the number that
We will further assume that we can identify the file via
a filename, and our file-length-getting function will
accept this filename as a parameter (rather than a pointer
to a Let's look at how we might get the file length in a Windows console program:
How about Linux?
(We'll ignore the Windows API version, since it won't add anything significant to the discussion.) Now, we don't really want to pepper our code with N different versions of the above code, for N different compilers, do we? But the solution is very simple. In fact, there are at least two, and I'll describe them both here. Both have one thing in common, though, so I'll deal with that first.
The principle that is common to both is this: we define a
new function (which I'll call
In each case,
(If you're quick, you'll have spotted that I have assumed the
length of the file can be represented in a
Conditional PreprocessingThis is my least favourite option. It goes like this:
Separate Source Files For Each Platform
Personally, I think that littering source code with
preprocessor directives is ugly, and hard to maintain.
I much prefer separating out the code that belongs to,
say, the Linux platform, and putting that code into the
Linux version of the Doing things this way involves a little more care in file management, but (at least in my opinion!) gives much more readable code.
Using the Abstraction LayerOnce you have your header and your source file(s), how do you use them? Well, it's pretty simple. Compile the source file into an object file (which you will probably wish to add to a library at some point), and then link the object file (or library) to whichever programs you wish.
All you have to do in your program source is
![]() I'm bound to think of some more stuff to do with portability -- in due course. |