Chapter 15: Using the C Preprocessor

In 1972, as the C programming language was maturing, it became necessary to develop syntax that would include the text of external files into the current program being compiled. The compiler writers also saw the utility of widespread string replacement in a program file, to textually replace one string with another automatically throughout a program.

Hence, the C preprocessor was invented. Initially, the preprocessor was a program that took the text of a C program and obeyed directives in that program, including text from external files and replacing one string for another. Soon after its introduction, the preprocessor was extended to include parameterized macros and to enable conditional compilation.

The C preprocessor, therefore, is not actually part of the C programming language. However, it is integral to programming in C and most C compilers assume the preprocessor is an external and use it extensively. In fact, some compilers actually incorporate it into the compiler program. This means that understanding how the preprocessor works and how it is used is important for programming in C, especially for programming in C on Pebble smartwatches.

The Basics of the Preprocessor

Whether the C preprocessor (we will refer to it as the "CPP") is implemented as a separate program or as part of a C compiler itself, it represents a separate text processing phase that takes place before compilation happens. Preprocessing is really two operations: directive handling and text processing. Directives that appear in the text being processed are consumed and executed. Text that is not part of a directive is analyzed for possible text substitution and/or replacement.

Let's consider an example. The CPP uses macros, that is, text substitution directives, to change text in a C program. Consider this example:

#define two 2
#define times *

int four = two times two;

In this example, preprocessing the C code before compiling it would produce the following processed code

int four = 2 * 2;

before compiling this statement. The directives would be eliminated since they are not part of the C language itself and the text would be replaced with processed text. This processed text would then be fed to the compiler.

Using the CPP for Text Processing

The CPP is so effective that users can be tempted to use the CPP on text from other programming languages and even regular text documents. For example, it is convenient for Web page authors use the CPP to maintain Web pages in HTML. This is most useful when the CPP is a program separate from the compiler, although many C compilers have options to only use the preprocessor and to not engage compilation.

This use of the CPP should be done with care. The CPP parses C code just as a compiler would so that it can effectively implement CPP directives. This means error messages and mishandling of text might occur. For example, comments are not processed by the CPP; anything that looks like a comment that appears in in regular text would be ignored. As another example, a contraction like "don't" might be flagged as an error (with an error message) because the single quote implies a character constant in C.

If you are eager to use the CPP for purposes other than C compilation, look for options to the CPP program that turn off C language processing. Also, there are implementations of preprocessing that are geared for general text purposes; seek out these programs. The "m4" program in Linux is an example of a general text preprocessor.

Preprocessor Directives

Preprocessor directives begin with a "#" symbol, typically placed in column 1 of a C program (note that some CPP implementations allow the directive to begin in any column, as long as it is alone on a text line). There are many directives defined; here is a list of a few directives that are more applicable to Pebble software developement.

  • #define and #undef This directive defines macros, which implement text substitution. There are many ways to use macros with the preprocessor; these are outlined in the next section below.

  • #if, #ifdef, #ifndef, #else, #elif, #endif These macros implement conditional processing. This use of directives is extremely useful and is outlined in a section below as well.

  • #include This directive takes a file name and is substituted by the CPP with the contents of that file. For example, most Pebble programs use

    #include <pebble.h> 
    as the first line in their code. This line makes the preprocessor find the "pebble.h" file, read its contents, and substitute the "#include" line with the contents of the "pebble.h" file.

  • #error and #warning These directives take a message string as a parameter and allow the preprocessor to issue an error or warning message. If the message is an error message, processing stops.

  • #pragma This directive provides additional information to the compiler. A C compiler is free to define any pragmas wants. Pragmas to the GCC compiler have the syntax

    #pragma GCC directive
    For example,
    #pragma GCC warning message
    issues a warning with the message given, much like the "#warning" directive. For other pragmas, see this documentation.

  • #line This directive takes either a line number or a line number and a filename as parameters. It informs the compiler as to the line number in the source code where the code following the directive is found. This helps the compiler issue more informative error messages, particularly when lots of text inclusion or macro expansions might change the line numbers from the original file.

Pebble Preprocessor Definitions

We have used definitions from the Pebble SDK throughout this book. Now we can see where these definitions come from. We use angle brackets with the "pebble.h" reference, which means this file can reside among system include files. In the current SDK, this means it's in ".pebble-sdks/SDKs/current/sdk-core/pebble/basalt/include/pebble.h" in your home directory (assuming you are using Linux and assuming you are compiling for the basalt smartwatch architecture). This specific location may (and likely will) change over the course of several SDKs; it certainly changes when you change smartwatch architectures. Using the angle-bracket notation assures that the compiler will find the file no matter where it ends up.

Defining and Using Macros

There are two types of macros: those with parameters and those without parameters. Let's look at macros without parameters first.

Macros without parameters define the substitution of the first parameter with the rest of the line. For example,

#define STRING_LENGTH 64

will direct the CPP to replace every occurence of "STRING_LENGTH" with the number "64". This text replacement is called a macro expansion. It is conventional to write macros in uppercase.

The macro definition ends at the end of the "#define" line. If you need to continue the definition onto multiple lines, you can use a backslash-newline combination. The result, however, will be generated on one line.

For example, consider the following macro definition:

#define DIMENSIONS  { 144, \
                      168  }
int sizes[] = DIMENSIONS;

would be replaced with the code below:

int sizes[] = { 144, 168 };

Order of definition makes a difference. Consider the following example:

#define WIDTHS APWIDTH, BASWIDTH, CHWIDTH

int widths[] = { WIDTHS };

#define APWIDTH 144
#define BASWIDTH 144
#define CHWIDTH 180

int newwidths[] = { WIDTHS };

This code would look like this after preprocessing:

int widths[] = { APWIDTH, BASWIDTH, CHWIDTH };
int newwidths[] = { 144, 144, 180 };

This means that textual substitution is done with whatever definitions are available at the time of substitution. For the declaration of widths, definitions of APWIDTH, BASWIDTH, and CHWIDTH were not available, so the text was used. For the declaration of newwidths, the three identifiers now were defined as their own macros, so those macros definitions were used.

To change the definitions of a macro, you must first undefine the macro name, then redefine it. The "#undef" directive undefines a macro. Thus, the following definitions would work:

#define APWIDTH 144
#define BASWIDTH 144
#define CHWIDTH 180

int newwidths[] = { WIDTHS };

#undef APWIDTH
#define APWIDTH 120
#undef CHWIDTH

int newwidths[] = { WIDTHS };

This code looks like this after preprocessing:

int newwidths[] = { 144, 144, 180 };
int newwidths[] = { 120, 144, CHWIDTH };

Note that we did not give CHWIDTH a new definition after we undefined it, so the name of the identifier was used in the macro expansion.

Macros can be defined with parameters. Here, parameters that are specified in the original string and the same names are used in the expansion string. Parentheses are used in the original string to indicate the parameter list (just like function definitions without type names).

Consider this example:

#define WRITE_TEXT(TEXT, FONT) graphics_draw_text(ctx, TEXT, FONT, bounds, \
                                                  GTextOverflowModeWordWrap, \
                                                  GTextAlignmentCenter, NULL);

Now, we can use WRITE_TEXT(code, myfont) and it will expand to

graphics_draw_text(ctx, code, myfont, bounds, GTextOverflowModeWordWrap, GTextAlignmentCenter, NULL);

Note here that the conventional uppercase of macro definitions is a big help when trying to figure out where the parameters go in the substituted text.

Definitions that have expandable macros embedded will expand as expected:

#define OVERFLOW GTextOverflowModeWordWrap
#define ALIGNMENT GTextAlignmentCenter
#define WRITE_TEXT(TEXT, FONT, OFMETHOD, AMETHOD) \
                               graphics_draw_text(ctx, TEXT, FONT, bounds, \
                                                  OFMETHOD, \
                                                  AMETHOD, NULL);
WRITE_TEXT(code, myfont, OVERFLOW, ALIGNMENT);

This also expands to the expansion above.

Macro definitions can include conditional code inclusion, as we will discuss in the next section.

Conditional Code Inclusion

The CPP allows for the conditional inclusion of code by using the directives "#if", "#ifdef", and "#ifundef". Each of these conditionals can work with an "#else". They all require an #endif" directive to complete the conditional inclusion.

The "#if" directive takes an expression as a "parameter". That expression can contain constants and arithemtic operators, other macros, and references to the "defined()" operator. Let's consider an example:

#if defined(TABLE_SIZE)
#if TABLE_SIZE > 100
#undef TABLE_SIZE
#define TABLE_SIZE 100
int boundary = 200;
#else
int boundary = 100;
#endif
#else
#define TABLE_SIZE 50;
int boundary = 100;
#endif

This looks a bit confusing without whitespace or indentation. We could rewrite it as follows:

#if defined(TABLE_SIZE)

     #if TABLE_SIZE > 100
         #undef TABLE_SIZE
         #define TABLE_SIZE 100
         int boundary = 200;
     #else
         int boundary = 100;
     #endif

#else
     #define TABLE_SIZE 50;
     int boundary = 100;
#endif

Now this reads better. Note that directives can be indented (with most preprocessors, including the GCC CPP that the Pebble SDK uses).

In this example, there are several assumptions made. If TABLE_SIZE is defined, it is assumed to have an integer value. If defined(TABLE_SIZE) evaluates to the value "true", then TABLE_SIZE is assumed to be a macro symbol. The "defined" question is actually a "defined as a macro" question, as opposed to "defined as a C name" question.

When an "if" statement is used, it's natural to think about an "else" part. For CPP directives, "#else" parts work as you might expect: they represent the alternative or false part to the "#if" directive. In the example above, if TABLE_SIZE is not defined, the bottom "#else" part is activated. "#else" directives can be nested, as the above example shows.

"#elif" directives combine "#else" and "#if" directives.

Checking to see if macros are defined is done so often that there is a directive to check this. "#ifdef" checks for defined macros and and "#ifndef" checks for undefined macros. The above check for defined(TABLE_SIZE) could be written as

#ifdef TABLE_SIZE

Pebble-Specific Definitions

Pebble applications can take advantage of preprocessor definitions to make a single file of code that applies to several different Pebble smartwatch platforms.

There are definitions that reflect services or architectural features.

  • PBL_COLOR is "true" when the smartwatch has a color display and "false" when it does not.
  • PBL_ROUND is "true" when the smartwatch has a round display and "false" when it has a rectangular display.

These definitions are suitable for use with directives like "#ifdef" like this:

#ifdef PBL_COLOR
    graphics_context_set_text_color(ctx, GColorRed);
#else
    graphics_context_set_text_color(ctx, GColorBlack);
#endif

In this example, the PBL_COLOR macro is defined when the compilation of this code is for a platform that has color.

In addition, there are macro definitions that check for services or hardware properties. In general, these work in your C code to include one of several possible pieces of code, depending on the presence or absense of a service or hardware feature. For example, we could set text color for an application, but we want the code to work with a Pebble Classic and a Pebble Time. We would do this:

graphics_context_set_text_color(ctx, PBL_IF_COLOR_ELSE(GColorRed, GColorBlack));

The Pebble Time would get a substitution of GColorRed for the macro; a Pebble Classic would get GColorBlack.

The macros that can check features and services include:

  • PBL_IF_COLOR_ELSE(true_code, false_code) will include the true_part if the platform supports color and the false_part if not.
  • PBL_IF_BW_ELSE(if_true, if_false) is the converse of the PBL_IF_COLOR_ELSE, substituting the true_part if the smartwatch supports only a monochrome screen and false_part if not.
  • PBL_IF_MICROPHONE_ELSE(if_true, if_false) will use the if_true part if the hardware supports a microphone, using the false_part if not.
  • PBL_IF_RECT_ELSE(if_true, if_false) will use the true_part if the screen is rectangular and the false_part if not.
  • PBL_IF_ROUND_ELSE(if_true, if_false) will substitute with the true_part if the platform has a round screen, that is, is a Pebble Time Round, and will substitute with false_part for everything else.
  • PBL_IF_HEALTH_ELSE(if_true, if_false) will use the true_part if the platform supports Pebble Health, otherwise it will use the false_part

It's important to remember that these are not run-time definitions; they are compile-time definitions. These definitions make it easier to write one set of code, that is, one set of files, but target the same code to different platforms. The result is still a set of PBW files, meaning there are still several sets of install packages with slightly different code adapted for specific smartwatch platforms. But you can generate these different install packages from the same set of files.

Guidelines for Using the CPP

With the introduction of the preprocessor, we have one of the easiest ways to write really confusing code. There are aspects of CPP definitions that make extensive use of the CPP a dangerous thing to do. Here are some dangers to stay away from and guidelines to help make the CPP useful.

  1. Remember that macro definitions are text substitutions that are substituted and removed before compilation. This means that macro definitions cannot be debugged. The problem here is that the compiler sees only the end result of text substitution and there is no connection to the original macro definition. Where this applies to literals or variables, use "enum" definitions and constant declarations. Where functions are substituted for other text, use actual functions with real, debuggable definitions.

  2. Always be aware of the text substitution and where is is placed. Remember that macro substitution is a verbatim text replacement. Often expansions can have strange results. Consider this example:

    #define SQUARE(x)  ((x) * (x))
    int y = 15;
    int z = SQUARE(y++);
    

    When this is expanded, we get something we did not intend: z = ((y++) * (y++)); which is not the square of 15 (the result is 15 * 16). Be especially careful when using replacements for expressions; use parentheses as much as you can.

  3. Watch name conflicts. The CPP processes text without regard to where it's defined or if the resulting code even works. When a name you chose in your code conflicts with a macro defined elsewhere, that name will still be replaced, often with unexpected consequences.

  4. Naming becomes extremely important with macros. Macros are the worst place to use cryptic or short names, because of the definitions that result.

  5. Comments in macros are very dangerous. Consider this:

    #define true 1  // booleans as integer
    #define false 0 // this too!
    
    while (true) { 
        x ++;
    }
    

    When this expands, the comment at the end of the true definitions will make the rest of the "while" syntax appear in a comment. The code will not compile correctly and the reason will be very hard to track down.

Obfuscation

We can use the CPP to our advantage, making flexible code that easier to read and use. We can also have some fun with it and make some terribly ofuscated code.

Consider the code to the first program we saw in Chapter 2. The program started out like this:

#include <pebble.h>

Window *window;
TextLayer *text_layer;

void init() {
  window = window_create();
  text_layer = text_layer_create(GRect(0, 0, 144, 40));
  text_layer_set_text(text_layer, "Hello, Pebble!);
  layer_add_child(window_get_root_layer(window), 
                    text_layer_get_layer(text_layer));
  window_stack_push(window, true);
}

We can obfuscate this code with some macro definitions. Consider this new starting code:

#include <pebble.h>

#define l1l window  
#define l11l Window
#define l1l1l layer
#define l111l Layer
#define ll1ll(l) text_##l
#define l11ll(l) Text##l
#define ll11l layer

l11l *l1l;
l11ll(Layer) *ll1ll(ll11l);

void init() {
  l1l = window_create();
  ll1ll(ll11l) = ll1ll(layer_create)(GRect(0, 0, 144, 40));
  ll1ll(layer_set_text)(ll1ll(ll11l), "Love ya, Pebble!");
  layer_add_child(window_get_root_layer(l1l), 
  ll1ll(layer_get_layer)(ll1ll(ll11l)));
  window_stack_push(l1l, true);
}

We will leave code "half" obfuscated, so we can see the resemblence to the original. Note that this code compiles just fine.

This code demonstrates a few tricks we can do with macros. First, this code exploits the similarity between the letter "ell" (l) and the numeral one ("1"). When placed together with a typewriter-style font, they look very much the same. Second, using odd representations for regular keywords, names and symbols will start to make the code cryptic. Simply transforming variable names like "window" and giving multiple representations of "layer" confuses this code. Finally, note the use the "##" sequence. To the CPP, this sequence will paste parts of a macro definition together. For example, if we used this definition in place of the definition above:

#define ll1ll(l) text_l

in hopes of using the parameter in the final text name, it would not work. ll1ll(layer) and ll1ll(layer_get_layer) would both be substituted with text_l. But using "##" as the "glue" for the definition makes two different substitutions.

Standard CPP Features

There are a few standard features that the CPP uses. We have already seen one in the last section: using the "##" sequence to glue parameters into a single name. There are a few others.

  • The names "__FILE__" and "__LINE__" will be substituted for the name of the file being evaluated and the current line being worked on.

  • Definitions can have variable sets of parameters. If you include the characters "..." as the last "parameter" in a macro name, you can refer to parameters that might have been included in the macro use by the name "__VA_ARGS__" (for "variable arguments") in the substitution text. For example, if we specify a macro like this:

     #define checkvalue(COND, ...) if (!COND) { __VA_ARGS__ }
    

    We could then have use this to check values and execute statements if values are not in line with expectations. We could use

     checkvalue(miles >= 0, APP_LOG(APP_LOG_INFO_LEVEL, "Miles out of range."));
    
  • The CPP uses a single "#" symbol to convert what follows the symbol to a string. We could define

     #define PRINT_ENUM(enumeration) printf(strcat(#enumeration, " is %d"), enumeration)
    

    that would print the name and value of an enum that we went to the macro. Quotes are automatically added.

  • The CPP defines a double "#" symbol to work with the "__VA_ARGS__" definition. Consider this definition:

     #define debug_variable(var, fmt, ...)   \
           printf( "DBG: " __FILE__ "(%d) " #var " = " fmt, __LINE__, var, __VA_ARGS__); 
    

    Now consider what happens when the "__VA_ARGS__" argument is empty, that is, when the macro is called with only two arguments. When the macro expands, the list is left with a comma on the right, resulting in a syntax error. This where the "##" sequence is used. If we change the macro definition to the following, we don't get the syntax error:

     #define debug_variable(var, fmt, ...)   \
           printf( "DBG: " __FILE__ "(%d) " #var " = " fmt, __LINE__, var, ##__VA_ARGS__); 
    

    The "##" deletes the character to the left of the "##__VA_ARGS__" if "__VA_ARGS__" is empty.

Useful Preprocessor Uses

While there are preprocessor hazards to avoid, there are many convenient uses for the CPP. We have already discussed the basics: text substitution and conditional code inclusion. There are a few other handy uses for CPP code.

  • Mass Removal of Code: If you have a large section of code you need to remove from consideration by the compiler, you could prepend each line with the "//" sequence for comments. You could also begin the entire section with #if 0 (always false) and #endif directives.

  • Include Guards In a project with many included files, one of the easy mistakes to make is to include the text from a file multiple times. The compiler would not like multiple definitions, so include guards can be used. These guards look like:

     #ifndef _FILE_NAME_H_
     #define _FILE_NAME_H_
     /* code */
     #endif 
    

    Here, code is included only if the file _FILE_NAME_H_ name is not defined and, when the file is first included, the name gets a definition. This means that the contents of file_name.h get included only once.

  • Multiple Definitions Sometimes it is necessary to define a name in multiple ways to test how multiple definitions work or to see what the best definition is. Using macros is a good way to do this, especially if the definition is used several times throughout the code. By using one version of the macro definition, then using the macro in the code, you can simply change the definition of the macro to try out different versions in the code.

Project Exercises

Project 15.1

Back in Chapter 3, we specified a project exercise, Project 3.1, that displayed a bouncing ball around the Pebble smartwatch screen. Make following changes to the code using macros:

  1. If the Pebble smartwatch is a Classic, start the ball in the top left corner; otherwise start the ball in the top right corner.
  2. If the Pebble smartwatch is a Classic, start the velocity of the ball off at 10; otherwise, start the velocity at 5.
  3. Find the macro that makes the ball GColorCobaltBlue or GColorBlack depending on whther color is available. Remove this \#if and make the code work with a macro in a single function call.
  4. Define a macro called WRAPAROUND. Do not give it a value. Now change the code in this project to bounce the ball off walls (as it is currently) if WRAPAROUND is not defined, but also to wrap the ball to the other wall in only the X direction if the macro is defined.

You can find an answer to this project here.

Project 15.2

Developing code using the CloudPebble IDE is very convenient, but there are few debugging features built into the environment. Most of the debugging in CloudPebble amounts to printing values and using logging messages. Let's write a few macros that will help with debugging.

  • Start by defining a name that will control debugging. If the name is defined, debugging macros will print what we ask; it the name is not defined, the debugging macros will do nothing. A simple "#define" is all that's needed here.
  • Now define a macro called "MY_DEBUG" that uses the printf function we have seen before to print a logging message. You will need the variable parameters specification. You are writing a macro like APP_LOG, described here, without the logging level.
  • Define a "MY_ASSERT" macro that will accept a condition and will print one of two messages. If the condition is true, the first message prints and if the message is false, the second message prints.

There are several other macros you could write. Devise one.

Finally, read through this discussion on StackOverflow about debugging macros.

An answer for this project with some demonstration code from a simple Pebble app can be found here.

Assertion Macros

Assertions are extremely useful in C code. They serve their most useful purpose in debugging, where code will report an error and stop if properties of variables and data are not met. The "MY_ASSERT" macro above is an example of an assertion.

Standard C defines a macro called "assert", in an include file called "assert.h", that takes an integer parameter. The integer is evaluated as if it represents a boolean value: the value 0 is evaluated as false and all other non-zero values are evaluated as true. This means that boolean expressions can be used in an assert call. Calls like assert(x < y) or assert(ticket != null) are useful; when the parameter evaluates to false, there is a report and the code halts.

Pebble SDKs do not include the "assert.h" file. Because of this, using your own version of "assert", like the example above, creates a very useful tool.

Project 15.3

Have some fun with obfuscation. Pick some Project Exercise code and obfuscate it. How much can you obliterate and still have the code compile correctly and do the same thing it did before obfuscation? How many symbols, names, and syntax can you make into different symbols?

Here is an example of obfuscation of Project Exercise 6.4.

We have stated it before, but lots of obfuscated C can be found at the Obfuscated C Contest Web site.

results matching ""

    No results matching ""