CMake customization points, how to configure your project?

CMake customization points, how to configure your project?

04/10/2020
cmake,tutorial,C++,buildsystem

In the previous article we saw the basics of CMake and how to handle targets. If you are not familiar with CMake, please read the previous article first.

We will now dive deeper and look into more advanced features so that you can customize your project. This is an important part of CMake as every project has its own set of requirements. Those can be based on the target platform, compiler, or simply based on the user choice.

We will see how to use variables, build configurations and then generator expressions so that you can provide flexible options to your users.

Variables and options

When writing build scripts you might quickly find yourself needing variables to provide more control over the configuration of your project. Like most imperative languages, CMake provides variables, control flow and even functions.

All variables are internally handled as strings but can be interpreted differently based on the command using it. The primary command for variables manipulation is set. Variables are case sensitive and can have any of the following scope:

The signature of the set function is the following:

set(<variable> <value>... [PARENT_SCOPE])

The PARENT_SCOPE parameter lets you set the value of a variable in the parent scope (parent function or parent directory), which can serve as an output parameter.

Note that if you have multiple values in the set command, they will be concatenated into a single string and seperated by ; which means that if you pass the variable to a list aware command, you need to escape any ; unless you want to split your string.
Such variables can then be considered as lists and can be more easily manipulated through the list command.

set(NUMBER_LIST 1 2 "3;4") # Results in NUMBER_LIST="1;2;3;4"
list(APPEND NUMBER_LIST 1) # Results in NUMBER_LIST="1;2;3;4;1"
list(REMOVE_DUPLICATES NUMBER_LIST 1) # Results in NUMBER_LIST="1;2;3;4"
set(NOT_A_LIST "a\;b") # Results in NOT_A_LIST="a\;b"  

CMakePrintHelpers

You might find yourself needing to debug the value of variables.
While you could use the message command, CMake provides a handy module named CMakePrintHelpers which makes this process a bit easier.

The only thing you need to do is include the module and it will expose a function named cmake_print_variables where you list the names of the variables you want to debug.
It is however only recommended for development use.

For example:

include(CMakePrintHelpers)

set(MYVAR value)
cmake_print_variables(MYVAR CMAKE_MAJOR_VERSION DOES_NOT_EXIST)

will print the following during the configuration step:

-- MYVAR="value" ; CMAKE_MAJOR_VERSION="3" ; DOES_NOT_EXIST=""

It can also be used to print properties of a target, sources, etc by using cmake_print_properties. See the documentation here for more details.

Cache

The concept of cache is very important in CMake.
It is used not only for caching results of commands to avoid running expensive steps multiple times, but also used to configure a project build.
The cache is a file named CMakeCache.txt located in your build directory, and can be reset simply by deleting it and re-running the cmake configuration step.

To set a Cache variable, you need to call set with the following signature.

set(<variable> <value>... CACHE <type> <docstring> [FORCE])

Any variable already present in the cache will not be overridden by subsequent calls to set( ... CACHE ...), unless the FORCE parameter is used or the signature without CACHE is used.

It means that a variable with directory or function scope takes priority over a cache variable. This can be surprising if you do not use the CACHE signature, as it will act as if the cache variable does not exist.

For example

set(MYVAR cachedValue CACHE STRING "someval")
cmake_print_variables(MYVAR)
set(MYVAR localValue) # Ignore the cache value
cmake_print_variables(MYVAR)

will print:

-- MYVAR="cachedValue"
-- MYVAR="localValue"

or (if you change the cache value of MYVAR to modifiedFromCache) you will get:

-- MYVAR="modifiedFromCache"
-- MYVAR="localValue"

Nicer GUI interactions

While all variables are strings internally, you can give a type to cache variables.
This way the cmake GUI can offer more adequates dialogs.
The allowed values for type are BOOL (ON/OFF), FILEPATH, PATH, STRING and INTERNAL. The INTERNAL variables are special and won't appear in the GUI.

CMake GUI example of variable types

The docstring parameter should be a little summary of the variable, and will be displayed as a tooltip.

The easiest way to manipulate the cache is through a GUI, but you can also edit the CMakeCache.txt file directly.
Note that it also contains the type and docstring in it. Cache variables can also be listed when invoking cmake by using the -L argument. -LH will list the variables with their docstring.

It is also possible for cache variables to be marked as advanced using the mark_as_advanced command. Such variables will not be listed by default when using the GUI or -L, the GUI provides the "advanced" checkbox for that, and -LA can be used with the command line.

CMake built-in cache variables

CMake has a lot of cache advanced variables that let you control a build, such as CMAKE_CXX_FLAGS which is the list of default flags when using C++.
Those are NOT meant to be overridden in your CMakeLists.txt but modified by the user.
Always prefer using the target_* commands and properties rather than modifying such variables from your CMakeLists.txt.

The user can also initialize or change cache variables using the cmake commandline with the -D parameter.
The syntax is -D <var>:<type>=<value> or -D <var>=<value>.

For example, if you want to use the Release configuration for single-config generators (see the section about configurations) you can use:

cmake -DCMAKE_BUILD_TYPE=Release ..

Project options

The option command

We have seen that cache variables can be really useful for project configuration, and one soon realizes that most of them are boolean switches.
That is why a shorter and more explicit command than set exists for those, which is option.

option(<variable> "<help_text>" [value])

Dependent options

Sometimes you however want to have an option depend on another option. And for this we have a small module called CMakeDependentOption. This module provides the cmake_dependent_option macro which let's you expose an option based on the value of another.

include(CMakeDependentOption) # Needs to be called once

cmake_dependent_option(<variable> "<help_text>" <default> <conditions> <fallback>)

The first 3 arguments are the same as option, except that if the list of conditions is not satisfied, the variable will use the fallback value and won't appear in the GUI.
Remember that such a list can be created by seperating values with a semicolon.

Examples

A typical usage of cmake_dependent_option is to offer an option to disable project tests based on the CMake predefined variable BUILD_TESTING so that if your project is used through add_subdirectory one can disable your tests but not all the tests.

You would write the following (here we chose the BP_ prefix for our variables, adjust it to your project):

cmake_dependent_option(BP_BUILD_TESTS
  # By default we want tests if CTest is enabled
  "Enable ${PROJECT_NAME} project tests targets" ON
  # But if BUILD_TESTING is not true, set BP_BUILD_TESTS to OFF
  "BUILD_TESTING" OFF
)

Another example using multiple conditions:

# Assuming TARGET_SUPPORTS_AVX variable exists
option(BP_USE_SIMD_CODE "Enable hand optimized SIMD code" TRUE)
cmake_dependent_option(
  BP_USE_AVX "Enable hand optimized AVX code" TRUE
  "BP_USE_SIMD_CODE;TARGET_SUPPORTS_AVX" OFF
)

Note that it is good practice to prefix (here with BP_) your options so that their names do not clash with other libraries if you (or your users) choose to use add_subdirectory for dependency management.

Variables usage

So far we saw how to define and set the value of variables, but we did not explain how to use them.

They can be used in control flow blocks such as if the following way:

if(BP_USE_AVX)
  target_compile_definitions(myawesomelib PRIVATE USE_AVX=1)
endif()

Or directly by using the variable reference syntax ${variable} which will be replaced by the variable value.
Note that when a command takes a variable as a parameter (such as cmake_print_variables or if [1], you do not need to dereference them. Do not mistake a variable name for its value!

project(Awesome)
set(${PROJECT_NAME}_Var ON) # Sets Awesome_Var=ON

You can even nest them, and they are then evaluated from inner to outer order (${outer_${inner_variable}_variable}).

# Prints "The variable Awesome_Var has value ON"
message(STATUS "The variable ${PROJECT_NAME}_Var has value ${${PROJECT_NAME}_Var}")

Optional features should stay optional

It is often tempting to add flags or features requirements to a target that are not mandatory for compilation.
This could be to enable extra warnings, enable extra optimizations, disable exceptions... the list is long.
However all those things are not mandatory to build your project, yet can be useful for development.

While it can make your's and your colleagues' job easier, it can most definitively make your package maintainer's or user's life harder.
You do not necessarily control the environment in which your library will be used or deployed.

Those are all scenarios you should be aware of, and as such your CMake project should focus on one thing:

How to build the project?

So if you add optional compilation flags or features, notably through target_compile_options, always make them optional from the point of view of the user, and use PRIVATE to avoid propagation of the properties!

option(${PROJECT_NAME}_DISABLE_EH "Turn this ON to disable exception handling" ON)
if(${PROJECT_NAME}_DISABLE_EH)
   target_compile_options(yourlib PRIVATE -fflag-to-disable-exceptions)
endif()

You should also avoid compiler specific flags that are not guarded by compiler checks.

Your users and package maintainers will thank you!

Configuration types

When compiling a project you usually have different build configurations, for example one that lets you debug and another one that removes debug assertions, logs and enables your compiler optimizations.

CMake provides by default four build configurations:

This is important to know since based on the generator you will use when invoking CMake (Visual Studio, Makefile, Ninja...), you have two behaviors.

Single configuration generators

Some generators such as Makefiles and Ninja are single configuration generators, so you will need different build directories for each configuration you want to use. This is controlled through the variable CMAKE_BUILD_TYPE and can be set from the command line on the first call to cmake.

If omitted, CMAKE_BUILD_TYPE is empty and does not match any configuration type. Instead a "default" configuration will be used, with no configuration-specific compiler flag.
Be careful, this is not the same as the Debug configuration!

Multiple configurations generators

Unlike the single configuration generators, some can handle multiple configurations at once which is useful for integrations with IDEs.
The Visual Studio, XCode and (since CMake 3.17) Ninja Multi-Config generators let you have more than one configuration in the same build directory, and thus won't be using the CMAKE_BUILD_TYPE cache variable.
Instead the CMAKE_CONFIGURATION_TYPES cache variable is used and contains the list of configurations to use for this build directory.

Custom configurations

With both single and multiple configuration generators, it is possible for the user to add or remove configuration types, and specify the compiler flags for each of them. This is extremly useful if you want to build for coverage, profiling, sanitizers, etc.

However I will not teach you how to do this (yet), and instead explain why you should be careful and avoid relying on build configurations in your CMakeLists.txt.

As mentioned earlier, your CMakeLists.txt should describe "How to build the project".
It means that you should be able to build the project even if someone adds or removes a configuration, changes optimization flags, etc.
Under no circumstances does it need to rely or make any assumption on the presence or absence of a configuration.

What I mean by this is, do not force the value of variables such as CMAKE_<LANG>_FLAGS_<CONFIG>! Let the user change them from the cache.

For now, just know that if you need to change the CMAKE_CONFIGURATION_TYPES list of configurations from your CMakeLists.txt, it needs to be done before the first call to the project command (which populates the variable if not already set) and that you need to check wether a multi-config generator is used or if the variable was already set.

Do not set CMAKE_CONFIGURATION_TYPES if it is not defined, as it means a single configuration generator is being used.
In the same way, do not set CMAKE_BUILD_TYPE if a multi-config generator is being used!

This leads to a code similar to this:

if(CMAKE_CONFIGURATION_TYPES AND (NOT "Coverage" IN_LIST CMAKE_CONFIGURATION_TYPES))
  list(APPEND CMAKE_CONFIGURATION_TYPES Coverage)
endif()

Whatever the case may be, leave control to the user and your package maintainer, and users will be happy!

Generator expressions

Since project regeneration only happens if you change a CMakeLists.txt file, it means that if you want to rely on variables set during the generation step and not configuration, you can not simply do if(VARIABLE). Sometimes you would also need to write verbose code to retrieve properties of a target to use it as an input for another command.

To support such scenarios, generator expressions were introduced.
Instead of being resolved during the configuration step, they are resolved during buildsystem generation. This means that they know about the different configuration types for example.

They are not supported everywhere but most commands now accept them. Refer to the command documentation to know if it is the case.

The syntax for generator expressions is the following:

$<condition:true_string>

Or if you want a ternary operator:

$<IF:condition,true_string,false_string>

Note that in both cases, the only values allowed for condition is 0 or 1.

Some useful examples

For a value based on an option, one can use the $<BOOL:string> logical operator to evaluates the string as a boolean, and convert it to 0 or 1. This is necessary since generator expressions only understand those two values as parameter for the condition.

Our previous example with BP_USE_AVX could then be replaced with:

# Defines USE_AVX=1 only if BP_USE_AVX is ON
target_compile_definitions(myawesomelib PRIVATE $<$<BOOL:${BP_USE_AVX}>:USE_AVX=1>)

But where generator expressions shine best is when using variable queries. Those can be used as conditions the following way:

target_sources(mytarget
  PRIVATE
    source/all-platforms.cpp
    $<$<PLATFORM_ID:Windows>:source/windows-only.cpp>
    $<$<PLATFORM_ID:Linux>:source/linux-only.cpp>
)

The above snippet adds source/all-platforms.cpp on all systems to mytarget, the source/windows-only.cpp file only if the current CMAKE_SYSTEM_NAME is Windows, and source/linux-only.cpp on Linux.

Another example would be:

target_compile_definitions(mytarget
  PRIVATE
    $<$<CONFIG:Debug>:DEBUG_MODE>
    $<$<COMPILE_LANGUAGE:CXX>:COMPILING_CXX>
    $<$<COMPILE_LANGUAGE:CUDA>:COMPILING_CUDA>
)

Here we tell CMake to add the define DEBUG_MODE for Debug builds, COMPILING_CXX when compiling C++ sources and COMPILING_CUDA for Cuda sources.

Debugging

Since they are evaluated during the generation step and not the configuration step, they can be hard to debug.
As noted in the documentation one can use add_custom_target(genexdebug COMMAND ${CMAKE_COMMAND} -E echo "$<...>") to debug commands.
You then just need to build the genexdebug target to view the result.

Whitespaces in generator expressions

One of the main issue people encounter when using generator expressions is that you can not add whitespace in the condition (first operand) part. For example, $<$<CONFIG:Debug>:DEBUG_MODE> is valid, but $< $<CONFIG:Debug> :DEBUG_MODE> is not valid.

A nice trick exists though! You can use the string(CONCAT) command to create a variable containing a generator expression defined on multiple lines and any whitespace:

string(CONCAT GENEXP_WITH_WHITESPACE
  $<IF: ${SOME_VAR},
    "SOME_VAR is true",
    "SOME_VAR is false"
  >
)

Which evaluates to:

$<IF:$<BOOL:${SOME_VAR}>,SOME_VAR is true,SOME_VAR is false>

If you want to have quotes in the final expression, you will need to escape them though.

What's next

In the next post we will talk about CMake modules, packages and target installation!


Clément Grégoire

Footnotes

  1. 1     

    Some variants of if still require you to dereference though, such as if(IS_DIRECTORY path-to-directory) because it takes a path as parameter, not a variable.