Thursday, 13 December 2018

Nightly build steps with Jenkins declarative pipeline

Jenkins declarative pipeline is a new(ish) pipeline syntax for Jenkins that allows you to write pipeline-as-code in a Jeninsfile. Unlike a traditional pipeline, instead of writing Groovy code, a declarative pipeline uses a custom DSL. A simple example that builds a C++ application is below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
pipeline {
    agent any
    stages {
        stage('Clean Old Builds') {
            steps {
                sh 'rm -fr _builds'
            }
        }
 
        stage('Build') {
            sh 'cmake -H. -B_builds/default'
            sh 'cmake --build _builds/default'
        }
 
        stage('Test')
            sh 'cmake --build _builds/default --target test'
        }
    }
}

The above example, will have 3 stages that can run on any build machine.

  • Stage 1, Clean Old Builds, will remove any old builds.
  • Stage 2, Build, will run cmake and make.
  • Stage 3, Test, will run any unit tests.

This can be cleaner and easier to read than the older Groovy based pipeline syntax.

Nightly Builds

In a previous post, I discussed how to run nightly builds using Jenkins Groovy pipelines. I tried to use the same isJobStartedByTimer function with declarative pipeline, as follows,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def isTimerBuild = isJobStartedByTimer()
pipeline {
...
    triggers {
        cron(env.BRANCH_NAME == 'master' ? 'H H(0-2) * * *' : '')
    }
    stages {
        ...
        stage('Nightly') {
            when {
                expression {
                    isTimerBuild == true
                }
            }
            steps {
                sh 'cmake --build _builds/default --target cppcheck
            }
        }
    }
}
// define isJobStartedByTimer function as before
@NonCPS
def isStartedByTimer() {
  ....
}

Unfortunately this failed because of an unauthorized error. I was unable to add the required values to the In Process Script Approval as they never appeared in the approval window.

Improved Method

While looking for a solution to the above problem I came across the following feature request in Jenkins. This was added in the pipeline workflow api plugin in v2.22. The core of this change is that it allows access to the build causes via the currentBuild class, instead of the sandboxed currentBuild.rawBuild class.

1
2
currentBuild.getBuildCauses()
currentBuild.getBuildCauses(String superClassName)

When using these functions you do not have to enable any In Script approvals. To get this to work I updated my isJobStartedByTimer as follows:

1
2
3
4
5
6
7
8
9
10
11
12
@NonCPS
def isJobStartedByTimer() {
    try {
        if(currentBuild.getBuildCauses('hudson.triggers.TimerTrigger$TimerTriggerCause').size() > 0) {
            echo "build started by timer"
            return true
        }
    } catch(theError) {
        echo "Error getting build cause: ${theError}"
    }
    return false
}

Using this updated function, and the example pipeline from earlier, allows me to have build steps that only run at night.

Alternative Improved Method

In some cases I found that the above improved method caused a ClassNotFoundException saying that the TimerTrigger class did not exist. An alternative example, that could be used if you have receive this error, is to use the currentBuild.getBuildCauses() version of the function. This works as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
@NonCPS
def isJobStartedByTimer() {
    try {
        for ( buildCause in currentBuild.getBuildCauses() ) {
            if (buildCause != null) {
                if (buildCause.shortDescription.contains("Started by timer")) {
                    echo "build started by timer"
                    return true
                }
            }
        }
    } catch(theError) {
        echo "Error getting build cause: ${theError}"
    }
    return false
}

Thursday, 6 December 2018

Cppcheck: More Checks

In my previous posts, I discussed some errors that Cppcheck can help you find. In this post I'm going to look at some more errors that Cppcheck can prevent.

Memory Leak from realloc

When using realloc, a failure to allocate data will return a null pointer. This can lead to a common error where the original data that you tried to reallocate is leaked. For example:

1
2
3
4
5
6
7
void realloc_issue() {
    int* data = (int*)malloc(4 * sizeof(int));
    // do something
    data = (int*)realloc(data, 8*sizeof(int));
    // do something
    free(data);
}

In the above code, the original data variable is leaked if the call to realloc fails. Using Cppcheck you will get the following warning about the code:

1
[realloc.cpp:4]: (error) Common realloc mistake: 'data' nulled but not freed upon failure

To fix this error you should allow a temporary to take the returned result from realloc and only assign it to data if it is valid memory:

1
2
3
int* new_data = (int*)realloc(data, 8*sizeof(int));
if(new_data)
    data = new_data;

Invalidated Iterator

When using stl containers such as std::vector, using methods such as push_back and resize can invalidate iterators. Cppcheck has a check which can help to catch this mistake. For example:

1
2
3
4
5
6
7
8
9
10
11
void pb_it() {
    std::vector<std::string> v;
    v.push_back("first");
 
    std::vector<std::string>::iterator first = v.begin();
    for(int i = 0; i < 100; ++i) {
        v.push_back("loop");
    }
 
    std::cout << *first << "\n";
}

This will result in the following warning:

1
[push_back.cpp:10]: (error) After push_back(), the iterator 'first' may be invalid.

Note: If you use auto for the iterator or a standard algorithm to fill your vector, then Cppcheck will fail to warn about the error. For example, neither of the following will produce a warning:

1
2
3
std::vector<std::string>::iterator first = v.begin();
std::fill_n(std::back_insertor(v), 100, "loop");
std::cout << *first << "\n";

or

auto first = v.begin(); for ... std::cout << *first << "\n";

Invalid Operation on File

Using the C library functions for working on a FILE*, there is no type safe way at compile time to catch code that reads from a write only file, or writes to a read only file. However, Cppcheck has warnings which can catch this error.

1
2
3
4
5
6
7
8
9
10
11
12
13
void read_on_write_only() {
    FILE* f = fopen("xxxw", "w");
    std::array<char, 100> buf;
    fread(buf.data(), 1, 100, f);
    fclose(f);
}  
 
void write_on_read_only() {
    FILE* f = fopen("xxxf", "r");
    int x = 0;
    fwrite(&x, sizeof(int), 1, f);
    fclose(f);
}  

In the above code we use fread on a file that was opened as write only, and fwrite on a file that was opened read only. Cppcheck will output the following warnings:

1
2
[file_ops.cpp:4]: (error) Read operation on a file that was opened only for writing.
[file_ops.cpp:11]: (error) Write operation on a file that was opened only for reading.

Note: This doesn't warn with std::fstream. For example, If you open a file with ios_base::in, and try to write to it then Cppcheck will fail to warn.

Class Checks

Cppcheck has a number of both style and warning checks that can hint at potential problems with your class design. For example, take this poorly designed class:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#include <string>
 
class Multipler
{
public:
    Multipler(int x) : x_(new int(x))
    {
    }
    ~Multipler() = default;
    Multipler& operator=(const Multipler& rhs) {
        x_ = new int(*rhs.x_);
        return *this;
    }
    int multiply(int y) const { return y * (*x_); }
 
    void name(const std::string& name) { name_ = name; }
private:
    int* x_;
    std::string name_;
    int vx_;
};

On most compilers this will compile with no warnings. However, Cppcheck will output the following warnings:

1
2
3
4
5
6
7
[class_design.cpp:6]: (warning) Member variable 'Multipler::vx_' is not initialized in the constructor.
[class_design.cpp:10]: (warning) Member variable 'Multipler::vx_' is not assigned a value in 'Multipler::operator='.
[class_design.cpp:10]: (warning) 'operator=' should check for assignment to self to avoid problems with dynamic memory.
[class_design.cpp:6]: (style) Class 'Multipler' does not have a copy constructor which is recommended since it has dynamic memory/resource allocation(s).
[class_design.cpp:6]: (style) Class 'Multipler' has dynamic memory/resource allocation(s). The destructor is explicitly defaulted but the default destructor does not work well. It is recommended to define the destructor.
[class_design.cpp:6]: (style) Class 'Multipler' has a constructor with 1 argument that is not explicit.
[class_design.cpp:11]: (warning) Possible leak in public function. The pointer 'x_' is not deallocated before it is allocated.

These are good checks, however, you need to have style warnings enabled for some of them. If you only use enable=warning, Cppcheck will not warn about the fact that you fail to delete x_ in the destructor.

Conclusions about Cppcheck

Cppcheck is a good tool to help you write more robust and safe code. However, it is not a silver bullet to prevent errors. As noted in some of my examples, a small change to your code can mean that Cppcheck will fail to catch an error.

I have found that Cppcheck can be very helpful when taking over an old codebase that uses an older style of C++. After enabling Cppcheck, you can often catch errors that have existed in the code for years and the developers have been lucky not to hit (e.g. mistakes in error handlers). If you are using more modern C++ style, Cppcheck might not be up to date with all changes and may miss some warnings. However, I would still recommend to have it running in your CI system to prevent errors from reaching your production code.

Thursday, 29 November 2018

Cppcheck: Basic Checks

In my previous posts, I've given an intro to Cppcheck and also described how to integrate it with CMake.

In this post I'm going to start looking at some of the errors that Cppcheck can help you find in your code.

Memory Leaks

Every C++ developer will have at some point written code which includes a memory leak. With modern C++ we have tools such as std::unique_ptr and std::shared_ptr to help with resource allocation, however, when using raw pointers (e.g. in legacy code), using a tool such as Cppcheck can help catch errors. For example:

1
2
3
4
5
6
7
8
9
10
11
12
13
#include <string>
#include <iostream>
 
std::string* my_to_string(int x) {
    return new std::string(to_string(x));
}
 
void print_ints() {
    for(int i = 0; i < 10; ++i) {
        std::string* tmp = my_to_string(i);
        std::cout << tmp << "\n";
    }
}

When run against a compiler, this code doesn't produce any warnings or errors, even with the all warnings enabled. However, when run against cppcheck, we can see:

1
2
3
$ cppcheck --enable=warning  leak.cpp
Checking leak.cpp ...
[leak.cpp:12]: (error) Memory leak: tmp

Out of Bounds

Out of bounds memory access can cause issues such as a seg fault or leaking of sensitive data. Using C++ check can help to prevent OOB errors from reaching your final release. For example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#include <array>
#include <iostream>
 
const int SZ = 5;
 
std::array<int, SZ> array_oob() {
    std::array<int, SZ> arr;
    for(int i = 0; i <= SZ; ++i) {
        arr[i] = i;
    }
    return arr;
}
 
void carray_oob() {
    int arr[5];
    for(int i = 0; i <= SZ; ++i) {
        arr[i] = i;
    }
}
 
int from_outside(int c) {
    std::array<int, SZ> arr;
    return arr[c];
}
 
void use_from_ouside() {
    from_outside(100);
}

Will produce the following errors when run with Cppcheck:

1
2
3
4
5
$ cppcheck --enable=warning oob.cpp
Checking oob.cpp ...
[oob.cpp:9]: (error) Array 'arr[5]' accessed at index 5, which is out of bounds.
[oob.cpp:17]: (error) Array 'arr[5]' accessed at index 5, which is out of bounds.
[oob.cpp:23]: (error) Array 'arr[5]' accessed at index 100, which is out of bounds.

Interestingly using Cppcheck v1.85 the following code does not produce any error:

1
2
3
4
5
6
7
std::vector<int> vector_oob() {
    std::vector<int> arr;
    for(int i = 0; i <= SZ; ++i) {
        arr[i] = i;
    }
    return arr;
}

Use After Move

Cppcheck will detect the incorrect usage of a moved from variable in the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#include <string>
#include <iostream>
 
void print(std::string&& to_print) {
    std::string moved(std::move(to_print));
    std::cout << moved << "\n";
}
 
void use_after_move() {
    std::string hello = "Hello";
    print(std::move(hello));
    hello[0] = 'h';
    std::cout << hello << "\n";
}

With the following error:

1
2
3
4
$ cppcheck --enable=warning uam.cpp
Checking uam.cpp ...
[uam.cpp:13]: (warning) Access of moved variable 'hello'.
[uam.cpp:14]: (warning) Access of moved variable 'hello'.

Thursday, 22 November 2018

Integrating Cppcheck and CMake

In my previous post, I gave an introduction to the static analysis tool Cppcheck. In this post, I'm going to show how to integrate Cppcheck into the CMake build system to generate a cppcheck-analysis target for the make / ninja build systems.

CMake

CMake is a meta build system that can be used to generate files for various build systems, including make and ninja. I would recommend to have a basic knowledge of how CMake works before reading this post. For an introduction to CMake, you can read my CMake tutorial.

FindCppcheck

In CMake you can find dependencies using a find(ModuleName) command. This commands looks for a file FindModuleName.cmake and will parse this file to find the dependency and add it's requirements. For Cppcheck, you can add a FindCppcheck.cmake in a folder cmake/modules off the root of your project.

You should then add the following to your root CMakeLists.txt to tell it where to search for modules:

1
2
# Add a custom CMake Modules directory
set(CMAKE_MODULE_PATH ${CMAKE_SOURCE_DIR}/cmake/modules ${CMAKE_MODULE_PATH})

The code for the FindCppcheck.cmake is below

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
# Locate cppcheck and add a cppcheck-analysis target
#
# This module defines
#  CPPCHECK_BIN, where to find cppcheck
#
# To help find the binary you can set CPPCHECK_ROOT_DIR to search a custom path
# Exported argumets include
#   CPPCHECK_FOUND, if false, do not try to link to cppcheck --- if (CPPCHECK_FOUND)
#  
#   CPPCHECK_THREADS_ARG - Number of threads to use (default -j4)
#   CPPCHECK_PROJECT_ARG - The project to use (compile_comands.json)
#   CPPCHECK_BUILD_DIR_ARG - The build output directory (default - ${PROJECT_BINARY_DIR}/analysis/cppcheck)
#   CPPCHECK_ERROR_EXITCODE_ARG - The exit code if an error is found (default --error-exitcode=1)
#   CPPCHECK_SUPPRESSIONS - A suppressiosn file to use (defaults to .cppcheck_suppressions)
#   CPPCHECK_EXITCODE_SUPPRESSIONS - An exitcode suppressions file to use (defaults to .cppcheck_exitcode_suppressions)
#   CPPCHECK_CHECKS_ARGS - The checks to run (defaults to --enable=warning)
#   CPPCHECK_OTHER_ARGS - Any other arguments
#   CPPCHECK_COMMAND - The full command to run the default cppcheck configuration
#   CPPCHECK_EXCLUDES - A list of files or folders to exclude from the scan. Must be the full path
#  
# if CPPCHECK_XML_OUTPUT is set to an output file name before calling this. CppCheck will create an xml file with that name
# find the cppcheck binary
 
# if custom path check there first
if(CPPCHECK_ROOT_DIR)
    find_program(CPPCHECK_BIN
        NAMES
        cppcheck
        PATHS
        "${CPPCHECK_ROOT_DIR}"
        NO_DEFAULT_PATH)
endif()
 
find_program(CPPCHECK_BIN NAMES cppcheck)
 
if(CPPCHECK_BIN)
    execute_process(COMMAND ${CPPCHECK_BIN} --version
                  OUTPUT_VARIABLE CPPCHECK_VERSION
                  ERROR_QUIET
                  OUTPUT_STRIP_TRAILING_WHITESPACE)
 
    set(CPPCHECK_THREADS_ARG "-j4" CACHE STRING "The number of threads to use")
    set(CPPCHECK_PROJECT_ARG "--project=${PROJECT_BINARY_DIR}/compile_commands.json")
    set(CPPCHECK_BUILD_DIR_ARG "--cppcheck-build-dir=${PROJECT_BINARY_DIR}/analysis/cppcheck" CACHE STRING "The build directory to use")
    # Don't show thise errors
    if(EXISTS "${CMAKE_SOURCE_DIR}/.cppcheck_suppressions")
        set(CPPCHECK_SUPPRESSIONS "--suppressions-list=${CMAKE_SOURCE_DIR}/.cppcheck_suppressions" CACHE STRING "The suppressions file to use")
    else()
        set(CPPCHECK_SUPPRESSIONS "" CACHE STRING "The suppressions file to use")
    endif()
 
    # Show these errors but don't fail the build
    # These are mainly going to be from the "warning" category that is enabled by default later
    if(EXISTS "${CMAKE_SOURCE_DIR}/.cppcheck_exitcode_suppressions")
        set(CPPCHECK_EXITCODE_SUPPRESSIONS "--exitcode-suppressions=${CMAKE_SOURCE_DIR}/.cppcheck_exitcode_suppressions" CACHE STRING "The exitcode suppressions file to use")
    else()
        set(CPPCHECK_EXITCODE_SUPPRESSIONS "" CACHE STRING "The exitcode suppressions file to use")
    endif()
 
    set(CPPCHECK_ERROR_EXITCODE_ARG "--error-exitcode=1" CACHE STRING "The exitcode to use if an error is found")
    set(CPPCHECK_CHECKS_ARGS "--enable=warning" CACHE STRING "Arguments for the checks to run")
    set(CPPCHECK_OTHER_ARGS "" CACHE STRING "Other arguments")
    set(_CPPCHECK_EXCLUDES)
 
    ## set exclude files and folders
    foreach(ex ${CPPCHECK_EXCLUDES})
        list(APPEND _CPPCHECK_EXCLUDES "-i${ex}")
    endforeach(ex)
 
    set(CPPCHECK_ALL_ARGS
        ${CPPCHECK_THREADS_ARG}
        ${CPPCHECK_PROJECT_ARG}
        ${CPPCHECK_BUILD_DIR_ARG}
        ${CPPCHECK_ERROR_EXITCODE_ARG}
        ${CPPCHECK_SUPPRESSIONS}
        ${CPPCHECK_EXITCODE_SUPPRESSIONS}
        ${CPPCHECK_CHECKS_ARGS}
        ${CPPCHECK_OTHER_ARGS}
        ${_CPPCHECK_EXCLUDES}
    )
 
    # run cppcheck command with optional xml output for CI system
    if(NOT CPPCHECK_XML_OUTPUT)
        set(CPPCHECK_COMMAND
            ${CPPCHECK_BIN}
            ${CPPCHECK_ALL_ARGS}
        )
    else()
        set(CPPCHECK_COMMAND
            ${CPPCHECK_BIN}
            ${CPPCHECK_ALL_ARGS}
            --xml
            --xml-version=2
            2> ${CPPCHECK_XML_OUTPUT})
    endif()
 
endif()
 
 
 
# handle the QUIETLY and REQUIRED arguments and set YAMLCPP_FOUND to TRUE if all listed variables are TRUE
include(FindPackageHandleStandardArgs)
FIND_PACKAGE_HANDLE_STANDARD_ARGS(
    CPPCHECK
    DEFAULT_MSG
    CPPCHECK_BIN)
 
mark_as_advanced(
    CPPCHECK_BIN 
    CPPCHECK_THREADS_ARG
    CPPCHECK_PROJECT_ARG
    CPPCHECK_BUILD_DIR_ARG
    CPPCHECK_ERROR_EXITCODE_ARG
    CPPCHECK_SUPPRESSIONS
    CPPCHECK_EXITCODE_SUPPRESSIONS
    CPPCHECK_CHECKS_ARGS
    CPPCHECK_EXCLUDES
    CPPCHECK_OTHER_ARGS)
 
# If found add a cppcheck-analysis target
if(CPPCHECK_FOUND)
    file(MAKE_DIRECTORY ${CMAKE_BINARY_DIR}/analysis/cppcheck)
    add_custom_target(cppcheck-analysis
        COMMAND ${CPPCHECK_COMMAND})
    message("cppcheck found. Use cppccheck-analysis targets to run it")
else()
    message("cppcheck not found. No cppccheck-analysis targets")
endif()

This file includes comments to describe the various sections, however, I'll include some common settings that you may wish to override before calling it.

Checks to enable

To change the level of enabled checks you can use the CPPCHECK_CHECKS_ARGS variable. By default this is set to --enable=warning. You can override it to any of the supported check levels in Cppcheck.

Suppression

As discussed in my previous post you can suppress some warnings in Cppcheck using a suppression file. By default the above script will look for a suppressions file in .cppcheck_suppressions. If it exists it will add this to the command when calling cppcheck. To override this file you can change the CPPCHECK_SUPPRESSIONS argument.

1
`set(CPPCHECK_SUPPRESSIONS ${PROJECT_ROOT_DIR}/my_suppressions)

exitcode

When Cppcheck finds errors you can tell it what error code to use when exiting the program. With the above script the default exitcode is 1. To override this you can use the CPPCHECK_ERROR_EXITCODE_ARG. To set it to use the cppcheck default

1
set(CPPCHECK_ERROR_EXITCODE_ARG "")

XML output

Cppcheck supports outputting the results via XML. This can be helpful for CI systems to read the output and add a report. To enable XML output you can set the CPPCHECK_XML_OUTPUT variable to the file you want to use:

1
set(CPPCHECK_XML_OUTPUT "${PROJECT_BINARY_DIR}/analysis/cppcheck/cppcheck_analysis.xml")

Excluding files

If you have vendored third party code in your repository then you may not want to include that in your Cppcheck analysis. To exclude a file or folder you can create a list CPPCHECK_EXCLUDES

1
2
3
4
set(CPPCHECK_EXCLUDES
    ${CMAKE_SOURCE_DIR}/3rd_party
    ${CMAKE_BINARY_DIR}/
)

Calling FindCppcheck

To call cppcheck you might add something similar to the below snipped to your root CMakeLists.txt:

1
2
3
4
5
6
7
8
set(CPPCHECK_XML_OUTPUT "${PROJECT_BINARY_DIR}/analysis/cppcheck/cppcheck_analysis.xml")
 
set(CPPCHECK_EXCLUDES
    ${CMAKE_SOURCE_DIR}/3rd_party
    ${CMAKE_BINARY_DIR}/
)
 
find(Cppcheck)

This will then add the target make cppcheck-analysis to your build system. By default this target will fail your build if there are any static analysis errors.

Conclusion

In this post, I've described how you can integrated Cppcheck into your build system using CMake. This allows you to provide a simplified interface for all developers and your CI system to easily call Cppcheck as they would any other build target.

The new cppcheck-analysis target will allow for new developers to quickly call Cppcheck and also supports incremental static-analysis to encourage calling it on larger projects.

Thursday, 15 November 2018

Intro to CppCheck

Cppcheck is an open-source static analysis tool for C++. In this post I'm going to describe how to install and run Cppcheck over your C++ code base.

Static Analysis

Static analysis is the analysis of code without executing it. It can be used to find common programming errors and enforce coding guidelines. Some common benefits of static analysis include:

  • Prevent unexpected / undefined behaviour.
  • Find security vulnerabilities and make code more secure.
  • Make code more maintainable by enforcing coding standards.

The very first static analysis tool that programmers use is their compiler. In my previous blog posts, I've discussed how using multiple compilers can help catch errors and prevent undefined behaviour. Using multiple compilers is a great first step in detecting issues, however, dedicated static analysis tools can help to prevent more errors as they can look at more detailed and complex code paths to analyse issues.

Cppcheck

Cppcheck is a standalone static analysis tool that can perform the following checks:

  • Undefined Behaviour
    • Dead pointers
    • Integer Overflow
    • Null pointer dereferences
    • Out of bounds checking
  • Security checking including some common CVE errors
  • Coding Standards

It is integrated into may common CI systems, IDEs and programming environments to allow it to be easily run over different code bases.

Installing

On Linux, most common package managers include a version of Cppcheck, however, this version is often older and out of date. As a result I recommended to install the latest version which can be found from the Cppcheck website.

For windows a binary installer is available.

For other platforms, it is easy to build the latest version from source. The only requirements are CMake and a C++11 compatible compiler. To build and install v1.85 (the latest at time or writing), you can run:

1
2
3
4
5
6
7
8
$ wget https://github.com/danmar/cppcheck/archive/1.85.tar.gz
$ tar xvf 1.85.tar.gz
$ cd cppcheck-1.85
$ mkdir build
$ cd build
$ cmake ..
$ make
$ sudo make install

You should then have cppcheck available in your path:

1
2
$ cppcheck --version
Cppcheck 1.85

Basic Usage

Single File

To run Cppcheck against a single cpp file you can run cppcheck filename.cpp. For a simple example, take the following file.

1
2
3
4
5
6
7
8
#include <array>
 
int main() {
  const int ex_sz = 5;
  std::array<int, ex_sz> ex;
  ex[ex_sz] = 1;
  return 1;
}

Under GCC and Clang this compiles without any warnings using -Wall. However, with Cppcheck I see:

1
2
3
$ cppcheck test.cpp
Checking test.cpp ...
[test.cpp:6]: (error) Array 'ex[5]' accessed at index 5, which is out of bounds.

This simple tests shows us how using Cppcheck as a static analysis tool can save us from an out of bounds error.

Header Files

If you have header files, you can use the -I flag to tell cppcheck where to search for them.

1
cppcheck -I inc/ with_header.cpp

Recursive Checks

To recursively check all cpp files in a folder you can pass the folder name to Cppcheck.

1
cppcheck -I inc/ src/

Enabling Checks

As mentioned, Cppcheck has a number of different checks available. You can enable or disable checks by using the --enable flag.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
--enable=<id>        Enable additional checks. The available ids are:
                      * all
                              Enable all checks. It is recommended to only
                              use --enable=all when the whole program is
                              scanned, because this enables unusedFunction.
                      * warning
                              Enable warning messages
                      * style
                              Enable all coding style checks. All messages
                              with the severities 'style', 'performance' and
                              'portability' are enabled.
                      * performance
                              Enable performance messages
                      * portability
                              Enable portability messages
                      * information
                              Enable information messages
                      * unusedFunction
                              Check for unused functions. It is recommend
                              to only enable this when the whole program is
                              scanned.
                      * missingInclude
                              Warn if there are missing includes. For
                              detailed information, use '--check-config'.
                     Several ids can be given if you separate them with
                     commas. See also --std

Usage for an existing project

As you can see from the above you have to pass details about your source code to Cppcheck in order for it to understand what files to analyse, how to find dependencies, and what compiler flags to use. The easiest way to do this is to use a compilation database.

If you are using CMake you can create a compilation database using the -DCMAKE_EXPORT_COMPILE_COMMANDS=ON flag, or by adding the following to your root CMakeLists.txt.

1
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)

After running CMake with this enabled a file compile_commands.json will be created. This will look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[
  {
    "directory": "/home/user/development/project",
    "command": "/usr/bin/c++ -Iinc -std=c+11 ... -c ../foo/foo.cc",
    "file": "../foo/foo.cc"
  },
 
  ...
 
  {
    "directory": "/home/user/development/project",
    "command": "/usr/bin/c++ -Iinc -std=c+11 ... -c ../foo/bar.cc",
    "file": "../foo/bar.cc"
  }
]

Once you have the compilation database available, you can then run Cppcheck as:

1
cppcheck --enable=all --project=compile_commands.json

Conclusions

This post showed how to install and run Cppcheck over your project. It also showed a simple example of how using static analysis can help you to detect errors. In future posts, I hope to show more advanced features of Cppcheck and also some common errors that it can help detect.

Thursday, 8 November 2018

Warnings Series - Unused Result

In this series of posts I'm discussing how compiling with multiple compilers and warnings enabled can help to catch errors at development time. In the last post I discussed how to catch format errors and use compiler specific methods to annotate your code. In this post, I'm going to look at a standards compliant way to annotate your code to help the compiler identify potential errors.

Background

Starting with C++11, it is possible to add attribute specifiers to your code. These can give both your users and the compiler hints on how to treat your code. An example of such an attribute is the deprecated attribute to warn that some code (e.g. class, namespace, or function) should no longer be used and may be removed in the future.

Unused Result

The unused-result warning is issued when a function that uses the [[nodiscard]] attribute specifier is called and the returned value is not used. This attribute specifier was added to the language in C++17. It can be used to annotate functions, classes and enums.

nodiscard functions

The [[nodiscard]] attribute will encourage the compiler to issue a warning if a function declared nodiscard has it's returned value not captured. For example, the following code will issue an unused result warning.

1
2
3
4
5
6
7
8
[[nodiscard]] int* allocate_int()
{
    return new int(1);
}
 
void test() {
    allocate_int();
}

In this code, the return value of allocate_int is a raw pointer and should always be deleted. By adding the [[nodiscard]] attribute, we are telling the compiler to warn users of the function if they ignore the value. An example of such a warning from Clang is:

1
2
3
warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result]
    allocate_int();
    ^

Note: This only checks that the returned value is captured and not that it is used correctly.

nodiscard classes

The [[nodiscard]] attribute can also be added to classes (and enums) to warn when they are returned by value and ignored. For example:

1
2
3
4
5
6
7
8
9
struct [[nodiscard]] use_me { };
 
use_me do_stuff() {
    return use_me{};
}
 
void test_use_me() {
    do_stuff();
}

This will issue a warning that the use of do_stuff in test_use_me is not using the returned value from that function. This is only applicable when you return the class by value. For example, the following code will not issue a warning:

1
2
3
4
5
6
7
8
9
use_me USE_ME{};
 
use_me& do_more_stuff() {
    return USE_ME;
}
 
void test_use_me_ref() {
    do_more_stuff();
}

Ignoring nodiscard

If a function is marked [[nodiscard]] but for some reason you know you can safely ignore the result, you can use a void cast to ignore the value. In the case of our do_stuff function above, we could call it as below to safely ignore the warning:

1
static_cast<void>(do_stuff());

While C style casts are often discouraged in new code, for a case such as this it can make for easier to read code.

1
(void)do_stuff();

Catching the error

Clang

With newer versions of Clang, you will receive the unused-result warning when compiling with C++11, 14 or 17 on all warning levels. With older versions prior to v3.9, you may receive a warning that the nodiscard attribute is not known.

GCC

Similar to Clang, with newer versions of GCC you will receive the unused-result warning when compiling with the C++11, 14 or 17 standards on all warning levels. For GCC versions earlier than GCC 7, you will receive a the unknown-attribute warning.

MSVC

On MSVC 2018, you will receive the warning C4834 when compiling with the /std:c++latest flag. For example:

1
2
<source>(12): warning C4834: discarding return value of function with 'nodiscard' attribute
<source>(25): warning C4834: discarding return value of function with 'nodiscard' attribute

Earlier versions of MSVC, do not appear to display any related warnings, however, they do ignore the unknown attribute and allow the code to compile.

Comparing the errors

To view the errors from each compiler you can use godbolt to compare the output.

Thursday, 1 November 2018

Warnings Series - Format

In this series of posts I'm discussing how compiling with multiple compilers and warnings enabled can help to catch errors. In the last post I discussed how to turn return type warnings into errors. In this post, I'm going to look at another warning that I often promote to an errors.

Background

As discussed in my previous posts, you can turn specific warnings into error using the -Werror=warning-name command line flag on GCC and Clang. In some cases you can also annotate code in order enable checking it for a warning type.

Format

The format warning lets you know when you have used the incorrect format specifier for the printf family of functions. For example:

1
2
long i = 4294967295;
printf("%d\n", i);

In the above example, using %d instead of %ld may print the value -1 instead of the expected value 4294967295. For a simple cause of a log statement, the incorrect value might not matter, however, if you are serializing a value to a file it may cause issues for your users.

In another example:

1
2
long i = 1;
printf("%s\n", i);

You are attempting to print a number using the string format specifier. This specifier expects a null terminated string and as a result it will cause undefined behaviour which could create security and stability issues.

Catching the error

Clang

When compiling with Clang this is a warning by default when using no compiler flags. To promote the warning to an error you can use -Werror=format. Below is an example of the warning from Clang:

1
2
3
4
5
6
7
8
9
<source>:23:18: warning: format specifies type 'int' but the argument has type 'int64_t' (aka 'long') [-Wformat]
  printf("%d\n", i);
          ~~     ^
          %ld
<source>:24:18: warning: format specifies type 'char *' but the argument has type 'int64_t' (aka 'long') [-Wformat]
  printf("%s\n", i);
          ~~     ^
          %ld
2 warnings generated.

GCC

For GCC the warning is not enabled by default. It is included as part of the -Wall flag or can be enabled individually using -Wformat. To promote it to an error you can use -Werror=format.

Note: From v8.0 of GCC you will receive a warning with no additional compiler flags.

MSVC

With MSVC the warning for standard printf functions is enabled by default.

Attributing your own functions

It can sometimes be the case that you want to make your own function which takes a format specifier (e.g. a logging function):

1
2
3
4
5
6
7
8
9
void my_logger(int level, const char* format, ...)
{
    if(level >= LOGGING_LEVEL) {
        va_list(args);
        va_start(args, format);
        vfprintf(stderr, format, args);
        va_end(args);
    }
}

Which you can call as:

1
my_logger(WARN,"%s\n", i);

In this case the normal format warning won't appear in you code to warn you of errors. If using GCC and Clang, you can add a non-standard function attribute to tell the compiler that a function takes a format argument.

1
void my_logger(int level, const char* format, ...) __attribute__ ((format (printf, 2, 3)));

After adding this you can will receive the following warning warning:

1
2
3
4
<source>:26:27: warning: format specifies type 'char *' but the argument has type 'int64_t' (aka 'long') [-Wformat]
  my_logger(WARN, "%s\n", i);
                   ~~     ^
                   %ld

Note: This is not available on MSVC, so the addition of this attribute should be hidden behind a conditional compiler flag.

Comparing the errors

To view the errors from each compiler you can use godbolt to compare the output.

Thursday, 25 October 2018

Warnings Series - Return Type

In this series of posts I'm discussing how compiling with multiple compilers can help to catch errors at development time. In the last post I discussed how to detect out of range errors. In this post, I'm going to look at turning some warnings into errors. As an example, I'm using the first warning that I promote to an error on every project I work on.

Background

As discussed in my previous posts, you can enable warnings under GCC and Clang using the -W flags. This can turn on a group of warnings, for example using -Wall, or an individual warning like -Wmaybe-uninitialized. If you want to be stricter, you can promote those warnings to errors that will stop your program from compiling. This can be done by adding -Werror= as a compiler flag, e.g. -Werror=maybe-uninitialized will make unitialized warnings into errors.

Return Type

All compilers have a warning regarding missing the return statement from a function. Consider the following code:

1
2
3
4
5
6
7
8
9
int* allocate_int() {
    int ret = new int(0);
}
 
int main() {
    int* x = allocate_int();
    std::cout << *x;
    delete x;
}

As you can see if you the function allocate_int is missing a return statement, therefore the variable x is undefined. This can then introduce strange logic errors that can be different between builds and runs of your program. One of the first times I ran into this error, the symptom of it was that if logging was enabled a variable was true and the program worked. If logging was disabled, an error was always returned as the variable was false.

Since then, this is always the first warning that I promote to an error on all projects I work on. For new projects, it prevents the error from being introduced. For legacy projects, it helps find and prevent subtle errors. In most legacy projects I've worked on, enabling this error has resulted in a failed compile.

Note: There is no return statement on the main function and this doesn't produce a warning or error. The reason for this is that the C++ standard specifies that main will implicitly return 0 if there is no return statement. This implicit return has resulted in papers such as p1276 calling for main to allow void.

Catching the error

Clang

When compiling with Clang and this is a warning by default when using no compiler flags. To promote the warning to an error you can use -Werror=return-type.

GCC

For GCC the warning is not enabled by default in older versions of the compiler. It is included as part of the -Wall flag or can be enabled individually using -Wreturn-type. To promote it to an error you can use -Werror=return-type.

Note: From v8.0 of GCC you will receive a warning with no additional compiler flags, however, you must still use -Werror=return-type to cause a compilation error.

MSVC

With MSVC this results in a compilation error by default.

Comparing the errors

To view the errors from each compiler you can use godbolt to compare the output.

Thursday, 18 October 2018

Warnings Series - autological out of range

In this series of posts I'm discussing how compiling with alternative compilers can help to catch errors at development time. In the last post I discussed how to detect uninitialized errors. In this post, I'm going to look at some logic errors related to using incorrect types. The type of error in this post was encountered when consuming a library and ended up in production code.

Background

For this example, we have a library that returns a result as a uint8_t. The values to return come from defines in the header file.

1
2
3
4
5
6
// coin.h
#define COIN_HEAD 0
#define COIN_TAIL 1
#define COIN_ERR 0xFFFF
 
uint8_t get_coin(const std::string& coin);

Autological Out of Range

The code to consume the above header file is as follows:

1
2
3
4
5
6
7
8
9
// game.cpp
int main() {
    auto c = get_coin("head");
    if(c != COIN_ERR) {
        std::cout << "valid coin " << c << std::endl;
    }
 
    return 0;
}

In the above code, we get the number representation of a coin face based on the string. If the string is valid, it is printed, otherwise we do nothing.

This code compiles with no warnings on GCC with -Wall

Problems with the code

The problem with the above example is that if you pass an unknown string into get_coin, the return will convert the COIN_ERR definition from 0xFFFF into 0xFF. When you then do the comparison c != COIN_ERR, it is always true because c can never be equal to 0xFFFF. This is caused because COIN_ERR and the return value of get_coin are incompatible.

Catching the error

Clang

When compiling with Clang and any warning level you will get the following warning:

1
2
3
4
<source>:35:10: warning: comparison of constant 65535 with expression of type 'unsigned char' is always true [-Wtautological-constant-out-of-range-compare]
    if(c != COIN_ERR) {
       ~ ^  ~~~~~~~~
1 warning generated.

GCC

As mentioned, on GCC with the compiler flags -Wall, there are no errors in the code. Enabling other warnings with -Wextra and -pedantic fails to trigger a warning.

MSVC

The warning failed to trigger on MVSC with any warning level.

Catching early

If we go back and look at the library code that this came from, the definition of get_coin is as follows:

1
2
3
4
5
6
7
8
9
10
11
// coin.cpp
uint8_t get_coin(const std::string& coin)
{
    if(coin == "head") {
        return COIN_HEAD;
    } else if(coin == "tail") {
        return COIN_TAIL;
    }
 
    return COIN_ERR;
}

If you recompile the library, this code will produce a warning on GCC, Clang, and MSVC. The Clang version of this warning is:

1
2
3
4
5
<source>:29:12: warning: implicit conversion from 'int' to 'uint8_t' (aka 'unsigned char') changes value from 65535 to 255 [-Wconstant-conversion]
    return COIN_ERR;
    ~~~~~~ ^~~~~~~~
<source>:16:18: note: expanded from macro 'COIN_ERR'
#define COIN_ERR 0xFFFF

If the original library author had paid attention to the warnings, they could have correctly caught the incompatibility in the code and fixed it before it reached our final application. This shows one of the reasons to aim for warning free code across multiple compilers.

Comparing the errors

To view the errors from each compiler you can use godbolt to compare the output.

Thursday, 11 October 2018

Warnings Series - Sometimes Uninitialized

In this series of posts I'm discussing how compiling with alternative compilers can help to catch errors at development time. In the last post I discussed how to detect hidden overloads. In this post, I'm going to look at some recent errors I came across when getting a legacy library to compile under clang.

Background

The library is an old library that was written with C++98 and was in the style of "C with classes" rather than C++. When written the author followed a single return policy from functions, which was achieved by using a goto cleanup pattern. For a simple example see:

1
2
3
4
5
6
7
8
9
10
11
12
int func(int num)
{
    int ret = 0;
 
    if(num < 50)
        goto cleanup;
 
    ret = num;
 
cleanup:
    return ret;
}

The error handling was normally activated by the use of macros that would check a condition and if false, log a message, set the return code, and then goto cleanup. This programming style lead to a lot of functions that have the following pattern:

  • Declare all variables at the start of the function.
  • Check inputs.
  • Function logic (with checks).
  • Cleanup.

Uninitialized Variables

All compilers have some form of uninitialized checking in place. For example GCC, Clang and MSVC will all warn about the uninitialized use of mid in the following function:

1
2
3
4
5
6
7
8
int test(int num)
{
    int mid;
    if(num < mid)
        return 100;
 
    return 0;
}

Maybe / Sometimes Uninitialized

However, with more complex use cases some compilers can miss the use of an uninitialized variable. Below is a simplified (and nonsensical) example of triggering such a use case:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// error handling macro
#define check(A, E)               \
    if ((A)) {                    \
        ret = E;                  \
        goto cleanup;             \
    }
 
int* get_mid()
{
    return new int(50);
}
 
int round_100(int num)
{
    int* mid;
    int ret = 0;
 
    check(num < 0, -1) // make sure input was valid
 
    mid = get_mid(); // get mid point
    check(mid, -1) // check mid was allocated
 
    check(num < *mid, 0)
    ret = 100;
cleanup:
    delete mid;
    return ret;
}

As mentioned above, this uses a macro for error checking and the goto-cleanup pattern of error handling. If a number is entered and is less that 0, we return -1. If it is greater than a median value (which we declare as the pointer mid), we return the value of mid. Otherwise we return 100.

Some basic unit tests for this might be created as:

1
2
3
4
5
6
7
void test_round()
{
    assert(round_100(0) == 0);
    assert(round_100(45) == 0);
    assert(round_100(55) == 100);
    assert(round_100(1000) == 100);
}

Using the above code on GCC (with -Wall), everything compiles without warning and all tests pass.

Problem with the code

There are a lot of problems with this code including lack of RAII, use of goto, etc. but for this post I'm only going to look at the code as it currently is without a full refactor.

On the happy path, the code above works fine and does what it is supposed to do. However, if the user enters a number of less than 0 then we go to the cleanup path and attempt to delete the mid pointer. This results in us calling delete on uninitialized memory and leads to undefined behaviour.

Catching the error

Clang

If you compile the library code with clang (again with -Wall) it gets the following warning:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
<source>:24:5: warning: variable 'mid' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
    check(num < 0, -1);
    ^~~~~~~~~~~~~~~~~~
<source>:14:9: note: expanded from macro 'check'
    if ((A)) {                    \
        ^~~
<source>:30:12: note: uninitialized use occurs here
    delete mid;
           ^~~
<source>:24:5: note: remove the 'if' if its condition is always false
    check(num < 0, -1);
    ^~~~~~~~~~~~~~~~~~
<source>:14:5: note: expanded from macro 'check'
    if ((A)) {                    \
    ^
<source>:21:13: note: initialize the variable 'mid' to silence this warning
    int* mid;
            ^
             = nullptr

This warning is long but tells us where the error happens, and even gives a helpful tip on how to fix it by initializing mid to nullptr.

GCC

As mentioned, on GCC with the compiler flags -Wall, there are no errors in the code. Enabling other warnings with -Wextra, -pedantic, and even -Wmaybe-uninitialized all fail to trigger the warnings.

MSVC

On MSVC compiling with the /W4 flag, the warning appears as below:

1
2
z:\tmp\example.cpp(30) : warning C4701: potentially uninitialized local variable 'mid' used
z:\tmp\example.cpp(30) : warning C4703: potentially uninitialized local pointer variable 'mid' used

While this isn't as helpful as the clang warning, it does tell us that there is a problem with the code.

Comparing the errors

To view the errors from each compiler you can use godbolt to compare the output

Monday, 8 October 2018

Warnings Series - Hidden Overloads

In previous posts I've described how to build Clang on RHEL. This allows us to have access to an alternative compiler to help catch errors early. In this and some upcoming blog posts I'm going to go over some of the errors that using multiple compilers has helped me catch.

Hidden Overloads

Below is some example code which has multiple types of Dice. The base type is a standard 6 sided Dice and it is using inheritance to make a 20 sided Dice.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// library code
struct Dice {
    virtual ~Dice() = default;
    // Do a "random" roll of the dice
    virtual int roll() { return 6; }
    virtual int min() { return 1; }
    virtual int max() { return 6; }
};
 
struct TwentySidedDice : public Dice {
    virtual ~TwentySidedDice() = default;
    // Do multiple random rolls of the dice
    virtual int roll(int times) { return times * 3; }
    virtual int max() { return 20; }
};

As with all good code, we have some unit tests to make sure everything is working:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
// unit tests
void test_dice() {
    Dice die;
    assert(die.roll() == 6);
    assert(die.min() == 1);
    assert(die.max() == 6);
}
 
void test_twenty_sided() {
    TwentySidedDice die;
    assert(die.roll(4) == 12);
    assert(die.min() == 1);
    assert(die.max() == 20);
}

The above is compiled without any warnings on GCC (using -Wall) and as all unit tests pass, we feel like we can ship the code as a library.

Problem with the code

If a client attempted to use the above code, they might try something like this:

1
2
3
4
5
6
7
8
9
// client code
int main()
{
    auto die = std::make_unique<TwentySidedDice>();
    std::cout << "min is " << die->min() << std::endl;
    std::cout << "max is " << die->max() << std::endl;
    std::cout << "roll is " << die->roll() << std::endl;
    return 0;
}

On first look this code seems reasonable and looks like it should work. However, you would see the following error:

1
2
3
4
5
6
7
8
9
<source>: In function 'int main()':
<source>:43:42: error: no matching function for call to 'TwentySidedDice::roll()'
     std::cout << "roll is " << die->roll() << std::endl; // no matching function error
                                          ^
<source>:16:17: note: candidate: 'virtual int TwentySidedDice::roll(int)'
     virtual int roll(int times) { return times * 3; }
                 ^~~~
<source>:16:17: note:   candidate expects 1 argument, 0 provided
Compiler returned: 1

The reason for the error is that based on the rules of C++ overload resolution the int roll(int i); function in TwentySidedDice is hiding the int roll(); function from Dice.

Catching the error

Clang

If you compile the library code with clang (again with -Wall) it gets the following warning:

1
2
3
4
5
6
7
8
<source>:16:17: warning: 'TwentySidedDice::roll' hides overloaded virtual function [-Woverloaded-virtual]
    virtual int roll(int times) { return times * 3; }
                ^
<source>:9:17: note: hidden overloaded virtual function 'Dice::roll' declared here: different number of parameters (0 vs 1)
    virtual int roll() { return 6; }
                ^
1 warning generated.
Compiler returned: 0

This shows the error early and would let a developer know about the potential problem, so that it can be resolved before shipping to users.

GCC

As mentioned, on GCC with the compiler flags -Wall, there are no errors in the code. To enable this warning on GCC you can explicitly add the -Woverloaded-virtual flag.

MSVC

On MSVC compiling with the /W4 flag, doesn't show any warnings as the C4264 warning is off by default. To see the error you can enable warning C4264 or you can see the following using `/Wall:

1
2
3
4
<source>(16): warning C4263: 'int TwentySidedDice::roll(int)': member function does not override any base class virtual member function
<source>(18): warning C4264: 'int Dice::roll(void)': no override available for virtual member function from base 'Dice'; function is hidden
<source>(9): note: see declaration of 'Dice::roll'
<source>(7): note: see declaration of 'Dice'

Comparing the errors

To view the errors from each compiler you can use godbolt to compare the output

Friday, 21 September 2018

Deploying to PyPi from from travis-ci

When building a python project, it's very common to want to deploy to PyPi. To save effort you can have travis-ci automatically deploy for you.

Insecure Method

The most basic (and insecure) way to do this is to add a deploy step to your .travis.yml file, that specifies your PyPi username and password.

1
2
3
4
deploy:
  provider: pypi
  user: "Your username"
  password: "Your password"

This will always deploy the master branch of your project, but unfortunately it exposes your PyPi username and password.

Secure Method

To secure your password you can use the travis command line client, to generate an encrypted password and add this to your deploy configuration. To do this, navigate to your repository and run the following:

1
2
3
4
5
6
7
$ travis encrypt mypassword
Detected repository as me/myproject, is this correct? |yes| y
Please add the following to your .travis.yml file:
 
  secure: "encrypted password"
 
Pro Tip: You can add it automatically by running with --add.

Add the secure section under the password in .travis.yml or as mentioned in the output use the --add option to do this automatically.

1
2
3
4
5
deploy:
  provider: pypi
  user: "Your username"
  password:
    secure: "encrypted password"

Limit to only Tagged Releases

The above method, will deploy your master branch to pypi. However, it's likely you only want tagged releases to be uploaded to pypi. To limit this you can use the on section to say if you only want to release tags.

1
2
3
4
5
6
7
deploy:
  provider: pypi
  user: "Your username"
  password:
    secure: "encrypted password"
  on:
    tags: true

Using Test PyPi instance

The test PyPi instance is a playground for testing if your deployment to PyPi would work. If you are testing the above deploys, you may want to first deploy to this test instance to confirm that everything works. To do this you can use the server variable to specify that you want to use the test pypi instance.

1
2
3
4
5
6
7
8
deploy:
  provider: pypi
  server: https://test.pypi.org/legacy/ # Remove for deployment to official PyPi repo
  user: "Your username"
  password:
    secure: "encrypted password"
  on:
    tags: true

This will send your python package to the test pypi server.

Note: Remember to change your encrypted password to the one you use for the test PyPi instance.

Note: This method can also be used for a self hosted PyPi instance.

Tuesday, 17 July 2018

Home Assistant Input Select via Google Home and IFTTT

I run home-assistant to give me central control of many of my smart home items. One of the integrations available via home-assistant is to expose your various tools to google assistant. Unfortunately because of limitations with what google will allow, this only works with a subset of integrations including lights, switches and climate devices. As a result of this if you want to use voice to control or query other devices you need a different solution. In this post, I will explain how to make an input select device available via Google home and IFTTT using an appdaemon script.

Input Select

An input select allows a list of user defined values to be make available as a drop down list. In this example I will be using an input select that maps the state of the dishes in my dishwasher. To create the input select add the following to your configuration.yaml:

1
2
3
4
5
6
7
input_select:
  dishwasher:
    name: Dishwasher State
    options:
      - clean
      - dirty
    icon: mdi:cup-water

This lets me set if there are clean or dirty dishes in the dishwasher, and gives it a specific material design icon.

input select dishwasher

AppDaemon

AppDaemon allows you to write python scripts which can interact with home-assistant. The interactions can include:

  • Notifications about state changes
  • Notifications about events
  • Change State
  • Fire events
  • Call services on home-assistant

For this post you can add the following script to your appdaemon folder:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
import appdaemon.plugins.hass.hassapi as hass
 
#
# App to take events from ifttt and change an input_select
#
#
# Args:
#
# input_select: Input select to trigger
# change_event: Name of the event that will trigger it
# say_event: Name of the event that will trigger it
# say_message: Message to say when an event was called
# media_player: The media player to use to post your message
#
 
 
class InputSelectChecker(hass.Hass):
 
  def initialize(self):
 
  if 'input_select' in self.args:
    self.input_select = self.args["input_select"]
  else:
    self.log("No input_select specified, doing nothing")
    return
 
  if 'change_event' in self.args:
    self.change_event = self.args["change_event"]
  else:
    self.change_event = None
    self.log("No change_event specified")
 
  if 'say_event' in self.args:
    self.say_event = self.args["say_event"]
  else:
    self.say_event = None
    self.log("No say_event specified")
 
  if self.change_event:
    self.change_handle = self.listen_event(self.change_event_cb, event = self.change_event)
    self.log("{} handler setup for {}".format(self.change_event, self.input_select))
 
  if self.say_event:
    self.say_handle = self.listen_event(self.say_event_cb, event = self.say_event)
    self.log("{} handler setup for {}".format(self.say_event, self.input_select))
 
  self.message = self.args['say_message']
  self.media_player = self.args['media_player']
 
 
  def change_event_cb(self, event_name, data, kwargs):
    select_state = self.get_state(entity = self.input_select)
    self.log("change event select state = {}".format(select_state))
    self.log(self.get_state()[self.input_select]['attributes'])
    all_options = self.get_state()[self.input_select]['attributes']['options']
    self.log("{}".format(all_options))
    if 'option' in data:
      self.log("Change to {}".format(data))
      change_to = data['option'].lower()
      for o in all_options:
        if change_to == o.lower():
          self.select_option(self.input_select, o)
          return
    self.log("Invalid option {}".format(data))
 
  def say_event_cb(self, event_name, data, kwargs):
    select_state = self.get_state(entity = self.input_select)
    self.log("say select state for {} = {}".format(self.input_select, select_state))
    self.log(self.message)
    self.call_service("tts/google_say", entity_id = self.media_player,
        message = self.message.format(select_state=select_state))

This script will allow you to listen for configurable events and then either change the configured input select option, or say a TTS message over a configured media player. The message must be in the following format optional start of message {select_state} optional rest of message, where {select_state} is where you want the state of your input select to be added to the message.

Note: If you want to use a different TTS service, then you can rename it in the script or make it available as an argument.

To configure this add the following to your apps.yaml:

1
2
3
4
5
6
7
8
dishwasher_state:
  module: ifttt_input_select_handler
  class: InputSelectChecker
  input_select: input_select.dishwasher
  change_event: CHANGE_INPUT_SELECT_DISHWASHER
  say_event: SAY_INPUT_SELECT_DISHWASHER
  media_player: media_player.kitchen
  say_message: 'The dishwasher has {select_state} dishes'

If you only want to support one of either "change status", or "say status", then you should only configure the event for that handler.

IFTTT

If this then that (IFTTT) allows you to receive an event from one service and then call another service. In this example I make 2 IFTTT applets. Each applet will receive an event from the Google assistant service and then call a web-hook to call home-assistant.

The first one to get the status is If you say "What's the status of the dishwasher?" Then "make a web request"

The second to change the status is If you say "Change the dishwasher to $" Then "make a web request"

Say status applet

Screenshots of creating the say status applet are below:

ifttt step 0

Step 0: Select this.

ifttt step 1

Step 1: Choose Say a simple phrase as the trigger.

ifttt step 2

Step 2: Choose your trigger phrases and what will be the response from Google.

ifttt step 3

Step 3: Choose that.

ifttt step 4

Step 4: Setup the web-hook to call your home assistant instance. This will trigger the event SAY_INPUT_SELECT_DISHWASHER, which is the event we configured when setting up our appdaemon app.

Change status applet

The change status applet is very similar to the above. The differences are:

  • In step 1, choose say a phrase with a text ingredient
  • In step 2, the trigger phrases must include a $ in the place where you want to have our custom input. For example Change the dishwasher to $, where $ should be clean or dirty.
  • In step 2, set the response to Setting the dishwasher to $
  • In step 4, for the web-hook will be to https://[HA_URL]/api/events/CHANGE_INPUT_SELECT_DISHWASHER and the post data will be {"option": "{{TextField}}"}.

These changes will allow you to send your text back to home-assistant and if that matches your input_select option it will be selected.

Conversation

The conversation you have with Google are below.

Change Status:

User: "Hey Google, Change the dishwasher to dirty"

Google: "Setting the dishwasher to dirty"

Check Status:

User: "Hey Google, what is the status of the dishwasher?"

Google: "Checking"

Google: "The dishwasher has dirty dishes"

Limitations

The limitations of this are:

  • You have to setup the IFTTT trigger and appdaemon app for each input select in home assistant.
  • When checking the status there is a 2 stage response. First saying "Checking" and then maybe 2 seconds later saying the status response.
  • You must manually set the speaker that you respond to.

Friday, 29 June 2018

pybind11 and python sub-modules

In my last post, I introduced pybind11 and some basic examples. In this post I want to show how to use python sub-modules with your exported bindings. In particular, I want to show how we structured our bindings in sub-modules when the C++ code was in different libraries in our main project.

Python sub-module

A python sub-module is accessible from python like:

1
2
3
>>> from ork import peon
>>> peon.work_work()
I'm not that kind of Orc

In this example, ork is the main module and peon is the sub-module. In pure Python these might have the folder structure

1
2
3
4
ork
    __init__.py
    peon
        __init__.py

C++ Layout

For our code, we had a project that has multiple C++ libraries and we wanted to expose bindings from each library as a different sub-module of ork in Python:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
ork
    CMakeLists.txt
    peon
        CMakeLists.txt
        include/
        src/
    grunt
        CMakeLists.txt
        include/
        src/
    warlock
        CMakeLists.txt
        include/
        src/

So we wanted to export some functionality from ork.peon, ork.grunt, and ork.warlock to python.

Exporting the bindings

Single Shared Object

The easiest way to do this is to create a single shared object using pybind11. This will include all exported bindings for all the libraries you want to export.

A simplified example would be to add a new ork_bindings.cpp after your other libraries:

1
2
3
4
5
6
7
8
9
10
11
12
PYBIND11_MODULE(orc, mymodule) {
    mymodule.doc() = "Orks live here";
 
    py::module peon = mymodule.def_submodule("peon", "A peon is a submodule of 'ork'");
    peon.def("work_work", &Peon::work_work, "Do some work");
 
    py::module grunt = mymodule.def_submodule("grunt", "A grunt is a submodule of 'ork'");
    grunt.def("work_work", &Grunt::work_work, "Do some work");
 
    py::module warlock = mymodule.def_submodule("warlock", "A warlock is a submodule of 'ork'");
    warlock.def("work_work", &Warlock::work_work, "Do some work");
}

In your CMakeLists.txt you export the the module as:

1
2
3
4
5
6
7
pybind11_add_module(ork, ork_bindings.cpp)
target_add_library(ork
    PRIVATE
        peon
        grunt
        warlock
)

This works fine for a small amount of libraries and exported code. However, I didn't like this approach as it moved your export code away from your main code. This would make it easy to forget to add a new function to a binding and allow for consistency issues to creep into the project.

Multiple Shared Objects

Our chosen approach was to use a separate bindings library for each C++ library to be exported as a sub-module. Then use a Python module as the main module.

To do this we added the following to our code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
ork/
    CMakeLists.txt
    peon/
        CMakeLists.txt
        include/
        src/
            bindings.cpp
    grunt
        CMakeLists.txt
        include/
        src/
            bindings.cpp
    warlock
        CMakeLists.txt
        include/
        src/
            bindings.cpp

An example bindings.cpp for peon is:

1
2
3
4
5
6
7
PYBIND11_MODULE(pypeon, m)
{
    // rename to a submodule
    m.attr("__name__") = "ork.pypeon";
    m.doc() = "A peon is a submodule of 'ork'";
    m.def("work_work", &Peon::work_work, "Do some work");
}

These bindings were added in each CMakeLists.txt as:

1
2
3
4
5
pybind11_add_module(pypeon, src/bindings.cpp)
target_link_library(pypeon
    PRIVATE
        peon
)

When installing your library you then install as:

1
2
3
4
5
ork/
    __init__.py
    pypeon.[python_info].so
    pygrunt.[python_info].so
    pywarlock.[python_info].so

One of the main problems with this approach is the naming of the sub-modules. As the C++ libraries are called peon, grunt, and warlock, it is not possible to have another CMake target with the same name. Therefore, you have to have a slightly different name. In the above example, I have chosen to add py as a prefix for the sub-module names.

Conclusions

So far we have found our approach to work, even with the downside of having a prefix to the name. This allows us to make sure bindings code lives as close a possible to our C++ code and we can conditionally choose which modules to export using CMake options.

Tuesday, 26 June 2018

Using C++ code from Python with pybind11

I have recently been working on a large C++ code base and needed to make certain parts of the code available in Python for use by our other teams. We looked at a number of different frameworks for this and eventually decided to go with pybind11. In this blog post I will describe the basic usage of pybind11 and how to call your exported code from python. For a full look at the code check out my example repository from here

Basic Example

The examples below can be considered part of a single library that is being made available. An example cpp file is available here

Method

Consider the following method that you want to make available to python

1
2
3
int add(int x, int y) {
    return x + y;
}

To make this available to python you add the following binding:

1
2
3
4
5
6
7
8
9
PYBIND11_MODULE(pybindings, mymodule) {
    using namespace pybind11::literals; // for _a literal to define arguments
    mymodule.doc() = "example module to export code";
    mymodule.def("add",
          &add,
          "Add 2 numbers together",
          "x"_a,
          "y"_a);
}

To explain the above a little bit we have

PYBIND11_MODULE(pybindings, m) - This defines the module name pybindings, with the variable mymodule used to reference it.

mymodule.doc() - Create a doc string for the module which will be displayed when using the help function in python.

mymodule.def - Define the function you want to export to python. The arguments to this are the name of the function in python, The C++ function to export, The doc string for the function, Any arguments for the function

Once build this function could be used as:

1
2
3
>>> import pybindings
>>> pybindings.add(1, 2)
3

Class

Consider the class:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class Adder {
public:
    Adder(int x) : x_(x) {};
    int add(int y) {
        return x_ + y;
    }
 
    void setAddition(int x) {
        x_ = x;
    }
    int getAddition() {
        return x_;
    }
private:
    int x_;
};

To export this, under your PYBIND11_MODULE add the following:

1
2
3
4
pybind11::class_<Adder>(mymodule, "Adder")
    .def(pybind11::init<int>())
    .def("add", &Adder::add)
    .def_property("addition", &Adder::getAddition, &Adder::setAddition);

This is similar to the method example above except we now define the class using pybind11::class_. The class type is passed as a template argument to this and functions are defined as part of the class instead of the module.

The constructor is defined by calling pybind11::init, and you can add access to class variables as properties using the def_property function.

Once built it can be run using:

1
2
3
4
5
6
7
>>> import pybindings
>>> a = pybindings.Adder(1)
>>> a.add(2)
3
>>> a.addition = 4
>>> a.add(2)
6

stl

pybind11 seamlessly supports using STL containers within python. To to this you import the pybind11/stl.h header to make the bindings available.

1
2
3
4
5
6
7
std::string join(std::vector<char> tojoin) {
    std::string ret;
    ret.reserve(tojoin.size())
    for(auto c: tojoin) {
        ret += c;
    }
}

To export add the function definition to your module:

1
2
3
4
mymodule.def("join",
      &join,
      "Join a list of strings",
      "tojoin"_a);

This is the very similar to the add function from our first example and shows how pybind11 seamlessly supports STL containers.

To call this from python:

1
2
3
4
5
6
>>> import pybindings
>>> s = pybindings.join(['a', 'b', 'c'])
>>> print(s)
abc
>>> type(s)
<class 'str'>

Building the Example

Pybind11 has good support for CMake and be easily integrated into a CMake project. To do this add the following to your CMakeLists.txt

1
2
find_package(pybind11)
pybind11_add_module(pybindings bindings.cpp)

This will add a target to build a library pybindings.[python-information].so.

NOTE: The name of the library must add match the name of the module you defined in your C++ code.

Using the binding

To use the bindings you add the location of the above shared object into your PYTHONPATH. Assuming your built your code in /data/code/build and have a python file /data/code/bindings.py you can run:

1
PYTHONPATH=/data/code/build/ python3 /data/code/bindings.py

The source of bindings.py can be found here

Wednesday, 18 April 2018

Rate Limiting

At work I was recently tasked with integrating a rate limiting solution into out API. The purpose of the rate limiter was to cap the number of requests a user can make against an API over a specific period of time. This can help:

  • Prevent spam attacks.
  • Stop spikes in traffic from overloading our servers and degrading performance.
  • Help identify and stop misbehaving clients.

A user in the context of rate limiting is any entity that you wish to limit. Users can be identified by any unique identifier. Some examples include:
  • IP Address
  • User ID
  • Device Identifier
  • Customer Name (for B2B customers)
  • URL. E.g. only X number of POST requests to /withdrawl URL
Or any combination of the above.

The main requirement for the solution was that our front end services can run on multiple containers in a distributed manner so our rate limiting must work across servers. This would involve using an external key-value store (redis) to store the rate limiting information. As a result of this many of the details described below are directly related to how redis works. These details can normally be applied to other cache services as required.

For the remainder of this blog post I will discuss the various rate limiting algorithms I investigated, our chosen algorithm and some additional implementation details.

Rate Limiting Algorithms

Fixed Window Counters

Fixed window counters are the simplest method of rate limiting.  They work by maintaining a counter with the number of times an entity has hit the endpoint within a fixed window. If the user makes more than the allocated number of requests, then new requests are rejected. At the start of the new window the counter is reset to 0 and the user may make additional requests. A window is typically associated with a time and the counter is reset at the start of the period. Some examples include:
  • Day window: Reset at the same time every day, e.g. midnight
  • Hour window: Reset on the hour, e.g. 12:00:00, 13:00:00 ...
  • 15 minute window: Reset every 15 minutes e.g. 12:00:00, 12:15:00 ...
The key for the cache service would be {id}_{windowTimestamp}, where id could be the user id of the user and windowTimestamp would be the timestamp at the start of the window.

 In the following example the user is allowed to make 3 requests in a 60 second window.

Time
Action
Counter Value
Additional
12:00:05user makes requestuser1_1515120000: 0 → 1Key is user1_1515120000
12:00:15user makes requestuser1_1515120000: 1 → 2
12:01:01user makes requestuser1_1515120100: 0 → 1
Counter is reset as the 60 second period is is over. New key is user1_1515120100
12:01:10user makes requestuser1_1515120100: 1 → 2
12:01:40user makes requestuser1_1515120100: 2 → 3
12:01:50user makes requestuser1_1515120100: 3 → 4User is rejected as they are over limit
12:02:20user makes requestuser1_1515120200: 0 → 1Allowed as counter reset and new key is user1_1515120200

Note: Old keys can be set to automatically expire a fixed period after they are done. This may require 2 calls to redis to have an INCR, then EXPIRE command called.

Pros:

The advantages of this approach are:
  • Simple to implement
  • INCR is an atomic redis command.
  • Low memory requirement.

Cons:

  • It can allow a user to go above their allowed quota in a rolling 60 second period. For example, in the above limit of 3 requests per 60 seconds, if the user made 3 requests at 12:00:59 and a further 3 requests at 12:01:00, this would allow the user to make 6 requests in a 2 second period. 
  • It may also cause bursts of traffic across many clients. For example, if every client has used their quota for the pervious minute they may retry until they are allowed. This could cause many users to hit the server in the first second of the new window.

 Sliding Log

Sliding log rate limiting involves storing a history of requests for each user along with the associated time stamp. As new requests come in you count the number of requests for a period. Logs can be stored in a sorted set per user, where the key and value are the time stamp of the requests. Logs older than the allowed period are dropped.

To recreate the previous example where the user is allowed 3 requests per 60 second window:

Time
Action
Set Value
Additional
12:00:05user makes request
user1: {
15151200050000: 15151200050000
}

12:00:15user makes request
user1: {
15151200050000: 15151200050000
15151200150000: 15151200150000
}

12:01:01user makes request
user1: {
15151200050000: 15151200050000
15151200150000: 15151200150000
15151201010000: 15151201010000
}

12:01:10user makes request
user1: {
15151200150000: 15151200150000
15151201010000: 15151201010000
15151201100000: 15151201100000
}
User is still under threshold as the initial request (15151200000000) was more than 60 seconds ago
12:01:40user makes request
user1: {
15151201010000: 15151201010000
15151201100000: 15151201100000
15151201400000: 15151201400000
}

12:01:50user makes request
user1: {
15151201010000: 15151201010000
15151201100000: 15151201100000
15151201400000: 15151201400000
15151201500000: 15151201500000
}
User is rejected as they have had 4 request in the last 60 seconds
12:02:20user makes request
user1: {
15151201400000: 15151201400000
15151201500000: 15151201500000
15151202200000:15151202200000
}
Request is allowed as user is below threshold.

Pros:

  • Allows high precision on the limits
  • Avoids bursts of data at the start of a window

Cons:

  • It requires a set item per request so has a large memory requirement.
  • Failed requests may still be added to the set. This could mean that even rate limited requests that perform no action could block a user.
  • May require multiple commands to perform the update and expire of old keys. 
    • This can be made atomic by using a redis MULTI command or lua script.

 Sliding Window

In a sliding window rate limiter, each user is given limits to be used within a time frame. These are broken up into smaller buckets that allow us to control the way limits are available over a more evenly distributed window. As requests are used they are removed from the smaller windows and as those small windows expire the tokens are re-added. You can add more precision to your requests by having multiple windows that work in parallel to even out he flow of traffic. The data would be stored in a has per user to allow access to the individual buckets. The key for each bucket would be the time stamp at the start of the window.

A simplified version allowing 3 requests per 60 seconds in 15 second buckets is:

Time
Action
Set Value
Additional
12:00:05user makes request
user1: {
1515120000: 1
}

12:00:15user makes request
user1: {
1515120000: 1
1515120015: 1
}

12:01:01user makes request
user1: {
1515120015: 1
1515120100: 1
}
Timestamp for 1515120000 is expired so is deleted
12:01:10user makes request
user1: {
1515120015: 1
1515120100: 2
}

12:01:40user makes request
user1: {
1515120100: 2
1515120130: 1
}
Timestamp for 1515120015 expires
12:01:50user makes request
user1: {
1515120100: 2
1515120130: 1
1515120145: 1
}
User is rejected as they have had 4 request in the last 60 seconds
12:02:20user makes request
user1: {
1515120130: 1
1515120145: 1
1515120215: 1
}
Timestamp for 1515120100 expires.

Request is allowed as user is below back threshold.

Pros:

  • Allows varying degrees of precision depending on bucket time frames.
  • Lower memory constraints than sliding log.
  • Avoids bursts of data at the start of a new window.

Cons:

  • The update is not atomic so would require a redis lua script to control key counting / expiry.

 Sliding Tail

 Sliding tail rate limiting is an alternative implementation of the sliding window algorithm. It can be also be thought of as a fixed window rate limiting with history. In this case, we store a count of the current fixed window and the previous window. When a request comes in we calculate usage based on the count of the number of requests in the current window + a weighted count of the previous window. For example, if the current window is 25% through it's time, then we use a weighted count including 75% of the previous count + the current window count. This count is used to create the current number of tokens used by the user.

The key for the information is the same as the fixed window example where we use {id}_{windowTimestamp}. A server can do an atomic INCR for the current window time stamp and a GET for the previous window. To improve efficiency after the first request for a time stamp the server could cache the previous window data as this would no longer change.

To redo the example of 3 requests per 60 seconds:

Time
Action
Counters Value
Additional
12:00:05user makes requestuser1_1515120000: 0 → 1No previous window so only using current window
12:00:15user makes requestuser1_1515120000: 1 → 2
12:01:01user makes request
user1_1515120000: 2
user1_1515120100: 0 → 1
Weighted count is (2 * ((60-1)/60)) + 1 = 2.9
12:01:10user makes request
user1_1515120000: 2
user1_1515120100: 1 → 2
Weighted count is (2 * ((60-10)/60)) + 2 = 3.6

12:01:40user makes request
user1_1515120000: 2
user1_1515120100: 2 → 3
Weighted count is (2 * ((60-40)/60)) + 3 = 3.6
12:01:50user makes request
user1_1515120000: 2
user1_1515120100: 3 → 4
Weighted count is (2 * ((60-50)/60)) + 4 = 4.3
User is above threshold so rejected
12:02:20user makes request
user1_1515120100: 4
user1_1515120200: 0 → 1
Updated to a new window so user1_1515120000 is dropped and we move to using the weighted count from user1_1515120100

Weighted count is (4 * ((60-20)/60)) + 1 = 3.6

In the above example you can see that the effect is the same as the fixed window for normal usage. However, if the user sent any more requests before 12:2:31, they would be rejected.

Pros:

  • Low memory usage
  • Increment is atomic, although there are 2 extra commands
    • EXPIRE of the current key
    • GET of the previous key
  • Prevents bursts of data at the start of a new window
  • Could be tuned to be more or less permissive based on weighting of the previous window
  • Could be tuned based on whether to round up or down

Cons:

  • Only an approximation of the last window as it assumes there was a constant rate of traffic.
  • New clients could send bursts of traffic at their first window
  • If using atomic increment rejected requests could still add to the count for the window

Leaky Bucket

The leaky bucket as a meter algorithm (related to token bucket) describes how an API can be limited to avoid burstiness. Each user has a count of tokens which is incremented as a request comes in. If the counter is above the threshold (the bucket size), then additional requests are discarded. To "empty" the bucket and give the user more tokens we decrement the counter by a set amount every period until it reaches zero.

An implementation could keep a hash which includes, the token count, and the time stamp of the last time the bucket was emptied. As each request comes in, it performs 2 operations:
  1. Decrement the token based on a steady counter
  2. Add a token
If the number is below the bucket size, the request is allowed.

In the following example, we allow a bucket size of 3 and a refresh rate of 1 per 20 seconds:

Time
Action
Set Value
Additional
12:00:05user makes request
user1: {
tokens: 1
ts: 1515120000
}
Add a token to the bucket
12:00:15user makes request
user1: {
tokens: 2
ts: 1515120000
}

12:01:01user makes request
user1: {
tokens: 1
ts: 1515120100
}
The previous ts was 1515120000 which means we should decrement the token count by up to 3.
Then add this token
12:01:10user makes request
user1: {
tokens: 2
ts: 1515120100
}

12:01:40user makes request
user1: {
tokens: 1
ts: 1515120140
}
The previous ts was 1515120100 which means we should decrement the token count by up to 2.
Then add this token
12:01:50user makes request
user1: {
tokens: 2
ts: 1515120140
}

12:02:20user makes request
user1: {
tokens: 1
ts: 15151200220
}
The previous ts was 1515120140 which means we should decrement the token count by up to 2.
Then add this token

 Pros:

  • Memory efficient
  • Prevents bursts of traffic

Cons:

  • Requires a redis lock or lua script to update.
  • Harder to tune requirements and implement.

 Implementation

For our purposes we decided to choose the sliding tail algorithm as it provided some protection against bursts of traffic while having low processing and memory usage. It is also easy to implement using standard redis commands that allow for atomic operations.

Additional Details

HTTP Responses

When rejecting a user who is over the limit we choose a 429 Too Many Requests response.

In addition to the 429 rejection, all responses include HTTP headers to inform the user what their limit is, and how many tokens are remaining. There appears to be no consensus on the HTTP header to use for this information. From looking at responses to other APIs we decided to go with:
  • X-Rate-Limit-Limit - to describe the limit the user has for this request
  • X-Rate-Limit-Remaining - to describe the number of tokens left for the user.

Tokens Used for each Request

As some operations are related but have different processing requirements, we decided to allow a configurable number of tokens per request type. This means that for certain endpoints we can allow the same bucket of tokens to rate limit the different requests. For example, a request to GET /user would use 1 token from the {id}:user:{timestamp} bucket, however a request to POST /user would remove 2 tokens from the same bucket.