In the previous chapters on tracing and interactive debugging, we have seen how to observe executions. By checking our observations against our expectations, we can find out when and how the program state is faulty. So far, we have assumed that this check would be done by humans – that is, us. However, having this check done by a computer, for instance as part of the execution, is infinitely more rigorous and efficient. In this chapter, we introduce techniques to specify our expectations and to check them at runtime, enabling us to detect faults right as they occur.
Prerequisites
Tracers and Interactive Debuggers are very flexible tools that allow you to observe precisely what happens during a program execution. It is still you, however, who has to check program states and traces against your expectations. There is nothing wrong with that – except that checking hundreds of statements or variables can quickly become a pretty boring and tedious task.
Processing and checking large amounts of data is actually precisely what computers were invented for. Hence, we should aim to delegate such checking tasks to our computers as much as we can. This automates another essential part of debugging – maybe even the most essential part.
The standard tool for having the computer check specific conditions at runtime is called an assertion. An assertion takes the form
assert condition
and states that, at runtime, the computer should check that condition
holds, e.g. evaluates to True. If the condition holds, then nothing happens:
assert True
If the condition evaluates to False, however, then the assertion fails, indicating an internal error.
with ExpectError():
assert False
Traceback (most recent call last): File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/2715578531.py", line 2, in <cell line: 1> assert False AssertionError (expected)
A common usage for assertions is for testing. For instance, we can test a square root function as
def test_square_root() -> None:
assert square_root(4) == 2
assert square_root(9) == 3
...
and test_square_root()
will fail if square_root()
returns a wrong value.
Assertions are available in all programming languages. You can even go and implement assertions yourself:
def my_own_assert(cond: bool) -> None:
if not cond:
raise AssertionError
... and get (almost) the same functionality:
with ExpectError():
my_own_assert(2 + 2 == 5)
Traceback (most recent call last): File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/1450148856.py", line 2, in <cell line: 1> my_own_assert(2 + 2 == 5) File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/3374119957.py", line 3, in my_own_assert raise AssertionError AssertionError (expected)
In most languages, built-in assertions offer a bit more functionality than what can be obtained with self-defined functions. Most notably, built-in assertions
2 + 2 == 5
)line 2
), andC and C++, for instance, provide an assert()
function that does all this:
# ignore
open('testassert.c', 'w').write(r'''
#include <stdio.h>
#include "assert.h"
int main(int argc, char *argv[]) {
assert(2 + 2 == 5);
printf("Foo\n");
}
''');
# ignore
from bookutils import print_content
print_content(open('testassert.c').read(), '.h')
#include <stdio.h> #include "assert.h" int main(int argc, char *argv[]) { assert(2 + 2 == 5); printf("Foo\n"); }
If we compile this function and execute it, the assertion (expectedly) fails:
!cc -g -o testassert testassert.c
!./testassert
Assertion failed: (2 + 2 == 5), function main, file testassert.c, line 6.
How would the C assert()
function be able to report the condition and the current location? In fact, assert()
is commonly implemented as a macro that besides checking the condition, also turns it into a string for a potential error message. Additional macros such as __FILE__
and __LINE__
expand into the current location and line, which can then all be used in the assertion error message.
A very simple definition of assert()
that provides the above diagnostics looks like this:
# ignore
open('assert.h', 'w').write(r'''
#include <stdio.h>
#include <stdlib.h>
#ifndef NDEBUG
#define assert(cond) \
if (!(cond)) { \
fprintf(stderr, "Assertion failed: %s, function %s, file %s, line %d", \
#cond, __func__, __FILE__, __LINE__); \
exit(1); \
}
#else
#define assert(cond) ((void) 0)
#endif
''');
print_content(open('assert.h').read(), '.h')
#include <stdio.h> #include <stdlib.h> #ifndef NDEBUG #define assert(cond) \ if (!(cond)) { \ fprintf(stderr, "Assertion failed: %s, function %s, file %s, line %d", \ #cond, __func__, __FILE__, __LINE__); \ exit(1); \ } #else #define assert(cond) ((void) 0) #endif
(If you think that this is cryptic, you should have a look at an actual <assert.h>
header file.)
This header file reveals another important property of assertions – they can be turned off. In C and C++, defining the preprocessor variable NDEBUG
("no debug") turns off assertions, replacing them with a statement that does nothing. The NDEBUG
variable can be set during compilation:
!cc -DNDEBUG -g -o testassert testassert.c
And, as you can see, the assertion has no effect anymore:
!./testassert
Foo
In Python, assertions can also be turned off, by invoking the python
interpreter with the -O
("optimize") flag:
!python -c 'assert 2 + 2 == 5; print("Foo")'
Traceback (most recent call last): File "<string>", line 1, in <module> AssertionError
!python -O -c 'assert 2 + 2 == 5; print("Foo")'
Foo
In comparison, which language wins in the amount of assertion diagnostics? Have a look at the information Python provides. If, after defining fun()
as
def fun() -> None:
assert 2 + 2 == 5
quiz("If we invoke `fun()` and the assertion fails,"
" which information do we get?",
[
"The failing condition (`2 + 2 == 5`)",
"The location of the assertion in the program",
"The list of callers",
"All of the above"
], '123456789 % 5')
fun()
and the assertion fails, which information do we get?
Indeed, a failed assertion (like any exception in Python) provides us with lots of debugging information, even including the source code:
with ExpectError():
fun()
Traceback (most recent call last): File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/2842303881.py", line 2, in <cell line: 1> fun() File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/3649916634.py", line 2, in fun assert 2 + 2 == 5 AssertionError (expected)
Assertions show their true power when they are not used in a test, but used in a program instead, because that is when they can check not only one run, but actually all runs.
The classic example for the use of assertions is a square root program, implementing the function $\sqrt{x}$. (Let's assume for a moment that the environment does not already have one.)
We want to ensure that square_root()
is always called with correct arguments. For this purpose, we set up an assertion:
def square_root(x): # type: ignore
assert x >= 0
... # compute square root in y
This assertion is called the precondition. A precondition is checked at the beginning of a function. It checks whether all the conditions for using the function are met.
So, if we call square_root()
with an bad argument, we will get an exception. This holds for any call, ever.
with ExpectError():
square_root(-1)
Traceback (most recent call last): File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/3162937341.py", line 2, in <cell line: 1> square_root(-1) File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/1081921329.py", line 2, in square_root assert x >= 0 AssertionError (expected)
For a dynamically typed language like Python, an assertion could actually also check that the argument has the correct type. For square_root()
, we could ensure that x
actually has a numeric type:
def square_root(x): # type: ignore
assert isinstance(x, (int, float))
assert x >= 0
... # compute square root in y
And while calls with the correct types just work...
square_root(4)
square_root(4.0)
... a call with an illegal type will raise a revealing diagnostic:
with ExpectError():
square_root('4')
Traceback (most recent call last): File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/2953341793.py", line 2, in <cell line: 1> square_root('4') File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/1840548601.py", line 2, in square_root assert isinstance(x, (int, float)) AssertionError (expected)
quiz("If we did not check for the type of `x`, "
"would the assertion `x >= 0` still catch a bad call?",
[
"Yes, since `>=` is only defined between numbers",
"No, because an empty list or string would evaluate to 0"
], '0b10 - 0b01')
x
, would the assertion x >= 0
still catch a bad call?
Fortunately (for us Python users), the assertion x >= 0
would already catch a number of invalid types, because (in contrast to, say, JavaScript), Python has no implicit conversion of strings or structures to integers:
with ExpectError():
'4' >= 0 # type: ignore
Traceback (most recent call last): File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/2976301596.py", line 2, in <cell line: 1> '4' >= 0 # type: ignore TypeError: '>=' not supported between instances of 'str' and 'int' (expected)
While a precondition ensures that the argument to a function is correct, a postcondition checks the result of this very function (assuming the precondition held in the first place). For our square_root()
function, we can check that the result $y = \sqrt{x}$ is correct by checking that $y^2 = x$ holds:
def square_root(x): # type: ignore
assert x >= 0
... # compute square root in y
assert y * y == x
In practice, we might encounter problems with this assertion. What might these be?
quiz("Why could the assertion fail despite `square_root()` being correct?",
[
"We need to compute `y ** 2`, not `y * y`",
"We may encounter rounding errors",
"The value of `x` may have changed during computation",
"The interpreter / compiler may be buggy"
], '0b110011 - 0o61')
square_root()
being correct?
Technically speaking, there could be many things that also could cause the assertion to fail (cosmic radiation, operating system bugs, secret service bugs, anything) – but the by far most important reason is indeed rounding errors. Here's a simple example, using the Python built-in square root function:
math.sqrt(2.0) * math.sqrt(2.0)
2.0000000000000004
math.sqrt(2.0) * math.sqrt(2.0) == 2.0
False
If you want to compare two floating-point values, you need to provide an epsilon value denoting the margin of error.
def square_root(x): # type: ignore
assert x >= 0
... # compute square root in y
epsilon = 0.000001
assert abs(y * y - x) < epsilon
In Python, the function math.isclose(x, y)
also does the job, by default ensuring that the two values are the same within about 9 decimal digits:
math.isclose(math.sqrt(2.0) * math.sqrt(2.0), 2.0)
True
So let's use math.isclose()
for our revised postcondition:
def square_root(x): # type: ignore
assert x >= 0
... # compute square root in y
assert math.isclose(y * y, x)
Let us try out this postcondition by using an actual implementation. The Newton–Raphson method is an efficient way to compute square roots:
def square_root(x): # type: ignore
assert x >= 0 # precondition
approx = None
guess = x / 2
while approx != guess:
approx = guess
guess = (approx + x / approx) / 2
assert math.isclose(approx * approx, x)
return approx
Apparently, this implementation does the job:
square_root(4.0)
2.0
However, it is not just this call that produces the correct result – all calls will produce the correct result. (If the postcondition assertion does not fail, that is.) So, a call like
square_root(12345.0)
111.1080555135405
does not require us to manually check the result – the postcondition assertion already has done that for us, and will continue to do so forever.
Having assertions right in the code gives us an easy means to test it – if we can feed sufficiently many inputs into the code without the postcondition ever failing, we can increase our confidence. Let us try this out with our square_root()
function:
for x in range(1, 10000):
y = square_root(x)
Note again that we do not have to check the value of y
– the square_root()
postcondition already did that for us.
Instead of enumerating input values, we could also use random (non-negative) numbers; even totally random numbers could work if we filter out those tests where the precondition already fails. If you are interested in such test generation techniques, the Fuzzing Book is a great reference for you.
Modern program verification tools even can prove that your program will always meet its assertions. But for all this, you need to have explicit and formal assertions in the first place.
For those interested in testing and verification, here is a quiz for you:
quiz("Is there a value for x that satisfies the precondition, "
"but fails the postcondition?",
[
"Yes",
"No"
], 'int("Y" in "Yes")')
This is indeed something a test generator or program verifier might be able to find with zero effort.
In the case of square_root()
, our postcondition is total – if it passes, then the result is correct (within the epsilon
boundaries, that is). In practice, however, it is not always easy to provide such a total check. As an example, consider our remove_html_markup()
function from the Introduction to Debugging:
def remove_html_markup(s): # type: ignore
tag = False
quote = False
out = ""
for c in s:
if c == '<' and not quote:
tag = True
elif c == '>' and not quote:
tag = False
elif c == '"' or c == "'" and tag:
quote = not quote
elif not tag:
out = out + c
return out
remove_html_markup("I am a text with <strong>HTML markup</strong>")
'I am a text with HTML markup'
The precondition for remove_html_markup()
is trivial – it accepts any string. (Strictly speaking, a precondition assert isinstance(s, str)
could prevent it from being called with some other collection such as a list.)
The challenge, however, is the postcondition. How do we check that remove_html_markup()
produces the correct result?
We could check it against some other implementation that removes HTML markup – but if we already do have such a "golden" implementation, why bother implementing it again?
After a change, we could also check it against some earlier version to prevent regression – that is, losing functionality that was there before. But how would we know the earlier version was correct? (And if it was, why change it?)
If we do not aim for ensuring full correctness, our postcondition can also check for partial properties. For instance, a postcondition for remove_html_markup()
may simply ensure that the result no longer contains any markup:
def remove_html_markup(s): # type: ignore
tag = False
quote = False
out = ""
for c in s:
if c == '<' and not quote:
tag = True
elif c == '>' and not quote:
tag = False
elif c == '"' or c == "'" and tag:
quote = not quote
elif not tag:
out = out + c
# postcondition
assert '<' not in out and '>' not in out
return out
Besides doing a good job at checking results, the postcondition also does a good job in documenting what remove_html_markup()
actually does.
quiz("Which of these inputs causes the assertion to fail?",
[
'`<foo>bar</foo>`',
'`"foo"`',
'`>foo<`',
'`"x > y"`'
], '1 + 1 -(-1) + (1 * -1) + 1 ** (1 - 1) + 1')
Indeed. Our (partial) assertion does not detect this error:
remove_html_markup('"foo"')
'foo'
But it detects this one:
with ExpectError():
remove_html_markup('"x > y"')
Traceback (most recent call last): File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/1913183346.py", line 2, in <cell line: 1> remove_html_markup('"x > y"') File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/2717035104.py", line 17, in remove_html_markup assert '<' not in out and '>' not in out AssertionError (expected)
In contrast to "standard" documentation such as
square_root()
expects a non-negative numberx
; its result is $\sqrt{x}$.
assertions have a big advantage: They are formal – and thus have an unambiguous semantics. Notably, we can understand what a function does uniquely by reading its pre- and postconditions. Here is an example:
def some_obscure_function(x: int, y: int, z: int) -> int:
result = int(...) # type: ignore
assert x == y == z or result > min(x, y, z)
assert x == y == z or result < max(x, y, z)
return result
quiz("What does this function do?",
[
"It returns the minimum value out of `x`, `y`, `z`",
"It returns the middle value out of `x`, `y`, `z`",
"It returns the maximum value out of `x`, `y`, `z`",
], 'int(0.5 ** math.cos(math.pi))', globals())
Indeed, this would be a useful (and bug-revealing!) postcondition for one of our showcase functions in the chapter on statistical debugging.
The final benefit of assertions, and possibly even the most important in the context of this book, is how much assertions help with locating defects. Indeed, with proper assertions, it is almost trivial to locate the one function that is responsible for a failure.
Consider the following situation. Assume I have
f()
whose precondition is satisfied, callingg()
whose precondition is violated and raises an exception.quiz("Which function is faulty here?",
[
"`g()` because it raises an exception",
"`f()`"
" because it violates the precondition of `g()`",
"Both `f()` and `g()`"
" because they are incompatible",
"None of the above"
], 'math.factorial(int(math.tau / math.pi))', globals())
The rule is very simple: If some function func()
is called with its preconditions satisfied, and the postcondition of func()
fail, then the fault in the program state must have originated at some point between these two events. Assuming that all functions called by func()
also are correct (because their postconditions held), the defect can only be in the code of func()
.
What pre- and postconditions imply is actually often called a contract between caller and callee:
whereas
In the above setting, f()
is the caller, and g()
is the callee; but as f()
violates the precondition of g()
, it has not kept its promises. Hence, f()
violates the contract and is at fault. f()
thus needs to be fixed.
Let us get back to debugging. In debugging, assertions serve two purposes:
This latter part is particularly interesting, as it allows us to focus our search on the lesser checked aspects of code and state.
When we say "code and state", what do we mean? Actually, assertions can not quickly check several executions of a function, but also large amounts of data, detecting faults in data at the moment they are introduced.
Let us illustrate this by an example. Let's assume we want a Time
class that represents the time of day. Its constructor takes the current time using hours, minutes, and seconds. (Note that this is a deliberately simple example – real-world classes for representing time are way more complex.)
class Time:
def __init__(self, hours: int = 0, minutes: int = 0, seconds: int = 0) -> None:
self._hours = hours
self._minutes = minutes
self._seconds = seconds
To access the individual elements, we introduce a few getters:
class Time(Time):
def hours(self) -> int:
return self._hours
def minutes(self) -> int:
return self._minutes
def seconds(self) -> int:
return self._seconds
We allow printing out time, using the ISO 8601 format:
class Time(Time):
def __repr__(self) -> str:
return f"{self.hours():02}:{self.minutes():02}:{self.seconds():02}"
Three minutes to midnight can thus be represented as
t = Time(23, 57, 0)
t
23:57:00
Unfortunately, there's nothing in our Time
class that prevents blatant misuse. We can easily set up a time with negative numbers, for instance:
t = Time(-1, 0, 0)
t
-1:00:00
Such a thing may have some semantics (relative time, maybe?), but it's not exactly conforming to ISO format.
Even worse, we can even construct a Time
object with strings as numbers.
t = Time("High noon") # type: ignore
and this will be included verbatim the moment we try to print it:
with ExpectError(): # This fails in Python 3.9
print(t)
High noon:00:00
For now, everything will be fine - but what happens when some other program tries to parse this time? Or processes a log file with a timestamp like this?
In fact, what we have here is a time bomb – a fault in the program state that can sleep for ages until someone steps on it. These are hard to debug, because one has to figure out when the time bomb was set – which can be thousands or millions of lines earlier in the program. Since in the absence of type checking, any assignment to a Time
object could be the culprit – so good luck with the search.
This is again where assertions save the day. What you need is an assertion that checks whether the data is correct. For instance, we could revise our constructor such that it checks for correct arguments:
class Time(Time):
def __init__(self, hours: int = 0, minutes: int = 0, seconds: int = 0) -> None:
assert 0 <= hours <= 23
assert 0 <= minutes <= 59
assert 0 <= seconds <= 60 # Includes leap seconds (ISO8601)
self._hours = hours
self._minutes = minutes
self._seconds = seconds
These conditions check whether hours
, minutes
, and seconds
are within the right range. They are called data invariants (or short invariants) because they hold for the given data (notably, the internal attributes) at all times.
Note the unusual syntax for range checks (this is a Python special), and the fact that seconds can range from 0 to 60. That's because there's not only leap years, but also leap seconds.
With this revised constructor, we now get errors as soon as we pass an invalid parameter:
with ExpectError():
t = Time(-23, 0, 0)
Traceback (most recent call last): File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/283236387.py", line 2, in <cell line: 1> t = Time(-23, 0, 0) File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/559831374.py", line 3, in __init__ assert 0 <= hours <= 23 AssertionError (expected)
Hence, any attempt to set any Time object to an illegal state will be immediately detected. In other words, the time bomb defuses itself at the moment it is being set.
This means that when we are debugging, and search for potential faults in the state that could have caused the current failure, we can now rule out Time
as a culprit, allowing us to focus on other parts of the state.
The more of the state we have checked with invariants,
For invariants to be effective, they have to be checked at all times. If we introduce a method that changes the state, then this method will also have to ensure that the invariant is satisfied:
class Time(Time):
def set_hours(self, hours: int) -> None:
assert 0 <= hours <= 23
self._hours = hours
This also implies that state changes should go through methods, not direct accesses to attributes. If some code changes the attributes of your object directly, without going through the method that could check for consistency, then it will be much harder for you to a) detect the source of the problem and b) even detect that a problem exists.
If we have several methods that can alter an object, it can be helpful to factor out invariant checking into its own method. Such a method can also be called to check for inconsistencies that might have been introduced without going through one of the methods – e.g. by direct object access, memory manipulation, or memory corruption.
By convention, methods that check invariants have the name repOK()
, since they check whether the internal representation is okay, and return True if so.
Here's a repOK()
method for Time
:
class Time(Time):
def repOK(self) -> bool:
assert 0 <= self.hours() <= 23
assert 0 <= self.minutes() <= 59
assert 0 <= self.seconds() <= 60
return True
We can integrate this method right into our constructor and our setter:
class Time(Time):
def __init__(self, hours: int = 0, minutes: int = 0, seconds: int = 0) -> None:
self._hours = hours
self._minutes = minutes
self._seconds = seconds
assert self.repOK()
def set_hours(self, hours: int) -> None:
self._hours = hours
assert self.repOK()
with ExpectError():
t = Time(-23, 0, 0)
Traceback (most recent call last): File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/283236387.py", line 2, in <cell line: 1> t = Time(-23, 0, 0) File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/346828762.py", line 6, in __init__ assert self.repOK() File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/478898990.py", line 3, in repOK assert 0 <= self.hours() <= 23 AssertionError (expected)
Having a single method that checks everything can be beneficial, as it may explicitly check for more faulty states. For instance, it is still permissible to pass a floating-point number for hours and minutes, again breaking the Time representation:
Time(1.5) # type: ignore
1.5:00:00
(Strictly speaking, ISO 8601 does allow fractional parts for seconds and even for hours and minutes – but still wants two leading digits before the fraction separator. Plus, the comma is the "preferred" fraction separator. In short, you won't be making too many friends using times formatted like the one above.)
We can extend our repOK()
method to check for correct types, too.
class Time(Time):
def repOK(self) -> bool:
assert isinstance(self.hours(), int)
assert isinstance(self.minutes(), int)
assert isinstance(self.seconds(), int)
assert 0 <= self.hours() <= 23
assert 0 <= self.minutes() <= 59
assert 0 <= self.seconds() <= 60
return True
Time(14, 0, 0)
14:00:00
This now also catches other type errors:
with ExpectError():
t = Time("After midnight") # type: ignore
Traceback (most recent call last): File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/119323666.py", line 2, in <cell line: 1> t = Time("After midnight") # type: ignore File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/346828762.py", line 6, in __init__ assert self.repOK() File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/1587012707.py", line 3, in repOK assert isinstance(self.hours(), int) AssertionError (expected)
Our repOK()
method can also be used in combination with pre- and postconditions. Typically, you'd like to make it part of the pre- and postcondition checks.
Assume you want to implement an advance()
method that adds a number of seconds to the current time. The preconditions and postconditions can be easily defined:
class Time(Time):
def seconds_since_midnight(self) -> int:
return self.hours() * 3600 + self.minutes() * 60 + self.seconds()
def advance(self, seconds_offset: int) -> None:
old_seconds = self.seconds_since_midnight()
... # Advance the clock
assert (self.seconds_since_midnight() ==
(old_seconds + seconds_offset) % (24 * 60 * 60))
But you'd really like advance()
to check the state before and after its execution – again using repOK()
:
class BetterTime(Time):
def advance(self, seconds_offset: int) -> None:
assert self.repOK()
old_seconds = self.seconds_since_midnight()
... # Advance the clock
assert (self.seconds_since_midnight() ==
(old_seconds + seconds_offset) % (24 * 60 * 60))
assert self.repOK()
The first postcondition ensures that advance()
produces the desired result; the second one ensures that the internal state is still okay.
Invariants are especially useful if you have a large, complex data structure which is very hard to track in a conventional debugger.
Let's assume you have a red-black search tree for storing and searching data. Red-black trees are among the most efficient data structures for representing associative arrays (also known as mappings); they are self-balancing and guarantee search, insertion, and deletion in logarithmic time. They also are among the most ugly to debug.
What is a red-black tree? Here is an example from Wikipedia:
As you can see, there are red nodes and black nodes (giving the tree its name). We can define a class RedBlackTrees
and implement all the necessary operations.
class RedBlackTree:
RED = 'red'
BLACK = 'black'
...
# ignore
# A few dummy methods to make the static type checker happy. Ignore.
class RedBlackNode:
def __init__(self) -> None:
self.parent = None
self.color = RedBlackTree.BLACK
pass
# ignore
# More dummy methods to make the static type checker happy. Ignore.
class RedBlackTree(RedBlackTree):
def redNodesHaveOnlyBlackChildren(self) -> bool:
return True
def equalNumberOfBlackNodesOnSubtrees(self) -> bool:
return True
def treeIsAcyclic(self) -> bool:
return True
def parentsAreConsistent(self) -> bool:
return True
def __init__(self) -> None:
self._root = RedBlackNode()
self._root.parent = None
self._root.color = self.BLACK
However, before we start coding, it would be a good idea to first reason about the invariants of a red-black tree. Indeed, a red-black tree has a number of important properties that hold at all times – for instance, that the root node be black or that the tree be balanced. When we implement a red-black tree, these invariants can be encoded into a repOK()
method:
class RedBlackTree(RedBlackTree):
def repOK(self) -> bool:
assert self.rootHasNoParent()
assert self.rootIsBlack()
assert self.redNodesHaveOnlyBlackChildren()
assert self.equalNumberOfBlackNodesOnSubtrees()
assert self.treeIsAcyclic()
assert self.parentsAreConsistent()
return True
Each of these helper methods are checkers in their own right:
class RedBlackTree(RedBlackTree):
def rootHasNoParent(self) -> bool:
return self._root.parent is None
def rootIsBlack(self) -> bool:
return self._root.color == self.BLACK
...
With all these helpers, our repOK()
method will become very rigorous – but all this rigor is very much needed. Just for fun, check out the description of red-black trees on Wikipedia. The description of how insertion or deletion work is 4 to 5 pages long (each!), with dozens of special cases that all have to be handled properly. If you ever face the task of implementing such a data structure, be sure to (1) write a repOK()
method such as the above, and (2) call it before and after each method that alters the tree:
# ignore
from typing import Any, List
class RedBlackTree(RedBlackTree):
def insert(self, item: Any) -> None:
assert self.repOK()
... # four pages of code
assert self.repOK()
def delete(self, item: Any) -> None:
assert self.repOK()
... # five pages of code
assert self.repOK()
Such checks will make your tree run much slower – essentially, instead of logarithmic time complexity, we now have linear time complexity, as the entire tree is traversed with each change – but you will find any bugs much, much faster. Once your tree goes in production, you can deactivate repOK()
by default, using some debugging switch to turn it on again should the need ever arise:
class RedBlackTree(RedBlackTree):
def __init__(self, checkRepOK: bool = False) -> None:
...
self.checkRepOK = checkRepOK
def repOK(self) -> bool:
if not self.checkRepOK:
return True
assert self.rootHasNoParent()
assert self.rootIsBlack()
...
return True
Just don't delete it – future maintainers of your code will be forever grateful that you have documented your assumptions and given them a means to quickly check their code.
When interacting with the operating system, there are a number of rules that programs must follow, lest they get themselves (or the system) in some state where they cannot execute properly anymore.
One area in which it is particularly easy to make mistakes is memory usage. In Python, memory is maintained by the Python interpreter, and all memory accesses are checked at runtime. Accessing a non-existing element of a string, for instance, will raise a memory error:
with ExpectError():
index = 10
"foo"[index]
Traceback (most recent call last): File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/295675708.py", line 3, in <cell line: 1> "foo"[index] IndexError: string index out of range (expected)
The very same expression in a C program, though, will yield undefined behavior – which means that anything can happen. Let us explore a couple of C programs with undefined behavior.
We start with a simple C program which uses the same invalid index as our Python expression, above. What does this program do?
# ignore
open('testoverflow.c', 'w').write(r'''
#include <stdio.h>
// Access memory out of bounds
int main(int argc, char *argv[]) {
int index = 10;
return "foo"[index]; // BOOM
}
''');
print_content(open('testoverflow.c').read())
#include <stdio.h> // Access memory out of bounds int main(int argc, char *argv[]) { int index = 10; return "foo"[index]; // BOOM }
In our example, the program will read from a random chunk or memory, which may exist or not. In most cases, nothing at all will happen – which is a bad thing, because you won't realize that your program has a defect.
!cc -g -o testoverflow testoverflow.c
!./testoverflow
To see what is going on behind the scenes, let us have a look at the C memory model.
In C, memory comes as a single block of bytes at continuous addresses. Let us assume we have a memory of only 20 bytes (duh!) and the string "foo" is stored at address 5:
mem_with_table: Memory = Memory(20)
mem_with_table[5] = 'f'
mem_with_table[6] = 'o'
mem_with_table[7] = 'o'
mem_with_table
Address | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Content | 'f' | 'o' | 'o' |
When we try to access "foo"[10]
, we try to read the memory location at address 15 – which may exist (or not), and which may have arbitrary contents, based on whatever previous instructions left there. From there on, the behavior of our program is undefined.
Such buffer overflows can also come as writes into memory locations – and thus overwrite the item that happens to be at the location of interest. If the item at address 15 happens to be, say, a flag controlling administrator access, then setting it to a non-zero value can come handy for an attacker.
A second source of errors in C programs is the use of dynamic memory – that is, memory allocated and deallocated at run-time. In C, the function malloc()
returns a continuous block of memory of a given size; the function free()
returns it back to the system.
After a block has been free()
'd, it must no longer be used, as the memory might already be in use by some other function (or program!) again. Here's a piece of code that violates this assumption:
# ignore
open('testuseafterfree.c', 'w').write(r'''
#include <stdlib.h>
// Access a chunk of memory after it has been given back to the system
int main(int argc, char *argv[]) {
int *array = malloc(100 * sizeof(int));
free(array);
return array[10]; // BOOM
}
''');
print_content(open('testuseafterfree.c').read())
#include <stdlib.h> // Access a chunk of memory after it has been given back to the system int main(int argc, char *argv[]) { int *array = malloc(100 * sizeof(int)); free(array); return array[10]; // BOOM }
Again, if we compile and execute this program, nothing tells us that we have just entered undefined behavior:
!cc -g -o testuseafterfree testuseafterfree.c
!./testuseafterfree
What's going on behind the scenes here?
Dynamic memory is allocated as part of our main memory. The following table shows unallocated memory in gray and allocated memory in black:
dynamic_mem: DynamicMemory = DynamicMemory(13)
dynamic_mem
Address | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Content | -1 | 0 |
The numbers already stored (-1 and 0) are part of our dynamic memory housekeeping (highlighted in blue); they stand for the next block of memory and the length of the current block, respectively.
Let us allocate a block of 3 bytes. Our (simulated) allocation mechanism places these at the first continuous block available:
p1 = dynamic_mem.allocate(3)
p1
2
We see that a block of 3 items is allocated at address 2. The two numbers before that address (5 and 3) indicate the beginning of the next block as well as the length of the current one.
dynamic_mem
Address | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Content | 5 | 3 | -1 | 0 |
Let us allocate some more.
p2 = dynamic_mem.allocate(4)
p2
7
We can make use of that memory:
dynamic_mem[p1] = 123
dynamic_mem[p2] = 'x'
dynamic_mem
Address | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Content | 5 | 3 | 123 | 11 | 4 | 'x' | -1 | 0 |
When we free memory, the block is marked as free by giving it a negative length:
dynamic_mem.free(p1)
dynamic_mem
Address | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Content | 5 | -3 | 123 | 11 | 4 | 'x' | -1 | 0 |
Note that freeing memory does not clear memory; the item at p1
is still there. And we can also still access it.
dynamic_mem[p1]
123
But if, in the meantime, some other part of the program requests more memory and uses it...
p3 = dynamic_mem.allocate(2)
dynamic_mem[p3] = 'y'
dynamic_mem
Address | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Content | 5 | 3 | 'y' | 11 | 4 | 'x' | -1 | 0 |
... then the memory at p1
may simply be overwritten.
dynamic_mem[p1]
'y'
An even worse effect comes into play if one accidentally overwrites the dynamic memory allocation information; this can easily corrupt the entire memory management. In our case, such corrupted memory can lead to an endless loop when trying to allocate more memory:
dynamic_mem[p3 + 3] = 0
dynamic_mem
Address | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Content | 5 | 3 | 'y' | 0 | 4 | 'x' | -1 | 0 |
When allocate()
traverses the list of blocks, it will enter an endless loop between the block starting at address 0 (pointing to the next block at 5) and the block at address 5 (pointing back to 0).
with ExpectTimeout(1):
dynamic_mem.allocate(1)
Traceback (most recent call last): File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/2188947149.py", line 2, in <cell line: 1> dynamic_mem.allocate(1) File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/2922652606.py", line 24, in allocate if chunk_length < 0 and abs(chunk_length) >= block_size: File "Timeout.ipynb", line 43, in timeout_handler raise TimeoutError() TimeoutError (expected)
Real-world malloc()
and free()
implementations suffer from similar problems. As stated above: As soon as undefined behavior is reached, anything may happen.
The solution to all these problems is to keep track of memory, specifically
To this end, we introduce two extra flags for each address:
allocated
flag tells whether an address has been allocated; the allocate()
method sets them, and free()
clears them again.initialized
flag tells whether an address has been written to. This is cleared as part of allocate()
.With these, we can run a number of checks:
Both of these should effectively prevent memory errors.
Here comes a simple simulation of managed memory. After we create memory, all addresses are neither allocated nor initialized:
managed_mem: ManagedMemory = ManagedMemory()
managed_mem
Address | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
---|---|---|---|---|---|---|---|---|---|---|
Allocated | ||||||||||
Initialized | ||||||||||
Content | -1 | 0 |
Let us allocate some elements. We see that the first three bytes are now marked as allocated:
p = managed_mem.allocate(3)
managed_mem
Address | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
---|---|---|---|---|---|---|---|---|---|---|
Allocated | Y | Y | Y | |||||||
Initialized | ||||||||||
Content | 5 | 3 | -1 | 0 |
After writing into memory, the respective addresses are marked as "initialized":
managed_mem[p] = 10
managed_mem[p + 1] = 20
managed_mem
Address | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
---|---|---|---|---|---|---|---|---|---|---|
Allocated | Y | Y | Y | |||||||
Initialized | Y | Y | ||||||||
Content | 5 | 3 | 10 | 20 | -1 | 0 |
Attempting to read uninitialized memory fails:
with ExpectError():
x = managed_mem[p + 2]
Traceback (most recent call last): File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/1363131886.py", line 2, in <cell line: 1> x = managed_mem[p + 2] File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/2465984283.py", line 3, in __getitem__ return self.read(address) File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/2898840933.py", line 11, in read assert self.initialized[address], \ AssertionError: Reading from uninitialized memory (expected)
When we free the block again, it is marked as not allocated:
managed_mem.free(p)
managed_mem
Address | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
---|---|---|---|---|---|---|---|---|---|---|
Allocated | ||||||||||
Initialized | ||||||||||
Content | 5 | -3 | 10 | 20 | -1 | 0 |
And accessing any element of the free'd block will yield an error:
with ExpectError():
managed_mem[p] = 10
Traceback (most recent call last): File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/4208287712.py", line 2, in <cell line: 1> managed_mem[p] = 10 File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/2465984283.py", line 6, in __setitem__ self.write(address, item) File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/2898840933.py", line 3, in write assert self.allocated[address], \ AssertionError: Writing into unallocated memory (expected)
Freeing the same block twice also yields an error:
with ExpectError():
managed_mem.free(p)
Traceback (most recent call last): File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/3645412891.py", line 2, in <cell line: 1> managed_mem.free(p) File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/193494573.py", line 10, in free assert self.allocated[base], \ AssertionError: Freeing memory that is already freed (expected)
With this, we now have a mechanism in place to fully detect memory issues in languages such as C.
Obviously, keeping track of whether memory is allocated/initialized or not requires some extra memory – and also some extra computation time, as read and write accesses have to be checked first. During testing, however, such effort may quickly pay off, as memory bugs can be quickly discovered.
To detect memory errors, a number of tools have been developed. The first class of tools interprets the instructions of the executable code, tracking all memory accesses. For each memory access, they can check whether the memory accessed exists and has been initialized at some point.
The Valgrind tool allows interpreting executable code, thus tracking each and every memory access. You can use Valgrind to execute any program from the command-line, and it will check all memory accesses during execution.
Here's what happens if we run Valgrind on our testuseafterfree
program:
print_content(open('testuseafterfree.c').read())
#include <stdlib.h> // Access a chunk of memory after it has been given back to the system int main(int argc, char *argv[]) { int *array = malloc(100 * sizeof(int)); free(array); return array[10]; // BOOM }
!valgrind ./testuseafterfree
==77== Memcheck, a memory error detector ==77== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==77== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info ==77== Command: ./testuseafterfree ==77== ==77== Invalid read of size 4 ==77== at 0x1086B7: main (testuseafterfree.c:8) ==77== Address 0x522f068 is 40 bytes inside a block of size 400 free'd ==77== at 0x4C32D3B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==77== by 0x1086B2: main (testuseafterfree.c:7) ==77== Block was alloc'd at ==77== at 0x4C31B0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==77== by 0x1086A2: main (testuseafterfree.c:6) ==77== ==77== ==77== HEAP SUMMARY: ==77== in use at exit: 0 bytes in 0 blocks ==77== total heap usage: 1 allocs, 1 frees, 400 bytes allocated ==77== ==77== All heap blocks were freed -- no leaks are possible ==77== ==77== For counts of detected and suppressed errors, rerun with: -v ==77== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
We see that Valgrind has detected the issue ("Invalid read of size 4") during execution; it also reported the current stack trace (and hence the location at which the error occurred). Note that the program continues execution even after the error occurred; should further errors occur, Valgrind will report these, too.
Being an interpreter, Valgrind slows down execution of programs dramatically. However, it requires no recompilation and thus can work on code (and libraries) whose source code is not available.
Valgrind is not perfect, though. For our testoverflow
program, it fails to detect the illegal access:
print_content(open('testoverflow.c').read())
#include <stdio.h> // Access memory out of bounds int main(int argc, char *argv[]) { int index = 10; return "foo"[index]; // BOOM }
!valgrind ./testoverflow
==78== Memcheck, a memory error detector ==78== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==78== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info ==78== Command: ./testoverflow ==78== ==78== ==78== HEAP SUMMARY: ==78== in use at exit: 0 bytes in 0 blocks ==78== total heap usage: 0 allocs, 0 frees, 0 bytes allocated ==78== ==78== All heap blocks were freed -- no leaks are possible ==78== ==78== For counts of detected and suppressed errors, rerun with: -v ==78== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
This is because at compile time, the information about the length of the "foo"
string is no longer available – all Valgrind sees is a read access into the static data portion of the executable that may be valid or invalid. To actually detect such errors, we need to hook into the compiler.
The second class of tools to detect memory issues are address sanitizers. An address sanitizer injects memory-checking code into the program during compilation. This means that every access will be checked – but this time, the code still runs on the processor itself, meaning that the speed is much less reduced.
Here is an example of how to use the address sanitizer of the Clang C compiler:
!cc -fsanitize=address -o testuseafterfree testuseafterfree.c
At the very first moment we have an out-of-bounds access, the program aborts with a diagnostic message – in our case already during read_overflow()
.
!./testuseafterfree
testuseafterfree(74510,0x1fd223840) malloc: nano zone abandoned due to inability to reserve vm space. ================================================================= ==74510==ERROR: AddressSanitizer: heap-use-after-free on address 0x614000000068 at pc 0x000102487eb8 bp 0x00016d97a9f0 sp 0x00016d97a9e8 READ of size 4 at 0x614000000068 thread T0 #0 0x102487eb4 in main+0x94 (testuseafterfree:arm64+0x100003eb4) #1 0x197d8c270 (<unknown module>) 0x614000000068 is located 40 bytes inside of 400-byte region [0x614000000040,0x6140000001d0) freed by thread T0 here: #0 0x1028f4d40 in free+0x98 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54d40) #1 0x102487e58 in main+0x38 (testuseafterfree:arm64+0x100003e58) #2 0x197d8c270 (<unknown module>) previously allocated by thread T0 here: #0 0x1028f4c04 in malloc+0x94 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54c04) #1 0x102487e4c in main+0x2c (testuseafterfree:arm64+0x100003e4c) #2 0x197d8c270 (<unknown module>) SUMMARY: AddressSanitizer: heap-use-after-free (testuseafterfree:arm64+0x100003eb4) in main+0x94 Shadow bytes around the buggy address: 0x613ffffffd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x613ffffffe00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x613ffffffe80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x613fffffff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x613fffffff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0x614000000000: fa fa fa fa fa fa fa fa fd fd fd fd fd[fd]fd fd 0x614000000080: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x614000000100: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x614000000180: fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa fa 0x614000000200: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x614000000280: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb ==74510==ABORTING
Likewise, if we apply the address sanitizer on testoverflow
, we also immediately get an error:
!cc -fsanitize=address -o testoverflow testoverflow.c
!./testoverflow
testoverflow(74528,0x1fd223840) malloc: nano zone abandoned due to inability to reserve vm space. ================================================================= ==74528==ERROR: AddressSanitizer: global-buffer-overflow on address 0x000102b6feaa at pc 0x000102b6fdf0 bp 0x00016d292a10 sp 0x00016d292a08 READ of size 1 at 0x000102b6feaa thread T0 #0 0x102b6fdec in main+0x84 (testoverflow:arm64+0x100003dec) #1 0x197d8c270 (<unknown module>) 0x000102b6feaa is located 6 bytes after global variable '.str' defined in 'testoverflow.c' (0x102b6fea0) of size 4 '.str' is ascii string 'foo' SUMMARY: AddressSanitizer: global-buffer-overflow (testoverflow:arm64+0x100003dec) in main+0x84 Shadow bytes around the buggy address: 0x000102b6fc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x000102b6fc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x000102b6fd00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x000102b6fd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x000102b6fe00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0x000102b6fe80: 00 00 00 00 04[f9]f9 f9 00 00 00 00 00 00 00 00 0x000102b6ff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x000102b6ff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x000102b70000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x000102b70080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x000102b70100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb ==74528==ABORTING
Since the address sanitizer monitors each and every read and write, as well as usage of free()
, it will require some effort to create a bug that it won't catch. Also, while Valgrind runs the program ten times slower and more, the performance penalty for memory sanitization is much much lower. Sanitizers can also help in finding data races, memory leaks, and all other sorts of undefined behavior. As Daniel Lemire puts it:
Really, if you are using gcc or clang and you are not using these flags, you are not being serious.
We have seen that during testing and debugging, invariants should be checked as much as possible, thus narrowing down the time it takes to detect a violation to a minimum. The easiest way to get there is to have them checked as postcondition in the constructor and any other method that sets the state of an object.
If you have means to alter the state of an object outside of these methods – for instance, by directly writing to memory, or by writing to internal attributes –, then you may have to check them even more frequently. Using the tracing infrastructure, for instance, you can have the tracer invoke repOK()
with each and every line executed, thereby again directly pinpointing the moment the state gets corrupted. While this will slow down execution tremendously, it is still better to have the computer do the work than you stepping backwards and forwards through an execution.
Another question is whether assertions should remain active even in production code. Assertions take time, and this may be too much for production.
First of all, assertions are not production code – the rest of the code should not be impacted by any assertion being on or off. If you write code like
assert map.remove(location)
your assertion will have a side effect, namely removing a location from the map. If one turns assertions off, the side effect will be turned off as well. You need to change this into
locationRemoved = map.remove(location)
assert locationRemoved
Consequently, you should not rely on assertions for system preconditions – that is, conditions that are necessary to keep the system running. System input (or anything that could be controlled by another party) still has to be validated by production code, not assertions. Critical conditions have to be checked by production code, not (only) assertions.
If you have code such as
assert command in {"open", "close", "exit"}
exec(command)
then having the assertion document and check your assumptions is fine. However, if you turn the assertion off in production code, it will only be a matter of time until somebody sets command
to 'system("/bin/sh")'
and all of a sudden takes control over your system.
The main reason for turning assertions off is efficiency. However, failing early is better than having bad data and not failing. Think carefully which assertions have a high impact on execution time, and turn these off first. Assertions that have little to no impact on resources can be left on.
As an example, here's a piece of code that handles traffic in a simulation. The light
variable can be either RED
, AMBER
, or GREEN
:
if light == RED:
traffic.stop()
elif light == AMBER:
traffic.prepare_to_stop()
elif light == GREEN:
traffic.go()
else:
pass # This can't happen!
Having an assertion
assert light in [RED, AMBER, GREEN]
in your code will eat some (minor) resources. However, adding a line
assert False
in the place of the This can't happen!
line, above, will still catch errors, but require no resources at all.
If you have very critical software, it may be wise to actually pay the extra penalty for assertions (notably system assertions) rather than sacrifice reliability for performance. Keeping a memory sanitizer on even in production can have a small impact on performance, but will catch plenty of errors before some corrupted data (and even some attacks) have bad effects downstream.
By default, failing assertions are not exactly user-friendly – the diagnosis they provide is of interest to the code maintainers only. Think of how your application should handle internal errors as discovered by assertions (or the runtime system). Simply exiting (as assertions on C do) may not be the best option for critical software. Think about implementing your own assert functions with appropriate recovery methods.
This chapter discusses assertions to define assumptions on function inputs and results:
def my_square_root(x): # type: ignore
assert x >= 0
y = square_root(x)
assert math.isclose(y * y, x)
return y
Notably, assertions detect violations of these assumptions at runtime:
with ExpectError():
y = my_square_root(-1)
Traceback (most recent call last): File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/76616918.py", line 2, in <cell line: 1> y = my_square_root(-1) File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/2617682038.py", line 2, in my_square_root assert x >= 0 AssertionError (expected)
System assertions help to detect invalid memory operations.
managed_mem = ManagedMemory()
managed_mem
Address | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
---|---|---|---|---|---|---|---|---|---|---|
Allocated | ||||||||||
Initialized | ||||||||||
Content | -1 | 0 |
with ExpectError():
x = managed_mem[2]
Traceback (most recent call last): File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/1296110967.py", line 2, in <cell line: 1> x = managed_mem[2] File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/2465984283.py", line 3, in __getitem__ return self.read(address) File "/var/folders/n2/xd9445p97rb3xh7m1dfx8_4h0006ts/T/ipykernel_74444/2898840933.py", line 9, in read assert self.allocated[address], \ AssertionError: Reading from unallocated memory (expected)
# ignore
import os
import shutil
# ignore
# Let's cleanup things
for path in [
'assert.h',
'testassert',
'testassert.c',
'testassert.dSYM',
'testoverflow',
'testoverflow.c',
'testoverflow.dSYM',
'testuseafterfree',
'testuseafterfree.c',
'testuseafterfree.dSYM',
]:
if os.path.isdir(path):
shutil.rmtree(path)
else:
try:
os.remove(path)
except FileNotFoundError:
pass
In the next chapters, we will learn how to
The usage of assertions goes back to the earliest days of programming. In 1947, Neumann and Goldstine defined assertion boxes that would check the limits of specific variables. In his 1949 talk "Checking a Large Routine", Alan Turing suggested
How can one check a large routine in the sense of making sure that it's right? In order that the man who checks may not have too difficult a task, the programmer should make a number of definite assertions which can be checked individually, and from which the correctness of the whole program easily follows.
Valgrind originated as an academic tool which has seen lots of industrial usage. A list of papers is available on the Valgrind page.
The Address Sanitizer discussed in this chapter was developed at Google; the paper by Serebryany discusses several details.
The Python shelve
module provides a simple interface for permanent storage of Python objects:
d = shelve.open('mydb')
d['123'] = 123
d['123']
123
d.close()
d = shelve.open('mydb')
d['123']
123
d.close()
Based on shelve
, we can implement a class ObjectStorage
that uses a context manager (a with
block) to ensure the shelve database is always closed - also in presence of exceptions:
# ignore
from typing import Sequence, Any, Callable, Optional, Type, Tuple, Any
from typing import Dict, Union, Set, List, FrozenSet, cast
class Storage:
def __init__(self, dbname: str) -> None:
self.dbname = dbname
def __enter__(self) -> Any:
self.db = shelve.open(self.dbname)
return self
def __exit__(self, exc_tp: Type, exc_value: BaseException,
exc_traceback: TracebackType) -> Optional[bool]:
self.db.close()
return None
def __getitem__(self, key: str) -> Any:
return self.db[key]
def __setitem__(self, key: str, value: Any) -> None:
self.db[key] = value
with Storage('mydb') as storage:
print(storage['123'])
123
Extend Storage
with assertions that ensure that after adding an element, it also can be retrieved with the same value.
Extend Storage
with a "shadow dictionary" which holds elements in memory storage, too. Have a repOK()
method that memory storage and shelve
storage are identical at all times.