How Debuggers Work¶
Interactive debuggers are tools that allow you to selectively observe the program state during an execution. In this chapter, you will learn how such debuggers work – by building your own debugger.
from bookutils import YouTubeVideo
YouTubeVideo("4aZ0t7CWSjA")
Prerequisites
- You should have read the Chapter on Tracing Executions.
- Again, knowing a bit of Python is helpful for understanding the code examples in the book.
import bookutils.setup
import sys
from Tracer import Tracer
Synopsis¶
To use the code provided in this chapter, write
>>> from debuggingbook.Debugger import <identifier>
and then make use of the following features.
This chapter provides an interactive debugger for Python functions. The debugger is invoked as
with Debugger():
function_to_be_observed()
...
While running, you can enter debugger commands at the (debugger)
prompt. Here's an example session:
>>> with Debugger():
>>> ret = remove_html_markup('abc')
Calling remove_html_markup(s = 'abc')
(debugger) help
break -- Set a breakpoint in given line. If no line is given, list all breakpoints
continue -- Resume execution
delete -- Delete breakpoint in line given by `arg`.
Without given line, clear all breakpoints
help -- Give help on given `command`. If no command is given, give help on all
list -- Show current function. If `arg` is given, show its source code.
print -- Print an expression. If no expression is given, print all variables
quit -- Finish execution
step -- Execute up to the next line
(debugger) break 14
Breakpoints: {14}
(debugger) list
1> def remove_html_markup(s): # type: ignore
2 tag = False
3 quote = False
4 out = ""
5
6 for c in s:
7 if c == '<' and not quote:
8 tag = True
9 elif c == '>' and not quote:
10 tag = False
11 elif c == '"' or c == "'" and tag:
12 quote = not quote
13 elif not tag:
14# out = out + c
15
16 return out
(debugger) continue
# tag = False, quote = False, out = '', c = 'a'
14 out = out + c
(debugger) step
# out = 'a'
6 for c in s:
(debugger) print out
out = 'a'
(debugger) quit
The Debugger
class can be easily extended in subclasses. A new method NAME_command(self, arg)
will be invoked whenever a command named NAME
is entered, with arg
holding given command arguments (empty string if none).
Debuggers¶
Interactive Debuggers (or short debuggers) are tools that allow you to observe program executions. A debugger typically offers the following features:
- Run the program
- Define conditions under which the execution should stop and hand over control to the debugger. Conditions include
- a particular location is reached
- a particular variable takes a particular value
- a particular variable is accessed
- or some other condition of choice.
- When the program stops, you can observe the current state, including
- the current location
- variables and their values
- the current function and its callers
- When the program stops, you can step through program execution, having it stop at the next instruction again.
- Finally, you can also resume execution to the next stop.
This functionality often comes as a command-line interface, typing commands at a prompt; or as a graphical user interface, selecting commands from the screen. Debuggers can come as standalone tools, or be integrated into a programming environment of choice.
Debugger interaction typically follows a loop pattern. First, you identify the location(s) you want to inspect, and tell the debugger to stop execution once one of these breakpoints is reached. Here's a command that could instruct a command-line debugger to stop at Line 239:
(debugger) break 239
(debugger) _
Then you have the debugger resume or start execution. The debugger will stop at the given location.
(debugger) continue
Line 239: s = x
(debugger) _
When it stops at the given location, you use debugger commands to inspect the state (and check whether things are as expected).
(debugger) print s
s = 'abc'
(debugger) _
You can then step through the program, executing more lines.
(debugger) step
Line 240: c = s[0]
(debugger) print c
c = 'a'
(debugger) _
You can also define new stop conditions, investigating other locations, variables, and conditions.
Debugger Interaction¶
Let us now show how to build such a debugger. The key idea of an interactive debugger is to set up the tracing function such that it actually asks what to do next, prompting you to enter a command. For the sake of simplicity, we collect such a command interactively from a command line, using the Python input()
function.
Our debugger holds a number of variables to indicate its current status:
stepping
is True whenever the user wants to step into the next line.breakpoints
is a set of breakpoints (line numbers)interact
is True while the user stays at one position.
We also store the current tracing information in three attributes frame
, event
, and arg
. The variable local_vars
holds local variables.
from types import FrameType
class Debugger(Tracer):
"""Interactive Debugger"""
def __init__(self, *, file: TextIO = sys.stdout) -> None:
"""Create a new interactive debugger."""
self.stepping: bool = True
self.breakpoints: Set[int] = set()
self.interact: bool = True
self.frame: FrameType
self.event: Optional[str] = None
self.arg: Any = None
self.local_vars: Dict[str, Any] = {}
super().__init__(file=file)
The traceit()
method is the main entry point for our debugger. If we should stop, we go into user interaction.
class Debugger(Debugger):
def traceit(self, frame: FrameType, event: str, arg: Any) -> None:
"""Tracing function; called at every line. To be overloaded in subclasses."""
self.frame = frame
self.local_vars = frame.f_locals # Dereference exactly once
self.event = event
self.arg = arg
if self.stop_here():
self.interaction_loop()
We stop whenever we are stepping through the program or reach a breakpoint:
class Debugger(Debugger):
def stop_here(self) -> bool:
"""Return True if we should stop"""
return self.stepping or self.frame.f_lineno in self.breakpoints
Our interaction loop shows the current status, reads in commands, and executes them.
class Debugger(Debugger):
def interaction_loop(self) -> None:
"""Interact with the user"""
self.print_debugger_status(self.frame, self.event, self.arg)
self.interact = True
while self.interact:
command = input("(debugger) ")
self.execute(command)
For a moment, let us implement two commands, step
and continue
. step
steps through the program:
class Debugger(Debugger):
def step_command(self, arg: str = "") -> None:
"""Execute up to the next line"""
self.stepping = True
self.interact = False
class Debugger(Debugger):
def continue_command(self, arg: str = "") -> None:
"""Resume execution"""
self.stepping = False
self.interact = False
The execute()
method dispatches between these two.
class Debugger(Debugger):
def execute(self, command: str) -> None:
if command.startswith('s'):
self.step_command()
elif command.startswith('c'):
self.continue_command()
Our debugger is now ready to run! Let us invoke it on the buggy remove_html_markup()
variant from the Introduction to Debugging:
def remove_html_markup(s):
tag = False
quote = False
out = ""
for c in s:
if c == '<' and not quote:
tag = True
elif c == '>' and not quote:
tag = False
elif c == '"' or c == "'" and tag:
quote = not quote
elif not tag:
out = out + c
return out
We invoke the debugger just like Tracer
, using a with
clause. The code
with Debugger():
remove_html_markup('abc')
gives us a debugger prompt
(debugger) _
where we can enter one of our two commands.
Let us do two steps through the program and then resume execution:
from bookutils import input, next_inputs
with Debugger():
remove_html_markup('abc')
Try this out for yourself by running the above invocation in the interactive notebook! If you are reading the Web version, the top menu entry Resources
-> Edit as Notebook
will do the trick. Navigate to the above invocation and press Shift
+Enter
.
A Command Dispatcher¶
Our execute()
function is still a bit rudimentary. A true command-line tool should provide means to tell which commands are available (help
), automatically split arguments, and not stand in line of extensibility.
We therefore implement a better execute()
method which does all that. Our revised execute()
method inspects its class for methods that end in _command()
, and automatically registers their names as commands. Hence, with the above, we already get step
and continue
as possible commands.
Implementing execute()
Let us detail how we implement execute()
. The commands()
method returns a list of all commands (as strings) from the class.
class Debugger(Debugger):
def commands(self) -> List[str]:
"""Return a list of commands"""
cmds = [method.replace('_command', '')
for method in dir(self.__class__)
if method.endswith('_command')]
cmds.sort()
return cmds
d = Debugger()
d.commands()
The command_method()
method converts a given command (or its abbrevation) into a method to be called.
class Debugger(Debugger):
def help_command(self, command: str) -> None:
...
def command_method(self, command: str) -> Optional[Callable[[str], None]]:
"""Convert `command` into the method to be called.
If the method is not found, return `None` instead."""
if command.startswith('#'):
return None # Comment
possible_cmds = [possible_cmd for possible_cmd in self.commands()
if possible_cmd.startswith(command)]
if len(possible_cmds) != 1:
self.help_command(command)
return None
cmd = possible_cmds[0]
return getattr(self, cmd + '_command')
d = Debugger()
d.command_method("step")
d = Debugger()
d.command_method("s")
The revised execute()
method now determines this method and executes it with the given argument.
class Debugger(Debugger):
def execute(self, command: str) -> None:
"""Execute `command`"""
sep = command.find(' ')
if sep > 0:
cmd = command[:sep].strip()
arg = command[sep + 1:].strip()
else:
cmd = command.strip()
arg = ""
method = self.command_method(cmd)
if method:
method(arg)
If command_method()
cannot find the command, or finds more than one matching the prefix, it invokes the help
command providing additional assistance. help
draws extra info on each command from its documentation string.
class Debugger(Debugger):
def help_command(self, command: str = "") -> None:
"""Give help on given `command`. If no command is given, give help on all"""
if command:
possible_cmds = [possible_cmd for possible_cmd in self.commands()
if possible_cmd.startswith(command)]
if len(possible_cmds) == 0:
self.log(f"Unknown command {repr(command)}. Possible commands are:")
possible_cmds = self.commands()
elif len(possible_cmds) > 1:
self.log(f"Ambiguous command {repr(command)}. Possible expansions are:")
else:
possible_cmds = self.commands()
for cmd in possible_cmds:
method = self.command_method(cmd)
self.log(f"{cmd:10} -- {method.__doc__}")
d = Debugger()
d.execute("help")
d = Debugger()
d.execute("foo")
Printing Values¶
With execute()
, we can now easily extend our class – all it takes is for a new command NAME
is a new NAME_command()
method. Let us start by providing a print
command to print all variables. We use similar code as for the Tracer
class in the chapter on tracing.
class Debugger(Debugger):
def print_command(self, arg: str = "") -> None:
"""Print an expression. If no expression is given, print all variables"""
vars = self.local_vars
self.log("\n".join([f"{var} = {repr(value)}" for var, value in vars.items()]))
with Debugger():
remove_html_markup('abc')
Let us extend print
such that if an argument is given, it only evaluates and prints out this argument.
class Debugger(Debugger):
def print_command(self, arg: str = "") -> None:
"""Print an expression. If no expression is given, print all variables"""
vars = self.local_vars
if not arg:
self.log("\n".join([f"{var} = {repr(value)}" for var, value in vars.items()]))
else:
try:
self.log(f"{arg} = {repr(eval(arg, globals(), vars))}")
except Exception as err:
self.log(f"{err.__class__.__name__}: {err}")
with Debugger():
remove_html_markup('abc')
Note how we would abbreviate commands to speed things up. The argument to print
can be any Python expression:
with Debugger():
remove_html_markup('abc')
Our help
command also properly lists print
as a possible command:
with Debugger():
remove_html_markup('abc')
Listing Source Code¶
We implement a list
command that shows the source code of the current function.
import inspect
from bookutils import getsourcelines # like inspect.getsourcelines(), but in color
class Debugger(Debugger):
def list_command(self, arg: str = "") -> None:
"""Show current function."""
source_lines, line_number = getsourcelines(self.frame.f_code)
for line in source_lines:
self.log(f'{line_number:4} {line}', end='')
line_number += 1
with Debugger():
remove_html_markup('abc')
Setting Breakpoints¶
Stepping through the program line by line is a bit cumbersome. We therefore implement breakpoints – a set of lines that cause the program to be interrupted as soon as this line is met.
class Debugger(Debugger):
def break_command(self, arg: str = "") -> None:
"""Set a breakpoint in given line. If no line is given, list all breakpoints"""
if arg:
self.breakpoints.add(int(arg))
self.log("Breakpoints:", self.breakpoints)
Here's an example, setting a breakpoint at the end of the loop:
with Debugger():
remove_html_markup('abc')
from bookutils import quiz
Try it out yourself by executing the above code block!
Deleting Breakpoints¶
To delete breakpoints, we introduce a delete
command:
class Debugger(Debugger):
def delete_command(self, arg: str = "") -> None:
"""Delete breakpoint in line given by `arg`.
Without given line, clear all breakpoints"""
if arg:
try:
self.breakpoints.remove(int(arg))
except KeyError:
self.log(f"No such breakpoint: {arg}")
else:
self.breakpoints = set()
self.log("Breakpoints:", self.breakpoints)
with Debugger():
remove_html_markup('abc')
Listings with Benefits¶
Let us extend list
a bit such that
- it can also list a given function, and
- it shows the current line (
>
) as well as breakpoints (#
)
class Debugger(Debugger):
def list_command(self, arg: str = "") -> None:
"""Show current function. If `arg` is given, show its source code."""
try:
if arg:
obj = eval(arg)
source_lines, line_number = inspect.getsourcelines(obj)
current_line = -1
else:
source_lines, line_number = \
getsourcelines(self.frame.f_code)
current_line = self.frame.f_lineno
except Exception as err:
self.log(f"{err.__class__.__name__}: {err}")
source_lines = []
line_number = 0
for line in source_lines:
spacer = ' '
if line_number == current_line:
spacer = '>'
elif line_number in self.breakpoints:
spacer = '#'
self.log(f'{line_number:4}{spacer} {line}', end='')
line_number += 1
with Debugger():
remove_html_markup('abc')
Quitting¶
In the Python debugger interface, we can only observe, but not alter the control flow. To make sure we can always exit out of our debugging session, we introduce a quit
command that deletes all breakpoints and resumes execution until the observed function finishes.
class Debugger(Debugger):
def quit_command(self, arg: str = "") -> None:
"""Finish execution"""
self.breakpoints = set()
self.stepping = False
self.interact = False
With this, our command palette is pretty complete, and we can use our debugger to happily inspect Python executions.
with Debugger():
remove_html_markup('abc')
Lessons Learned¶
- Debugging hooks from interpreted languages allow for simple interactive debugging.
- A command-line debugging framework can be very easily extended with additional functionality.
Next Steps¶
In the next chapter, we will see how assertions check correctness at runtime.
Background¶
The command-line interface in this chapter is modeled after GDB, the GNU debugger, whose interface in turn goes back to earlier command-line debuggers such as dbx. All modern debuggers build on the functionality and concepts realized in these debuggers, be it breakpoints, stepping through programs, or inspecting program state.
The concept of time travel debugging (see the Exercises, below) has been invented (and reinvented) many times. One of the most impactful tools comes from King et al. [Samuel T. King et al, 2005], integrating a time-traveling virtual machine (TTVM) for debugging operating systems, integrated into GDB. The recent record+replay "rr" debugger also implements time travel debugging on top of the GDB command line debugger; it is applicable for general-purpose programs and available as open source.
Exercises¶
Exercise 1: Changing State¶
Some Python implementations allow altering the state by assigning values to frame.f_locals
. Implement a assign VAR=VALUE
command that allows changing the value of (local) variable VAR
to the new value VALUE
.
Note: As detailed in this blog post,
frame.f_locals
is re-populated with every access, so assign to our local alias self.local_vars
instead.
Exercise 2: More Commands¶
Extending the Debugger
class with extra features and commands is a breeze. The following commands are inspired from the GNU command-line debugger (GDB):
Named breakpoints ("break")¶
With break FUNCTION
and delete FUNCTION
, set and delete a breakpoint at FUNCTION
.
Step over functions ("next")¶
When stopped at a function call, the next
command should execute the entire call, stopping when the function returns. (In contrast, step
stops at the first line of the function called.)
Print call stack ("where")¶
Implement a where
command that shows the stack of calling functions.
Move up and down the call stack ("up" and "down")¶
After entering the up
command, explore the source and variables of the calling function rather than the current function. Use up
repeatedly to move further up the stack. down
returns to the caller.
Execute until line ("until")¶
With until LINE
, resume execution until a line greater than LINE
is reached. If LINE
is not given, resume execution until a line greater than the current is reached. This is useful to avoid stepping through multiple loop iterations.
Execute until return ("finish")¶
With finish
, resume execution until the current function returns.
Watchpoints ("watch")¶
With watch CONDITION
, stop execution as soon as CONDITION
changes its value. (Use the code from our EventTracer
class in the chapter on Tracing.) delete CONDITION
removes the watchpoint. Keep in mind that some variable names may not exist at all times.
Exercise 3: Time-Travel Debugging¶
Rather than inspecting a function at the moment it executes, you can also record the entire state (call stack, local variables, etc.) during execution, and then run an interactive session to step through the recorded execution. Your time travel debugger would be invoked as
with TimeTravelDebugger():
function_to_be_tracked()
...
The interaction then starts at the end of the with
block.
Part 1: Recording Values¶
Start with a subclass of Tracer
from the chapter on tracing (say, TimeTravelTracer
) to execute a program while recording all values. Keep in mind that recording even only local variables at each step quickly consumes large amounts of memory. As an alternative, consider recording only changes to variables, with the option to restore an entire state from a baseline and later changes.
Part 2: Command Line Interface¶
Create TimeTravelDebugger
as subclass of both TimeTravelTracer
and Debugger
to provide a command line interface as with Debugger
, including additional commands which get you back to earlier states:
back
is likestep
, except that you go one line backrestart
gets you to the beginning of the executionrewind
gets you to the beginning of the current function invocation
Part 3: Graphical User Interface¶
Create GUItimeTravelDebugger
to provide a graphical user interface that allows you to explore a recorded execution, using HTML and JavaScript.
Here's a simple example to get you started. Assume you have recorded the following line numbers and variable values:
recording: List[Tuple[int, Dict[str, Any]]] = [
(10, {'x': 25}),
(11, {'x': 25}),
(12, {'x': 26, 'a': "abc"}),
(13, {'x': 26, 'a': "abc"}),
(10, {'x': 30}),
(11, {'x': 30}),
(12, {'x': 31, 'a': "def"}),
(13, {'x': 31, 'a': "def"}),
(10, {'x': 35}),
(11, {'x': 35}),
(12, {'x': 36, 'a': "ghi"}),
(13, {'x': 36, 'a': "ghi"}),
]
Then, the following function will provide a slider that will allow you to explore these values:
from bookutils import HTML
def slider(rec: List[Tuple[int, Dict[str, Any]]]) -> str:
lines_over_time = [line for (line, var) in rec]
vars_over_time = []
for (line, vars) in rec:
vars_over_time.append(", ".join(f"{var} = {repr(value)}"
for var, value in vars.items()))
# print(lines_over_time)
# print(vars_over_time)
template = f'''
<div class="time_travel_debugger">
<input type="range" min="0" max="{len(lines_over_time) - 1}"
value="0" class="slider" id="time_slider">
Line <span id="line">{lines_over_time[0]}</span>:
<span id="vars">{vars_over_time[0]}</span>
</div>
<script>
var lines_over_time = {lines_over_time};
var vars_over_time = {vars_over_time};
var time_slider = document.getElementById("time_slider");
var line = document.getElementById("line");
var vars = document.getElementById("vars");
time_slider.oninput = function() {{
line.innerHTML = lines_over_time[this.value];
vars.innerHTML = vars_over_time[this.value];
}}
</script>
'''
# print(template)
return HTML(template)
slider(recording)
Explore the HTML and JavaScript details of how slider()
works, and then expand it to a user interface where you can
- see the current source code (together with the line being executed)
- search for specific events, such as a line being executed or a variable changing its value
Just like slider()
, your user interface should come in pure HTML and JavaScript such that it can run in a browser (or a Jupyter notebook) without interacting with a Python program.
The content of this project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. The source code that is part of the content, as well as the source code used to format and display that content is licensed under the MIT License. Last change: 2024-10-25 17:12:48+02:00 • Cite • Imprint