Author: Adrian Tam
In all programming exercises, it is difficult to go far and deep without a handy debugger. The built-in debugger, pdb
, in Python is a mature and capable one that can help us a lot if you know how to use it. In this tutorial we are going see what the pdb
can do for you as well as some of its alternative.
In this tutorial you will learn:
- What can a debugger do
- How to control a debugger
- The limitation of Python’s pdb and its alternatives
Let’s get started.
Tutorial Overview
This tutorial is in 4 parts, they are
- The concept of running a debugger
- Walk-through of using a debugger
- Debugger in Visual Studio Code
- Using GDB on a running Python program
The concept of running a debugger
The purpose of a debugger is to provide you a slow motion button to control the flow of a program. It also allow you to freeze the program at certain point of time and examine the state.
The simplest operation under a debugger is to step through the code. That is to run one line of code at a time and wait for your acknowledgment before proceeding into next. The reason we want to run the program in a stop-and-go fashion is to allow us to check the logic and value or verify the algorithm.
For a larger program, we may not want to step through the code from the beginning as it may take a long time before we reached the line that we are interested in. Therefore, debuggers also provide a breakpoint feature that will kick in when a specific line of code is reached. From that point onward, we can step through it line by line.
Walk-through of using a debugger
Let’s see how we can make use of a debugger with an example. The following is the Python code for showing particle swarm optimization in an animation:
import numpy as np import matplotlib.pyplot as plt from matplotlib.animation import FuncAnimation def f(x,y): "Objective function" return (x-3.14)**2 + (y-2.72)**2 + np.sin(3*x+1.41) + np.sin(4*y-1.73) # Compute and plot the function in 3D within [0,5]x[0,5] x, y = np.array(np.meshgrid(np.linspace(0,5,100), np.linspace(0,5,100))) z = f(x, y) # Find the global minimum x_min = x.ravel()[z.argmin()] y_min = y.ravel()[z.argmin()] # Hyper-parameter of the algorithm c1 = c2 = 0.1 w = 0.8 # Create particles n_particles = 20 np.random.seed(100) X = np.random.rand(2, n_particles) * 5 V = np.random.randn(2, n_particles) * 0.1 # Initialize data pbest = X pbest_obj = f(X[0], X[1]) gbest = pbest[:, pbest_obj.argmin()] gbest_obj = pbest_obj.min() def update(): "Function to do one iteration of particle swarm optimization" global V, X, pbest, pbest_obj, gbest, gbest_obj # Update params r1, r2 = np.random.rand(2) V = w * V + c1*r1*(pbest - X) + c2*r2*(gbest.reshape(-1,1)-X) X = X + V obj = f(X[0], X[1]) pbest[:, (pbest_obj >= obj)] = X[:, (pbest_obj >= obj)] pbest_obj = np.array([pbest_obj, obj]).min(axis=0) gbest = pbest[:, pbest_obj.argmin()] gbest_obj = pbest_obj.min() # Set up base figure: The contour map fig, ax = plt.subplots(figsize=(8,6)) fig.set_tight_layout(True) img = ax.imshow(z, extent=[0, 5, 0, 5], origin='lower', cmap='viridis', alpha=0.5) fig.colorbar(img, ax=ax) ax.plot([x_min], [y_min], marker='x', markersize=5, color="white") contours = ax.contour(x, y, z, 10, colors='black', alpha=0.4) ax.clabel(contours, inline=True, fontsize=8, fmt="%.0f") pbest_plot = ax.scatter(pbest[0], pbest[1], marker='o', color='black', alpha=0.5) p_plot = ax.scatter(X[0], X[1], marker='o', color='blue', alpha=0.5) p_arrow = ax.quiver(X[0], X[1], V[0], V[1], color='blue', width=0.005, angles='xy', scale_units='xy', scale=1) gbest_plot = plt.scatter([gbest[0]], [gbest[1]], marker='*', s=100, color='black', alpha=0.4) ax.set_xlim([0,5]) ax.set_ylim([0,5]) def animate(i): "Steps of PSO: algorithm update and show in plot" title = 'Iteration {:02d}'.format(i) # Update params update() # Set picture ax.set_title(title) pbest_plot.set_offsets(pbest.T) p_plot.set_offsets(X.T) p_arrow.set_offsets(X.T) p_arrow.set_UVC(V[0], V[1]) gbest_plot.set_offsets(gbest.reshape(1,-1)) return ax, pbest_plot, p_plot, p_arrow, gbest_plot anim = FuncAnimation(fig, animate, frames=list(range(1,50)), interval=500, blit=False, repeat=True) anim.save("PSO.gif", dpi=120, writer="imagemagick") print("PSO found best solution at f({})={}".format(gbest, gbest_obj)) print("Global optimal at f({})={}".format([x_min,y_min], f(x_min,y_min)))
The particle swarm optimization is done by executing the update()
function a number of times. Each time it runs, we are closer to the optimal solution to the objective function. We are using matplotlib’s FuncAnimation()
function instead of a loop to run update()
. So we can capture the position of the particles at each iteration.
Assume this program is saved as pso.py
, to run this program in command line is simply to enter:
python pso.py
and the solution will be print out to the screen and the animation will be saved as PSO.gif
. But if we want to run it with the Python debugger, we enter the following in command line:
python -m pdb pso.py
The -m pdb
part is to load the pdb
module and let the module to execute the file pso.py
for you. When you run this command, you will be welcomed with the pdb
prompt as follows:
> /Users/mlm/pso.py(1)<module>() -> import numpy as np (Pdb)
At the prompt, you can type in the debugger commands. To show the list of supported commands, we can use h
. And to show the detail of the specific command (such as list
), we can use h list
:
> /Users/mlm/pso.py(1)<module>() -> import numpy as np (Pdb) h Documented commands (type help <topic>): ======================================== EOF c d h list q rv undisplay a cl debug help ll quit s unt alias clear disable ignore longlist r source until args commands display interact n restart step up b condition down j next return tbreak w break cont enable jump p retval u whatis bt continue exit l pp run unalias where Miscellaneous help topics: ========================== exec pdb (Pdb)
At the beginning of a debugger session, we start with the first line of the program. Normally a Python program would start with a few lines of import
. We can use n
to move to the next line, or s
to step into a function:
> /Users/mlm/pso.py(1)<module>() -> import numpy as np (Pdb) n > /Users/mlm/pso.py(2)<module>() -> import matplotlib.pyplot as plt (Pdb) n > /Users/mlm/pso.py(3)<module>() -> from matplotlib.animation import FuncAnimation (Pdb) n > /Users/mlm/pso.py(5)<module>() -> def f(x,y): (Pdb) n > /Users/mlm/pso.py(10)<module>() -> x, y = np.array(np.meshgrid(np.linspace(0,5,100), np.linspace(0,5,100))) (Pdb) n > /Users/mlm/pso.py(11)<module>() -> z = f(x, y) (Pdb) s --Call-- > /Users/mlm/pso.py(5)f() -> def f(x,y): (Pdb) s > /Users/mlm/pso.py(7)f() -> return (x-3.14)**2 + (y-2.72)**2 + np.sin(3*x+1.41) + np.sin(4*y-1.73) (Pdb) s --Return-- > /Users/mlm/pso.py(7)f()->array([[17.25... 7.46457344]]) -> return (x-3.14)**2 + (y-2.72)**2 + np.sin(3*x+1.41) + np.sin(4*y-1.73) (Pdb) s > /Users/mlm/pso.py(14)<module>() -> x_min = x.ravel()[z.argmin()] (Pdb)
In pdb
, the line of code will be printed before the prompt. Usually n
command is what we would prefer as it executes that line of code and moves the flow at the same level without drill down deeper. When we are at a line that calls a function (such as line 11 of the above program, that runs z = f(x, y)
) we can use s
to step into the function. In the above example, we first step into f()
function, then another step to execute the computation, and finally, collect the return value from the function to give it back to the line that invoked the function. We see there are multiple s
command needed for a function as simple as one line because finding the function from the statement, calling the function, and return each takes one step. We can also see that in the body of the function, we called np.sin()
like a function but the debugger’s s
command does not go into it. It is because the np.sin()
function is not implemented in Python but in C. The pdb
does not support compiled code.
If the program is long, it is quite boring to use the n
command many times to move to somewhere we are interested. We can use until
command with a line number to let the debugger run the program until that line is reached:
> /Users/mlm/pso.py(1)<module>() -> import numpy as np (Pdb) until 11 > /Users/mlm/pso.py(11)<module>() -> z = f(x, y) (Pdb) s --Call-- > /Users/mlm/pso.py(5)f() -> def f(x,y): (Pdb) s > /Users/mlm/pso.py(7)f() -> return (x-3.14)**2 + (y-2.72)**2 + np.sin(3*x+1.41) + np.sin(4*y-1.73) (Pdb) s --Return-- > /Users/mlm/pso.py(7)f()->array([[17.25... 7.46457344]]) -> return (x-3.14)**2 + (y-2.72)**2 + np.sin(3*x+1.41) + np.sin(4*y-1.73) (Pdb) s > /Users/mlm/pso.py(14)<module>() -> x_min = x.ravel()[z.argmin()] (Pdb)
A command similar to until
is return
, which will execute the current function until the point that it is about to return. You can consider that as until
with the line number equal to the last line of the current function. The until
command is one-off, meaning it will bring you to that line only. If you want to stop at a particular line whenever it is being run, we can make a breakpoint on it. For example, if we are interested in how each iteration of the optimization algorithm moves the solution, we can set a breakpoint right after the update is applied:
> /Users/mlm/pso.py(1)<module>() -> import numpy as np (Pdb) b 40 Breakpoint 1 at /Users/mlm/pso.py:40 (Pdb) c > /Users/mlm/pso.py(40)update() -> obj = f(X[0], X[1]) (Pdb) bt /usr/local/Cellar/python@3.9/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python3.9/bdb.py(580)run() -> exec(cmd, globals, locals) <string>(1)<module>() /Users/mlm/pso.py(76)<module>() -> anim.save("PSO.gif", dpi=120, writer="imagemagick") /usr/local/lib/python3.9/site-packages/matplotlib/animation.py(1078)save() -> anim._init_draw() # Clear the initial frame /usr/local/lib/python3.9/site-packages/matplotlib/animation.py(1698)_init_draw() -> self._draw_frame(frame_data) /usr/local/lib/python3.9/site-packages/matplotlib/animation.py(1720)_draw_frame() -> self._drawn_artists = self._func(framedata, *self._args) /Users/mlm/pso.py(65)animate() -> update() > /Users/mlm/pso.py(40)update() -> obj = f(X[0], X[1]) (Pdb) p r1 0.8054505373292797 (Pdb) p r2 0.7543489945823536 (Pdb) p X array([[2.77550474, 1.60073607, 2.14133019, 4.11466522, 0.2445649 , 0.65149396, 3.24520628, 4.08804798, 0.89696478, 2.82703884, 4.42055413, 1.03681404, 0.95318658, 0.60737118, 1.17702652, 4.67551174, 3.95781321, 0.95077669, 4.08220292, 1.33330594], [2.07985611, 4.53702225, 3.81359193, 1.83427181, 0.87867832, 1.8423856 , 0.11392109, 1.2635162 , 3.84974582, 0.27397365, 2.86219806, 3.05406841, 0.64253831, 1.85730719, 0.26090638, 4.28053621, 4.71648133, 0.44101305, 4.14882396, 2.74620598]]) (Pdb) n > /Users/mlm/pso.py(41)update() -> pbest[:, (pbest_obj >= obj)] = X[:, (pbest_obj >= obj)] (Pdb) n > /Users/mlm/pso.py(42)update() -> pbest_obj = np.array([pbest_obj, obj]).min(axis=0) (Pdb) n > /Users/mlm/pso.py(43)update() -> gbest = pbest[:, pbest_obj.argmin()] (Pdb) n > /Users/mlm/pso.py(44)update() -> gbest_obj = pbest_obj.min() (Pdb)
After we set a breakpoint with the b
command, we can let the debugger run our program until the breakpoint is hit. The c
command means to continue until a trigger is met. At any point, we can use bt
command to show the traceback to check how we reached here. We can also use the p
command to print the variables (or an expression) to check what value they are holding.
Indeed, we can place a breakpoint with a condition, so that it will stop only if the condition is met. The below will impose a condition that the first random number (r1
) is greater than 0.5:
(Pdb) b 40, r1 > 0.5 Breakpoint 1 at /Users/mlm/pso.py:40 (Pdb) c > /Users/mlm/pso.py(40)update() -> obj = f(X[0], X[1]) (Pdb) p r1, r2 (0.8054505373292797, 0.7543489945823536) (Pdb) c > /Users/mlm/pso.py(40)update() -> obj = f(X[0], X[1]) (Pdb) p r1, r2 (0.5404045753007164, 0.2967937508800147) (Pdb)
Indeed, we can also try to manipulate variables while we are debugging.
(Pdb) l 35 global V, X, pbest, pbest_obj, gbest, gbest_obj 36 # Update params 37 r1, r2 = np.random.rand(2) 38 V = w * V + c1*r1*(pbest - X) + c2*r2*(gbest.reshape(-1,1)-X) 39 X = X + V 40 B-> obj = f(X[0], X[1]) 41 pbest[:, (pbest_obj >= obj)] = X[:, (pbest_obj >= obj)] 42 pbest_obj = np.array([pbest_obj, obj]).min(axis=0) 43 gbest = pbest[:, pbest_obj.argmin()] 44 gbest_obj = pbest_obj.min() 45 (Pdb) p V array([[ 0.03742722, 0.20930531, 0.06273426, -0.1710678 , 0.33629384, 0.19506555, -0.10238065, -0.12707257, 0.28042122, -0.03250191, -0.14004886, 0.13224399, 0.16083673, 0.21198813, 0.17530208, -0.27665503, -0.15344393, 0.20079061, -0.10057509, 0.09128536], [-0.05034548, -0.27986224, -0.30725954, 0.11214169, 0.0934514 , 0.00335978, 0.20517519, 0.06308483, -0.22007053, 0.26176423, -0.12617228, -0.05676629, 0.18296986, -0.01669114, 0.18934933, -0.27623121, -0.32482898, 0.213894 , -0.34427909, -0.12058168]]) (Pdb) p r1, r2 (0.5404045753007164, 0.2967937508800147) (Pdb) r1 = 0.2 (Pdb) p r1, r2 (0.2, 0.2967937508800147) (Pdb) j 38 > /Users/mlm/pso.py(38)update() -> V = w * V + c1*r1*(pbest - X) + c2*r2*(gbest.reshape(-1,1)-X) (Pdb) n > /Users/mlm/pso.py(39)update() -> X = X + V (Pdb) p V array([[ 0.02680837, 0.16594979, 0.06350735, -0.15577623, 0.30737655, 0.19911613, -0.08242418, -0.12513798, 0.24939995, -0.02217463, -0.13474876, 0.14466204, 0.16661846, 0.21194543, 0.16952298, -0.24462505, -0.138997 , 0.19377154, -0.10699911, 0.10631063], [-0.03606147, -0.25128615, -0.26362411, 0.08163408, 0.09842085, 0.00765688, 0.19771385, 0.06597805, -0.20564599, 0.23113388, -0.0956787 , -0.07044121, 0.16637064, -0.00639259, 0.18245734, -0.25698717, -0.30336147, 0.19354112, -0.29904698, -0.08810355]]) (Pdb)
In the above, we use l
command to list the code around the current statement (identified by the arrow ->
). In the listing, we can also see the breakpoint (marked with B
) is set at line 40. As we can see the current value of V
and r1
, we can modify r1
from 0.54 to 0.2 and run the statement on V
again by using j
(jump) to line 38. And as we see after we execute the statement with n
command, the value of V
is changed.
If we use a breakpoint and found something unexpected, chances are that it was caused by issues in a different level of the call stack. Debuggers would allow you to navigate to different levels:
(Pdb) bt /usr/local/Cellar/python@3.9/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python3.9/bdb.py(580)run() -> exec(cmd, globals, locals) <string>(1)<module>() /Users/mlm/pso.py(76)<module>() -> anim.save("PSO.gif", dpi=120, writer="imagemagick") /usr/local/lib/python3.9/site-packages/matplotlib/animation.py(1091)save() -> anim._draw_next_frame(d, blit=False) /usr/local/lib/python3.9/site-packages/matplotlib/animation.py(1126)_draw_next_frame() -> self._draw_frame(framedata) /usr/local/lib/python3.9/site-packages/matplotlib/animation.py(1720)_draw_frame() -> self._drawn_artists = self._func(framedata, *self._args) /Users/mlm/pso.py(65)animate() -> update() > /Users/mlm/pso.py(39)update() -> X = X + V (Pdb) up > /Users/mlm/pso.py(65)animate() -> update() (Pdb) bt /usr/local/Cellar/python@3.9/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python3.9/bdb.py(580)run() -> exec(cmd, globals, locals) <string>(1)<module>() /Users/mlm/pso.py(76)<module>() -> anim.save("PSO.gif", dpi=120, writer="imagemagick") /usr/local/lib/python3.9/site-packages/matplotlib/animation.py(1091)save() -> anim._draw_next_frame(d, blit=False) /usr/local/lib/python3.9/site-packages/matplotlib/animation.py(1126)_draw_next_frame() -> self._draw_frame(framedata) /usr/local/lib/python3.9/site-packages/matplotlib/animation.py(1720)_draw_frame() -> self._drawn_artists = self._func(framedata, *self._args) > /Users/mlm/pso.py(65)animate() -> update() /Users/mlm/pso.py(39)update() -> X = X + V (Pdb) l 60 61 def animate(i): 62 "Steps of PSO: algorithm update and show in plot" 63 title = 'Iteration {:02d}'.format(i) 64 # Update params 65 -> update() 66 # Set picture 67 ax.set_title(title) 68 pbest_plot.set_offsets(pbest.T) 69 p_plot.set_offsets(X.T) 70 p_arrow.set_offsets(X.T) (Pdb) p title 'Iteration 02' (Pdb)
In the above, the first bt
command gives the call stack when we are at the bottom frame, i.e., the deepest of the call stack. We can see that we are about to execute the statement X = X + V
. Then the up
command moves our focus to one level up on the call stack, which is the line running update()
function (as we see at the line preceded with >
). Since our focus is changed, the list command l
will print a different fragment of code and the p
command can examine a variable in a different scope.
The above covers most of the useful commands in the debugger. If we want to terminate the debugger (which also terminates the program), we can use the q
command to quit or hit Ctrl-D if your terminal supports.
Debugger in Visual Studio Code
If you are not very comfortable to run the debugger in command line, you can rely on the debugger from your IDE. Almost always the IDE will provide you some debugging facility. In Visual Studio Code for example, you can launch the debugger in the “Run” menu.
The screen below shows Visual Studio Code at debugging session. The buttons at the center top are correspond to pdb
commands continue
, next
, step
, return
, restart
, and quit
respectively. A breakpoint can be created by clicking on the line number, which a red dot will be appeared to identify that. The bonus of using an IDE is that the variables are shown immediately at each debugging step. We can also watch for an express and show the call stack. These are at left side of the screen below.
Using GDB on a running Python program
The pdb
from Python is suitable only for programs running from scratch. If we have a program already running but stuck, we cannot use pdb to hook into it to check what’s going on. The Python extension from GDB, however, can do this.
To demonstrate, let’s consider a GUI application. It will wait until user’s action before the program can end. Hence it is a perfect example to see how we can use gdb
to hook into a running process. The code below is a “hello world” program using PyQt5 that just create an empty window and waiting for user to close it:
import sys from PyQt5.QtWidgets import QApplication, QWidget, QMainWindow class Frame(QMainWindow): def __init__(self): super().__init__() self.initUI() def initUI(self): self.setWindowTitle("Simple title") self.resize(800,600) def main(): app = QApplication(sys.argv) frame = Frame() frame.show() sys.exit(app.exec_()) if __name__ == '__main__': main()
Let’s save this program as simpleqt.py
and run it using the following in Linux under X window environment:
python simpleqt.py &
The final &
will make it run in background. Now we can check for its process ID using the ps
command:
ps a | grep python
... 3997 pts/1 Sl 0:00 python simpleqt.py ...
The ps
command will tell you the process ID at the first column. If you have gdb
installed with python extension, we can run
gdb python 3997
and it will bring you into the GDB’s prompt:
GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git Copyright (C) 2021 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from python... Reading symbols from /usr/lib/debug/.build-id/f9/02f8a561c3abdb9c8d8c859d4243bd8c3f928f.debug... Attaching to program: /usr/local/bin/python, process 3997 [New LWP 3998] [New LWP 3999] [New LWP 4001] [New LWP 4002] [New LWP 4003] [New LWP 4004] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007fb11b1c93ff in __GI___poll (fds=0x7fb110007220, nfds=3, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29 29 ../sysdeps/unix/sysv/linux/poll.c: No such file or directory. (gdb) py-bt Traceback (most recent call first): <built-in method exec_ of QApplication object at remote 0x7fb115f64c10> File "/mnt/data/simpleqt.py", line 16, in main sys.exit(app.exec_()) File "/mnt/data/simpleqt.py", line 19, in <module> main() (gdb) py-list 11 12 def main(): 13 app = QApplication(sys.argv) 14 frame = Frame() 15 frame.show() >16 sys.exit(app.exec_()) 17 18 if __name__ == '__main__': 19 main() (gdb)
GDB is supposed to be a debugger for compiled programs (usually from C or C++). The Python extension allows you to check the code (written in Python) being run by the Python interpreter (which is written in C). It is less feature-rich than the Python’s pdb
in terms of handling Python code but useful when you want to need to hook into a running process.
The command supported under GDB are py-list
, py-bt
, py-up
, py-down
, and py-print
. They are comparable to the same commands in pdb
without the py-
prefix.
GDB is useful if your Python code uses a library that is compiled from C (such as numpy) and want to investigate how the it runs. It is also useful to learn why your program is frozen by checking the call stack in run time. However, it may be rarely the case that you need to use GDB to debug your machine learning project.
Further Readings
The Python pdb
module’s document is at
But pdb
is not the only debugger available. Some third-party tools are listed in:
For GDB with Python extension, it is most mature to be used in Linux environment. Please see the following for more details on its usage:
The command interface of pdb
is influenced by that of GDB. Hence we can learn the technique of debugging a program in general from the latter. A good primer on how to use a debugger would be
- The Art of Debugging with GDB, DDD, and Eclipse, by Norman Matloff (2008)
Summary
In this tutorial, you discovered the features of Python’s pdb
Specifically, you learned:
- What can
pdb
do and how to use it - The limitation and alternatives of
pdb
In the next post, we will see that pdb
is also a Python function that can be called inside a Python program.
The post Python debugging tools appeared first on Machine Learning Mastery.