Author: Zhe Ming Chng
When working on code, whether we know it or not, we often come across the decorator design pattern. This is a programming technique to extend the functionality of classes or functions without modifying them. The decorator design pattern allows us to mix and match extensions easily. Python has a decorator syntax rooted in the decorator design pattern. Knowing how to make and use a decorator can help you write more powerful code.
In this post, you will discover the decorator pattern and Python’s function decorators.
After completing this tutorial, you will learn:
- What is the decorator pattern, and why is it useful
- Python’s function decorators and how to use them
Let’s get started!
Overview
This tutorial is divided into four parts:
- What is the decorator pattern, and why is it useful?
- Function decorators in Python
- The use cases of decorators
- Some practical examples of decorators
What is the decorator pattern, and why is it useful?
The decorator pattern is a software design pattern that allows us to dynamically add functionality to classes without creating subclasses and affecting the behavior of other objects of the same class. By using the decorator pattern, we can easily generate different permutations of functionality that we might want without creating an exponentially increasing number of subclasses, making our code increasingly complex and bloated.
Decorators are usually implemented as sub-interfaces of the main interface that we want to implement and store an object of the main interface’s type. It will then modify the methods to which it wants to add certain functionality by overriding the methods in the original interface and calling on methods from the stored object.
Above is the UML class diagram for the decorator design pattern. The decorator abstract class contains an object of type OriginalInterface
; this is the object whose functionality the decorator will be modifying. To instantiate our concrete DecoratorClass
, we would need to pass in a concrete class that implements the OriginalInterface,
and then when we make method calls to DecoratorClass.method1()
, our DecoratorClass
should modify the output from the object’s method1()
.
With Python, however, we are able to simplify many of these design patterns due to dynamic typing along with functions and classes being first-class objects. While modifying a class or a function without changing the implementation remained the key idea of decorators, we will explore Python’s decorator syntax in the following.
Function Decorators in Python
A function decorator is an incredibly useful feature in Python. It is built upon the idea that functions and classes are first-class objects in Python.
Let’s consider a simple example, that is, to call a function twice. Since a Python function is an object and we can pass a function as an argument to another function, this task can be done as follows:
def repeat(fn): fn() fn() def hello_world(): print("Hello world!") repeat(hello_world)
Again, since a Python function is an object, we can make a function to return another function, which is to execute yet another function twice. This is done as follows:
def repeat_decorator(fn): def decorated_fn(): fn() fn() # returns a function return decorated_fn def hello_world(): print ("Hello world!") hello_world_twice = repeat_decorator(hello_world) # call the function hello_world_twice()
The function returned by repeat_decorator()
above is created when it is invoked, as it depends on the argument provided. In the above, we passed the hello_world
function as an argument to the repeat_decorator()
function, and it returns the decorated_fn
function, which is assigned to hello_world_twice
. Afterward, we can invoke hello_world_twice()
since it is now a function.
The idea of decorator pattern applies here. But we do not need to define the interface and subclasses explicitly. In fact, hello_world
is a name defined as a function in the above example. There is nothing preventing us from redefining this name to something else. Hence we can also do the following:
def repeat_decorator(fn): def decorated_fn(): fn() fn() # returns a function return decorated_fn def hello_world(): print ("Hello world!") hello_world = repeat_decorator(hello_world) # call the function hello_world()
That is, instead of assigning the newly created function to hello_world_twice
, we overwrite hello_world
instead. While the name hello_world
is reassigned to another function, the previous function still exists but is just not exposed to us.
Indeed, the above code is functionally equivalent to the following:
# function decorator that calls the function twice def repeat_decorator(fn): def decorated_fn(): fn() fn() # returns a function return decorated_fn # using the decorator on hello_world function @repeat_decorator def hello_world(): print ("Hello world!") # call the function hello_world()
In the above code, @repeat_decorator
before a function definition means to pass the function into repeat_decorator()
and reassign its name to the output. That is, to mean hello_world = repeat_decorator(hello_world)
. The @
line is the decorator syntax in Python.
Note: @
syntax is also used in Java but has a different meaning where it’s an annotation that is basically metadata and not a decorator.
We can also implement decorators that take in arguments, but this would be a bit more complicated as we need to have one more layer of nesting. If we extend our example above to define the number of times to repeat the function call:
def repeat_decorator(num_repeats = 2): # repeat_decorator should return a function that's a decorator def inner_decorator(fn): def decorated_fn(): for i in range(num_repeats): fn() # return the new function return decorated_fn # return the decorator that actually takes the function in as the input return inner_decorator # use the decorator with num_repeats argument set as 5 to repeat the function call 5 times @repeat_decorator(5) def hello_world(): print("Hello world!") # call the function hello_world()
The repeat_decorator()
takes in an argument and returns a function which is the actual decorator for the hello_world
function (i.e., invoking repeat_decorator(5)
returns inner_decorator
with the local variable num_repeats = 5
set). The above code will print the following:
Hello world! Hello world! Hello world! Hello world! Hello world!
Before we end this section, we should remember that decorators can also be applied to classes in addition to functions. Since class in Python is also an object, we may redefine a class in a similar fashion.
The Use Cases of Decorators
The decorator syntax in Python made the use of decorators easier. There are many reasons we may use a decorator. One of the most common use cases is to convert data implicitly. For example, we may define a function that assumes all operations are based on numpy arrays and then make a decorator to ensure that happens by modifying the input:
# function decorator to ensure numpy input def ensure_numpy(fn): def decorated_function(data): # converts input to numpy array array = np.asarray(data) # calls fn on input numpy array return fn(array) return decorated_function
We can further add to our decorator by modifying the output of the function, such as rounding off floating point values:
# function decorator to ensure numpy input # and round off output to 4 decimal places def ensure_numpy(fn): def decorated_function(data): array = np.asarray(data) output = fn(array) return np.around(output, 4) return decorated_function
Let’s consider the example of finding the sum of an array. A numpy array has sum()
built-in, as does pandas DataFrame. But the latter is to sum over columns rather than sum over all elements. Hence a numpy array will sum to one floating point value while a DataFrame will sum to a vector of values. But with the above decorator, we can write a function that gives you consistent output in both cases:
import numpy as np import pandas as pd # function decorator to ensure numpy input # and round off output to 4 decimal places def ensure_numpy(fn): def decorated_function(data): array = np.asarray(data) output = fn(array) return np.around(output, 4) return decorated_function @ensure_numpy def numpysum(array): return array.sum() x = np.random.randn(10,3) y = pd.DataFrame(x, columns=["A", "B", "C"]) # output of numpy .sum() function print("x.sum():", x.sum()) print() # output of pandas .sum() funuction print("y.sum():", y.sum()) print(y.sum()) print() # calling decorated numpysum function print("numpysum(x):", numpysum(x)) print("numpysum(y):", numpysum(y))
Running the above code gives us the output:
x.sum(): 0.3948331694737762 y.sum(): A -1.175484 B 2.496056 C -0.925739 dtype: float64 A -1.175484 B 2.496056 C -0.925739 dtype: float64 numpysum(x): 0.3948 numpysum(y): 0.3948
This is a simple example. But imagine if we define a new function that computes the standard deviation of elements in an array. We can simply use the same decorator, and then the function will also accept pandas DataFrame. Hence all the code to polish input is taken out of these functions by depositing them into the decorator. This is how we can efficiently reuse the code.
Some Practical Examples of Decorators
Now that we learned the decorator syntax in Python, let’s see what we can do with it!
Memoization
There are some function calls that we do repeatedly, but where the values rarely, if ever, change. This could be calls to a server where the data is relatively static or as part of a dynamic programming algorithm or computationally intensive math function. We might want to memoize these function calls, i.e., storing the value of their output on a virtual memo pad for reuse later.
A decorator is the best way to implement a memoization function. We just need to remember the input and output of a function but keep the function’s behavior as-is. Below is an example:
import pickle import hashlib MEMO = {} # To remember the function input and output def memoize(fn): def _deco(*args, **kwargs): # pickle the function arguments and obtain hash as the store keys key = (fn.__name__, hashlib.md5(pickle.dumps((args, kwargs), 4)).hexdigest()) # check if the key exists if key in MEMO: ret = pickle.loads(MEMO[key]) else: ret = fn(*args, **kwargs) MEMO[key] = pickle.dumps(ret) return ret return _deco @memoize def fibonacci(n): if n in [0, 1]: return n else: return fibonacci(n-1) + fibonacci(n-2) print(fibonacci(40)) print(MEMO)
In this example, we implemented memoize()
to work with a global dictionary MEMO
such that the name of a function together with the arguments becomes the key and the function’s return becomes the value. When the function is called, the decorator will check if the corresponding key exists in MEMO
, and the stored value will be returned. Otherwise, the actual function is invoked, and its return value is added to the dictionary.
We use pickle
to serialize the input and output and use hashlib
to create a hash of the input because not everything can be a key to the Python dictionary (e.g., list
is an unhashable type; thus, it cannot be a key). Serializing any arbitrary structure into a string can overcome this and guarantee that the return data is immutable. Furthermore, hashing the function argument would avoid storing an exceptionally long key in the dictionary (for example, when we pass in a huge numpy array to the function).
The above example uses fibonacci()
to demonstrate the power of memoization. Calling fibonacci(n)
will produce the n-th Fibonacci number. Running the above example would produce the following output, in which we can see the 40th Fibonacci number is 102334155 and how the dictionary MEMO
is used to store different calls to the function.
102334155 {('fibonacci', '635f1664f168e2a15b8e43f20d45154b'): b'x80x04Kx01.', ('fibonacci', 'd238998870ae18a399d03477dad0c0a8'): b'x80x04Kx00.', ('fibonacci', 'dbed6abf8fcf4beec7fc97f3170de3cc'): b'x80x04Kx01.', ... ('fibonacci', 'b9954ff996a4cd0e36fffb09f982b08e'): b'x80x04x95x06x00x00x00x00x00x00x00J)pTx02.', ('fibonacci', '8c7aba62def8063cf5afe85f42372f0d'): b'x80x04x95x06x00x00x00x00x00x00x00Jxa2x0exc5x03.', ('fibonacci', '6de8535f23d756de26959b4d6e1f66f6'): b'x80x04x95x06x00x00x00x00x00x00x00Jxcb~x19x06.'}
You may try to remove the @memoize
line in the code above. You will find the program takes significantly longer to run (because each function call invokes two more function calls; hence it is running in O(2^n) instead of O(n) as in the memoized case), or you may even be running out of memory.
Memoization is very helpful for expensive functions whose outputs don’t change frequently, for example, the following function that reads some stock market data from the Internet:
... import pandas_datareader as pdr @memoize def get_stock_data(ticker): # pull data from stooq df = pdr.stooq.StooqDailyReader(symbols=ticker, start="1/1/00", end="31/12/21").read() return df #testing call to function import cProfile as profile import pstats for i in range(1, 3): print(f"Run {i}") run_profile = profile.Profile() run_profile.enable() get_stock_data("^DJI") run_profile.disable() pstats.Stats(run_profile).print_stats(0)
If implemented correctly, the call to get_stock_data()
should be more expensive the first time and much less expensive subsequently. The output from the code snippet above gives us:
Run 1 17492 function calls (17051 primitive calls) in 1.452 seconds Run 2 221 function calls (218 primitive calls) in 0.001 seconds
This is particularly useful if you are working on a Jupyter notebook. If you need to download some data, wrap it in a memoize decorator. Since developing a machine learning project means many iterations of changing your code to see if the result looks any better, a memoized download function saves you a lot of unnecessary waiting.
You may make a more powerful memoization decorator by saving the data in a database (e.g., a key-value store like GNU dbm or an in-memory database such as memcached or Redis). But if you just need the functionality as above, Python 3.2 or later shipped you the decorator lru_cache
from the built-in library functools
, so you don’t need to write your own:
import functools import pandas_datareader as pdr # memoize using lru_cache @functools.lru_cache def get_stock_data(ticker): # pull data from stooq df = pdr.stooq.StooqDailyReader(symbols=ticker, start="1/1/00", end="31/12/21").read() return df # testing call to function import cProfile as profile import pstats for i in range(1, 3): print(f"Run {i}") run_profile = profile.Profile() run_profile.enable() get_stock_data("^DJI") run_profile.disable() pstats.Stats(run_profile).print_stats(0)
Note: The lru_cache
implements LRU caching, which limits its size to the most recent calls (default 128) to the function. In Python 3.9, there is a @functools.cache
as well, which is unlimited in size without the LRU purging.
Function Catalog
Another example where we might want to consider the use of function decorators is for registering functions in a catalog. It allows us to associate functions with a string and pass the strings as arguments for other functions. This is the start of making a system that will enable user-provided plug-ins. Let’s illustrate this with an example. Below is a decorator and a function activate()
that we will use later. Let’s assume the following code is saved in the file activation.py
:
# activation.py ACTIVATION = {} def register(name): def decorator(fn): # assign fn to "name" key in ACTIVATION ACTIVATION[name] = fn # return fn unmodified return fn return decorator def activate(x, kind): try: fn = ACTIVATION[kind] return fn(x) except KeyError: print("Activation function %s undefined" % kind)
After defining the register
decorator in the above code, we can now use it to register functions and associate strings with them. Let’s have the file funcs.py
as such:
# funcs.py from activation import register import numpy as np @register("relu") def relu(x): return np.where(x>0, x, 0) @register("sigmoid") def sigm(x): return 1/(1+np.exp(-x)) @register("tanh") def tanh(x): return np.tanh(x)
We’ve registered the “relu,” “sigmoid,” and “tanh” functions to their respective strings by building this association in the ACTIVATION
dictionary.
Now, let’s see how we can use our newly registered functions.
import numpy as np from activation import activate # create a random matrix x = np.random.randn(5,3) print(x) # try ReLU activation on the matrix relu_x = activate(x, "relu") print(relu_x) # load the functions, and call ReLU activation again import funcs relu_x = activate(x, "relu") print(relu_x)
which gives us the output:
[[-0.81549502 -0.81352867 1.41539545] [-0.28782853 -1.59323543 -0.19824959] [ 0.06724466 -0.26622761 -0.41893662] [ 0.47927331 -1.84055276 -0.23147207] [-0.18005588 -1.20837815 -1.34768876]] Activation function relu undefined None [[0. 0. 1.41539545] [0. 0. 0. ] [0.06724466 0. 0. ] [0.47927331 0. 0. ] [0. 0. 0. ]]
Observe that before we reached the import func
line, the ReLU activation does not exist. Hence calling the function will have the error message print, and the result is None
. Then after we run that import
line, we are loading those functions defined just like a plug-in module. Then the same function call gave us the result we expected.
Note that we never invoked anything in the module func
explicitly, and we didn’t modify anything in the call to activate()
. Simply importing func
caused those new functions to register and expanded the functionality of activate()
. Using this technique allows us to develop a very large system while focusing on only one small part at a time without worrying about the interoperability of other parts. Without the registration decorators and function catalog, adding a new activation function would need modification to every function that uses activation.
If you’re familiar with Keras, you should resonate the above with the following syntax:
layer = keras.layers.Dense(128, activation="relu") model.compile(loss="sparse_categorical_crossentropy", optimizer="adam", metrics=["sparse_categorical_accuracy"])
Keras defined almost all components using a decorator of similar nature. Hence we can refer to building blocks by name. Without this mechanism, we have to use the following syntax all the time, which puts a burden on us to remember the location of a lot of components:
layer = keras.layers.Dense(128, activation=keras.activations.relu) model.compile(loss=keras.losses.SparseCategoricalCrossentropy(), optimizer=keras.optimizers.Adam(), metrics=[keras.metrics.SparseCategoricalAccuracy()])
Further reading
This section provides more resources on the topic if you are looking to go deeper.
Articles
- Decorator pattern
- Python Language Reference, Section 8.7, Function definitions
- PEP 318 – Decorators for Functions and Methods
Books
- Fluent Python, 2nd edition, by Luciano Ramalho
APIs
- functools module in Python standard library
Summary
In this post, you discovered the decorator design pattern and Python’s decorator syntax. You also saw some specific use cases of decorators that can help your Python program run faster or be easier to extend.
Specifically, you learned:
- The idea of a decorator pattern and the decorator syntax in Python
- How to implement a decorator in Python for use with the decorator syntax
- The use of a decorator for adapting function input and output, for memoization, and for registering functions in a catalog
The post A Gentle Introduction to Decorators in Python appeared first on Machine Learning Mastery.