Python Good Practices: Part 1 - Python constructs

Python has many powerful and useful constructs. Depending on their design, they help us to write code safer, more readable, often faster and with lower memory usage. This article covers Iterables and Iterators, Generators, Context Managers, Comprehensions and Decorators and explains their usage and usability.

|
   Python 
Iterables and iterators

Looping is one of the most impressive features of Python. We can find it almost everywhere. We can loop over the built-in types such as dictionary, list, tuple, string or even custom object.

For loop syntax is slightly different from many other languages. It does not require any initial value or conditional expression indicating when to stop. Instead, it gathers next element from an iterable in every loop iteration.

Iterable is an object that implements the iterator protocol. It simply means that the object should be of a class that contains two magic methods: __next__() and __iter__().

-          __iter__() – returns an iterator object. Usually it returns ‘self’ indicating that the class is iterable. However, it can be any other iterator (or generator).

-          __next__() – returns next value of an iterator. Usually it raises StopIteration exception that indicates that there is no next value and looping over the object should stop. If __iter__() returns generator, there is no need to implement this method.

Let’s consider the following example of iterating over Fibonacci numbers:

class Fibonacci:

    def __init__(self, numbers):
        self._numbers = numbers

    def __iter__(self):
        self._x = 1
        self._y = 1
        self._counter = 0
        return self

    def __next__(self):
        current_number = self._x
        self._counter += 1
        if self._counter > self._numbers:
            raise StopIteration
        self._x, self._y = self._y, self._x + self._y
        return current_number

for fib_number in Fibonacci(10):
    print(fib_number)

In the above example our ‘for’ loop invoke __iter__() method of the class when we create an object. Without this method, the TypeError exception would be raised. Then on every iteration the __next__() method will be called to get the next value. This proceeds until StopIteration is raised to indicate there is no next value.

Iterators allow us to create more readable code and treat any object as iterable that can be used in a loop. Moreover, it helps us to save memory usage. Typically when we want to loop over build-in collections such as dictionary, list or tuple we need to have the whole collection in a memory, while iterators require only one element in a given iteration.

To sum up, iterators should be used as a good practice to increase readability of the code and to save the memory usage. However, there is a lot of boilerplate and if there is no need to create the whole class to iterate over we should create generator instead.

Generators

Generator is another Python construct that makes looping over specific elements even simpler than iterators. It is a function with a yield keyword instead of return statement. It is enough for Python interpreter to know that the function is a generator.

Generators differ from ordinary functions in the way on how they work. When we invoke a function, it executes until the body ends or the return statement is met. On the other hand, if we invoke generator, we get the generator object in idle state i.e. no code is executed yet.  At every iteration the code starts to execute until it meets the yield keyword. It returns the current value and become idle again. It proceeds until particular condition is met (the generator body ends without yield statement or StopIteration exception is raised).

Let’s compare the conventional way of implementing the Fibonacci function and the generator one.

def fibonacci_func(numbers):
    results = []
    current_number = 0
    x, y = 1, 1
    while current_number < numbers:
        current_element = x
        x, y = y, x+y
        current_number += 1
        results.append(current_element)
    return results

for fib_number in fibonacci_func(10000):
    print(fib_number)

This function returns the list of elements that we iterate over. The similar example with generator will be as follows:

def fibonacci_gen(numbers):
    current_number = 0
    x, y = 1, 1
    while current_number < numbers:
        current_element = x
        x, y = y, x+y
        current_number += 1
        yield current_element

for fib_number in fibonacci_gen(10000):
    print(fib_number)

As a result both solutions do exactly the same but they differ with the memory usage. Similarly to iterators, generators do not need to load the whole collection into the memory. We can check this with the getsizeof() function:

from sys import getsizeof 
print(getsizeof(fib_gen)) # 48
print(getsizeof(fibonacci_func)) # 4516

As we can see, Generators in usage are very similar to Iterators. Both of them are a great option for working with vast collections or resources to save memory usage. However, there are some advantages of using first over the second:

-   Reduced boilerplate

-   Creating generators are simpler and more readable

Generators can also be used in the iterator protocol. Instead of implementing both __iter__() and __next__() methods, we can make __iter__() to be a generator. With combining those two constructs together we can rewrite the Fibonacci class to look like this:

class Fibonacci:

    def __init__(self, numbers):
        self._numbers = numbers

    def __iter__(self):
        current_number = 0
        x, y = 1, 1
        while current_number < self._numbers:
            current_element = x
            x, y = y, x+y
            current_number += 1
            yield current_element

for fib_number in Fibonacci(10):
    print(fib_number)

In conclusion, a good practice is to use generators every time we want to loop over vast number of elements or resources if we want to save memory space. Moreover, we should use generators instead of iterators to reduce boilerplate if we do not need to create the whole class. Otherwise we can combine the usage of iterators and generators.

Comprehensions

Comprehension is another Python construct that allows us to create collections in a more concise way. It is driven by one of Python design principles saying that “Flat is better then nested”. We can use comprehensions for lists, sets or even dictionaries. Basically it uses generator expression to create generator object and unpack it to the collection we want to.

Generator expressions are slightly different from generators. They are mostly one-line expressions with implicit yield statement e.g.

gen_expr = (x for x in [1, 2, 3])

The above line created a generator object that yields every element of the provided collection. This generator can be unpacked to the list in the following way:

list_ = list(gen_expr)

Now let’s consider the situation when we want to create a list of elements multiplied by 2 from another list. The conventional way of doing this would be as follows:

numbers = [1, 2, 3, 4, 5]
multiplied = []
for number in numbers:
    multiplied.append(number*2)

Comprehensions use generator expressions to create collections on-the-fly (list, dictionary etc.). The code snippet can be rewritten with this construct:

multiplied = [number*2 for number in numbers]

As we can see it is a great way to create collections in a very concise way.

Comprehensions are not limited to lists. This way we can create dictionaries and even immutable tuples. Let’s say we have a dictionary of countries population and we want to create another dictionary with countries with over 1 million population.

country_population = {
    "Afghanistan": 22720000,
    "Albania": 3401200,
    "Andorra": 78000,
    "Luxembourg": 435700,
    "Montserrat": 11000,
    "United Kingdom": 59623400,
    "United States": 278357000,
    "Zimbabwe": 11669000
}

over_mln_population = {country: population for country, population in country_population.items() if population > 1000000}

Or create immutable tuple with the names of these countries e.g.:

over_mln_population = tuple(country for country, population in country_population.items() if population > 1000000)

Comprehensions may be used in a more complex way to combine multiple collections into one. Here’s the example:

characters_per_serie = [
    {
        'serie': 'How I Met Your Mother',
        'characters': ['Barney', 'Ted', 'Marshall', 'Robin', 'Lily']
    },
    {
        'serie': 'Friends',
        'characters': ['Ross', 'Rachel', 'Phoebe', 'Monica', 'Joey', 'Chandler']
    },
    {
        'serie': 'The Big Bang Theory',
        'characters': ['Sheldon', 'Penny', 'Leonard', 'Rajesh', 'Howard', 'Amy', 'Bernadette']
    }
]

actresses = {
    'Rachel': 'Jennifer Aniston',
    'Monica': 'Courteney Cox',
    'Phoebe': 'Lisa Kudrow',
    'Penny': 'Kaley Cuoco',
    'Bernadette': 'Melissa Rauch',
    'Amy': 'Mayim Bialik',
    'Robin': 'Cobie Smulders',
    'Lily': 'Alyson Hannigan'
}

actresses_per_serie = {actresses[character]: characters_per_serie_['serie'] 
                       for characters_per_serie_ in characters_per_serie 
                       for character in characters_per_serie_['characters'] 
                       if character in actresses.keys()}

As a result we got the dictionary where keys are actresses and values are series they played in. Comprehensions give us the possibility to create any collection from another in a very simple, concise and readable way with significant code reduction.

In conclusion, it is a good practice to use comprehensions to create collections of items on-the-fly to avoid nested blocks, reduce code volume and increase its readability.

Context managers

Context Manager is another construct that allows to write code in a safer and more readable way. Working with resources e.g. files is the most common usage of this construct. Let’s consider the conventional way of opening and closing the file:

file = open('filepath', 'r')
file.close()

However, consider the following code:

file = open('filepath', 'r')
raise SomeException('Something went terribly wrong')
file.close()

As we can assume, the file is not being closed. To overcome this, the code can be surrounded  with a try … except block. Nevertheless, Context Managers can be used instead:

with open('filepath', 'r'):
    raise SomeException('Something went terribly wrong')

With this construct we are assured that the file will have been closed before program is interrupted.

-   Creating Context Managers

Context Managers are not limited to resources. We can create custom ones. It can be done in various ways. Consider the object that is expected to perform some action at the beginning and in the end of some particular operation.

class CtxMngr:

    def do_something(self, raise_exception):
        print('Before exception')
        if raise_exception:
            raise Exception()
        print('After exception')

    def __enter__(self):
        print('Enter')
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        print('Exit')


with CtxMngr() as ctx_mngr:
    ctx_mngr.do_something(raise_exception=True)

When the program starts, we can see that ‘After exception’ is not printed out but ‘Exit’ is. This makes us be sure that something will be done before leaving the Context Manager block.

Another way to create context managers is to use contextlib module that contains contextmanager decorator. It should decorate a generator that has one yield statement. Everything before it will be assumed as __enter__ block and after will be the equivalence to __exit__. Here is an example of contextmanager decorator usage:

import contextlib

@contextlib.contextmanager
def ctx_mngr():
    print('Enter')
    yield
    print('Exit')

with ctx_mngr() as ctx_mngr:
    raise Exception()

Context Managers should be used as a good practice when we want to be ensured that something will be always done at the beginning and in the end of some operation. They are invaluable while working with resources to assure that unexpected behavior will not result in  resource leak.

Decorators

Decorators are another Python construct. They behave like one of the design patterns with the same name. They are functions that extend functionality of another one. Python has some useful built-in decorators such as @staticmethod, @classmethod or @property but we can create a custom one.

Let’s say we want to benchmark function execution:

def print_fibonacci(length):
    start = time.time()
    for number in fibonacci_gen(length):
        print(number)
    end = time.time()
    print('Execution time: {}'.format(end - start))

print_fibonacci(5000)

Now imagine that we want to benchmark more than one function. The best way would be to create a separate function and reuse it. Python allows to pass function as a parameter to another one. Function can be also returned from another. With this features we can write a decorate design pattern as follows:

def benchmark(func):
    def inner_function(*args, **kwargs):
        start = time.time()
        func(*args, **kwargs)
        end = time.time()
        print('Execution time: {}'.format(end - start))
    return inner_function

def print_fibonacci(length):
    for number in fibonacci_gen(length):
        print(number)

decorated_function = benchmark(print_fibonacci)
decorated_function(1000)

The functionality of print_fibonacci() function was extended by the benchmark(). However, Python has a special sign ‘@’ that simplifies this:

def benczmark(func):
    def inner_function(*args, **kwargs):
        start = time.time()
        func(*args, **kwargs)
        end = time.time()
        print('Execution time: {}'.format(end - start))
    return inner_function

@benchmark
def print_fibonacci(length):
    for number in fibonacci_gen(length):
        print(number)

print_fibonacci(1000)

The both solutions do exactly the same but we moved the decoration where the function is implemented rather than executed. With one symbol we are able to reuse the code wherever we want to.

Decorators are chaining functions together. Knowing this, we can decorate function with multiple decorators. Moreover we can pass arguments to decorators:


def benchmark(func):
    def inner_function(*args, **kwargs):
        start = time.time()
        func(*args, **kwargs)
        end = time.time()
        print('Execution time: {}'.format(end - start))
    return inner_function

def check_input(input_type):
    def decorator(func):
        def inner_function(*args, **kwargs):
            if not all(isinstance(arg, input_type)
                       for arg in args + tuple(kwargs.values())):
                print('Input should be integer')
                return
            func(*args, **kwargs)
        return inner_function
    return decorator

@benchmark
@check_input(int)
def print_fibonacci(length): 
    """ Printing fibonacci numbers """
    for number in fibonacci_gen(length):
        print(number)

print_fibonacci(length='1')

As we can see, with decorators we can simply move reusable implementation to other place and extend functionalities of any function. But one thing need to be remembered. When we decorate functions as above, we are loosing information about the original function e.g.: if we call:

print(print_fibonacci.__name__)
print(print_fibonacci.__doc__)

We are getting that the name of the function is ‘inner_function’ without docstring that is not expected but in fact is true. To keep the information about the function we should always use wraps decorator from functools module.

from functools import wraps

def benczmark(func):
    @wraps(func)
    def inner_function(*args, **kwargs):
        start = time.time()
        func(*args, **kwargs)
        end = time.time()
        print('Execution time: {}'.format(end - start))
    return inner_function

def check_input(input_type):
    def decorator(func):
        @wraps(func)
        def inner_function(*args, **kwargs):
            if not all(isinstance(arg, input_type)
                       for arg in args + tuple(kwargs.values())):
                print('Input should be integer')
                return
            func(*args, **kwargs)
        return inner_function
    return decorator

Now we keep all the original information about wrapped functions.

In conclusion, good practice is to use decorators when we want to separate reusable code  or extend the functionality of one function without modifying it.

Summary

Python has many specific constructs that can make our code faster, more readable and simpler. Their usage depends on the goal we want to achieve but definitely they are a great and powerful tools to make our code better. Python is very simple and easy language but it always depends on the programmer how its code looks like.

Szymon Piwowar

Did you like this article?

Python Good Practices: Part 1 - Python constructs