/ Mykola Kharechko

Static Type Checking in Python

Starting from version 3.5, support of optional static typing was added to Python. PEP 484 -- Type Hints was approved and implemented. This PEP adds support of optional declaration of types for methods’ and functions’ arguments and their return values. Later, PEP 526 -- Syntax for Variable Annotationswas implemented in Python-3.6, which became a logical extension of improving and advancing support of static typing. PEP 526 added the possibility to specify a variable type. Type annotations for variables and methods can be used by static code analyzers and IDEs. They do not impact runtime performance (annotations are ignored at runtime).

At this moment, mypy is the most popular static type checker for Python. Static typing can be useful not only at big projects, but also at scripts or pieces of code that are very simple at first glance. Types are like unittests for code, but apply to your data. In this article, we will look at several examples of such code. The code, where static typing allows omitting not-trivial bugs, which take a lot of time to debug and find root cause. Also, I’ll tell you about basic type annotations syntax in Python.

Let's take a look at the following code snippet:

from pprint import pprint as pp

def db_get(_id):

return {'name': 'Nick'} if _id == 10 else {'name': 'Unknown'}

def db_all():

return [{'name': 'Nick'}, {'name': 'John'}]

def process_request(object_id=None):

if object_id:

data = db_get(object_id)

else:

data = db_all()

return match_with_ids(data)

def match_with_ids(objects):

ids = map(id, objects)

return list(zip(ids, objects))

 

We try to test the code at Python-2.7. The code runs as expected:

>>> pp(process_request())

[(4480574360, {'name': 'Nick'}), (4480573520, {'name': 'John'})]

 

Now let’s look at the following usage examples (actual result):

>>> pp(process_request(10))

[(4479487600, 'name')]

 

However, the expected result should be as follows:

>>> pp(process_request())

(4480574360, {'name': 'Nick'})

 

It means that an error occurred in our code. Why did it happen? The reason is that match_with_ids expects list as argument, but not a hash table.

I’ll try to fix the error. For this purpose, I add type annotations to functions:

from pprint import pprint as pp

from typing import List, Optional, Tuple

def db_get(_id):

# type: (...) -> dict

return {'name': 'Nick'} if _id == 10 else {'name': 'Unknown'}

def db_all():

# type: () -> List[dict]

return [{'name': 'Nick'}, {'name': 'John'}]

def process_request(object_id = None # type: Optional[int]

):

if object_id:

data = [db_get(object_id)]

else:

data = db_all()

return match_with_ids(data)

def match_with_ids(objects):

# type: (List) -> List[Tuple[int, dict]]

ids = map(id, objects)

return zip(ids, objects)

 

And run the static analyzer before running the script:

$ mypy --py2 t.py

t.py:17: error: Incompatible types in assignment (expression has type "List[Dict[Any,

t.py:18: error: Argument 1 to "match_with_ids" has incompatible type "Dict[Any, Any]"

 

As you see, mypy says that match_with_ids accepts List[Any] type, but not Dict[Any, Any]. In this way, we find the error before running the script on the static checking stage. The correct version of process_request:

def process_request(object_id = None # type: Optional[int]

):

if object_id:

data = [db_get(object_id)]

else:

data = db_all()

return match_with_ids(data)

 

With the changes above mypy runs without errors:

$ mypy --py2 t.py && echo $?

0

 

The following example shows the benefits of type annotations for developing the code compatible with both major versions of Python - 2 and 3. Let’s try to run the following code at Python-2:

(python-env-2.7.13)$ python

>>> from t import process_request

>>> process_request()[0]

(4339523664, {'name': 'Nick'})

 

And Python-3:

(python-env-3.6.1)$ python

>>> from t import process_request

>>> process_request()[0]

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

TypeError: 'zip' object is not subscriptable

 

As you see, the code that is at first glance valid crashes at Python-3. If you run mypy static checker before running the script, you will see the following error:

(python-env-3.6.1)$ mypy t.py

t.py:23: error: Incompatible return value type (got "Iterator[Tuple[int, Any]]"

 

Such errors in most cases can be caught only by a manual QA or numerous unittests. The reason why this error occurred: the interface of zip function changed in Python-3. Valid code, which will work on both Python versions, looks like this:

def match_with_ids(objects):

# type: (List) -> List[Tuple[int, dict]]

ids = map(id, objects)

return list(zip(ids, objects))

mypy t.py && echo $?

0

 

Syntax for Type Declarations

Python supports two forms of data type declarations for variables, function arguments, and return values. The first form uses annotations syntax and the second form allows us to specify type information in special formatted comments.

Type annotation

Type annotations are based on PEP 3107 and PEP 0526 with some extensions. You can specify basic types:

def some_func(arg1: int, arg2:bool) -> None:

initialized_var: int = 5

not_initialized_var: float

 

Or more complex types. To support complex types, Python provides the typing module with basic complex data types, such as Any, Union, Tuple, Callable, TypeVar, and Generic. Here is an example:

from typing import List

Vector = List[float]

def scale(scalar: float, vector: Vector) -> Vector:

return [scalar * num for num in vector]

 

Also you can define your own types:

from typing import NewType

UserId = NewType('UserId', int)

some_id = UserId(524313)

 

In some situations, it is useful to specify that the function can return some value or None. For such case, the Optional type will be as follows:

from typing import Optional

def int_or_nothing(a: bool) -> Optional(int):

return 1 if a else None

 

You should keep in mind that it is not allowed to annotate variables subject to global or nonlocal in the same function scope:

def f():

global x: int # SyntaxError

def g():

x: int # Also a SyntaxError

global x

 

The reason is that global and nonlocal don't own variables; therefore, the type annotations belong to the scope, which owns the variable.

Specify types in comments

This approach of type annotations is preferable if you suppose that your code will be supported by Python-2. Type annotations are not supported by Python-2, therefore it is the only way. You can specify type hints and use static type checking in Python-2.

There are two forms to specify type hints in the function declaration. The first one is the one-line declaration:

def func(arg1, arg_int, *args, **kw):

# type: (List[str], int, *str, **bool) -> int

"""Docstring comes after type comment."""

 

The second one is multi-line Python 2 function annotations:

def func(arg1 # int,

arg2 # List[int]):

# type (...) -> List[float]

...

 

Variable type annotations look as follows:

my_var = 10 # type: int

 

Conclusion

As you can see, there is nothing complex in type annotations in Python code. From the other side, it gives you a lot of benefits, which are present in statically typed programming languages.

Mykola
Kharechko
Subscribe for regular updates