Recent Python Performance Improvements

Recent Python Performance Improvements

Like many other Free and Open Source (FOSS) projects, the Python programming language is continuously updated with new releases as security vulnerabilities are patched and bugs are fixed.   Still other updates allow Python code to run faster or use less memory.

Enumerating the Contents of a Directory Gets Much Faster

One example of a recent Python change which both reduces memory usage and speeds up code is the addition of the new scandir() function to the os module in Python 3.5.

The os.scandir() function gathers information about all the files or subfolders within a folder and returns that information as an iterable.  Iterables are very memory-efficient  because they yield objects containing data one at a time as they are needed by a program, and thus memory is consumed only for the object which the code is currently manipulating.  Python lists, however, consume an amount of memory equal to the sum of all the items in that list.  For a lot code, this is unnecessary since lists are often just iterated over one by one anyway.  If a Python program only uses one object at a time, it is a waste of precious high-speed RAM to load every object into memory all at once by creating a list.  The scandir()'s iterable approach is thus a large improvement over the pre-existing os.listdir() function, which returns all the file and subfolder names all at once in one large list.  When listing the contents of large folders containing hundreds or thousands of items (like a shared network drive drive on a large company's LAN), the memory saving advantages of os.scandir() can even make the difference between whether or not a program crashes before it finishes.

Besides the lower memory usage, scandir() also brings a dramatic speed improvement to real-world code because scandir() preserves the additional information from the underlying platform-specific system calls listdir() always discarded.  One example is whether each item is either a subdirectory or a file.  Because listdir() only passed along the actual names, Python developers had use additional functions (with their additional system calls) to retrieve this information which listdir() could have kept, but didn't.  Skipping these redundant system calls saves a lot of CPU time, especially when recursively listing a folder's contents.

This brings us to the os.walk() function, which does exactly that.  Previously, os.walk() used the os.listdir() function, but now os.walk() uses the new os.scandir() function.  This means Python developers whose existing code called os.walk() will see the their code speed up automatically after updating to version 3.5 or above.  One benchmark shows this new code show running 36 times faster on Windows 7 64-bit on an NFS share while Mac OS X saw a factor of 3.8 improvement.

Some Speed Boosts for 2.X

Python 3 isn't alone in receving tune-ups.  The latest release candidate of the 2 series (currently 2.7 release candidate 1) includes a performance boost to the Python interpreter which was implemented in the 3.X series 5 years ago.  The benchmarks posted by the core developers of Python show an average speed increase of about 10%  This optimization requires that a specific feature (called “computed gotos”) be enabled in the compiler which compiled the Python interpreter, which practically speaking, means just GCC.

The “computed goto” improvement is an example of a backported feature into the 2.X series from the 3.X.  Not all Python 3 performance improvements, however will be backported to the 2.X series.  One example of a fix that is 3.X only is the range() function: in 3.X it returns an iterable, but in 2.X, range() returns a list.  Remember from our above discussion of os.scandir() vs. os.listdir() the difference between iterables and lists when it comes to memory usage.  This improvement won't be backported for many reasons, backwards compatibility probably being the most important.  Since in Python 2.X range() returns a list, any programs which apply methods available only for lists (like sort()) to the output of range(), would stop with an error if range() returned a list instead.  Python 3 was created specifically for these type of necessary, but compatibility-breaking changes.

Like all improvements to Python, performance gains are expected when an inherently well-designed language with robust open governance is coupled with a worldwide team of developers and users.

Copyright © Python People