Learn everything about Analytics

Home » Make your output Prettier Using PPrint

Make your output Prettier Using PPrint

This article was published as a part of the Data Science Blogathon.

Introduction

Your Python output can be so much prettier and well presented if you put some work into it. Having proper print statements can help so much more while debugging your code. Every programming language supports outputs. It’s one of the fundamental things. For example what program do you first write while learning a new programming language? Yes!, a “Hello World!” program, where you output these words to the screen. That’s how much important outputs are, that they are the first thing which is taught to students.

However, sometimes these outputs can be really messy and confusing when you are working with a huge amount of data. For example:

1) A dictionary with all the key-value pairs printed on a single line.

2) A nested list printed in a single line, with soo many brackets that you can’t make out head or tail.

3) Dictionaries with deeply nested objects, but you can’t focus on the main parts.

If you’ve read my previous blogs, then you know that what I usually write about, are lesser-known Python libraries that I find very useful, and would suggest all of you start using them too. This article is about a Python library called PPrint – Data Pretty Printer. This library will be most useful if you are working with data that is saved in JSON files, or while using Dictionaries.

Also, this is one of the inbuilt libraries in Python, so you don’t have to install anything separately. Like I previously mentioned, using print statements is the most rudimentary method of debugging, especially if you are using interactive notebooks like Jupyter Notebooks because the output is all you get after running the cell.

 

The pprint()

It stands for “Pretty Print”, and is one of the native Python libraries that allows you to customize your outputs with its numerous parameters and flags for its single class pprint(). Here is the official documentation to list all of its properties and usage.

As I said before, the library contains just a single class, called pprint(). There are in total six parameters that can be used with this class. Here is a short description of the parameters along with their default values:

1) indent: The number of spaces to indent each line, this value can help when specific formatting is needed. Default value = 1

2) width: Maximum characters that can be in a single line. If the number of words exceeds this limit, the remaining text will be wrapped on the lines below. Default value = 80

3) depth: The number of depth levels to be shown while using nested data types. By default, it shows all the data, but if specified, the data beyond the depth level is shown as a series of dots ( . . . ). Default value = None

4) stream: This is used to specify an output stream and is mainly used to pretty print a file. Its default behavior is to use sys.stdout. Default value = None

5) compact: This is a boolean argument. If set to True, it will consolidate complex data structures into single lines, within the specified width. If the value is the default (ie. False) all the items will be formatted on separate lines. Default value = False

6) sort_dicts: This is also a boolean argument. While printing dictionaries with pprint(), it prints the key-value pair sorted according to the key name alphabetically. When set to false, the key, value pairs will be displayed according to their order of insertion. Default value = True

Now enough with the technical stuff, let’s jump into the programming part!

 

Basics of pprint()

First, we import the pprint module at the beginning of our notebook.

import pprint

Now you can either use the pprint() method or instantiate your pprint object with PrettyPrinter().

pprint.pprint("Hello World!")
> 'Hello world!'
my_printer = pprint.PrettyPrinter()
my_printer.pprint("Hello Pretty Printer")
> 'Hello pretty printer'
print(type(my_printer))
> <class 'pprint.PrettyPrinter'>

Now let us create a sample dictionary to demonstrate some of the arguments of the class pprint().

sample_dict = {
    'name': 'Sion',
    'age': 21,
    'message': 'Thank you for reading this article!',
    'topic':'Python Libraries'
}

Now, if we simply print out this dictionary using print, what we get is:

{'name': 'Sion', 'age': 21, 'message': 'Thank you for reading this article!', 'topic': 'Python Libraries'}

Now that doesn’t look much appealing, does it? But still one might argue that this output format is okay since you can clearly see which value belongs to which key, but what happens if these values are extremely long, and nested. Or if the volume of our key-value pairs is much much more? That’s when it all goes downhill. It will become very very difficult to read, but worry not, print to the rescue:

pprint.pprint(sample_dict)

Firstly, all the pairs have their separate row, which increases the readability tenfold. Also if you look closely, all the elements are automatically sorted according to the keys.

Text wrapping

Text wrapping pprint

Image Source: Elle

I guess most people might know the basics that I showed above. Now let’s use some of the other parameters to further customize our outputs.

Another basic usage is text wrapping. Suppose you are not satisfied by just printing the key-value pairs on separate lines, but want to have the text wrapped when the length of the line exceeds a certain amount. For this, we can use the width parameter.

pprint.pprint(sample_dict, width = 30)
pprint sample

Apart from this, we can use the indent parameter to add indentation in front of each row for better readability.

pprint.pprint(sample_dict, width = 30, indent = 10)
pprint readability

Here is an example for the usage of compact and width parameters:

import pprint
stuff = ['spam', 'eggs', 'lumberjack', 'knights', 'ni']
stuff.insert(0, stuff[:])
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(stuff)
compact and width parameters
pp = pprint.PrettyPrinter(width=41, compact=True)
pp.pprint(stuff)
spam

 

Deep nested objects

Deep nested objects

Image Source: Missouri Dept. of Conservation

Sometimes while working with highly nested objects, we just want to view just the outer values and are not interested in the deeper levels. For example, if we have a nested tuple like this:

sample_tuple = ('spam', ('eggs', ('lumberjack', ('knights', ('ni', ('dead',
('parrot', ('fresh fruit',))))))))

Now if we use print or print, the outputs will be almost similar:

print(sample_tuple)
> ('spam', ('eggs', ('lumberjack', ('knights', ('ni', ('dead', ('parrot', ('fresh fruit',))))))))
pp.pprint(sample_tuple)
pprint tuple

However, if the depth parameter is specified, anything deeper than that will be truncated:

pprint.pprint(sample_tuple, depth=2)
sample tuple
pprint.pprint(sample_tuple, depth=1)
depth 1 tuple
p = pprint.PrettyPrinter(depth=6)
p.pprint(sample_tuple)
depth 6

pprint() vs PrettyPrinter()

The difference between these two is that the pprint() method uses the default arguments and settings of the libraries, which you can change like we previously saw, but these changes are temporary.

With PrettyPrinter(), you can create a class, with your own specifications and override the default settings to create permanent class objects which retain their forms and values all over your project.

import pprint
coordinates = [
   {
       "name": "Location 1",
       "gps": (29.008966, 111.573724)
   },
   {
       "name": "Location 2",
       "gps": (40.1632626, 44.2935926)
   },
   {
       "name": "Location 3",
       "gps": (29.476705, 121.869339)
   }
]
pprint.pprint(coordinates, depth=1)
> [{...}, {...}, {...}]
pprint.pprint(coordinates)
pprint vs PrettyPrinter

As you can see, the arguments supplied were just temporary. Conversely, these settings are stored with PrettyPrinter(), and you can use them wherever in the code you want, without any change in functionality:

import pprint
my_printer = pprint.PrettyPrinter(depth=1)
coordinates = [
   {
       "name": "Location 1",
       "gps": (29.008966, 111.573724)
   },
   {
       "name": "Location 2",
       "gps": (40.1632626, 44.2935926)
   },
   {
       "name": "Location 3",
       "gps": (29.476705, 121.869339)
   }
]
my_printer.pprint(coordinates)
> [{...}, {...}, {...}]

References

1) Pprint Library

2) How to Pretty Print in Python | Jonathan Hsu

Be sure to check out the official documentation for further in-depth readings. Several shortcut functions are present in the documentation that’ll help you to customize your outputs further. And if you like what you’re are reading and want to check out more of my content you can do so here. I hope this article helps you in becoming a much more awesome Data Scientist. Cheers!!

The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion. 

You can also read this article on our Mobile APP Get it on Google Play