Data Visualization is a very important step in data analysis or data science. The main goal of data visualization is to communicate the results of an analysis clearly and effectively through graphical means. In this sense, it is both an art and a science.
Python has some of the most powerful data visualization libraries - Matplotlib being one of them. It is a massive library which is why working with it can be frustrating at times. But once mastered, it is one of the most versatile data visualization libraries available.
The aim of this post is to introduce you to the basic concepts of Matplotlib so that you become self sufficient in picking up its advanced features from the documentation.
Anatomy of a Matplotlib figure
The figure below shows the names of various elements of a Matplotlib figure.
I'll describe some of the important terms below:
It is a sub-section of the figure - the area that contains the plot. It contains most of the figure elements like Axis, Ticks, Line2D, Text and Polygon.
A matplotlib figure can have multiple axes.
It is the name given to a matplotlib figure or an axes. In case a figure has multiple axes, each axes can have its own title.
It is a number line that represents the scale of the graphs being plotted - the thing that has ticks and tick labels.
Don't confuse Axis with Axes
It is the name given to various elements of a figure like axis labels (x-axis label, y-axis label) , tick labels, graph labels, etc.
Spines, Grids and Ticks:
Spines: are the boundaries of the figure.
Grids: are the lines that divide the area of the graph.
Ticks: are the marks used to indicate the major and minor demarcation along an axis.
So far, so good...
Well, the aim of this blog is also to show you how to plot a matplotlib figure using python. And for this we're going to use a tool that will allow you to work with the python code used in this tutorial in an interactive environment.
We are going to plot this famous Supply Demand Diagram used in economics.
The example code to generate the above diagram and its explanation will introduce you to the various elements of a matplotlib figure as well as show you the python commands used to generate the figure.
When the page loads, the code will run automatically to generate the different figures.
Feel free to play with the code and hit the Run button on top to see the results.
Note that I'm showing the plot at various points in the script to make sure you understand what is the outcome of each statement.
In case there is an error running the code, refresh the page and hit the run button with " >_ " as the symbol at the top bar of the embedded code below.
If you are on a mobile device, click on the edit (pencil) icon to view the entire code.
Matplotlib Example Code:
import time import matplotlib as mpl import matplotlib.pyplot as plt plt.title("Supply Demand Diagram") plt.show() time.sleep(2) plt.xlabel("Quantity") plt.show() time.sleep(2) plt.ylabel("Price") plt.show() time.sleep(2) plt.plot([1,2],[1,2], label='Supply') plt.show() time.sleep(2) plt.plot([1,2],[2,1], label='Demand') plt.show() time.sleep(2) plt.legend(loc='upper left') plt.show() time.sleep(2) plt.vlines(1.5, 1.0, 1.5, linestyles='dashed') plt.show() time.sleep(2) plt.annotate('Q*', (1.5,1.0), textcoords="offset points", xytext=(0,-10), ha='center') plt.show() time.sleep(2) plt.hlines(1.5, 1.0, 1.5, linestyles='dashed') plt.show() time.sleep(2) plt.annotate('P*',(1.0,1.5),textcoords="offset points", xytext=(-15,0), va='center') plt.show() time.sleep(2)
The first three lines import the necessary libraries in the interactive python 3 environment.
matplotlib library is imported as
plt so that it can be easily accessed with the
plt alias in the script.
plt.show()method displays the matplotlib figure in the output
time.sleep() method suspends the execution of the code for a few seconds.
plt.title() method adds a title "Supply Demand Diagram" to the figure.
plt.ylabel() methods add the x and y labels to the x and y axis for Quantity and Price respectively.
plt.plot() method plots the two line graphs for Supply and Demand plots.
The downward sloping line is the Demand plot and the upward sloping line is the Supply plot.
Essentially, the plot method takes two lists that are used to plot a line graph. In this example, we have provided two lists of two points each [x1, x2] and [y1, y2] and so it plots the line connecting the points (x1,y1) and (x2, y2). We then again used the
plt.plot() method to plot another line graph with a different set of points as arguments.
We also provided the individual Supply and Demand plots a label which can be used to identify the lines in the legend.
plt.legend() methods adds a legend to the figure for the Supply and Demand plot labels.
plt.hlines() methods are used to draw the dashed vertical and horizontal lines from the Equillibrium point (intersection of the Supply and Demand plots) to the x-axis and y-axis respectively.
plt.annotate() method is used to add the text labels for the Equilibrium Price (P*) and the Quantity (Q*). I will leave the explanations of the parameters of the annotate method for you to figure out. Feel free to change the parameters and run the code again to view the updated output.
If you have gone through the post carefully, you should now have a fair understanding of how to import the matplotlib library in python, add the various elements to a matplotlib figure and plot the figure as the output.
As mentioned earlier, matplotlib is a huge library and only practice will make sure that you understand the details.
Once again, I would urge you to play with the code. Add/remove statements and try to run the code to check what is the outcome of each statement.
In later posts, I will try to cover some more advanced visualization with Matplotlib.