In this post, you are going to learn how and when to use the
pandas.melt() method of the pandas library in python.
When to use
Very often, you will find a DataFrame in which the values are pivoted w.r.t. certain variables and you want to "unpivot" the DataFrame from a wide format (many separate columns) to a long format (one column with many rows).
For example, consider this data set: https://data.gov.in/resources/number-visitors-centrally-protected-ticketed-monuments-during-fy-2016-17-and-fy-2017-18
It contains the number of visitors to centrally protected ticketed monuments in India during FY 2016-17 and FY 2017-18.
But the values [number of visitors] are pivoted w.r.t to the variables [ a combination of type of visitor (Foreign/Domestic) and the period of visit (FY 2016-17/2017-18) ]
You want to "unpivot" this DataFrame as such:
How to use
To unpivot the DataFrame from a wide format to a long format, you use the
Melting is a transformation process that is used to reshape your DataFrame columns into rows of data.
pandas.melt() method accepts a list of arguments, but the two important ones that need to be kept in mind are the
id_vars : specifies the columns that you do not want to melt, in other words, you want to keep them as separate columns
value_vars : specifies the columns that you do want to melt , in other words, you want to combine them into a single column with many rows
Step by Step Explanation:
Step 1: Import the csv file and print the head of the DataFrame
import pandas as pd df = pd.read_csv("https://docs.google.com/spreadsheets/d/e/2PACX-1vT-2lDZrGbeDKc_HKDmpOqINLFUYyEEiyQlcBXcppS2UxxB1IvNXFLclJHb6PHW0LZQzlsz8ZvvBTbB/pub?gid=0&single=true&output=csv") df.head()
Melt the DataFrame and print the head of the reshaped DataFrame
melted_df = pd.melt(df, id_vars=['Circle', 'Name of the Monument'], value_vars=['Domestic - 2016-17', 'Foreign - 2016-17', 'Domestic - 2017-18', 'Foreign - 2017-18']) melted_df.head()
Note the parameters provided to the
df : the DataFrame to be melted
id_vars=['Circle', 'Name of the Monument'] : columns that you do not want to melt
value_vars=['Domestic - 2016-17', 'Foreign - 2016-17', 'Domestic - 2017-18', 'Foreign - 2017-18'] : columns that you do want to melt
Hope you now have a clear understanding of when and how to use the
pandas.melt() method to reshape your DataFrame.
In later posts of the Pandas Tutorial Series, I will explain other useful pandas methods that will make your data science job easier.