Introduction to structured multi-plot grids

Introduction to structured multi-plot grids

Table of Contents

The FacetGrid class is useful when you want to visualize the distribution of a variable or the relationship between multiple variables separately within subsets of your dataset. A FacetGrid can be drawn with up to three dimensions: row, col, and hue. The first two have obvious correspondence with the resulting array of axes. 

It can also represent levels of a third variable with the hue parameter, which plots different subsets of data in different colors. This uses color to resolve elements on a third dimension, but only draws subsets on top of each other and will not tailor the hue parameter for the specific visualization of the way that axes-level functions that accept hue will. 

This class maps a dataset into multiple axes arrayed in a grid of rows and columns that correspond to levels of variables in the dataset. The plots it produces are often called “lattice”, “trellis”, or “small-multiple” graphics. 

The basic workflow is to initialize the FacetGrid object with the dataset and the variables that are used to structure the grid. Then one or more plotting functions can be applied to each subset by calling FacetGrid.map() or FacetGrid.map_dataframe().  

Finally, the plot can be tweaked with other methods to do things like changing the axis labels, use different ticks, or add a legend. See the detailed code examples below for more information. We will use the tips data set for this example. 

import seaborn as sns  
tips = sns.load_dataset(“tips") 

Output:  

total_billtipsexsmokerdaytimesize
016.991.01FemaleNoSunDinner2
110.341.66MaleNoSunDinner3
221.013.50MaleNoSunDinner3
323.683.31MaleNoSunDinner2
424.593.61FemaleNoSunDinner4

sns.FacetGrid(tips)  

Output: 

Introduction to structured multi-plot grids

To draw a plot on every facet, pass a function and the name of one or  more columns in the dataframe to FacetGrid.map()  

g = sns.FacetGrid(tips, col=”time”, row=“sex”)  g.map(sns.scatterplot, “total_bill”, “tip”) 

The variable specification in FacetGrid.map() requires a positional  argument mapping, but if the function has a data parameter and  accepts named variable assignments, you can also use  multi-plot grids FacetGrid.map_dataframe().  

One difference between the two methods is that FacetGrid.map_dataframe() does not add axis labels. 

g = sns.FacetGrid(tips, col="time", row=“sex”)  

g.map_dataframe(sns.histplot, x=“total_bill”)  Introduction to structured multi-plot grids

The FacetGrid constructor accepts a hue parameter. Setting this will  condition the data on another variable and make multi-plot grids in  different colors. Where possible, label information is tracked so that a  single legend can be drawn 

g = sns.FacetGrid(tips, col="time", hue=“sex")  g.map_dataframe(sns.scatterplot, x=“total_bill",  y=“tip") 
structured multi-plot grids

The size and shape of the plot are specified at the level of each subplot using multi-plot grids the height and aspect parameters. Change the height and aspect ratio of each facet. 

g = sns.FacetGrid(tips, col="day", height=3.5,  aspect=.65)  
g.map(sns.histplot, “total_bill”) 
structured multi-plot grids

Note that margin_titles isn’t formally supported by the matplotlib API,  and may not work well in all cases. In particular, it currently can’t be used with a legend that lies outside of the plot. 

The size of the figure is set by providing the height of each facet, along  with the aspect ratio

g =sns.FacetGrid(tips, col="day", height=4,aspect=.5)  g.map(sns.barplot, "sex", "total_bill",  

order=[“Male”, “Female”]) 

multi-plot grids

The default ordering of the facets is derived from the information in the  multi-plot grids DataFrame. If the variable used to define facets has a categorical type,  then the order of the categories is used.  

Otherwise, the facets will be in the order of appearance of the category  levels. It is possible, however, to specify an ordering of any facet  dimension with the appropriate *_order parameter  

ordered_days = tips.sex.value_counts().index  g = sns.FacetGrid(tips, row="sex",  
row_order=ordered_days,height=1.7, aspect=4,) g.map(sns.kdeplot, “total_bill”) 
multi-plot grids

If you have many levels of one variable, you can plot it along with the columns but “wrap” them so that they span multiple rows. When doing this, you cannot use a row variable. 

attend = sns.load_dataset(“attention").query("subject  <= 12”)  

Unnamed: 0 subject attention solutions score 

Unnamed: 0subjectattentionsolutionsscore
001divided12.0
112divided13.0
223divided13.0
334divided15.0
445divided14.0

g = sns.FacetGrid(attend, col=”subject”, col_wrap=4,  height=2, ylim=(0, 10))  

g.map(sns.pointplot, “solutions”, “score”, order=[1,  2, 3], color=”.3″, ci=None

multi-plot grids

Using custom functions  

You’re not limited to existing matplotlib and seaborn functions when  using FacetGrid. However, to work properly, any function you use must  follow a few rules: 

1. It must plot onto the “currently active” matplotlib Axes. This will be true of functions in the matplotlib.pyplot namespace, and you can call matplotlib.pyplot.gca() to get a reference to the current Axes if you want to work directly with its methods. 

2. It must accept the data that it plots in positional arguments.  Internally, FacetGrid will pass a series of data for each of the named positional arguments passed to FacetGrid.map()

3. It must be able to accept color and label keyword arguments, and,  ideally, it will do something useful with them. In most cases, it’s easiest to catch a generic dictionary of **kwargs and pass it along to the underlying plotting function multi-plot grids. 

Let’s look at a minimal example of a function you can plot with. This  function will just take a single vector of data for each facet 

from scipy import stats  

def quantile_plot(x, **kwargs):  

 quantiles, xr = stats.probplot(x, fit=False)   plt.scatter(xr, quantiles, **kwargs)  

g = sns.FacetGrid(tips, col=”sex”, height=4) g.map(quantile_plot, “total_bill”) 

Introduction to structured multi-plot grids

Plotting pairwise data relationships  

PPairGrid also allows you to quickly draw a grid of small subplots using the same plot type to visualize data in each. In a PairGrid, each row and column is assigned to a different variable, so the resulting plot shows each pairwise relationship in the dataset. This style of the plot is sometimes called a “scatterplot matrix”, as this is the most common way to show each relationship, but PairGrid is not limited to scatterplots. 

It’s important to understand the differences between a FacetGrid and a PairGrid. In the former, each facet shows the same relationship conditioned on different levels of other variables. In the latter, each plot shows a different relationship (although the upper and lower triangles will have mirrored plots). Using PairGrid can give you a very quick, very high-level summary of interesting relationships in your dataset. 

The basic usage of the class is very similar to FacetGrid. First, you initialize the grid, then you pass the plotting function to a map method and it will be called on each subplot. There is also a companion function, pairplot() that trades off some flexibility for faster plotting. 

We will use iris dataset for this example 

iris = sns.load_dataset(“iris”)  

sepal_lengthsepal_widthpetal_lengthpetal_widthspecies
05.13.51.40.2setosa
14.93.01.40.2setosa
24.73.21.30.2setosa
34.63.11.50.2setosa
45.03.61.40.2setosa

Now we plot pairplot() for this iris dataset 

g = sns.PairGrid(iris)  

g.map(sns.scatterplot) 

Introduction to structured multi-plot grids

By default every numeric column in the dataset is used, but you can  focus on particular relationships if you want. 

g = sns.PairGrid(iris, vars=[“sepal_length",  "sepal_width"], hue=“species")  

g.map(sns.scatterplot) 

Introduction to structured multi-plot grids

The square grid with identity relationships on the diagonal is actually just  a special case, and you can plot with different variables in the rows and  columns. 

g = sns.PairGrid ( t i p s , y _ v a r s = [ " t i p " ] ,  x_vars=["total_bill", "size"], height=4)  g.map(sns.regplot, color=“.3") 

g.set(ylim=(-1, 11), yticks=[0, 5, 10]) 

Introduction to structured multi-plot grids

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Share this article
Subscribe
By pressing the Subscribe button, you confirm that you have read our Privacy Policy.
Need a Free Demo Class?
Join H2K Infosys IT Online Training
Enroll Free demo class