Qualitative color palettes

Qualitative Color Palettes: A Guide for Data Science Enthusiasts

Table of Contents

Color is a powerful communication tool in data visualization. Qualitative color palettes play a crucial role in making data science visualizations informative and engaging. If you’re learning Python for data science, understanding how to use qualitative color palettes effectively is essential.

At H2K Infosys, our Python programming for Data science courses equip you with practical skills to master data visualization, statistical analysis, and more. In this blog, we’ll explore the significance of qualitative color palettes and how you can leverage them to excel in data science with Python.

Qualitative color palettes are well-suited to representing categorical data because most of their variation is in the hue component. The default  color palette in seaborn is a qualitative palette with ten distinct hues 

sns.color_palette() 

Qualitative Color Palettes: A Guide for Data Science Enthusiasts

These colors have the same ordering as the default matplotlib color palette, “tab10”, but they are a bit less intense. Compare 

IT Courses in USA

sns.color_palette(“tab10″) 

Qualitative Color Palettes: A Guide for Data Science Enthusiasts

Seaborn has six variations of matplotlib’s palette,  called deep, muted, pastel, bright, dark, and colorblind. This  span a range of average luminance and saturation values

Qualitative Color Palettes: A Guide for Data Science Enthusiasts

Many people find the moderated hues of the default “deep” palette to be aesthetically pleasing, but they are also less distinct. As a result,  they may be more difficult to discriminate in some contexts, which is something to keep in mind when making publication graphics. This comparison can help estimate how the seaborn color palettes perform when simulating different forms of colorblindness. 

What Are Qualitative Color Palettes?

Qualitative color palettes are designed to distinguish discrete categories or groups in a dataset. Unlike sequential or diverging palettes, qualitative palettes rely on distinct colors that have no inherent order. These palettes are ideal for:

  • Grouping data by categories (e.g., product types, regions, or customer segments).
  • Visualizing nominal data.
  • Enhancing readability in plots like bar charts, scatter plots, and pie charts.

For instance, a qualitative palette might assign unique colors to categories such as “Apple,” “Banana,” and “Cherry” in a fruit sales dataset. This makes it easy for viewers to distinguish between the groups.

Why Are Qualitative Color Palettes Important in Data Science?

1. Clarity and Interpretation

Qualitative palettes ensure that viewers can quickly identify and differentiate data categories. For example, a marketing analyst visualizing customer segments can easily interpret trends using distinct colors for each segment.

2. Accessibility

When creating visualizations for diverse audiences, using qualitative color palettes improves accessibility. Tools like colorblind-friendly palettes ensure inclusivity, making insights understandable to everyone.

3. Professional Appeal

Effective use of qualitative colors enhances the aesthetic quality of charts and graphs. In data science projects, visually appealing results can leave a lasting impression on stakeholders.

Using circular color systems  

When you have an arbitrary number of categories, the easiest approach to finding unique hues is to draw evenly-spaced colors in a circular color space (one where the hue changes while keeping the brightness and saturation constant). This is what most seaborn functions default to when they need to use more colors than are currently set in the default color cycle. 

The most common way to do this uses the HLS color space, which is a  simple transformation of RGB values. We saw this color palette before  as a counterexample for how to plot a histogram: 

sns.color_palette(“hls”, 8) 

Qualitative Color Palettes: A Guide for Data Science Enthusiasts

Because of the way the human visual system works, colors that have  the same luminance and saturation in terms of their RGB values won’t  necessarily look equally intense To remedy this, seaborn provides an  interface to the HUSL system since renamed to HSLuv, which achieves  less intensity variation as you rotate around the color wheel 

sns.color_palette("husl", 8)
Qualitative Color Palettes: A Guide for Data Science Enthusiasts

Using categorical Colorbrewer palettes  

Another source of visually pleasing categorical palettes comes from the Color Brewer tool which also has sequential and diverging palettes,  as we’ll see below. 

sns.color_palette(“Set2") 

Qualitative Color Palettes: A Guide for Data Science Enthusiasts

Sequential color palettes 

The second major class of color palettes is called “sequential”. This kind of mapping is appropriate when data range from relatively low or uninteresting values to relatively high or interesting values or vice versa.  As we saw above, the primary dimension of variation in a sequential palette is luminance. Some seaborn functions will default to a  sequential palette when you are mapping numeric data. For historical reasons, both categorical and numeric mappings are specified with the hue parameter in functions like relplot() or displot(), even though numeric mappings use color palettes with relatively little hue variation. 

Perceptually uniform palettes  

Because they are intended to represent numeric values, the best sequential palettes will be perceptually uniform, meaning that the relative discriminability of two colors is proportional to the difference between the corresponding data values.  

Seaborn includes four perceptually uniform sequential colormaps: “rocket”, “mako”, “flare”, and “crest”. The first two  have a very wide luminance range and are well suited for applications such as heatmaps, where colors fill the space they are plotted into 

sns.color_palette("rocket", as_cmap=True)
Qualitative Color Palettes: A Guide for Data Science Enthusiasts
sns.color_palette("mako", as_cmap=True
Qualitative Color Palettes: A Guide for Data Science Enthusiasts

Because the extreme values of these colormaps approach white, they are not well-suited for coloring elements such as lines or points: it will be difficult to discriminate important values against a white or gray background. The “flare” and “crest” colormaps are a better choice for such plots. They have a more restricted range of luminance variations,  which they compensate for with a slightly more pronounced variation in hue. The default direction of the luminance ramp is also reversed so  that smaller values have lighter colors 

sns.color_palette("flare", as_cmap=True
Qualitative Color Palettes: A Guide for Data Science Enthusiasts
sns.color_palette("crest", as_cmap=True
Qualitative Color Palettes: A Guide for Data Science Enthusiasts

It is also possible to use the perceptually uniform colormaps provided  by matplotlib, such as “magma” and “viridis” 

sns.color_palette("magma", as_cmap=True
Qualitative Color Palettes: A Guide for Data Science Enthusiasts
sns.color_palette("viridis", as_cmap=True)Qualitative Color Palettes: A Guide for Data Science Enthusiasts

Discrete vs. continuous mapping  

One thing to be aware of is that seaborn can generate discrete values from sequential colormaps and, when doing so, it will not use the most extreme values. Compare the discrete version of “rocket” against the  continuous version shown above 

sns.color_palette("rocket") 
Qualitative Color Palettes: A Guide for Data Science Enthusiasts

Internally, seaborn uses the discrete version for categorical data and the continuous version when in numeric mapping mode. Discrete sequential colormaps can be well-suited for visualizing categorical data with an intrinsic ordering, especially if there is some hue variation. 

Sequential “cubehelix” palettes  

The perceptually uniform colormaps are difficult to programmatically generate because they are not based on the RGB color space.  The cubehelix system offers an RGB-based compromise: it generates sequential palettes with a linear increase or decrease in brightness and some continuous variation in hue. While not perfectly perceptually uniform, the resulting colormaps have many good properties.  Importantly, many aspects of the design process are parameterizable. 

Matplotlib has the default cubehelix version built into it 

sns.color_palette("cubehelix", as_cmap=True) Qualitative Color Palettes: A Guide for Data Science Enthusiasts

cubehelix_palette() function is a bit different from the matplotlib default in that it does not rotate as far around the hue wheel or cover as wide a  range of intensities. It also reverses the luminance ramp

sns.cubehelix_palette(as_cmap=True
Qualitative Color Palettes: A Guide for Data Science Enthusiasts

Other arguments to cubehelix_palette() control how the palette  looks. The two main things you’ll change are the start value  between 0 and 3 and rot, or number of rotations, an arbitrary value, but  usually between -1 and 1. 

sns.cubehelix_palette(start=.5, rot=-.5,as_cmap=True) Qualitative Color Palettes: A Guide for Data Science Enthusiasts

You can control both how dark and light the endpoints are and their order 

sns.cubehelix_palette(start=2, rot=0, dark=0,  light=.95, reverse=True, as_cmap=True

Qualitative Color Palettes: A Guide for Data Science Enthusiasts

Custom sequential palettes  

For a simpler interface to custom sequential palettes, you can  use light_palette() or dark_palette(), which are both seeded  with a single color and produce a palette that ramps either from light or  dark desaturated values to that color 

sns.light_palette("seagreen", as_cmap=True)Qualitative Color Palettes: A Guide for Data Science Enthusiasts
sns.dark_palette("#69d", reverse=True, as_cmap=True) Qualitative Color Palettes: A Guide for Data Science Enthusiasts
Reverse the colormap by adding “_r" 

sns.color_palette(“dark:salmon_r”, as_cmap=True) Qualitative Color Palettes: A Guide for Data Science Enthusiasts

Diverging color palettes  

The third class of color palettes is called “diverging”. These are used for data where both large low and high values are interesting and span a  midpoint value often 0 that should be emphasized. The rules for choosing good diverging palettes are similar to good sequential palettes, except now there should be two dominant hues in the colormap, one at (or near) each pole. It’s also important that the starting values are of similar brightness and saturation. 

Perceptually uniform diverging palettes  

Seaborn includes  two perceptually uniform diverging palettes

“vlag” and “icefire”. They both use blue and red at their  poles, which many intuitively process as “cold” and “hot” 

sns.color_palette("vlag", as_cmap=True
Qualitative Color Palettes: A Guide for Data Science Enthusiasts
sns.color_palette("icefire", as_cmap=True)Qualitative Color Palettes: A Guide for Data Science Enthusiasts

Custom diverging palettes  

You can also use the seaborn function diverging_palette() to create a custom colormap for diverging data. This function makes diverging palettes using the HUSL color system. You pass it two hues (in degrees) and, optionally, the lightness and saturation values for the extremes. Using HUSL means that the extreme values, and the resulting  ramps to the midpoint, while not perfectly perceptually uniform, will be  well-balanced 

sns.diverging_palette(220, 20, as_cmap=True
Qualitative Color Palettes: A Guide for Data Science Enthusiasts

It’s also possible to make a palette where the midpoint is dark rather  than light 

sns.diverging_palette(250,30,l=65,center="dark",  as_cmap=True

Qualitative Color Palettes: A Guide for Data Science Enthusiasts

Other diverging palettes  

There are a few other good diverging palettes built into matplotlib,  including Color Brewer palettes 

sns.color_palette("Spectral", as_cmap=True) Qualitative Color Palettes: A Guide for Data Science Enthusiasts

Seaborn tries both to use good defaults and to offer a lot of flexibility. 

This discussion is only the beginning, and there are a number of good resources for learning more about techniques for using color in visualizations.

Learn Python for Data Science at H2K Infosys

Are you ready to enhance your data visualization skills? At H2K Infosys, our Python certification courses are tailored to help you master data science with Python. Our programs focus on practical applications, real-world case studies, and hands-on projects. Here’s what you’ll gain:

  • Comprehensive Training: Learn Python programming for data science from basics to advanced topics.
  • Real-World Projects: Work on projects that simulate industry scenarios, such as customer segmentation and sales analysis.
  • Expert Guidance: Gain insights from experienced instructors who are industry leaders.
  • Flexible Learning: Choose from the best online data science programs with flexible schedules.

Best Practices for Using Qualitative Color Palettes

  1. Choose Distinct Colors: Avoid using colors that are too similar, as they may confuse the audience.
  2. Consider Accessibility: Use tools like Color Brewer to select palettes that are colorblind-friendly.
  3. Limit Categories: Limit the number of categories to 8-10 to avoid overwhelming viewers.
  4. Test Your Visualizations: Always test your visualizations on different devices to ensure clarity.

Key Takeaways

  • Qualitative color palettes are essential for visualizing categorical data in Python.
  • Popular libraries like Matplotlib, Seaborn, and Plotly make it easy to implement these palettes.
  • Following best practices ensures your visualizations are clear, accessible, and impactful.
  • Enroll in H2K Infosys’ Python for data science courses to master these techniques and elevate your career.

Take the Next Step!

Join H2K Infosys’ Python certification programs today to build your expertise in data science. Learn hands-on, gain real-world skills, and achieve your career goals. Enroll now!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Share this article
Enroll IT Courses

Enroll Free demo class
Need a Free Demo Class?
Join H2K Infosys IT Online Training
Subscribe
By pressing the Subscribe button, you confirm that you have read our Privacy Policy.