Data Visualizing: Waffle charts.

Mari Galdina
5 min readDec 27, 2020
Photo by nabil boukala on Unsplash

One of the necessary abilities for data scientists: translate the data so that everyone can understand it. We can generate visualizations two times during the project. The first of those times is the Exploratory Data Analysis (EDA). Here we work on understanding data. The second time is a summary where you can compare model prediction results and forecast after EDA.

In this post, I will focus on some visualizations for Exploratory Data Analysis (EDA). It is a large and significant step for the project. Usually, we work on it after feature engineering and acquiring data. It helps to prepare for modeling and summarize information about the dataset. But EDA is NOT about making fancy visualizations. The purpose of EDA is to use statistical techniques and visualizations to better understand data. We need it to find clues about the tendencies of the data. That means our goal is to create a figure which someone can look at in a couple of seconds and understand what you want to say.

Structured data

When we work with structured data on EDA, we commonly deal with are categorical variables or numerical values.

Categorical variables have a finite set of values.

Numerical values are continuous.

Categorical variables can also be separated into two types: nominal and ordinal. Nominal data has no specific ordering. For example gender, nationality, type of furniture. While ordinal data as clear ordering (what a surprise) such as dryer settings (high, medium, low). For describing such data we can use a bar chart or a waffle chart. I want to consider in detail the waffle charts.

What do we know about waffle charts?

  1. They are an alternative to pie charts because it is good for displaying a single percentage. Typically waffle chart is made with squares representing the whole. And we can shade these squares based on the relation of several parts to a whole.
  2. They use to show the progress on dashboards where each element display contributes to the overall.
  3. They have some cons like they become too complicated when too many segments are involved (like all charts).

How to create waffle charts?

The simplest way to create a waffle chart is using a pywaffle package. I use Jupyter Notebook, Python, and 100People: A world Portrait dataset.

First, I install the pywaffle package using this command:

!pip install pywaffle

Then I import Waffle from pywaffle package:

# import necessary libraries
import matplotlib.pyplot as plt
from pywaffle import Waffle

Now using documentation, I can create basic waffle charts. First of all, I should instantiate a new figure, where I define a figure class as waffle, specify how many rows our charts should have, and values for the chart.

Waffle chart for age distribution
# simple code for waffle chart
fig = plt.figure(
FigureClass = Waffle,
rows = 5,
values = df_age.percent,
labels = list(df_age.age),
)

We get a standard picture for the waffle chart. But we can change a lot of parameters:

  • colors — a list of colors for each category. A number of colors should be the same as values.
  • columns — the number of columns of the waffle chart. Columns or rows are required. If we passed rows then columns would be calculated automatically through the absolute value of values.
if we change some parameters.
# try new parametrs
fig = plt.figure(
FigureClass = Waffle,
rows = 10,
columns = 10,
colors = ['navy', 'red', 'yellow'],
values = df_age.percent,
labels = list(df_age.age),
)
  • characters — a char or a list of characters for each category.
  • font size — font size for characters on the chart. Can be an absolute number or value of ‘xx-small’, ‘x-small’, ‘small’, ‘medium’, ‘large’, ‘x-large’, ‘xx-large’.
Change squares to symbols.
# try more parametrs
fig = plt.figure(
FigureClass = Waffle,
rows = 10,
columns = 10,
colors = ['navy', 'red', 'yellow'],
characters = ['+','o','*'],
font_size = 'xx-large',#30,
values = df_age.percent,
labels = list(df_age.age),
)
  • icons — we can change symbols and squares with icons. If it is a string, all categories use the same icon.
  • icon_legend — add your icon into legend

I like to use these parameters for plots because it can make your plot unique and show data personality.

# change characters to icons
fig = plt.figure(
FigureClass = Waffle,
rows = 10,
columns = 10,
colors = ['navy', 'red', 'yellow'],
font_size = 'xx-large',#30,
values = df_age.percent,
labels = list(df_age.age),
icons='child', icon_size=20,
icon_legend=True
)

But we still can make it better. Let’s move the legend for the graph.

One row can move legend away from the body of the chart.

legend = {'loc': 'upper left', 'bbox_to_anchor': (1, 1)},

Also, I like to change icons for waffle charts when I work with time-series datasets, like time or weather.

icons=['sun', 'cloud-showers-heavy', 'snowflake']

A waffle chart can help to compare distribution values into two or more different areas.

Compare cars

Summary

This blog represents a brief introduction to data visualization using waffle charts. It is an interesting variant when you try to add different visualization features to projects.

--

--