Source

Plotly: Creating Interactive Visuals

George Ferre
6 min readMay 31, 2021

--

One of the most important jobs of a data scientist is being able to communicate your findings. One of the best ways to do so is through visualizing your data. Let’s say that I wanted to discuss Seaborn’s mpg dataset and talk about mean miles per gallon per year. With so many data points, it would take far too long to use words to describe it all. But like they say, a picture is worth a thousand words. Instead of painstakingly describing each point of data, I can convey that data quickly in an intuitive way with a scatter plot.

Like many data scientists, Matplotlib is my bread and butter when it comes to visualizing data. Matplotlib provides a wide array of tools to create visualizations with varying degrees of complexity and aesthetic choices. However, if I wanted to make an interactive visual, Matplotlib currently does not offer a solution. This is where Plotly comes in.

What is Plotly?

Plotly is a library for creating interactive graphs available in Python, R, MATLAB, Julia, Perl, Arduino and REST created by Plotly (Plotly is the name of the company as well as the name of the library). Plotly also creates aesthetically pleasing visuals that require less coding than it would take to create a similarly pleasing visual in Matplotlib. Before we get into interactivity in Plotly, I think it’s important to show a comparison to the graph above with a similar graph made with Plotly. This is the code used to create the graph above:

import matplotlib.pyplot as pltfig, ax = plt.subplots(figsize=(8,5))x = year_df.index
y = year_df['mpg']
ax.scatter(x,y)
ax.set_title('Mean Miles Per Gallon by Year')
ax.set_xlabel('Year')
ax.set_ylabel('MPG');

This the code used to create a similar graph with Plotly:

import plotly.express as pxfig = px.scatter(df, x='model_year', y='mpg', title='Miles Per Gallon by Year')
fig.show()

The plotly graph was made with a single line of code, and in my opinion looks better.

Interact With Graphs

It’s nice to have a way to make nice looking graphs more easily, but the star of Plotly’s show is interactivity. You may look at my graph made above, and think to yourself that it is pretty easily interpret the data without having to interact with it. Sure it would be neat to hover over a point and get an exact number for mpg, but that does not really add much to the graph. That is true, but we are dealing with a pretty basic graph above. What happens when we start to have a bit more data on our graph, or if we want to make direct comparisons between two or more data sets? Take a look at this graph:

Source

The graph shows how much we could theoretically charge for an Air BnB unit in different neighborhoods in Berlin. Different prices per neighborhood is easy enough to show for a static graph, but you will notice that we can express so much more information here. First, we can hover over each of the units and get information on what type of offer it is (entire home, private room or shared room), how many people the unit can accommodate, if wifi is available, among many other pieces of information. It would be impossible to provide this much information within a static graph.

Additionally, we can highlight an important feature (such as room type) and display different colors to get a better sense of how that feature affects price. We can then toggle the data points for this feature on or off so that we can zero in on the data that is most important to us at that moment. This is a crucial aspect of the ability of interactivity. If we were to create a similar effect without an interactive graph, we would most likely need thirteen graphs (one graph to show all the units in the twelve neighborhoods shown, and another twelve graphs to break down prices for the three room types in each neighborhood).

Animation of Data

Not only can plotly update information using inputs from the end user, you could also create animations with plotly to help tell the story of your data. Let’s take a look at a graph that shows life expectancy based on GDP per capita of different countries:

Source

Here we can see that you can make an animation to automatically take the end user through an interactive aspect of the data. Animation can be used here to point out a pretty important fact from the data: life expectancy appears to be increasing across countries of all GDPs. You could make that specific point with a static graph by making the year the x axis and life expectancy the y axis, but I think you lose a bit of nuance there. We can infer that if all (or at least most) of the data points are trending towards higher life expectancy, that life expectancy must be increasing for all GDPs, but this animation makes that point more explicit. And to reiterate earlier points, we can interpret so much more than what we could with a single static graph: the effects of continents on life expectancy, the rate at which GDP per capita increases life expectancy, which timespan at the most gains in life expectancy, etc.

Is Plotly Always the Best Option?

It is worth noting that Plotly is not the only interactive visualization library out there. Plotly is often compared to Bokeh, which is fair given that they provide a similar service. A few reasons why someone would choose Plotly over Bokeh are:

  • Plotly offers 3D visualization out of the box
  • Plotly has a larger user base, so it may be easier to find support, tutorials, and other help if needed
  • Plotly generally takes less code to make a similar visualization to Bokeh

That said, Bokeh users cite being able to easily connect multiple graphs so that interactions to one update multiple graphs, and some would argue that Bokeh is more versatile than Plotly. From my research I would recommend starting with learning Plotly and then looking into Bokeh.

With all of that said, if Plotly can make nicer looking graphs more easily, and throw in interactivity, is it worth always defaulting to Plotly? Not necessarily. For one thing, Matplotlib may be a bit easier to learn if you are just starting to learn data science (or data visualization in general). Additionally, with so many tools being built on top of Matplotlib (including Plotly) it is important to know. It is also worth noting that if you are not going to be present your graph in a format that allows interactivity (such as a book), it may be simpler to user Matplotlib, or a Matplotlib wrapper that specializes in aesthetics like Seaborn. However, once you get your visualization fundamentals down, Plotly is a tool that will be useful in anyone’s visualization toolkit.

This post only really scratches the surface of what Plotly is capable of. For more information, I would recommend checking out Plotly’s official website, or this beginner’s tutorial on kaggle.

--

--

George Ferre

My name is George Ferre. I am currently working to become a data scientist. I hope to share insight into the process as a progress through bootcamp and onward.