Plotly notes

Table of Contents generated with DocToc

Plot.ly for scientific visualization

  • developed by a Montreal-based company Plot.ly
  • open-source scientific plotting Python library for Python, R, MATLAB, Perl, Julia
  • front end uses JavaScript/HTML/CSS and D3.js visualization library
  • files are hosted on Amazon S3

Generated plots can be

  1. browsed online
  2. stored as offline html5 files
  3. displayed inside a Jupyter notebook

Installation

  • Python 3, along with the pip package manager
  • run the command
$ pip install plotly

Online documentation:

  1. Tutorials
  2. Code examples
  3. Keyword index
  4. Getting started guide
  5. Community feed with gallery
  6. Plotly for matplotlib users
  7. Using Plotly with Python offline
  8. Saving static images (PNG, PDF, etc)
  9. Creating HTML or PDF reports in Python
  10. Creating dashboards with Plotly and Python
  11. Connecting to databases
  12. Plotly and IPython / Jupyter notebook

Initial setup for online plotting

Setting up online plotting (need to do this only once):

  1. obtain a free account at https://plot.ly/accounts/login/?action=signup
  2. generate your API key
  3. run the following from Python
import plotly
plotly.tools.set_credentials_file(username='yourUserName', api_key='yourAPIkey')   # will write to ~/.plotly/.credentials

Three different plot destinations

Online plotting

  • room to store 25 free charts; beyond that will get an error
  • can free up some space by removing older plots using their Python web API or simply deleting individual plots at https://plot.ly/organize/home
import plotly.plotly as py    # online plotting
import plotly.graph_objs as go
from numpy import linspace, sin
x1 = linspace(0.01,1,100)
y1 = sin(1/x1)
trace1 = go.Scatter(x=x1, y=y1, mode='lines+markers', name='sin(1/x)')
data = [trace1]
py.plot(data,auto_open=False)   # create a unique URL for this plot and optionally open it

This should return the URL of the plot and open it in your web browser (no local plots saved).

help(py.plot) to show all plotting arguments
py.plot(data, auto_open=False)   # make the online plot, do not auto-open in the web browser
py.plot(data, auto_open=False, sharing='private')   # only 1 free private file

With default one free private file, you will quickly reach the quota. Same for secret keyword (need a paid account). However, you get 25 free public plots, and unlimited offline plotting!

Offline plotting

  1. change the first line,
  2. specify filename in last line.
import plotly.offline as py   # offline plotting
import plotly.graph_objs as go
from numpy import linspace, sin
x1 = linspace(0.01,1,100)
y1 = sin(1/x1)
trace1 = go.Scatter(x=x1, y=y1, mode='lines+markers', name='sin(1/x)')
data = [trace1]
py.plot(data, filename='lines.html',auto_open=False)

By default this will auto-open the file, but you can also use auto_open=False if you want.

Plotting inside a Jupyter notebook

(Option 1) If you have installed jupyter, run it locally on your notebook:

$ jupyter notebook

(Option 2) Log in to https://syzygy.ca with your university ID. This website is maintained/hosted by PIMS, Compute Canada, and Cybera.

  1. start a new Python 3 notebook
  2. in the first line use import plotly.offline as py
  3. specify py.init_notebook_mode(connected=True)
  4. in the last line change py.plot() to py.iplot()
import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
from numpy import linspace, sin
x1 = linspace(0.01,1,100)
y1 = sin(1/x1)
trace1 = go.Scatter(x=x1, y=y1, mode='lines+markers', name='sin(1/x)')
data = [trace1]
py.iplot(data)   # create a unique URL and open the plot inline in a Jupyter Notebook
  • connected=True will use the online plotly.js library inside the notebook (smaller notebook file sizes)
  • connected=False will include plotly.js into the notebook for complete offline work (larger files sizes, also need to modify some notebook settings before it works)

2D plots: deeper dive

“Scatter” plots

We’ll start with the Jupyter Notebook version of the code.

Let’s print the dataset trace1: it is a plotly object which is actually a Python dictionary, with all elements clearly identified (plot type, x numpy array, y numpy array, line type, legend line name). So, go.Scatter simply creates a dictionary with the corresponding type element. This variable/dataset trace1 completely describes our plot!* Then we create a list data of such objects and pass it to the plotting routine.

In fact, we can rewrite this routine with dictionaries:

import plotly.offline as py
py.init_notebook_mode(connected=True)
from numpy import linspace, sin
x1 = linspace(0.01,1,100)
y1 = sin(1/x1)
trace1 = dict(type='scatter', x=x1, y=y1, mode='lines+markers', name='sin(1/x)')
data = [trace1]
py.iplot(data)

Exercise 1

Pass a list of two objects the plotting routine with data = [trace1,trace2]. Let the second dataset trace2 contain another mathematical function. The idea is to have multiple objects in the plot.

Notice:

  • how we can hover over each data point, and its (x,y) will be shown
  • the toolbar at the top
  • double-clicking on the plot will reset it

Exercise 2

Add a bunch of dots to the plot with dots = go.Scatter(x=[.2,.4,.6,.8], y=[2,1.5,2,1.2]). What is default scatter mode?

Exercise 2.1

Change line colour and width by adding the dictionary line=dict(color=('rgb(205,12,24)'),width=4) to dots:

Exercise 3

Use go.Scatter() to produce a real scatter plot showing a Gaussian distribution in 2D with 1,000 random points.

What are the other plot types? There are quite a few:

import plotly.graph_objs as go
dir(go)
['AngularAxis', 'Annotation', 'Annotations', 'Area', 'Bar', 'Box', 'Candlestick', 'Carpet', 'Choropleth', 'ColorBar', 'Contour', 'Contourcarpet', 'Contours', 'Data', 'ErrorX', 'ErrorY', 'ErrorZ', 'Figure', 'Font', 'Frames', 'Heatmap', 'Heatmapgl', 'Histogram', 'Histogram2d', 'Histogram2dContour', 'Histogram2dcontour', 'Layout', 'Legend', 'Line', 'Margin', 'Marker', 'Mesh3d', 'Ohlc', 'Parcoords', 'Pie', 'Pointcloud', 'RadialAxis', 'Sankey', 'Scatter', 'Scatter3d', 'Scattercarpet', 'Scattergeo', 'Scattergl', 'Scattermapbox', 'Scatterternary', 'Scene', 'Stream', 'Surface', 'Table', 'Trace', 'XAxis', 'XBins', 'YAxis', 'YBins', 'ZAxis', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'absolute_import', 'graph_objs', 'graph_objs_tools']

Bar plots

Let’s try a Bar plot, constructing data directly in one line from the dictionary:

import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
data = [go.Bar(x=['Vancouver', 'Calgary', 'Toronto', 'Montreal', 'Halifax'],
               y=[2463431, 1392609, 5928040, 4098927, 403131])]
py.iplot(data)

Let’s plot inner city population vs. greater metro area for each city:

import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
cities = ['Vancouver', 'Calgary', 'Toronto', 'Montreal', 'Halifax']
proper = [631486, 1239220, 2731571, 1704694, 316701]
metro = [2463431, 1392609, 5928040, 4098927, 403131]
bar1 = go.Bar(x=cities, y=proper, name='inner city')
bar2 = go.Bar(x=cities, y=metro, name='greater area')
data = [bar1,bar2]
py.iplot(data)

Let’s now do a stacked plot, with outer city population on top of inner city population:

import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
cities = ['Vancouver', 'Calgary', 'Toronto', 'Montreal', 'Halifax']
proper = [631486, 1239220, 2731571, 1704694, 316701]
metro = [2463431, 1392609, 5928040, 4098927, 403131]
outside = [m-p for p,m in zip(proper,metro)]   # need to subtract
bar1 = go.Bar(x=cities, y=proper, name='inner city')
bar2 = go.Bar(x=cities, y=outside, name='outer city')
data = [bar1,bar2]
layout = go.Layout(barmode='stack')         # new element!
fig = go.Figure(data=data, layout=layout)   # new element!
py.iplot(fig)   # we get a stacked bar chart

What else can we modify in the layout?

import plotly.graph_objs as go
help(go.Layout)

There are lots of attributes! Let’s set the title and the background colour:

layout = go.Layout(barmode='stack', title='Population', plot_bgcolor = 'rgb(153, 204, 255)')

Heatmaps

  • go.Area() for plotting wind rose charts
  • go.Box() for basic box plots

Let’s plot a heatmap of monthly temperatures at the South Pole:

import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec', 'Year']
recordHigh = [-14.4,-20.6,-26.7,-27.8,-25.1,-28.8,-33.9,-32.8,-29.3,-25.1,-18.9,-12.3,-12.3]
averageHigh = [-26.0,-37.9,-49.6,-53.0,-53.6,-54.5,-55.2,-54.9,-54.4,-48.4,-36.2,-26.3,-45.8]
dailyMean = [-28.4,-40.9,-53.7,-57.8,-58.0,-58.9,-59.8,-59.7,-59.1,-51.6,-38.2,-28.0,-49.5]
averageLow = [-29.6,-43.1,-56.8,-60.9,-61.5,-62.8,-63.4,-63.2,-61.7,-54.3,-40.1,-29.1,-52.2]
recordLow = [-41.1,-58.9,-71.1,-75.0,-78.3,-82.8,-80.6,-79.3,-79.4,-72.0,-55.0,-41.1,-82.8]
trace = go.Heatmap(z=[recordHigh, averageHigh, dailyMean, averageLow, recordLow],
                   x=months,
                   y=['record high', 'aver.high', 'daily mean', 'aver.low', 'record low'])
data = [trace]
py.iplot(data)

Contour maps

Exercise 4

Pretend that our heatmap is defined over a 2D domain and plot the same temperature data as a contour map. Remove the Year data (last column) and use go.Contour to plot the 2D contour map.

Let’s change to a different colourmap:

11c11
<                    y=['record high', 'aver.high', 'daily mean', 'aver.low', 'record low'])
---
>                    y=['record high', 'aver.high', 'daily mean', 'aver.low', 'record low'],
>                    colorscale='Jet')

Downloading data

Open a terminal window inside Jupyter (New-Terminal) and run these commands:

wget http://bit.ly/paraviewzip
unzip paraviewzip
mv data/*.csv .
mv data/*.nc .

Geographical scatterplot

Go back to your Python Jupyter Notebook. Now let’s do a scatterplot on top of a geographical map:

import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
import pandas as pd
from math import log10
df = pd.read_csv('/project/shared/astro/data/cities.csv')   # lists name,pop,lat,lon for 254 Canadian cities and towns
df['text'] = df['name'] + '<br>Population ' + \
             (df['pop']/1e6).astype(str) +' million' # add new column for mouse-over

largest, smallest = df['pop'].max(), df['pop'].min()
def normalize(x):
    return log10(x/smallest)/log10(largest/smallest)   # x scaled into [0,1]

df['logsize'] = round(df['pop'].apply(normalize)*255)   # new column
cities = go.Scattergeo(
    lon = df['lon'], lat = df['lat'], text = df['text'],
    marker = dict(
        size = df['pop']/5000,
        color = df['logsize'],
        colorscale = 'Viridis',
        showscale = True,   # show the colourbar
        line = dict(width=0.5, color='rgb(40,40,40)'),
        sizemode = 'area'))
layout = go.Layout(title = 'City populations',
                       showlegend = False,   # do not show legend for first plot
                       geo = dict(
                           scope = 'north america',
                           resolution = 50,   # base layer resolution of km/mm
                           lonaxis = dict(range=[-130,-55]), lataxis = dict(range=[44,70]), # plot range
                           showland = True, landcolor = 'rgb(217,217,217)',
                           showrivers = True, rivercolor = 'rgb(153,204,255)',
                           showlakes = True, lakecolor = 'rgb(153,204,255)',
                           subunitwidth = 1, subunitcolor = "rgb(255,255,255)",   # province border
						   countrywidth = 2, countrycolor = "rgb(255,255,255)"))  # country border
fig = go.Figure(data=[cities], layout=layout)
py.iplot(fig)

Exercise 5

Modify the code to display only 10 largest cities.

Recall how we combined several scatter plots in one figure before. You can combine several plots on top of a single map – let’s combine scattergeo + choropleth:

import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('/project/shared/astro/data/cities.csv')
df['text'] = df['name'] + '<br>Population ' + \
             (df['pop']/1e6).astype(str)+' million' # add new column for mouse-over
cities = go.Scattergeo(lon = df['lon'],
                       lat = df['lat'],
                       text = df['text'],
                       marker = dict(
                           size = df['pop']/5000,
                           color = "lightblue",
                           line = dict(width=0.5, color='rgb(40,40,40)'),
                           sizemode = 'area'))
gdp = pd.read_csv('/project/shared/astro/data/gdp.csv')   # read name, gdp, code for 222 countries
c1 = [0,"rgb(5, 10, 172)"]     # define colourbar from top (0) to bottom (1)
c2, c3 = [0.35,"rgb(40, 60, 190)"], [0.5,"rgb(70, 100, 245)"]
c4, c5 = [0.6,"rgb(90, 120, 245)"], [0.7,"rgb(106, 137, 247)"]
c6 = [1,"rgb(220, 220, 220)"]
countries = go.Choropleth(locations = gdp['CODE'],
                          z = gdp['GDP (BILLIONS)'],
                          text = gdp['COUNTRY'],
                          colorscale = [c1,c2,c3,c4,c5,c6],
                          autocolorscale = False,
                          reversescale = True,
                          marker = dict(line = dict(color='rgb(180,180,180)',width = 0.5)),
                          zmin = 0,
                          colorbar = dict(tickprefix = '$',title = 'GDP<br>Billions US$'))
layout = go.Layout(hovermode = "x", showlegend = False)  # do not show legend for first plot
fig = go.Figure(data=[cities,countries], layout=layout)
py.iplot(fig)

3D plots

Topographic elevation

Let’s plot some tabulated topographic elevation data:

import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
import pandas as pd
table = pd.read_csv('/project/shared/astro/data/mt_bruno_elevation.csv')
data = go.Surface(z=table.values)  # use 2D numpy array format
layout = go.Layout(title='Mt Bruno Elevation',
                   width=800, height=800,    # image size
                   margin=dict(l=65, r=10, b=65, t=90))   # margins around the plot
fig = go.Figure(data=[data], layout=layout)
py.iplot(fig)

Elevated 2D functions

Exercise 6

Plot a 2D function f(x,y) = (1−y) sin(πx) + y sin^2(2πx), where x,y ∈ [0,1] on a 100^2 grid.

Let’s define a different colourmap by adding colorscale='Viridis' inside go.Surface(). This is our current code:

import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
from numpy import *
n = 100   # plot resolution
x = linspace(0,1,n)
y = linspace(0,1,n)
Y, X = meshgrid(x, y)   # meshgrid() returns two 2D arrays storing x/y respectively at each mesh point
F = (1-Y)*sin(pi*X) + Y*(sin(2*pi*X))**2   # array operation
data = go.Surface(z=F, colorscale='Viridis')
layout = go.Layout(width=1000, height=1000, scene=go.Scene(zaxis=go.layout.scene.ZAxis(range=[-1,2])));
fig = go.Figure(data=[data], layout=layout)
py.iplot(fig)

Lighting control

Let’s change the default light in the room by adding lighting=dict(ambient=0.1) inside go.Surface(). Now our plot is much darker!

  • ambient controls the light in the room (default = 0.8)
  • roughness controls amount of light scattered (default = 0.5)
  • diffuse controls the reflection angle width (default = 0.8)
  • fresnel controls light washout (default = 0.2)
  • specular induces bright spots (default = 0.05)

Let’s try lighting=dict(ambient=0.1,specular=0.3) – now we have lots of specular light!

Parametric plots

In plotly documentation you can find quite a lot of different 3D plot types. Here is something visually very different, but it still uses go.Surface(x,y,z):

import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
from numpy import pi, sin, cos, mgrid
dphi, dtheta = pi/250, pi/250    # 0.72 degrees
[phi, theta] = mgrid[0:pi+dphi*1.5:dphi, 0:2*pi+dtheta*1.5:dtheta]
        # define two 2D grids: both phi and theta are (252,502) numpy arrays
r = sin(4*phi)**3 + cos(2*phi)**3 + sin(6*theta)**2 + cos(6*theta)**4
x = r*sin(phi)*cos(theta)   # x is also (252,502)
y = r*cos(phi)              # y is also (252,502)
z = r*sin(phi)*sin(theta)   # z is also (252,502)
surface = go.Surface(x=x, y=y, z=z, colorscale='Viridis')
layout = go.Layout(title='parametric plot')
fig = go.Figure(data=[surface], layout=layout)
py.iplot(fig)

Scatter plots

Let’s take a look at a 3D scatter plot using the country index data from http://www.prosperity.com for for 142 countries:

import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('/project/shared/astro/data/legatum2015.csv')
spheres = go.Scatter3d(x=df.economy,
                       y=df.entrepreneurshipOpportunity,
                       z=df.governance,
                       text=df.country,
                       mode='markers',
                       marker=dict(
                           sizemode = 'diameter',
                           sizeref = 0.3,   # max(safetySecurity+5.5) / 32
                           size = df.safetySecurity+5.5,
                           color = df.education,
                           colorscale = 'Viridis',
                           colorbar = dict(title = 'Education'),
                           line = dict(color='rgb(140, 140, 170)')))   # sphere edge
layout = go.Layout(height=900, width=900,
                   title='Each sphere is a country sized by safetySecurity',
                   scene = dict(xaxis=dict(title='economy'),
                                yaxis=dict(title='entrepreneurshipOpportunity'),
                                zaxis=dict(title='governance')))
fig = go.Figure(data=[spheres], layout=layout)
py.iplot(fig)

Graphs

We can plot 3D graphs. Consider a Dorogovtsev-Goltsev-Mendes graph: in each subsequent generation, every edge from the previous generation yields a new node, and the new graph can be made by connecting together three previous-generation graphs.

import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
import networkx as nx
from forceatlas import forceatlas2_layout
import sys
generation = 5
H = nx.dorogovtsev_goltsev_mendes_graph(generation)
print(H.number_of_nodes(), 'nodes and', H.number_of_edges(), 'edges')
# Force Atlas 2 graph layout from https://github.com/tpoisot/nxfa2.git
pos = forceatlas2_layout(H, iterations=100, kr=0.001, dim=3)
Xn = [pos[i][0] for i in pos]   # x-coordinates of all nodes
Yn = [pos[i][1] for i in pos]   # y-coordinates of all nodes
Zn = [pos[i][2] for i in pos]   # z-coordinates of all nodes
Xe, Ye, Ze = [], [], []
for edge in H.edges():
    Xe += [pos[edge[0]][0], pos[edge[1]][0], None]   # x-coordinates of all edge ends
    Ye += [pos[edge[0]][1], pos[edge[1]][1], None]   # y-coordinates of all edge ends
    Ze += [pos[edge[0]][2], pos[edge[1]][2], None]   # z-coordinates of all edge ends

degree = [deg[1] for deg in H.degree()]   # list of degrees of all nodes
labels = [str(i) for i in range(H.number_of_nodes())]
edges = go.Scatter3d(x=Xe, y=Ye, z=Ze,
                     mode='lines',
                     marker=dict(size=12,line=dict(color='rgba(217, 217, 217, 0.14)',width=0.5)),
                     hoverinfo='none')
nodes = go.Scatter3d(x=Xn, y=Yn, z=Zn,
                     mode='markers',
                     marker=dict(sizemode = 'area',
                                 sizeref = 0.01, size=degree,
                                 color=degree, colorscale='Viridis',
                                 line=dict(color='rgb(50,50,50)', width=0.5)),
                     text=labels, hoverinfo='text')

axis = dict(showline=False, zeroline=False, showgrid=False, showticklabels=False, title='')
layout = go.Layout(
    title = str(generation) + "-generation Dorogovtsev-Goltsev-Mendes graph",
    width=1000, height=1000,
    showlegend=False,
    scene=dict(xaxis=go.layout.scene.XAxis(axis),
               yaxis=go.layout.scene.YAxis(axis),
               zaxis=go.layout.scene.ZAxis(axis)),
    margin=go.layout.Margin(t=100))
fig = go.Figure(data=[edges,nodes], layout=layout)
py.iplot(fig)

3D functions

Let’s create an isosurface of a decoCube function at f=0.03. Isosurfaces are returned as a list of polygons, and for plotting polygons in plotly we need to use plotly.figure_factory.create_trisurf() which replaces plotly.graph_objs.Figure():

import plotly.offline as py
py.init_notebook_mode(connected=True)
from plotly import figure_factory as FF
from numpy import mgrid
from skimage import measure
X,Y,Z = mgrid[-1.2:1.2:30j, -1.2:1.2:30j, -1.2:1.2:30j] # three 30^3 grids, each side [-1.2,1.2] in 30 steps
F = ((X*X+Y*Y-0.64)**2 + (Z*Z-1)**2) * \
    ((Y*Y+Z*Z-0.64)**2 + (X*X-1)**2) * \
    ((Z*Z+X*X-0.64)**2 + (Y*Y-1)**2)
vertices, triangles, normals, values = measure.marching_cubes_lewiner(F, 0.03)  # create an isosurface
x,y,z = zip(*vertices)   # zip(*...) is opposite of zip(...): unzips a list of tuples
fig = FF.create_trisurf(x=x, y=y, z=z, plot_edges=False,
                        simplices=triangles, title="Isosurface", height=900, width=900)
py.iplot(fig)

Try switching plot_edges=False to plot_edges=True – you’ll see individual polygons!