3D plots with Plotly

plotly’ is a popular graphing package in Python, but also has an R interface. What makes plotly really cool is that it is by default interactive, which also makes it great for 3D plots. In this lesson we will make 3D plots in plotly.

Getting ready

First, let’s install the required packages for this tutorial plotly for plotting and reshape2 for making a best fit surface.

install.packages("plotly")
install.packages("reshape2")

Next we need to load up the packages.

library(plotly)
library(reshape2)

Loading up and checking the data

Then we load up and check our fake data

Data<-read.csv("Fake_data_3d_plots.csv")

str(Data)
## 'data.frame':	50 obs. of  4 variables:
##  $ X          : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Growth_Rate: num  207.4 30.7 93 136.7 181.2 ...
##  $ Density_1  : num  25.5 30.9 37.9 24.3 29.6 ...
##  $ Density_2  : num  25.8 40.3 27.2 36.4 24.8 ...

Our for data set has the density of 2 different species, and the growth rate for species 1.

Making the 3D scatter plot

The structure of plot_ly is similar to ggplot2. We first supply the name of data frame, and then map variables from our data frame to specific “aesthetics” (e.g., x-axis, y-axis, etc.). However, for plot_ly these are just separate parameters and not all within the aes() function. For this mapping we also need to use ~ in front of each variable.

To make a 3D scatter plot we need to specify an x-axis, y-axis, and z-axis. We also need to set type="scatter3d". Like ggplot2 we can save the plot as a variable to use later on. Finally the name parameter doesn’t show up, but is used to reference specific layers when customizing the plot (which we will use later).

p<-plot_ly(Data,x=~Density_1,y=~Density_2,z=~Growth_Rate,name="Data",type="scatter3d")

p

Plotly is interactive, so you can zoom in, zoom out, change the angle, and get information on specific data points.

Add predictions

Fit a model

Say we want to add predictions, so lets try to fit a model using lm with growth rate of species 1 as a function of density of species 1, density of species 2, and their interaction.

We use data argument to specify what data frame we are using, then put in the formula based on column names. Density_1:Density_2.

We can then use summary(model_1) to get information on the model we just fit.

model_1<-lm(data=Data,Growth_Rate ~ Density_1+Density_2+Density_1:Density_2)

summary(model_1)
## 
## Call:
## lm(formula = Growth_Rate ~ Density_1 + Density_2 + Density_1:Density_2, 
##     data = Data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -4.494 -1.696  0.242  1.490  3.764 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         396.78887    9.57549  41.438   <2e-16 ***
## Density_1             0.14535    0.31174   0.466    0.643    
## Density_2             0.25639    0.32534   0.788    0.435    
## Density_1:Density_2  -0.30577    0.01062 -28.781   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.102 on 46 degrees of freedom
## Multiple R-squared:  0.9992,	Adjusted R-squared:  0.9991 
## F-statistic: 1.905e+04 on 3 and 46 DF,  p-value: < 2.2e-16

Making Predictions

Next we can use the predict function to make predictions based on our model. The first argument is the saved model that we fit above, and the newdata argument is a data frame that has variables that match the variable names used to fit the model (in this case Density_1 and Density_2). The function will generate predictions at these these values. In this case we can supply the original data frame because we want to generate predictions of the model at the original data to visually see how well the model fits our data.

predictions<-predict(model_1, newdata = Data)
predictions
##         1         2         3         4         5         6         7         8 
## 205.76162  30.26453  93.86870 138.82142 183.33419 217.30808 117.84127 104.46095 
##         9        10        11        12        13        14        15        16 
## -23.99789  67.16232 200.23285 -16.46752 180.61361 174.78481  23.73701 260.03999 
##        17        18        19        20        21        22        23        24 
## 196.53111 153.53361  83.52809 157.02114  95.07185 176.11158  69.93366  19.53620 
##        25        26        27        28        29        30        31        32 
## 135.82457 267.75781  49.23855 129.68867  44.59277 186.56241  50.26576 198.99010 
##        33        34        35        36        37        38        39        40 
## 113.24112 204.67605 254.03568  87.59822 232.06791 185.11934 174.48324 128.26608 
##        41        42        43        44        45        46        47        48 
##  84.30696 172.30643 216.79693 128.05856 180.18955 118.72169  64.66046 137.41530 
##        49        50 
## 178.34330 209.05358

Adding predictions

Now we can add a new layer to the first 3D scatter plot that includes points of our predictions. We add new layers using %>%add_trace(...)

Like before we need to specify x, y, and z axes. Since we haven’t added the predictions (our z axis) to a dataframe we will put vectors into these arguments. Because they are vectors we don’t need to add the ~ like we did before.We will use the name parameter, for the legend (so it knows what to call this layer). marker=list(color="red") is how we specify the color of the predictions.

p_w_prediction<- p %>%add_trace(x = Data$Density_1, y = Data$Density_2, z =predictions , type = "scatter3d",name="Predictions_model_1",marker=list(color="red")) 
p_w_prediction
## No scatter3d mode specifed:
##   Setting the mode to markers
##   Read more about this attribute -> https://plotly.com/r/reference/#scatter-mode
## No scatter3d mode specifed:
##   Setting the mode to markers
##   Read more about this attribute -> https://plotly.com/r/reference/#scatter-mode

Prediction surface

While it was cool to generate predictions at the same points we have data at, it’s probably more informative to generate predictions across the different combinations of densities. In a 2D scatter plot we can just use a line, but in 3D we need to make prediction surface/plane.

Make the prediction

First, we need to generate predictions across different possible combinations of the two density variables.

We will use the seq function which generates a vector of numbers in a sequence. The first argument is the minimum value, second argument is maximum value, and the third argument is the step of the sequence. To get those values, we will use the minimums and maximums of our density variables +/- 2 to give a little wider range for our surface.

SimD1 <- seq(min(Data$Density_1)-2,max(Data$Density_1)+2, 0.5)
SimD2<- seq(min(Data$Density_2)-2, max(Data$Density_2)+2, 0.5)
SimD1
##  [1] 15.74147 16.24147 16.74147 17.24147 17.74147 18.24147 18.74147 19.24147
##  [9] 19.74147 20.24147 20.74147 21.24147 21.74147 22.24147 22.74147 23.24147
## [17] 23.74147 24.24147 24.74147 25.24147 25.74147 26.24147 26.74147 27.24147
## [25] 27.74147 28.24147 28.74147 29.24147 29.74147 30.24147 30.74147 31.24147
## [33] 31.74147 32.24147 32.74147 33.24147 33.74147 34.24147 34.74147 35.24147
## [41] 35.74147 36.24147 36.74147 37.24147 37.74147 38.24147 38.74147 39.24147
## [49] 39.74147 40.24147 40.74147 41.24147 41.74147 42.24147
SimD2
##  [1] 16.9604 17.4604 17.9604 18.4604 18.9604 19.4604 19.9604 20.4604 20.9604
## [10] 21.4604 21.9604 22.4604 22.9604 23.4604 23.9604 24.4604 24.9604 25.4604
## [19] 25.9604 26.4604 26.9604 27.4604 27.9604 28.4604 28.9604 29.4604 29.9604
## [28] 30.4604 30.9604 31.4604 31.9604 32.4604 32.9604 33.4604 33.9604 34.4604
## [37] 34.9604 35.4604 35.9604 36.4604 36.9604 37.4604 37.9604 38.4604 38.9604
## [46] 39.4604 39.9604 40.4604 40.9604 41.4604 41.9604

Next, we need to get all possible combinations for these values. We can do that with the expand.grid. This creates a data frame with all possible combinations of the vectors we feed into the function. We can set the name of the columns in the result with =:

surface<- expand.grid(Density_1 = SimD1, Density_2= SimD2)	
head(surface)
##   Density_1 Density_2
## 1  15.74147   16.9604
## 2  16.24147   16.9604
## 3  16.74147   16.9604
## 4  17.24147   16.9604
## 5  17.74147   16.9604
## 6  18.24147   16.9604

Now we can generate predictions like we did before with the predict function. Rather than save the results in a seperate vector, let’s just add a new column to the surface data frame:

surface$Growth_Rate<- predict(model_1, newdata = surface)		
head(surface)
##   Density_1 Density_2 Growth_Rate
## 1  15.74147   16.9604    321.7911
## 2  16.24147   16.9604    319.2708
## 3  16.74147   16.9604    316.7505
## 4  17.24147   16.9604    314.2302
## 5  17.74147   16.9604    311.7099
## 6  18.24147   16.9604    309.1896

Unfortunately, to plot the prediction surface we need to format the data in a specific way, specifically matrix format. Where the rows are density 1, columns are density 2 and the numbers inside are the predictions. We can achieve this with the acast function from reshape2. This is how we use that function: acast(data frame to transform, first predictor variable ~ second predictor variable, value.var = depednent variable column name)

surface_reshaped<-acast(surface, Density_1 ~ Density_2, value.var = "Growth_Rate")
surface_reshaped
##                  16.9604011059768 17.4604011059768 17.9604011059768
## 15.7414680607694         321.7911         319.5126         317.2342
## 16.2414680607694         319.2708         316.9159         314.5610
## 16.7414680607694         316.7505         314.3192         311.8879
...

Adding the surface

Next, let’s add the surface to the original plot. We put will supply the matrix into the z axis, and then the vectors that we used to create the prediction surface in the other axis columns. Finally we need to specify type="surface.

p_w_surface <- p %>% add_trace(x = SimD1, y = SimD2, z =surface_reshaped , type = "surface") 
p_w_surface
## No scatter3d mode specifed:
##   Setting the mode to markers
##   Read more about this attribute -> https://plotly.com/r/reference/#scatter-mode

To make it look nicer, let’s just change the labels with the scene paramater in the layout() function.

p_w_surface %>% layout(scene = list(xaxis = list(title = 'Density 1'),
                                     yaxis = list(title = 'Density 2'),
                                     zaxis = list(title = 'Predicted Growth Rate'))) 
## No scatter3d mode specifed:
##   Setting the mode to markers
##   Read more about this attribute -> https://plotly.com/r/reference/#scatter-mode
Matthew Kustra
Matthew Kustra
Postdoctoral fellow

My research interests include sexual selection, speciation, and endosymbionts.

Related