# Load the required libraries for 3D plots, suppressing startup messages
library(scatterplot3d, quietly = TRUE, warn.conflicts = FALSE)
Three Dimensional (3D) Data
Chapter 16.
Overview of R packages for 3D Plots
This chapter demonstrates how to create 3D plots in R. This chapter demonstrates creating 3D plots in R, focusing on ggplot2 and scatterplot3d packages. It begins by discussing the ggplot2 package’s limitations in producing true 3D plots, though it can simulate 3D effects using size, color, and transparency. The chapter then introduces the scatterplot3d package for creating genuine 3D scatter plots. This package is highlighted for its simplicity and effectiveness in visualizing data in three dimensions.
ggplot2
The ggplot2 package in R is primarily designed for creating 2D plots. However, we can simulate 3D effects, even though it won’t provide true 3D plots. [1]
scatterplot3d
The scatterplot3d package in R can be used to create 3D scatter plots. It is quite simple to use and can be a good choice for quick visualizations. [2]
rgl
The rgl package in R is a powerful tool for creating interactive 3D plots. It allows real-time interaction with the plots, where users can zoom, rotate, and pan the visualization to explore the data in three dimensions. This package provides various functionalities to create a variety of 3D plots, including scatter plots, surface plots, line plots, bar plots. [3]
In this chapter, we focus our attention on ggplot2 and scatterplot3d packages.
Data: Suppose we run the following code to prepare the mtcars
data for subsequent analysis and save it in a tibble called tb
.
# Load the required libraries, suppressing annoying startup messages
library(dplyr, quietly = TRUE, warn.conflicts = FALSE)
library(tibble, quietly = TRUE, warn.conflicts = FALSE)
library(ggplot2, quietly = TRUE, warn.conflicts = FALSE) # For data visualization
library(ggpubr, quietly = TRUE, warn.conflicts = FALSE) # For data visualization
# Read the mtcars dataset into a tibble called tb
data(mtcars)
<- as_tibble(mtcars)
tb
# Convert relevant columns into factor variables
$cyl <- as.factor(tb$cyl) # cyl = {4,6,8}, number of cylinders
tb$am <- as.factor(tb$am) # am = {0,1}, 0:automatic, 1: manual transmission
tb$vs <- as.factor(tb$vs) # vs = {0,1}, v-shaped engine, 0:no, 1:yes
tb$gear <- as.factor(tb$gear) # gear = {3,4,5}, number of gears tb
3D Plots using ggplot2
The ggplot2 package in R cannot create true 3D plots. Here is how to simulate a 3D plot using ggplot2. [1]
Simulating 3D using ggplot2
# Load necessary library
library(ggplot2)
# Create a scatter plot with the perception of depth using size and color
ggplot(tb,
aes(x = wt,
y = mpg,
color = hp,
size = hp)) + # Define the data and aesthetics for the plot
geom_point(alpha = 0.6) + # Add points to the plot with transparency
scale_color_gradient(low = "blue", high = "red") + # Define color gradient
theme_minimal() + # Set the plot theme to minimal
labs(title = "3D Effect Scatter plot",
x = "Weight",
y = "Miles Per Gallon",
color = "Horsepower",
size = "Horsepower") # Set plot labels and title
Discussion:
The code provided simulates a 3D plot by exploiting visual cues and perceptual aspects of human vision. It uses the following strategies to simulate depth and give the perception of a three-dimensional plot on a two-dimensional surface:
Size Encoding: The
size = hp
aesthetic in theaes()
function maps thehp
(horsepower) variable to the size of the points in the scatter plot. Points representing cars with higher horsepower appear larger, while those with lower horsepower appear smaller. This varying point size provides a depth cue, simulating the z-axis on a 2D surface.Color Encoding: The
color = hp
aesthetic assigns different colors to the points based on their horsepower values. Thescale_color_gradient(low = "blue", high = "red")
function further specifies that lower horsepower should be represented with blue, transitioning to red for higher horsepower. The gradient color scale serves as another depth cue, indicating that points of different colors are at different ‘depths’.Alpha Transparency: The
alpha = 0.6
parameter in thegeom_point()
function adjusts the transparency of the points. This transparency allows for better visibility of overlapping points, giving a sense of depth where points are clustered.Axis Mapping: The
x = wt
andy = mpg
within theaes()
function map the weight of the cars to the x-axis and the miles per gallon to the y-axis. These two axes represent the spatial dimensions on the plotting surface.This plot has similarities to the classic bubble plot. The application of a color gradient gives the illusion of 3D.
By combining these elements, the plot employs size, color gradient, and transparency to introduce visual cues of depth, effectively simulating a 3D effect on a 2D plotting surface. This enables the viewer to infer relationships among
wt
,mpg
, andhp
, even though the plot itself is inherently two-dimensional. [1]
3D Plots using scatterplot3d
The *scatterplot3d
** package can be used on the mtcars data to demonstrate visualization in 3 dimensions.
# Load the scatterplot3d library for 3D plotting
library(scatterplot3d)
# Create a 3D scatter plot using the tb tibble
<- scatterplot3d(x = tb$wt,
sp3 y = tb$mpg,
z = tb$hp,
xlab = "Weight",
ylab = "Miles Per Gallon",
zlab = "Horsepower",
pch = 16, color = "blue",
main = "3D Scatter Plot of mtcars")
Discussion:
x = tb$wt
,y = tb$mpg
, andz = tb$hp
are used to map thewt
,mpg
, andhp
columns from thetb
tibble to the x, y, and z axes of the 3D scatter plot respectively.- The
xlab
,ylab
, andzlab
parameters are used to label the x, y, and z axes. pch = 16
specifies the type of point to be used in the plot.color = "blue"
sets the color of the points to blue.main
is used to set the title of the plot.- Overall, this code visualizes a 3D scatter plot demonstrating the relationships between
wt
,mpg
, andhp
in themtcars
dataset. [2]
Variations of 3D Plots using scatterplot3d
Here are some variations and extensions of the original scatterplot3d code to demonstrate the versatility of 3D visualizations:
Color by a Variable:
We can color points by a variable, for example, by the number of cylinders (cyl), which can reveal additional patterns in the data.
# Load the scatterplot3d library for 3D plotting
library(scatterplot3d)
# Create a 3D scatter plot, with points colored by cylinders
<- scatterplot3d(x = tb$wt,
sp32 y = tb$mpg,
z = tb$hp,
color = as.numeric(tb$cyl), # Color by cyl
pch = 16, # Use a specific point symbol
xlab = "Weight",
ylab = "Miles Per Gallon",
zlab = "Horsepower",
main = "3D Scatter Plot Colored by Cylinders")
- We can add a legend for the color in the
scatterplot3d
plot by using a combination of thelegend
function and thecolors
argument. [2]
library(scatterplot3d)
# Define a color palette
<- rainbow(length(unique(tb$cyl)))
color_palette
# Create a scatterplot3d and specify colors by the number of cylinders
<- scatterplot3d(x = tb$wt, y = tb$mpg, z = tb$hp,
sp33 color = color_palette[as.numeric(tb$cyl)],
pch = 16,
xlab = "Weight", ylab = "MPG", zlab = "Horsepower",
main = "3D Scatter Plot Colored by Number of Cylinders")
# Add a color legend
legend("right",
legend = unique(as.character(tb$cyl)),
fill = color_palette,
title = "Cylinders")
Discussion:
- We defined a color palette using the
rainbow
function based on the number of unique cylinder values intb$cyl
. - We then used this color palette to color the points in the scatter plot.
- Finally, we used the
legend
function to manually add a color legend to the right of the plot. Thelegend
function takes the unique cylinder values as the labels for the legend and the colors from the color palette as the fill colors. The title of the legend is set to “Cylinders”. [2]
Change Point Style:
We can change the point style to better distinguish between data points. [2]
# Create a scatterplot3d and specify point shapes by the number of cylinders
<- scatterplot3d(x = tb$wt,
sp34 y = tb$mpg,
z = tb$hp,
pch = as.numeric(tb$cyl), # point shapes
color = "blue", # point color to blue
xlab = "Weight",
ylab = "Miles per Gallon",
zlab = "Horsepower",
main = "3D Scatter Plot with Point Styles")
# Get unique cylinder values and corresponding point shapes
<- unique(tb$cyl)
cylinders <- as.numeric(cylinders)
shapes
# Add a legend for point shapes
legend("right",
legend = as.character(cylinders), # cylinders
pch = shapes, # Use point shapes as legend
title = "Cylinders") # Set legend title
Discussion:
This R code visualizes relationships among the
wt
,mpg
, andhp
columns from thetb
tibble, with points having different shapes based on the number of cylinders (cyl
).Creation of 3D Scatter Plot
sp34 <- scatterplot3d(...)
: This line initializes the creation of a 3D scatter plot and stores the result in thes3d
variable.x = tb$wt, y = tb$mpg, z = tb$hp
: These arguments define the variables that will be plotted on the x, y, and z axes, respectively.pch = as.numeric(tb$cyl)
: This argument specifies the point shape (pch
) to vary based on thecyl
column, with different cylinders represented by different shapes.color = "blue"
: All points are colored blue.xlab
,ylab
,zlab
: These arguments label the axes of the plot.main
: This argument provides the main title of the plot.
Determination of Unique Cylinder Values and Shapes
cylinders <- unique(tb$cyl)
: This line identifies the unique values in thecyl
column and stores them in thecylinders
variable.shapes <- as.numeric(cylinders)
: The unique cylinder values are converted to numeric form to determine the corresponding point shapes, which are stored inshapes
.
Addition of Legend
legend("right", ...)
: This function is used to add a legend to the right of the plot.legend = as.character(cylinders)
: This argument defines the labels of the legend, which are the unique cylinder values converted to character strings.pch = shapes
: The point shapes corresponding to the unique cylinder values are used in the legend.title = "Cylinders"
: The title of the legend is set as “Cylinders”.
By executing this code, we get a 3D scatter plot with points having different shapes based on the number of cylinders, and a corresponding legend clarifying the representation of each shape. [2]
Highlight Specific Points:
We can can highlight specific points, for example, cars with 4 cylinders, by changing their color or size. [2]
# Create a scatterplot3d with specific point highlighting
<- scatterplot3d(x = tb$wt,
sp35 y = tb$mpg,
z = tb$hp,
pch = 16, # Use a specific point shape
color = ifelse(tb$cyl == 4, "red", "blue"),
# Color points based on the number of cylinders
xlab = "Weight",
ylab = "Miles per Gallon",
zlab = "Horsepower",
main = "3D Scatter Plot Highlighting Points")
Add Regression Plane:
We can fit a linear regression model and add the regression plane to the scatter plot. [2]
library(scatterplot3d)
# Create a 3D scatter plot
<- scatterplot3d(x = tb$wt, y = tb$mpg, z = tb$hp,
sp36 pch = 16, color = "blue",
xlab = "Weight",
ylab = "Miles per Gallon",
zlab = "Horsepower",
main = "3D Scatter Plot with Regression Plane")
# Fit a linear model and add a regression plane to 3D plot
<- lm(hp ~ wt + mpg, data = tb)
model $plane3d(model) sp36
Discussion:
- This code snippet is used to create a 3D scatter plot and then fit a linear regression plane to the plotted data points.
- Fit a Linear Model
model <- lm(hp ~ wt + mpg, data = tb)
: A linear model is fitted using thelm
function withhp
as the dependent variable andwt
andmpg
as the independent variables. The resulting model object is stored in the variablemodel
.
- Add a Regression Plane to the Plot
sp36$plane3d(model)
: Theplane3d
function is accessed from thes3d
object and is used to add a regression plane to the existing 3D scatter plot. The plane is based on the coefficients of the linear model stored inmodel
.
- The resulting output will be a 3D scatter plot visualizing the relationships among weight (
wt
), miles per gallon (mpg
), and horsepower (hp
), with a regression plane fitted to these points, representing the linear relationship between the dependent and independent variables. [2], [4]
Summary of Chapter 16 – Three Dimensional (3D) Data
This chapter provides an insightful exploration into the creation of 3D plots using R, focusing on the ggplot2 and scatterplot3d packages.
ggplot2: The chapter begins by highlighting that ggplot2
, primarily known for 2D plots, can be used to simulate 3D effects. This is demonstrated through a scatter plot example where visual cues like point size and color gradients are employed to give an illusion of depth. By mapping variables such as horsepower to these visual elements in the ‘mtcars’ dataset, the plot achieves a 3D-like effect on a 2D plane.
scatterplot3d: The scatterplot3d
package, specifically designed for 3D scatter plots, is explored next. The chapter illustrates its straightforward application in creating 3D plots with the mtcars data, considering variables like weight, miles per gallon, and horsepower. It delves into various enhancements like coloring points based on categorical variables (e.g., the number of cylinders), changing point styles to distinguish data points, highlighting specific points for emphasis, and adding legends for clarity. The chapter also demonstrates how to incorporate a linear regression plane into the scatter plot, adding an analytical dimension to the visual representation.
Throughout the chapter, code examples and discussions are provided to guide the reader on how to effectively use these packages for 3D plotting in R. The focus is on practical application, offering readers the tools to transform standard 2D visualizations into more engaging and informative 3D plots. This approach not only enhances the visual appeal of the data but also allows for a more nuanced understanding of complex relationships within the dataset.
References
ggplot2
Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. Retrieved from https://ggplot2.tidyverse.org
Wickham, H., & Grolemund, G. (2016). R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O’Reilly Media.
Wickham, H. (2020). ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics (Version 3.3.2) [Computer software]. Retrieved from https://CRAN.R-project.org/package=ggplot2
Wickham, H., et al. (2020). dplyr: A Grammar of Data Manipulation (Version 1.0.2) [Computer software]. Retrieved from https://CRAN.R-project.org/package=dplyr
Wilkinson, L. (2005). The Grammar of Graphics (2nd ed.). Springer-Verlag.
Wickham, H., et al. (2020). tibble: Simple Data Frames (Version 3.0.3) [Computer software]. Retrieved from https://CRAN.R-project.org/package=tibble
Ligges, U., Maechler, M., Schnackenberg, S., & Ligges, M. U. (2018). Package ‘scatterplot3d’. Retrieved from https://cran.rproject.org/web/packages/scatterplot3d/scatterplot3d.pdf
Ligges, U., & Mächler, M. (2002). Scatterplot3d - an R package for visualizing multivariate data. Technical report, 2002, 22.
3D Visualization
Adler, D., Nenadic, O., & Zucchini, W. (2003, March). Rgl: A R-library for 3D visualization with OpenGL. In Proceedings of the 35th Symposium of the Interface: Computing Science and Statistics, Salt Lake City (Vol. 35, pp. 1-11).
Regression
Fox, J., & Weisberg, S. (2011). An R Companion to Applied Regression (2nd ed.). Sage.