7 Data Analysis and Visualization
R in VSCode, R programming in VSCode
7.1 Introduction
Visual Studio Code (VSCode), combined with R and essential extensions, offers a powerful environment for data analysis and visualization. In this chapter, we will explore how to effectively use VSCode for performing data analysis and creating visualizations with R. This includes using popular R packages like tidyverse
for data wrangling and ggplot2
for visualizations, all within the convenience of VSCode.
7.2 Data Analysis
Data analysis in VSCode is streamlined through the vscode-R
extension, which provides robust support for working with R scripts and interactive data exploration.
7.2.1 STEP 1. Loading Data
To load data in VSCode, you can use the R terminal integrated into the editor or write and run R scripts directly from the editor.
Loading CSV Files: Use the
read.csv()
orreadr::read_csv()
function to load CSV files. You can highlight the line of code and pressCtrl + Enter
(Windows/Linux) orCmd + Enter
(Mac) to execute it in the active R terminal.# Create a demo data file dir.create("data", showWarnings = FALSE, ) ::write_csv(iris, "data/iris.csv") readr# Load the data <- readr::read_csv("data/iris.csv") data
Viewing Data: Use the
View()
function to open data frames in the interactive viewer provided by VSCode. This allows you to sort, filter, and explore the data directly in the editor.View(data)
7.2.2 STEP 2. Data Wrangling with tidyverse
The tidyverse
package provides an excellent set of tools for data manipulation and transformation. In VSCode, you can leverage these tools to clean and prepare your dataset for analysis.
Filtering and Mutating Data: Use
dplyr
to filter and mutate data frames. You can run these commands interactively to see the output immediately in the R terminal.library(dplyr) <- data %>% filtered_data filter(Sepal.Length > 5) %>% mutate(Sepal.Ratio = Sepal.Length / Sepal.Width)
Piping Commands: The
%>%
(pipe
) operator allows you to chain multiple operations together, which is particularly helpful in making the code readable and efficient. VSCode supports the use of pipes seamlessly, allowing for interactive execution of each step.
7.3 Data Visualization with ggplot2
Visualization is a key component of data analysis, and VSCode provides multiple ways to create, view, and interact with plots.
7.3.1 STEP 1. Creating Visualizations
The ggplot2
package is the go-to tool for creating beautiful and informative visualizations in R. In VSCode, you can use ggplot2
to generate charts and plots and view them interactively.
Basic Plotting: Create a scatter plot to visualize relationships between variables.
library(ggplot2) ggplot(data, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) + geom_point()
Interactive Plot Viewing: With the httpgd package enabled, your plots will appear in the VSCode plot viewer. This allows you to zoom, export, or copy images directly from the viewer pane, making the process more efficient.
install.packages("httpgd") ::hgd() httpgdoptions(device = httpgd::hgd)
7.3.2 STEP 2. Customizing Visuals
Customization is key to making your plots informative and visually appealing.
Adding Titles and Labels: Customize your plots by adding titles, axis labels, and adjusting themes.
ggplot(data, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) + geom_point() + labs(title = "Sepal Length vs Width", x = "Sepal Length (cm)", y = "Sepal Width (cm)") + theme_minimal()
Faceting: Use
facet_wrap()
orfacet_grid()
to create small multiples, which can help in understanding patterns across different subsets of data.ggplot(data, aes(x = Sepal.Length, y = Sepal.Width)) + geom_point() + facet_wrap(~ Species)
7.4 Interactive Visualization Tools
VSCode, through the vscode-R
extension, supports interactive visualizations that enhance data exploration.
Plot Viewer: The plot viewer in VSCode allows you to interact with your visualizations. Using
httpgd
, you can view plots that update automatically as you make changes to your code.Htmlwidgets and Shiny Apps: Htmlwidgets like
plotly
or interactive Shiny apps can also be rendered within VSCode, allowing you to explore data interactively without leaving the editor.# Example using plotly library(plotly) <- ggplot(data, aes(x = Sepal.Length, y = Sepal.Width)) + p geom_point() ggplotly(p)
7.5 Conclusion
Data analysis and visualization are central to any data science workflow, and VSCode, paired with R, provides a powerful environment for both. By leveraging the vscode-R
extension, httpgd
for interactive plots, and popular R packages like tidyverse
and ggplot2
, you can efficiently transform data and create meaningful visualizations. The integrated terminal and plot viewers in VSCode make the entire process streamlined, enabling a seamless flow from data wrangling to visualization.