Creating Interactive Stacked Bar Graphs in R with Two Variables Using Plotly and ggplot2

Creating a Stacked Bar Graph with Two Variables in R

Understanding the Problem and Requirements

As a data analyst, creating visualizations that effectively communicate complex data insights is crucial. In this article, we will explore how to create a stacked bar graph using two variables from a dataset in R. We will use Plotly, a popular visualization library for R, to achieve this.

The goal is to create a graph where the y-axis represents the percentage of two variables (Available and Unavailable) and the x-axis represents dates, with each service (Serie) represented by a different color. This type of graph allows us to visualize how the availability of services changes over time for each service.

Setting Up the Environment

Before we dive into creating the stacked bar graph, let’s ensure our environment is set up correctly. We will use RStudio as our IDE and Install the necessary libraries: Plotly, ggplot2 (for a comparison), and dplyr (for data manipulation).

# Install necessary libraries
install.packages("Plotly")
install.packages("ggplot2")
install.packages("dplyr")

# Load necessary libraries
library(Plotly)
library(ggplot2)
library(dplyr)

Loading the Sample Dataset

For demonstration purposes, we will use a sample dataset that includes two variables (Available and Unavailable) and dates. You can replace this with your own dataset.

# Create a sample dataset
data <- data.frame(
  Date = c("2022-01-01", "2022-01-02", "2022-01-03", "2022-01-04"),
  Servicelevel = c(10, 20, 15, 30),
  Available = c(8, 12, 18, 25),
  Unavailable = c(2, 8, 7, 5)
)

# Convert Date column to Date format
data$Date <- as.Date(data$Date, format = "%Y-%m-%d")

Plotting the Stacked Bar Graph

Now that our dataset is loaded and prepared, we can create the stacked bar graph using Plotly.

# Create a new dataframe with only available data
available_data <- data %>%
  filter(Available > 0)

# Create the stacked bar graph
barstack <- plot_ly(data, x = ~Date, y = ~Available,
                     color = ~Servicelevel,
                     type = "bar") %>%
  add_trace(y = ~Unavailable,
            color = ~Servicelevel) %>%
  layout(title = "Percentage of Availability over time",
         yaxis = list(title = 'Availability (%)',
                      tickformat = ".2%"),
         xaxis=list(title='Date'),
         barmode = 'stack')

Customizing the Graph

Let’s customize the graph by adding a title, axis labels, and colors.

# Customize the layout of the graph
barstack <- barstack %>%
  layout(
    title = "Percentage of Availability over time",
    yaxis = list(title = 'Availability (%)', tickformat = ".2%"),
    xaxis=list(title='Date'),
    barmode = 'stack',
    coloraxis = list(color = "Blue"),
    hovermode = "x"
  )

Plotting the Stacked Bar Graph with ggplot2

To provide an alternative, we can use ggplot2 to create a similar stacked bar graph.

# Create a new dataframe with only available data
available_data <- data %>%
  filter(Available > 0)

# Create the stacked bar graph using ggplot2
barstack_ggplot <- 
  ggplot(data, aes(x = Date, y = Available,
                    color = Servicelevel)) +
  geom_bar(stat = "identity") +
  geom_bar(aes(y = Unavailable), 
           position = "stack", 
           fill = "lightblue") +
  labs(title = "Percentage of Availability over time",
       x = 'Date', 
       y = 'Availability (%)')

Conclusion

In this article, we have explored how to create a stacked bar graph using two variables from a dataset in R. We used Plotly, a popular visualization library for R, to achieve this. The resulting graph provides a clear visual representation of the availability of services over time for each service.

By following these steps and examples, you should now be able to create your own stacked bar graphs with multiple variables using Plotly or ggplot2 in R.


Last modified on 2024-04-01