A plotting package that all R users will use - ggplot2 (3.4.2)

foreword

Data visualization is a very important part of data analysis and research, andggplot2-kitIt is a very powerful data visualization suite in the R language, and almost all R users will use it. In this article, we will briefly introduce the basic operation, download and several common drawing methods of ggplot2.

ggplot2 package installation

First, we need to make sure we have installedR and RStudio software, and then enter the following command in RStudio to install the ggplot2 package:

install.packages("ggplot2")

After the installation is complete, you can use the following code to load ggplot2:

library(ggplot2)

After ggplot2 is loaded into the environment, the visual output of the data can be started.

ggplot2 commonly used drawing instructions

Next, we will introduce the basic operations of ggplot2. The following are several commonly used ggplot2 drawing instructions:

  • Scatterplot: use geom_point() function to draw.
  • Line chart: use geom_line() function to draw.
  • Histogram: use geom_histogram() function to draw.
  • Boxplot: use geom_boxplot() function to draw.
  • Heat map: use geom_tile() function to draw.

Our built-in datasets in RmtcarsAs an example, you can call the dataset directly from within R using the following command:

library(ggplot2) data(mtcars) head(mtcars)

head() functionIt is used to display the first few data of a data set or vector, and the first 6 data will be displayed by default. If more or less data needs to be displayed, it can be found in the head() The number of rows to display is specified in the function, for example head(mtcars, n = 10) The first 10 records will be displayed.

1. Scatterplot

Next, we draw the scatter diagram between "wt" and "mpg" in the mtcars data set, and use "cyl" to classify the color, and finally add the labels of the x-axis and y-axis:

ggplot(data = mtcars, aes(x = wt, y = mpg, color = factor(cyl))) + geom_point() + labs(x = "Weight (lb/1000)", y = "Miles per Gallon", color = "Cylinders")

Dangdang! The result will come out in less than a second. If you are interested, you can modify the fields of the data set and the parameters of the drawing function according to your own needs to customize the results you want.

2. Line chart

to draw ggplot2 For a line chart, you can use geom_line() function to draw. The following is a simple example of how to draw ggplot2 A line chart of:

# Create a graphic object, set the x-axis to mpg, and the y-axis to hp ggplot(mtcars, aes(x = mpg, y = hp)) + # Draw a polyline, set the line color to red geom_line(color = "red") + # Set the graph title and x, y axis labels labs(title = "Example Line Chart", x = "Mileage per Gallon", y = "Horsepower")

In this example, we use ggplot() function plotting, setting mtcars data set mpg field is the x-axis,hp The field is the y-axis. then use geom_line() The function draws a line graph, and sets the color of the line to red, and finally uses labs() The function sets the title of the graph and the labels of the x, y axes.

[Note] In R language, use # symbolYou can add comments after the statement to help yourself or others understand the function and logic of the code.

3. Histogram

exist ggplot2 in, you can use geom_histogram() function to draw a histogram. Here is a simple example of how to draw a histogram:

# Create a graphic object, set the x-axis to the mpg field in the data set ggplot(mtcars, aes(x = mpg)) + # draw a histogram, set the interval width to 5, the border color to black, and the fill color to light blue geom_histogram (binwidth = 5, color = "black", fill = "lightblue") + # set graph title and x, y axis labels labs(title = "Histogram example", x = "miles per gallon", y = " frequency")

In this example, we use ggplot() function plotting, setting mtcars data set mpg The field is the x-axis. then use geom_histogram() The function draws a histogram, and sets the interval width to 5, the border color to black, and the fill color to light blue, and finally uses labs() The function sets the title of the graph and the labels of the x, y axes.

4. Box plot

exist ggplot2 , we can use geom_boxplot() function to draw boxplots. Here is a simple example of how to draw a boxplot:

# Create a graphic object, set the x-axis to the cyl field in the data set, and the y-axis to the mpg field in the data set ggplot(mtcars, aes(x = factor(cyl), y = mpg)) + # draw a box plot, Set the fill color to light blue and the line color to black geom_boxplot(fill = "lightblue", color = "black") + # Set the graph title and x, y axis labels labs(title = "box plot example", x = "Number of Cylinders", y = "Mileage per Gallon")

In this example, we use ggplot() function plotting, setting mtcars data set cyl field is the x-axis,mpg The field is the y-axis. then use geom_boxplot() The function draws a box plot, and sets the fill color to light blue and the line color to black, and finally uses labs() The function sets the title of the graph and the labels of the x, y axes.

5. Heatmap

exist ggplot2 in, you can use geom_tile() function to draw a heatmap. Here is a simple example of how to draw a heatmap:

# Create a data frame df <- data.frame( x = rep(1:5, 5), y = rep(1:5, each = 5), z = rnorm(25)) # Create a graphic object and set the x axis is the x column, the y axis is the y column, and the fill color is the value of the z column geom_tile(color = "white") + # Set the range of the fill color, as well as the title and label of the legend scale_fill_gradient(low = "blue", high = "red", name = "value", labels = scales::number) + # Set the graph title and x, y axis labels labs(title = "heat map example", x = "x axis label", y = "y axis label")

In this example, we created a df , which contains three fields:x,the y,z, then use ggplot() Function plotting. set up df data x field is the x-axis,the y field is the y-axis,z The value of the field is the fill color, then use the geom_tile() The function draws a heat map, sets the border color to white, and uses scale_fill_gradient() function to set the range of fill colors, and the title and label of the legend. last used labs() The function sets the title of the graph and the labels of the x, y axes.

custom legend

Scatter plots, line charts, histograms, box plots and heat maps are the most commonly used results presentation diagrams. For other drawing methods and techniques, you can go toggplot2 official websiteRefer to detailed explanation.

When we are drawing a diagram, we may need to display some additional information, at this time we can uselabs()functionto customize the label. This function can be used to change the title, x-axis label and y-axis label. Additionally, we can usescale_color_manual()andscale_shape_manual()functions to customize colors and shapes. Here is a simple example of how to draw a custom icon:

data(iris) head(iris) ggplot(data = iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species, shape = Species)) + geom_point(size = 4) + labs(title = " Sepal Length vs. Sepal Width by Species", x = "Sepal Length", y = "Sepal Width", color = "Species", shape = "Species") + scale_color_manual(values = c("#00AFBB", "#E7B800" , "#FC4E07"), labels = c("Setosa", "Versicolor", "Virginica")) + scale_shape_manual(values = c(0, 1, 2), labels = c("Setosa", "Versicolor", " Virginica"))

In this example we uselabs()Functions to set title, x-axis label, y-axis label, color label and shape label. We usescale_color_manual()function to customize the color, and use thescale_shape_manual()function to customize the shape.

save image file

ggsave yes ggplot2 A function provided by the suite that converts the ggplot2 The drawn icon is saved as a picture file. Below is ggsave The basic syntax of a function:

# Create a data frame df <- data.frame(x = c(1, 2, 3, 4, 5), y = c(2, 4, 6, 8, 10)) # Draw a line graph p <- ggplot( df, aes(x, y)) + geom_line() + ggtitle("Schematic Line Chart") p # archive, change to the path of your own computer ggsave(path = "C:/Users/Administrator/Desktop", filename = " my_plot.jpeg", width=40, height=40, dpi = 300, units = 'cm')

where path is the path specifying the archive to be archived,filename is the name of the file to be archived,width and height The parameter is to specify the width and height of the graphic,dpi The parameter is the resolution of the specified image, the default is 300 dpi,units The argument specifies the units of the drawing, which can be inches ("in"), centimeters ("cm"), or millimeters ("mm").

epilogue

ggplot2 is one of the most popular plotting packages in R language. It not only provides powerful plotting functions, but also is very easy to learn and use. There are a lot of learning resources and examples on the Internet. You can learn more about ggplot2 using methods and skills through the ggplot2 official website, youtube, and the teachings of R masters.

I am very grateful for your sharing!!!
MillionQuesn
Million Quesn

A foreigner living in Taiwan, sharing the highlights of a sudden flash of inspiration.

Articles: 46

Leave a Reply

Your email address will not be published. Required fields are marked *