In this tutorial, you will learn how to rename the columns of a data frame in R.This can be done easily using the function rename() [dplyr package]. It’s also possible to use R base functions, but they require more typing.
Contents:
Required packages
Load the tidyverse
packages, which include dplyr
:
library(tidyverse)
Demo dataset
We’ll use the R built-in iris data set, which we start by converting into a tibble data frame (tbl_df) for easier data analysis.
my_data <- as_tibble(iris)
my_data
## # A tibble: 150 x 5
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## <dbl> <dbl> <dbl> <dbl> <fct>
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
## # ... with 144 more rows
Renaming columns with dplyr::rename()
Rename the column Sepal.Length to sepal_length and Sepal.Width to sepal_width:
my_data %>%
rename(
sepal_length = Sepal.Length,
sepal_width = Sepal.Width
)
## # A tibble: 150 x 5
## sepal_length sepal_width Petal.Length Petal.Width Species
## <dbl> <dbl> <dbl> <dbl> <fct>
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
## # ... with 144 more rows
Renaming columns with R base functions
To rename the column Sepal.Length to sepal_length, the procedure is as follow:
- Get column names using the function names() or colnames()
- Change column names where name = Sepal.Length
# get column names
colnames(my_data)
## [1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
## [5] "Species"
# Rename column where names is "Sepal.Length"
names(my_data)[names(my_data) == "Sepal.Length"] <- "sepal_length"
names(my_data)[names(my_data) == "Sepal.Width"] <- "sepal_width"
my_data
## # A tibble: 150 x 5
## sepal_length sepal_width Petal.Length Petal.Width Species
## <dbl> <dbl> <dbl> <dbl> <fct>
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
## # ... with 144 more rows
It’s also possible to rename by index in names vector as follow:
names(my_data)[1] <- "sepal_length"
names(my_data)[2] <- "sepal_width"
Summary
In this chapter, we describe how to rename data frame columns using the function rename()[in dplyr package].
Recommended for you
This section contains best data science and self-development resources to help you on your path.
Coursera - Online Courses and Specialization
Data science
- Course: Machine Learning: Master the Fundamentals by Stanford
- Specialization: Data Science by Johns Hopkins University
- Specialization: Python for Everybody by University of Michigan
- Courses: Build Skills for a Top Job in any Industry by Coursera
- Specialization: Master Machine Learning Fundamentals by University of Washington
- Specialization: Statistics with R by Duke University
- Specialization: Software Development in R by Johns Hopkins University
- Specialization: Genomic Data Science by Johns Hopkins University
Popular Courses Launched in 2020
- Google IT Automation with Python by Google
- AI for Medicine by deeplearning.ai
- Epidemiology in Public Health Practice by Johns Hopkins University
- AWS Fundamentals by Amazon Web Services
Trending Courses
- The Science of Well-Being by Yale University
- Google IT Support Professional by Google
- Python for Everybody by University of Michigan
- IBM Data Science Professional Certificate by IBM
- Business Foundations by University of Pennsylvania
- Introduction to Psychology by Yale University
- Excel Skills for Business by Macquarie University
- Psychological First Aid by Johns Hopkins University
- Graphic Design by Cal Arts
Amazon FBA
Amazing Selling Machine
Books - Data Science
Our Books
- Practical Guide to Cluster Analysis in R by A. Kassambara (Datanovia)
- Practical Guide To Principal Component Methods in R by A. Kassambara (Datanovia)
- Machine Learning Essentials: Practical Guide in R by A. Kassambara (Datanovia)
- R Graphics Essentials for Great Data Visualization by A. Kassambara (Datanovia)
- GGPlot2 Essentials for Great Data Visualization in R by A. Kassambara (Datanovia)
- Network Analysis and Visualization in R by A. Kassambara (Datanovia)
- Practical Statistics in R for Comparing Groups: Numerical Variables by A. Kassambara (Datanovia)
- Inter-Rater Reliability Essentials: Practical Guide in R by A. Kassambara (Datanovia)
Others
- R for Data Science: Import, Tidy, Transform, Visualize, and Model Data by Hadley Wickham & Garrett Grolemund
- Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurelien Géron
- Practical Statistics for Data Scientists: 50 Essential Concepts by Peter Bruce & Andrew Bruce
- Hands-On Programming with R: Write Your Own Functions And Simulations by Garrett Grolemund & Hadley Wickham
- An Introduction to Statistical Learning: with Applications in R by Gareth James et al.
- Deep Learning with R by François Chollet & J.J. Allaire
- Deep Learning with Python by François Chollet
what should i do if i want to change setosa to Setosa
It’s possible to use the function mutate() as follow:
How do i add the letter “V” to row names in R. For example, row name codes are 1023, 1024, 1025 and i want to change it to V1023, V1024, V2025.
Thank you.
This might help you…Kan has nicely described this..
https://blog.exploratory.io/selecting-columns-809bdd1ef615
df %>%
select(-starts_with(“user.”), -starts_with(“milestone.”),
-starts_with(“pull_”), -ends_with(“url”)) %>%
rename(developer = assignee.login) %>%
select(-starts_with(“assignee”), -title, -comments, -locked, -labels, -id, -body)
You can proceed as follow:
If I have a quite big data suppose 200+ column?
The functions described here still work, even if you have a large number of columns
Hi Kassambara,
You seem to be really on top of how to rename columns and I’m been struggling with writing a code that can rename columns based on their names. I have many different dataset where a number of columns will start with “alt” (e.g. alt1.price, alt1.pol, alt1.x, alt2.price, alt2.pol, alt2.x) and I would like to rename these columns to price_1, pol_1, x_1, price_2, pol_2, x_3.
Essentially, I would like to select columns starting with alt, add an underscore, delete the ‘alt’ and move the number to the end of the column name. Is that possible in any way?
Kind regards, Thomas
Hi Thomas,
you need to perform some string manipulations as shown below.
Kassambara – you are a hero. Thanks a million for your extremely detailed answer. I was hoping for some hints and get a full code – much appreciated.
/T
You are goooood!
If I have a large data set with 200+ columns?
is there a way where I don’t do each column manually one by one? could you possibly create a forloop or something to do it? if you can how would that work and what would it look like?
Thanks
You can also go as follow:
Is there a way where I don’t do c(“newname1”, “newname2”, “newname3”, … , “newname200”)?
I have a matrix with column data as years as date but when using as.Date it expects something %y%m%d how to rename column to %Y only as date but not character?
example 2001-01-01 rename as 2001
Hi i am Venkatapanchumarthi.
Awesome. I read excellent artcicle in recent days, this post is very informative this article is helped me a lot .Thanks for sharing such useful information.