Open In App
Related Articles

R Strings

Improve
Improve
Improve
Like Article
Like
Save Article
Save
Report issue
Report

Strings are a bunch of character variables. It is a one-dimensional array of characters. One or more characters enclosed in a pair of matching single or double quotes can be considered a string in R. Strings in R Programming represent textual content and can contain numbers, spaces, and special characters. An empty string is represented by using “. R Strings are always stored as double-quoted values. A double-quoted string can contain single quotes within it. Single-quoted strings can’t contain single quotes. Similarly, double quotes can’t be surrounded by double quotes.

Creation of String in R

R Strings can be created by assigning character values to a variable. These strings can be further concatenated by using various functions and methods to form a big string. 

Example

R

# R program for String Creation
 
# creating a string with double quotes
str1 <- "OK1"
cat ("String 1 is : ", str1)
 
# creating a string with single quotes
str2 <- 'OK2'
cat ("String 2 is : ", str2)
str3 <- "This is 'acceptable and 'allowed' in R"
cat ("String 3 is : ", str3)
str4 <- 'Hi, Wondering "if this "works"'
cat ("String 4 is : ", str4)
str5 <- 'hi, ' this is not allowed'
cat ("String 5 is : ", str5)

                    

Output

String 1 is:  OK1
String 2 is:  OK2
String 3 is:  This is 'acceptable and 'allowed' in R
String 4 is:  Hi, Wondering "if this "works"
Error: unexpected symbol in "        str5 <- 'hi, ' this"
Execution halted

Length of String

The length of strings indicates the number of characters present in the string. The function str_length() belonging to the ‘string’ package or nchar() inbuilt function of R can be used to determine the length of strings in R. 

Using the str_length() function 

R

# R program for finding length of string
 
# Importing package
library(stringr)
 
# Calculating length of string   
str_length("hello")

                    

Output

5

Using nchar() function 

R

# R program to find length of string
 
# Using nchar() function
nchar("hel'lo")

                    

Output

6

Accessing portions of an R string

The individual characters of a string can be extracted from a string by using the indexing methods of a string. There are two R’s inbuilt functions in order to access both the single character as well as the substrings of the string. 

substr() or substring() function in R extracts substrings out of a string beginning with the start index and ending with the end index. It also replaces the specified substring with a new set of characters. 

Syntax

substr(..., start, end)
or 
substring(..., start, end)

Using substr() function 

R

# R program to access
# characters in a string
 
# Accessing characters
# using substr() function
substr("Learn Code Tech", 1, 1)

                    

Output

"L"

If the starting index is equal to the ending index, the corresponding character of the string is accessed. In this case, the first character, ‘L’ is printed. 

Using substring() function 

R

# R program to access characters in string
str <- "Learn Code"
 
# counts the characters in the string
len <- nchar(str)
 
# Accessing character using
# substring() function
print (substring(str, len, len))
 
# Accessing elements out of index
print (substring(str, len+1, len+1))

                    

Output

[1] "e"

The number of characters in the string is 10. The first print statement prints the last character of the string, “e”, which is str[10]. The second print statement prints the 11th character of the string, which doesn’t exist, but the code doesn’t throw an error and print “”, that is an empty character. 

The following R code indicates the mechanism of String Slicing, where in the substrings of a R string are extracted: 

R

# R program to access characters in string
str <- "Learn Code"
 
# counts the number of characters of str = 10
len <- nchar(str)
print(substr(str, 1, 4))
print(substr(str, len-2, len))

                    

Output

[1]"Lear"
[1]"ode"

The first print statement prints the first four characters of the string. The second print statement prints the substring from the indexes 8 to 10, which is “ode”.

Case Conversion

The R string characters can be converted to upper or lower case by R’s inbuilt function toupper() which converts all the characters to upper case, tolower() which converts all the characters to lower case, and casefold(…, upper=TRUE/FALSE) which converts on the basis of the value specified to the upper argument. All these functions can take in as arguments multiple strings too. The time complexity of all the operations is O(number of characters in the string). 

Example

R

# R program to Convert case of a string
str <- "Hi LeArn CodiNG"
print(toupper(str))
print(tolower(str))
print(casefold(str, upper = TRUE))

                    

Output

[1] "HI LEARN CODING"
[1] "hi learn coding"
[1] "HI LEARN CODING" 

By default, the value of upper in casefold() function is set to FALSE. If we set it to TRUE, the R string gets printed in upper case.

Concatenation of R Strings

Using R’s paste function, you can concatenate strings. Here is a straightforward example of code that joins two strings together:
 

R

# Create two strings
string1 <- "Hello"
string2 <- "World"
 
# Concatenate the two strings
result <- paste(string1, string2)
 
# Print the result
print(result)

                    

Output

 "Hello World"

In this example, we first create two strings “Hello” and “World” and store them in the variables string1 and string2, respectively. We then use the paste function to concatenate the two strings together with a space between them and store the result in the variable result. Finally, we use the print function to print the value of result to the console.

When you run this code, you should see the output Hello World, which is the concatenated string of “Hello” and “World” with a space between them.

You can also concatenate multiple strings by passing them as separate arguments to the paste function, like this:

R

# Concatenate three strings
result <- paste("Hello", "to", "the World")
 
# Print the result
print(result)

                    

Output

[1] "Hello to the World"

In this example, we concatenate three strings “Hello”, “to”, and “the World” and store the result in the variable result. The paste function combines the strings together with a space between them, so the output of this code would be Hello to the World.

R String formatting

String formatting in R is done via the sprintf function. An easy example of code that prepares a string using a variable value is provided below:

R

# Create two variables with values
x <- 42
y <- 3.14159
 
# Format a string with the two variable values
result <- sprintf("The answer is %d, and pi is %.2f.", x, y)
 
# Print the result
print(result)

                    

Output

[1] "John is 35 years old and 1.80 meters tall."

In this example, we format a string with two decimal places using the%d format specifier for the integer value x and the%.2f format specifier for the floating-point value y. The prepared string is saved in the variable result before being written to the console using the print function. You should see the output when you run this code. The solution is 42, and pi is 3.14, which is the formatted string with x and y values substituted for the format specifiers.

Updating R strings

The characters, as well as substrings of a string, can be manipulated to new string values. The changes are reflected in the original string. In R, the string values can be updated in the following way:

substr (..., start, end) <- newstring
substring (..., start, end) <- newstring

R

# Create a string
string <- "Hello, World!"
 
# Replace "World" with "Universe"
string <- gsub("World", "Universe", string)
 
# Print the updated string
print(string)

                    

Output

"Hello, Universe!"

Multiple strings can be updated at once, with the start <= end.

  • If the length of the substring is larger than the new string, only the portion of the substring equal to the length of the new string is replaced.
  • If the length of the substring is smaller than the new string, the position of the substring is replaced with the corresponding new string values.


Last Updated : 03 May, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads