Working with Text in R

R is a programming language used for statistical – computing. R is used by many data miners and statisticians for developing statistical software and data analysis. It includes machine learning algorithms, linear regression, time series, statistical inference to name a few. R and its libraries implement a wide variety of statistical and graphical techniques, including linear and non-linear modeling, classical, statistical tests, time-series analysis, classification, clustering, and others.

Any value written inside the double quote is treated as a string in R. String is an array of characters and these collections of characters are stored inside a variable. Internally R stores every string within double quotes, even when you create them with a single quote.

Syntax:

Variable_name <- "String"

Example:

filter_none

edit
close

play_arrow

link
brightness_4
code

# R program to demonstrate
# creation of a string
a < -"hello world" print(a)

chevron_right


Output:



"hello world"

Following is a list of rules that need to be followed while working with strings:

  • The quotes at the beginning and end of a string should be both double quotes or both single quote. They can not be mixed.
  • Double quotes can be inserted into a string starting and ending with a single quote.
  • A single quote can be inserted into a string starting and ending with double-quotes.

String Manipulation

String manipulation is a process where a user is asked to process a given string and use/change its data. There are different methods in R to manipulate string that are as follows:

  • Concatenating of strings – paste() function:

    paste() function is used to combine string in R. It can take n number of arguments to combine together.

    Syntax:

    paste(….,  sep = " ",  collapse =NULL )

    Parameters:
    .....: It is used to pass n no of arguments to combine together.
    sep: It is used to represent the separator between the arguments. It is optional.
    collapse: It is used to remove the space between 2 strings, But not space within two words in one string.

    Example:

    filter_none

    edit
    close

    play_arrow

    link
    brightness_4
    code

    # concatenate two strings
    str1 <- "hello" 
    str2 <- "how are you?" 
    print(paste(str1, str2, sep = " ", collapse = "NULL"))

    chevron_right

    
    

    Output:

    "hello how are you?"
    
  •  



  • Formatting numbers and string – format() function:

    format() function is used to format strings and numbers in a specified style.

    Syntax:

    format(x, digits, nsmall, scientific, width, justify = c("left", "right", "centre", "none")) 

    Parameters:

    x is the vector input.
    digits here is the total number of digits displayed.
    nsmall is the minimum number of digits to the right of the decimal point.
    scientific is set to TRUE to display scientific notation.
    width indicates the minimum width to be displayed by padding blanks in the beginning.
    justify is the display of the string to left, right, or center.

    Example:

    filter_none

    edit
    close

    play_arrow

    link
    brightness_4
    code

    # formatting numbers and strings
      
    # Total number of digits displayed.
    # Last digit rounded off.
    result <- format(69.145656789, digits = 9)
    print(result)
      
    # Display numbers in scientific notation.
    result <- format(c(3, 132.84521), 
                     scientific = TRUE)
    print(result)
      
    # The minimum number of digits 
    # to the right of the decimal point.
    result <- format(96.47, nsmall = 5)
    print(result)
      
    # Format treats everything as a string.
    result <- format(8)
    print(result)
      
    # Numbers are padded with blank
    # in the beginning for width.
    result <- format(67.7, width = 6)
    print(result)
      
    # Left justify strings.
    result <- format("Hello", width = 8
                              justify = "l")
    print(result)

    chevron_right

    
    

    Output:

    [1] "69.1456568"
    [1] "3.000000e+00" "1.328452e+02"
    [1] "96.47000"
    [1] "8"
    [1] "  67.7"
    [1] "Hello   "
    
  •  

  • Counting the number of characters in the string – nchar() function:

    nchar() function is used to count the number of characters and spaces in the string.

    Syntax: nchar(x)

    Parameter:
    x is the vector input here.



    Example:

    filter_none

    edit
    close

    play_arrow

    link
    brightness_4
    code

    # to count the number of characters
    # in the string
    a <- nchar("hello world")
    print(a)

    chevron_right

    
    

    Output:

    [1] 11
    
  •  

  • Changing the case of the string – toupper() & tolower() function:

    toupper() & tolower() function is used to change the case of the string.

    Syntax:
    toupper(x)
    and
    tolower(x)

    Parameter:

    x is the vector input

    Example:

    filter_none

    edit
    close

    play_arrow

    link
    brightness_4
    code

    # Changing to Upper case.
    a <- toupper("hello world")
    print(a)
      
    # Changing to lower case.
    b <- tolower("HELLO WORLD")
    print(b)

    chevron_right

    
    

    Output:

    "HELLO WORLD"
    "hello world"
    
  •  

  • Extracting parts of the string – substring() function:
    substring() function is used to extract parts of the string.

    Syntax: substring(x, first, last)

    Parameters:
    x is the character vector input.
    first is the position of the first character to be extracted.
    last is the position of the last character to be extracted.

    Example:

    filter_none

    edit
    close

    play_arrow

    link
    brightness_4
    code

    # Extract characters from 1th to 3rd position.
    c <- substring("Programming", 1, 3)
    print(c)

    chevron_right

    
    

    Output:

    "Pro"
    



My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.


Article Tags :

Be the First to upvote.


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.