Open In App

Bash Scripting – Substring

Improve
Improve
Like Article
Like
Save
Share
Report

sIn this article, we will discuss how to write a bash script to extract substring from a string.

Extracting an Index-Based Substring

There are various ways to obtain substring based on the index of characters in the string:

  • Using cut command
  • Using Bash substring
  • Using expr substr command
  • Using awk command

Method 1: Using cut command

Cut command is used to perform the slicing operation to get the desired result.

Syntax:

cut [option] range [string/filename]

-c option is used to cut out the string by character.  It is necessary to specify list or range of character numbers of otherwise it gives an error with this option. In range, specify the range of indexes of original to get the substring. It uses 1-based index(indexing starts from 1) system.

Example 1: For demonstration purposes let’s extract the character’s last 0 in string ‘01010string’.

Code:

cut -c 6-11<<< '01010string'

<<< is known as here-string. Using this, one can pass a pre-made string of text to a program. We have specified the range 6-11 because 6 is the starting index and 11 is the ending index of our desired result.

Output:

Example 2: Now extract characters before ‘s’ in string ‘01010string’.

Code:

cut -c 1-5<<< '01010string'

 We have specified the range 1-5, because 1 is the starting index and 5 is the ending index of our desired result.

Output:

Method 2: Bash’s Substring (without using external command)

Syntax:

${VAR:start_index:length}

It uses 0-based index system.

Example 1: For demonstration, we will extract the substring from a string ‘My name is ROMY’ from index 11 to index 15. For 11 to 15 index, length of substring will become 4.

Code:

STR="My name is ROMY"
echo ${STR:11:4}

Output:

Example 2: Extract the string that lie before index 10. As this method uses 0 based index system, length of  desired string will be 10.

Code:

STR="My name is ROMY"
echo ${STR:0:10}

Output:

Method 3: Using expr command 

It is used to perform:

  • addition, subtraction, multiplication, division and modulus like operations.
  • Evaluation of regular expressions, string operations like substring.

It uses 1-based index system.

Example 1: For demonstration, we will extract the substring from a string ‘My name is ROMY’ from index 12 to index 16. For 12 to 16 index, the length of the substring will become 4.

Syntax:

expr substr <input_string> <start_index> <length>

Code:

expr substr "My name is ROMY" 12 4

Output:

Example 2: Extract a substring from the start of a string start till index 10. As this method uses 1-based index system, the length of the string till index 10 is 9.

Code:

expr substr "My name is ROMY" 1 9

Output:

Method 4: Using awk command 

It is a scripting language used for manipulating data. It does not require compilation and allows string functions, variable, etc. It has a built-in substr() function which can be used directly to get the substring. 

The substr(s, i, n) function accepts three arguments.

  • s : The input string
  • i : The start index of the substring
  • n : The length of the substring.

It uses 1-based index system.

Syntax:

awk '{print substr($var,start_index, length)}'

Example 1: Extract substring of length 5 starting from index 12.

Code:

awk '{print substr($0, 12, 5)}' <<< 'My name is ROMY'

Output:

Example 2: Extract string of length 10, starting from index 1.

Code:

awk '{print substr($0, 1, 10)}' <<< 'My name is ROMY'

Output:

Extracting a Pattern-Based Substring

There are various ways to obtain substring based on the patterns of the string:

  • using cut command
  • using awk command

Method 1: Using cut command

For demonstration, take input strings to be comma-separated values: “Romy, Pushkar, Kareena, Katrina”. (-d ,) option is to be used with cut command to tell the command that the input string is comma separated values. -f option tell the cut command to extract the string based on the field like (-f 3) is for third field in the string.

Syntax:

cut [option] field_position <<< "comma_seperated_string"

Code:

cut -d, -f 3 <<< “Romy,Pushkar,Kareena,Katrina”.

This will extract third field.

Output:

Method 2: using awk command

Syntax:

awk [option] field_separator ‘{print $field_position}’ <<< “input_string”

Code:

To extract third field from string

awk -F’,’ ‘{print $1}’ <<< “Romy,Pushkar,Kareena,Katrina”

Output:

Different Pattern-Based Substring Case

It is not necessary that the input string is always a comma-separated value.

In this method, we will see the method to obtain substring that lies between two patterns in a string. This problem can be solved using awk command.

  • sub(/.*start/, “”) – It removes everything before beginning till ‘start’.
  • sub(/end.*/, “”) – It removes everything from “end” along with end.

Syntax:

awk ‘{ sub(/.*BEGIN:/, “”); sub(/END:.*/, “”); print }’ <<< “input_string”

Code:

STR="Hello!! My name is ROMY kumari"
awk '{ sub(/.*!!/, ""); sub(/kumari.*/, ""); print }' <<< "$STR"

Output:


Last Updated : 24 May, 2022
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads