Open In App

Matching using regexp in GoLang

Improve
Improve
Like Article
Like
Save
Share
Report

Regexp is the short form of Regular expressions. People generally confuse regexp with regex, and that’s obvious because other programming languages such as Python use the term regex for their regular expression library, but Go uses regexp and it’s important to remember it as regexp as it is the name of the regular expressions package available in Go with pre-built functions.

Matching – What is it?

Well, irrespective of the origin, every kid may have experienced at least one situation in their lives where they may have worn matching clothes or matching accessories and have gone saying, “Hey buddy we wore matching clothes”, “same pinch” etc.. How exactly did those kids decide if something matches with the other? It’s simple! When two things are exactly the same, or a part of one exactly is the same as some other thing, we say those two things are a match or a sub-match respectively. So let’s dive deeper into the concept of matching in GoLang.

Matching using regexp in GoLang

regexp (regular expressions) is all about string/pattern matching. Every function, every part of regexp functions somewhere requires text matching. Let us look at some functions that are important for matching directly or indirectly. 

Refer Table 1.1 to know more about the methods used to compile and store a regex object. Refer Table 1.2 to know more about the functions that directly implement the Match interface and nothing more. Refer to Table 1.3. to know more about the other pre-built methods of regexp that perform operations such as find, replace, etc.. which indirectly implement the Match interface first and then perform their operation. 

Do you doubt how they do so? Okay consider this example: If your mother asks you to find something in the kitchen by just providing a description of the object, what will you do? You’ll go to the kitchen and then search for objects that match your mother’s description and then you pick it and give it to her. This means you simply matched the description first and reported back to your mom (though your prime focus was to just find it, you indirectly had to match in order to find it). Got it? Let’s now take a deeper look at these functions.
 

COMPILE

Function Description

regexp.Compile( ) This method creates a regex object that can match the text provided in the input in case no errors occur. If any error occurs then this method returns an error. This method returns a regex object and an error message (if any).
regexp.CompilePOSIX( ) It does the same as Compile( ) but the difference is that it restricts the text to POSIX format.
regexp.MustCompile( ) This method creates a regex object that can match the text provided in the input in case no errors occur. The main difference  between Compile( ) and MustCompile( ) is that in case of errors, Compile( ) simply returns an error message to the second err variable but MustCompile( ) panics and raises an issue. Also, MustCompile( ) returns only the object to match or panics, but not a second variable.
regexp.MustCompilePOSIX( ) It does the same as MustCompile( ) but the difference is that it restricts the text to POSIX format.

Table 1.1. Compilation methods

 

DIRECT MATCH FUNCTIONS

Function Description

regexp.Match( )

The Match( ) takes a byte array as input and checks whether the byte array text matches the one held by regex object that called the Match( ). If it matches, then the method returns a boolean value: true or else a boolean value: false.

regexp.MatchString( )

Works the same as Match( ). The main difference is that it collects a string type text as an input and returns the boolean value according to the respective match/mismatch.

regexp.MatchReader( )

Works the same as Match( ). The main difference is that it collects a rune of reader type as an input and returns the boolean value according to the respective match/mismatch.

Table 1.2. Direct match functions that implement the Match interface.

 

FIND FUNCTIONS

Function Description

regexp.Find( ) Returns a byte array/slice of the first occurrence of the input text with regex object text.
regexp.FindString( ) Returns a string of the first occurrence of the input text with regex object text.
regexp.FindSubmatch( ) Returns a byte array/slice of the first occurrence of any of the subset-match of input text with regex object text.
regexp.FindStringSubmatch( ) Returns a string of the first occurrence of any of the subset-match of input text with regex object text.
regexp.FindAll( ) Returns a byte array of all byte arrays/slices of all occurrences of the input text with regex object text.
regexp.FindAllString( ) Returns an array of all strings of all occurrences of the input text with regex object text.
regexp.FindAllSubmatch( ) Returns a byte array of all byte arrays/slices that are subset-matches of input text with regex object text.
regexp.FindAllStringSubmatch( ) Returns an array of all subset-strings of all occurrences of the input text with regex object text.
regexp.FindIndex( ) Returns the index of the first occurrence of the matched byte array/slice input text with regex object text.
regexp.FindStringIndex( ) Returns the index of the first occurrence of the matched string input text with the regex object text.
regexp.FindSubmatchIndex( ) Returns the index of the first subset-match of input slice with the respective regex object text.
regexp.FindStringSubmatchIndex( ) Returns the index of the first subset-match of the input string with the respective regex object text.
regexp.FindAllIndex( ) Returns an array of all such indices which match the input slice text with regex object text.
regexp.FindAllStringIndex( ) Returns an array of all such indices which match the input string text with regex object text.
regexp.FindAllSubmatchIndex( ) Returns an array of all such indices which match partially the input slice text with regex object text.
regexp.FindAllStringSubmatchIndex( ) Returns an array of all such indices which match partially the input string text with regex object text.
regexp.FindReaderIndex( ) Returns a slice of integers that define the location of the first complete occurrence of regex object text in the RuneReader text.
regexp.FindReaderSubmatchIndex( ) Returns a slice of integers that define the location of the first partial occurrence of regex object text in the RuneReader text.
regexp.ReplaceAll( ) As the name suggests, this function replaces all the values in input slice (arg1) with input slice (arg2) by  matching the text mentioned in regex object with that in input slice (arg1). It returns a copy of slice, modified. Also, in arg2 all ‘$’ signs are interpreted the same way as in Expand.
regexp.ReplaceAllFunc( ) Almost the same as ReplaceAll( ) with one difference that the slice in arg2 is not directly input by user but instead a function is called which returns a slice that takes arg2’s place. It returns a copy of slice, modified.
regexp.ReplaceAllString( ) Same as ReplaceAll( ) but the only difference is that in collects string arguments and also returns string copy of the slice, modified.
regexp.ReplaceAllLiteral( ) As the name suggests, this function replaces all the values in input slice (arg1) with input slice (arg2) by  matching the text mentioned in regex object with that in input slice (arg1). And literally the text in arg2 is exactly considered unlike ReplaceAll( ) that has a different convention for some signs like for instance, ‘$’.  It returns the modified slice.
regexp.ReplaceAllStringFunc( ) Same as ReplaceAllFunc( ) but the only difference is that this function operates with string inputs & string output.
regexp.ReplaceAllLiteralString( ) Same as ReplaceAllLiteral( ) but the only difference is that this function operates with string inputs & string output.

Table 1.3. Other functions that perform different operations but indirectly implement the Match interface for their respective operations.

Note: We’ve been mentioning a copy of ____, modified and not modified ___.  This is so because changes are not directly performed to the source string.  Instead, a copy is passed by default, and modifications are performed on the copy.

As we mentioned earlier, regexp itself is a bundle of operations dealing with string matching. So directly or indirectly almost every function in the regexp package deals with matching. There’s no need to panic as all these operations are completely real-time and easy to understand.

Examples:

Code 1: Direct implementation of Match methods

Go




package main
  
import (
    f "fmt"
    "regexp"
)
  
func main() {
    f.Println("--------Reference strings--------\n")
  
    name := "My name is Geeks for geeks."
    f.Println(name)
  
    profession := "I am a computer science portal for geeks."
    f.Println(profession)
  
    message := "You can find anything here, if not tell us and we'll add it for you!"
    f.Println(message)
  
    //---------------------------------------------------
    f.Println("\n--------Match functions--------")
    //-------------------------------------------
  
    obj, err := regexp.Match("[gG]e*k.*", []byte(name))
    f.Println("\nregex.Match returns ->", obj,
        "and error(if any) --->", err)
  
    //-------------------------------------------
    obj, err = regexp.Match("[gG]e*k.*", []byte(profession))
    f.Println("\nregex.Match returns ->", obj,
        "and error(if any) --->", err)
  
    //-------------------------------------------
    obj, err = regexp.MatchString("Geek.*", message)
    f.Println("\nregex.MatchString returns ->", obj,
        "and error(if any) --->", err)
  
    //-------------------------------------------
}


Command to run on command prompt:

:/Directory where the go file is present/> go run (file_name).go

Output:

--------Reference strings--------

My name is Geeks for geeks.
I am a computer science portal for geeks.
You can find anything here, if not tell us and we'll add it for you!

--------Match functions--------

regex.Match returns -> true and error(if any) ---> <nil>

regex.Match returns -> true and error(if any) ---> <nil>

regex.MatchString returns -> false and error(if any) ---> <nil>

Code 2: Direct implementation of the Match method(s)

Go




package main
  
import (
    f "fmt"
    "io"
    "regexp"
)
  
func main() {
    obj := regexp.MustCompile("ee")
    var r io.RuneReader
    s := []byte("Hello GeekS, 1234")
    f.Println("Initial byte array -----> ", s)
    f.Println("Initial string ---------> ", string(s))
    f.Println("MatchReader ------------> ", obj.MatchReader(r))
    ex := []byte("NEW")
    f.Println("ReplaceAllFunc( ) work in progress...")
    s = obj.ReplaceAllFunc(s, func(s []byte) []byte {
        if true {
            return ex
        }
        return s
    })
    f.Println("Final string -----------> ", string(s))
    f.Println("Final byte array -------> ", s)
}


Command to run on command prompt:

:/Directory where the go file is present/> go run (file_name).go

Output:

Initial byte array —–> [72 101 108 108 111 32 71 101 101 107 83 44 32 49 50 51 52]
Initial string ———> Hello GeekS, 1234
MatchReader ————> false
ReplaceAllFunc( ) work in progress…
Final string ———–> Hello GNEWkS, 1234
Final byte array ——-> [72 101 108 108 111 32 71 78 69 87 107 83 44 32 49 50 51 52]

Code 3: Indirect implementation of Match methods

Go




package main
  
import (
    f "fmt"
    "regexp"
)
  
func main() {
    f.Println("--------Reference strings--------\n")
  
    name := "My name is Geeks for geeks."
    f.Println(name)
  
    profession := "I am a computer science portal for geeks."
    f.Println(profession)
  
    message := "You can find anything here, if not tell us and we'll add it for you!"
    f.Println(message)
    //---------------------------------------------------------
    f.Println("\n--------Compiling functions--------\n")
    //-------------------------------------------
  
    musComp := regexp.MustCompile("[gG]ee.?")
    f.Println("Initialized the regexp object to musComp...")
  
    //---------------------------------------------------------
    f.Println("\n--------Find functions--------\n")
    //-------------------------------------------
  
    f.Println("mustCompile.Find -----------------------> ",
        musComp.Find([]byte(name)))
  
    f.Println("mustCompile.FindString -----------------> ",
        musComp.FindString(name))
  
    f.Println("mustCompile.FindSubmatch ---------------> ",
        musComp.FindSubmatch([]byte(name)))
  
    f.Println("mustCompile.FindStringSubmatch ---------> ",
        musComp.FindStringSubmatch(name))
    //-------------------------------------------
    f.Println("mustCompile.FindAll --------------------> ",
        musComp.FindAll([]byte(name), -1))
  
    f.Println("mustCompile.FindAllString --------------> ",
        musComp.FindAllString(name, -1))
  
    f.Println("mustCompile.FindAllSubmatch ------------> ",
        musComp.FindAllSubmatch([]byte(name), -1))
  
    f.Println("mustCompile.FindAllStringSubmatch ------> ",
        musComp.FindAllStringSubmatch(name, -1))
    //-------------------------------------------
    f.Println("mustCompile.FindIndex ------------------> ",
        musComp.FindIndex([]byte(name)))
  
    f.Println("mustCompile.FindStringIndex ------------> ",
        musComp.FindStringIndex(name))
  
    f.Println("mustCompile.FindSubmatchIndex ----------> ",
        musComp.FindSubmatchIndex([]byte(name)))
  
    f.Println("mustCompile.FindStringSubmatchIndex ----> ",
        musComp.FindStringSubmatchIndex(name))
    //-------------------------------------------
    f.Println("mustCompile.FindAllIndex ---------------> ",
        musComp.FindAllIndex([]byte(name), -1))
  
    f.Println("mustCompile.FindAllStringIndex ---------> ",
        musComp.FindAllStringIndex(name, -1))
  
    f.Println("mustCompile.FindAllSubmatchIndex -------> ",
        musComp.FindAllSubmatchIndex([]byte(name), -1))
  
    f.Println("mustCompile.FindAllStringSubmatchIndex -> ",
        musComp.FindAllStringSubmatchIndex(name, -1))
    //------------------------------------------------------------
    f.Println("\n--------Replace functions--------\n")
    //-------------------------------------------
  
    f.Println("mustCompile.ReplaceAll -----------------> ",
        musComp.ReplaceAll([]byte(name), []byte("Bow bow!")))
  
    f.Println("mustCompile.ReplaceAllStirng -----------> ",
        musComp.ReplaceAllString(name, "Bow bow!"))
  
    f.Println("mustCompile.ReplaceAllLiteral ----------> ",
        musComp.ReplaceAllLiteral([]byte(name), []byte("T")))
  
    f.Println("mustCompile.ReplaceAllLiteralString ----> ",
        musComp.ReplaceAllLiteralString(name, "T"))
    //------------------------------------------------------------
}


Command to run on command prompt:

:/Directory where the go file is present/> go run (file_name).go

Output:

——–Reference strings——–

My name is Geeks for geeks.
I am a computer science portal for geeks.
You can find anything here, if not tell us and we’ll add it for you!

——–Compiling functions——–

Initialized the regexp object to musComp…

——–Find functions——–

mustCompile.Find ———————–>  [71 101 101 107]
mustCompile.FindString —————–>  Geek
mustCompile.FindSubmatch —————>  [[71 101 101 107]]
mustCompile.FindStringSubmatch ———>  [Geek]
mustCompile.FindAll ——————–>  [[71 101 101 107] [103 101 101 107]]
mustCompile.FindAllString ————–>  [Geek geek]
mustCompile.FindAllSubmatch ————>  [[[71 101 101 107]] [[103 101 101 107]]]
mustCompile.FindAllStringSubmatch ——>  [[Geek] [geek]]
mustCompile.FindIndex ——————>  [11 15]
mustCompile.FindStringIndex ————>  [11 15]
mustCompile.FindSubmatchIndex ———->  [11 15]
mustCompile.FindStringSubmatchIndex —->  [11 15]
mustCompile.FindAllIndex —————>  [[11 15] [21 25]]
mustCompile.FindAllStringIndex ———>  [[11 15] [21 25]]
mustCompile.FindAllSubmatchIndex ——->  [[11 15] [21 25]]
mustCompile.FindAllStringSubmatchIndex ->  [[11 15] [21 25]]

——–Replace functions——–

mustCompile.ReplaceAll —————–>  [77 121 32 110 97 109 101 32 105 115 32 66 111 119 32 98 111 119 33 115 32 102 111 114 32 66 111 119 32 98 111 119 33 115 46]
mustCompile.ReplaceAllStirng ———–>  My name is Bow bow!s for Bow bow!s.
mustCompile.ReplaceAllLiteral ———->  [77 121 32 110 97 109 101 32 105 115 32 84 115 32 102 111 114 32 84 115 46]
mustCompile.ReplaceAllLiteralString —->  My name is Ts for Ts.



Last Updated : 14 Oct, 2020
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads