Skip to content
Related Articles

Related Articles

Extract URLs present in a given string

View Discussion
Improve Article
Save Article
  • Last Updated : 08 Feb, 2021
View Discussion
Improve Article
Save Article

Given a string S, the task is to find and extract all the URLs from the string. If no URL is present in the string, then print “-1”.

Examples:

Input: S = “Welcome to https://www.geeksforgeeks.org Computer Science Portal”
Output: https://www.geeksforgeeks.org
Explanation:
The given string contains the URL ‘https://www.geeksforgeeks.org’.

Input: S = “Welcome to https://write.geeksforgeeks.org portal of https://www.geeksforgeeks.org Computer Science Portal”
Output:
https://write.geeksforgeeks.org 
https://www.geeksforgeeks.org
Explanation:
The given string contains two URLs ‘https://write.geeksforgeeks.org’ and ‘https://www.geeksforgeeks.org’.

Approach: The idea is to use Regular Expression to solve this problem. Follow the steps below to solve the given problem:

regex = “\\b((?:https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:, .;]*[-a-zA-Z0-9+&@#/%=~_|])”

  • Create an ArrayList in Java and compile the regular expression using Pattern.compile().
  • Match the given string with the regular expression. In Java, this can be done by using Pattern.matcher().
  • Find the substring from the first index of match result to the last index of the match result and add this substring into the list.
  • After completing the above steps, if the list is found to be empty, then print “-1” as there is no URL present in the string S. Otherwise, print all the string stored in the list.

Below is the implementation of the above approach:

Java




// Java program for the above approach
  
import java.util.*;
import java.util.regex.*;
class GFG {
  
    // Function to extract all the URL
    // from the string
    public static void extractURL(
        String str)
    {
  
        // Creating an empty ArrayList
        List<String> list
            = new ArrayList<>();
  
        // Regular Expression to extract
        // URL from the string
        String regex
            = "\\b((?:https?|ftp|file):"
              + "//[-a-zA-Z0-9+&@#/%?="
              + "~_|!:, .;]*[-a-zA-Z0-9+"
              + "&@#/%=~_|])";
  
        // Compile the Regular Expression
        Pattern p = Pattern.compile(
            regex,
            Pattern.CASE_INSENSITIVE);
  
        // Find the match between string
        // and the regular expression
        Matcher m = p.matcher(str);
  
        // Find the next subsequence of
        // the input subsequence that
        // find the pattern
        while (m.find()) {
  
            // Find the substring from the
            // first index of match result
            // to the last index of match
            // result and add in the list
            list.add(str.substring(
                m.start(0), m.end(0)));
        }
  
        // IF there no URL present
        if (list.size() == 0) {
            System.out.println("-1");
            return;
        }
  
        // Print all the URLs stored
        for (String url : list) {
            System.out.println(url);
        }
    }
  
    // Driver Code
    public static void main(String args[])
    {
  
        // Given String str
        String str
            = "Welcome to https:// www.geeksforgeeks"
              + ".org Computer Science Portal";
  
        // Function Call
        extractURL(str);
    }
}

Output:

https://www.geeksforgeeks.org

Time Complexity: O(N)
Auxiliary Space: O(1)


My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!