Skip to content
Related Articles

Related Articles

Web Scrapping in PHP Using Simple HTML DOM Parser
  • Last Updated : 22 Nov, 2019

Web Scraping is a technique used to extract large amounts of data from websites extracted and saved to a local file in your computer or to a database or can be used as API. Data displayed by most websites can be viewed by using a web browser only. They do not offer the functionality to save a copy of this data for use. Thus the only option is to copy and paste the selected data that is required, which in reality, is a very tedious job and may take hours complete. In other terms Web Scraping is the technique of automating such a process, in place of manual work, the Web Scraping software performs the same task within seconds. The web scraping can be done by targeting the selected DOM components and then processing or storing the text between that DOM element of a web page. To do the same in PHP, there is an API which parses the whole page and looks for the required elements within the DOM. It is the Simple HTML DOM Parser. To know more about the Web Scraping visit this article.

One can download it by clicking this link.

Example 1: The below-given example shows the use of this API, to display a google search on the localhost.

  • HTML Code:




    <!DOCTYPE html>
    <html lang="en">
      
    <head>
        <meta charset="UTF-8">
          
        <meta name="viewport" content=
            "width=device-width, initial-scale=1.0">
          
        <meta http-equiv="X-UA-Compatible" content="ie=edge">
          
        <title>Document</title>
    </head>
      
    <body>
        <form action="GoogleSearch.php" method="POST">
            <input type="text" name="search">
              
            <br><br>
              
            <button>
                Search
            </button>
        </form>
    </body>
      
    </html>
  • PHP code:




    <?php
      
    // In case the File is in the API directory 
    include('simple_html_dom.php');
       
    // Extracting DOM
    $html = file_get_html(
    'http://www.google.com/search?q='.$_POST["search"]);
      
    // Displaying DOM
    echo $html;
      
    ?>

    Output: The output on local server is

    Example 2: Here we will try to access the text on the first search result of google. For this we first fetch the DOM Component having the first result to a query asked to the google. Here we fetch the span tag having class ‘kCrYT’ from the DOM, which have the list of details for all searched, but we need the first one only, so loop iterates once only.

    • PHP code: This code will work on if you have already search for anything on Google Search engine.




      <?php
        
      include('simple_html_dom.php');
        
      $html = file_get_html(
      'http://www.google.com/search?q='.$_POST["search"]);
        
      foreach($html->find('div.kCrYT') as $elements) {
          echo $elements->plaintext;
          break;
      }
      ?>
    • Output:
      GeeksforGeeks is a very fast-growing community among programmers
      and have a reach of around 10 million+ readers globally. Writing will
      surely enhance your knowledge of the subject as before writing any
       topic, you need to be very crisp and clear about it.



    My Personal Notes arrow_drop_up
Recommended Articles
Page :