Skip to content
Related Articles
Open in App
Not now

Related Articles

What is Web Scraping in Node.js ?

Improve Article
Save Article
  • Last Updated : 07 Feb, 2023
Improve Article
Save Article

Web Scraping means collecting any type of data such as images, text, or video from the internet. It is much useful when someone has to collect a large amount of data, it saves so much time by making the process automated.

Puppeteer: In Node.js, there are many modules for Web Scraping but one of the easy-to-implement & popular modules is Puppeteer. Puppeteer provides many methods that make the whole process of Web Scraping & Web Automation much easier. We can install this module in our project directory by typing the command.

npm install puppeteer

Approach: 

Step 1: Require Puppeteer Module

const puppeteer = require('puppeteer');

Step 2: Make an async function

async function webScraper() {
};

webScraper();

Step 3: Inside the function, create two constants, first is a browser const that is used to launch Puppeteer, second is a page const that is used to browse & open a new page for scraping purposes.

async function webScraper() {
    const browser = await puppeteer.launch({})
       const page = await browser.newPage()
};
webScraper();

Step 4: Using the goto method, open the website which we want to scrape, then select the element that text we want, then extract text from that element & log the text into the console.

await page.goto(‘https://www.geeksforgeeks.org/explain-the-mechanism-of-event-loop-in-node-js/’)
var element = await page.waitFor(“h1”)
var text = await page.evaluate(element => element.textContent, element)
console.log(text)
browser.close()

Example:

Javascript




const puppeteer = require('puppeteer');
 
async function webScraper() {
    const browser = await puppeteer.launch({})
    const page = await browser.newPage()
    await page.goto(
    var element = await page.waitFor("h1")
    var text = await page.evaluate(
        element => element.textContent, element)
    console.log(text)
    browser.close()
};
 
webScraper();

Step to run the application: Open the terminal and type the following command.

node app.js

Output:

 

My Personal Notes arrow_drop_up
Related Articles

Start Your Coding Journey Now!