What is Web Scraping in Node.js ?
Web Scraping means collecting any type of data such as images, text, or video from the internet. It is much useful when someone has to collect a large amount of data, it saves so much time by making the process automated.
Puppeteer: In Node.js, there are many modules for Web Scraping but one of the easy-to-implement & popular modules is Puppeteer. Puppeteer provides many methods that make the whole process of Web Scraping & Web Automation much easier. We can install this module in our project directory by typing the command.
npm install puppeteer
Approach:
Step 1: Require Puppeteer Module
const puppeteer = require('puppeteer');
Step 2: Make an async function
async function webScraper() { }; webScraper();
Step 3: Inside the function, create two constants, first is a browser const that is used to launch Puppeteer, second is a page const that is used to browse & open a new page for scraping purposes.
async function webScraper() { const browser = await puppeteer.launch({}) const page = await browser.newPage() }; webScraper();
Step 4: Using the goto method, open the website which we want to scrape, then select the element that text we want, then extract text from that element & log the text into the console.
await page.goto(‘https://www.geeksforgeeks.org/explain-the-mechanism-of-event-loop-in-node-js/’)
var element = await page.waitFor(“h1”)
var text = await page.evaluate(element => element.textContent, element)
console.log(text)
browser.close()
Example:
Javascript
const puppeteer = require( 'puppeteer' ); async function webScraper() { const browser = await puppeteer.launch({}) const page = await browser.newPage() await page.goto( var element = await page.waitFor( "h1" ) var text = await page.evaluate( element => element.textContent, element) console.log(text) browser.close() }; webScraper(); |
Step to run the application: Open the terminal and type the following command.
node app.js
Output:

Please Login to comment...