How to get all HTML content from DOMParser excluding the outer body tag ?

DOM (Document Object Model) allows us to dynamically access and manipulate the HTML data. All the text data from an HTML file can also be extracted using DOMParser. DOM parser returns an HTML/XML/SVG object. All the objects can be accessed using the [ ] operator in javascript.

The HTML DOM Tree of objects:

Steps to get all the text from an HTML document using DOMParser:

  1. Declare an instance of DOMParser.
    Syntax:

    const parser = new DOMParser();
  2. Parse the document using .parseFromString() function. It takes two arguments, the string to be parsed and the type of document.
    Syntax:



    const parsedDocument = parser.parseFromString(
            htmlInput, "text/html");
  3. Use doc.all element to access the whole HTML page, now get its root element which is stored at 0th index. We can also use getElementByID() to get content of a specific element.
    Syntax:

    var allText = parsedDocument.all[0].textContent;

Finally, we will use the textContent attribute of doc.all[0] to get the text from all HTML elements.

Example:

filter_none

edit
close

play_arrow

link
brightness_4
code

<title>This is the title</title>
<div>
    <span>Geeks for geeks</span>
    <p>Content to be parsed </p>
</div>

chevron_right


Output:

This is the title 
Geeks for geeks
Content to be parsed

Code:

filter_none

edit
close

play_arrow

link
brightness_4
code

<!DOCTYPE html>
<html lang="en" dir="ltr">
  
<head>
    <title>
        Dom Parser Inner Content
    </title>
</head>
  
<body>
    <h2>
        DomParser to get 
        all HTML content
    </h2>
  
    <p>
        Click on the button Below 
        to parse the HTML document
    </p>
  
    <!-- Paragraph element to 
         show the output -->
    <p id="output"> </p>
  
    <!-- Button to call the 
         parsing function -->
    <button onclick="printOutput()">
        Parse now
    </button>
  
    <script>
  
        // Input HTML string to be parsed
        var htmlInput = `
    <title> This is the title </title>
    <div>
      <span>Geeks for geeks</span>
      <p> Content to be parsed </p>
    </div>
  `;
  
        // Created instance
        const parser = new DOMParser();
  
        // Parsing
        const parsedDocument =
                    parser.parseFromString(
                    htmlInput, "text/html");
  
        // Getting text
        function printOutput() {
  
            var allText = parsedDocument
                     .all[0].textContent;
  
            // Printing on page and console
            document.getElementById("output")
                        .innerHTML = allText;
  
            console.log(parsedDocument
                        .all[0].textContent);
        }
    </script>
</body>
  
</html>

chevron_right


Output:
Before pressing the button:

After Pressing the button:

The text content from individual components can also be retrieved using getElementsByClassName(‘className’) and getElementById(‘IDName’).

Javascript Function that takes the document to be parsed as a string and prints the result.

filter_none

edit
close

play_arrow

link
brightness_4
code

function parse(htmlInput) {
  
    // Creating Praser instance
    const parser = new DOMParser();
  
    // Parsing the document using DOM Parser
    // and storing the returned HTML object in
    // a variable
    const parsedDocument = parser
        .parseFromString(htmlInput, "text/html");
  
    // Retrieve all text content from DOM object
    var allText = parsedDocument.all[0].textContent;
  
    // Printing the output to webpage and
    console.log(parsedDocument.all[0].textContent);
}

chevron_right


full-stack-img




My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.