乐闻世界logo
搜索文章和话题

How do get script content using cheerio

1个答案

1

1. Install Cheerio:

First, ensure Cheerio is installed in your Node.js project. If not installed, you can install it via npm:

bash
npm install cheerio

2. Load HTML Content:

You can use Node.js's fs module to read local HTML files or an HTTP client library like axios to fetch web page content. Here, I'll demonstrate an example using axios to retrieve online HTML:

javascript
const axios = require('axios'); const cheerio = require('cheerio'); async function fetchHTML(url) { const { data } = await axios.get(url); return data; }

3. Use Cheerio to Extract <script> Tag Content:

After obtaining the HTML, load it with Cheerio and extract all <script> tags:

javascript
async function extractScripts(url) { const html = await fetchHTML(url); const $ = cheerio.load(html); $('script').each((i, elem) => { console.log($(elem).html()); }); }

In this function, $('script') selects all <script> tags, the each method iterates through them, and $(elem).html() retrieves the JavaScript code within each tag.

4. Call the Function:

Finally, invoke the extractScripts function with a URL:

javascript
extractScripts('https://example.com');

Example Explanation:

Suppose we extract scripts from a simple HTML page with the following content:

html
<!DOCTYPE html> <html> <head> <title>Example Page</title> </head> <body> <h1>Welcome to Example Page</h1> <script> console.log('This is inline JavaScript code'); </script> <script src="example.js"></script> </body> </html>

Here, the extractScripts function outputs console.log('This is inline JavaScript code'); and an empty string, as the second <script> tag references an external file without inline code.

In this manner, Cheerio enables developers to efficiently extract and process <script> tag content from web pages, making it particularly valuable for web scraping applications.

2024年8月10日 01:08 回复

你的答案