乐闻世界logo
搜索文章和话题

How does Puppeteer implement network request interception? What are the practical use cases?

2月19日 19:38

Puppeteer's network interception feature allows developers to intercept, modify, block, and monitor network requests, which is very useful for testing, debugging, and performance optimization.

1. Enable Request Interception

Use page.setRequestInterception(true) to enable request interception.

javascript
const puppeteer = require('puppeteer'); (async () => { const browser = await puppeteer.launch(); const page = await browser.newPage(); // Enable request interception await page.setRequestInterception(true); // Add request listener page.on('request', (request) => { // Handle request request.continue(); }); await page.goto('https://example.com'); await browser.close(); })();

2. Basic Request Interception Operations

Continue Request:

javascript
page.on('request', (request) => { request.continue(); // Continue original request });

Abort Request:

javascript
page.on('request', (request) => { if (request.resourceType() === 'image') { request.abort(); // Block image loading } else { request.continue(); } });

Modify Request:

javascript
page.on('request', (request) => { // Modify request headers request.continue({ headers: { ...request.headers(), 'Authorization': 'Bearer token123' } }); });

Respond to Request:

javascript
page.on('request', (request) => { if (request.url().includes('/api/data')) { // Return mock response request.respond({ status: 200, contentType: 'application/json', body: JSON.stringify({ data: 'mock data' }) }); } else { request.continue(); } });

3. Intercept Specific Resource Types

javascript
page.on('request', (request) => { const resourceType = request.resourceType(); // Block images, fonts, stylesheets, etc. if (['image', 'font', 'stylesheet'].includes(resourceType)) { request.abort(); } else { request.continue(); } });

Resource Types Include:

  • document - Document
  • stylesheet - CSS stylesheet
  • image - Image
  • media - Media file
  • font - Font
  • script - Script
  • texttrack - Text track
  • xhr - XMLHttpRequest
  • fetch - Fetch API
  • eventsource - Event source
  • websocket - WebSocket
  • manifest - Web App manifest
  • other - Other types

4. Intercept Specific URLs

javascript
page.on('request', (request) => { const url = request.url(); // Block specific domain if (url.includes('analytics.com')) { request.abort(); } // Modify specific API request else if (url.includes('/api/')) { request.continue({ headers: { ...request.headers(), 'X-Custom-Header': 'value' } }); } // Mock API response else if (url.includes('/api/mock')) { request.respond({ status: 200, body: JSON.stringify({ success: true }) }); } else { request.continue(); } });

5. Response Interception

Listen to response events to handle server responses.

javascript
page.on('response', async (response) => { const url = response.url(); const status = response.status(); console.log(`Response from ${url}: ${status}`); // Get response content if (url.includes('/api/data')) { const data = await response.json(); console.log('API Response:', data); } });

6. Practical Use Cases

Use Case 1: Block Ads and Tracking

javascript
const blockedDomains = ['ads.com', 'analytics.com', 'tracker.com']; page.on('request', (request) => { const url = request.url(); if (blockedDomains.some(domain => url.includes(domain))) { request.abort(); } else { request.continue(); } });

Use Case 2: API Mocking (for testing)

javascript
const mockResponses = { '/api/users': { users: [{ id: 1, name: 'John' }] }, '/api/posts': { posts: [{ id: 1, title: 'Hello' }] } }; page.on('request', (request) => { const url = request.url(); for (const [path, mockData] of Object.entries(mockResponses)) { if (url.includes(path)) { request.respond({ status: 200, contentType: 'application/json', body: JSON.stringify(mockData) }); return; } } request.continue(); });

Use Case 3: Add Authentication Headers

javascript
const authToken = 'your-auth-token'; page.on('request', (request) => { const headers = { ...request.headers(), 'Authorization': `Bearer ${authToken}` }; request.continue({ headers }); });

Use Case 4: Performance Optimization (block unnecessary resources)

javascript
page.on('request', (request) => { const resourceType = request.resourceType(); // Block images and fonts to speed up page loading if (['image', 'font', 'media'].includes(resourceType)) { request.abort(); } else { request.continue(); } });

Use Case 5: Network Request Monitoring

javascript
const requests = []; const responses = []; page.on('request', (request) => { requests.push({ url: request.url(), method: request.method(), resourceType: request.resourceType(), timestamp: Date.now() }); }); page.on('response', (response) => { responses.push({ url: response.url(), status: response.status(), headers: response.headers(), timestamp: Date.now() }); }); // Use collected data await page.goto('https://example.com'); console.log('Total requests:', requests.length); console.log('Total responses:', responses.length);

7. Error Handling

javascript
page.on('requestfailed', (request) => { console.log('Request failed:', request.url()); console.log('Failure:', request.failure()); });

Best Practices:

  1. Clean up interceptors promptly: Disable interception when not needed
  2. Use conditional judgments: Avoid intercepting all requests
  3. Handle exceptions: Add error handling logic
  4. Performance considerations: Interceptors increase request processing time
  5. Test and verify: Ensure interception logic works correctly

Important Notes:

  • Request interception must be enabled before page navigation
  • Each request must call one of continue(), abort(), or respond()
  • Interceptors affect page performance, use with caution
  • Some requests (like navigation requests) may not be interceptable
标签:Puppeteer