乐闻世界logo
搜索文章和话题

前端面试题手册

Rspack 如何管理环境变量?

Rspack 的环境变量管理是前端开发中的重要功能,能够帮助开发者在不同环境下使用不同的配置。以下是 Rspack 环境变量管理的详细说明:环境变量基本概念环境变量是在构建时注入到代码中的变量,用于区分不同环境(开发、测试、生产)的配置。环境变量定义方式1. DefinePlugin使用 DefinePlugin 定义环境变量:const { DefinePlugin } = require('@rspack/core');module.exports = { plugins: [ new DefinePlugin({ 'process.env.NODE_ENV': JSON.stringify('production'), 'process.env.API_URL': JSON.stringify('https://api.example.com'), 'process.env.VERSION': JSON.stringify('1.0.0') }) ]}2. .env 文件使用 .env 文件管理环境变量:# .env.developmentNODE_ENV=developmentAPI_URL=http://localhost:3000DEBUG=true# .env.productionNODE_ENV=productionAPI_URL=https://api.example.comDEBUG=false3. 命令行参数通过命令行传递环境变量:# Unix/LinuxNODE_ENV=production API_URL=https://api.example.com npx rspack build# Windowsset NODE_ENV=production&& set API_URL=https://api.example.com&& npx rspack build# 使用 cross-env(跨平台)npx cross-env NODE_ENV=production API_URL=https://api.example.com npx rspack build环境变量加载1. dotenv-webpack-plugin使用 dotenv-webpack-plugin 加载 .env 文件:const Dotenv = require('dotenv-webpack');module.exports = { plugins: [ new Dotenv({ path: './.env.production', safe: true, systemvars: true, silent: true }) ]}2. 自定义环境变量加载const fs = require('fs');const path = require('path');function loadEnv(mode) { const envPath = path.resolve(__dirname, `.env.${mode}`); const envContent = fs.readFileSync(envPath, 'utf8'); const envVars = {}; envContent.split('\n').forEach(line => { const [key, value] = line.split('='); if (key && value) { envVars[key.trim()] = value.trim(); } }); return envVars;}const envVars = loadEnv(process.env.NODE_ENV || 'development');module.exports = { plugins: [ new DefinePlugin({ 'process.env': JSON.stringify(envVars) }) ]}环境变量使用1. 在代码中使用// 使用环境变量const apiUrl = process.env.API_URL;const isDevelopment = process.env.NODE_ENV === 'development';const version = process.env.VERSION;console.log('API URL:', apiUrl);console.log('Is Development:', isDevelopment);2. TypeScript 类型定义// env.d.tsdeclare namespace NodeJS { interface ProcessEnv { NODE_ENV: 'development' | 'production' | 'test'; API_URL: string; VERSION: string; DEBUG: string; }}环境配置1. 多环境配置const { merge } = require('webpack-merge');const commonConfig = require('./rspack.common.js');const devConfig = require('./rspack.dev.js');const prodConfig = require('./rspack.prod.js');module.exports = (env) => { const mode = env.mode || 'development'; if (mode === 'production') { return merge(commonConfig, prodConfig); } return merge(commonConfig, devConfig);};2. 环境特定配置// rspack.dev.jsmodule.exports = { mode: 'development', devtool: 'eval-cheap-module-source-map', devServer: { hot: true, port: 3000 }}// rspack.prod.jsmodule.exports = { mode: 'production', devtool: 'source-map', optimization: { minimize: true }}环境变量最佳实践1. 敏感信息处理// 不要在代码中硬编码敏感信息// ❌ 错误const API_KEY = 'your-secret-key';// ✅ 正确const API_KEY = process.env.API_KEY;// 使用 .env.local 存储敏感信息(不提交到版本控制)// .env.localAPI_KEY=your-secret-key2. 环境变量验证// 验证必需的环境变量const requiredEnvVars = ['API_URL', 'API_KEY'];requiredEnvVars.forEach(envVar => { if (!process.env[envVar]) { throw new Error(`Missing required environment variable: ${envVar}`); }});3. 默认值设置// 设置默认值const apiUrl = process.env.API_URL || 'http://localhost:3000';const timeout = parseInt(process.env.TIMEOUT || '5000', 10);const debug = process.env.DEBUG === 'true';环境变量与构建1. 条件构建module.exports = (env) => { const isProduction = env.mode === 'production'; return { mode: isProduction ? 'production' : 'development', plugins: [ new DefinePlugin({ 'process.env.NODE_ENV': JSON.stringify(env.mode), 'process.env.IS_PRODUCTION': JSON.stringify(isProduction) }) ] };};2. 环境特定插件const isProduction = process.env.NODE_ENV === 'production';const plugins = [ new DefinePlugin({ 'process.env.NODE_ENV': JSON.stringify(process.env.NODE_ENV) })];if (isProduction) { plugins.push( new TerserPlugin(), new CompressionPlugin() );}module.exports = { plugins};.env 文件管理1. 文件命名约定.env # 所有环境的默认值.env.local # 本地覆盖(不提交).env.development # 开发环境.env.test # 测试环境.env.production # 生产环境2. .gitignore 配置# 忽略所有 .env.local 文件.env.local.env.*.local# 可以提交其他环境配置.env.development.env.test.env.production3. 环境变量优先级命令行参数.env.local.env.[mode].local.env.[mode].env环境变量与 CI/CD1. GitHub Actionsname: Buildon: [push]jobs: build: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - name: Set up Node.js uses: actions/setup-node@v3 with: node-version: '18' - name: Install dependencies run: npm install - name: Build env: NODE_ENV: production API_URL: ${{ secrets.API_URL }} run: npm run build2. Docker# DockerfileFROM node:18-alpineWORKDIR /appCOPY package*.json ./RUN npm installCOPY . .ARG NODE_ENV=productionARG API_URLENV NODE_ENV=$NODE_ENVENV API_URL=$API_URLRUN npm run build最佳实践安全性:不要在代码中硬编码敏感信息使用 .env.local 存储本地配置在 CI/CD 中使用 secrets可维护性:为不同环境创建独立的配置文件使用清晰的变量命名提供默认值类型安全:为 TypeScript 项目提供类型定义验证环境变量的有效性处理缺失的环境变量文档化:记录所有环境变量说明每个变量的用途提供示例配置Rspack 的环境变量管理为开发者提供了灵活的配置方式,通过合理使用环境变量,可以轻松管理不同环境的配置,提升开发效率和代码质量。
阅读 0·2月21日 15:34

Puppeteer 如何与测试框架集成?有哪些 E2E 测试和 CI/CD 集成的最佳实践?

Puppeteer 可以与各种测试框架集成,实现端到端测试、单元测试和集成测试。以下是常见的集成方式和最佳实践。1. 与 Jest 集成安装依赖:npm install --save-dev puppeteer jest jest-puppeteer @types/puppeteer配置 Jest:// jest.config.jsmodule.exports = { preset: 'jest-puppeteer', testMatch: ['**/*.test.js'], setupFilesAfterEnv: ['./jest.setup.js']};设置文件:// jest.setup.jsbeforeEach(async () => { await page.goto('http://localhost:3000');});编写测试:describe('Puppeteer with Jest', () => { test('page title', async () => { await expect(page.title()).resolves.toMatch('My App'); }); test('user can login', async () => { await page.type('#username', 'testuser'); await page.type('#password', 'password'); await page.click('#login-button'); await expect(page).toMatch('Welcome'); });});2. 与 Mocha 集成安装依赖:npm install --save-dev puppeteer mocha chai配置 Mocha:// mocha.config.jsmodule.exports = { timeout: 10000, require: ['mocha-setup.js']};设置文件:// mocha-setup.jsconst puppeteer = require('puppeteer');const { expect } = require('chai');let browser;let page;before(async () => { browser = await puppeteer.launch(); page = await browser.newPage();});after(async () => { await browser.close();});beforeEach(async () => { await page.goto('http://localhost:3000');});global.page = page;global.expect = expect;编写测试:describe('Puppeteer with Mocha', () => { it('should display correct title', async () => { const title = await page.title(); expect(title).to.equal('My App'); }); it('should allow user to login', async () => { await page.type('#username', 'testuser'); await page.type('#password', 'password'); await page.click('#login-button'); const welcomeText = await page.$eval('.welcome', el => el.textContent); expect(welcomeText).to.include('Welcome'); });});3. 与 Playwright 集成安装依赖:npm install --save-dev @playwright/test配置 Playwright:// playwright.config.jsmodule.exports = { testDir: './tests', use: { headless: true, screenshot: 'only-on-failure' }};编写测试:const { test, expect } = require('@playwright/test');test.describe('Puppeteer migration', () => { test('basic navigation', async ({ page }) => { await page.goto('http://localhost:3000'); await expect(page).toHaveTitle('My App'); }); test('form submission', async ({ page }) => { await page.goto('http://localhost:3000'); await page.fill('#username', 'testuser'); await page.fill('#password', 'password'); await page.click('#login-button'); await expect(page.locator('.welcome')).toBeVisible(); });});4. 与 Cypress 集成安装 Cypress:npm install --save-dev cypress配置 Cypress:// cypress.config.jsconst { defineConfig } = require('cypress');module.exports = defineConfig({ e2e: { baseUrl: 'http://localhost:3000', setupNodeEvents(on, config) { on('task', { puppeteer({ url, action }) { return require('./puppeteer-task')(url, action); } }); } }});Puppeteer 任务:// puppeteer-task.jsconst puppeteer = require('puppeteer');module.exports = async (url, action) => { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto(url); let result; if (action === 'screenshot') { result = await page.screenshot({ encoding: 'base64' }); } else if (action === 'pdf') { result = await page.pdf({ encoding: 'base64' }); } await browser.close(); return result;};Cypress 测试:// cypress/e2e/puppeteer.spec.jsdescribe('Puppeteer integration', () => { it('should take screenshot with Puppeteer', () => { cy.task('puppeteer', { url: 'http://localhost:3000', action: 'screenshot' }).then(screenshot => { cy.log('Screenshot taken'); }); });});5. 端到端测试最佳实践测试结构:describe('User Flow', () => { before(async () => { // 设置测试环境 await setupTestDatabase(); }); after(async () => { // 清理测试环境 await cleanupTestDatabase(); }); beforeEach(async () => { // 每个测试前的准备 await page.goto('http://localhost:3000'); }); test('complete user registration flow', async () => { // 测试步骤 await page.click('#register-button'); await page.type('#username', 'newuser'); await page.type('#email', 'newuser@example.com'); await page.type('#password', 'password123'); await page.click('#submit-button'); // 验证结果 await expect(page).toMatch('Registration successful'); });});测试数据管理:// test-data.jsmodule.exports = { validUser: { username: 'testuser', email: 'test@example.com', password: 'password123' }, invalidUser: { username: '', email: 'invalid-email', password: '123' }};// 使用测试数据const { validUser } = require('./test-data');test('register with valid data', async () => { await page.type('#username', validUser.username); await page.type('#email', validUser.email); await page.type('#password', validUser.password); await page.click('#submit-button'); await expect(page).toMatch('Success');});6. 视觉回归测试使用 Percy:npm install --save-dev @percy/puppeteerconst percy = require('@percy/puppeteer');describe('Visual regression tests', () => { beforeAll(async () => { await percy.start(); }); afterAll(async () => { await percy.stop(); }); test('homepage visual', async () => { await page.goto('http://localhost:3000'); await percy.snapshot(page, 'Homepage'); });});使用 BackstopJS:// backstop.config.jsmodule.exports = { scenarios: [ { label: 'Homepage', url: 'http://localhost:3000', selectors: ['#header', '#main', '#footer'] } ], paths: { bitmaps_reference: 'backstop_data/bitmaps_reference', bitmaps_test: 'backstop_data/bitmaps_test', html_report: 'backstop_data/html_report' }};7. 性能测试使用 Lighthouse:npm install --save-dev lighthouse puppeteerconst lighthouse = require('lighthouse');const puppeteer = require('puppeteer');describe('Performance tests', () => { test('page performance score', async () => { const browser = await puppeteer.launch(); const page = await browser.newPage(); const runnerResult = await lighthouse('http://localhost:3000', { port: new URL(browser.wsEndpoint()).port, output: 'json' }); const score = runnerResult.lhr.categories.performance.score * 100; expect(score).toBeGreaterThan(80); await browser.close(); });});8. 测试报告使用 Allure:npm install --save-dev allure-commandlineconst allure = require('allure-js-commons');describe('Allure reporting', () => { test('with Allure steps', async () => { const epic = new allure.Epic('User Management'); const feature = new allure.Feature('Registration'); const story = new allure.Story('User can register'); epic.addFeature(feature); feature.addStory(story); await page.click('#register-button'); await page.type('#username', 'testuser'); await page.click('#submit-button'); story.addStep('Click register button'); story.addStep('Enter username'); story.addStep('Submit form'); });});9. CI/CD 集成GitHub Actions 配置:name: Puppeteer Testson: [push, pull_request]jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - name: Setup Node.js uses: actions/setup-node@v2 with: node-version: '16' - name: Install dependencies run: npm ci - name: Install Chrome run: | sudo apt-get update sudo apt-get install -y chromium-browser - name: Run tests run: npm test env: CI: trueDocker 配置:FROM node:16-alpine# 安装 ChromeRUN apk add --no-cache chromium# 安装依赖COPY package*.json ./RUN npm ci# 复制测试文件COPY . .# 运行测试CMD ["npm", "test"]10. 最佳实践总结1. 测试隔离:// 每个测试使用独立的上下文beforeEach(async () => { const context = await browser.createIncognitoBrowserContext(); page = await context.newPage();});afterEach(async () => { await context.close();});2. 等待策略:// 使用明确的等待await page.waitForSelector('.element', { visible: true });// 避免硬编码延迟// await page.waitForTimeout(5000); // 不推荐3. 错误处理:test('with error handling', async () => { try { await page.click('#button'); } catch (error) { // 保存失败截图 await page.screenshot({ path: 'failure.png' }); throw error; }});4. 测试数据清理:afterEach(async () => { // 清理测试数据 await cleanupTestData();});5. 并行测试:// Jest 配置module.exports = { maxWorkers: 4, // 并行运行测试 preset: 'jest-puppeteer'};
阅读 0·2月19日 19:55

如何在 Jest 中测试 API 调用和网络请求?如何 Mock fetch 和 Axios?

在 Jest 中测试 API 调用和网络请求需要使用 Mock 来隔离外部依赖:1. Mock fetch API:global.fetch = jest.fn(() => Promise.resolve({ json: () => Promise.resolve({ data: 'test' }) }));test('fetches data from API', async () => { const data = await fetchData(); expect(data).toEqual({ data: 'test' }); expect(fetch).toHaveBeenCalledTimes(1);});2. Mock Axios:import axios from 'axios';jest.mock('axios');test('fetches user data', async () => { const mockData = { name: 'John', age: 30 }; axios.get.mockResolvedValue({ data: mockData }); const result = await getUser(1); expect(result).toEqual(mockData); expect(axios.get).toHaveBeenCalledWith('/users/1');});3. Mock 模块导出:// api.jsexport const fetchData = () => fetch('/api/data');// api.test.jsimport { fetchData } from './api';import fetch from 'node-fetch';jest.mock('node-fetch');test('fetches data', async () => { fetch.mockResolvedValue({ json: () => ({ data: 'test' }) }); const result = await fetchData(); expect(result).toEqual({ data: 'test' });});4. 测试错误处理:test('handles API errors', async () => { axios.get.mockRejectedValue(new Error('Network Error')); await expect(getUser(1)).rejects.toThrow('Network Error');});5. 测试不同的响应状态:test('handles 404 error', async () => { axios.get.mockRejectedValue({ response: { status: 404, data: { message: 'Not found' } } }); await expect(getUser(1)).rejects.toMatchObject({ response: { status: 404 } });});6. 使用 MSW(Mock Service Worker):import { rest } from 'msw';import { setupServer } from 'msw/node';const server = setupServer( rest.get('/api/users', (req, res, ctx) => { return res(ctx.json({ name: 'John' })); }));beforeAll(() => server.listen());afterEach(() => server.resetHandlers());afterAll(() => server.close());test('fetches user with MSW', async () => { const user = await fetchUser(); expect(user).toEqual({ name: 'John' });});最佳实践:使用 Mock 隔离外部 API 依赖测试成功和失败场景验证请求参数和响应处理清理 Mock 以避免测试污染考虑使用 MSW 进行更真实的 API 模拟
阅读 0·2月19日 19:55

如何在 Jest 中测试 React 组件?常用的测试工具和查询方法有哪些?

在 Jest 中测试 React 组件需要结合测试工具和渲染方法:常用测试工具:@testing-library/react:官方推荐的 React 测试库react-test-renderer:用于快照测试enzyme:传统的 React 组件测试工具(较少使用)基本测试示例:import { render, screen, fireEvent } from '@testing-library/react';import Button from './Button';test('renders button with text', () => { render(<Button>Click me</Button>); expect(screen.getByText('Click me')).toBeInTheDocument();});test('calls onClick when clicked', () => { const handleClick = jest.fn(); render(<Button onClick={handleClick}>Click me</Button>); fireEvent.click(screen.getByText('Click me')); expect(handleClick).toHaveBeenCalledTimes(1);});常用查询方法:getByText():通过文本查找元素getByRole():通过角色查找元素getByTestId():通过 data-testid 属性查找queryByText():查找元素,不存在时返回 nullfindByText():异步查找元素测试异步组件:test('loads and displays data', async () => { render(<UserList />); expect(screen.getByText('Loading...')).toBeInTheDocument(); await waitFor(() => { expect(screen.getByText('John')).toBeInTheDocument(); });});最佳实践:测试用户行为,而不是实现细节使用 @testing-library/react 而不是 enzyme使用 data-testid 作为最后的选择避免测试内部状态,测试可见输出保持测试简单和可读
阅读 0·2月19日 19:53

Puppeteer 如何处理动态网页和单页应用(SPA)?有哪些处理异步加载和路由变化的技巧?

Puppeteer 在处理动态网页和单页应用(SPA)时具有独特的优势,可以执行 JavaScript、等待异步加载、处理路由变化等。1. 处理动态内容加载等待元素出现:const puppeteer = require('puppeteer');async function scrapeDynamicContent() { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto('https://example.com'); // 等待动态加载的元素 await page.waitForSelector('.dynamic-content', { visible: true }); const content = await page.$eval('.dynamic-content', el => el.textContent); console.log(content); await browser.close();}scrapeDynamicContent();等待特定条件:await page.waitForFunction(() => { return document.querySelectorAll('.item').length > 0;});等待网络请求完成:await page.goto('https://example.com', { waitUntil: 'networkidle2'});2. 处理无限滚动基本无限滚动:async function scrapeInfiniteScroll() { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto('https://example.com/infinite-scroll'); const items = []; let previousHeight = 0; while (true) { // 滚动到底部 await page.evaluate(() => { window.scrollBy(0, window.innerHeight); }); // 等待新内容加载 await page.waitForTimeout(1000); // 检查是否有新内容 const currentHeight = await page.evaluate(() => document.body.scrollHeight); if (currentHeight === previousHeight) { break; // 没有新内容了 } previousHeight = currentHeight; // 收集数据 const newItems = await page.$$eval('.item', elements => { return elements.map(el => el.textContent); }); items.push(...newItems); } await browser.close(); return items;}优化的无限滚动:async function scrapeInfiniteScrollOptimized() { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto('https://example.com/infinite-scroll'); const items = []; let noNewItemsCount = 0; while (noNewItemsCount < 3) { // 连续 3 次没有新内容就停止 const itemCountBefore = items.length; // 滚动到底部 await page.evaluate(() => { window.scrollTo(0, document.body.scrollHeight); }); // 等待加载指示器消失 try { await page.waitForSelector('.loading', { hidden: true, timeout: 3000 }); } catch (error) { // 加载指示器可能不存在 } // 收集新数据 const newItems = await page.$$eval('.item', elements => { return elements.map(el => el.textContent); }); if (newItems.length === itemCountBefore) { noNewItemsCount++; } else { noNewItemsCount = 0; items.push(...newItems); } } await browser.close(); return items;}3. 处理 SPA 路由监听路由变化:async function handleSPARoutes() { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto('https://example.com'); // 监听路由变化 page.on('framenavigated', async (frame) => { console.log('Navigated to:', frame.url()); // 等待页面内容加载 await frame.waitForSelector('.content'); const title = await frame.$eval('.content', el => el.textContent); console.log('Page title:', title); }); // 点击导航链接 await page.click('#about-link'); await page.waitForTimeout(1000); await page.click('#contact-link'); await page.waitForTimeout(1000); await browser.close();}等待特定路由:async function waitForRoute(page, path) { return new Promise((resolve) => { const checkRoute = async () => { const currentPath = await page.evaluate(() => window.location.pathname); if (currentPath === path) { resolve(); } else { setTimeout(checkRoute, 100); } }; checkRoute(); });}// 使用await page.click('#about-link');await waitForRoute(page, '/about');4. 处理 AJAX 请求等待特定 API 响应:async function waitForAPIResponse(page, urlPattern) { return new Promise((resolve) => { page.on('response', (response) => { if (response.url().includes(urlPattern)) { resolve(response); } }); });}// 使用const apiResponse = await Promise.all([ waitForAPIResponse(page, '/api/data'), page.click('#load-data-button')]);const data = await apiResponse.json();console.log(data);拦截和修改 API 请求:await page.setRequestInterception(true);page.on('request', (request) => { if (request.url().includes('/api/data')) { // 修改请求 request.continue({ headers: { ...request.headers(), 'Authorization': 'Bearer token' } }); } else { request.continue(); }});5. 处理 WebSocket监听 WebSocket 消息:const client = await page.target().createCDPSession();await client.send('Network.enable');client.on('Network.webSocketFrameReceived', (params) => { console.log('WebSocket message:', params.response.payloadData);});client.on('Network.webSocketFrameSent', (params) => { console.log('WebSocket sent:', params.response.payloadData);});6. 处理客户端渲染等待客户端渲染完成:async function waitForClientRendering(page) { // 方法 1:等待特定元素 await page.waitForSelector('.rendered-content'); // 方法 2:等待渲染标志 await page.waitForFunction(() => { return window.__RENDER_COMPLETE__ === true; }); // 方法 3:等待网络空闲 await page.waitForFunction(() => { return performance.getEntriesByType('resource').length > 0; });}处理 React/Vue 应用:async function scrapeReactApp() { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto('https://example.com/react-app'); // 等待 React 应用挂载 await page.waitForSelector('#root'); // 等待数据加载完成 await page.waitForFunction(() => { return window.__INITIAL_STATE__?.loaded === true; }); // 与 React 应用交互 await page.click('#load-more-button'); await page.waitForSelector('.new-items'); const items = await page.$$eval('.item', elements => { return elements.map(el => el.textContent); }); await browser.close(); return items;}7. 实际应用场景场景 1:抓取社交媒体动态内容async function scrapeSocialMediaPosts(username) { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto(`https://social-media.com/${username}`); const posts = []; // 滚动加载更多帖子 while (posts.length < 50) { // 滚动到底部 await page.evaluate(() => { window.scrollBy(0, window.innerHeight); }); // 等待新帖子加载 await page.waitForTimeout(2000); // 收集帖子数据 const newPosts = await page.$$eval('.post', elements => { return elements.map(post => ({ id: post.dataset.id, content: post.querySelector('.content')?.textContent, likes: post.querySelector('.likes')?.textContent, timestamp: post.querySelector('.timestamp')?.textContent })); }); // 只添加新帖子 const newPostIds = new Set(posts.map(p => p.id)); const uniqueNewPosts = newPosts.filter(p => !newPostIds.has(p.id)); posts.push(...uniqueNewPosts); } await browser.close(); return posts;}场景 2:抓取电商网站商品列表async function scrapeEcommerceProducts(categoryUrl) { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto(categoryUrl); const products = []; while (true) { // 等待商品加载 await page.waitForSelector('.product-card'); // 收集当前页商品 const pageProducts = await page.$$eval('.product-card', cards => { return cards.map(card => ({ id: card.dataset.id, title: card.querySelector('.title')?.textContent, price: card.querySelector('.price')?.textContent, rating: card.querySelector('.rating')?.textContent })); }); products.push(...pageProducts); // 检查是否有下一页 const nextButton = await page.$('.next-page:not(.disabled)'); if (!nextButton) { break; } // 点击下一页 await nextButton.click(); await page.waitForTimeout(1000); } await browser.close(); return products;}场景 3:抓取实时数据更新async function scrapeRealTimeData(url) { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto(url); const dataUpdates = []; // 监听 DOM 变化 await page.evaluate(() => { const observer = new MutationObserver((mutations) => { mutations.forEach((mutation) => { if (mutation.type === 'childList') { window.__DATA_UPDATES__ = window.__DATA_UPDATES__ || []; window.__DATA_UPDATES__.push({ timestamp: Date.now(), addedNodes: mutation.addedNodes.length }); } }); }); observer.observe(document.body, { childList: true, subtree: true }); }); // 等待一段时间收集数据 await page.waitForTimeout(30000); // 获取收集的数据 const updates = await page.evaluate(() => { return window.__DATA_UPDATES__ || []; }); await browser.close(); return updates;}8. 最佳实践1. 使用适当的等待策略:// 优先使用 waitForSelectorawait page.waitForSelector('.element');// 复杂条件使用 waitForFunctionawait page.waitForFunction(() => { return document.querySelectorAll('.item').length > 10;});// 网络请求使用 waitForResponseawait page.waitForResponse(response => response.url().includes('/api/data'));2. 避免硬编码等待时间:// 不好的做法await page.waitForTimeout(5000);// 好的做法await page.waitForSelector('.loaded-content');3. 处理加载失败:try { await page.waitForSelector('.content', { timeout: 10000 });} catch (error) { console.log('Content failed to load, using fallback'); // 使用备用策略}4. 优化性能:// 禁用不必要的资源await page.setRequestInterception(true);page.on('request', (request) => { if (['image', 'font', 'media'].includes(request.resourceType())) { request.abort(); } else { request.continue(); }});5. 处理反爬虫:// 设置真实的用户代理await page.setUserAgent('Mozilla/5.0 ...');// 添加随机延迟const randomDelay = () => Math.random() * 2000 + 1000;await page.waitForTimeout(randomDelay());// 模拟人类行为await page.evaluate(() => { window.scrollBy(0, Math.random() * 500);});
阅读 0·2月19日 19:49

Puppeteer 在实际项目中有哪些应用场景?请举例说明网页爬虫、自动化测试等具体实现。

Puppeteer 在实际项目中有广泛的应用场景,从网页爬虫到自动化测试,从数据采集到性能监控。以下是一些典型的实际应用案例。1. 网页爬虫和数据采集案例 1:电商商品价格监控const puppeteer = require('puppeteer');async function monitorProductPrices(productUrls) { const browser = await puppeteer.launch({ headless: 'new', args: ['--no-sandbox', '--disable-setuid-sandbox'] }); const results = []; for (const url of productUrls) { const page = await browser.newPage(); // 设置用户代理,避免被识别为爬虫 await page.setUserAgent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'); await page.goto(url, { waitUntil: 'networkidle2' }); // 等待价格元素加载 await page.waitForSelector('.price', { timeout: 5000 }); const productData = await page.evaluate(() => { return { title: document.querySelector('.product-title')?.textContent, price: document.querySelector('.price')?.textContent, availability: document.querySelector('.availability')?.textContent, rating: document.querySelector('.rating')?.textContent }; }); results.push({ url, ...productData, timestamp: new Date().toISOString() }); await page.close(); } await browser.close(); return results;}// 使用示例const products = [ 'https://example.com/product/1', 'https://example.com/product/2'];monitorProductPrices(products).then(data => { console.log(JSON.stringify(data, null, 2));});案例 2:社交媒体数据抓取async function scrapeSocialMedia(username) { const browser = await puppeteer.launch({ headless: 'new' }); const page = await browser.newPage(); // 模拟登录 await page.goto('https://social-media.com/login'); await page.type('#username', 'your_username'); await page.type('#password', 'your_password'); await page.click('#login-button'); await page.waitForNavigation(); // 访问用户页面 await page.goto(`https://social-media.com/${username}`); // 滚动加载更多内容 while (true) { await page.evaluate(() => { window.scrollBy(0, window.innerHeight); }); try { await page.waitForSelector('.new-post', { timeout: 2000 }); } catch { break; } } // 抓取帖子数据 const posts = await page.evaluate(() => { return Array.from(document.querySelectorAll('.post')).map(post => ({ content: post.querySelector('.content')?.textContent, likes: post.querySelector('.likes')?.textContent, comments: post.querySelector('.comments')?.textContent, date: post.querySelector('.date')?.textContent })); }); await browser.close(); return posts;}2. 自动化测试案例 3:E2E 测试const { expect } = require('expect-puppeteer');async function runE2ETest() { const browser = await puppeteer.launch({ headless: 'new', slowMo: 50 // 减慢操作速度,便于观察 }); const page = await browser.newPage(); try { // 测试用户注册流程 await page.goto('https://example.com/register'); // 填写注册表单 await page.type('#username', 'testuser'); await page.type('#email', 'test@example.com'); await page.type('#password', 'password123'); await page.type('#confirm-password', 'password123'); // 提交表单 await Promise.all([ page.waitForNavigation(), page.click('#register-button') ]); // 验证注册成功 await expect(page).toMatch('Welcome, testuser!'); // 测试登录流程 await page.click('#logout-button'); await page.waitForNavigation(); await page.type('#login-email', 'test@example.com'); await page.type('#login-password', 'password123'); await page.click('#login-button'); await page.waitForNavigation(); // 验证登录成功 await expect(page).toMatch('Welcome back!'); console.log('E2E test passed!'); } catch (error) { console.error('E2E test failed:', error); // 保存失败截图 await page.screenshot({ path: 'test-failure.png' }); } finally { await browser.close(); }}runE2ETest();案例 4:视觉回归测试const fs = require('fs');const pixelmatch = require('pixelmatch');const { PNG } = require('pngjs');async function visualRegressionTest(url, baselinePath) { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto(url, { waitUntil: 'networkidle2' }); // 截取当前页面 const screenshot = await page.screenshot(); await browser.close(); // 如果没有基线图片,保存当前截图作为基线 if (!fs.existsSync(baselinePath)) { fs.writeFileSync(baselinePath, screenshot); console.log('Baseline image created'); return true; } // 读取基线图片 const baseline = PNG.sync.read(fs.readFileSync(baselinePath)); const current = PNG.sync.read(screenshot); // 比较图片差异 const diff = new PNG({ width: baseline.width, height: baseline.height }); const numDiffPixels = pixelmatch( baseline.data, current.data, diff.data, baseline.width, baseline.height, { threshold: 0.1 } ); // 保存差异图片 fs.writeFileSync('diff.png', PNG.sync.write(diff)); const totalPixels = baseline.width * baseline.height; const diffPercentage = (numDiffPixels / totalPixels) * 100; console.log(`Difference: ${diffPercentage.toFixed(2)}%`); // 如果差异超过阈值,测试失败 if (diffPercentage > 0.5) { console.log('Visual regression detected!'); return false; } console.log('Visual regression test passed!'); return true;}visualRegressionTest('https://example.com', 'baseline.png');3. PDF 生成和文档处理案例 5:动态报表生成async function generateReport(data, outputPath) { const browser = await puppeteer.launch(); const page = await browser.newPage(); // 生成 HTML 报表 const html = ` <!DOCTYPE html> <html> <head> <style> body { font-family: Arial, sans-serif; padding: 40px; } h1 { color: #333; } table { width: 100%; border-collapse: collapse; margin-top: 20px; } th, td { border: 1px solid #ddd; padding: 12px; text-align: left; } th { background-color: #f2f2f2; } .summary { margin-top: 30px; padding: 20px; background-color: #f9f9f9; } </style> </head> <body> <h1>销售报表</h1> <p>生成时间: ${new Date().toLocaleString()}</p> <table> <thead> <tr> <th>产品</th> <th>数量</th> <th>单价</th> <th>总价</th> </tr> </thead> <tbody> ${data.map(item => ` <tr> <td>${item.product}</td> <td>${item.quantity}</td> <td>$${item.price.toFixed(2)}</td> <td>$${(item.quantity * item.price).toFixed(2)}</td> </tr> `).join('')} </tbody> </table> <div class="summary"> <h2>总计: $${data.reduce((sum, item) => sum + item.quantity * item.price, 0).toFixed(2)}</h2> </div> </body> </html> `; await page.setContent(html); // 生成 PDF await page.pdf({ path: outputPath, format: 'A4', printBackground: true, margin: { top: '20px', right: '20px', bottom: '20px', left: '20px' } }); await browser.close(); console.log(`Report generated: ${outputPath}`);}// 使用示例const salesData = [ { product: '产品 A', quantity: 10, price: 99.99 }, { product: '产品 B', quantity: 5, price: 149.99 }, { product: '产品 C', quantity: 8, price: 79.99 }];generateReport(salesData, 'sales-report.pdf');案例 6:发票批量生成async function generateInvoices(invoices) { const browser = await puppeteer.launch(); const page = await browser.newPage(); for (const invoice of invoices) { const html = ` <!DOCTYPE html> <html> <head> <style> body { font-family: Arial, sans-serif; padding: 40px; } .header { text-align: center; margin-bottom: 40px; } .invoice-info { margin-bottom: 30px; } table { width: 100%; border-collapse: collapse; } th, td { border: 1px solid #ddd; padding: 10px; text-align: left; } th { background-color: #f2f2f2; } .total { text-align: right; font-weight: bold; margin-top: 20px; } </style> </head> <body> <div class="header"> <h1>发票</h1> <p>发票号: ${invoice.number}</p> </div> <div class="invoice-info"> <p>日期: ${invoice.date}</p> <p>客户: ${invoice.customer}</p> </div> <table> <thead> <tr> <th>项目</th> <th>数量</th> <th>单价</th> <th>总价</th> </tr> </thead> <tbody> ${invoice.items.map(item => ` <tr> <td>${item.name}</td> <td>${item.quantity}</td> <td>$${item.price}</td> <td>$${item.quantity * item.price}</td> </tr> `).join('')} </tbody> </table> <div class="total"> 总计: $${invoice.total} </div> </body> </html> `; await page.setContent(html); await page.pdf({ path: `invoices/invoice-${invoice.number}.pdf`, format: 'A4', printBackground: true }); console.log(`Generated invoice: ${invoice.number}`); } await browser.close();}// 使用示例const invoices = [ { number: 'INV-001', date: '2024-01-15', customer: '客户 A', items: [ { name: '服务 A', quantity: 1, price: 500 }, { name: '服务 B', quantity: 2, price: 300 } ], total: 1100 }];generateInvoices(invoices);4. 性能监控和分析案例 7:页面性能分析async function analyzePagePerformance(url) { const browser = await puppeteer.launch(); const page = await browser.newPage(); // 启用性能监控 const client = await page.target().createCDPSession(); await client.send('Performance.enable'); await client.send('Network.enable'); // 记录开始时间 const startTime = Date.now(); await page.goto(url, { waitUntil: 'networkidle2' }); const loadTime = Date.now() - startTime; // 获取性能指标 const metrics = await client.send('Performance.getMetrics'); // 获取关键性能指标 const performanceData = { loadTime, domContentLoaded: await page.evaluate(() => performance.timing.domContentLoadedEventEnd - performance.timing.navigationStart ), firstPaint: await page.evaluate(() => performance.getEntriesByType('paint')[0]?.startTime ), firstContentfulPaint: await page.evaluate(() => performance.getEntriesByType('paint')[1]?.startTime ), resources: metrics.metrics }; // 生成性能报告 console.log('Performance Report:'); console.log(`Load Time: ${performanceData.loadTime}ms`); console.log(`DOM Content Loaded: ${performanceData.domContentLoaded}ms`); console.log(`First Paint: ${performanceData.firstPaint}ms`); console.log(`First Contentful Paint: ${performanceData.firstContentfulPaint}ms`); await browser.close(); return performanceData;}analyzePagePerformance('https://example.com');5. SEO 工具案例 8:SEO 检查工具async function seoAudit(url) { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto(url, { waitUntil: 'networkidle2' }); const seoData = await page.evaluate(() => { const issues = []; const warnings = []; // 检查标题 const title = document.querySelector('title'); if (!title) { issues.push('Missing title tag'); } else if (title.textContent.length > 60) { warnings.push('Title too long (> 60 characters)'); } // 检查描述 const description = document.querySelector('meta[name="description"]'); if (!description) { issues.push('Missing meta description'); } else if (description.content.length > 160) { warnings.push('Meta description too long (> 160 characters)'); } // 检查 H1 标签 const h1Tags = document.querySelectorAll('h1'); if (h1Tags.length === 0) { issues.push('Missing H1 tag'); } else if (h1Tags.length > 1) { warnings.push('Multiple H1 tags found'); } // 检查图片 alt 属性 const images = document.querySelectorAll('img'); let missingAlt = 0; images.forEach(img => { if (!img.alt) missingAlt++; }); if (missingAlt > 0) { warnings.push(`${missingAlt} images missing alt attributes`); } // 检查链接 const links = document.querySelectorAll('a[href]'); let brokenLinks = 0; links.forEach(link => { if (link.getAttribute('href').startsWith('#')) brokenLinks++; }); return { title: title?.textContent, description: description?.content, h1Count: h1Tags.length, imageCount: images.length, linkCount: links.length, issues, warnings }; }); console.log('SEO Audit Results:'); console.log(JSON.stringify(seoData, null, 2)); await browser.close(); return seoData;}seoAudit('https://example.com');6. 最佳实践总结1. 错误处理:try { // 操作代码} catch (error) { console.error('Error:', error); // 保存错误截图 await page.screenshot({ path: 'error.png' });} finally { await browser.close();}2. 资源管理:// 及时清理资源await page.close();await browser.close();3. 性能优化:// 禁用不必要的资源await page.setRequestInterception(true);page.on('request', (request) => { if (['image', 'font'].includes(request.resourceType())) { request.abort(); } else { request.continue(); }});4. 反爬虫策略:// 设置真实的用户代理await page.setUserAgent('Mozilla/5.0 ...');// 添加延迟await new Promise(resolve => setTimeout(resolve, 1000));// 使用代理const browser = await puppeteer.launch({ args: ['--proxy-server=http://proxy.example.com:8080']});
阅读 0·2月19日 19:48

Puppeteer 如何实现页面交互和表单操作?有哪些常用的 API 和最佳实践?

Puppeteer 提供了丰富的页面交互和表单操作功能,可以模拟用户的真实操作行为,这对于自动化测试和网页爬虫非常重要。1. 基本页面操作导航到页面:// 基本导航await page.goto('https://example.com');// 等待网络空闲await page.goto('https://example.com', { waitUntil: 'networkidle2' });// 设置超时时间await page.goto('https://example.com', { timeout: 30000 });// 等待特定条件await page.goto('https://example.com', { waitUntil: ['load', 'domcontentloaded'] });刷新页面:await page.reload();await page.reload({ waitUntil: 'networkidle2' });前进和后退:await page.goBack();await page.goForward();2. 元素选择Puppeteer 支持多种选择器方式。使用 $ 选择单个元素:// 通过 CSS 选择器const element = await page.$('#my-id');const element = await page.$('.my-class');const element = await page.$('div > p');// 通过 XPathconst element = await page.$x('//div[@class="my-class"]');使用 $$ 选择多个元素:// 选择所有匹配的元素const elements = await page.$$('.item');console.log(elements.length); // 元素数量// 遍历元素for (const element of elements) { const text = await element.evaluate(el => el.textContent); console.log(text);}使用 $$eval 批量获取数据:// 获取所有元素的文本const texts = await page.$$eval('.item', elements => { return elements.map(el => el.textContent);});// 获取所有元素的属性const hrefs = await page.$$eval('a', elements => { return elements.map(el => el.href);});3. 点击操作基本点击:await page.click('#button');await page.click('.submit-btn');带选项的点击:await page.click('#button', { button: 'left', // 'left', 'right', 'middle' clickCount: 1, // 点击次数 delay: 100, // 点击延迟(毫秒) offset: { // 点击位置偏移 x: 10, y: 10 }});双击:await page.click('#button', { clickCount: 2 });右键点击:await page.click('#button', { button: 'right' });等待元素可点击:await page.waitForSelector('#button', { visible: true });await page.click('#button');4. 文本输入基本输入:await page.type('#input', 'Hello World');带选项的输入:await page.type('#input', 'Hello World', { delay: 100, // 每个字符的延迟(毫秒) clear: true // 输入前清空输入框});模拟真实打字速度:await page.type('#input', 'Hello World', { delay: 50 });清空输入框:await page.click('#input');await page.keyboard.down('Control');await page.keyboard.press('A');await page.keyboard.up('Control');await page.keyboard.press('Backspace');5. 键盘操作基本按键:await page.keyboard.press('Enter');await page.keyboard.press('Tab');await page.keyboard.press('Escape');await page.keyboard.press('Backspace');组合键:// Ctrl+Cawait page.keyboard.down('Control');await page.keyboard.press('C');await page.keyboard.up('Control');// Ctrl+A (全选)await page.keyboard.down('Control');await page.keyboard.press('A');await page.keyboard.up('Control');// Ctrl+V (粘贴)await page.keyboard.down('Control');await page.keyboard.press('V');await page.keyboard.up('Control');特殊键:await page.keyboard.press('ArrowUp');await page.keyboard.press('ArrowDown');await page.keyboard.press('ArrowLeft');await page.keyboard.press('ArrowRight');await page.keyboard.press('PageUp');await page.keyboard.press('PageDown');await page.keyboard.press('Home');await page.keyboard.press('End');6. 鼠标操作移动鼠标:await page.mouse.move(100, 100);await page.mouse.move(100, 100, { steps: 10 }); // 平滑移动点击鼠标:await page.mouse.click(100, 100);await page.mouse.click(100, 100, { button: 'left', clickCount: 1});按下和释放鼠标:await page.mouse.down();await page.mouse.up();// 拖拽操作await page.mouse.down({ x: 100, y: 100 });await page.mouse.move(200, 200, { steps: 10 });await page.mouse.up();7. 表单操作填写表单:// 文本输入await page.type('#name', 'John Doe');await page.type('#email', 'john@example.com');// 选择下拉框await page.selectOption('#country', 'CN');await page.selectOption('#country', ['CN', 'US']); // 多选// 复选框await page.click('#checkbox');const isChecked = await page.$eval('#checkbox', el => el.checked);// 单选框await page.click('#radio-male');// 文件上传await page.setInputFiles('#file-upload', '/path/to/file.pdf');await page.setInputFiles('#file-upload', ['/file1.pdf', '/file2.pdf']);提交表单:// 点击提交按钮await page.click('#submit-button');// 使用表单提交await page.evaluate(() => { document.querySelector('form').submit();});8. 滚动操作滚动到页面底部:await page.evaluate(() => { window.scrollTo(0, document.body.scrollHeight);});滚动到特定元素:await page.evaluate(() => { document.querySelector('#target').scrollIntoView();});平滑滚动:await page.evaluate(() => { window.scrollTo({ top: 1000, behavior: 'smooth' });});滚动指定距离:await page.evaluate(() => { window.scrollBy(0, 500);});9. 等待元素等待元素出现:await page.waitForSelector('.result');await page.waitForSelector('.result', { visible: true });await page.waitForSelector('.result', { hidden: true });等待 XPath:await page.waitForXPath('//div[@class="result"]');等待函数:await page.waitForFunction(() => { return document.querySelectorAll('.item').length > 5;});等待导航:await Promise.all([ page.waitForNavigation(), page.click('#link')]);10. 获取元素信息获取文本内容:const text = await page.$eval('.title', el => el.textContent);获取属性:const href = await page.$eval('a', el => el.href);const id = await page.$eval('div', el => el.id);获取多个元素信息:const texts = await page.$$eval('.item', elements => { return elements.map(el => el.textContent);});检查元素是否存在:const exists = await page.$('.element') !== null;检查元素是否可见:const isVisible = await page.$eval('.element', el => { return el.offsetParent !== null;});11. 实际应用场景场景 1:登录表单填写async function login(url, username, password) { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto(url); // 填写登录表单 await page.type('#username', username); await page.type('#password', password); // 点击登录按钮 await Promise.all([ page.waitForNavigation(), page.click('#login-button') ]); // 验证登录成功 const isLoggedIn = await page.$('.user-profile') !== null; await browser.close(); return isLoggedIn;}login('https://example.com/login', 'user@example.com', 'password');场景 2:搜索功能测试async function testSearch(url, query) { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto(url); // 输入搜索关键词 await page.type('#search-input', query); // 提交搜索 await Promise.all([ page.waitForNavigation(), page.keyboard.press('Enter') ]); // 等待搜索结果 await page.waitForSelector('.search-result'); // 获取结果数量 const resultCount = await page.$$eval('.search-result', results => { return results.length; }); console.log(`Found ${resultCount} results for "${query}"`); await browser.close(); return resultCount;}testSearch('https://example.com', 'puppeteer');场景 3:分页数据抓取async function scrapePaginatedData(url) { const browser = await puppeteer.launch(); const page = await browser.newPage(); const allData = []; let hasNextPage = true; let pageNum = 1; while (hasNextPage) { await page.goto(`${url}?page=${pageNum}`); await page.waitForSelector('.item'); // 抓取当前页数据 const pageData = await page.$$eval('.item', items => { return items.map(item => ({ title: item.querySelector('.title').textContent, price: item.querySelector('.price').textContent })); }); allData.push(...pageData); console.log(`Scraped page ${pageNum}: ${pageData.length} items`); // 检查是否有下一页 hasNextPage = await page.$('.next-page:not(.disabled)') !== null; pageNum++; } await browser.close(); return allData;}scrapePaginatedData('https://example.com/products');场景 4:动态内容加载async function scrapeDynamicContent(url) { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto(url); // 等待初始内容加载 await page.waitForSelector('.content'); // 滚动加载更多内容 while (true) { // 滚动到底部 await page.evaluate(() => { window.scrollTo(0, document.body.scrollHeight); }); // 等待新内容加载 try { await page.waitForSelector('.new-content', { timeout: 3000 }); } catch (error) { break; // 没有新内容了 } } // 获取所有内容 const allContent = await page.$$eval('.content-item', items => { return items.map(item => item.textContent); }); await browser.close(); return allContent;}scrapeDynamicContent('https://example.com/infinite-scroll');12. 最佳实践1. 使用等待机制:// 好的做法await page.waitForSelector('.button', { visible: true });await page.click('.button');// 不好的做法await page.click('.button'); // 可能失败2. 处理动态内容:// 等待网络空闲await page.goto(url, { waitUntil: 'networkidle2' });// 等待特定元素await page.waitForSelector('.loaded-content');3. 错误处理:try { await page.click('#button');} catch (error) { console.error('Click failed:', error); // 重试逻辑}4. 性能优化:// 禁用不必要的资源await page.setRequestInterception(true);page.on('request', (request) => { if (['image', 'font', 'media'].includes(request.resourceType())) { request.abort(); } else { request.continue(); }});5. 清理资源:try { // 操作代码} finally { await browser.close();}
阅读 0·2月19日 19:48

Puppeteer 和 Selenium 有什么区别?在什么场景下应该选择 Puppeteer 而不是 Selenium?

Puppeteer 和 Selenium 都是流行的浏览器自动化工具,但它们在设计理念、实现方式和使用场景上有显著差异。1. 架构差异Puppeteer:基于 Chrome DevTools Protocol (CDP)直接与浏览器通信,无需中间层专为 Chrome/Chromium 设计使用 WebSocket 与浏览器建立连接Selenium:基于 WebDriver 协议通过 WebDriver 服务器与浏览器通信支持多种浏览器(Chrome、Firefox、Safari、Edge 等)需要安装浏览器驱动程序2. 性能对比Puppeteer:// 启动速度快const browser = await puppeteer.launch();const page = await browser.newPage();await page.goto('https://example.com'); // 快速加载Selenium:// 启动较慢const driver = await new Builder() .forBrowser('chrome') .build();await driver.get('https://example.com'); // 加载较慢性能指标对比:| 指标 | Puppeteer | Selenium ||------|-----------|----------|| 启动时间 | 快(1-2秒) | 慢(3-5秒) || 执行速度 | 快 | 中等 || 内存占用 | 较低 | 较高 || 网络请求 | 直接通信 | 通过驱动 |3. API 设计Puppeteer API:// 简洁直观的 APIawait page.click('#button');await page.type('#input', 'text');await page.waitForSelector('.result');const text = await page.$eval('.title', el => el.textContent);Selenium API:// 相对复杂的 APIawait driver.findElement(By.id('button')).click();await driver.findElement(By.id('input')).sendKeys('text');await driver.wait(until.elementLocated(By.css('.result')));const text = await driver.findElement(By.css('.title')).getText();4. 浏览器支持Puppeteer:Chrome/Chromium(主要支持)Firefox(实验性支持,通过 puppeteer-firefox)其他浏览器支持有限Selenium:ChromeFirefoxSafariEdgeOperaInternet Explorer支持几乎所有主流浏览器5. 功能特性对比Puppeteer 特有功能:// 1. 网络拦截await page.setRequestInterception(true);page.on('request', request => { if (request.resourceType() === 'image') { request.abort(); } else { request.continue(); }});// 2. 性能追踪const client = await page.target().createCDPSession();await client.send('Performance.enable');const metrics = await client.send('Performance.getMetrics');// 3. 文件下载const [download] = await Promise.all([ page.waitForEvent('download'), page.click('#download-button')]);await download.saveAs('/path/to/save');// 4. 设备模拟const devices = puppeteer.devices;const iPhone = devices['iPhone 12'];await page.emulate(iPhone);// 5. 地理位置模拟await page.setGeolocation({ latitude: 35.6895, longitude: 139.6917 });Selenium 特有功能:// 1. 多浏览器支持const driver = await new Builder() .forBrowser('firefox') .build();// 2. 分布式测试(Selenium Grid)// 可以在多台机器上并行运行测试// 3. 移动设备测试(Appium)// 支持原生移动应用测试// 4. 高级等待机制await driver.wait( until.titleIs('Expected Title'), 5000, 'Title did not match');// 5. Actions API(复杂交互)await driver.actions() .move({ origin: element }) .press() .move({ origin: targetElement }) .release() .perform();6. 使用场景Puppeteer 适用场景:网页爬虫和数据抓取生成截图和 PDF性能测试和监控CI/CD 自动化测试SPA(单页应用)测试需要网络拦截的场景Selenium 适用场景:跨浏览器兼容性测试大型企业级测试框架分布式测试环境需要支持多种浏览器的项目移动应用测试(配合 Appium)传统 Web 应用测试7. 学习曲线Puppeteer:API 简洁直观文档清晰易懂学习曲线较平缓适合初学者Selenium:API 相对复杂需要理解 WebDriver 概念学习曲线较陡峭需要更多配置8. 社区和生态系统Puppeteer:Google 官方维护活跃的 GitHub 社区丰富的插件生态持续更新和改进Selenium:开源社区维护成熟的生态系统大量第三方工具和集成广泛的企业应用9. 实际代码对比任务:登录并获取用户信息Puppeteer 实现:const puppeteer = require('puppeteer');(async () => { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto('https://example.com/login'); // 填写表单 await page.type('#username', 'user@example.com'); await page.type('#password', 'password123'); // 提交表单并等待导航 await Promise.all([ page.waitForNavigation(), page.click('#login-button') ]); // 获取用户信息 const userInfo = await page.evaluate(() => { return { name: document.querySelector('.user-name').textContent, email: document.querySelector('.user-email').textContent }; }); console.log(userInfo); await browser.close();})();Selenium 实现:const { Builder, By, until } = require('selenium-webdriver');(async () => { const driver = await new Builder() .forBrowser('chrome') .build(); await driver.get('https://example.com/login'); // 填写表单 await driver.findElement(By.id('username')).sendKeys('user@example.com'); await driver.findElement(By.id('password')).sendKeys('password123'); // 提交表单并等待导航 await Promise.all([ driver.wait(until.titleContains('Dashboard'), 5000), driver.findElement(By.id('login-button')).click() ]); // 获取用户信息 const userInfo = { name: await driver.findElement(By.css('.user-name')).getText(), email: await driver.findElement(By.css('.user-email')).getText() }; console.log(userInfo); await driver.quit();})();10. 选择建议选择 Puppeteer 如果:主要使用 Chrome/Chromium需要高性能和快速执行需要网络拦截或性能分析项目规模较小或中等团队熟悉 Node.js需要生成截图或 PDF选择 Selenium 如果:需要支持多种浏览器需要跨浏览器兼容性测试项目规模较大或企业级需要分布式测试环境需要测试移动应用团队已有 Selenium 经验11. 混合使用策略在某些项目中,可以结合两者的优势:// 使用 Puppeteer 进行快速开发和测试const puppeteer = require('puppeteer');// 使用 Selenium 进行跨浏览器验证const { Builder, By } = require('selenium-webdriver');async function testWithPuppeteer() { // 快速测试主要功能}async function testWithSelenium() { // 跨浏览器兼容性测试}总结:Puppeteer 和 Selenium 各有优势,选择哪个工具取决于项目需求、团队技能和测试场景。Puppeteer 更适合现代 Web 应用和快速开发,而 Selenium 更适合需要跨浏览器支持的企业级测试框架。
阅读 0·2月19日 19:47

Puppeteer 如何使用 Chrome DevTools Protocol (CDP) 进行高级调试和性能分析?

Puppeteer 提供了丰富的 Chrome DevTools Protocol (CDP) 功能,允许开发者访问浏览器底层的调试和性能分析能力。1. CDP 基础创建 CDP 会话:const client = await page.target().createCDPSession();启用 CDP 域:await client.send('Performance.enable');await client.send('Network.enable');await client.send('Runtime.enable');发送 CDP 命令:const result = await client.send('Performance.getMetrics');console.log(result);监听 CDP 事件:client.on('Network.requestWillBeSent', (params) => { console.log('Request:', params.request.url);});2. 性能监控启用性能监控:const client = await page.target().createCDPSession();await client.send('Performance.enable');获取性能指标:const metrics = await client.send('Performance.getMetrics');console.log('Performance Metrics:', metrics.metrics);关键性能指标:const metrics = await client.send('Performance.getMetrics');const metricMap = {};metrics.metrics.forEach(m => metricMap[m.name] = m.value);console.log({ Timestamp: metricMap.Timestamp, Documents: metricMap.Documents, Frames: metricMap.Frames, JSEventListeners: metricMap.JSEventListeners, Nodes: metricMap.Nodes, LayoutCount: metricMap.LayoutCount, RecalcStyleCount: metricMap.RecalcStyleCount, LayoutDuration: metricMap.LayoutDuration, RecalcStyleDuration: metricMap.RecalcStyleDuration, ScriptDuration: metricMap.ScriptDuration, TaskDuration: metricMap.TaskDuration});性能追踪:// 开始追踪await client.send('Performance.enable');await client.send('Tracing.start', { traceConfig: { includedCategories: ['devtools.timeline', 'blink.user_timing'] }});// 执行操作await page.goto('https://example.com');// 停止追踪const traceData = await client.send('Tracing.stop');3. 网络监控启用网络监控:const client = await page.target().createCDPSession();await client.send('Network.enable');监控网络请求:client.on('Network.requestWillBeSent', (params) => { console.log('Request:', { url: params.request.url, method: params.request.method, type: params.type });});监控网络响应:client.on('Network.responseReceived', (params) => { console.log('Response:', { url: params.response.url, status: params.response.status, mimeType: params.response.mimeType });});获取请求体:client.on('Network.requestWillBeSent', async (params) => { if (params.request.postData) { console.log('Request body:', params.request.postData); }});获取响应体:client.on('Network.responseReceived', async (params) => { const responseBody = await client.send('Network.getResponseBody', { requestId: params.requestId }); console.log('Response body:', responseBody.body);});4. 运行时调试启用运行时监控:const client = await page.target().createCDPSession();await client.send('Runtime.enable');执行 JavaScript:const result = await client.send('Runtime.evaluate', { expression: 'document.title'});console.log('Result:', result.result.value);获取控制台日志:client.on('Runtime.consoleAPICalled', (params) => { console.log('Console:', params.type, params.args);});监听异常:client.on('Runtime.exceptionThrown', (params) => { console.error('Exception:', params.exceptionDetails);});5. DOM 监控启用 DOM 监控:const client = await page.target().createCDPSession();await client.send('DOM.enable');获取文档根节点:const root = await client.send('DOM.getDocument');console.log('Root node:', root.root);查询节点:const result = await client.send('DOM.querySelector', { nodeId: root.root.nodeId, selector: '.my-element'});console.log('Node:', result.nodeId);获取节点属性:const attributes = await client.send('DOM.getAttributes', { nodeId: result.nodeId});console.log('Attributes:', attributes.attributes);6. Page 监控启用 Page 监控:const client = await page.target().createCDPSession();await client.send('Page.enable');监听页面加载:client.on('Page.loadEventFired', () => { console.log('Page loaded');});监听导航:client.on('Page.frameNavigated', (params) => { console.log('Navigated to:', params.frame.url);});获取页面资源树:const resourceTree = await client.send('Page.getResourceTree');console.log('Resource tree:', resourceTree);7. 实际应用场景场景 1:性能分析工具async function analyzePerformance(url) { const browser = await puppeteer.launch(); const page = await browser.newPage(); const client = await page.target().createCDPSession(); // 启用性能监控 await client.send('Performance.enable'); await client.send('Network.enable'); const startTime = Date.now(); await page.goto(url, { waitUntil: 'networkidle2' }); const loadTime = Date.now() - startTime; // 获取性能指标 const metrics = await client.send('Performance.getMetrics'); const metricMap = {}; metrics.metrics.forEach(m => metricMap[m.name] = m.value); // 收集网络数据 const networkData = []; client.on('Network.requestWillBeSent', (params) => { networkData.push({ url: params.request.url, method: params.request.method, timestamp: params.timestamp }); }); const report = { url, loadTime, metrics: { layoutDuration: metricMap.LayoutDuration, recalcStyleDuration: metricMap.RecalcStyleDuration, scriptDuration: metricMap.ScriptDuration, taskDuration: metricMap.TaskDuration }, networkRequests: networkData.length }; await browser.close(); return report;}analyzePerformance('https://example.com').then(console.log);场景 2:网络请求分析async function analyzeNetworkRequests(url) { const browser = await puppeteer.launch(); const page = await browser.newPage(); const client = await page.target().createCDPSession(); await client.send('Network.enable'); const requests = []; client.on('Network.requestWillBeSent', (params) => { requests.push({ requestId: params.requestId, url: params.request.url, method: params.request.method, type: params.type, timestamp: params.timestamp }); }); client.on('Network.responseReceived', (params) => { const request = requests.find(r => r.requestId === params.requestId); if (request) { request.status = params.response.status; request.mimeType = params.response.mimeType; request.size = params.response.encodedDataLength; } }); await page.goto(url, { waitUntil: 'networkidle2' }); // 分析请求 const analysis = { totalRequests: requests.length, byType: {}, byStatus: {}, totalSize: 0 }; requests.forEach(req => { // 按类型统计 if (!analysis.byType[req.type]) { analysis.byType[req.type] = { count: 0, size: 0 }; } analysis.byType[req.type].count++; analysis.byType[req.type].size += req.size || 0; // 按状态码统计 if (!analysis.byStatus[req.status]) { analysis.byStatus[req.status] = 0; } analysis.byStatus[req.status]++; analysis.totalSize += req.size || 0; }); await browser.close(); return analysis;}analyzeNetworkRequests('https://example.com').then(console.log);场景 3:内存分析async function analyzeMemory(url) { const browser = await puppeteer.launch(); const page = await browser.newPage(); const client = await page.target().createCDPSession(); await client.send('Runtime.enable'); await client.send('HeapProfiler.enable'); await page.goto(url, { waitUntil: 'networkidle2' }); // 获取堆快照 const heapSnapshot = await client.send('HeapProfiler.takeHeapSnapshot', { reportProgress: false }); // 获取内存使用情况 const memoryMetrics = await client.send('Runtime.getHeapUsage'); const report = { totalSize: memoryMetrics.totalSize, usedSize: memoryMetrics.usedSize, heapSnapshot: heapSnapshot }; await browser.close(); return report;}analyzeMemory('https://example.com').then(console.log);场景 4:JavaScript 执行分析async function analyzeJavaScript(url) { const browser = await puppeteer.launch(); const page = await browser.newPage(); const client = await page.target().createCDPSession(); await client.send('Runtime.enable'); await client.send('Debugger.enable'); const consoleLogs = []; const exceptions = []; client.on('Runtime.consoleAPICalled', (params) => { consoleLogs.push({ type: params.type, args: params.args.map(arg => arg.value) }); }); client.on('Runtime.exceptionThrown', (params) => { exceptions.push({ message: params.exceptionDetails.exception?.description, stackTrace: params.exceptionDetails.stackTrace }); }); await page.goto(url, { waitUntil: 'networkidle2' }); const report = { consoleLogs, exceptions, hasErrors: exceptions.length > 0 }; await browser.close(); return report;}analyzeJavaScript('https://example.com').then(console.log);8. CDP 高级功能覆盖代码:const client = await page.target().createCDPSession();await client.send('DOM.enable');await client.send('CSS.enable');// 启用代码覆盖await client.send('Profiler.enable');await client.send('Profiler.startPreciseCoverage', { callCount: true, detailed: true});// 执行操作await page.goto('https://example.com');// 获取覆盖数据const coverage = await client.send('Profiler.takePreciseCoverage');console.log('Coverage:', coverage.result);监控长任务:const client = await page.target().createCDPSession();await client.send('Performance.enable');client.on('Performance.metrics', (params) => { params.metrics.forEach(metric => { if (metric.name === 'TaskDuration' && metric.value > 50) { console.warn('Long task detected:', metric.value, 'ms'); } });});监控布局抖动:const client = await page.target().createCDPSession();await client.send('Performance.enable');const layoutShifts = [];client.on('Performance.metrics', (params) => { params.metrics.forEach(metric => { if (metric.name === 'LayoutShift') { layoutShifts.push(metric.value); } });});// 计算累积布局偏移const cls = layoutShifts.reduce((sum, shift) => sum + shift, 0);console.log('Cumulative Layout Shift:', cls);9. 最佳实践1. 及时禁用 CDP 域:try { await client.send('Performance.enable'); // 操作} finally { await client.send('Performance.disable');}2. 批量获取数据:// 一次性获取多个指标const [metrics, networkData] = await Promise.all([ client.send('Performance.getMetrics'), client.send('Network.getResponseBody', { requestId: 'xxx' })]);3. 使用事件过滤:client.on('Network.requestWillBeSent', (params) => { // 只处理特定请求 if (params.request.url.includes('/api/')) { console.log('API Request:', params.request.url); }});4. 错误处理:try { await client.send('Performance.getMetrics');} catch (error) { console.error('CDP error:', error); // 降级处理}
阅读 0·2月19日 19:40

Puppeteer 中有哪些等待机制?如何正确使用它们来处理异步操作?

Puppeteer 提供了多种等待机制来处理异步操作和页面加载,确保在执行操作前页面状态已就绪。1. page.waitForNavigation()等待页面导航完成,适用于点击链接、提交表单等会触发页面跳转的操作。await Promise.all([ page.waitForNavigation(), page.click('#submit-button')]);参数选项:waitUntil: 'load' | 'domcontentloaded' | 'networkidle0' | 'networkidle2'timeout: 超时时间(毫秒)2. page.waitForSelector(selector)等待指定选择器出现在页面中。await page.waitForSelector('.result-item', { visible: true });参数选项:visible: 等待元素可见hidden: 等待元素隐藏timeout: 超时时间3. page.waitForXPath(xpath)等待 XPath 选择器匹配的元素。await page.waitForXPath('//div[@class="content"]');4. page.waitForFunction(pageFunction, …args)等待自定义函数返回真值,最灵活的等待方式。await page.waitForFunction( () => document.querySelectorAll('.item').length > 5);// 带参数await page.waitForFunction( (count) => document.querySelectorAll('.item').length >= count, {}, 10);5. page.waitForTimeout(milliseconds)等待指定时间(已废弃,建议使用 setTimeout)。// 旧方法(已废弃)await page.waitForTimeout(1000);// 新方法await new Promise(resolve => setTimeout(resolve, 1000));6. page.waitForResponse(urlOrPredicate)等待特定的网络响应。// 等待特定 URL 的响应await page.waitForResponse('https://api.example.com/data');// 使用谓词函数await page.waitForResponse(response => response.url().includes('/api/') && response.status() === 200);7. page.waitForRequest(urlOrPredicate)等待特定的网络请求。await page.waitForRequest(request => request.url().includes('/api/data'));8. page.waitForFrame(frame)等待指定的 iframe 加载完成。const frame = await page.waitForFrame('iframe-name');最佳实践:1. 选择合适的等待方法:导航操作 → waitForNavigation元素操作 → waitForSelector复杂条件 → waitForFunctionAPI 调用 → waitForResponse2. 设置合理的超时时间:await page.waitForSelector('.element', { timeout: 5000 // 5 秒超时});3. 使用 Promise.all 并行等待:await Promise.all([ page.waitForNavigation(), page.click('#link'), page.waitForSelector('.loaded')]);4. 处理超时异常:try { await page.waitForSelector('.element', { timeout: 3000 });} catch (error) { console.log('Element not found within timeout');}5. 优化等待策略:// 等待网络空闲(推荐)await page.waitForNavigation({ waitUntil: 'networkidle2' });// 等待特定元素可见await page.waitForSelector('.element', { visible: true });常见问题解决:问题 1:元素存在但不可见// 解决方案:等待元素可见await page.waitForSelector('.element', { visible: true });问题 2:动态加载内容// 解决方案:使用 waitForFunction 检查内容await page.waitForFunction(() => document.querySelectorAll('.item').length > 0);问题 3:SPA 路由变化// 解决方案:等待 URL 变化await page.waitForFunction(() => window.location.pathname === '/new-page');
阅读 0·2月19日 19:40