Mongoose 聚合框架如何使用，有哪些常用操作？ - 面试题

Mongoose 聚合框架（Aggregation Framework）是 MongoDB 强大的数据处理工具，允许对文档进行复杂的数据转换和计算操作。Mongoose 提供了与 MongoDB 聚合管道完全兼容的接口。

基本聚合操作

使用 aggregate() 方法

javascript
const results = await User.aggregate([
  { $match: { age: { $gte: 18 } } },
  { $group: { _id: '$city', count: { $sum: 1 } } }
]);

聚合管道阶段

1. $match - 过滤文档

javascript
// 过滤年龄大于等于 18 的用户
const results = await User.aggregate([
  { $match: { age: { $gte: 18 } } }
]);

// 多条件过滤
const results = await User.aggregate([
  { $match: { 
    age: { $gte: 18 },
    status: 'active'
  }}
]);

2. $group - 分组统计

javascript
// 按城市分组统计用户数量
const results = await User.aggregate([
  { $group: { 
    _id: '$city',
    count: { $sum: 1 },
    avgAge: { $avg: '$age' }
  }}
]);

// 多字段分组
const results = await Order.aggregate([
  { $group: {
    _id: { city: '$city', status: '$status' },
    totalAmount: { $sum: '$amount' },
    count: { $sum: 1 }
  }}
]);

3. $project - 投影字段

javascript
// 选择特定字段
const results = await User.aggregate([
  { $project: {
    name: 1,
    email: 1,
    fullName: { $concat: ['$firstName', ' ', '$lastName'] }
  }}
]);

// 排除字段
const results = await User.aggregate([
  { $project: {
    password: 0,
    __v: 0
  }}
]);

4. $sort - 排序

javascript
// 按年龄升序排序
const results = await User.aggregate([
  { $sort: { age: 1 } }
]);

// 多字段排序
const results = await User.aggregate([
  { $sort: { city: 1, age: -1 } }
]);

5. $limit 和 $skip - 分页

javascript
// 分页查询
const page = 2;
const pageSize = 10;
const results = await User.aggregate([
  { $skip: (page - 1) * pageSize },
  { $limit: pageSize }
]);

6. $lookup - 关联查询

javascript
// 关联订单数据
const results = await User.aggregate([
  { $match: { _id: userId } },
  { $lookup: {
    from: 'orders',
    localField: '_id',
    foreignField: 'userId',
    as: 'orders'
  }}
]);

// 多个关联
const results = await User.aggregate([
  { $lookup: {
    from: 'orders',
    localField: '_id',
    foreignField: 'userId',
    as: 'orders'
  }},
  { $lookup: {
    from: 'reviews',
    localField: '_id',
    foreignField: 'userId',
    as: 'reviews'
  }}
]);

7. $unwind - 展开数组

javascript
// 展开标签数组
const results = await User.aggregate([
  { $unwind: '$tags' },
  { $group: {
    _id: '$tags',
    count: { $sum: 1 }
  }}
]);

// 保留空数组
const results = await User.aggregate([
  { $unwind: { path: '$tags', preserveNullAndEmptyArrays: true } }
]);

高级聚合操作

数组操作

javascript
// $push - 添加到数组
const results = await User.aggregate([
  { $group: {
    _id: '$city',
    users: { $push: '$name' }
  }}
]);

// $addToSet - 添加到集合（去重）
const results = await User.aggregate([
  { $group: {
    _id: '$city',
    uniqueTags: { $addToSet: '$tags' }
  }}
]);

// $first 和 $last - 获取第一个和最后一个
const results = await User.aggregate([
  { $sort: { createdAt: 1 } },
  { $group: {
    _id: '$city',
    firstUser: { $first: '$name' },
    lastUser: { $last: '$name' }
  }}
]);

条件操作

javascript
// $cond - 条件表达式
const results = await User.aggregate([
  { $project: {
    name: 1,
    ageGroup: {
      $cond: {
        if: { $gte: ['$age', 18] },
        then: 'adult',
        else: 'minor'
      }
    }
  }}
]);

// $ifNull - 空值处理
const results = await User.aggregate([
  { $project: {
    name: 1,
    displayName: { $ifNull: ['$displayName', '$name'] }
  }}
]);

日期操作

javascript
// 按日期分组
const results = await Order.aggregate([
  { $group: {
    _id: {
      year: { $year: '$createdAt' },
      month: { $month: '$createdAt' },
      day: { $dayOfMonth: '$createdAt' }
    },
    total: { $sum: '$amount' },
    count: { $sum: 1 }
  }}
]);

// 日期范围查询
const results = await User.aggregate([
  { $match: {
    createdAt: {
      $gte: new Date('2024-01-01'),
      $lt: new Date('2025-01-01')
    }
  }}
]);

性能优化

使用索引

javascript
// 在 $match 阶段使用索引
const results = await User.aggregate([
  { $match: { email: 'john@example.com' } }, // 使用索引
  { $group: { _id: '$city', count: { $sum: 1 } } }
]);

优化管道顺序

javascript
// 优化前
const results = await User.aggregate([
  { $project: { name: 1, age: 1, city: 1 } },
  { $match: { age: { $gte: 18 } } }
]);

// 优化后 - 先过滤再投影
const results = await User.aggregate([
  { $match: { age: { $gte: 18 } } },
  { $project: { name: 1, age: 1, city: 1 } }
]);

使用 $facet 并行处理

javascript
// 并行执行多个聚合管道
const results = await User.aggregate([
  { $facet: {
    total: [{ $count: 'count' }],
    adults: [
      { $match: { age: { $gte: 18 } } },
      { $count: 'count' }
    ],
    byCity: [
      { $group: { _id: '$city', count: { $sum: 1 } } }
    ]
  }}
]);

实际应用场景

销售统计

javascript
const salesStats = await Order.aggregate([
  { $match: {
    createdAt: {
      $gte: new Date('2024-01-01'),
      $lt: new Date('2025-01-01')
    }
  }},
  { $group: {
    _id: {
      year: { $year: '$createdAt' },
      month: { $month: '$createdAt' }
    },
    totalRevenue: { $sum: '$amount' },
    orderCount: { $sum: 1 },
    avgOrderValue: { $avg: '$amount' }
  }},
  { $sort: { '_id.year': 1, '_id.month': 1 } }
]);

用户活跃度分析

javascript
const userActivity = await User.aggregate([
  { $lookup: {
    from: 'activities',
    localField: '_id',
    foreignField: 'userId',
    as: 'activities'
  }},
  { $project: {
    name: 1,
    email: 1,
    activityCount: { $size: '$activities' },
    lastActivity: { $max: '$activities.createdAt' }
  }},
  { $sort: { activityCount: -1 } }
]);

最佳实践

尽早使用 $match：减少处理的数据量
合理使用索引：在 $match 阶段利用索引
限制结果数量：使用 $limit 避免处理过多数据
避免过深嵌套：保持管道简洁
使用 $facet：并行处理多个聚合
监控性能：使用 explain() 分析聚合性能
分批处理大数据：对于大数据集使用分批处理