Python面试题 - 标签

Python 函数式编程有哪些特性和应用场景？# Python 函数式编程详解 ## 函数式编程的基本概念函数式编程是一种编程范式，强调使用纯函数、避免可变状态和副作用。Python 虽然不是纯函数式语言，但提供了丰富的函数式编程工具。 ### 纯函数纯函数是指相同的输入总是产生相同的输出，并且没有任何副作用。 ```python # 纯函数示例 def add(a, b): return a + b print(add(2, 3)) # 5 print(add(2, 3)) # 5 - 相同输入，相同输出 # 非纯函数示例 counter = 0 def increment(): global counter counter += 1 return counter print(increment()) # 1 print(increment()) # 2 - 相同输入，不同输出（有副作用） ``` ### 不可变数据函数式编程倾向于使用不可变数据结构。 ```python # 不可变操作 original_list = [1, 2, 3] new_list = original_list + [4, 5] # 创建新列表，不修改原列表 print(original_list) # [1, 2, 3] print(new_list) # [1, 2, 3, 4, 5] # 可变操作（不推荐） original_list.append(4) # 修改原列表 print(original_list) # [1, 2, 3, 4] ``` ## 高阶函数高阶函数是指接受函数作为参数或返回函数的函数。 ### map 函数 `map` 函数对可迭代对象的每个元素应用指定函数。 ```python # 基本用法 numbers = [1, 2, 3, 4, 5] squared = list(map(lambda x: x ** 2, numbers)) print(squared) # [1, 4, 9, 16, 25] # 使用命名函数 def square(x): return x ** 2 squared = list(map(square, numbers)) print(squared) # [1, 4, 9, 16, 25] # 多个可迭代对象 numbers1 = [1, 2, 3] numbers2 = [4, 5, 6] summed = list(map(lambda x, y: x + y, numbers1, numbers2)) print(summed) # [5, 7, 9] ``` ### filter 函数 `filter` 函数根据条件过滤可迭代对象的元素。 ```python # 基本用法 numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] even_numbers = list(filter(lambda x: x % 2 == 0, numbers)) print(even_numbers) # [2, 4, 6, 8, 10] # 使用命名函数 def is_even(x): return x % 2 == 0 even_numbers = list(filter(is_even, numbers)) print(even_numbers) # [2, 4, 6, 8, 10] # 过滤字符串 words = ["apple", "banana", "cherry", "date"] long_words = list(filter(lambda x: len(x) > 5, words)) print(long_words) # ['banana', 'cherry'] ``` ### reduce 函数 `reduce` 函数对可迭代对象的元素进行累积操作。 ```python from functools import reduce # 基本用法 numbers = [1, 2, 3, 4, 5] sum_result = reduce(lambda x, y: x + y, numbers) print(sum_result) # 15 # 计算乘积 product = reduce(lambda x, y: x * y, numbers) print(product) # 120 # 使用初始值 sum_with_initial = reduce(lambda x, y: x + y, numbers, 10) print(sum_with_initial) # 25 # 查找最大值 max_value = reduce(lambda x, y: x if x > y else y, numbers) print(max_value) # 5 ``` ### sorted 函数 `sorted` 函数对可迭代对象进行排序。 ```python # 基本排序 numbers = [3, 1, 4, 1, 5, 9, 2, 6] sorted_numbers = sorted(numbers) print(sorted_numbers) # [1, 1, 2, 3, 4, 5, 6, 9] # 降序排序 sorted_desc = sorted(numbers, reverse=True) print(sorted_desc) # [9, 6, 5, 4, 3, 2, 1, 1] # 按键排序 students = [ {"name": "Alice", "age": 25}, {"name": "Bob", "age": 20}, {"name": "Charlie", "age": 30} ] sorted_by_age = sorted(students, key=lambda x: x["age"]) print(sorted_by_age) # [{'name': 'Bob', 'age': 20}, {'name': 'Alice', 'age': 25}, {'name': 'Charlie', 'age': 30}] ``` ## Lambda 表达式 Lambda 表达式是匿名函数，适用于简单的函数定义。 ### 基本语法 ```python # Lambda 表达式 add = lambda x, y: x + y print(add(3, 5)) # 8 # 等价于 def add(x, y): return x + y ``` ### 实际应用 ```python # 与高阶函数结合使用 numbers = [1, 2, 3, 4, 5] squared = list(map(lambda x: x ** 2, numbers)) print(squared) # [1, 4, 9, 16, 25] # 排序 students = [("Alice", 25), ("Bob", 20), ("Charlie", 30)] sorted_students = sorted(students, key=lambda x: x[1]) print(sorted_students) # [('Bob', 20), ('Alice', 25), ('Charlie', 30)] # 条件表达式 get_grade = lambda score: "A" if score >= 90 else "B" if score >= 80 else "C" print(get_grade(95)) # A print(get_grade(85)) # B print(get_grade(75)) # C ``` ### Lambda 的限制 ```python # Lambda 只能包含表达式，不能包含语句 # 错误示例 # bad_lambda = lambda x: if x > 0: return x # 语法错误 # 正确做法 good_lambda = lambda x: x if x > 0 else 0 print(good_lambda(5)) # 5 print(good_lambda(-5)) # 0 ``` ## 装饰器装饰器是高阶函数的一种应用，用于修改或增强函数的行为。 ### 基本装饰器 ```python def my_decorator(func): def wrapper(): print("函数执行前") func() print("函数执行后") return wrapper @my_decorator def say_hello(): print("Hello!") say_hello() # 输出: # 函数执行前 # Hello! # 函数执行后 ``` ### 带参数的装饰器 ```python def repeat(times): def decorator(func): def wrapper(*args, **kwargs): for _ in range(times): result = func(*args, **kwargs) return result return wrapper return decorator @repeat(3) def greet(name): print(f"Hello, {name}!") greet("Alice") # 输出: # Hello, Alice! # Hello, Alice! # Hello, Alice! ``` ### 保留函数元数据 ```python from functools import wraps def logging_decorator(func): @wraps(func) def wrapper(*args, **kwargs): print(f"调用函数: {func.__name__}") return func(*args, **kwargs) return wrapper @logging_decorator def calculate(x, y): """计算两个数的和""" return x + y print(calculate.__name__) # calculate print(calculate.__doc__) # 计算两个数的和 ``` ## 偏函数偏函数固定函数的某些参数，创建新的函数。 ```python from functools import partial # 基本用法 def power(base, exponent): return base ** exponent square = partial(power, exponent=2) cube = partial(power, exponent=3) print(square(5)) # 25 print(cube(5)) # 125 # 实际应用 def greet(name, greeting, punctuation): return f"{greeting}, {name}{punctuation}" hello = partial(greet, greeting="Hello", punctuation="!") goodbye = partial(greet, greeting="Goodbye", punctuation=".") print(hello("Alice")) # Hello, Alice! print(goodbye("Bob")) # Goodbye, Bob. ``` ## 列表推导式与生成器表达式 ### 列表推导式 ```python # 基本用法 numbers = [1, 2, 3, 4, 5] squared = [x ** 2 for x in numbers] print(squared) # [1, 4, 9, 16, 25] # 带条件 even_squared = [x ** 2 for x in numbers if x % 2 == 0] print(even_squared) # [4, 16] # 嵌套 matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] flattened = [item for row in matrix for item in row] print(flattened) # [1, 2, 3, 4, 5, 6, 7, 8, 9] ``` ### 生成器表达式 ```python # 基本用法 numbers = (x ** 2 for x in range(10)) print(list(numbers)) # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81] # 内存效率 # 列表推导式 - 占用大量内存 large_list = [x ** 2 for x in range(1000000)] # 生成器表达式 - 几乎不占用内存 large_gen = (x ** 2 for x in range(1000000)) ``` ## 实际应用场景 ### 1. 数据处理管道 ```python from functools import reduce # 处理数据管道 data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] # 过滤偶数 even = filter(lambda x: x % 2 == 0, data) # 平方 squared = map(lambda x: x ** 2, even) # 求和 result = reduce(lambda x, y: x + y, squared) print(result) # 220 ``` ### 2. 函数组合 ```python def compose(*functions): """组合多个函数""" def inner(arg): result = arg for func in reversed(functions): result = func(result) return result return inner # 定义函数 def add_one(x): return x + 1 def multiply_two(x): return x * 2 def square(x): return x ** 2 # 组合函数 pipeline = compose(square, multiply_two, add_one) print(pipeline(3)) # ((3 + 1) * 2) ** 2 = 64 ``` ### 3. 柯里化 ```python def curry(func): """柯里化函数""" def curried(*args): if len(args) >= func.__code__.co_argcount: return func(*args) return lambda *more_args: curried(*(args + more_args)) return curried @curry def add(a, b, c): return a + b + c add_1 = add(1) add_1_2 = add_1(2) result = add_1_2(3) print(result) # 6 # 也可以链式调用 result = add(1)(2)(3) print(result) # 6 ``` ### 4. 记忆化 ```python from functools import lru_cache # 使用 lru_cache 装饰器 @lru_cache(maxsize=128) def fibonacci(n): if n < 2: return n return fibonacci(n-1) + fibonacci(n-2) print(fibonacci(100)) # 快速计算 # 手动实现记忆化 def memoize(func): cache = {} def wrapper(*args): if args not in cache: cache[args] = func(*args) return cache[args] return wrapper @memoize def fibonacci_manual(n): if n < 2: return n return fibonacci_manual(n-1) + fibonacci_manual(n-2) print(fibonacci_manual(100)) # 快速计算 ``` ## 函数式编程的优势 ### 1. 可预测性 ```python # 纯函数的行为是可预测的 def calculate_discount(price, discount_rate): return price * (1 - discount_rate) print(calculate_discount(100, 0.2)) # 80.0 print(calculate_discount(100, 0.2)) # 80.0 - 总是相同 ``` ### 2. 可测试性 ```python # 纯函数易于测试 def add(a, b): return a + b # 测试 assert add(2, 3) == 5 assert add(-1, 1) == 0 assert add(0, 0) == 0 ``` ### 3. 并行性 ```python # 纯函数可以安全地并行执行 from concurrent.futures import ThreadPoolExecutor def process_item(item): return item ** 2 items = list(range(1000)) with ThreadPoolExecutor(max_workers=4) as executor: results = list(executor.map(process_item, items)) ``` ### 4. 代码简洁性 ```python # 函数式风格更简洁 numbers = [1, 2, 3, 4, 5] # 命令式风格 squared = [] for num in numbers: squared.append(num ** 2) # 函数式风格 squared = list(map(lambda x: x ** 2, numbers)) ``` ## 最佳实践 ### 1. 优先使用纯函数 ```python # 好的做法 - 纯函数 def calculate_total(price, tax_rate): return price * (1 + tax_rate) # 不好的做法 - 有副作用 total = 0 def add_to_total(amount): global total total += amount ``` ### 2. 避免过度使用 Lambda ```python # 不好的做法 - 复杂的 Lambda complex_lambda = lambda x: x ** 2 if x > 0 else (x * 2 if x < 0 else 0) # 好的做法 - 使用命名函数 def process_number(x): if x > 0: return x ** 2 elif x < 0: return x * 2 else: return 0 ``` ### 3. 合理使用列表推导式 ```python # 简单情况 - 使用列表推导式 squared = [x ** 2 for x in range(10)] # 复杂情况 - 使用生成器或循环 def complex_process(data): for item in data: # 复杂的处理逻辑 processed = item * 2 if processed > 10: yield processed ``` ### 4. 使用内置函数 ```python # 好的做法 - 使用内置函数 numbers = [1, 2, 3, 4, 5] total = sum(numbers) maximum = max(numbers) minimum = min(numbers) # 不好的做法 - 手动实现 total = 0 for num in numbers: total += num ``` ## 总结 Python 函数式编程的核心概念： 1. **纯函数**：相同输入总是产生相同输出，无副作用 2. **不可变数据**：避免修改原始数据，创建新数据 3. **高阶函数**：接受或返回函数的函数（map, filter, reduce） 4. **Lambda 表达式**：匿名函数，适用于简单操作 5. **装饰器**：修改或增强函数行为 6. **偏函数**：固定函数参数，创建新函数 7. **列表推导式**：简洁地创建列表 8. **生成器表达式**：惰性求值，节省内存函数式编程的优势： - 代码更简洁、更易读 - 更容易测试和调试 - 更好的并行性 - 减少副作用和状态管理掌握函数式编程技巧，能够编写出更优雅、更高效的 Python 代码。

Python

服务端 · 2月21日 17:10

Python 的内存管理机制是怎样的？# Python 内存管理机制详解 ## Python 内存管理概述 Python 使用自动内存管理，主要通过引用计数（Reference Counting）和垃圾回收（Garbage Collection）两种机制来管理内存。这种机制让开发者无需手动分配和释放内存，大大提高了开发效率。 ## 引用计数（Reference Counting） ### 基本原理每个 Python 对象都有一个引用计数器，记录有多少个引用指向该对象。当引用计数降为 0 时，对象会被立即回收。 ### 引用计数示例 ```python import sys a = [1, 2, 3] # 引用计数 = 1 print(sys.getrefcount(a)) # 2 (getrefcount 本身也会创建一个临时引用) b = a # 引用计数 = 2 print(sys.getrefcount(a)) # 3 c = b # 引用计数 = 3 print(sys.getrefcount(a)) # 4 del b # 引用计数 = 2 print(sys.getrefcount(a)) # 3 del c # 引用计数 = 1 print(sys.getrefcount(a)) # 2 del a # 引用计数 = 0，对象被回收 ``` ### 引用计数的变化情况 ```python # 1. 赋值操作 x = [1, 2, 3] y = x # 引用计数增加 # 2. 函数调用 def func(obj): pass func(x) # 函数参数传递时引用计数增加 # 3. 容器存储 lst = [x, y] # 列表存储时引用计数增加 # 4. 删除操作 del x # 引用计数减少 del y # 引用计数减少 del lst # 引用计数减少 ``` ### 引用计数的优缺点 **优点：** - 实时回收：对象不再被引用时立即回收 - 简单高效：无需复杂的标记-清除算法 - 可预测性：内存回收时机明确 **缺点：** - 无法处理循环引用 - 维护引用计数需要额外开销 - 多线程环境下需要加锁保护 ## 循环引用问题 ### 什么是循环引用当两个或多个对象相互引用，形成闭环时，即使没有外部引用，它们的引用计数也不会降为 0，导致内存泄漏。 ```python class Node: def __init__(self, value): self.value = value self.next = None # 创建循环引用 node1 = Node(1) node2 = Node(2) node1.next = node2 node2.next = node1 # 形成循环引用 # 即使删除外部引用，对象也不会被回收 del node1 del node2 # 此时两个对象的引用计数仍为 1（相互引用） ``` ### 循环引用的解决方案 Python 的垃圾回收器专门处理循环引用问题。 ## 垃圾回收（Garbage Collection） ### 分代回收机制 Python 的垃圾回收器采用分代回收策略，将对象分为三代： 1. **第 0 代（Generation 0）**：新创建的对象 2. **第 1 代（Generation 1）**：经历过一次回收仍存活的对象 3. **第 2 代（Generation 2）**：经历过多次回收仍存活的对象 ### 回收阈值 ```python import gc # 查看回收阈值 print(gc.get_threshold()) # (700, 10, 10) # 含义： # - 700: 第 0 代对象数量达到 700 时触发回收 # - 10: 第 0 代回收 10 次后触发第 1 代回收 # - 10: 第 1 代回收 10 次后触发第 2 代回收 # 设置回收阈值 gc.set_threshold(1000, 15, 15) ``` ### 手动触发垃圾回收 ```python import gc # 手动触发垃圾回收 gc.collect() # 禁用垃圾回收 gc.disable() # 启用垃圾回收 gc.enable() # 检查是否启用 print(gc.isenabled()) ``` ### 垃圾回收器工作原理 ```python import gc class MyClass: def __del__(self): print(f"{self} 被回收") # 创建循环引用 obj1 = MyClass() obj2 = MyClass() obj1.ref = obj2 obj2.ref = obj1 # 删除外部引用 del obj1 del obj2 # 手动触发垃圾回收 collected = gc.collect() print(f"回收了 {collected} 个对象") ``` ## 内存池机制 ### 小对象内存池（Pymalloc） Python 对小对象（小于 512 字节）使用专门的内存池管理，提高内存分配效率。 ```python import sys # 小对象使用内存池 small_list = [1, 2, 3] print(f"小对象大小: {sys.getsizeof(small_list)} 字节") # 大对象直接使用系统内存 large_list = list(range(10000)) print(f"大对象大小: {sys.getsizeof(large_list)} 字节") ``` ### 内存池的优势 - 减少内存碎片 - 提高分配速度 - 降低系统调用次数 ## 内存优化技巧 ### 1. 使用生成器替代列表 ```python # 不好的做法 - 使用列表 def get_squares_list(n): return [i ** 2 for i in range(n)] # 好的做法 - 使用生成器 def get_squares_generator(n): for i in range(n): yield i ** 2 ``` ### 2. 使用 __slots__ 减少内存占用 ```python class Person: def __init__(self, name, age): self.name = name self.age = age class PersonWithSlots: __slots__ = ['name', 'age'] def __init__(self, name, age): self.name = name self.age = age # 对比内存占用 import sys p1 = Person("Alice", 25) p2 = PersonWithSlots("Alice", 25) print(f"普通对象大小: {sys.getsizeof(p1)} 字节") print(f"使用 __slots__ 对象大小: {sys.getsizeof(p2)} 字节") ``` ### 3. 使用弱引用（Weak Reference） ```python import weakref class Cache: def __init__(self): self.cache = weakref.WeakValueDictionary() def get(self, key): return self.cache.get(key) def set(self, key, value): self.cache[key] = value # 使用弱引用避免循环引用 cache = Cache() obj = MyClass() cache.set("key", obj) del obj # 对象可以被回收 ``` ### 4. 及时释放大对象 ```python # 处理大文件 def process_large_file(filename): with open(filename, 'r') as f: data = f.read() # 读取大文件 result = process_data(data) del data # 及时释放内存 return result ``` ### 5. 使用适当的数据结构 ```python # 使用元组替代列表（不可变数据） coordinates = (1, 2, 3) # 比列表更节省内存 # 使用集合替代列表（需要快速查找） unique_items = set(items) # 查找效率更高 # 使用字典替代多个列表 data = {'names': names, 'ages': ages} # 更好的组织方式 ``` ## 内存分析工具 ### 1. 使用 sys 模块 ```python import sys # 获取对象大小 obj = [1, 2, 3, 4, 5] print(f"对象大小: {sys.getsizeof(obj)} 字节") # 获取引用计数 print(f"引用计数: {sys.getrefcount(obj)}") ``` ### 2. 使用 gc 模块 ```python import gc # 获取所有对象 all_objects = gc.get_objects() print(f"对象总数: {len(all_objects)}") # 获取垃圾对象 garbage = gc.garbage print(f"垃圾对象数: {len(garbage)}") # 获取回收统计 print(gc.get_stats()) ``` ### 3. 使用 tracemalloc 模块 ```python import tracemalloc # 开始跟踪内存分配 tracemalloc.start() # 执行代码 data = [i for i in range(100000)] # 获取内存快照 snapshot = tracemalloc.take_snapshot() # 显示内存分配统计 top_stats = snapshot.statistics('lineno') for stat in top_stats[:10]: print(stat) # 停止跟踪 tracemalloc.stop() ``` ### 4. 使用 memory_profiler ```python # 安装: pip install memory-profiler from memory_profiler import profile @profile def memory_intensive_function(): data = [i for i in range(1000000)] return sum(data) if __name__ == '__main__': memory_intensive_function() ``` ## 常见内存问题及解决方案 ### 1. 内存泄漏 ```python # 问题代码 class Observer: def __init__(self, subject): self.subject = subject subject.observers.append(self) # 形成循环引用 # 解决方案 1: 使用弱引用 import weakref class Observer: def __init__(self, subject): self.subject = weakref.ref(subject) subject.observers.append(self) # 解决方案 2: 提供清理方法 class Observer: def __init__(self, subject): self.subject = subject subject.observers.append(self) def cleanup(self): if self in self.subject.observers: self.subject.observers.remove(self) ``` ### 2. 大对象占用过多内存 ```python # 问题代码 def load_all_data(): return [process_item(item) for item in large_dataset] # 解决方案: 使用生成器 def load_data_generator(): for item in large_dataset: yield process_item(item) ``` ### 3. 缓存无限增长 ```python # 问题代码 cache = {} def get_data(key): if key not in cache: cache[key] = expensive_operation(key) return cache[key] # 解决方案: 使用 LRU 缓存 from functools import lru_cache @lru_cache(maxsize=128) def get_data(key): return expensive_operation(key) ``` ## 最佳实践 ### 1. 避免不必要的对象创建 ```python # 不好的做法 def process_items(items): results = [] for item in items: temp = item * 2 results.append(temp) return results # 好的做法 def process_items(items): return [item * 2 for item in items] ``` ### 2. 使用上下文管理器 ```python # 好的做法 - 自动释放资源 with open('large_file.txt', 'r') as f: data = f.read() # 处理数据 # 文件自动关闭，内存自动释放 ``` ### 3. 及时清理不再需要的引用 ```python def process_data(): large_data = load_large_dataset() result = analyze(large_data) del large_data # 及时释放大对象 return result ``` ### 4. 使用适当的数据类型 ```python # 使用数组替代列表（数值数据） import array arr = array.array('i', [1, 2, 3, 4, 5]) # 更节省内存 # 使用字节串替代字符串（二进制数据） data = b'binary data' # 比 str 更节省内存 ``` ## 总结 Python 的内存管理机制包括： 1. **引用计数**：实时回收不再使用的对象 2. **垃圾回收**：处理循环引用问题 3. **内存池**：提高小对象分配效率 4. **分代回收**：优化垃圾回收性能 ### 内存优化关键点 - 使用生成器替代列表 - 使用 `__slots__` 减少对象内存占用 - 使用弱引用避免循环引用 - 及时释放大对象 - 选择合适的数据结构 - 使用缓存时设置大小限制理解 Python 的内存管理机制，有助于编写更高效、更稳定的程序，避免内存泄漏和性能问题。

Python

服务端 · 2月21日 17:10

Python

服务端 · 2月21日 17:10

Python

服务端 · 2月21日 17:10

Python 中的闭包是什么？如何使用？# Python 中的闭包详解 ## 闭包的基本概念闭包是 Python 中一个重要的概念，它是指一个函数对象，即使在其定义作用域之外执行时，仍然能够访问其定义作用域中的变量。 ### 闭包的基本结构 ```python def outer_function(x): """外部函数""" def inner_function(y): """内部函数""" return x + y return inner_function # 创建闭包 closure = outer_function(10) # 调用闭包 print(closure(5)) # 15 print(closure(20)) # 30 ``` ### 闭包的三个条件 1. 必须有一个嵌套函数（内部函数） 2. 内部函数必须引用外部函数中的变量 3. 外部函数必须返回这个内部函数 ```python def make_multiplier(factor): """创建乘法闭包""" def multiply(number): return number * factor return multiply # 创建不同的乘法器 double = make_multiplier(2) triple = make_multiplier(3) print(double(5)) # 10 print(triple(5)) # 15 ``` ## 闭包的工作原理 ### 变量的作用域 ```python def outer(): x = 10 def inner(): # 内部函数可以访问外部函数的变量 print(f"内部函数访问 x: {x}") return x return inner closure = outer() print(closure()) # 内部函数访问 x: 10, 10 ``` ### 变量的生命周期 ```python def counter(): """计数器闭包""" count = 0 def increment(): nonlocal count count += 1 return count return increment # 创建计数器 my_counter = counter() print(my_counter()) # 1 print(my_counter()) # 2 print(my_counter()) # 3 # 创建另一个计数器 another_counter = counter() print(another_counter()) # 1 ``` ### __closure__ 属性 ```python def outer(x): def inner(y): return x + y return inner closure = outer(10) # 查看闭包的变量 print(closure.__closure__) # (<cell at 0x...: int object at 0x...>,) print(closure.__closure__[0].cell_contents) # 10 ``` ## 闭包的实际应用 ### 1. 数据隐藏和封装 ```python def make_account(initial_balance): """创建银行账户""" balance = initial_balance def deposit(amount): nonlocal balance balance += amount return balance def withdraw(amount): nonlocal balance if amount <= balance: balance -= amount return balance else: raise ValueError("余额不足") def get_balance(): return balance # 返回多个函数 return { 'deposit': deposit, 'withdraw': withdraw, 'get_balance': get_balance } # 创建账户 account = make_account(100) # 使用账户 print(account['deposit'](50)) # 150 print(account['withdraw'](30)) # 120 print(account['get_balance']()) # 120 # balance 变量被隐藏，无法直接访问 # print(balance) # NameError: name 'balance' is not defined ``` ### 2. 函数工厂 ```python def make_power_function(power): """创建幂函数""" def power_function(base): return base ** power return power_function # 创建不同的幂函数 square = make_power_function(2) cube = make_power_function(3) fourth_power = make_power_function(4) print(square(3)) # 9 print(cube(3)) # 27 print(fourth_power(3)) # 81 ``` ### 3. 延迟计算 ```python def lazy_sum(*args): """延迟求和""" def sum(): total = 0 for num in args: total += num return total return sum # 创建延迟求和函数 f = lazy_sum(1, 2, 3, 4, 5) # 调用时才计算 print(f()) # 15 ``` ### 4. 缓存和记忆化 ```python def memoize(func): """记忆化装饰器""" cache = {} def memoized(*args): if args not in cache: cache[args] = func(*args) return cache[args] return memoized @memoize def fibonacci(n): """斐波那契数列""" if n < 2: return n return fibonacci(n-1) + fibonacci(n-2) print(fibonacci(10)) # 55 print(fibonacci(20)) # 6765 ``` ### 5. 回调函数 ```python def make_callback(callback): """创建回调函数""" def execute(*args, **kwargs): print("执行回调前...") result = callback(*args, **kwargs) print("执行回调后...") return result return execute def my_function(x, y): return x + y # 创建带回调的函数 callback_function = make_callback(my_function) print(callback_function(3, 5)) # 执行回调前..., 8, 执行回调后... ``` ### 6. 状态保持 ```python def make_state_machine(): """创建状态机""" state = 'idle' def transition(action): nonlocal state print(f"当前状态: {state}, 动作: {action}") if state == 'idle': if action == 'start': state = 'running' elif state == 'running': if action == 'pause': state = 'paused' elif action == 'stop': state = 'idle' elif state == 'paused': if action == 'resume': state = 'running' elif action == 'stop': state = 'idle' print(f"新状态: {state}") return state return transition # 创建状态机 state_machine = make_state_machine() state_machine('start') # idle -> running state_machine('pause') # running -> paused state_machine('resume') # paused -> running state_machine('stop') # running -> idle ``` ## 闭包与装饰器 ### 闭包实现装饰器 ```python def my_decorator(func): """简单的装饰器""" def wrapper(): print("装饰器：函数调用前") result = func() print("装饰器：函数调用后") return result return wrapper @my_decorator def say_hello(): print("Hello!") say_hello() # 输出: # 装饰器：函数调用前 # Hello! # 装饰器：函数调用后 ``` ### 带参数的装饰器 ```python def repeat(times): """重复执行装饰器""" def decorator(func): def wrapper(*args, **kwargs): results = [] for _ in range(times): result = func(*args, **kwargs) results.append(result) return results return wrapper return decorator @repeat(3) def greet(name): return f"Hello, {name}!" print(greet("Alice")) # 输出: ['Hello, Alice!', 'Hello, Alice!', 'Hello, Alice!'] ``` ## 闭包的注意事项 ### 1. 循环变量的陷阱 ```python # 错误的做法 def create_multipliers(): return [lambda x: x * i for i in range(5)] multipliers = create_multipliers() print([m(2) for m in multipliers]) # [8, 8, 8, 8, 8] - 错误！ # 正确的做法 - 使用默认参数 def create_multipliers_correct(): return [lambda x, i=i: x * i for i in range(5)] multipliers_correct = create_multipliers_correct() print([m(2) for m in multipliers_correct]) # [0, 2, 4, 6, 8] - 正确 ``` ### 2. 修改外部变量 ```python def outer(): count = 0 def increment(): nonlocal count # 必须使用 nonlocal 关键字 count += 1 return count return increment counter = outer() print(counter()) # 1 print(counter()) # 2 ``` ### 3. 内存泄漏风险 ```python def large_closure(): """创建大闭包""" large_data = list(range(1000000)) def process(): return sum(large_data[:100]) return process # 闭包会保持对 large_data 的引用 # 即使只使用其中的一小部分 closure = large_closure() # 如果不再需要闭包，应该删除引用 del closure ``` ## 闭包 vs 类 ### 闭包实现 ```python def make_counter(): """使用闭包实现计数器""" count = 0 def increment(): nonlocal count count += 1 return count def get_count(): return count return { 'increment': increment, 'get_count': get_count } counter = make_counter() print(counter['increment']()) # 1 print(counter['increment']()) # 2 print(counter['get_count']()) # 2 ``` ### 类实现 ```python class Counter: """使用类实现计数器""" def __init__(self): self.count = 0 def increment(self): self.count += 1 return self.count def get_count(self): return self.count counter = Counter() print(counter.increment()) # 1 print(counter.increment()) # 2 print(counter.get_count()) # 2 ``` ### 何时使用闭包 vs 类 ```python # 使用闭包的场景： # 1. 简单的状态保持 def make_accumulator(): total = 0 def add(value): nonlocal total total += value return total return add # 2. 函数工厂 def make_power(power): def power_function(base): return base ** power return power_function # 使用类的场景： # 1. 复杂的状态管理 class BankAccount: def __init__(self, initial_balance): self.balance = initial_balance self.transactions = [] def deposit(self, amount): self.balance += amount self.transactions.append(('deposit', amount)) def withdraw(self, amount): if amount <= self.balance: self.balance -= amount self.transactions.append(('withdraw', amount)) def get_balance(self): return self.balance def get_transactions(self): return self.transactions # 2. 需要多个方法和属性 class Calculator: def __init__(self): self.history = [] def add(self, a, b): result = a + b self.history.append(f"{a} + {b} = {result}") return result def subtract(self, a, b): result = a - b self.history.append(f"{a} - {b} = {result}") return result def get_history(self): return self.history ``` ## 闭包的高级应用 ### 1. 部分函数应用 ```python def partial(func, *args, **kwargs): """部分函数应用""" def wrapper(*more_args, **more_kwargs): all_args = args + more_args all_kwargs = {**kwargs, **more_kwargs} return func(*all_args, **all_kwargs) return wrapper def power(base, exponent): return base ** exponent square = partial(power, exponent=2) cube = partial(power, exponent=3) print(square(5)) # 25 print(cube(5)) # 125 ``` ### 2. 函数组合 ```python def compose(*functions): """函数组合""" def wrapper(arg): result = arg for func in reversed(functions): result = func(result) return result return wrapper def add_one(x): return x + 1 def multiply_two(x): return x * 2 def square(x): return x ** 2 # 组合函数 combined = compose(square, multiply_two, add_one) print(combined(3)) # ((3 + 1) * 2) ** 2 = 64 ``` ### 3. 验证器 ```python def make_validator(validator_func, error_message): """创建验证器""" def validate(value): if not validator_func(value): raise ValueError(error_message) return value return validate # 创建验证器 is_positive = make_validator( lambda x: x > 0, "值必须为正数" ) is_email = make_validator( lambda x: '@' in x and '.' in x, "无效的邮箱地址" ) # 使用验证器 print(is_positive(10)) # 10 # is_positive(-5) # ValueError: 值必须为正数 print(is_email("user@example.com")) # user@example.com # is_email("invalid") # ValueError: 无效的邮箱地址 ``` ### 4. 限流器 ```python import time def rate_limiter(max_calls, time_window): """创建限流器""" calls = [] def limiter(func): def wrapper(*args, **kwargs): current_time = time.time() # 移除超出时间窗口的调用记录 calls[:] = [call_time for call_time in calls if current_time - call_time < time_window] # 检查是否超过限制 if len(calls) >= max_calls: raise Exception(f"超过限流限制：{max_calls} 次/{time_window} 秒") # 记录调用 calls.append(current_time) # 执行函数 return func(*args, **kwargs) return wrapper return limiter @rate_limiter(max_calls=3, time_window=1) def api_call(): print("API 调用成功") return "success" # 测试限流 api_call() # 成功 api_call() # 成功 api_call() # 成功 # api_call() # Exception: 超过限流限制：3 次/1 秒 ``` ## 总结 Python 闭包的核心概念： 1. **基本定义**：闭包是一个函数对象，能够访问其定义作用域中的变量 2. **三个条件**：嵌套函数、引用外部变量、返回内部函数 3. **工作原理**：通过 `__closure__` 属性保持对外部变量的引用闭包的实际应用： - 数据隐藏和封装 - 函数工厂 - 延迟计算 - 缓存和记忆化 - 回调函数 - 状态保持闭包的注意事项： - 循环变量的陷阱 - 使用 `nonlocal` 修改外部变量 - 注意内存泄漏风险闭包 vs 类： - 闭包：适合简单的状态保持和函数工厂 - 类：适合复杂的状态管理和多个方法闭包的高级应用： - 部分函数应用 - 函数组合 - 验证器 - 限流器闭包是 Python 中一个强大而优雅的特性，它允许函数保持状态，实现数据隐藏，并创建更加灵活和可重用的代码。掌握闭包对于编写高质量的 Python 代码非常重要。

服务端 · 2月21日 17:10

服务端 · 2月21日 17:10

服务端 · 2月21日 17:10

服务端 · 2月21日 17:10

Python 中的列表推导式和生成器表达式有什么区别？# Python 生成器表达式与列表推导式详解 ## 列表推导式 ### 基本语法列表推导式是一种简洁的创建列表的方式，它将循环和条件判断结合在一起。 ```python # 基本列表推导式 numbers = [1, 2, 3, 4, 5] # 传统方式 squares = [] for num in numbers: squares.append(num ** 2) # 列表推导式 squares = [num ** 2 for num in numbers] print(squares) # [1, 4, 9, 16, 25] ``` ### 带条件的列表推导式 ```python # 带过滤条件的列表推导式 numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] # 获取偶数 evens = [num for num in numbers if num % 2 == 0] print(evens) # [2, 4, 6, 8, 10] # 获取大于5的奇数 odd_gt_5 = [num for num in numbers if num % 2 == 1 and num > 5] print(odd_gt_5) # [7, 9] # 使用 if-else 表达式 result = ["偶数" if num % 2 == 0 else "奇数" for num in numbers[:5]] print(result) # ['奇数', '偶数', '奇数', '偶数', '奇数'] ``` ### 嵌套列表推导式 ```python # 嵌套列表推导式 matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] # 展平二维列表 flattened = [item for row in matrix for item in row] print(flattened) # [1, 2, 3, 4, 5, 6, 7, 8, 9] # 转置矩阵 transposed = [[row[i] for row in matrix] for i in range(3)] print(transposed) # [[1, 4, 7], [2, 5, 8], [3, 6, 9]] # 创建乘法表 multiplication_table = [[i * j for j in range(1, 6)] for i in range(1, 6)] for row in multiplication_table: print(row) # [1, 2, 3, 4, 5] # [2, 4, 6, 8, 10] # [3, 6, 9, 12, 15] # [4, 8, 12, 16, 20] # [5, 10, 15, 20, 25] ``` ### 列表推导式的实际应用 ```python # 1. 数据转换 names = ["alice", "bob", "charlie"] capitalized = [name.capitalize() for name in names] print(capitalized) # ['Alice', 'Bob', 'Charlie'] # 2. 数据过滤 data = [1, -2, 3, -4, 5, -6, 7, -8, 9, -10] positive = [x for x in data if x > 0] print(positive) # [1, 3, 5, 7, 9] # 3. 字典键值转换 user_dict = {"name": "Alice", "age": 25, "city": "New York"} keys = [key for key in user_dict.keys()] values = [value for value in user_dict.values()] items = [f"{key}: {value}" for key, value in user_dict.items()] print(keys) # ['name', 'age', 'city'] print(values) # ['Alice', 25, 'New York'] print(items) # ['name: Alice', 'age: 25', 'city: New York'] # 4. 文件处理 # 假设有一个文件包含多行文本 lines = ["hello world", "python is great", "list comprehension"] words = [word for line in lines for word in line.split()] print(words) # ['hello', 'world', 'python', 'is', 'great', 'list', 'comprehension'] ``` ## 生成器表达式 ### 基本语法生成器表达式与列表推导式语法相似，但使用圆括号而不是方括号。生成器表达式返回一个生成器对象，而不是列表。 ```python # 生成器表达式 numbers = [1, 2, 3, 4, 5] # 列表推导式 squares_list = [num ** 2 for num in numbers] print(squares_list) # [1, 4, 9, 16, 25] print(type(squares_list)) # <class 'list'> # 生成器表达式 squares_gen = (num ** 2 for num in numbers) print(squares_gen) # <generator object <genexpr> at 0x...> print(type(squares_gen)) # <class 'generator'> # 使用生成器 print(list(squares_gen)) # [1, 4, 9, 16, 25] ``` ### 生成器的惰性求值 ```python # 生成器的惰性求值特性 def count(): print("生成器开始执行") for i in range(5): print(f"生成 {i}") yield i # 创建生成器 gen = count() print("生成器已创建") # 逐个获取值 print(f"获取值: {next(gen)}") print(f"获取值: {next(gen)}") print(f"获取值: {next(gen)}") # 输出: # 生成器已创建 # 生成器开始执行 # 生成 0 # 获取值: 0 # 生成 1 # 获取值: 1 # 生成 2 # 获取值: 2 ``` ### 生成器表达式的实际应用 ```python # 1. 处理大文件 # 假设有一个大文件，逐行读取 def read_large_file(filename): with open(filename, 'r') as f: for line in f: yield line.strip() # 使用生成器表达式处理 # lines = (line for line in read_large_file('large_file.txt')) # long_lines = [line for line in lines if len(line) > 100] # 2. 无限序列 import itertools # 无限的偶数生成器 evens = (i for i in itertools.count(0, 2)) print(next(evens)) # 0 print(next(evens)) # 2 print(next(evens)) # 4 # 3. 链式处理 numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] # 链式生成器表达式 result = ( num ** 2 for num in numbers if num % 2 == 0 if num > 4 ) print(list(result)) # [36, 64, 100] # 4. 内存高效的数据处理 # 处理大量数据时，使用生成器可以节省内存 large_data = range(1000000) # 列表推导式 - 占用大量内存 # squares_list = [x ** 2 for x in large_data] # 生成器表达式 - 内存高效 squares_gen = (x ** 2 for x in large_data) # 只在需要时计算 print(next(squares_gen)) # 0 print(next(squares_gen)) # 1 ``` ## 列表推导式 vs 生成器表达式 ### 内存使用对比 ```python import sys # 列表推导式 - 立即创建所有元素 list_comp = [x ** 2 for x in range(1000000)] print(f"列表推导式内存使用: {sys.getsizeof(list_comp)} bytes") # 生成器表达式 - 惰性求值，不立即创建所有元素 gen_expr = (x ** 2 for x in range(1000000)) print(f"生成器表达式内存使用: {sys.getsizeof(gen_expr)} bytes") # 输出: # 列表推导式内存使用: 8000056 bytes (约 8MB) # 生成器表达式内存使用: 200 bytes (非常小) ``` ### 性能对比 ```python import time # 测试列表推导式性能 start = time.time() list_comp = [x ** 2 for x in range(1000000)] list_time = time.time() - start # 测试生成器表达式性能 start = time.time() gen_expr = (x ** 2 for x in range(1000000)) gen_time = time.time() - start print(f"列表推导式创建时间: {list_time:.4f} 秒") print(f"生成器表达式创建时间: {gen_time:.4f} 秒") # 但如果需要遍历所有元素 start = time.time() for _ in list_comp: pass list_iterate_time = time.time() - start start = time.time() for _ in gen_expr: pass gen_iterate_time = time.time() - start print(f"列表推导式遍历时间: {list_iterate_time:.4f} 秒") print(f"生成器表达式遍历时间: {gen_iterate_time:.4f} 秒") ``` ### 使用场景对比 ```python # 适合使用列表推导式的场景 # 1. 需要多次访问结果 numbers = [1, 2, 3, 4, 5] squares = [x ** 2 for x in numbers] print(squares[0]) # 1 print(squares[2]) # 9 print(squares[4]) # 25 # 2. 需要索引访问 for i, value in enumerate(squares): print(f"索引 {i}: {value}") # 3. 需要切片操作 print(squares[1:4]) # [4, 9, 16] # 适合使用生成器表达式的场景 # 1. 处理大数据集 large_numbers = range(10000000) squares_gen = (x ** 2 for x in large_numbers) # 2. 只需要遍历一次 total = sum(x ** 2 for x in range(1000000)) print(f"总和: {total}") # 3. 链式操作 result = ( x ** 2 for x in range(100) if x % 2 == 0 ) result = (x + 1 for x in result) result = (x * 2 for x in result) print(list(result)[:5]) # [2, 18, 50, 98, 162] ``` ## 高级应用 ### 1. 使用生成器表达式实现管道 ```python # 数据处理管道 def pipeline(data, *functions): """创建数据处理管道""" result = data for func in functions: result = func(result) return result # 定义处理函数 def filter_even(numbers): return (num for num in numbers if num % 2 == 0) def square(numbers): return (num ** 2 for num in numbers) def add_one(numbers): return (num + 1 for num in numbers) # 使用管道 numbers = range(10) result = pipeline(numbers, filter_even, square, add_one) print(list(result)) # [1, 5, 17, 37, 65] ``` ### 2. 使用生成器表达式处理文件 ```python # 假设有一个日志文件 # log.txt: # 2024-01-01 10:00:00 INFO User logged in # 2024-01-01 10:01:00 ERROR Connection failed # 2024-01-01 10:02:00 INFO User logged out # 2024-01-01 10:03:00 ERROR Timeout occurred # 使用生成器表达式处理日志文件 def process_log_file(filename): """处理日志文件，提取错误信息""" with open(filename, 'r') as f: # 生成器表达式：只提取错误行 errors = ( line.strip() for line in f if 'ERROR' in line ) # 进一步处理 error_messages = ( line.split('ERROR ')[1] for line in errors ) return list(error_messages) # errors = process_log_file('log.txt') # print(errors) # ['Connection failed', 'Timeout occurred'] ``` ### 3. 使用列表推导式创建复杂结构 ```python # 创建字典 keys = ['name', 'age', 'city'] values = ['Alice', 25, 'New York'] person_dict = {keys[i]: values[i] for i in range(len(keys))} print(person_dict) # {'name': 'Alice', 'age': 25, 'city': 'New York'} # 创建嵌套字典 users = [ {'name': 'Alice', 'age': 25}, {'name': 'Bob', 'age': 30}, {'name': 'Charlie', 'age': 35} ] user_index = {user['name']: user['age'] for user in users} print(user_index) # {'Alice': 25, 'Bob': 30, 'Charlie': 35} # 创建集合 numbers = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4] unique_numbers = {num for num in numbers} print(unique_numbers) # {1, 2, 3, 4} # 创建元组 coordinates = [(x, y) for x in range(3) for y in range(3)] print(coordinates) # [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)] ``` ### 4. 使用生成器表达式实现无限序列 ```python # 斐波那契数列生成器 def fibonacci(): a, b = 0, 1 while True: yield a a, b = b, a + b # 获取前10个斐波那契数 fib = fibonacci() first_10 = [next(fib) for _ in range(10)] print(first_10) # [0, 1, 1, 2, 3, 5, 8, 13, 21, 34] # 质数生成器 def primes(): """生成质数""" num = 2 while True: if all(num % i != 0 for i in range(2, int(num ** 0.5) + 1)): yield num num += 1 # 获取前10个质数 prime_gen = primes() first_10_primes = [next(prime_gen) for _ in range(10)] print(first_10_primes) # [2, 3, 5, 7, 11, 13, 17, 19, 23, 29] ``` ## 最佳实践 ### 1. 可读性优先 ```python # 好的做法 - 清晰易读 numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] evens = [num for num in numbers if num % 2 == 0] # 不好的做法 - 过于复杂 result = [x for x in [y ** 2 for y in range(10)] if x > 50 and x < 100] # 更好的做法 - 分步骤 squares = [y ** 2 for y in range(10)] result = [x for x in squares if 50 < x < 100] ``` ### 2. 避免副作用 ```python # 不好的做法 - 在推导式中有副作用 items = [] result = [items.append(x) for x in range(10)] # 错误！ # 好的做法 - 使用循环 items = [] for x in range(10): items.append(x) # 或者直接使用列表推导式 items = [x for x in range(10)] ``` ### 3. 选择合适的数据结构 ```python # 需要多次访问 - 使用列表 numbers = [x ** 2 for x in range(100)] print(numbers[0]) print(numbers[50]) print(numbers[99]) # 只需要遍历一次 - 使用生成器 total = sum(x ** 2 for x in range(1000000)) # 需要唯一值 - 使用集合 unique = {x % 10 for x in range(100)} # 需要键值对 - 使用字典 mapping = {x: x ** 2 for x in range(10)} ``` ### 4. 考虑性能 ```python # 对于大数据集，使用生成器表达式 large_data = range(10000000) # 好的做法 - 使用生成器 result = sum(x ** 2 for x in large_data) # 不好的做法 - 使用列表（占用大量内存） # result = sum([x ** 2 for x in large_data]) # 对于小数据集，列表推导式可能更快 small_data = range(100) result = sum([x ** 2 for x in small_data]) ``` ## 总结 Python 列表推导式和生成器表达式的核心概念： ### 列表推导式 1. **基本语法**：`[expression for item in iterable if condition]` 2. **特点**：立即创建列表，支持索引和切片 3. **适用场景**：需要多次访问、需要索引操作、数据量较小 ### 生成器表达式 1. **基本语法**：`(expression for item in iterable if condition)` 2. **特点**：惰性求值，内存高效，只能遍历一次 3. **适用场景**：处理大数据集、只需要遍历一次、链式操作 ### 主要区别 1. **内存使用**：生成器表达式更节省内存 2. **访问方式**：列表支持索引，生成器不支持 3. **重用性**：列表可以多次访问，生成器只能遍历一次 4. **创建时间**：生成器创建更快，但遍历时间可能更长 ### 最佳实践 1. 优先考虑可读性 2. 避免在推导式中使用副作用 3. 根据需求选择合适的数据结构 4. 考虑性能和内存使用掌握列表推导式和生成器表达式，能够编写出更简洁、更高效的 Python 代码。它们是 Python 中非常强大的特性，能够显著提高代码的可读性和性能。

Python

服务端 · 2月21日 17:10

Python

服务端 · 2月21日 17:10