What are the techniques and best practices for Python performance optimization? - 面试题

Performance Analysis Tools

1. timeit Module

The timeit module is used to measure the execution time of small code snippets.

python
import timeit

# Measure code execution time
code = """
sum(range(1000))
"""

execution_time = timeit.timeit(code, number=1000)
print(f"Execution time: {execution_time:.4f} seconds")

# Use timeit decorator
@timeit.timeit
def test_function():
    return sum(range(1000))

test_function()

2. cProfile Module

The cProfile module is used to analyze program performance bottlenecks.

python
import cProfile

def slow_function():
    total = 0
    for i in range(1000000):
        total += i
    return total

def fast_function():
    return sum(range(1000000))

def main():
    slow_function()
    fast_function()

# Performance profiling
cProfile.run('main()')

# Output analysis results to file
cProfile.run('main()', filename='profile_stats')

3. memory_profiler

memory_profiler is used to analyze memory usage.

python
# Install: pip install memory-profiler
from memory_profiler import profile

@profile
def memory_intensive_function():
    data = [i for i in range(1000000)]
    return sum(data)

if __name__ == '__main__':
    memory_intensive_function()

4. line_profiler

line_profiler is used to analyze function performance line by line.

python
# Install: pip install line_profiler
from line_profiler import LineProfiler

def complex_function():
    result = []
    for i in range(1000):
        result.append(i * 2)
    return sum(result)

# Create performance profiler
lp = LineProfiler()
lp_wrapper = lp(complex_function)
lp_wrapper()

# Display results
lp.print_stats()

Algorithm Optimization

1. Choose Appropriate Algorithms

python
# Bad practice - O(n²) complexity
def find_duplicates_slow(arr):
    duplicates = []
    for i in range(len(arr)):
        for j in range(i + 1, len(arr)):
            if arr[i] == arr[j] and arr[i] not in duplicates:
                duplicates.append(arr[i])
    return duplicates

# Good practice - O(n) complexity
def find_duplicates_fast(arr):
    seen = set()
    duplicates = set()
    for item in arr:
        if item in seen:
            duplicates.add(item)
        else:
            seen.add(item)
    return list(duplicates)

2. Use Built-in Functions

python
# Bad practice - Manual implementation
def manual_sum(arr):
    total = 0
    for item in arr:
        total += item
    return total

# Good practice - Use built-in functions
def builtin_sum(arr):
    return sum(arr)

# Performance comparison
import timeit
print(timeit.timeit(lambda: manual_sum(range(10000)), number=100))
print(timeit.timeit(lambda: builtin_sum(range(10000)), number=100))

3. Avoid Unnecessary Computations

python
# Bad practice - Repeated computation
def calculate_distances(points):
    distances = []
    for i in range(len(points)):
        for j in range(len(points)):
            dx = points[j][0] - points[i][0]
            dy = points[j][1] - points[i][1]
            distances.append((dx ** 2 + dy ** 2) ** 0.5)
    return distances

# Good practice - Avoid repeated computation
def calculate_distances_optimized(points):
    distances = []
    for i in range(len(points)):
        for j in range(i + 1, len(points)):
            dx = points[j][0] - points[i][0]
            dy = points[j][1] - points[i][1]
            distances.append((dx ** 2 + dy ** 2) ** 0.5)
    return distances

Data Structure Optimization

1. Use Appropriate Data Structures

python
# List lookup - O(n)
def find_in_list(lst, target):
    return target in lst

# Set lookup - O(1)
def find_in_set(s, target):
    return target in s

# Performance comparison
import timeit
lst = list(range(10000))
s = set(range(10000))

print("List lookup:", timeit.timeit(lambda: find_in_list(lst, 5000), number=1000))
print("Set lookup:", timeit.timeit(lambda: find_in_set(s, 5000), number=1000))

2. Use Generators Instead of Lists

python
# Bad practice - Use lists
def get_squares_list(n):
    return [i ** 2 for i in range(n)]

# Good practice - Use generators
def get_squares_generator(n):
    for i in range(n):
        yield i ** 2

# Memory usage comparison
import sys
list_obj = get_squares_list(1000000)
gen_obj = get_squares_generator(1000000)

print(f"List memory: {sys.getsizeof(list_obj)} bytes")
print(f"Generator memory: {sys.getsizeof(gen_obj)} bytes")

3. Use slots to Reduce Memory

python
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

class PersonWithSlots:
    __slots__ = ['name', 'age']
    
    def __init__(self, name, age):
        self.name = name
        self.age = age

# Memory comparison
import sys
p1 = Person("Alice", 25)
p2 = PersonWithSlots("Alice", 25)

print(f"Regular object: {sys.getsizeof(p1)} bytes")
print(f"With __slots__: {sys.getsizeof(p2)} bytes")

I/O Optimization

1. Batch Process I/O

python
# Bad practice - Write line by line
def write_lines_slow(filename, lines):
    with open(filename, 'w') as f:
        for line in lines:
            f.write(line + '\n')

# Good practice - Batch write
def write_lines_fast(filename, lines):
    with open(filename, 'w') as f:
        f.write('\n'.join(lines))

2. Use Buffering

python
# Bad practice - No buffering
def read_without_buffer(filename):
    with open(filename, 'r', buffering=0) as f:
        return f.read()

# Good practice - Use buffering
def read_with_buffer(filename):
    with open(filename, 'r', buffering=8192) as f:
        return f.read()

3. Asynchronous I/O

python
import asyncio
import aiohttp

async def fetch_url(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as response:
            return await response.text()

async def fetch_all_urls(urls):
    tasks = [fetch_url(url) for url in urls]
    return await asyncio.gather(*tasks)

urls = [
    "https://www.example.com",
    "https://www.google.com",
    "https://www.github.com",
]

# Fetch all URLs asynchronously
results = asyncio.run(fetch_all_urls(urls))

Concurrency Optimization

1. Multiprocessing for CPU-Intensive Tasks

python
import multiprocessing

def process_data(data_chunk):
    return sum(x ** 2 for x in data_chunk)

def parallel_processing(data, num_processes=4):
    chunk_size = len(data) // num_processes
    chunks = [data[i:i + chunk_size] for i in range(0, len(data), chunk_size)]
    
    with multiprocessing.Pool(processes=num_processes) as pool:
        results = pool.map(process_data, chunks)
    
    return sum(results)

data = list(range(1000000))
result = parallel_processing(data)

2. Multithreading for I/O-Intensive Tasks

python
import threading
import requests

def download_url(url):
    response = requests.get(url)
    return len(response.content)

def parallel_download(urls):
    threads = []
    results = []
    
    def worker(url):
        result = download_url(url)
        results.append(result)
    
    for url in urls:
        thread = threading.Thread(target=worker, args=(url,))
        threads.append(thread)
        thread.start()
    
    for thread in threads:
        thread.join()
    
    return results

urls = ["url1", "url2", "url3"]
results = parallel_download(urls)

3. Use concurrent.futures

python
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor

def process_item(item):
    return item ** 2

def with_thread_pool(items):
    with ThreadPoolExecutor(max_workers=4) as executor:
        results = list(executor.map(process_item, items))
    return results

def with_process_pool(items):
    with ProcessPoolExecutor(max_workers=4) as executor:
        results = list(executor.map(process_item, items))
    return results

items = list(range(1000))
thread_results = with_thread_pool(items)
process_results = with_process_pool(items)

Caching Optimization

1. Use functools.lru_cache

python
from functools import lru_cache

@lru_cache(maxsize=128)
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n-1) + fibonacci(n-2)

# Fast calculation
print(fibonacci(100))

2. Custom Caching

python
class Cache:
    def __init__(self, max_size=128):
        self.cache = {}
        self.max_size = max_size
    
    def get(self, key):
        return self.cache.get(key)
    
    def set(self, key, value):
        if len(self.cache) >= self.max_size:
            self.cache.pop(next(iter(self.cache)))
        self.cache[key] = value

cache = Cache()

def expensive_computation(x):
    cached_result = cache.get(x)
    if cached_result is not None:
        return cached_result
    
    result = sum(i ** 2 for i in range(x))
    cache.set(x, result)
    return result

3. Use Redis Cache

python
import redis
import pickle

# Connect to Redis
r = redis.Redis(host='localhost', port=6379, db=0)

def cache_result(key, value, ttl=3600):
    """Cache result"""
    r.setex(key, ttl, pickle.dumps(value))

def get_cached_result(key):
    """Get cached result"""
    result = r.get(key)
    if result:
        return pickle.loads(result)
    return None

def expensive_operation(data):
    cache_key = f"result:{hash(str(data))}"
    
    # Try to get from cache
    cached = get_cached_result(cache_key)
    if cached:
        return cached
    
    # Execute computation
    result = complex_computation(data)
    
    # Cache result
    cache_result(cache_key, result)
    
    return result

String Optimization

1. Use join Instead of +

python
# Bad practice - Use +
def build_string_slow(parts):
    result = ""
    for part in parts:
        result += part
    return result

# Good practice - Use join
def build_string_fast(parts):
    return ''.join(parts)

# Performance comparison
import timeit
parts = ["part"] * 1000
print(timeit.timeit(lambda: build_string_slow(parts), number=100))
print(timeit.timeit(lambda: build_string_fast(parts), number=100))

2. Use String Formatting

python
# Bad practice - String concatenation
def format_message_slow(name, age):
    return "Name: " + name + ", Age: " + str(age)

# Good practice - Use f-string
def format_message_fast(name, age):
    return f"Name: {name}, Age: {age}"

# Performance comparison
print(timeit.timeit(lambda: format_message_slow("Alice", 25), number=10000))
print(timeit.timeit(lambda: format_message_fast("Alice", 25), number=10000))

3. Use String Methods

python
# Bad practice - Manual processing
def process_string_slow(s):
    result = ""
    for char in s:
        if char.isupper():
            result += char.lower()
        else:
            result += char
    return result

# Good practice - Use built-in methods
def process_string_fast(s):
    return s.lower()

# Performance comparison
print(timeit.timeit(lambda: process_string_slow("HELLO"), number=10000))
print(timeit.timeit(lambda: process_string_fast("HELLO"), number=10000))

Database Optimization

1. Use Connection Pool

python
from sqlalchemy import create_engine
from sqlalchemy.pool import QueuePool

# Create connection pool
engine = create_engine(
    'postgresql://user:password@localhost/dbname',
    poolclass=QueuePool,
    pool_size=10,
    max_overflow=5
)

def execute_query(query):
    with engine.connect() as connection:
        result = connection.execute(query)
        return result.fetchall()

2. Batch Insert

python
# Bad practice - Insert one by one
def insert_slow(items):
    for item in items:
        db.execute("INSERT INTO table VALUES (%s)", (item,))

# Good practice - Batch insert
def insert_fast(items):
    db.executemany("INSERT INTO table VALUES (%s)", [(item,) for item in items])

3. Use Indexes

python
# Create index
CREATE INDEX idx_name ON users(name);

# Use index query
SELECT * FROM users WHERE name = 'Alice';

# Avoid full table scan
# Bad practice
SELECT * FROM users WHERE LOWER(name) = 'alice';

# Good practice
SELECT * FROM users WHERE name = 'Alice';

Best Practices

1. Pre-allocate Memory

python
# Bad practice - Dynamic growth
def build_list_slow():
    result = []
    for i in range(10000):
        result.append(i)
    return result

# Good practice - Pre-allocate
def build_list_fast():
    return [i for i in range(10000)]

2. Avoid Global Variables

python
# Bad practice - Use global variables
counter = 0

def increment_global():
    global counter
    counter += 1

# Good practice - Use local variables
def increment_local(counter):
    return counter + 1

3. Use Appropriate Data Types

python
# Bad practice - Use lists for numeric data
numbers = [1, 2, 3, 4, 5]

# Good practice - Use arrays
import array
numbers = array.array('i', [1, 2, 3, 4, 5])

# Bad practice - Use strings for binary data
data = "binary data"

# Good practice - Use bytes
data = b"binary data"

4. Lazy Loading

python
# Bad practice - Load all data immediately
def load_all_data():
    data = []
    for item in large_dataset:
        processed = process_item(item)
        data.append(processed)
    return data

# Good practice - Lazy loading
def load_data_lazy():
    for item in large_dataset:
        yield process_item(item)

Performance Monitoring

1. Use logging to Record Performance

python
import logging
import time

logging.basicConfig(level=logging.INFO)

def logged_function(func):
    def wrapper(*args, **kwargs):
        start_time = time.time()
        result = func(*args, **kwargs)
        end_time = time.time()
        logging.info(f"{func.__name__} execution time: {end_time - start_time:.4f} seconds")
        return result
    return wrapper

@logged_function
def expensive_function():
    time.sleep(1)
    return "Done"

expensive_function()

2. Use Performance Counters

python
import time
from collections import defaultdict

class PerformanceMonitor:
    def __init__(self):
        self.counters = defaultdict(list)
    
    def record(self, name, duration):
        self.counters[name].append(duration)
    
    def get_stats(self, name):
        durations = self.counters[name]
        return {
            'count': len(durations),
            'total': sum(durations),
            'average': sum(durations) / len(durations),
            'min': min(durations),
            'max': max(durations)
        }

monitor = PerformanceMonitor()

def monitored_function(func):
    def wrapper(*args, **kwargs):
        start_time = time.time()
        result = func(*args, **kwargs)
        end_time = time.time()
        monitor.record(func.__name__, end_time - start_time)
        return result
    return wrapper

Summary

Key points of Python performance optimization:

Performance Analysis Tools: timeit, cProfile, memory_profiler, line_profiler
Algorithm Optimization: Choose appropriate algorithms, use built-in functions, avoid unnecessary computations
Data Structure Optimization: Use appropriate data structures, use generators, use slots
I/O Optimization: Batch processing, use buffering, asynchronous I/O
Concurrency Optimization: Multiprocessing, multithreading, concurrent.futures
Caching Optimization: lru_cache, custom caching, Redis caching
String Optimization: Use join, string formatting, string methods
Database Optimization: Connection pooling, batch insertion, use indexes
Best Practices: Pre-allocate memory, avoid global variables, use appropriate data types, lazy loading
Performance Monitoring: logging, performance counters

Performance optimization principles:

Measure first, then optimize
Optimize bottlenecks, not all code
Balance readability and performance
Use built-in functions and libraries
Consider using C extensions or Cython

Mastering performance optimization techniques enables writing more efficient and faster Python programs.