乐闻世界logo
搜索文章和话题

How secure is YAML? What are common YAML security risks and prevention measures?

2月21日 14:20

YAML security is an important topic, especially when handling untrusted input. Understanding YAML security risks and best practices is crucial for protecting applications.

YAML Security Risks

1. Code Injection Risks

YAML parsers may execute arbitrary code, especially when using unsafe_load() methods.

python
# ❌ Dangerous: Using unsafe_load may lead to code execution import yaml data = yaml.unsafe_load(""" !!python/object/apply:os.system args: ['rm -rf /'] """)

2. Type Confusion Attacks

Attackers may exploit type inference rules to bypass security checks.

yaml
# Unexpected type conversion password: "123456" # String password: 123456 # Number, may cause validation failure

3. Resource Exhaustion Attacks

Maliciously crafted YAML files may cause parsers to consume excessive resources.

yaml
# Deep nesting may cause stack overflow a: b: c: d: e: f: g: h: i: value

Some YAML parsers may follow symbolic links, leading to information disclosure.

Secure YAML Parsing Methods

Python

Using safe_load()

python
import yaml # ✅ Secure: Use safe_load with open('config.yaml', 'r') as f: data = yaml.safe_load(f) # ❌ Dangerous: Avoid using unsafe_load data = yaml.unsafe_load(open('config.yaml'))

Using SafeLoader

python
import yaml # Explicitly specify SafeLoader data = yaml.load(open('config.yaml'), Loader=yaml.SafeLoader)

JavaScript

Using safeLoad()

javascript
const yaml = require('js-yaml'); // ✅ Secure: Use safeLoad const data = yaml.safeLoad(fs.readFileSync('config.yaml', 'utf8')); // ❌ Dangerous: Avoid using load (if supported) const data = yaml.load(fs.readFileSync('config.yaml', 'utf8'));

Java

Using SnakeYAML

java
import org.yaml.snakeyaml.Yaml; import org.yaml.snakeyaml.constructor.Constructor; import org.yaml.snakeyaml.constructor.SafeConstructor; // ✅ Secure: Use SafeConstructor Yaml yaml = new Yaml(new SafeConstructor()); Map<String, Object> data = yaml.load(inputStream); // ❌ Dangerous: Avoid using default constructor Yaml yaml = new Yaml(); Map<String, Object> data = yaml.load(inputStream);

Go

Using gopkg.in/yaml.v3

go
import "gopkg.in/yaml.v3" // Go's yaml library is safe by default var data map[string]interface{} err := yaml.Unmarshal([]byte(yamlContent), &data)

YAML Security Best Practices

1. Always Use Secure Parsers

python
# Python import yaml data = yaml.safe_load(yaml_string) # JavaScript const yaml = require('js-yaml'); const data = yaml.safeLoad(yamlString); # Java Yaml yaml = new Yaml(new SafeConstructor()); Map<String, Object> data = yaml.load(inputStream);

2. Validate and Sanitize Input

python
import yaml from cerberus import Validator # Define validation schema schema = { 'name': {'type': 'string', 'required': True}, 'age': {'type': 'integer', 'min': 0, 'max': 120}, 'email': {'type': 'string', 'regex': '^[^@]+@[^@]+$'} } # Load and validate data = yaml.safe_load(yaml_string) validator = Validator() if not validator.validate(data, schema): raise ValueError("Invalid YAML data")

3. Limit File Size

python
import yaml MAX_YAML_SIZE = 10 * 1024 * 1024 # 10MB def load_yaml_safely(file_path): with open(file_path, 'r') as f: content = f.read() if len(content) > MAX_YAML_SIZE: raise ValueError("YAML file too large") return yaml.safe_load(content)

4. Limit Nesting Depth

python
import yaml class DepthLimitingLoader(yaml.SafeLoader): def __init__(self, stream): super().__init__(stream) self.depth = 0 self.max_depth = 10 def construct_mapping(self, node, deep=False): if self.depth > self.max_depth: raise ValueError("YAML nesting too deep") self.depth += 1 try: return super().construct_mapping(node, deep) finally: self.depth -= 1 data = yaml.load(yaml_string, Loader=DepthLimitingLoader)

5. Use YAML Schema Validation

python
import yaml from jsonschema import validate # Define JSON Schema schema = { "type": "object", "properties": { "name": {"type": "string"}, "age": {"type": "number"}, "active": {"type": "boolean"} }, "required": ["name"] } # Load and validate data = yaml.safe_load(yaml_string) validate(instance=data, schema=schema)

Security Considerations for Specific Scenarios

1. Configuration Files

yaml
# ✅ Secure: Explicit types and values database: host: db.example.com port: 5432 ssl: true timeout: 30 # ❌ Dangerous: Using special tags database: !!python/object:database.Connection host: db.example.com port: 5432

2. User Input

python
# ✅ Secure: Validate user-provided YAML def process_user_yaml(user_yaml): try: data = yaml.safe_load(user_yaml) # Validate data structure if not isinstance(data, dict): raise ValueError("Invalid YAML structure") # Sanitize and validate fields return sanitize_data(data) except yaml.YAMLError as e: raise ValueError("Invalid YAML format") from e

3. Serialized Data

python
# ✅ Secure: Use safe_dump import yaml data = { 'name': 'John', 'age': 30, 'active': True } yaml_output = yaml.safe_dump(data)

Common Security Vulnerabilities and Fixes

1. Deserialization Vulnerabilities

python
# ❌ Vulnerability: Using unsafe_load data = yaml.unsafe_load(user_input) # ✅ Fix: Use safe_load data = yaml.safe_load(user_input)

2. Type Confusion

yaml
# ❌ Issue: yes interpreted as boolean enabled: yes # ✅ Fix: Use quotes or explicit values enabled: "yes" # or enabled: true

3. Path Traversal

yaml
# ❌ Dangerous: May contain path traversal config_file: ../../../etc/passwd # ✅ Secure: Validate path config_file: /etc/app/config.yaml

Security Tools and Libraries

1. YAML Linter

bash
# Check for security issues with yamllint yamllint -d "{rules: {line-length: disable, document-start: disable}}" config.yaml

2. Bandit (Python Security Check)

bash
# Check for security issues in code bandit -r my_project/

3. Snyk (Dependency Security Check)

bash
# Check for security vulnerabilities in dependencies snyk test

Compliance Considerations

1. OWASP Top 10

  • A03: Injection: Prevent YAML injection attacks
  • A08: Software and Data Integrity Failures: Verify YAML file integrity
  • A09: Security Logging and Monitoring Failures: Log YAML parsing activities

2. Secure Coding Standards

Follow secure coding standards, such as:

  • OWASP Secure Coding Practices
  • CERT C Coding Standards
  • CWE (Common Weakness Enumeration)

Monitoring and Logging

python
import yaml import logging logger = logging.getLogger(__name__) def load_yaml_with_logging(file_path): try: logger.info(f"Loading YAML file: {file_path}") with open(file_path, 'r') as f: data = yaml.safe_load(f) logger.info(f"Successfully loaded YAML file: {file_path}") return data except yaml.YAMLError as e: logger.error(f"YAML parsing error in {file_path}: {e}") raise except Exception as e: logger.error(f"Unexpected error loading {file_path}: {e}") raise

Summary

YAML security requires consideration from multiple levels:

  1. Use secure parsing methods
  2. Validate and sanitize input
  3. Limit resource usage
  4. Use Schema validation
  5. Monitor and log activities
  6. Regular security audits

By following these best practices, you can significantly reduce YAML-related security risks.

标签:YAML