YAML security is an important topic, especially when handling untrusted input. Understanding YAML security risks and best practices is crucial for protecting applications.
YAML Security Risks
1. Code Injection Risks
YAML parsers may execute arbitrary code, especially when using unsafe_load() methods.
python# ❌ Dangerous: Using unsafe_load may lead to code execution import yaml data = yaml.unsafe_load(""" !!python/object/apply:os.system args: ['rm -rf /'] """)
2. Type Confusion Attacks
Attackers may exploit type inference rules to bypass security checks.
yaml# Unexpected type conversion password: "123456" # String password: 123456 # Number, may cause validation failure
3. Resource Exhaustion Attacks
Maliciously crafted YAML files may cause parsers to consume excessive resources.
yaml# Deep nesting may cause stack overflow a: b: c: d: e: f: g: h: i: value
4. Symbolic Link Attacks
Some YAML parsers may follow symbolic links, leading to information disclosure.
Secure YAML Parsing Methods
Python
Using safe_load()
pythonimport yaml # ✅ Secure: Use safe_load with open('config.yaml', 'r') as f: data = yaml.safe_load(f) # ❌ Dangerous: Avoid using unsafe_load data = yaml.unsafe_load(open('config.yaml'))
Using SafeLoader
pythonimport yaml # Explicitly specify SafeLoader data = yaml.load(open('config.yaml'), Loader=yaml.SafeLoader)
JavaScript
Using safeLoad()
javascriptconst yaml = require('js-yaml'); // ✅ Secure: Use safeLoad const data = yaml.safeLoad(fs.readFileSync('config.yaml', 'utf8')); // ❌ Dangerous: Avoid using load (if supported) const data = yaml.load(fs.readFileSync('config.yaml', 'utf8'));
Java
Using SnakeYAML
javaimport org.yaml.snakeyaml.Yaml; import org.yaml.snakeyaml.constructor.Constructor; import org.yaml.snakeyaml.constructor.SafeConstructor; // ✅ Secure: Use SafeConstructor Yaml yaml = new Yaml(new SafeConstructor()); Map<String, Object> data = yaml.load(inputStream); // ❌ Dangerous: Avoid using default constructor Yaml yaml = new Yaml(); Map<String, Object> data = yaml.load(inputStream);
Go
Using gopkg.in/yaml.v3
goimport "gopkg.in/yaml.v3" // Go's yaml library is safe by default var data map[string]interface{} err := yaml.Unmarshal([]byte(yamlContent), &data)
YAML Security Best Practices
1. Always Use Secure Parsers
python# Python import yaml data = yaml.safe_load(yaml_string) # JavaScript const yaml = require('js-yaml'); const data = yaml.safeLoad(yamlString); # Java Yaml yaml = new Yaml(new SafeConstructor()); Map<String, Object> data = yaml.load(inputStream);
2. Validate and Sanitize Input
pythonimport yaml from cerberus import Validator # Define validation schema schema = { 'name': {'type': 'string', 'required': True}, 'age': {'type': 'integer', 'min': 0, 'max': 120}, 'email': {'type': 'string', 'regex': '^[^@]+@[^@]+$'} } # Load and validate data = yaml.safe_load(yaml_string) validator = Validator() if not validator.validate(data, schema): raise ValueError("Invalid YAML data")
3. Limit File Size
pythonimport yaml MAX_YAML_SIZE = 10 * 1024 * 1024 # 10MB def load_yaml_safely(file_path): with open(file_path, 'r') as f: content = f.read() if len(content) > MAX_YAML_SIZE: raise ValueError("YAML file too large") return yaml.safe_load(content)
4. Limit Nesting Depth
pythonimport yaml class DepthLimitingLoader(yaml.SafeLoader): def __init__(self, stream): super().__init__(stream) self.depth = 0 self.max_depth = 10 def construct_mapping(self, node, deep=False): if self.depth > self.max_depth: raise ValueError("YAML nesting too deep") self.depth += 1 try: return super().construct_mapping(node, deep) finally: self.depth -= 1 data = yaml.load(yaml_string, Loader=DepthLimitingLoader)
5. Use YAML Schema Validation
pythonimport yaml from jsonschema import validate # Define JSON Schema schema = { "type": "object", "properties": { "name": {"type": "string"}, "age": {"type": "number"}, "active": {"type": "boolean"} }, "required": ["name"] } # Load and validate data = yaml.safe_load(yaml_string) validate(instance=data, schema=schema)
Security Considerations for Specific Scenarios
1. Configuration Files
yaml# ✅ Secure: Explicit types and values database: host: db.example.com port: 5432 ssl: true timeout: 30 # ❌ Dangerous: Using special tags database: !!python/object:database.Connection host: db.example.com port: 5432
2. User Input
python# ✅ Secure: Validate user-provided YAML def process_user_yaml(user_yaml): try: data = yaml.safe_load(user_yaml) # Validate data structure if not isinstance(data, dict): raise ValueError("Invalid YAML structure") # Sanitize and validate fields return sanitize_data(data) except yaml.YAMLError as e: raise ValueError("Invalid YAML format") from e
3. Serialized Data
python# ✅ Secure: Use safe_dump import yaml data = { 'name': 'John', 'age': 30, 'active': True } yaml_output = yaml.safe_dump(data)
Common Security Vulnerabilities and Fixes
1. Deserialization Vulnerabilities
python# ❌ Vulnerability: Using unsafe_load data = yaml.unsafe_load(user_input) # ✅ Fix: Use safe_load data = yaml.safe_load(user_input)
2. Type Confusion
yaml# ❌ Issue: yes interpreted as boolean enabled: yes # ✅ Fix: Use quotes or explicit values enabled: "yes" # or enabled: true
3. Path Traversal
yaml# ❌ Dangerous: May contain path traversal config_file: ../../../etc/passwd # ✅ Secure: Validate path config_file: /etc/app/config.yaml
Security Tools and Libraries
1. YAML Linter
bash# Check for security issues with yamllint yamllint -d "{rules: {line-length: disable, document-start: disable}}" config.yaml
2. Bandit (Python Security Check)
bash# Check for security issues in code bandit -r my_project/
3. Snyk (Dependency Security Check)
bash# Check for security vulnerabilities in dependencies snyk test
Compliance Considerations
1. OWASP Top 10
- A03: Injection: Prevent YAML injection attacks
- A08: Software and Data Integrity Failures: Verify YAML file integrity
- A09: Security Logging and Monitoring Failures: Log YAML parsing activities
2. Secure Coding Standards
Follow secure coding standards, such as:
- OWASP Secure Coding Practices
- CERT C Coding Standards
- CWE (Common Weakness Enumeration)
Monitoring and Logging
pythonimport yaml import logging logger = logging.getLogger(__name__) def load_yaml_with_logging(file_path): try: logger.info(f"Loading YAML file: {file_path}") with open(file_path, 'r') as f: data = yaml.safe_load(f) logger.info(f"Successfully loaded YAML file: {file_path}") return data except yaml.YAMLError as e: logger.error(f"YAML parsing error in {file_path}: {e}") raise except Exception as e: logger.error(f"Unexpected error loading {file_path}: {e}") raise
Summary
YAML security requires consideration from multiple levels:
- Use secure parsing methods
- Validate and sanitize input
- Limit resource usage
- Use Schema validation
- Monitor and log activities
- Regular security audits
By following these best practices, you can significantly reduce YAML-related security risks.