XML injection is a security vulnerability where attackers manipulate an application's XML parsing behavior by injecting malicious content into XML input. This type of attack can lead to information disclosure, denial of service, privilege escalation, and other security issues.
Types of XML Injection
1. XML External Entity (XXE) Attack
XXE (XML External Entity) is the most common type of XML injection attack. Attackers exploit the XML parser's ability to process external entities to read sensitive files on the server or launch SSRF attacks.
Attack Example:
xml<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE data [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]> <data> <username>&xxe;</username> </data>
Impact:
- Reading sensitive files on the server
- Launching SSRF (Server-Side Request Forgery) attacks
- Remote code execution (in some cases)
- Denial of service attacks
2. XML Injection Attack
Attackers modify the structure of XML documents by injecting XML tags.
Attack Example:
xml<!-- Normal input --> <user> <name>John</name> </user> <!-- Malicious input --> <user> <name>John</name> <role>admin</role> </user>
3. XPath Injection
Similar to SQL injection, attackers manipulate XPath queries to obtain unauthorized data.
Attack Example:
xml<!-- Normal query --> //user[username='john' and password='secret'] <!-- Malicious query --> //user[username='john' or '1'='1']
Protection Measures
1. Disable External Entities
Java Example:
javaDocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); // Disable external entities dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true); dbf.setFeature("http://xml.org/sax/features/external-general-entities", false); dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false); dbf.setXIncludeAware(false); dbf.setExpandEntityReferences(false); DocumentBuilder db = dbf.newDocumentBuilder();
Python Example:
pythonfrom lxml import etree # Disable external entities parser = etree.XMLParser(resolve_entities=False, load_dtd=False) tree = etree.parse("data.xml", parser)
PHP Example:
phplibxml_disable_entity_loader(true); $xml = simplexml_load_string($xmlString);
2. Input Validation and Filtering
java// Validate XML input public boolean isValidXML(String xml) { try { // Check for malicious content if (xml.contains("<!DOCTYPE") || xml.contains("<!ENTITY")) { return false; } // Validate XML format DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true); DocumentBuilder db = dbf.newDocumentBuilder(); db.parse(new InputSource(new StringReader(xml))); return true; } catch (Exception e) { return false; } }
3. Use Whitelist Validation
java// Whitelist validation public boolean isValidElement(String elementName) { Set<String> allowedElements = new HashSet<>(Arrays.asList( "name", "email", "age" )); return allowedElements.contains(elementName); }
4. Use Secure XML Parsers
Recommended Secure Configuration:
javaDocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); // Secure configuration dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true); dbf.setFeature("http://xml.org/sax/features/external-general-entities", false); dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false); dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false); dbf.setXIncludeAware(false); dbf.setExpandEntityReferences(false); // Use Schema validation dbf.setNamespaceAware(true); dbf.setSchema(schema);
5. XPath Injection Protection
java// Use parameterized queries public User getUser(String username, String password) { try { XPathFactory xPathFactory = XPathFactory.newInstance(); XPath xpath = xPathFactory.newXPath(); // Use variables instead of string concatenation XPathExpression expr = xpath.compile( "//user[username=$username and password=$password]" ); // Set parameters SimpleVariableResolver resolver = new SimpleVariableResolver(); resolver.addVariable(new QName("username"), username); resolver.addVariable(new QName("password"), password); xpath.setXPathVariableResolver(resolver); Node node = (Node) expr.evaluate(doc, XPathConstants.NODE); return parseUser(node); } catch (Exception e) { throw new RuntimeException("XPath query failed", e); } }
6. Use XML Schema Validation
java// Use Schema to validate input SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI); Schema schema = factory.newSchema(new File("schema.xsd")); DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); dbf.setSchema(schema); dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true); DocumentBuilder db = dbf.newDocumentBuilder(); Document doc = db.parse(new File("input.xml"));
Best Practices
- Disable DTD and external entities: This is the most effective method to prevent XXE attacks
- Use whitelist validation: Only allow predefined elements and attributes
- Input filtering: Filter out dangerous XML structures
- Use secure parsers: Configure parsers to disable dangerous features
- Principle of least privilege: Limit permissions for XML processing
- Regular updates: Keep XML parsing libraries up to date
- Secure coding: Follow secure coding best practices
- Security testing: Regularly perform security testing and code reviews
Detection Tools
- OWASP ZAP: Web application security scanner
- Burp Suite: Web application security testing tool
- XMLSec: XML security library
- SonarQube: Code quality and security analysis tool
XML injection is a serious security issue that must be considered during the design and implementation phases. By disabling external entities, validating input, and using secure parser configurations, you can effectively prevent XML injection attacks.