CDATA (Character Data) sections in XML are a special mechanism for containing text content that will not be parsed by the XML parser. CDATA sections are very useful when you need to include special characters (such as <, >, &, etc.) or code snippets in XML documents.
Basic Syntax of CDATA
CDATA sections start with <![CDATA[ and end with ]]>:
xml<description> <![CDATA[ You can include any characters here, including < > & and other special characters These characters will not be parsed by the XML parser ]]> </description>
Use Cases for CDATA
1. Including Code Snippets
xml<code> <![CDATA[ function hello() { if (x < 10) { return "Hello"; } } ]]> </code>
2. Including Mathematical Formulas
xml<formula> <![CDATA[ E = mc² x < y && y > z ]]> </formula>
3. Including HTML or XML Fragments
xml<content> <![CDATA[ <div class="header"> <p>Welcome to <strong>XML</strong></p> </div> ]]> </content>
4. Including Special Character Data
xml<data> <![CDATA[ Special characters: < > & " ' Comparison: 5 < 10, 20 > 15 ]]> </data>
Limitations and Considerations of CDATA
-
Cannot be nested: CDATA sections cannot be nested
xml<!-- Error: CDATA cannot be nested --> <data> <![CDATA[ Outer CDATA <![CDATA[Inner CDATA]]> ]]> </data> -
Cannot contain end markers: CDATA sections cannot contain the
]]>stringxml<!-- Error: Contains end marker --> <data> <![CDATA[ This contains ]]> which is not allowed ]]> </data> -
Case sensitive: CDATA markers must be uppercase
xml<!-- Error: CDATA must be uppercase --> <data> <![cdata[This is wrong]]> </data> -
Whitespace preserved: All whitespace characters within CDATA sections are preserved
xml<data> <![CDATA[ Line 1 Line 2 Indented line ]]> </data>
Comparison of CDATA and Entity References
| Feature | CDATA | Entity References |
|---|---|---|
| Syntax | <![CDATA[content]]> | < > & etc. |
| Readability | High, displays original content directly | Low, requires conversion |
| Applicability | Large blocks of text | Individual characters |
| Performance | Slightly better, reduces parsing overhead | Slightly worse, requires entity parsing |
| Flexibility | Low, cannot be used partially | High, precise control possible |
When to Use CDATA
Situations suitable for CDATA:
- Text containing many special characters
- Code snippets that need to preserve original formatting
- Content containing other markup languages (HTML, JavaScript, etc.)
- Need to avoid frequent character escaping
Situations not suitable for CDATA:
- Only a few special characters
- Need to process content partially
- Content may contain the
]]>string - Need compatibility with other XML processing tools
Practical Application Examples of CDATA
1. Web Service Configuration
xml<configuration> <script> <![CDATA[ $(document).ready(function() { $("#button").click(function() { if (count < 10) { alert("Click count: " + count); } }); }); ]]> </script> </configuration>
2. Database Query Storage
xml<queries> <query id="getUser"> <![CDATA[ SELECT * FROM users WHERE age > 18 AND status = 'active' ORDER BY name ASC ]]> </query> </queries>
3. Template Content
xml<template> <![CDATA[ <html> <head><title>${title}</title></head> <body> <h1>Welcome, ${username}!</h1> <p>Your balance is: $${balance}</p> </body> </html> ]]> </template>
CDATA Processing in Different Languages
Java DOM Parsing
javaElement element = document.createElement("description"); CDATASection cdata = document.createCDATASection("Text with <special> characters"); element.appendChild(cdata);
Python ElementTree
pythonimport xml.etree.ElementTree as ET element = ET.Element("description") element.text = "Text with <special> characters" # ElementTree will automatically escape special characters
CDATA sections are an important tool in XML for handling special characters and raw text content. Proper use can improve the readability and maintainability of XML documents.