Grok is one of the most powerful and commonly used filters in Logstash, used to parse unstructured text data into structured data formats.
Grok Basic Concepts
Grok is based on regular expressions and parses text into fields through predefined patterns. Grok syntax format:
shell%{PATTERN:field_name}
Where:
- PATTERN: Predefined pattern name
- field_name: Field name to store after parsing
Common Grok Patterns
Basic Patterns
%{NUMBER:num}: Match numbers%{WORD:word}: Match words%{DATA:data}: Match any data%{GREEDYDATA:msg}: Greedy match remaining data%{IP:ip}: Match IP addresses%{DATE:date}: Match dates
Log Patterns
%{COMBINEDAPACHELOG}: Apache combined log format%{COMMONAPACHELOG}: Apache common log format%{NGINXACCESS}: Nginx access log format%{SYSLOGBASE}: System log base format
Practical Application Examples
1. Apache Access Log Parsing
conffilter { grok { match => { "message" => "%{COMBINEDAPACHELOG}" } } }
After parsing, the following fields are generated:
- clientip
- ident
- auth
- timestamp
- verb
- request
- httpversion
- response
- bytes
- referrer
- agent
2. Custom Log Format
Assuming log format:
shell2024-02-21 10:30:45 [INFO] User john.doe logged in from 192.168.1.100
Configuration:
conffilter { grok { match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} \[%{LOGLEVEL:level}\] %{GREEDYDATA:message}" } } }
3. Complex Log Parsing
conffilter { grok { match => { "message" => "%{IP:client_ip} - %{USER:user} \[%{HTTPDATE:timestamp}\] \"%{WORD:method} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}\" %{NUMBER:response_code} %{NUMBER:bytes} \"%{DATA:referrer}\" \"%{DATA:agent}\"" } } }
Custom Grok Patterns
Define custom patterns in configuration files:
conffilter { grok { patterns_dir => ["/path/to/patterns"] match => { "message" => "%{CUSTOM_PATTERN:custom_field}" } } }
Define in patterns file:
shellCUSTOM_PATTERN [0-9]{3}-[A-Z]{2}
Multi-pattern Matching
Grok supports multiple matching patterns, tried in order:
conffilter { grok { match => { "message" => [ "%{COMBINEDAPACHELOG}", "%{COMMONAPACHELOG}", "%{NGINXACCESS}" ] } } }
Grok Debugging Tools
1. Grok Debugger
Use online Grok Debugger tools to test and debug patterns:
- Grok Debugger in Kibana Dev Tools
- Elastic official online debugger
2. Add Tags for Debugging
conffilter { grok { match => { "message" => "%{PATTERN:field}" } add_tag => ["_grokparsefailure"] tag_on_failure => ["_grokparsefailure"] } }
Performance Optimization
- Use Precompiled Patterns: Logstash caches compiled patterns
- Avoid Greedy Matching: Use more precise patterns for better performance
- Reduce Pattern Count: Use only necessary patterns
- Use Conditional Statements: Apply specific grok patterns to specific data types
Best Practices
- Start Simple: Test simple patterns first, gradually increase complexity
- Use Named Capture Groups: Improve code readability
- Handle Parse Failures: Use
_grokparsefailuretags to handle parsing failures - Document Custom Patterns: Add comments explaining custom patterns
- Version Control: Include custom pattern files in version control