Logstash provides various filter plugins for parsing, transforming, and enriching data. Here are commonly used filters and their usage.
1. Grok Filter
Grok is the most powerful filter for parsing unstructured data into structured data.
Basic Usage
conffilter { grok { match => { "message" => "%{COMBINEDAPACHELOG}" } } }
Multi-pattern Matching
conffilter { grok { match => { "message" => [ "%{COMBINEDAPACHELOG}", "%{COMMONAPACHELOG}", "%{NGINXACCESS}" ] } } }
Custom Patterns
conffilter { grok { patterns_dir => ["/path/to/patterns"] match => { "message" => "%{CUSTOM_PATTERN:custom_field}" } } }
2. Mutate Filter
The Mutate filter performs various operations on fields.
Rename Fields
conffilter { mutate { rename => { "old_name" => "new_name" } } }
Convert Field Types
conffilter { mutate { convert => { "status" => "integer" "price" => "float" "enabled" => "boolean" } } }
Remove Fields
conffilter { mutate { remove_field => ["temp_field", "debug_info"] } }
Replace Field Values
conffilter { mutate { replace => { "message" => "new message" } } }
Add Fields
conffilter { mutate { add_field => { "environment" => "production" "processed_at" => "%{@timestamp}" } } }
Merge Fields
conffilter { mutate { merge => { "field1" => "field2" } } }
3. Date Filter
The Date filter parses timestamps and converts them to Logstash's @timestamp field.
Basic Usage
conffilter { date { match => ["timestamp", "dd/MMM/yyyy:HH:mm:ss Z"] } }
Multiple Date Formats
conffilter { date { match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z", "yyyy-MM-dd HH:mm:ss", "ISO8601" ] } }
Custom Target Field
conffilter { date { match => ["log_time", "yyyy-MM-dd HH:mm:ss"] target => "parsed_time" } }
Timezone Setting
conffilter { date { match => ["timestamp", "yyyy-MM-dd HH:mm:ss"] timezone => "Asia/Shanghai" } }
4. GeoIP Filter
The GeoIP filter adds geographic location information based on IP address.
Basic Usage
conffilter { geoip { source => "client_ip" } }
Specify Target Field
conffilter { geoip { source => "client_ip" target => "geoip" } }
Specify Database Path
conffilter { geoip { source => "client_ip" database => "/path/to/GeoLite2-City.mmdb" } }
Specify Fields
conffilter { geoip { source => "client_ip" fields => ["city_name", "country_name", "location"] } }
5. Useragent Filter
The Useragent filter parses User-Agent strings.
Basic Usage
conffilter { useragent { source => "agent" } }
Specify Target Field
conffilter { useragent { source => "agent" target => "ua" } }
6. CSV Filter
The CSV filter parses CSV format data.
Basic Usage
conffilter { csv { separator => "," columns => ["name", "age", "city"] } }
Auto-detect Column Names
conffilter { csv { separator => "," autodetect_column_types => true } }
7. JSON Filter
The JSON filter parses JSON strings.
Basic Usage
conffilter { json { source => "message" } }
Specify Target Field
conffilter { json { source => "message" target => "parsed_json" } }
Keep Original Field
conffilter { json { source => "message" remove_field => ["message"] } }
8. Ruby Filter
The Ruby filter allows using Ruby code for complex data processing.
Basic Usage
conffilter { ruby { code => 'event.set("computed_field", event.get("field1") + event.get("field2"))' } }
Complex Logic
conffilter { ruby { code => ' if event.get("status").to_i >= 400 event.tag("error") else event.tag("success") end ' } }
Array Operations
conffilter { ruby { code => ' items = event.get("items") if items.is_a?(Array) event.set("item_count", items.length) event.set("total_price", items.sum { |i| i["price"] }) end ' } }
9. Drop Filter
The Drop filter discards events.
Conditional Drop
conffilter { if [log_level] == "DEBUG" { drop { } } }
Percentage Drop
conffilter { ruby { code => 'event.cancel if rand < 0.1' } }
10. Aggregate Filter
The Aggregate filter aggregates multiple events.
Basic Usage
conffilter { aggregate { task_id => "%{user_id}" code => ' map["count"] ||= 0 map["count"] += 1 ' push_map_as_event => true timeout => 60 } }
Filter Combination
Multiple filters can be combined:
conffilter { # Parse log format grok { match => { "message" => "%{COMBINEDAPACHELOG}" } } # Convert field types mutate { convert => { "response" => "integer" } } # Parse timestamp date { match => ["timestamp", "dd/MMM/yyyy:HH:mm:ss Z"] } # Add geographic location information geoip { source => "clientip" } # Parse User-Agent useragent { source => "agent" } }
Best Practices
- Filter Order: Arrange filters in logical order
- Conditional Statements: Use conditional statements to avoid unnecessary processing
- Performance Optimization: Avoid using complex Ruby code
- Error Handling: Handle parsing failures
- Testing and Verification: Use tools like Grok Debugger to test filters