乐闻世界logo
搜索文章和话题

What are the commonly used filters in Logstash, and how do you use Grok and Mutate filters?

2月21日 15:52

Logstash provides various filter plugins for parsing, transforming, and enriching data. Here are commonly used filters and their usage.

1. Grok Filter

Grok is the most powerful filter for parsing unstructured data into structured data.

Basic Usage

conf
filter { grok { match => { "message" => "%{COMBINEDAPACHELOG}" } } }

Multi-pattern Matching

conf
filter { grok { match => { "message" => [ "%{COMBINEDAPACHELOG}", "%{COMMONAPACHELOG}", "%{NGINXACCESS}" ] } } }

Custom Patterns

conf
filter { grok { patterns_dir => ["/path/to/patterns"] match => { "message" => "%{CUSTOM_PATTERN:custom_field}" } } }

2. Mutate Filter

The Mutate filter performs various operations on fields.

Rename Fields

conf
filter { mutate { rename => { "old_name" => "new_name" } } }

Convert Field Types

conf
filter { mutate { convert => { "status" => "integer" "price" => "float" "enabled" => "boolean" } } }

Remove Fields

conf
filter { mutate { remove_field => ["temp_field", "debug_info"] } }

Replace Field Values

conf
filter { mutate { replace => { "message" => "new message" } } }

Add Fields

conf
filter { mutate { add_field => { "environment" => "production" "processed_at" => "%{@timestamp}" } } }

Merge Fields

conf
filter { mutate { merge => { "field1" => "field2" } } }

3. Date Filter

The Date filter parses timestamps and converts them to Logstash's @timestamp field.

Basic Usage

conf
filter { date { match => ["timestamp", "dd/MMM/yyyy:HH:mm:ss Z"] } }

Multiple Date Formats

conf
filter { date { match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z", "yyyy-MM-dd HH:mm:ss", "ISO8601" ] } }

Custom Target Field

conf
filter { date { match => ["log_time", "yyyy-MM-dd HH:mm:ss"] target => "parsed_time" } }

Timezone Setting

conf
filter { date { match => ["timestamp", "yyyy-MM-dd HH:mm:ss"] timezone => "Asia/Shanghai" } }

4. GeoIP Filter

The GeoIP filter adds geographic location information based on IP address.

Basic Usage

conf
filter { geoip { source => "client_ip" } }

Specify Target Field

conf
filter { geoip { source => "client_ip" target => "geoip" } }

Specify Database Path

conf
filter { geoip { source => "client_ip" database => "/path/to/GeoLite2-City.mmdb" } }

Specify Fields

conf
filter { geoip { source => "client_ip" fields => ["city_name", "country_name", "location"] } }

5. Useragent Filter

The Useragent filter parses User-Agent strings.

Basic Usage

conf
filter { useragent { source => "agent" } }

Specify Target Field

conf
filter { useragent { source => "agent" target => "ua" } }

6. CSV Filter

The CSV filter parses CSV format data.

Basic Usage

conf
filter { csv { separator => "," columns => ["name", "age", "city"] } }

Auto-detect Column Names

conf
filter { csv { separator => "," autodetect_column_types => true } }

7. JSON Filter

The JSON filter parses JSON strings.

Basic Usage

conf
filter { json { source => "message" } }

Specify Target Field

conf
filter { json { source => "message" target => "parsed_json" } }

Keep Original Field

conf
filter { json { source => "message" remove_field => ["message"] } }

8. Ruby Filter

The Ruby filter allows using Ruby code for complex data processing.

Basic Usage

conf
filter { ruby { code => 'event.set("computed_field", event.get("field1") + event.get("field2"))' } }

Complex Logic

conf
filter { ruby { code => ' if event.get("status").to_i >= 400 event.tag("error") else event.tag("success") end ' } }

Array Operations

conf
filter { ruby { code => ' items = event.get("items") if items.is_a?(Array) event.set("item_count", items.length) event.set("total_price", items.sum { |i| i["price"] }) end ' } }

9. Drop Filter

The Drop filter discards events.

Conditional Drop

conf
filter { if [log_level] == "DEBUG" { drop { } } }

Percentage Drop

conf
filter { ruby { code => 'event.cancel if rand < 0.1' } }

10. Aggregate Filter

The Aggregate filter aggregates multiple events.

Basic Usage

conf
filter { aggregate { task_id => "%{user_id}" code => ' map["count"] ||= 0 map["count"] += 1 ' push_map_as_event => true timeout => 60 } }

Filter Combination

Multiple filters can be combined:

conf
filter { # Parse log format grok { match => { "message" => "%{COMBINEDAPACHELOG}" } } # Convert field types mutate { convert => { "response" => "integer" } } # Parse timestamp date { match => ["timestamp", "dd/MMM/yyyy:HH:mm:ss Z"] } # Add geographic location information geoip { source => "clientip" } # Parse User-Agent useragent { source => "agent" } }

Best Practices

  1. Filter Order: Arrange filters in logical order
  2. Conditional Statements: Use conditional statements to avoid unnecessary processing
  3. Performance Optimization: Avoid using complex Ruby code
  4. Error Handling: Handle parsing failures
  5. Testing and Verification: Use tools like Grok Debugger to test filters
标签:Logstash