Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Filebeat – Logstash – Multiple Config Files – Duplicate data

I am new to logstash and filebeat. I am trying to set up multiple config files for my logstash instance.
Using filebeat to send data to logstash. Even if I have filters created for both the logstash config files, I am getting duplicate data.

Logstash config file – 1:

input {
  beats {
    port => 5045
  }
}

filter {
   if [fields][env] == "prod" {
     grok {   match => { "message" => "%{LOGLEVEL:loglevel}] %{GREEDYDATA:message}$" }
     overwrite => [ "message" ]
     }
   }
}

output {
  stdout {
    codec => rubydebug
  }

  elasticsearch {
    hosts => ["https://172.17.0.2:9200"]
    index => "logstash-myapp-%{+YYYY.MM.dd}"
    user => "elastic"
    password => "password"
    ssl => true
    cacert => "/usr/share/logstash/certs/http_ca.crt"
  }
}

logstash config file-2

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

input {
  beats {
    port => 5044
  }
}

filter {
   if [fields][env] == "dev" {
     grok {   match => { "message" => "%{LOGLEVEL:loglevel}] %{GREEDYDATA:message}$" }
     overwrite => [ "message" ]
     }
   }
}

output {
  stdout {
    codec => rubydebug
  }

  elasticsearch {
    hosts => ["https://172.17.0.2:9200"]
    index => "logstash-myapp-%{+YYYY.MM.dd}"
    user => "elastic"
    password => "password"
    ssl => true
    cacert => "/usr/share/logstash/certs/http_ca.crt"
  }
}

Logfile Content:

[INFO] First Line
[INFO] Second Line
[INFO] Third Line

Filebeat config:

filebeat.inputs:
- type: filestream
  enabled: true
  paths:
    - /root/data/logs/*.log
  fields:
    app: test
    env: dev

output.logstash:
  # The Logstash hosts
    hosts: ["172.17.0.4:5044"]

I know that even if we have multiple files for config, logstash processes each and every line of the data against all the filters present in all the config files. Hence we have put filters in each of the config files for "fields.env".
I am expecting 3 lines to be sent to Elasticsearch because "fields.env" is "dev", but it is sending 6 lines to Elasticsearch and duplicate data.
Pleas help.

>Solution :

The problem is that your two configuration files get merged, not only the filters but also the outputs.

So each log line making it into the pipeline through any of the input, will go through all filters (bearing any conditions of course) and all outputs (no conditions possible in output).

So the first log line [INFO] First Line coming in from port 5044, will only go through the filter guarded by [fields][env] == "dev", but then will go through each of the two outputs, hence why it ends up twice in your ES.

So the easy solution is to remove the output section from one of the configuration file, so that log lines only go through a single output.

The better solution is to create separate pipelines.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading