Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Filebeat JSON Logs: How to Normalize Message Field?

Learn how to normalize JSON logs in Filebeat using decode_json_fields to parse nested objects and improve log management.
Exploding nested JSON logs transforming into clean structured data with Filebeat decode_json_fields processor in ELK stack Exploding nested JSON logs transforming into clean structured data with Filebeat decode_json_fields processor in ELK stack
  • ⚒️ The decode_json_fields processor transforms nested JSON in Filebeat logs into usable top-level fields.
  • ⚠️ JSON strings in the message field prevent Elasticsearch from indexing fields correctly and hurt query performance.
  • 🧰 Combining decode_json_fields with other processors like drop_fields improves log cleanliness and usability.
  • 🔍 Normalized logs significantly enhance Kibana searchability, filtering, and visualizations.
  • 🚫 Improper use of decode_json_fields can silently break parsing and lead to indexing conflicts or data loss.

Logs in the ELK stack need to be structured and easy to query. But, nested JSON in log message fields, especially when stored as raw strings, can make it hard to parse, analyze, and see that data. This guide shows how to get Filebeat logs in order using the decode_json_fields processor. This helps Elasticsearch work faster and makes Kibana easier to use.


What Does It Mean to Get Filebeat Logs in Order?

Log normalization in Filebeat changes nested or raw log data, often JSON, into simple top-level fields. These flat fields make logs much easier to query in Elasticsearch and see in Kibana dashboards.
Instead of sending a whole JSON object as one field, normalization pulls out each key-value pair. This makes each pair an individual field you can search.

🔍 What You Gain from Log Normalization

  • ⚡ Querying and indexing works better.
  • 🔎 Filters and aggregations in Kibana are quicker.
  • 🧱 Log structure is the same everywhere.
  • 📊 Data is easier to see and find in Kibana.
  • 📉 Less indexing work and fewer mapping problems.

🛑 Before Normalization

{
  "message": "{\"status\":\"error\",\"error\":\"database timeout\"}",
  "host": {
    "name": "app-server"
  }
}

All helpful data is stuck in the "message" string. So, Elasticsearch cannot read it as separate fields.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

✅ After Normalization

{
  "status": "error",
  "error": "database timeout",
  "host": {
    "name": "app-server"
  }
}

Fields are at the top level. You can search them in Elasticsearch. For example, you can filter by status:error in Kibana or easily use aggregations.


Filebeat and JSON: Raw Logs vs. Parsed Fields

Filebeat works with JSON-formatted logs directly. If your application logs print one JSON object per line, Filebeat can read those fields on its own using its built-in input setup.

📄 Example JSON Input Configuration

filebeat.inputs:
  - type: log
    paths:
      - /var/log/app.log
    json.keys_under_root: true
    json.add_error_key: true
  • json.keys_under_root: true tells Filebeat to read the JSON object and put its fields at the event's root level.
  • json.add_error_key: true adds an error key if parsing fails. This helps find problems.

🧱 The Problem with Nested Message Fields

This setup works well for JSON that is already at the top level. But it fails when a log field, like message or log, has another JSON string inside it.

📉 Signs of nested JSON problems:

  • Raw JSON appears as a plain string, for example, "message": "{\"user\":\"admin\"}".
  • Data is stuck in strings instead of being read as separate fields.
  • Filters in Kibana don't work because data inside strings cannot be searched.
  • Index size grows, and performance drops because mappings are not good.

Nested JSON fields must be decoded on their own to work. This is where Filebeat processors, like decode_json_fields, come in.


Use decode_json_fields to Get Filebeat Logs in Order

The decode_json_fields processor in Filebeat takes a JSON object that is part of a string inside another field. This is most often the message field. It reads the hidden JSON and puts its data into top-level fields or a specific subpath.

⚙️ Basic Configuration Example

processors:
  - decode_json_fields:
      fields: ["message"]
      process_array: false
      max_depth: 1
      target: ""
      overwrite_keys: true

🔍 Explanation of Key Configuration Options

Option Description
fields Lists the fields to decode (e.g., ["message"])
process_array If true, the processor tries to read arrays (default is false)
max_depth Sets a limit for how deep the decoding goes. This prevents too much nesting.
target Where to put the decoded fields ("" puts them at the root level)
overwrite_keys Lets decoded keys replace existing fields (most times, set this to true)

With this setup, Filebeat will read any good JSON string in the message field. Then it will show the values inside as their own fields when sending data to Elasticsearch.

📚 Citation


Nested JSON in the Message Field: A Real Use Case

Here's an example of logs your application might send:

{
  "timestamp": "2024-04-01T12:00:00Z",
  "level": "info",
  "message": "{\"action\":\"login\",\"user\":{\"id\":\"123\",\"role\":\"admin\"}}"
}

In this case, message contains a second layer of JSON data. Without using the decode_json_fields processor, the hidden information will stay as raw text. This makes it useless for searching.

🔎 Problems if You Don't Normalize

  • 🔒 You can't search fields like user.id and role right away.
  • 📉 Logs stay messy, which makes analysis and monitoring harder.
  • 💥 You might get mapping problems if logs that are alike are a bit different.
  • 📊 Kibana dashboards might run slow or not show all fields.
  • ⚠️ Index size can grow too much because the same raw text is stored over and over.

The decode_json_fields processor fixes this. It pulls out the action and user.* fields and puts them into the Elasticsearch event in the right way.


Structuring Filebeat Inputs and Processors Together

To get logs in order, you often need to use decode_json_fields with other Filebeat processors. This cleans up both the data's structure and its content.

🛠️ Example Filebeat Input + Processor Configuration

filebeat.inputs:
  - type: log
    paths:
      - /var/log/app.json
    json.keys_under_root: true
    json.add_error_key: true

processors:
  - decode_json_fields:
      fields: ["message"]
      target: ""
      overwrite_keys: true
  - drop_fields:
      fields: ["message"]

🔍 Good Ways to Do This

  • Set target: "" to show decoded fields at the root level. This makes querying easiest.
  • Use overwrite_keys: true to stop duplicate or extra keys from causing problems.
  • Use drop_fields to take out the first message if you don't need it anymore.
  • Think about using the rename processor to make field names the same across all your apps.

Avoid These Common Mistakes

If you set up decode_json_fields wrong, it can cause parsing errors you don't see, data loss, and make developers frustrated. Watch out for these common mistakes:

⚠️ Things to Avoid

  • ❌ Don't leave overwrite_keys off: decoded fields won't work right.
  • ❌ Don't try to decode arrays when process_array: false: you'll get empty records.
  • ❌ Don't decode JSON data that's already parsed: this creates mapping and duplicate data problems.
  • ❌ Don't skip testing your logs: use filebeat -e -d "*" to find problems soon.
  • ❌ Don't use unclear field names, like message.message.

Finding problems in log pipelines gets much harder when the issue is decoding errors that happen sometimes or mapping conflicts in Elasticsearch that are not clear.


Why Normalized Logs Improve Elasticsearch and Kibana

Elasticsearch works best with flat, well-organized fields. Nested JSON, unless you plan it into a nested mapping (which often adds its own problems), makes performance worse and limits how you can search.

⚡ Good Points of Normalized Data

  • 🔍 You can search logs right away with DSL or CLI filters.
  • 📉 Index mappings stay steady and small.
  • 📊 Kibana dashboards find field types on their own for better views.
  • 📦 Less repeated data compared to big message blocks.
  • 🔐 Fewer unexpected problems with mapping and indexing.

In the end, normalized JSON makes your search, aggregation, alerting, and forensics work better and simpler.

📚 Citation


Security and Performance Considerations

Decoding JSON is very helpful. But there are important things to watch for, mainly in busy production systems.

🔐 Key Advice

  • 🚫 Do not decode logs you don't trust. Bad or harmful JSON can slow down parsing or cause crashes.
  • ⚙️ Set max_depth to stop deep, repeating decoding that can overload systems.
  • 📊 Watch Filebeat logs for parsing errors or lost events using filebeat -e -d publish.
  • 💡 After normalization, check how many logs come in. This helps avoid too many fields (called mapping explosion).
  • 📁 If you can, make service teams agree on log structures. This makes JSON simpler.

Making Your Log Pipelines More Sound

As your log system changes, your way of decoding and structuring data should change too.

🧩 More Advanced Ways (Not Just Filebeat)

  • 🧪 Use Logstash for hard changes or JSON parsing in layers with scripts.
  • 📦 Set up Elasticsearch data streams. These help you handle templates and map fields well across servers.
  • 🛠️ Make custom index templates. This locks in the field types, formats, and mappings you expect.
  • ✅ Think about having an early processing step where apps send out clean, flat JSON logs on purpose (for example, with structured logging tools).

It is important to check your pipeline often. Log decoding errors usually show up much later during analysis, so testing early helps a lot.

📚 Citation


Dealing with Complex or Custom JSON Logs

Sometimes logs are not simple { "field": "value" } JSON. Some cases need special ways to handle them.

🌀 How to Handle Complex JSON

  • 🔁 Use decode_json_fields in layers or use it again in Logstash for JSON that's double-wrapped.
  • 📜 Use Logstash's scripting (like the ruby plugin) for unusual patterns or changes.
  • 🧹 Make deeply nested JSON flat by hand in your apps before sending logs.
  • 👩‍💻 Filter and remove fields to get rid of unneeded or noisy parts in very chatty logs.
  • 🔐 Always check logs before they go into your system. A bad structure early on makes bigger problems later.

If your logs have encrypted tokens, object arrays, or different field types, try out changes in a test area first before putting them into live systems.


Final Wrap-Up and Quick Checklist

Normalizing Filebeat logs with the decode_json_fields processor connects raw log data to clear, useful monitoring. You change hidden JSON strings into clean, fast, structured documents. This makes filtering easier, lets you see data faster in Kibana, and helps you keep a log system that can grow.

✅ Quick Checklist

  • decode_json_fields is on and points to message correctly.
  • Decoded fields show up at the root or sub-root path you want.
  • overwrite_keys: true stops data from being copied or missed.
  • Extra message or raw fields are gone.
  • Mappings in Elasticsearch stay clean and steady.
  • You checked for problems using filebeat -e -d "*" before putting it live.

Need help setting up your pipeline or handling complex logs? Devsolus gives detailed ELK stack advice to make your data work as well as it can.


Sources & Citations

Elastic. (2023). Filebeat Reference [8.13]. https://www.elastic.co/guide/en/beats/filebeat/8.13/decode-json-fields.html

Elastic. (2023). Indexing Structured Logs in Elasticsearch. https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html

Brown, K. (2022). Best Practices in ELK Stack Parsing. DevOps Weekly News.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading