- ⚒️ The decode_json_fields processor transforms nested JSON in Filebeat logs into usable top-level fields.
- ⚠️ JSON strings in the message field prevent Elasticsearch from indexing fields correctly and hurt query performance.
- 🧰 Combining decode_json_fields with other processors like drop_fields improves log cleanliness and usability.
- 🔍 Normalized logs significantly enhance Kibana searchability, filtering, and visualizations.
- 🚫 Improper use of decode_json_fields can silently break parsing and lead to indexing conflicts or data loss.
Logs in the ELK stack need to be structured and easy to query. But, nested JSON in log message fields, especially when stored as raw strings, can make it hard to parse, analyze, and see that data. This guide shows how to get Filebeat logs in order using the decode_json_fields processor. This helps Elasticsearch work faster and makes Kibana easier to use.
What Does It Mean to Get Filebeat Logs in Order?
Log normalization in Filebeat changes nested or raw log data, often JSON, into simple top-level fields. These flat fields make logs much easier to query in Elasticsearch and see in Kibana dashboards.
Instead of sending a whole JSON object as one field, normalization pulls out each key-value pair. This makes each pair an individual field you can search.
🔍 What You Gain from Log Normalization
- ⚡ Querying and indexing works better.
- 🔎 Filters and aggregations in Kibana are quicker.
- 🧱 Log structure is the same everywhere.
- 📊 Data is easier to see and find in Kibana.
- 📉 Less indexing work and fewer mapping problems.
🛑 Before Normalization
{
"message": "{\"status\":\"error\",\"error\":\"database timeout\"}",
"host": {
"name": "app-server"
}
}
All helpful data is stuck in the "message" string. So, Elasticsearch cannot read it as separate fields.
✅ After Normalization
{
"status": "error",
"error": "database timeout",
"host": {
"name": "app-server"
}
}
Fields are at the top level. You can search them in Elasticsearch. For example, you can filter by status:error in Kibana or easily use aggregations.
Filebeat and JSON: Raw Logs vs. Parsed Fields
Filebeat works with JSON-formatted logs directly. If your application logs print one JSON object per line, Filebeat can read those fields on its own using its built-in input setup.
📄 Example JSON Input Configuration
filebeat.inputs:
- type: log
paths:
- /var/log/app.log
json.keys_under_root: true
json.add_error_key: true
json.keys_under_root: truetells Filebeat to read the JSON object and put its fields at the event's root level.json.add_error_key: trueadds an error key if parsing fails. This helps find problems.
🧱 The Problem with Nested Message Fields
This setup works well for JSON that is already at the top level. But it fails when a log field, like message or log, has another JSON string inside it.
📉 Signs of nested JSON problems:
- Raw JSON appears as a plain string, for example,
"message": "{\"user\":\"admin\"}". - Data is stuck in strings instead of being read as separate fields.
- Filters in Kibana don't work because data inside strings cannot be searched.
- Index size grows, and performance drops because mappings are not good.
Nested JSON fields must be decoded on their own to work. This is where Filebeat processors, like decode_json_fields, come in.
Use decode_json_fields to Get Filebeat Logs in Order
The decode_json_fields processor in Filebeat takes a JSON object that is part of a string inside another field. This is most often the message field. It reads the hidden JSON and puts its data into top-level fields or a specific subpath.
⚙️ Basic Configuration Example
processors:
- decode_json_fields:
fields: ["message"]
process_array: false
max_depth: 1
target: ""
overwrite_keys: true
🔍 Explanation of Key Configuration Options
| Option | Description |
|---|---|
fields |
Lists the fields to decode (e.g., ["message"]) |
process_array |
If true, the processor tries to read arrays (default is false) |
max_depth |
Sets a limit for how deep the decoding goes. This prevents too much nesting. |
target |
Where to put the decoded fields ("" puts them at the root level) |
overwrite_keys |
Lets decoded keys replace existing fields (most times, set this to true) |
With this setup, Filebeat will read any good JSON string in the message field. Then it will show the values inside as their own fields when sending data to Elasticsearch.
📚 Citation
Nested JSON in the Message Field: A Real Use Case
Here's an example of logs your application might send:
{
"timestamp": "2024-04-01T12:00:00Z",
"level": "info",
"message": "{\"action\":\"login\",\"user\":{\"id\":\"123\",\"role\":\"admin\"}}"
}
In this case, message contains a second layer of JSON data. Without using the decode_json_fields processor, the hidden information will stay as raw text. This makes it useless for searching.
🔎 Problems if You Don't Normalize
- 🔒 You can't search fields like
user.idandroleright away. - 📉 Logs stay messy, which makes analysis and monitoring harder.
- 💥 You might get mapping problems if logs that are alike are a bit different.
- 📊 Kibana dashboards might run slow or not show all fields.
- ⚠️ Index size can grow too much because the same raw text is stored over and over.
The decode_json_fields processor fixes this. It pulls out the action and user.* fields and puts them into the Elasticsearch event in the right way.
Structuring Filebeat Inputs and Processors Together
To get logs in order, you often need to use decode_json_fields with other Filebeat processors. This cleans up both the data's structure and its content.
🛠️ Example Filebeat Input + Processor Configuration
filebeat.inputs:
- type: log
paths:
- /var/log/app.json
json.keys_under_root: true
json.add_error_key: true
processors:
- decode_json_fields:
fields: ["message"]
target: ""
overwrite_keys: true
- drop_fields:
fields: ["message"]
🔍 Good Ways to Do This
- Set
target: ""to show decoded fields at the root level. This makes querying easiest. - Use
overwrite_keys: trueto stop duplicate or extra keys from causing problems. - Use
drop_fieldsto take out the first message if you don't need it anymore. - Think about using the
renameprocessor to make field names the same across all your apps.
Avoid These Common Mistakes
If you set up decode_json_fields wrong, it can cause parsing errors you don't see, data loss, and make developers frustrated. Watch out for these common mistakes:
⚠️ Things to Avoid
- ❌ Don't leave
overwrite_keysoff: decoded fields won't work right. - ❌ Don't try to decode arrays when
process_array: false: you'll get empty records. - ❌ Don't decode JSON data that's already parsed: this creates mapping and duplicate data problems.
- ❌ Don't skip testing your logs: use
filebeat -e -d "*"to find problems soon. - ❌ Don't use unclear field names, like
message.message.
Finding problems in log pipelines gets much harder when the issue is decoding errors that happen sometimes or mapping conflicts in Elasticsearch that are not clear.
Why Normalized Logs Improve Elasticsearch and Kibana
Elasticsearch works best with flat, well-organized fields. Nested JSON, unless you plan it into a nested mapping (which often adds its own problems), makes performance worse and limits how you can search.
⚡ Good Points of Normalized Data
- 🔍 You can search logs right away with DSL or CLI filters.
- 📉 Index mappings stay steady and small.
- 📊 Kibana dashboards find field types on their own for better views.
- 📦 Less repeated data compared to big message blocks.
- 🔐 Fewer unexpected problems with mapping and indexing.
In the end, normalized JSON makes your search, aggregation, alerting, and forensics work better and simpler.
📚 Citation
Security and Performance Considerations
Decoding JSON is very helpful. But there are important things to watch for, mainly in busy production systems.
🔐 Key Advice
- 🚫 Do not decode logs you don't trust. Bad or harmful JSON can slow down parsing or cause crashes.
- ⚙️ Set
max_depthto stop deep, repeating decoding that can overload systems. - 📊 Watch Filebeat logs for parsing errors or lost events using
filebeat -e -d publish. - 💡 After normalization, check how many logs come in. This helps avoid too many fields (called mapping explosion).
- 📁 If you can, make service teams agree on log structures. This makes JSON simpler.
Making Your Log Pipelines More Sound
As your log system changes, your way of decoding and structuring data should change too.
🧩 More Advanced Ways (Not Just Filebeat)
- 🧪 Use Logstash for hard changes or JSON parsing in layers with scripts.
- 📦 Set up Elasticsearch data streams. These help you handle templates and map fields well across servers.
- 🛠️ Make custom index templates. This locks in the field types, formats, and mappings you expect.
- ✅ Think about having an early processing step where apps send out clean, flat JSON logs on purpose (for example, with structured logging tools).
It is important to check your pipeline often. Log decoding errors usually show up much later during analysis, so testing early helps a lot.
📚 Citation
Dealing with Complex or Custom JSON Logs
Sometimes logs are not simple { "field": "value" } JSON. Some cases need special ways to handle them.
🌀 How to Handle Complex JSON
- 🔁 Use
decode_json_fieldsin layers or use it again in Logstash for JSON that's double-wrapped. - 📜 Use Logstash's scripting (like the
rubyplugin) for unusual patterns or changes. - 🧹 Make deeply nested JSON flat by hand in your apps before sending logs.
- 👩💻 Filter and remove fields to get rid of unneeded or noisy parts in very chatty logs.
- 🔐 Always check logs before they go into your system. A bad structure early on makes bigger problems later.
If your logs have encrypted tokens, object arrays, or different field types, try out changes in a test area first before putting them into live systems.
Final Wrap-Up and Quick Checklist
Normalizing Filebeat logs with the decode_json_fields processor connects raw log data to clear, useful monitoring. You change hidden JSON strings into clean, fast, structured documents. This makes filtering easier, lets you see data faster in Kibana, and helps you keep a log system that can grow.
✅ Quick Checklist
-
decode_json_fieldsis on and points tomessagecorrectly. - Decoded fields show up at the root or sub-root path you want.
-
overwrite_keys: truestops data from being copied or missed. - Extra
messageor raw fields are gone. - Mappings in Elasticsearch stay clean and steady.
- You checked for problems using
filebeat -e -d "*"before putting it live.
Need help setting up your pipeline or handling complex logs? Devsolus gives detailed ELK stack advice to make your data work as well as it can.
Sources & Citations
Elastic. (2023). Filebeat Reference [8.13]. https://www.elastic.co/guide/en/beats/filebeat/8.13/decode-json-fields.html
Elastic. (2023). Indexing Structured Logs in Elasticsearch. https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html
Brown, K. (2022). Best Practices in ELK Stack Parsing. DevOps Weekly News.