Suppose I have some json formatted something like this (I’ve removed the bits I’m not interested in, I can filter those out):
{
"name": "db1",
"size": 40000,
"items": 500,
"mutations": 2,
"tombstones": 4,
"views_count": 8,
"fts_count": 0,
"index_count": 0,
"analytics_count": 0
}
{
"name": "db2",
"size": 11,
"items": 900,
"mutations": 3,
"tombstones": 0,
"views_count": 0,
"fts_count": 0,
"index_count": 0,
"analytics_count": 0
}
{
"name": "db1",
"size": 10,
"items": 6,
"mutations": 5,
"tombstones": 0,
"views_count": 0,
"fts_count": 0,
"index_count": 1,
"analytics_count": 0
}
How can I get a summary of all the elements of the entry, but only for each name (here we can see db1 mentioned twice so I’d like something like this):
{
"name": "db1",
"size": 40010,
"items": 506,
"mutations": 8,
"tombstones": 4,
"views_count": 8,
"fts_count": 0,
"index_count": 1,
"analytics_count": 0
}
{
"name": "db2",
"size": 11,
"items": 900,
"mutations": 3,
"tombstones": 0,
"views_count": 0,
"fts_count": 0,
"index_count": 0,
"analytics_count": 0
}
This is a subset of the complete json which has around 18000 lines (but there’s only 11 names).
>Solution :
--slurp the input to get an array and then use group_by to make groups (arrays) according to any criteria (.name in your case). If you want just one for each name, take one (e.g. the first) of each such group. Finally, apply [] to disassemble the surrounding array.
jq --slurp 'group_by(.name) | map(first)[]'
{
"name": "db1",
"size": 40000,
"items": 500,
"mutations": 2,
"tombstones": 4,
"views_count": 8,
"fts_count": 0,
"index_count": 0,
"analytics_count": 0
}
{
"name": "db2",
"size": 11,
"items": 900,
"mutations": 3,
"tombstones": 0,
"views_count": 0,
"fts_count": 0,
"index_count": 0,
"analytics_count": 0
}
Note: With a little bit of more effort, you could also --stream (and then reduce) the input (instead of using --slurp) but 18000 lines isn’t that big for memories nowadays, so I thought this would be the more reasonable approach.