I’m struggling somehow to create a nice jq-Query to generate a proper CSV like the below from following JSON example:
[
{
"regisseur": "Steven Spielberg",
"movies": [
"War of the Worlds",
"Jurassic Park"
]
},
{
"regisseur": "George Lucas",
"movies": [
"Howard The Duck",
"Hook"
]
}
]
What I’d like to get is a CSV-representation of it looking like this:
Steven Spielberg;War of the Worlds;
Steven Spielberg;Jurassic Park;
George Lucas;Howard The Duck;
George Lucas;Hook;
I especially want to achieve this with one single jq – call for performance-reasons and I’d like to avoid to iterate over it for each regisseur (at least within bash; jq may do iterate, as long as it is fast)
(The example here is just small, in reality I have to look at many more records, hence the emphasis on performance)
I could do it by getting a list of all regisseurs, externally iterate over it and then ask for the list of movies per that regisseur and make the CSV.
However this is not performant enough and I do not know how to script such within jq itself.
>Solution :
jq -r '.[] | .regisseur as $r | .movies[] | [$r, .] | @csv'
Will output proper CSV
"Steven Spielberg","War of the Worlds"
"Steven Spielberg","Jurassic Park"
"George Lucas","Howard The Duck"
"George Lucas","Hook"
For semicolon separated, you could pipe that into another tool like miller or csvkit, or you could hope the input data contains no semicolons and do
jq -r '.[] | .regisseur as $r | .movies[] | [$r, .] | join(";")'