If you have this collection of objects:
{ "a": 10, "b": 20, "c": 30 }
{ "a": 11, "b": 20, "c": 31 }
{ "a": 10, "b": 20, "c": 31 }
-
There is a way to get distinct values, for example, for field "a":
[10, 11] -
There is also a way to get distinct values of any tuple, for example, for pairs of ("b", "c"):
[ {"b": 20, "c": 30}, {"b": 20, "c": 31} ]
Is there a way to query distinct values for each field individually in a single query?
For example, I can simply use query 1 above 3 times for "a", "b", "c":
[10, 11]
[20]
[30, 31]
But I guess it might be less efficient and there should be a better option.
Bonus: How to do it if the list of fields is not known upfront?
Ideally, the single query should return all keys and their distinct values:
{
"a": [10, 11],
"b": [20],
"c": [30, 31]
}
>Solution :
Assuming you don’t know the full list of the fields beforehand, you need to use $objectToArray to convert the $$ROOT document into an array of k-v tuples. Then group by the field name and $addToSet the values.
db.collection.aggregate([
{
"$project": {
_id: 0,
arr: {
"$objectToArray": "$$ROOT"
}
}
},
{
"$unwind": "$arr"
},
{
$match: {
"arr.k": {
$ne: "_id"
}
}
},
{
$group: {
_id: "$arr.k",
values: {
"$addToSet": "$arr.v"
}
}
}
])