Calculate max value for list of subdocuments based on reference map

I have a document structure like this:

[
    {
        "country": "UK",
        "shops": [
            {"city": "London", "fruits": ["banana", "apple"]},
            {"city": "Birmingham", "fruits": ["banana", "pineapple"]},
        ],
    },
    {
        "country": "DE",
        "shops": [
            {"city": "Munich", "fruits": ["banana", "strawberry"]},
            {"city": "Berlin", "fruits": ["kiwi", "pineapple"]},
        ],
    },
]

In my python script, i have a dict mapping each fruit to a category value:

categories = {
    1: ["apple"],
    2: ["banana", "kiwi"],
    3: ["pineapple", "strawberry"]
}

DESIRED OUTPUT:

Now i would like to get, using the mongo aggregation framework, for each document the max_category projected from the max value found in the shops sub-documents based on the category mapping.

[
    {
        "country": "UK",
        "shops": [
            {"city": "London", "fruits": ["banana", "apple"]},
            {"city": "Birmingham", "fruits": ["banana", "pineapple"]},
        ],
        "max_category": 3
    },
    {
        "country": "DE",
        "shops": [
            {"city": "Munich", "fruits": ["banana", "apple"]},
            {"city": "Berlin", "fruits": ["kiwi", "apple"]},
        ],
        "max_category": 2
    },
]

Thanks for your help!

>Solution :

You category dict will not fit in mongo as mongo requires objects to have keys in string types. You can wrangle the dict to below form for easier processing:

[
  {
    category: 1,
    fruits: [
      "apple"
    ]
  },
  {
    category: 2,
    fruits: [
      "banana",
      "kiwi"
    ]
  },
  {
    category: 3,
    fruits: [
      "pineapple",
      "strawberry"
    ]
  }
]

In aggregation pipeline, iterate through the wrangled category array by $reduce to conditionally update the accumulator to get the max matched category.

db.collection.aggregate([
  {
    "$unwind": "$shops"
  },
  {
    "$set": {
      "max_category": {
        "$reduce": {
          "input": [
            {
              category: 1,
              fruits: [
                "apple"
              ]
            },
            {
              category: 2,
              fruits: [
                "banana",
                "kiwi"
              ]
            },
            {
              category: 3,
              fruits: [
                "pineapple",
                "strawberry"
              ]
            }
          ],
          "initialValue": null,
          "in": {
            "$cond": {
              "if": {
                $and: [
                  {
                    $gt: [
                      "$$this.category",
                      "$$value"
                    ]
                  },
                  {
                    $gt: [
                      {
                        $size: {
                          "$setIntersection": [
                            "$$this.fruits",
                            "$shops.fruits"
                          ]
                        }
                      },
                      0
                    ]
                  }
                ]
              },
              "then": "$$this.category",
              "else": "$$value"
            }
          }
        }
      }
    }
  },
  {
    "$group": {
      "_id": "$_id",
      "country": {
        $first: "$country"
      },
      "max_category": {
        $max: "$max_category"
      },
      "shops": {
        "$push": "$shops"
      }
    }
  }
])

Mongo Playground

Leave a Reply