Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

MongoDB remove duplicates from a nested array

I have a users collection with the following structure:

[
  {
    name: "xxx",
    labels: [
      {
        category: "Language",
        values: ["English", "Spanish"],
      },
      {
        category: "Hobby",
        values: ["Read", "Cook", "Read"],
      },
    ]
  },
  {
    name: "yyy",
    labels: [
      {
        category: "Language",
        values: ["English", "English"],
      },
      {
        category: "Hobby",
        values: ["Read", "Play", "Play"],
      },
    ]
  },
]

I want to delete all duplicates from values array, so the result would be:

[
  {
    name: "xxx",
    labels: [
      {
        category: "Language",
        values: ["English", "Spanish"],
      },
      {
        category: "Hobby",
        values: ["Read", "Cook"],
      },
    ]
  },
  {
    name: "yyy",
    labels: [
      {
        category: "Language",
        values: ["English"],
      },
      {
        category: "Hobby",
        values: ["Read", "Play"],
      },
    ]
  },
]

I tried to use setUnion and setIntersection, but I didn’t know what is the right why to use them with a nested array.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

For Mongo version 4.2+ you can use pipeline updates for this, like so:

db.collection.updateMany(
{},
[
  {
    "$set": {
      labels: {
        $map: {
          input: "$labels",
          in: {
            $mergeObjects: [
              "$$this",
              {
                values: {
                  $setUnion: "$$this.values"
                }
              }
            ]
          }
        }
      }
    }
  }
])

Mongo Playground

For older Mongo versions you’ll have to read each document into memory and do this in code, then update each document separately.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading