6 C
London
Tuesday, March 11, 2025
HomeMongoDBAggregation in MongoDBHow to Find Duplicates in MongoDB

How to Find Duplicates in MongoDB

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

You can use the following syntax to find documents with duplicate values in MongoDB:

db.collection.aggregate([
    {"$group" : { "_id": "$field1", "count": { "$sum": 1 } } },
    {"$match": {"_id" :{ "$ne" : null } , "count" : {"$gt": 1} } }, 
    {"$project": {"name" : "$_id", "_id" : 0} }
])

Here’s what this syntax does:

  • Group all documents having the same value in field1
  • Match the groups that have more than one document
  • Project all groups that have more than one document

This particular query finds duplicate values in the field1 column. Simply change this value to change the field to look in.

The following example shows how to use this syntax with a collection teams with the following documents:

db.teams.insertOne({team: "Mavs", position: "Guard", points: 31})
db.teams.insertOne({team: "Mavs", position: "Guard", points: 22})
db.teams.insertOne({team: "Rockets", position: "Center", points: 19})
db.teams.insertOne({team: "Rockets", position: "Forward", points: 26})
db.teams.insertOne({team: "Cavs", position: "Guard", points: 33})

Example: Find Documents with Duplicate Values

We can use the following code to find all of the duplicate values in the ‘team’ column:

db.teams.aggregate([
    {"$group" : { "_id": "$team", "count": { "$sum": 1 } } },
    {"$match": {"_id" :{ "$ne" : null } , "count" : {"$gt": 1} } }, 
    {"$project": {"name" : "$_id", "_id" : 0} }
])

This query returns the following results:

{ name: 'Rockets' }
{ name: 'Mavs' }

This tells us that the values ‘Rockets’ and ‘Mavs’ occur multiple times in the ‘team’ field.

Note that we can simply change $team to $position to instead search for duplicate values in the ‘position’ field:

db.teams.aggregate([
    {"$group" : { "_id": "$position", "count": { "$sum": 1 } } },
    {"$match": {"_id" :{ "$ne" : null } , "count" : {"$gt": 1} } }, 
    {"$project": {"name" : "$_id", "_id" : 0} }
])

This query returns the following results:

{ name: 'Guard' }

This tells us that ‘Guard’ occurs multiple times in the ‘position’ field.

Additional Resources

The following tutorials explain how to perform other common operations in MongoDB:

MongoDB: How to Add a New Field in a Collection
MongoDB: How to Group By and Count
MongoDB: How to Group By Multiple Fields

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories