MongoDB — Unions 🇬🇧

Ido Montekyo
idomongo
Published in
2 min readJan 23, 2022

--

Image from Wikipedia

In set theory, the union (denoted by ∪) of a collection of sets is the set of all elements in the collection.

In SQL the UNION clause combines the results of two SQL queries into a single table of all matching rows. The two queries must result in the same number of columns and compatible data types in order to unite. Any duplicate records are automatically removed unless UNION ALL is used.

Can we do a UNION in MongoDB?

We sure can! as of Version 4.4, we have $unionWith.

Let’s take a closer look using this friends dataset:

{
“Year_of_prod”: 1995,
“Season”: 2,
“Episode Number”: 9,
“Episode_Title”: “The One with Phoebe’s Dad”,
“Duration”: 22,
“Summary”: “Phoebe tracks down her father, but isn’t sure whether she should meet him. Ross asks Rachel to make a list of pros and cons about him.”,
“Director”: “Kevin Bright”,
“Stars”: 8,
“Votes”: 3444
}
  • This episodes collection has a document per episode in Friends
  • Each document has an Episode_Title field which is usually something like: “The one where…” or “The one with…”

The One Where VS. The One With

Now, Let’s query how many episodes there are of each type.

The following aggregation pipeline counts how many episodes had the title “The One Where” using $regex and $group:

[{$match: {
Episode_Title: {
$regex: 'The One Where'
}
}}, {$group: {
_id: 'where',
count: {
$sum: 1
}
}}]

The result:

_id: “where”
count: 54

Now, we can use $unionWith

New in version 4.4.

Performs a union of two collections; i.e. $unionWith combines pipeline results from two collections into a single result set. The stage outputs the combined result set (including duplicates) to the next stage.

I’ll duplicate the previous pipeline and change the strings to “The One with” accordingly:

[{$match: {
Episode_Title: {
$regex: 'The One with'
}
}}, {$group: {
_id: 'with',
count: {
$sum: 1
}
}}]

Finally, we’ll paste into the $unionWith step in the pipeline statement:

[{$match: {
Episode_Title: {
$regex: 'The One Where'
}
}}, {$group: {
_id: 'where',
count: {
$sum: 1
}
}}, {$unionWith: {
coll: 'episodes',
pipeline: [
{
$match: {
Episode_Title: {
$regex: 'The One with'
}
}
},
{
$group: {
_id: 'with',
count: {
$sum: 1
}
}
}
]
}
}]

The result is a union of both queries:

_id: “where”
count: 54
_id: “with”
count: 163

🍿

--

--

Ido Montekyo
idomongo

System Analysis. System Design. Architecture. Databases. Project Management. Speaker. People Motivator.