MongoDB Aggregation Pipeline Challenge 1 Solution
This is an awesome challenge that my boss Justin sent me and thought it would be a fun one for you all.
Disclaimer: there are MANY ways to solve this challenge this is just Justin’s solution to this challenge
TLDR: If all you want to see is the aggregate solution it is in the franchiseeCustomerSatScoreBreakdown function
The Problem
Big Ice Cream Co is the largest ice cream franchisor in the US. Each of their 100+ franchisees has between 1 and 25 locations. Each location tracks its customer satisfaction score which can be below average, average, or above average. The CEO of Big Ice Cream Co wants to see a report that shows the customer satisfaction scores for each location grouped by franchisee.
Example:
| franchiseeId | franchiseeName | belowAverage | average | aboveAverage | totalLocations |
|---------------------|---------------:|--------------|---------|--------------|----------------||
search | True | 1 | 10 | 14 | 25 ||
institutionCodes | False | 10 | 3 | 0 | 13 ||
aggregatorPartnerId | True | 0 | 3 | 0 | 3 |
This data is stored in the franchising MongoDB, there are two collections:
1. franchisee
2. locations
Franchisee schema
{ companyName: { type: String }, signedDate: { type: Date }, owner: { type: String } }
Locations schema
{ franchiseeId: { type: ObjectId }, address: { state: { type: String }, city: { type: String }, zip: { type: String }, streetAddress: { type: String }, secondaryAddress: { type: String } }, customerSatisfactionScore: { type: String, enum : ['below average','average', 'above average'] } }
The Solution
This solution is in JavaScript but there are many languages you can do this in.
First, let’s set up the basic file of everything we are going to need. getSatScoreCount and franchiseeCustomerSatScoreBreakdown are going to be helper functions and the customerSatBreakdown is going to be the main function that will set up the database and return the data. If you would like me to do a post on how to set up a JavaScript file like this let me know in the comments I would happy to do so!
This will set up the mongo client, get the data from the helper functions and create a CSV of the data we needed.
const fs = require('fs') const { promisify } = require('util') const { Parser } = require('json2csv') const dotenv = require('dotenv-extended') const { ObjectId, MongoClient } = require('mongodb') const writeFileAsync = promisify(fs.writeFile) const getSatScoreCount = (result, scoreName) => {} const franchiseeCustomerSatScoreBreakdown = async ({ franchisingDb }) => {} const customerSatBreakdown = async function () { try { // connect to db const db_url = process.env.DB_URL const dbClient = await MongoClient.connect(db_url, { useUnifiedTopology: true }) const franchisingDb = dbClient.db('franchising') const data = await franchiseeCustomerSatScoreBreakdown({ franchisingDb }) const fields = Object.keys(data[0]).map(key => { return { label: key, value: key } }) const json2csvParser = new Parser({ fields }); const csvData = json2csvParser.parse(data); // write to JSON file await writeFileAsync('franchising-customer-sat-scores.csv', csvData, 'utf8') process.exit(0) } catch (error) { console.error(error.message) process.exit(1) } } dotenv.load({ silent: false, path: '.env', errorOnMissing: false, errorOnExtra: false }) customerSatBreakdown()
Now let’s put together the bread and butter, the aggregate. We are going to do an aggregate on the collection of the location of the database.
First, we are going to need to lookup. A lookup is a left join in mongo. If you don’t know what that is it takes the data from the left table and what is the same from the right table. Here is an image that may help understand better as well.
We are going to be doing this join on the frachisee table to get the franchisee with their location.
const franchiseeCustomerSatScoreBreakdown = async ({ franchisingDb }) => { const locationsCollection = franchisingDb.collection('locations') const results = await locationsCollection.aggregate([ { $lookup: { from: "franchisee", localField: "franchiseeId", foreignField: "_id", as: "franchiseeInfo" } } ])
We then want to “unwind” the franchiseeInfo. This will destructure the franchisee info object. If we don’t have this when you get the data the franchiseeInfo will look something like this: franchiseeInfo: [ [Object] ]. When we add unwind it will look something like this:
franchiseeInfo: { _id: 6015a1f19ab3e80b64d50508, companyName: 'Waelchi Inc', signedDate: 2020-12-29T19:26:18.097Z, owner: 'Catherine Kozey' }
our aggregate will now look like this:
const franchiseeCustomerSatScoreBreakdown = async ({ franchisingDb }) => { const locationsCollection = franchisingDb.collection('locations') const results = await locationsCollection.aggregate([ { $lookup: { from: "franchisee", localField: "franchiseeId", foreignField: "_id", as: "franchiseeInfo" } }, { $unwind: "$franchiseeInfo" } ])
We are going to have 2 groups. The first group is going to be grouping all the locations with their customer satisfaction scores. This grouping will get the name of the franchisee company name, customer satisfaction score, franchisee id, and the count of the locations with that satisfaction score.
{ $group:{ _id:{ franchiseeId:'$franchiseeId', satScore: '$customerSatisfactionScore', franchiseeName: '$franchiseeInfo.companyName' }, locationsCount: { $sum: 1 } }},
The second group is going to group each of the scores with the franchisee along with a total number of locations. This will make it from multiple records broken up for each franchisee to all of the sums of the scores for the franchisees and the total count all in the same record.
{ $group: { _id: { franchiseeId: "$_id.franchiseeId", franchiseeName: '$_id.franchiseeName' }, scores: { $push: { satScore: "$_id.satScore", count: "$locationsCount" }, }, count: { $sum: "$locationsCount" } }}
Then we are going to throw that in an array. If you don’t do this your return type will just be an aggregate cursor (cursor to the document) and it won’t make any sense or be the data you want. By adding a toArray() you are pushing all that data into an array that you can actually use.
const franchiseeCustomerSatScoreBreakdown = async ({ franchisingDb }) => { const locationsCollection = franchisingDb.collection('locations') const results = await locationsCollection.aggregate([ { $lookup: { from: "franchisee", localField: "franchiseeId", foreignField: "_id", as: "franchiseeInfo" } }, { $unwind: "$franchiseeInfo" }, { $group:{ _id:{ franchiseeId:'$franchiseeId', satScore: '$customerSatisfactionScore', franchiseeName: '$franchiseeInfo.companyName' }, locationsCount: { $sum: 1 } }}, { $group: { _id: { franchiseeId: "$_id.franchiseeId", franchiseeName: '$_id.franchiseeName' }, scores: { $push: { satScore: "$_id.satScore", count: "$locationsCount" }, }, count: { $sum: "$locationsCount" } }} ]).toArray()
The rest of this is going to be how we are making the data able to be returned in aa csv with JavaScript.
Before we are going to return the data we want to get the scores from the scores group destructured. For this we are going to map over data get structure it how we want for the csv. For this we are going to use a helper function to pull the score out for each type of score.
const getSatScoreCount = (result, scoreName) => { const targetIndex = result.scores.findIndex(s => s.satScore === scoreName) return targetIndex !== -1 ? result.scores[targetIndex].count : 0 }
this is what our map is going to looks like as well.
return results.map(result => { return { franchiseeId: result._id.franchiseeId, franchiseeName: result._id.franchiseeName, belowAverage: getSatScoreCount(result, 'below average'), average: getSatScoreCount(result, 'average'), aboveAverage: getSatScoreCount(result, 'above average'), totalLocations: result.count } })
and thats it! this will be the full file.
const fs = require('fs') const { promisify } = require('util') const { Parser } = require('json2csv') const dotenv = require('dotenv-extended') const { ObjectId, MongoClient } = require('mongodb') const writeFileAsync = promisify(fs.writeFile) const getSatScoreCount = (result, scoreName) => { const targetIndex = result.scores.findIndex(s => s.satScore === scoreName) return targetIndex !== -1 ? result.scores[targetIndex].count : 0 } const franchiseeCustomerSatScoreBreakdown = async ({ franchisingDb }) => { const locationsCollection = franchisingDb.collection('locations') const results = await locationsCollection.aggregate([ { $lookup: { from: "franchisee", localField: "franchiseeId", foreignField: "_id", as: "franchiseeInfo" } }, { $unwind: "$franchiseeInfo" }, { $group:{ _id:{ franchiseeId:'$franchiseeId', satScore: '$customerSatisfactionScore', franchiseeName: '$franchiseeInfo.companyName' }, locationsCount: { $sum: 1 } }}, { $group: { _id: { franchiseeId: "$_id.franchiseeId", franchiseeName: '$_id.franchiseeName' }, scores: { $push: { satScore: "$_id.satScore", count: "$locationsCount" }, }, count: { $sum: "$locationsCount" } }} ]).toArray() console.log(results) return results.map(result => { return { franchiseeId: result._id.franchiseeId, franchiseeName: result._id.franchiseeName, belowAverage: getSatScoreCount(result, 'below average'), average: getSatScoreCount(result, 'average'), aboveAverage: getSatScoreCount(result, 'above average'), totalLocations: result.count } }) } const customerSatBreakdown = async function () { try { // connect to db const db_url = process.env.DB_URL const dbClient = await MongoClient.connect(db_url, { useUnifiedTopology: true }) const franchisingDb = dbClient.db('franchising') const data = await franchiseeCustomerSatScoreBreakdown({ franchisingDb }) const fields = Object.keys(data[0]).map(key => { return { label: key, value: key } }) const json2csvParser = new Parser({ fields }); const csvData = json2csvParser.parse(data); // write to JSON file await writeFileAsync('franchising-customer-sat-scores.csv', csvData, 'utf8') process.exit(0) } catch (error) { console.error(error.message) process.exit(1) } } dotenv.load({ silent: false, path: '.env', errorOnMissing: false, errorOnExtra: false }) customerSatBreakdown()
I hope you all enjoyed this one! If there was something that you would like a more in depth article about or something you need more explanation on please let me know in the comments below! If you would like to get an email when new articles are released sign up for the newsletter below!