I recently took a course in Microsoft PowerQuery, a visual user interface to data wrangling. It’s nowhere as powerful as Python’s pandas for example, but it does provide a convenient way for analysts without coding skills to clean small to medium-sized datasets. One aspect I found to be really intuitive is how joins are performed.
Audience: This blog post is written for people who want to know how to perform joins in PowerQuery.
In the following example, I join two ficticious data sets of a fictitious restaurant. The first data set gives the orders per table. The second data set contains the menu, i.e. the prices of everything that can be ordered by the tables. They can easily be imported using the Get Data > From File > From Workbook menu item.
- Don’t load the first data set, rather choose to Transform it.
2. Once the data set is opened in the Query Editor, go to Close & Load > Close & Load to.
3. You should only create the connection, as we aren’t interested in this data set. We want a joined version of our two data sets.
Next, retake step 1 again for the other data set. Once it is active in the editor, go to Merge Queries. You can choose to work in this connection or create a new connection by selecting Merge Queries as New.
First, you can select the columns within each data set to join on. And then there is the Join Kind menu. This is where I truly appreciate Microsoft going for completeness. It’s a menu where you kan choose the kind of you need. And it has a really straightforward explanation of what this kind of join actually does.
I basically started my career with this visual representation. In my opinion, Microsoft did a really great job explaining the different kinds of joins in one line in a single menu.
Once you finish the join by pressing Ok, the second table gets merged with the first. You can expand the columns in the second table, by clicking the icon in the top right corner of the column.
Finally, let’s make this useful and calculate the value per table.
Which gives us this nice aggregated table.