Top 5 SQL JOIN FAQs for Data Science Interview

Suraj Gurav 25 Jul, 2022 • 5 min read

This article was published as a part of the Data Science Blogathon.

Introduction

SQL stands for Structured Query Language

As its name suggests, SQL follows a structured pattern to query or fetch data from a relational database. In SQL, JOINs are powerful tools that make it easier to get the data stored in different tables.

In the real world, companies usually store the data on different small tables. So getting the required data for analysis essentially needs to go through the process of connecting different tables. Hence, interviewers always want to know how efficiently you can use SQL JOINs. And that’s why in data science job interviews, JOINs are always favorite topics.

SQL JOIN
Source: https://www.pexels.com/photo/black-magnifying-glass-221164/

I listed here the 5 most frequently asked interview questions on SQL JOINs and how you can answer them. I gathered them through my interview experience at different companies.

Let’s start…

What are SQL JOIN and its use?

This is the most straightforward interview question.

The JOIN command in SQL is used to combine data from two or more tables based on the common columns between these tables.

When the data you need resides in different tables, SQL JOIN allows you to combine these tables in one or more common columns. And then you can select the records, which may or may not have columns from both the tables.

What are the different types of SQL JOINs?

Based on how you would like to combine two or more tables and select records, there are 4 fundamental types of JOINs in SQL.

  • INNER JOIN: It combines two tables so that the common columns on which the join operation is performed have matching values in both the tables. Therefore it is used to retrieve records that are common in both tables. Instead of writing INNER JOIN, you can write only JOIN. By default, it will be taken as INNER JOIN.
  • LEFT JOIN: This is used to retrieve all records from the left table and the matching records from the right table. However, when there is no matching row in the right table, this type of JOIN returns NULL for certain rows in the left table. This join is also called a LEFT OUTER JOIN.
  • RIGHT JOIN: It is used to retrieve all records from the right table and the matching records from the left table. Similar to the previous join type, for certain records in the right table, when there is no matching row in the left table, this type of JOIN returns NULL. This join is also called a RIGHT OUTER JOIN.
  • FULL JOIN: As its name suggests, it is used to retrieve all the records from both the tables. So the result set of this join will contain all the rows from both the tables. When there is no matching row in the left or right table, a NULL value is returned at that position. This join is also called a FULL OUTER JOIN

Here instead of stating only the types of JOINs, you should try to explain them in 1–2 sentences. This certainly gives the interviewer an idea that you know more about the joins.

What is CROSS JOIN in SQL?

This returns all the possible combinations of two tables. So each row of the first table combines with each row of the second table.

Ultimately, the total number of records returned by CROSS JOIN is essentially the multiplication of the number of records in both the tables, if no WHERE a clause is used.

The result generated from this JOIN is called the Cartesian Product of the two tables.

Unlike other types of joins, you do not need to mention the ON clause in CROSS JOIN. It can be as simple as shown below.

SELECT t1.column1, t1.column2, t1.column3
FROM Table1 AS t1
CROSS JOIN Table2 AS t2

What is SELF JOIN in SQL, and when do you use it?

This is used to combine a table with itself. It might sound confusing if you are a beginner. However, it is beneficial for comparing rows within the same table.

Remember, there is no SELF JOIN keyword, so you just use JOIN where both tables involved in the join are the same table. As both the table names are the same, it is essential to use the table alias in the case of SELF JOIN. ✅

Write a SQL query that finds out employees who earn more than their managers
— One of the most frequently asked interview question on
SELF JOIN

let’s take this as an example and create a Dummy_Employees dataset as below.

SQL JOIN

Dummy employees dataset | Image by Author

And try to get find out which employees handle more orders than their manager using this query,

SELECT t1.EmployeeName, t1.TotalOrders
FROM Dummy_Employees AS t1
JOIN Dummy_Employees AS t2
ON t1.ManagerID = t2.EmployeeID
WHERE t1.TotalOrders > t2.TotalOrders
result set

As expected, it returned employees — Abdul and Maria — who handled more orders than their manager — Pablo.

I got this question in almost 80% of the interviews I faced. So, it is the classic use-case of SELF JOIN.

You can read more about SELF JOIN and its use in one of my most-read articles –>

And last but not least,

How do you JOIN more than two tables in SQL?

Combining multiple tables is the same as using JOIN between two tables.

This type of join requires consecutive JOIN operations → first, you join the first and the second table and get an intermediate result set; then, you join another table to this intermediate table.

Let’s see how this works using the example below.

Suppose you want to retrieve orderID, sales in USD, and quantity which are present in three different tables as shown below.

dummy tables | SQL JOIN
Dummy tables | Image by Author

And orderID is the common column between them. So, the JOIN operation will be performed on this column as shown below.

SELECT orders.orderID,
       sales.salesUSD,
       products.quantity
FROM orders
INNER JOIN sales
        ON orders.orderID = sales.orderID
INNER JOIN products 
        ON orders.orderID = products.orderID

This gives only a single row output, as you used INNER JOIN , and only orderID = A1234 is present in all the tables.

Join multiple tables | Image by Author

Likewise, you can use other types of JOINs as well!

Conclusion

I hope you found this article useful to know and understand interview questions about SQL JOINs. Knowing different types of JOINs and their uses will certainly assist you in acing the job interview.

Ultimately, you learned in this article –

  • What does SQL JOIN mean, and what are its types
  • How does SELF JOIN be different from other types of JOINs
  • How CROSS JOIN works in SQL
  • How to JOIN 2 or more tables using different types of JOINs
  • How to answer the SQL JOIN questions in the interview

Along with basic types of JOINs, you now have a more conceptual understanding of CROSS JOINs and SELF JOINs.

If you found this piece of article useful, feel free to join me on Medium & Sign up for my E-mail list

Thank you for reading!

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Suraj Gurav 25 Jul 2022

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear

Related Courses

image.name
0 Hrs 36 Lessons
4.97

Top Data Science Projects for Analysts and Data Scientists

Free

Recommended for you
  • [tta_listen_btn class="listen"]