FORUM Product Analytics Inner join vs Left join counts the same?

Inner join vs Left join counts the same?

Hey everyone,

I’m working on lesson 97, Left Join 101, and I tried counting rows for a query with inner join vs left join and got the same results. My understanding is that inner joins would return results with only matches, and left join will include rows with null values.

I would expect left joins, keeping all things else equal, would return greater results. Am I missing something?

SELECT COUNT(user_id)
FROM books b
LEFT JOIN books_users u
  ON u.book_id = b.id

-- returns  1667
SELECT COUNT(user_id)
FROM books b
INNER JOIN books_users u
  ON u.book_id = b.id

-- returns  1667

REPLIES

Hi @coachpacman 👋

That’s a great question, a cornerstone of Data Analysis with SQL. In my experience, once you get the intuition in the INNER vs LEFT JOIN topic everything else becomes trivial.

Let’s try to arrive at the deepest level of understanding step by step.

Step 1: left and right tables

You always join one table to another. I always imagine two rectangles put together: ⬛ ⬜.

You have your left table (⬛) and you join the right table to it (⬜). In your case ☝, your left table is the books table and your right table is the books_users table.

Step 2: join condition

Once we’ve decided on which tables to join, we need to define the join condition (ON ...).

The join condition is a boolean expression that returns either TRUE or FALSE. Here’re some examples:

The join condition is defined for all join types, it’ll be evaluated for all pairs of records. ⚠

Step 3: INNER JOIN behavior

Take another look at the illustration from the INNER JOIN lesson. The idea is that INNER JOIN joins two records (from the left and right tables) and keeps them in the result set only ⚠ when the join condition is true.

If the join condition is false, then neither left nor right records won’t make it to the result set.

That’s why INNER JOIN is always pictured as the intersection between 2 circles.

Step 4: LEFT JOIN behavior

Now let’s look at the LEFT JOIN lesson. What your intuition tells you? 💡

The LEFT JOIN keeps all records from the left table:

  • if the join condition is true, the right record will be joined to the left record and added to the result set
  • If the join condition is false, the right record won’t be joined BUT a blank record will be joined to the left record (all columns are NULL-s)

That’s why on the diagram the left circle is always full.

Step 5: picking the right left table

Haha no pan intended 🙈

When it comes to the INNER JOIN, it doesn’t matter which table is left or right – the result set will be the same in any case. Intersection between 2 circles won’t change when we switch them, right?

It matters a lot which table is the left table when using LEFT JOIN. Remember, all records ⚠ from the left table will be in the result set. That’s why we need to choose the left table based on what we’re calculating or researching.

Step 6: your example

First of all, a little reminder: the COUNT() function counts only non-NULL values. Try it out:

SELECT COUNT(1)

vs

SELECT COUNT(NULL)

That’s why these 2 queries will return 2 different results:

SELECT COUNT(user_id)
FROM books b
LEFT JOIN books_users u
  ON u.book_id = b.id
SELECT COUNT(*)
FROM books b
LEFT JOIN books_users u
  ON u.book_id = b.id

🔍 Because of the LEFT JOIN logic, we’ll have some NULL records from the books_users table (books that haven’t been read by any users).

I believe that was the source of your confusion, because if we’d use COUNT(*) the results would be different:

SELECT COUNT(*)
FROM books b
LEFT JOIN books_users u
  ON u.book_id = b.id

-- 1739

vs

SELECT COUNT(*)
FROM books b
INNER JOIN books_users u
  ON u.book_id = b.id

-- 1667

Step 7: homework

Now that we’re so deep into JOINs, I’d like you to try and change the left and right tables and play with the results:

SELECT COUNT(*)
FROM books_users u
INNER JOIN books b
  ON u.book_id = b.id

Also try to simply eyeball the result set and see if your intuition picks it up, that helps me a lot to detect patterns:

SELECT *
FROM books_users u
INNER JOIN books b
  ON u.book_id = b.id

I hope that was helpful, have fun! 👋

wow! what an awesome explanation. very thorough. good call on the nuance with the count function and the selector I chose along with the detail on inner vs left join

also, reassuring to hear that the intuition on the INNER vs LEFT JOIN functions is cornerstone for data analysis. that helps clarify such a big subject like SQL

WRITE A REPLY