FORUM SQL Habit Question 199

Question 199

Hi:) Why does the following syntax not work and is there a way to write it without CTEs?

SELECT
  COUNT(DISTINCT(user_id)),
  custom_parameters ->> 'ab_test_name' AS ab_test_name,
  custom_parameters ->> 'ab_test_variation' AS ab_test_variation,
  created_at AS categorized_at
FROM mobile_analytics.events
WHERE
  custom_parameters ->> 'ab_test_name' IS NOT NULL
  AND action = 'signup'
  AND  ab_test_name = 'longer_onboarding_201803'
GROUP BY ab_test_variation

REPLIES

Hi @mrv 👋

Great question, it’s definitely possible to write this query without CTEs. If you run your query you’ll see an error:

ERROR: column "ab_test_name" does not exist LINE 10: AND ab_test_name = 'longer_onboarding_201803' ^

This is because you can’t reference a column (column "ab_test_name" does not exist) you just defined inside the WHERE clause. Let’s fix it:

SELECT
  custom_parameters ->> 'ab_test_name' AS ab_test_name,
  custom_parameters ->> 'ab_test_variation' AS ab_test_variation,
  COUNT(DISTINCT(user_id))
FROM mobile_analytics.events
WHERE
  custom_parameters ->> 'ab_test_name' IS NOT NULL
  AND custom_parameters ->> 'ab_test_name' = 'longer_onboarding_201803'
  AND action = 'signup'
GROUP BY 1, 2

🔍 I’ve removed categorized_at column, I guess it made there by accident. We want to know how many users we had per AB-test variation, no need to pull in this datetime column.

🔍 Note how I referenced columns inside GROUP BY by their numbers – 1, 2. It’s simply to avoid duplication and if you want to group by another column (say, country) you can just change it in the SELECT.

As you can see, this query seems very bulky, CTEs allow you to save some duplication and make it more readable. 💡

Ping me if you have more questions 🚀

WRITE A REPLY