From Exercise 66:
Let’s practice our newly learned skill of grouping and counting. Which email domain has the most amount of users at Bindle?
I am writing a code other way, by forming function. I am getting error. Could anyone help me to identify where I have mistaken?
WITH domain AS (
SELECT
SPLIT_PART (email, '@', 2) AS email_domain,
FROM users
)
SELECT
email_domain,
COUNT(*)
FROM domain
GROUP BY 1
ORDER BY 2 DESC
This might be a bit silly, I am new to SQL so pardon my simple queries. I was practicing calculating purchase rate in lesson 26 and I decided to create my own queries. The question my own query should answer is “What is the % of users who are trial users and signed up in the first 2 weeks of Bindle existance”.
So I wrote a query that calculates this (Query 1), and I wanted to check this with another query (Query 2) to see if the result is correct. I can’t figure out why I am missing 1% of the users when I run the 2nd query.
Calcualate % of the users who are trial users and signed up in the first 2 weeks of Bindle existance. This query produces a result of 36%:
SELECT
100 * COUNT(CASE WHEN status = 'trial' THEN id END) /
COUNT(*) AS trial_users_first_2_weeks_Jan_2018
FROM users
WHERE
signup_date BETWEEN '2018-01-01' AND '2018-01-14'
Below query should return 64% as the query above returned 36% but it returns 63% and I don’t know why.
SELECT
100 * COUNT(CASE WHEN status = 'customer' OR status = 'free' THEN id END) /
COUNT(*) AS non_trial_users_first_2_weeks_Jan_2018
FROM users
WHERE
signup_date BETWEEN '2018-01-01' AND '2018-01-14'
Below query shows all unique values for status column in the user table for the same time frame as previous queries. I used it to figure out how to structure second query.
SELECT DISTINCT status
FROM users
WHERE
signup_date BETWEEN '2018-01-01' AND '2018-01-14'
Thank you.
Hello, I have a questions and it is…
I have created_date
and resolved_date
as columns in tickets
table. In other table I have sat_calendar with all days since 2020-06-01 and I identified Sunday, Saturdays and Holidays in description
column.
But now I need to calculate difference between created_date
and resolve_date
in working_days.
I did new table with row_number to working_day but I couldn’t do put ‘the previuos’ number if the created or resolved date is some holiday.
Could someone help me please?
Hey everyone!
In lesson 107, there is this helpful nuance that if we adjust the where clause to included refunded customers in an ARPU calculation (as opposed to the AND clause in a LEFT JOIN) that you would get ARPPU.
I dbl checked that and found a small discrepancy. I just wanted to make sure my thinking here is right.
I updated the select lines to show tables and it looks like this claim would be true exccept there are more than 2 values for status. The second query shows a slight difference from the first becacuse there are “trial” and “free” users, I think.
Am I getting this right?
SELECT
-- SUM(amount) / COUNT(DISTINCT(u.id)) AS ARPPU
*
FROM users u
LEFT JOIN purchases p
ON u.id = p.user_id
AND refunded = FALSE
WHERE
status = 'customer'
-- arppu 58.5
SELECT
-- SUM(amount) / COUNT(DISTINCT(u.id)) AS ARPPU
*
FROM users u
LEFT JOIN purchases p
ON u.id = p.user_id
WHERE
refunded = FALSE
-- arppu 56.8
Hey everyone,
I’m working on lesson 97, Left Join 101, and I tried counting rows for a query with inner join vs left join and got the same results. My understanding is that inner joins would return results with only matches, and left join will include rows with null values.
I would expect left joins, keeping all things else equal, would return greater results. Am I missing something?
SELECT COUNT(user_id)
FROM books b
LEFT JOIN books_users u
ON u.book_id = b.id
-- returns 1667
SELECT COUNT(user_id)
FROM books b
INNER JOIN books_users u
ON u.book_id = b.id
-- returns 1667
I am exploring an alternative solution to the one shown for Exercise 204.
My question is, why can we not filter out refunded values within the WHERE
clause?
WITH ab_test_categorization AS (
SELECT
user_id,
custom_parameters ->> 'ab_test_name' AS ab_test_name,
custom_parameters ->> 'ab_test_variation' AS ab_test_variation,
created_at AS categorized_at
FROM mobile_analytics.events
WHERE
custom_parameters ->> 'ab_test_name' IS NOT NULL
AND action = 'signup'
), ab_test_stats AS (
SELECT
ab_test_variation AS variation,
COUNT(DISTINCT(c.user_id)) / SUM(p.amount) AS arpu
FROM ab_test_categorization c
LEFT JOIN purchases p
ON c.user_id = p.user_id
WHERE
ab_test_name = 'longer_onboarding_201803'
AND p.refunded = FALSE
GROUP BY 1
)
SELECT
variation,
arpu
FROM ab_test_stats
Are there any performance differences between INNER JOIN and LEFT JOIN? For instance, this following subquery from lesson 207 could have been written with an INNER JOIN.
WITH user_activity AS (
SELECT
u.user_id,
u.created_at::date AS signup_date,
e.created_at::date AS activity_date,
COUNT(*) AS events_counts
FROM mobile_analytics.events u
LEFT JOIN mobile_analytics.events e
ON e.user_id = u.user_id
WHERE
u.action = 'signup'
GROUP BY 1, 2, 3
ORDER BY signup_date ASC, user_id ASC
)
SELECT *
FROM user_activity
WITH spends AS (
SELECT
utm_campaign,
SUM(amount) AS total_spend
FROM marketing_spends
GROUP BY 1
), total_customers AS (
SELECT
utm_campaign,
COUNT(*) AS customers_count
FROM users
WHERE
utm_campaign IS NOT NULL
AND status = 'customer'
GROUP BY 1
)
SELECT
s.utm_campaign,
total_spend / customers_count AS CAC
FROM spends s
LEFT JOIN total_customers u
ON s.utm_campaign = u.utm_campaign
ORDER BY 2 ASC NULLS LAST
I think the utm_campaign IS NOT NULL
filter used here is redundant because we’re doing a LEFT JOIN
from the spends CTE and the marketing_spends table has no NULLs under its utm_campaign column. It would, however, make a difference if we did the join the other way round.
I’m in lesson 110. CPA Cost per acquisition
In this lesson appears the following query:
WITH spends AS (
SELECT
utm_source,
SUM(amount) AS total_spend
FROM marketing_spends
GROUP BY 1
), total_users AS (
SELECT
utm_source,
COUNT(*) AS users_count
FROM users
WHERE
utm_source IS NOT NULL
GROUP BY 1
)
SELECT
s.utm_source,
total_spend / users_count AS CPA
FROM spends s
INNER JOIN total_users u
ON s.utm_source = u.utm_source
However, I tried to do the exercise by my own using the following query:
SELECT ms.utm_source, SUM(amount), COUNT(DISTINCT(u.id))
FROM marketing_spends ms
LEFT JOIN users u
ON u.utm_source = ms.utm_source
GROUP BY 1
But I get different results for the amount spent, I don’t understand why it is, the users count is correct.