Hi Anatoli!
My query did take me to the right answer but the CTR values werenāt the same as your query.
My query was like this:
SELECT
device_type,
(100.0 * count(distinct s.pageview_id) / count(distinct p.visitor_id)) ctr
FROM web_analytics.pageviews p
LEFT JOIN (select pageview_id from web_analytics.events where category = 'Signup Button' and action = 'Click') s
ON p.pageview_id = s.pageview_id
where p.url like '%books%'
group by 1
order by 2 desc
Could you please explain what Iām missing? Thank you!
When I run Query 1 query, I expected the query will return me only rows with refunded = False
. But it also returns rows with refunded = Null
.
SELECT *
FROM users u
LEFT JOIN purchases p
ON u.id = p.user_id
AND refunded = FALSE
But when I run query 2, it returns me rows with refunded = false
(no null rows):
SELECT u.id, status, refunded
FROM users u
LEFT JOIN purchases p
ON u.id = p.user_id
WHERE
refunded = FALSE
Can someone please explain me this?
This might be a bit silly, I am new to SQL so pardon my simple queries. I was practicing calculating purchase rate in lesson 26 and I decided to create my own queries. The question my own query should answer is āWhat is the % of users who are trial users and signed up in the first 2 weeks of Bindle existanceā.
So I wrote a query that calculates this (Query 1), and I wanted to check this with another query (Query 2) to see if the result is correct. I canāt figure out why I am missing 1% of the users when I run the 2nd query.
Calcualate % of the users who are trial users and signed up in the first 2 weeks of Bindle existance. This query produces a result of 36%:
SELECT
100 * COUNT(CASE WHEN status = 'trial' THEN id END) /
COUNT(*) AS trial_users_first_2_weeks_Jan_2018
FROM users
WHERE
signup_date BETWEEN '2018-01-01' AND '2018-01-14'
Below query should return 64% as the query above returned 36% but it returns 63% and I donāt know why.
SELECT
100 * COUNT(CASE WHEN status = 'customer' OR status = 'free' THEN id END) /
COUNT(*) AS non_trial_users_first_2_weeks_Jan_2018
FROM users
WHERE
signup_date BETWEEN '2018-01-01' AND '2018-01-14'
Below query shows all unique values for status column in the user table for the same time frame as previous queries. I used it to figure out how to structure second query.
SELECT DISTINCT status
FROM users
WHERE
signup_date BETWEEN '2018-01-01' AND '2018-01-14'
Thank you.
WITH spends AS (
SELECT
utm_campaign,
SUM(amount) AS total_spend
FROM marketing_spends
GROUP BY 1
), total_customers AS (
SELECT
utm_campaign,
COUNT(*) AS customers_count
FROM users
WHERE
utm_campaign IS NOT NULL
AND status = 'customer'
GROUP BY 1
)
SELECT
s.utm_campaign,
total_spend / customers_count AS CAC
FROM spends s
LEFT JOIN total_customers u
ON s.utm_campaign = u.utm_campaign
ORDER BY 2 ASC NULLS LAST
I think the utm_campaign IS NOT NULL
filter used here is redundant because weāre doing a LEFT JOIN
from the spends CTE and the marketing_spends table has no NULLs under its utm_campaign column. It would, however, make a difference if we did the join the other way round.
Hello,
With reference to the CTE - total_users
in following query for this chapter, my query is that why isnāt distinct
used with COUNT(*) AS users_count
WITH spends AS (
SELECT
utm_source,
SUM(amount) AS total_spend
FROM marketing_spends
GROUP BY 1
), total_users AS (
SELECT
utm_source,
COUNT(*) AS users_count
FROM users
WHERE
utm_source IS NOT NULL
GROUP BY 1
)
SELECT
s.utm_source,
total_spend / users_count AS CPA
FROM spends s
INNER JOIN total_users u
ON s.utm_source = u.utm_source
In 105. a great and important point is made in the difference of limiting a query in the join conditions vs. limiting a query in the where clause.
I understand the scenario of why limiting the query in the join condition makes sense (filtering the purchases table to only consider all purchases that were not refunded) and I see how using the where clause ultimately returns wrong output (filtering in the where clause will ignore the ānegative spaceā of all the NULLs in purchase columns for users who did not make a purchase).
I tried challenging my understanding a bit and came up with the ābā CTE below (āaā CTE is the correct query as described in the exercise). My question is then: how am I making a logical error in trying to utilise the where clause in the ābā CTE below? As far as I can see, the difference in counts between āaā and ābā is 104 rows (purchases that were refunded?).
WITH a AS (
SELECT
SUM(amount) / COUNT(DISTINCT(u.id)) AS ARPU
FROM users u
LEFT JOIN purchases p
ON u.id = p.user_id
AND refunded = FALSE
),
b AS (
SELECT
SUM(amount) / COUNT(DISTINCT(u.id)) AS ARPU
FROM users as u
LEFT JOIN purchases AS p
ON u.id = p.user_id
WHERE refunded IS NULL OR refunded = FALSE
)
SELECT * FROM b
-- SELECT * FROM a
/*
The 'a' CTE above is correct. I can see that 'b' CTE is not correct,
but I'm unable to wrap my mind around why, for the 'b' version, the
'amount' is greater in relation to the distinct user count.
*/
Hi:) Why does the following syntax not work and is there a way to write it without CTEs?
SELECT
COUNT(DISTINCT(user_id)),
custom_parameters ->> 'ab_test_name' AS ab_test_name,
custom_parameters ->> 'ab_test_variation' AS ab_test_variation,
created_at AS categorized_at
FROM mobile_analytics.events
WHERE
custom_parameters ->> 'ab_test_name' IS NOT NULL
AND action = 'signup'
AND ab_test_name = 'longer_onboarding_201803'
GROUP BY ab_test_variation
Hello, Iām struggling to understand the formula used in the exercise to calculate the churn rate:
SELECT
100 - 100 * COUNT(DISTINCT(b.visitor_id)) / COUNT(DISTINCT(h.visitor_id))::float AS churn_rate
FROM web_analytics.pageviews h
I donāt get the 100-ā¦.why cannot we leave it as:
100 * COUNT(DISTINCT(b.visitor_id)) / COUNT(DISTINCT(h.visitor_id))::float
Many thanks!