Hey @isik
Very interesting, your query looks good at first. I’d only switch the subquery approach to CTE for readability:
WITH signup_clicks AS (
SELECT pageview_id
FROM web_analytics.events
WHERE
category = 'Signup Button'
AND action = 'Click'
)
SELECT
device_type,
100.0 * COUNT(DISTINCT s.pageview_id) / COUNT(DISTINCT p.visitor_id) AS CTR
FROM web_analytics.pageviews p
LEFT JOIN signup_clicks s
ON p.pageview_id = s.pageview_id
WHERE
p.url LIKE '%books%'
GROUP BY 1
ORDER BY 2 DESC
Indeed it returns slightly different numbers from the solution and there’s a big reason for this and it’s in the filter p.url LIKE '%books%'
.
Our goal is to filter all pageviews from book pages. Here’s a sample URL: https://www.bindle.com/books/the-sacred-sorcerer
. Notice, that the word “books” is in the website folder structure.
The filter from your matches all URL-s that contain the word “books”, for example this one: https://www.bindle.com/landing-page?utm_campaign=new_books_2024
. This URL also contains the word “books”, but it’s a completely different page.
The solution is to change our filter to p.url LIKE '%/books/%'
to ensure that we’re matching pages using folder structure.
Note that even this might not be enough for complex websites, because we can have pages like https://www.bindle.com/books/the-sacred-sorcerer/recommendations
– it’s not a book page, but it also sits within the /books/
folder.
I hope that helps, be extra careful when working with web analytics data and don’t hesitate to shoot more questions.