FORUM Everything about SQL Most popular email domain

Most popular email domain

I’m trying the following query:

WITH domains AS (
  SELECT
    SPLIT_PART(email, '@', 2) AS email_domain,
    COUNT(*) AS recount
  FROM users
  GROUP BY email_domain
)

SELECT 
  DISTINCT(SPLIT_PART(email_domain, '.', 1)) AS maindomain, 
  recount
FROM domains
GROUP BY maindomain, recount
ORDER BY recount DESC

But I want to group the domain before the dot as one, for example I want all the “boogle’s” to count as one domain and sum the total of users with that domain (idependently of their country), but it keeps counting them as if they were different ids or domains.

REPLIES

@ecasanova95 that’s a great question! You’re almost there, the trick is to apply the SPLIT_PART() function twice.

First, to get the domain name (sqlhabit) with the top-level domain (.com):

SELECT SPLIT_PART('hello@sqlhabit.com', '@', 2) AS root_domain

-- sqlhabit.com

Second, to extract just the domain name:

SELECT SPLIT_PART(SPLIT_PART('hello@sqlhabit.com', '@', 2), '.', 1) AS domain_name

-- sqlhabit

Now we can aggregate by domain name:

SELECT 
  SPLIT_PART(SPLIT_PART(email, '@', 2), '.', 1) AS domain_name,
  COUNT(*) AS users_count
FROM users
GROUP BY 1
ORDER BY 2 DESC

We can even spice it up and report user counts on both domain name and top-level domain:

SELECT 
  SPLIT_PART(SPLIT_PART(email, '@', 2), '.', 1) AS domain_name,
  SPLIT_PART(SPLIT_PART(email, '@', 2), '.', 2) AS top_level_domain,
  COUNT(*) AS users_count
FROM users
GROUP BY 1, 2
ORDER BY 3 DESC

Ping me if you have more questions 🚀

WRITE A REPLY