Practice

  • 📊 Learn Data Analysis with SQL with real life examples from Product Management, Marketing and more
  • 📱 Multiple datasets (web/mobile app, E-commerce, Finance)
  • 💡 Detailed explanations
Difficutly
Dataset

1. Total revenue calculation

Bindle dataset

This is an entry-level SQL exercise, but it’s also a must-have Data Analysis tool for anyone working with financial data: Product Manager or Analyst calculating Unit Economics or Marketer working on ROI estimation. 💰

2. Percentage of purchases with discount codes

E-commerce dataset

Segmentation is a main tool to get the signal from the noise in data. This exercise is a starting point in using segmentation: purchase with or without discount. The same way you can look at purchases from the USA or India, users with or without mobile app, etc.

3. Calculating gross revenue

Finance dataset

This exercise will give you a taste of working with financial data and thinking in terms of revenues and expenses (credit and debit). 💵

4. Measuring feature adoption

E-commerce dataset

This exercise will teach you the must-have SQL technique for any Product Manager or Data Analyst: measuring feature adoption aka “how many users interacted with a specific feature”.

5. Distribution of users by country

E-commerce dataset

Probably the most used segmentation in Product Management or Marketing analysis: by country. Practice it with this entry-level exercise. 🌎

6. Distribution of users email domains

Bindle dataset

This exercise is very common in B2B marketing, where we need to contact people from a specific company or organization (think of *.edu domain names).

7. Total number of transactions

Finance dataset

Finance 101 exercise: count the total number of credit or debit (read revenue or expense) transactions.

8. Percentage of refunded purchases

Bindle dataset

Absolutely important skill for anyone working with payments data. For any business it’s always too early to count 💵 by looking only at the number of purchases. We always should account for refunded purchases.

9. Counting subcategories in an online store

E-commerce dataset

Think of Amazon: they have a few of main categories, but every time you land on a product page you see these massive menus with subcategories (Books > Business > Product Management > Title). In this exercise your job will be to count all of them.

10. Percentage of users who leave reviews

E-commerce dataset

No need to mention how important reviews are for any sort of business. If your Product team focuses on making users leave reviews – you’ll definitely need a metric to track progress. In this exercise you’ll calculate it. 🚀

11. Average number of items per cart

E-commerce dataset

E-Commerce and Data Analysis classic exercise (which is quite often used in job interviews) – calculating the average number of items per cart. It’s not an easy one but if you’re working in E-commerce (or your product has add-ons, for example) it should be your bread & butter. 📊

12. Categories' items distribution

E-commerce dataset

Another exercise focused on calculating distribution. This time the distribution of items in categories of an online store.

13. The most profitable month

Finance dataset

In this exercise you’ll research what is the most profitable month of a company based on it’s financial data.

This is a very important Marketing insight. Any Marketing is seasonal. Which means there’re months when a company is very conservative about investing marketing money and there’re really profitable months when a company is really aggressive (for example, during the New Year resolutions rush in January).

14. The biggest B2B customer

Finance dataset

It’s a typical financial research for B2B products: which customer is the biggest (aka “brings the most revenue”)? In the beginning of any B2B startup those customers are precious – losing one might mean the end of a startup.

15. Count users with identical names

E-commerce dataset

In these exercise you’ll deal with duplicate entries. Other reincarnations of this exercise are counting duplicated E-commerce items, cities with identical names, etc.

16. Demographics data report

Bindle dataset

In this exercise you’ll build a basic demographics report – distribution of users by country and their average age.

17. Revenue by country

Bindle dataset

In this exercise you’ll apply segmentation technique to the revenue data. Revenue distribution by country is a typical Marketing dashboard in any international product company.

18. A month with the lowest number of purchases

Finance dataset

In this exercise we’ll explore business seasonality and find which month of the year has the lower number of purchases. 📆

19. Distribution of countries by purchase rate

Bindle dataset

Let’s level up our segmentation game and this time apply in to purchase rate. Looking at the TOP countries by purchase rate can define next steps for our marketing efforts.

20. Naive sentiment analysis

E-commerce dataset

The goal of sentiment analysis is to determine the mood of customer’s feedback/customer support ticket – are they happy, neutral or angry and we need to help them asap.

Companies use sentiment analysis to prioritize their customer support queues or user feedback (to deal with negative ones first).

21. The biggest category

E-commerce dataset

This is another exercise for working with nested data (category has multiple categories which have multiple categories, etc). You job will be to find the biggest category of the E-commerce store’s catalog.

22. Top grossing month of the year

E-commerce dataset

In this exercise you’ll analyze purchases from an E-commerce store and find a month of the year with the highest revenue. It sounds straightforward but there’re all these carts, items. 💥 This is a very popular interview exercise. 💼

23. Vendor with the highest refund rate

E-commerce dataset

Another serious exercise from the life of an E-commerce store. Since you’re losing money for every refund (yep, transaction fee of a payment provider) it’s super important to work with vendors who produce high quality goods, otherwise you’ll be losing money on refunds. In this exercise you’ll find which vendors have the highest refund rate.

24. Average price per cart

E-commerce dataset

E-commerce classics: average price per cart. It’s one of the main metrics for E-commerce business; it’s a must have technique for anyone working with data. In this exercise you’ll master it! 💪

25. E-commerce power users

E-commerce dataset

This exercise is an example of a power users analysis. You can guess who power users of a product are – the most active users, your product is part of their life and they use most of its features. For an E-commerce store power users are the ones with most purchases per month.

26. Top grossing month of the year. Part 2

Finance dataset

In this exercise we’ll revisit the topic of revenue seasonality. This time from the finance perspective: you need to find out which month of the year has the highest gross revenue.

27. First sale date for E-commerce vendor

Finance dataset

An E-commerce store Product Manager is trying to answer the question: do we make money with newly added vendors or with the old ones? In this exercise you’ll help her out. 😉

28. Product ratings distribution

E-commerce dataset

Product Management classics – ratings distribution. You can see this chart on every AppStore or Amazon page – how many 5-s, 4-s, etc. In this exercise you’ll calculate such distribution yourself. ⭐

29. ARPU per country

Bindle dataset

Here comes Unit Economics. Whether you’re a Product Manager, Marketer or a Data Analyst you should be able to calculate Unit Economics metrics when someone wakes you up at 3 AM. This exercise will help you master it before you went to bed. 😆

30. Business spending seasonality

Finance dataset

Seasonality is a feature not only for your business revenue flow but also for expenses. In this exercise you’ll analyze when the business has the highest expenses (and help finance people to plan the next year). 📊

31. Time between purchases

E-commerce dataset

This is a serious product analytics exercise. Knowing (or predicting) the time between users purchases is critical for E-commerce store demand forecasting.

32. Items per cart year-over-year dynamics

E-commerce dataset

This exercise will blow you mind. 💥 🙈 You’ve probably heard it many times “Company X grew Y% last year”. In this exercises you’ll create a typical report for business growth, this time focused on the number of items per cart in an E-commerce store.

33. Most popular items in E-commerce store

E-commerce dataset

In this exercise your job is to find which items are the most popular per country (!!). 🌏 This will help with demand forecasting or marketers with planning big sale campaigns.

34. Top rated products

E-commerce dataset

E-commerce classics again: you need to find products with highest ratings. It’s a must have dashboard for any product manager (to understand your audience and business) or marketer (to plan new campaigns or sales).

35. Revenue distribution by country

Bindle dataset

Revenue distribution again, this time will take it one step further – you’ll calculate the percentage of revenue by country as well.

36. Vendor with the highest revenue

Finance dataset

Your job is to find the revenue distribution by vendor of an E-commerce store.

37. Ultimate AB-test dashboard: cohort sizes

Bindle dataset

This exercise opens the door to the world of AB-testing. It’ll show you how to apply funnel analysis to analyze AB-tests and not just one (manual work 👎) but all of them (automation 🚀).

38. Extracting dates out of text

Bindle dataset

Marketing classics: extracting information from a custom campaign naming convention. For example, a campaign name like 20180208_us-ca-gb-au_mf_bindle-launch implies that is was launched on 8th of February. In this exercise you’ll write a query to extract launch dates of all campaigns. 📆

39. User base month-to-month growth

Bindle dataset

Analytics 101: month-to-month growth. This exercise could be applied to number of user, purchases number, revenue, page views, etc. 📊

40. Calculating CPS (Cost per Signup)

Bindle dataset

CPS is an important marketing metric that allows us to compare campaigns between each other; it allows us to timely adjust our marketing spends and avoid burning through marketing budgets. 💵 📱

41. Calculating purchase rate

Bindle dataset

I guess this is the first question asked after a new product gets launched. This is an entry-level exercise for any Analyst, Marketer or PM who wants to become fluent with data. 🚪

42. Ultimate AB-test dashboard: purchase rate

Bindle dataset

This is the next step in mastering the world of AB-testing – adding customers count and purchase rate to the AB-test funnel analysis. 📊

43. Ultimate AB-test dashboard: activation rate

Bindle dataset

When we analyze an AB-test we compare variation funnels. Ideally, a new variation impacts the whole funnel (more signups, more active users, more purchases, etc). Let’s grow our AB-test dashboard by one more column – activation rate. 💥

44. Ultimate AB-test dashboard: purchase rate uplift

Bindle dataset

In this exercise we’ll take our AB-test analysis game up a bit – we’ll calculate an uplift of purchase rate compared to control variation. For all AB-tests at once, how cool is that? 🙀

45. Design research: users who refunded

Bindle dataset

It’s unreal to build a great product solely based on data – we have to talk to our customers and, most importantly, to churned customers. Their frustrations will inspire important product adjustments or new features that will help retain your future customers. ❤

46. Number of days to make the first purchase

Bindle dataset

In this exercise you’ll apply you whole Data Analysis repertoire to calculate a number of days it takes for users to make the first purchase. This is an important metric for financial planning and even more for marketing.

47. Number of days to make the first purchase with percentage distribution

Bindle dataset

This is the second part of the exercise to calculate the number of days it takes users to make the first purchase. This time we’ll add percentage distribution to the final report. 📊

48. Number of days to make the first purchase with ranges

Bindle dataset

This is the third part of the exercise to calculate the number of days it takes users to make the first purchase. In this variation we’ll use ranges instead of number of days for a final report.

49. Cumulative total users per day

Bindle dataset

In this exercise, we’ll look into a cumulative users count report, also known as running total or rolling sum. 👥

50. Purchase rate for different cohorts

Bindle dataset

In this exercise you’ll build a comparison (not to mix with statistical correlation) report that shows purchase rate for two different user cohorts – regular users and invited (referred) users. 👥

51. Count signups in the last 30 days

Live dataset

In this exercise we’ll start with the simplest real-time business metric – the number of signups.

52. Signups in the last 30 days with an uplift

Live dataset

In this exercise, we’ll start step up the challenge and report both number of signups for the past month and the percentage uplift compared to the previous month. 📈

53. Trial opt-in rate last 30 days

Live dataset

In this exercise, we’ll look further into real-time business metrics and calculate trial opt-in rate for the past 30 days daily. Such report is a must-have for any business monitoring dashboard. 📊

54. Time of day distribution for signups

Live dataset

This exercise is a gateway into time-of-day analysis and understanding usage patterns of your product. So far all the queries we have been running produced data relative to our timezone (UTC). This time we’ll look at the world from a user perspective. 👀

55. Trial retention rate last 30 days

Live dataset

In this exercise we’ll continue exploring subscription model metrics and calculate trial retention rate (percentage of trials which converted to subscription purchases) for daily cohorts. 📊

56. The best 3 point shooter of all time

NBA dataset

Let’s not pretend that we’re doing sport analytics here – we’re simply having fun (and practicing Data Analysis with 🏀). 😄

This is a great exercise to practice your SQL skills in a brand new context. Can you calculate who hit the most 3 pointers in the NBA history? 🏆

57. Games won by a single point

NBA dataset

I love to watch buzzer beaters that won games, a last second free throw or a clutch shot that leads a team to victory. 🏀 It’s a great reminder to rest in the end, not in the middle.

In this exercise, let’s find out how many games in NBA were won by a single point difference.

58. The biggest point difference in history

NBA dataset

There’re many types of NBA games: a typical game where the final score has 10-15 point difference. An intense game where the game is won by single points. And then there’s these berserker games where one team outscores the other by 20-30 points.

Let’s find out the biggest point difference in the history of NBA. 🏀

59. The day Steph Curry scored his 3000th three pointer

NBA dataset

In 2021 the history was made: Steph Curry was the first ever NBA player to score 3000 three point shots. That’s crazy. 🔥 🏀

In case you want to rewatch that moment, let’s find a game when it happened.

60. The best free throw shooter

NBA dataset

Have you ever tried to make two free throws in a row? Three? Ten? My personal record is 21 and I practiced A LOT to do this. Doing it in a game is a whole another level of difficulty though… 🏀

Let’s see which NBA player has the highest free throw percentage in the history of NBA.

61. Most triple-doubles in NBA history

NBA dataset

There’re 5 primary player game statistics: points, rebounds, blocks, steals and assists. A triple-double is a game when a player had at least 3 of those statistics in double-digits, like 30 points, 10 rebounds and 10 assists.

Sounds HARD 💪, guess what’s the maximum number of triple-doubles games a player in their entire career?

62. Single point triple-doubles

NBA dataset

A triple-double game is a game when a player had 3 double digits statistics: points, rebounds, assists, blocks or steals. Almost always these statistics are points, rebounds and assists, but there’re exceptions.

Let’s find out how many times in the recent NBA history players had triple-double games with single digit points. 🏀

63. Perfect 3 point games

NBA dataset

I’m not sure if there’s a term for this already, but let’s call a game a “perfect 3 point game” when every team player scored at least one three-pointer. 🏀

How many times teams had perfect 3 point games in the whole history of NBA?

64. Most 3 pointers in a regular season game

NBA dataset

Who do you think hit the most 3 pointers in a game? How many? Let’s figure that out. 🏀

65. Most threes in a season

NBA dataset

We have players regularly shooting 5-10 threes a night, but for how long can they sustain those streaks?

Let’s figure out the record holders in the most threes made during a regular season. 🏀

66. The longest game time without a single stat

NBA dataset

That’s a funny record 😅 – how many minutes a player could be in a game without recording a single stat? (points, rebounds, assists, steals or blocks).

Make a guess and let’s figure out the answer. ⏱ 🏀

67. Most points in a game

NBA dataset

Let’s practice our SQL shots and find out who set the records for the most points scored during a regular season game. 🏀

68. 3 point line

NBA dataset

The first NBA game took place Nov 1, 1949. Guys wore very short shorts (no pan intended) and high Converses. Most importantly, there were no 3 point line. It was added much later. Can you figure out when? 🏀

69. True shooting percentage

NBA dataset

What if a player hit 5/5 from three and 1/10 from the free throw line? Is it a good or bad shooting? To measure the efficiency of all shots, NBA came up with a stat called True Shooting Percentage (TSP) and it takes into account all types of shots.

Who do you think are TSP leaders in the modern NBA era? 🏀

70. The longest winning streak

NBA dataset

An NBA analogue of this exercise would be a 7th playoff game. In other words, it’s a tough one! 🔥

Your challenge is to find the longest winning streak for a team during a regular season. In the history of NBA! 🏀

71. How many teams overcame 3–1 deficit in playoffs

NBA dataset

Do you remember when the Warriors blew up 3–1 lead and lost NBA finals to the Cavaliers in 2016? That was a true 🏀 💣, super interesting match up.

How many times do you think it ever happened in the NBA history?

72. Order teams by the number of rings

NBA dataset

Let’s reconstruct the NBA classic leaderboard: teams and their rings. 💍 🏀

73. Playing at home vs playing on the road

NBA dataset

In this challenge, let’s check if teams always play better at home than on the road. 🏀 ✈

74. Three-peats

NBA dataset

First, you win an NBA championship, then you repeat and then you three-peat! 🏀

How many times did NBA teams won back-to-back-to-back championships? 💍 💍 💍

75. Most free throw attempts in a regular season

NBA dataset

It’s a challenge with a twist. Maybe the Free Throw Attempts (FTA) stat doesn’t sounds epic, but the result definitely is.

Try it out and see which players lead this category. 🏀

76. Best scoring duo

NBA dataset

It’s a big NBA paradox: basketball is a team sport, but everyone talks about individuals. Let’s break the pattern and find out the best NBA duos. 🏀 👬

77. One team players

NBA dataset

This is a very elite club – to play an entire career for one franchise. Let’s see who are the players on that list. 🏀

78. Top rated movies

Movies dataset

Your job is to write a query that lists TOP-10 movies of all time based on user ratings. 🎥

79. Highest Grossing Movies

Movies dataset

Find the TOP-5 highest-grossing movies, displaying their titles and total revenue. 📊

80. Movies starring specific actor

Movies dataset

Find all movies that feature a specific actor. 📸

81. Most Prolific Directors

Movies dataset

Find TOP-3 directors by the number of movies. 🎥

82. TOP-5 movie genres

Movies dataset

Your task is to find the biggest genres by the number of movies.

84. Most prolific actors

Movies dataset

Find TOP-5 actors who have appeared in the most movies.

85. Movies with the longest runtime

Movies dataset

Retrieve the titles and runtimes of the TOP-7 longest movies.

86. Actor collaborations

Movies dataset

Find actor pairs who have starred in more than 3 movies together. 👥

87. Director's biggest hits

Movies dataset

Find TOP-3 highest-grossing movies for a director. 💰

88. Highest rated directors

Movies dataset

Find TOP-3 directors who have the highest average movie ratings. ⭐

89. Average movie duration by genre

Movies dataset

Calculate the average duration of movies for each genre. ⏱

90. Movie franchises

Movies dataset

Find movies with sequels and the time gap between them.

91. Most versatile actors

Movies dataset

Find the TOP-3 actors who have worked with the most different directors.

92. Average movie rating by year

Movies dataset

Calculate the average movie rating for each year. 📆

93. Most recent movie by director

Movies dataset

Find the most recent movie for the TOP tier directors.

95. Genre distribution of ratings

Movies dataset

Calculate the distribution of movie ratings for each genre and display them in ranges. ⭐

96. Directors' movie revenue share

Movies dataset

List movies by top directors and their revenue share.

97. Movies with above average revenue

Movies dataset

List all top-grossing movies and their directors.

98. Time gaps between movies

Movies dataset

Find how often Christopher Nolan releases new movies.

99. Actors with consecutive high ratings

Movies dataset

Find actors who have appeared in at least two high-rated consecutive movies. ⭐

100. The longest movie franchise

Movies dataset

Find all movie franchises with at least 3 movies.

101. Referral rate

Live dataset

Calculate the impact of a mobile app’s referral system. 📱

102. Revenue prediction based on iPhone model

Live dataset

Calculate the ARPU (Average Revenue Per User) per iPhone model. 📱

Have an interview coming up? Practice with Mock Interviews 💼

A Mock Interviews consists of 2 exercises you need to solve under 45 minutes, just as if it was a real tech interview.