SQL MDN Docs DISTINCT

DISTINCT keyword in SQL

The DISTINCT keyword in SQL is a powerful tool used to eliminate duplicate rows from a result set.

The DISTINCT command is particularly useful when dealing with large datasets where there may be numerous instances of the same data. These duplicates could be specific values in columns (like user countries) or entire duplicated records (when using LEFT JOIN, for example).

Understanding how to use DISTINCT effectively can help you maintain the integrity of your data analysis and ensure accurate results.

Syntax

When you use the DISTINCT keyword in a SELECT statement, SQL engine filters out all duplicate rows based on the columns you’ve specified and returns only unique rows.

Here’s an example with one column:

SELECT DISTINCT country
FROM users

This query retrieves a list of unique countries from the users table.

💡 Note that you can also use the DISTINCT command with brackets like so:

SELECT DISTINCT(country)
FROM users

DISTINCT on multiple columns

You can also use DISTINCT with multiple columns to get unique combinations of values across the specified columns.

SELECT DISTINCT country, age
FROM users

This query returns unique combinations of country and age from the users table.

🔍 If there are two users from the same country but with different ages, both rows will appear in the result set.

Using DISTINCT in Data Analysis

DISTINCT can be particularly useful in data analysis in many scenarios:

  • Identifying Unique Values. When you need to know the different variations of a particular attribute in your data.
  • Data Cleaning. Helps in identifying and removing duplicates, which is crucial for accurate analysis (think business revenue reporting).
  • Aggregate Functions: Combine DISTINCT with functions like COUNT to understand the size of a particular subset of data.

For instance, to count the number of unique countries in the users table:

SELECT COUNT(DISTINCT country)
FROM users

Database compatibility

MySQL
PostgreSQL
SQLite
Redshift
Big Query
Snowflake
DISTINCT
DISTINCT
MySQL
PostgreSQL
SQLite
Redshift
Big Query
Snowflake

Find a problem on this page?

Want to get more involved? SQL MDN Docs is an open-source project, check out contribution guidelines.
This page was last modified on March 24, 2024.