Chapter 9: Text analysis with SQL

There’s a lot of information stored in text form. Even a simple email can reveal us so much: user name (who’s that person), email domain (what company they work for), top-level domain (what country or kind of organization is that). Imagine how much insights we can find in other text columns in our data warehouse.

Text manipulation in SQL

SQL is a very powerful programming language and its power stretches far beyond numeric operations.

We can manipulate text in tons of ways:

  • split text into multiple values
  • combine multiple text values into one
  • convert text to other data types (like dates or numbers)
  • filter records by simple inclusion or very complex patterns (regular expressions)
  • update existing text values using pattern matching

Pattern matching

A big part of text manipulation are Regular Expressions. It’s a separate language (yep, a programming language inside a programming language) designed to specify complex text patterns.

A Regular Expression is literally a string that is interpreted by SQL engine to find a match in a text value. For example, here’s a regular expressoin that matches a website URL like **sqlhabit.com**: '[\w-]+\.[\w]+'.

Mastering text analysis boils down to learning a handful of SQL operators and text functions. In this chapter you’ll learn all of them, let’s go!

Anatoli Makarevich, author of SQL Habit About SQL Habit

Hi, it’s Anatoli, the author of SQL Habit. 👋

SQL Habit is a course (or, as some of the students say, “business simulator”). It’s based on a story of a fictional startup called Bindle. You’ll play a role of their Data Analyst 📊 and solve real-life challenges from Business, Marketing, and Product Management.

SQL Habit course is made of 13 chapters (you’re looking at one atm) that contain 273 bite-sized lessons and exercises. All of them have a real-life setting and detailed explanations. You can immediately apply everything you’ve learned at work. 🚀

The 2nd part of the course is called Practice. It’s made of standalone exercises based on multiple datasets – E-commerce, Finance and Meditation app a-la Headspace or Calm. Practice exercises are harder than in the main course. They’ll get you ready for any challenge at work or an interview. 💪

Reviews

4.97 61 reviews
Maria, CRO @ Babbel
Chandricka, Junior UA Manger @ Phiture
Artur, Marketing Analyst @ Babbel
“well worth the money”

Fluent in SQL in a month

Master Data Analysis with SQL with real life examples from Product Management, Marketing, Finance and more.