Understanding Aggregated Functions in SQL: A Beginner's Journey
When you're new to SQL, one of the trickiest things to master is working with aggregated functions such as COUNT()
, SUM()
, and AVG()
. These functions are essential for summarizing data, but incorrect usage can lead to confusing results.
In this article, we'll walk through a beginner's experience when trying to count how many stores had a specific movie in stock. Initially, there was confusion regarding the use of COUNT()
and the GROUP BY
clause, which led to an incorrect query. However, after reflecting on the logic and fixing the query, things became much clearer.
Creating the movies
and inventory
Tables
To provide some context, let’s first define the two tables involved: movies
and inventory
. These tables store information about movies and the number of copies (inventory) available at different stores.
Here’s how the tables could be structured:
-- Creating the `movies` table
CREATE TABLE movies (
movie_id INT PRIMARY KEY,
title VARCHAR(255) NOT NULL
);
-- Creating the `inventory` table
CREATE TABLE inventory (
inventory_id INT PRIMARY KEY,
movie_id INT,
store_id INT,
FOREIGN KEY (movie_id) REFERENCES movies(movie_id)
);
Sample Data Insertion
Now, let's insert some example data into the tables:
-- Inserting data into the `movies` table
INSERT INTO movies (movie_id, title) VALUES
(1, 'Inception'),
(2, 'The Matrix'),
(3, 'Interstellar');
-- Inserting data into the `inventory` table
INSERT INTO inventory (inventory_id, movie_id, store_id) VALUES
(1, 1, 1), -- Inception in Store 1
(2, 1, 1), -- Another copy of Inception in Store 1
(3, 1, 2), -- Inception in Store 2
(4, 2, 1), -- The Matrix in Store 1
(5, 3, 3); -- Interstellar in Store 3
Data Overview
- The
movies
table contains the movies with a uniquemovie_id
andtitle
. - The
inventory
table tracks individual copies of movies, stored bystore_id
. Each row ininventory
represents one copy of a movie in a particular store.
For example:
- Inception has two copies in Store 1 and one copy in Store 2.
- The Matrix has one copy in Store 1.
- Interstellar has one copy in Store 3.
Problem
Count how many copies of a movie each store has in stock.
Solution
Since each inventory_id
represents one copy of the movie, counting the inventory_id
will be easy, here is the query.
SELECT movies.movie_id, title, store_id, COUNT(*) AS stock
FROM movies
LEFT OUTER JOIN inventory
ON inventory.movie_id = movies.movie_id
GROUP BY movies.movie_id, title, store_id
ORDER BY movies.movie_id, store_id;
Remember:
- LEFT OUTER JOIN: This joins the
movies
andinventory
tables onmovie_id
. Even if a movie has no inventory in a store, theLEFT OUTER JOIN
ensures it will still appear in the results. - GROUP BY: The grouping is done by
movie_id
,title
, andstore_id
. This ensures that the query counts how many copies of each movie are available at each store. - COUNT(*): The
COUNT(*)
function counts the number of rows in the group, which corresponds to the number of copies of the movie (represented byinventory_id
) for each store. - ORDER BY: The query results are ordered by
movie_id
andstore_id
for better readability, ensuring each movie's inventory is listed by store.
Result
If we run the corrected query on our sample data, we would get the following result:
movie_id | title | store_id | stock |
---|---|---|---|
1 | Inception | 1 | 2 |
1 | Inception | 2 | 1 |
2 | The Matrix | 1 | 1 |
3 | Interstellar | 3 | 1 |
This output shows that:
- Inception has 2 copies in Store 1 and 1 copy in Store 2.
- The Matrix has 1 copy in Store 1.
- Interstellar has 1 copy in Store 3.
The purpose of having an inventory
table is to represent each physical copy or instance of a movie in different stores. This table allows you to track each unique copy and its specific details.
Finally
Don’t worry if you make mistakes along the way—SQL is all about trial and error, and every query teaches you something new. Keep practicing, and you’ll develop a solid understanding of how to effectively summarize and work with your data.