Understanding Aggregated Functions in SQL: A Beginner's Journey
When you're new to SQL, one of the trickiest things to master is working with aggregated functions such as COUNT(), SUM(), and AVG(). These functions are essential for summarizing data, but incorrect usage can lead to confusing results.
In this article, we'll walk through a beginner's experience when trying to count how many stores had a specific movie in stock. Initially, there was confusion regarding the use of COUNT() and the GROUP BY clause, which led to an incorrect query. However, after reflecting on the logic and fixing the query, things became much clearer.
Creating the movies and inventory Tables
To provide some context, let’s first define the two tables involved: movies and inventory. These tables store information about movies and the number of copies (inventory) available at different stores.
Here’s how the tables could be structured:
-- Creating the `movies` table
CREATE TABLE movies (
movie_id INT PRIMARY KEY,
title VARCHAR(255) NOT NULL
);
-- Creating the `inventory` table
CREATE TABLE inventory (
inventory_id INT PRIMARY KEY,
movie_id INT,
store_id INT,
FOREIGN KEY (movie_id) REFERENCES movies(movie_id)
);
Sample Data Insertion
Now, let's insert some example data into the tables:
-- Inserting data into the `movies` table
INSERT INTO movies (movie_id, title) VALUES
(1, 'Inception'),
(2, 'The Matrix'),
(3, 'Interstellar');
-- Inserting data into the `inventory` table
INSERT INTO inventory (inventory_id, movie_id, store_id) VALUES
(1, 1, 1), -- Inception in Store 1
(2, 1, 1), -- Another copy of Inception in Store 1
(3, 1, 2), -- Inception in Store 2
(4, 2, 1), -- The Matrix in Store 1
(5, 3, 3); -- Interstellar in Store 3
Data Overview
- The
moviestable contains the movies with a uniquemovie_idandtitle. - The
inventorytable tracks individual copies of movies, stored bystore_id. Each row ininventoryrepresents one copy of a movie in a particular store.
For example:
- Inception has two copies in Store 1 and one copy in Store 2.
- The Matrix has one copy in Store 1.
- Interstellar has one copy in Store 3.
Problem
Count how many copies of a movie each store has in stock.
Solution
Since each inventory_id represents one copy of the movie, counting the inventory_id will be easy, here is the query.
SELECT movies.movie_id, title, store_id, COUNT(*) AS stock
FROM movies
LEFT OUTER JOIN inventory
ON inventory.movie_id = movies.movie_id
GROUP BY movies.movie_id, title, store_id
ORDER BY movies.movie_id, store_id;
Remember:
- LEFT OUTER JOIN: This joins the
moviesandinventorytables onmovie_id. Even if a movie has no inventory in a store, theLEFT OUTER JOINensures it will still appear in the results. - GROUP BY: The grouping is done by
movie_id,title, andstore_id. This ensures that the query counts how many copies of each movie are available at each store. - COUNT(*): The
COUNT(*)function counts the number of rows in the group, which corresponds to the number of copies of the movie (represented byinventory_id) for each store. - ORDER BY: The query results are ordered by
movie_idandstore_idfor better readability, ensuring each movie's inventory is listed by store.
Result
If we run the corrected query on our sample data, we would get the following result:
| movie_id | title | store_id | stock |
|---|---|---|---|
| 1 | Inception | 1 | 2 |
| 1 | Inception | 2 | 1 |
| 2 | The Matrix | 1 | 1 |
| 3 | Interstellar | 3 | 1 |
This output shows that:
- Inception has 2 copies in Store 1 and 1 copy in Store 2.
- The Matrix has 1 copy in Store 1.
- Interstellar has 1 copy in Store 3.
The purpose of having an inventory table is to represent each physical copy or instance of a movie in different stores. This table allows you to track each unique copy and its specific details.
Finally
Don’t worry if you make mistakes along the way—SQL is all about trial and error, and every query teaches you something new. Keep practicing, and you’ll develop a solid understanding of how to effectively summarize and work with your data.
Comments ()