Understanding Aggregated Functions in SQL: A Beginner's Journey

Understanding Aggregated Functions in SQL: A Beginner's Journey
Photo by Anson Antony / Unsplash

When you're new to SQL, one of the trickiest things to master is working with aggregated functions such as COUNT(), SUM(), and AVG(). These functions are essential for summarizing data, but incorrect usage can lead to confusing results.

In this article, we'll walk through a beginner's experience when trying to count how many stores had a specific movie in stock. Initially, there was confusion regarding the use of COUNT() and the GROUP BY clause, which led to an incorrect query. However, after reflecting on the logic and fixing the query, things became much clearer.

Creating the movies and inventory Tables

To provide some context, let’s first define the two tables involved: movies and inventory. These tables store information about movies and the number of copies (inventory) available at different stores.

Here’s how the tables could be structured:

-- Creating the `movies` table
CREATE TABLE movies (
    movie_id INT PRIMARY KEY,
    title VARCHAR(255) NOT NULL
);

-- Creating the `inventory` table
CREATE TABLE inventory (
    inventory_id INT PRIMARY KEY,
    movie_id INT,
    store_id INT,
    FOREIGN KEY (movie_id) REFERENCES movies(movie_id)
);

Sample Data Insertion

Now, let's insert some example data into the tables:

-- Inserting data into the `movies` table
INSERT INTO movies (movie_id, title) VALUES
(1, 'Inception'),
(2, 'The Matrix'),
(3, 'Interstellar');

-- Inserting data into the `inventory` table
INSERT INTO inventory (inventory_id, movie_id, store_id) VALUES
(1, 1, 1), -- Inception in Store 1
(2, 1, 1), -- Another copy of Inception in Store 1
(3, 1, 2), -- Inception in Store 2
(4, 2, 1), -- The Matrix in Store 1
(5, 3, 3); -- Interstellar in Store 3

Data Overview

  • The movies table contains the movies with a unique movie_id and title.
  • The inventory table tracks individual copies of movies, stored by store_id. Each row in inventory represents one copy of a movie in a particular store.

For example:

  • Inception has two copies in Store 1 and one copy in Store 2.
  • The Matrix has one copy in Store 1.
  • Interstellar has one copy in Store 3.

Problem

Count how many copies of a movie each store has in stock.

Solution

Since each inventory_id represents one copy of the movie, counting the inventory_id will be easy, here is the query.

SELECT movies.movie_id, title, store_id, COUNT(*) AS stock
FROM movies
LEFT OUTER JOIN inventory
ON inventory.movie_id = movies.movie_id
GROUP BY movies.movie_id, title, store_id
ORDER BY movies.movie_id, store_id;

Remember:

  1. LEFT OUTER JOIN: This joins the movies and inventory tables on movie_id. Even if a movie has no inventory in a store, the LEFT OUTER JOIN ensures it will still appear in the results.
  2. GROUP BY: The grouping is done by movie_id, title, and store_id. This ensures that the query counts how many copies of each movie are available at each store.
  3. COUNT(*): The COUNT(*) function counts the number of rows in the group, which corresponds to the number of copies of the movie (represented by inventory_id) for each store.
  4. ORDER BY: The query results are ordered by movie_id and store_id for better readability, ensuring each movie's inventory is listed by store.

Result

If we run the corrected query on our sample data, we would get the following result:

movie_idtitlestore_idstock
1Inception12
1Inception21
2The Matrix11
3Interstellar31

This output shows that:

  • Inception has 2 copies in Store 1 and 1 copy in Store 2.
  • The Matrix has 1 copy in Store 1.
  • Interstellar has 1 copy in Store 3.

The purpose of having an inventory table is to represent each physical copy or instance of a movie in different stores. This table allows you to track each unique copy and its specific details.

Finally

Don’t worry if you make mistakes along the way—SQL is all about trial and error, and every query teaches you something new. Keep practicing, and you’ll develop a solid understanding of how to effectively summarize and work with your data.

Support Us