SQL Adventures : Tackling the 8 Week SQL Challenge — Week 2 Part I

Ash
9 min readSep 6, 2024

--

Welcome to SQL Adventures!

Inspired by Danny Ma’s 8 Week SQL Challenge, let’s dive into the challenge for Week 2— Pizza Runner. This series is a gateway to my efforts in mastering SQL, one challenge at a time. Through this series of articles, I share my solutions and insights gained from each challenge. The first part of this challenge, Week 1 — Danny’s Diner, is documented here.

Data

The data is about Danny’s Diner, a restaurant specializing in sushi, curry, and and pizza. The data is stored in the pizza_runner database schema, which contains several interconnected tables that capture details about customers, orders, runners, and the pizzas themselves. Here is a brief overview of the tables:

  • runners : Contains details about the runners, including their unique runner IDs and registration dates.
  • customer_orders : Captures pizza orders from customers, including information on the type of pizza ordered, any customizations (exclusions or extras), and the order time.
  • runner_orders: Tracks order fulfillment by runners, including pickup times, delivery distances, durations, and cancellations.
  • pizza_names : A simple reference table that lists the available pizza types, such as Meat Lovers and Vegetarian.
  • pizza_recipes : Details the standard set of toppings for each pizza type.
  • pizza_toppings : Provides a mapping of topping IDs to their respective names, helping to identify ingredients used in the pizzas.

I’ll add the ER diagram here for reference, but the data within each table can be seen in the challenge website.

Entity Relationship Diagram

Challenge

This challenge is packed with over 25 questions, divided into four key areas of focus. To maintain my sanity while writing this (and hopefully yours while reading), I’ve decided to break it down into parts. Part I will cover the first area of Pizza Metrics and following articles will cover the remaining areas along with the bonus questions.

Pizza Metrics

  1. How many pizzas were ordered?
  2. How many unique customer orders were made?
  3. How many successful orders were delivered by each runner?
  4. How many of each type of pizza was delivered?
  5. How many Vegetarian and Meatlovers were ordered by each customer?
  6. What was the maximum number of pizzas delivered in a single order?
  7. For each customer, how many delivered pizzas had at least 1 change and how many had no changes?
  8. How many pizzas were delivered that had both exclusions and extras?
  9. What was the total volume of pizzas ordered for each hour of the day?
  10. What was the volume of orders for each day of the week?

Solutions

Before diving into the questions, it can be seen that there are some null values within certain tables. To fix this, an update command can be used.

In the table customer_orders, the columns exclusions and extras have null values which are updated to NULL as follows:

UPDATE pizza_runner.customer_orders
SET exclusions = CASE
WHEN exclusions = 'null' OR exclusions = ' ' THEN NULL
ELSE exclusions
END,
extras = CASE
WHEN extras = 'null' OR extras = 'NaN' OR extras = ' ' THEN NULL
ELSE extras
END
RETURNING *;

In the table runner_orders, the columns pickup_time, distance, cancellation and duration have ‘null’ or ‘NaN’ values and the columns distance and duration have irregularities within their format and data types. These can be fixed as follows:

UPDATE pizza_runner.runner_orders
SET pickup_time = CASE
WHEN pickup_time = 'null' THEN NULL
ELSE pickup_time
END,
distance = CASE
WHEN distance = 'null' THEN NULL
ELSE regexp_replace(distance, '[^0-9]', '', 'g')::NUMERIC(10,2)
END,
duration = CASE
WHEN duration = 'null' THEN NULL
ELSE regexp_replace(duration, '[^0-9]', '', 'g')::INTEGER
END,
cancellation = CASE
WHEN cancellation = 'null' OR cancellation = '' THEN NULL
ELSE cancellation
END
RETURNING *;

Here, the table runner_orders is updated to clean and format its data. The query converts occurrences of the string 'null' to actual NULL values in the pickup_time, distance, duration, and cancellation columns. For the distance column, non-numeric characters are removed, and the values are cast to a NUMERIC type with two decimal places. Similarly, the duration column is stripped of non-numeric characters and cast to an INTEGER. The cancellation column is set to NULL for empty strings or the string 'null'.

Let’s move on to solving the questions now.

1. How many pizzas were ordered?

SELECT
COUNT(pizza_id) AS total_pizza_count
FROM
pizza_runner.customer_orders;
Number of pizzas ordered

This is straightforward — counting the total number of pizzas within the customer_orders table.

2. How many unique customer orders were made?

SELECT
COUNT(DISTINCT order_id) AS unique_order_count
FROM
pizza_runner.customer_orders;
Number of unique orders

This is again counting the number of orders within the customer_orders table and not the number of pizzas.

3. How many successful orders were delivered by each runner?

Since I used DBFiddle for completing this task, I had to rerun the update command before every run. To avoid this repetition, I wrote a CTE that does the same and used it to solve this question.

WITH cte_ro AS (
SELECT
order_id,
runner_id,
CASE
WHEN pickup_time = 'null' THEN NULL
ELSE pickup_time
END AS pickup_time,
CASE
WHEN distance = 'null' THEN NULL
ELSE REGEXP_REPLACE(distance, '[^0-9]', '', 'g')::NUMERIC(10,2)
END AS distance,
CASE
WHEN duration = 'null' THEN NULL
ELSE REGEXP_REPLACE(duration, '[^0-9]', '', 'g')::INTEGER
END AS duration,
CASE
WHEN cancellation = 'null' OR cancellation = '' THEN NULL
ELSE cancellation
END AS cancellation
FROM
pizza_runner.runner_orders
)
SELECT
DISTINCT runner_id,
COUNT(*) AS successful_orders
FROM
cte_ro
WHERE
cancellation IS NULL
GROUP BY
runner_id
ORDER BY
runner_id;

If UPDATE command works fine, then this can be the used directly:

SELECT
runner_id,
COUNT(*) AS successful_orders
FROM
pizza_runner.runner_orders
WHERE
cancellation IS NULL
GROUP BY
runner_id
ORDER BY
runner_id;
Successful orders per runner

The number of successful orders per runner is as shown. Runner with runner_id of 1 has the most number of successful orders at 4.

4. How many of each type of pizza was delivered?

To just get the number of pizzas of each type that were ordered:

SELECT DISTINCT 
co.pizza_id,
pn.pizza_name,
COUNT(co.order_id) AS number_of_pizzas
FROM
pizza_runner.customer_orders co
LEFT JOIN
pizza_runner.pizza_names pn ON co.pizza_id = pn.pizza_id
GROUP BY
co.pizza_id,
pn.pizza_name;
Number of pizzas per type from customer_orders table

Taking into account only the orders that were delivered and not cancelled, the number of pizzas of each type can be calculated using the same CTE as used in Q3.

WITH cte_ro AS (
SELECT
order_id,
runner_id,
CASE
WHEN pickup_time = 'null' OR pickup_time = '' THEN NULL
ELSE pickup_time
END AS pickup_time,
CASE
WHEN distance = 'null' OR distance = '' THEN NULL
ELSE REGEXP_REPLACE(distance, '[^0-9]', '', 'g')::NUMERIC(10,2)
END AS distance,
CASE
WHEN duration = 'null' THEN NULL
ELSE REGEXP_REPLACE(duration, '[^0-9]', '', 'g')::INTEGER
END AS duration,
CASE
WHEN cancellation = 'null' OR cancellation = '' THEN NULL
ELSE cancellation
END AS cancellation
FROM
pizza_runner.runner_orders
)
SELECT
DISTINCT co.pizza_id,
pn.pizza_name,
COUNT(co.order_id) AS number_of_pizzas
FROM
pizza_runner.customer_orders co
LEFT JOIN
pizza_runner.pizza_names pn ON co.pizza_id = pn.pizza_id
LEFT JOIN
cte_ro c ON co.order_id = c.order_id
WHERE
c.cancellation IS NULL
GROUP BY
co.pizza_id,
pn.pizza_name;
Number of pizzas per type delivered

The number of pizzas delivered with respect to the pizza type is as shown. Pizza with pizza_id of 1, that is the Meatlovers, is the most delivered pizza type at 9 pizzas.

5. How many Vegetarian and Meatlovers were ordered by each customer?

Again, since I used DBFiddle for completing this task, I wrote a CTE that does the same update to customer_orders table.

WITH cte_co AS (
SELECT
order_id,
customer_id,
pizza_id,
CASE
WHEN exclusions = 'null' OR exclusions = '' THEN NULL
ELSE exclusions
END AS exclusions,
CASE
WHEN extras = 'null' OR extras = '' THEN NULL
ELSE extras
END AS extras,
order_time
FROM
pizza_runner.customer_orders
)

SELECT
customer_id,
pn.pizza_name,
COUNT(*) AS pizza_count
FROM
cte_co co
LEFT JOIN
pizza_runner.pizza_names pn ON co.pizza_id = pn.pizza_id
GROUP BY
customer_id,
pn.pizza_name
ORDER BY
customer_id;
Types of pizzas ordered by each customer

The number of pizzas delivered with respect to the pizza type and customer is as shown. Every customer except customers with customer_id of 104 and 105 have ordered both the types of pizzas. While 104 has only ordered Meatlovers pizza thrice, 105 has only ordered Vegetarian once.

6. What was the maximum number of pizzas delivered in a single order?

Using the CTEs as defined before, the maximum number of pizzas ordered in a single order:

SELECT
order_id,
COUNT(pizza_id) AS number_of_pizzas
FROM
cte_co
GROUP BY
order_id
ORDER BY
number_of_pizzas DESC
LIMIT 1;
Maximum number of pizzas ordered in a single order

To ensure that only the orders that were delivered and not cancelled are considered, the code can be modified as follows which again gives the same result:

SELECT
co.order_id,
COUNT(co.pizza_id) AS number_of_pizzas
FROM
cte_co co
LEFT JOIN
cte_ro ro ON co.order_id = ro.order_id
WHERE
ro.cancellation IS NULL
GROUP BY
co.order_id
ORDER BY
number_of_pizzas DESC
LIMIT 1;
Maximum number of pizzas delivered in a single order

Using both the conditions, the maximum number of pizzas delivered in a single order is 3 which is in the order_id of 4.

7. For each customer, how many delivered pizzas had at least 1 change and how many had no changes?

Considering either exclusions or extras within the order as a change:

SELECT
co.customer_id,
COUNT(co.pizza_id) AS pizza_count,
SUM(
CASE
WHEN co.exclusions IS NOT NULL OR co.extras IS NOT NULL THEN 1
ELSE 0
END
) AS at_least_1_change,
SUM(
CASE
WHEN co.exclusions IS NULL AND co.extras IS NULL THEN 1
ELSE 0
END
) AS no_changes
FROM
cte_co co
LEFT JOIN cte_ro ro ON co.order_id = ro.order_id
WHERE
ro.cancellation IS NULL
GROUP BY
co.customer_id
ORDER BY
co.customer_id;
Customer orders delivered with at least one change and without any changes

This shows that customer with customer_id of 101 had 2 pizzas delivered, both with no changes, as well as customer with customer_id of 102 had 3 pizzas delivered, with no changes. Customer with customer_id of 103 had 3 pizzas delivered, all with at least 1 change, as well as customer with customer_id of 105 had only 1 pizza delivered and that too with at least 1 change.

8. How many pizzas were delivered that had both exclusions and extras?

SELECT 
COUNT(co.pizza_id) AS pizza_count_with_extras_exclusions
FROM cte_co co
LEFT JOIN cte_ro ro
ON co.order_id = ro.order_id
WHERE ro.cancellation IS NULL
AND co.exclusions IS NOT NULL
AND co.extras IS NOT NULL;
Number of pizzas with both extras and exclusions

Only one pizza was delivered that had both exclusions and extras.

9. What was the total volume of pizzas ordered for each hour of the day?

SELECT
DATE_PART('hour', order_time) AS hour_of_day,
COUNT(order_id) AS pizza_volume
FROM cte_co
GROUP BY DATE_PART('hour', order_time)
ORDER BY DATE_PART('hour', order_time);
Total volume of pizza ordered for each hour of the day

The total volume of pizzas ordered each day with respect to the hour of the day is as shown. The majority of orders were placed during the hours of 13 (1 pm), 18 (6 pm), 21 (9 pm) and 23 (11 pm).

10. What was the volume of orders for each day of the week?

SELECT
TO_CHAR(order_time, 'FMDay') AS day_of_week,
COUNT(order_id) AS pizza_volume
FROM cte_co
GROUP BY TO_CHAR(order_time, 'FMDay');
Volume of orders for each day of the week

The total volume of pizzas ordered each day of the week is as shown. The majority of orders were placed on Saturday and Wednesday (5) followed by Thursday (3) and then Friday (1).

That was Part I. Onto the next one! See you in Part II!

Toodles!

--

--

Ash
Ash

Written by Ash

Data Analyst | AWS Certified Cloud Practitioner

No responses yet