Chapter 4: Aggregation and Grouping for Data Analysis
4.1 COUNT, SUM, AVG, MIN, and MAX Functions
SQL provides several built-in functions for aggregating data:
- COUNT() - Counts the number of rows
- SUM() - Adds up numeric values
- AVG() - Computes the average value
- MIN() - Finds the smallest value
- MAX() - Finds the largest value
Example: Sales Table
Sale ID | Customer Name | Amount | Order Date |
---|---|---|---|
1 | Alice | 250 | 2024-02-01 |
2 | Bob | 450 | 2024-02-02 |
3 | Alice | 300 | 2024-02-03 |
4 | Charlie | 500 | 2024-02-04 |
5 | Alice | 150 | 2024-02-05 |
Using Aggregate Functions
SELECT COUNT(*) AS total_sales FROM sales; SELECT SUM(amount) AS total_revenue FROM sales; SELECT AVG(amount) AS average_sale FROM sales; SELECT MIN(amount) AS min_sale FROM sales; SELECT MAX(amount) AS max_sale FROM sales;
4.2 GROUP BY and HAVING Clauses
The GROUP BY clause is used to group data for aggregation, and the HAVING clause filters aggregated results.
SELECT customer_name, SUM(amount) AS total_spent FROM sales GROUP BY customer_name;
SELECT customer_name, SUM(amount) AS total_spent FROM sales GROUP BY customer_name HAVING SUM(amount) > 500;
4.3 Advanced Aggregations with ROLLUP and CUBE
Using ROLLUP
SELECT customer_name, SUM(amount) AS total_spent FROM sales GROUP BY ROLLUP(customer_name);
Using CUBE
SELECT customer_name, order_date, SUM(amount) AS total_spent FROM sales GROUP BY CUBE(customer_name, order_date);
4.4 Interactive Exercise: Calculating Descriptive Statistics with SQL
Exercise 1: Total Sales Per Day
SELECT order_date, SUM(amount) AS daily_sales FROM sales GROUP BY order_date;
Exercise 2: Finding Customers with High Spending
SELECT customer_name, COUNT(sale_id) AS total_purchases, SUM(amount) AS total_spent FROM sales GROUP BY customer_name HAVING SUM(amount) > 500;
Exercise 3: Analyzing Sales Trends
SELECT customer_name, order_date, SUM(amount) AS sales_amount FROM sales GROUP BY CUBE(customer_name, order_date);
Conclusion
In this chapter, we explored:
- Aggregation functions: COUNT, SUM, AVG, MIN, MAX
- Grouping data with GROUP BY and filtering with HAVING
- Advanced aggregations with ROLLUP and CUBE
- Interactive exercises for practicing SQL queries
These SQL techniques are essential for data analysis and reporting.