Use of Aggregate Functions and GroupBy in SQL

Sonali Dash 26 May, 2022 • 6 min read

This article was published as a part of the Data Science Blogathon.

Introduction on Aggregate functions in SQL

Before going to our main concept, we must first learn- What are function and type?

Function in SQL is a set of SQL statements that performs a specific task. SQL provides a variety of functions which are grouped into two categories:

1. Single row function

2. Multiple row function.

Single row functions operate on a single row and one result per row. This function can accept one or more arguments and returns one value for each row. And can be used with SELECT, WHERE and ORDER BY. Whereas the multiple row function is used with multiple rows and gives aggregate value.

This multiple row function is also known as the Aggregate function and Non-Scalar function. In SQL normally we work on single row values but when it comes to working on multiple rows at that time we use an aggregate function. Aggregate functions are used to perform specific types of operations such as counting the number of records in a table, searching maximum and minimum values from a table, finding out the sum and average values from a table, etc. An aggregate function ignores NULL values in a table when performing the calculation except for the COUNT() function.

The aggregate function is used in the SELECT statement.

 Let’s discuss the types of aggregate functions-

Types Of Aggregate function

 SQL provide several types of aggregate functions they are SUM(),AVG(),MAX(),MIN(),COUNT(),FIRST() and LAST(). SUM(), AVG(), MAX(), MIN() these functions are applied on those attributes which has integer type value for example salary of employees, age of employees, marks of students etc.

SUM()- This function is used to find out the sum of a specified column in a set of rows.

Syntax: SELECT SUM(column_name) FROM table_name;
Example: SELECT SUM(Salary) FROM Employee;

 

Types Of Aggregate function

AVG()- This function is used to calculate the average of a specified column in a set of rows.

Syntax: SELECT AVG(column_name) FROM table_name;
Example: SELECT AVG(Salary) FROM Employee;

 

Types Of Aggregate function Image 2

MAX()– This function retrieves the maximum value of a specified column.

Syntax: SELECT MAX(column_name) FROM table_name;
Example: SELECT E_NAME,MAX(SALARY) FROM Employee;

 

Types Of Aggregate function Image 3

MIN()- This function retrieves the minimum value of a specified column.

Syntax: SELECT MIN(column_name) FROM table_name;
Example: SELECT MIN(Salary) FROM Employee;

 

Types Of Aggregate function Image 4

COUNT()– This function returns the number of rows in a database table.

Syntax: SELECT COUNT(*) FROM table_name;
       SELECT COUNT(DISTINCT column_name) FROM table_name;
Example: SELECT COUNT(*) FROM EMPLOYEE ;
        SELECT COUNT(DEPARTMENT) FROM EMPLOYEE WHERE SALARY>30000;

 

Types Of Aggregate function Image 5
Types Of Aggregate function Image 6

When we use COUNT(*), it counts all rows and gives the output as the total number of rows present in the employee table. And in the next example, we use COUNT(DEPARTMENT) with where clause by putting the condition SALARY > 30000, then it counts those DEPARTMENT salaries where employees get more than 30000.

FIRST()-This function returns the first value of a specified column.

Syntax: SELECT FIRST(column_name) FROM table_name;
Example: SELECT FIRST(Name) FROM EMPLOYEE ;

LAST()-This function returns the last value of a specified column.

Syntax: SELECT LAST(column_name) FROM table_name;
Example: SELECT  LAST(Name) FROM EMPLOYEE ;

We can use aggregate functions in a particular column. If there is any query that we have to calculate or count a group of values from a specified table to resolve this kind of issue in SQL we use the GROUP BY clause with aggregate functions. For that reason, the Aggregate function is also known as the GROUP function. Let’s discuss how to implement GROUP BY with aggregate function in SQL.

GROUP BY

The GROUP BY clause is a SQL command used to group rows with the same values. The GROUP BY clause is used in the SELECT statement. Also, we use the WHERE statement with GROUP BY.WHERE clause is used before GROUP BY. To represent a group of values in a particular manner ORDER BY clause is also used with GROUP BY and  GROUP BY placed before the ORDER BY.

Syntax: 
SELECT column_name FROM table_name 
WHERE [condition] 
GROUP BY column_name
ORDER BY column_name;
Example:
SELECT DEPARTMENT, SUM(SALARY) FROM EMPLOYEE 
GROUP BY DEPARTMENT
ORDER BY DEPARTMENT;
Types Of Aggregate function Image 7

Like the SUM() function we can also use other aggregate functions(AVG(),MAX(),MIN(),COUNT(*)) with GROUP BY.

Example:
SELECT DEPARTMENT, AVG(SALARY) FROM EMPLOYEE GROUP BY DEPARTMENT
ORDER BY DEPARTMENT;

Types Of Aggregate function Image 8
Example:

SELECT DEPARTMENT, MIN(SALARY) FROM EMPLOYEE

GROUP BY DEPARTMENT

ORDER BY DEPARTMENT;

Types Of Aggregate function Image 9

After using this query it displays the minimum salary of each department.

Example:
SELECT DEPARTMENT, MAX(SALARY) FROM EMPLOYEE
GROUP BY DEPARTMENT
ORDER BY DEPARTMENT;
Types Of Aggregate function Image 10

After using this query it displays the maximum salary of each department.

Example:

SELECT DEPARTMENT, COUNT(E_NAME) FROM EMPLOYEE
GROUP BY DEPARTMENT
ORDER BY DEPARTMENT;
Types Of Aggregate function Image 11

This query counts the number of employees working in each department.

We can also use the GROUP BY without applying the Aggregate function. In this case GROUP BY works like a DISTINCT clause which never shows any duplicate value in the result set.

Example:
SELECT E_NAME, Salary FROM EMPLOYEE 
GROUP BY DEPARTMENT;

 

Types Of Aggregate function Image 12

In the above example, we use the GROUP BY clause without an aggregate function. In the EMPLOYEE table, there are a total of 8 rows but it returns only 4-row values i.e one value from each department (Tech, HR, IT, Sales) because the rest of the row values belongs to those departments. It displays the first value of each department and avoids the rest values. Here GROUP BY works like the DISTINCT clause which avoids duplicate rows from the result set.

HAVING

 Sometimes we do not want to see the whole output produced by the GROUP BY. To filter the group of data based on a specified list of conditions the HAVING clause is often used with the GROUP BY. HAVING clause is placed after GROUP BY. Both WHERE and HAVING clauses can be used in the same query. As the WHERE clause can not be used with aggregate functions like SUM(), AVG(), MIN(), etc… for that reason HAVING clause is used with an aggregate function to filter the group of a result set.

Syntax:
SELECT column 1,column 2
FROM table
WHERE [conditions]
GROUP BY column 1
HAVING [condition]
ORDER BY column 1;
Example:
SELECT DEPARTMENT, AVG(SALARY)FROM EMPLOYEE
GROUP BY DEPARTMENT 
HAVING AVG(SALARY)>28000;
Types Of Aggregate function Image 12

In the above example, the query returns those departments’ average salary whose average salary is more than 28000.

Types Of Aggregate function Image 13

In the above example, we use the WHERE clause with aggregate functions instead of the HAVING clause and it returns the syntax error message.

Conclusion on Aggregate functions

The aggregate function is one of the most powerful concepts of SQL. This article has seen all the important concepts of the aggregate function with proper examples of how they work with queries. The key takeaways from the article are,

  • We discuss the function and its types.
  • Learned about types of aggregate functions with proper examples.
  • Also learned how GROUP BY  is used with aggregate function and its importance. And  How the HAVING clause work with GROUP BY and aggregate functions.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Sonali Dash 26 May 2022

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear