Most Important SQL Queries for Beginners
This article was published as a part of the Data Science Blogathon.
SQL stands for “Structured Query Language” and is a programming language that is used to query and manipulate relational databases. The number of data points is increasing every minute, but raw data offers no insight. All this data is stored in databases, and professionals use SQL to extract it for further analysis.
Figure 1: Overview of SQL
A database is a table with rows and columns. The language of databases is SQL. It facilitates retrieving specific information from databases which can then be analysed. SQL is required to extract the data that you need from a company’s database, even if the analysis is performed on another platform such as Python or R.
This guide will discuss how to query databases using Structured Query Language(SQL). We will go through concepts such as:
CRUD (create, read, update, delete) operations
DML & DDL commands,
Primary key, Foreign Key, Unique Key
Inner join, Natural join, Left join, Right join, Cross join, Self join.
The code associated with all these features are also included.
How does SQL Work?
There are different versions and frameworks of SQL, the most popular is MySQL. MySQL is an open-source database management system that serves as a back-end data management solution for web applications. SQL is used by companies such as Facebook, Instagram, and WhatsApp for back-end data storage and processing. In the process of writing and running (or parsing) an SQL query, the query is processed by a query optimizer. Once the query reaches SQL server, it is compiled in three phases: Parsing, Binding, and Optimization.
Parsing – Process of checking the syntax of the query.
Binding – It is the process to check the semantics of a query
Optimization- It is the process of generating the query execution plan
Optimization consists of generating all possible permutations and combinations to find the most effective query execution plan in a reasonable amount of time.
What is SQL used for?
SQL can perform various actions on databases such as:
1. Query a database
2. Retrieve data from database
3. Add records to a database
4. Edit records in a database
5. Remove records from a database
6. Add tables to an existing database
These are also referred to as the “CRUD” operations.
Figure 2: “CRUD” Operations
Working with basic SQL commands:
All the commands in SQL can be broadly divided into 2 types:
1.DDL: A Data Definition Language(DLL) is used to define structures such as schemas, databases, tables, constraints, etc. ‘CREATE’ and ‘ALTER’ statements are examples of DDL.
2.DML: DML stands for Data Manipulation Language and is a language for manipulating data. ‘INSERT’, ‘UPDATE’ and ‘DELETE’ statements are examples of DML.
- CREATE: Creates the database or its objects (such as tables, indexes, functions, views).
create table college( id int, firstname varcahar(30), lastname varcahar(30), major varchar(20) );
2. DROP: The delete command is used to remove objects from the database.
DROP table college;
3. ALTER: The purpose of this is to modify the database structure.’
ALTER table college ->add city varchar(30);
4. TRUNCATE: All records in a table will be removed, including all spaces allocated for the records
TRUNCATE table college;
- INSERT: Inserts data into a table.
INSERT into college(id, firstname, lastname, city) values(1, 'Adam', 'Cole', 'Computer');
2. UPDATE: It is used to update existing data in a table.
UPDATE college set major = 'IT' where id=1;
3. DELETE: Deletes records from a database table.
DELETE from college;
What are Primary Key, Foreign Key and Unique Key?
Primary key: The primary key constraint identifies each record in a table uniquely. Primary keys cannot contain NULL values, and must contain unique values. There can be only one primary key in a table, and this primary key may consist of one or more columns.
The ‘ID’ column of our dataset will be the primary key as it contains unique values and can be used to uniquely identify an attribute.
Foreign Key :Foreign keys refer to fields in one table that are the primary keys in another table. A table with a foreign key is called a child table, and a table with a primary key is called the parent table.
Table 2: GYM
The ID column is present in both the tables “college” and “gym” , it is the primary key in “college” table and foreign key in “gym” table. This makes “college” the parent table and “gym” the child table.
Unique Key: A unique constraint ensures that all values in a column are unique. Both the unique and primary key constraints ensure uniqueness for a column or set of columns. The difference between a primary key and unique key is that unique key can have NULL values while primary key cannot.
How do we Join Two Tables?
With a join clause, you combine rows from multiple tables based on a common column between them. As described in the primary key and foreign key example the “ID” column is common in both tables and hence we can define a general relationship between the tables using JOIN and select matched records between both tables.
Types of joins in SQL:
1. Inner Join: It is a SELF JOIN used to create a table by joining itself as there were two tables. It makes temporary naming of at least one table in an SQL statement.
Select Student_ID, StudentName, TeacherName, TeacherEmail FROM students INNER JOIN Teachers ON students.TeacherID = Teachers.TeacherID;
2. Natural Join: A type of inner join that joins two or more tables based on the same column names and data types present in both tables.
SELECT * from students NATURAL JOIN Teachers;
3. Left Join: LEFT JOIN retrieves all records from the left table (table1) and the matched rows or columns from the right table (table2). If neither table contains any matched rows or columns, NULL is returned.
Select product_detail.ProductID, ProductName, CustomerName, City, Amount FROM product_detail LEFT JOIN customer_detail ON product_detail.ProductID = customer_detail.ProductID;
4. Right Join: The RIGHT JOIN retrieves all records from the right table (table2) and the matched rows or columns from the left table (table1). If neither table has any matching row or column, NULL is returned.
Select product_detail.ProductID, ProductName, CustomerName, City, Amount FROM product_detail RIGHT JOIN customer_detail ON product_detail.ProductID = customer_detail.ProductID;
5. Cross Join: This is also known as the CARTESIAN JOIN, which returns the Cartesian product of two or more joined tables. CROSS JOIN produces a table that merges each row from the first table with each row from the second table.
Select product_detail.ProductID, ProductName, CustomerName, City, Amount FROM product_detail, customer_detail;
6. Self Join: It is a JOIN used to create a table by joining itself as there are two tables.An SQL statement creates a temporary name for at least one table.
Select TB.ProductID, TB.ProductName FROM product_detail TB, product_detail TB2 WHERE TB.AMOUNT < TB2.AMOUNT;
SQL is a database management system which provides access to, manipulation of, and communication with a database. SQL can be used to retrieve data from a database, create a database, manipulate data and databases, such as inserting, deleting, and updating data.It is a user-friendly language that is domain-specific.
In this guide we briefly discussed some queries/commands of SQL which can be used to perform CRUD (create, read, update, delete) operations on a database. Through the guide we can conclude that SQL is widely used in Business Intelligence tools, it is used to manipulate and test data. Data Science tools are heavily dependent upon SQL. A few examples are Spark and Impala. It also has fast query processing which means that Large amounts of data can be retrieved quickly and efficiently.
The advantages of SQL make it a very popular and highly demanded language. It is a reliable and efficient language for communicating with databases.
1. Figure 1: https://www.smartninja.de/blog/was-ist-sql-und-wo-kann-es-verwendet-werden-8
2. Figure 2: https://medium.com/geekculture/crud-operations-explained-2a44096e9c88
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.