Hands-on Beginner’s Guide to SQL
This article was published as a part of the Data Science Blogathon.
From the very beginning of my SQL journey, I have been fascinated by this database language. The ability of SQL to be able to update, modify, delete and store data with very few lines of code is the most engaging part of it.
In this guide, we will go through SQL and its applications comprehensively.
Structured Query Language
The Structured Query Language or SQL as we call it was initially developed by Donald D. Chamberlin and Raymond F. Boyce at IBM after gaining an insight into the relational model from Edgar F. Codd in the early 1970s. Originally called SEQUEL (Structured English Query Language), in 1987 it became an ISO standard.
SQL is a domain-specific programming language that allows managing, retrieving, storing, and manipulating the data inside a relational database management system or RDBMS. It is broadly used and highly recommended by popular RDBMS like MySQL, SQL Server, and Oracle.
Different versions of SQL have been developed over the years, each built to handle a specific set of instructions, some scaled-down versions are available for free while others are constructed to higher levels for support and scalability. It has recently become available for Linux users otherwise for over 20 years SQL worked exclusively on Windows.
SQL plays a vital role in data science and analytics nowadays. Some reasons why are explained below:
- It allows users to insert, update, modify and delete the data from the relational database without any hassle, which in turn is very beneficial to data professionals.
- It is useful to describe the structured data and also helps in creating the view, stored procedures, and functions.
- Not just data, users can create, modify and drop the database and the tables inside it.
- SQL permits the set of constraints on table columns, views, and stored procedures.
- Its security mechanism allows access to data without granting permissions to access the underlying base tables.
When any SQL command is executed for the relational database management system, the SQL engine determines how to interpret the task and the system decides the optimized way to carry out the user’s request, and there are various components involved in the process. The components are:
Classic Query Engine
SQL Query Engine, etc.
All the non-SQL queries are handled by a classic query engine but logical files are not handled by a SQL query engine.
A simple diagram of SQL Architecture is as follows:
Setting Up Work Environment for Practicing SQL
To perform SQL queries on the data in any database, it is required to install a free, open-source database management system in your system. Some most popular and widely used DBMS are MySQL, Oracle, MongoDB, SQL Server, PostgreSQL, etc.
Some links to install the same are given below for the reference:
- MySQL: https://dev.mysql.com/downloads/mysql/
- Oracle: https://www.oracle.com/tools/downloads/sqldev-downloads.html
- SQL Server Express: https://www.microsoft.com/en-in/download/details.aspx?id=30438
SQL statements are easy to write, understand, and straightforward like plain English just with syntax. The most highly used SQL commands along with their functionality are mentioned below:
- CREATE command
To create a new database, table, table view, and other objects of the database, this command is used.
- UPDATE command
To change or update the stored data of the database, the UPDATE command is used.
- DELETE command
This command is used when the stored data needs to be erased or deleted from the database. At a time single or multiple rows can be removed from the table.
- SELECT command
For accessing the rows of one or multiple tables of the database this command is used along with the WHERE clause.
- INSERT command
The data is added or inserted in the table of the database using this command.
To write any query in SQL, the predefined syntax needs to be followed. The syntax of SQL as defined by the ISO and ANSI standards consists of a unique set of directions and guidelines, which is not case-sensitive. Some important rules to keep in mind are:
keywords of SQL can be written in both uppercase and lowercase, uppercase
keywords are preferred due to increased readability.
- A SQL statement can be placed in single or multiple
text lines depending on the user as SQL syntax or statements depend on
- Mostly the database can be operated using queries of
- Relational algebra and tuple relational calculus are
the basis of SQL syntax.
The SQL statements instruct the database on what information the user wants and what operation the user would like to perform on the data. Every statement starts with SQL keywords and ends with a semicolon (;), which separates multiple statements in the same cell. Here is an example of a valid SQL statement
SELECT name, id, dept FROM employee WHERE dept = "HR";
For better understanding and readability, the same statement can be written as:
SELECT name, id, dept FROM dept WHERE dept = "HR";
Before we go any further it is important to know that by default SQL is case-Insensitive, but this might not be the case depending upon the operating system. While Windows platforms are case-insensitive, Linux platforms aren’t.
Now that we have a basic understanding of SQL, let’s look at the syntax and example statements of all the basic commands in SQL:
1. SELECT Statement
This statement interprets the data from the SQL database and reflects the output to the database user.
Syntax of SELECT Statement:
SELECT column_name1, column_name2, column_name3, ......., column_nameN FROM table_name WHERE condition;
Example of SELECT Statement:
SELECT stud_name, stud_id, grade FROM student_details WHERE grade = "A";
The above command will simply select stud_name, stud_id, and grade column from the table student_details given the condition is satisfied i.e., only the tuples with grade A are selected.
2. UPDATE Statement
This SQL statement modifies the stored data in the database according to the user.
Syntax of UPDATE Statement:
SET column_name1 = new_value1, column_name2 = new_value2, ...., column_nameN = new_valueN WHERE condition;
Example of UPDATE Statement:
UPDATE student_detail SET grade ="A" WHERE stud_id = 12;
This example modifies the grade of the student whose stud_id is 12 in the table.
3. DELETE Statement
This statement deletes the stored data.
Syntax of DELETE Statement:
DELETE FROM table_name WHERE condition;
Example of DELETE Statement:
DELETE FROM student_details WHERE name = "Joe";
This command deletes the record of the student whose name is Joe in the table.
4. CREATE TABLE Statement
This statement is used to create a new table in the SQL database.
Syntax of CREATE TABLE Statement:
CREATE TABLE table_name( column_name1 data_type [column1 constraint(s)], column_name1 data_type [column1 constraint(s)], ..... ....., column_name1 data_type [column1 constraint(s)], PRIMARY KEY(one or more col) );
Example of CREATE TABLE Statement:
CREATE TABLE student_details( Name VARCHAR(30), Stud_id int(4), Grade VARCHAR(2), PRIMARY KEY(Stud_id) );
This example creates the table student_details with three columns as specified along with their datatype. The Stud_id column in the table acts as a primary key, which means that the Stud_id column cannot contain duplicate values and null values.
5. INSERT INTO Statement
This SQL statement is used to insert the records into an existing or created table of the database. It can easily insert one or more than one records in a single query statement.
Syntax of INSERT INTO Statement:
INSERT INTO table_name (column_name1, column_name2, ...., column_nameN) VALUES (value1, value2, ...., valueN), (value1, value2, ...., valueN), .....;
Example of INSERT INTO Statement:
INSERT INTO student_details (name, stud_id, grade) VALUES ("Joe", 13, "B"), ("John", 21, "A"), ("Beck", 32, "B");
This example inserts the record of three students in student_details.
6. ALTER TABLE Statement
To add, delete and modify the columns of the table in the SQL database, this statement is used.
Syntax of ALTER TABLE Statement:
ALTER TABLE table_name ADD column_name datatype[(size)];
The above statement will add the described column in the table.
ALTER TABLE table_name MODIFY column_name datatype[(size)];
The alter statement renames the old column name to the new one.
ALTER TABLE table_name DROP COLUMN column_name;
The above alter statement deletes the specified column.
Example of ALTER TABLE Statement:
ALTER TABLE student_details ADD class_number int(2);
This example adds the new field class_number in the student_details table.
7. DROP TABLE Statement
This statement is used to delete or remove the table and permissions, views, and everything associated with the table.
Syntax of DELETE TABLE Statement:
DROP TABLE table_name1, table_name2, ...., table_nameN;
Example of DROP TABLE Statement:
DROP TABLE student;
This example drops the table student from the SQL database, which removes complete information from the table.
8. CREATE DATABASE Statement
To create a new database in the database management system, this statement is used.
Syntax of CREATE DATABASE Statement:
CREATE DATABASE database_name;
Example of CREATE DATABASE Statement:
CREATE DATABASE Class;
By executing the above example, a class database is created in the system.
9. USE Statement
This SQL statement selects the existing database. Before working on the data in any database, it needs to be selected.
Syntax of USE Statement:
Example of USE Statement:
The above example selects the class database.
10. DROP DATABASE Statement
The existing database is deleted with all the tables and views from the system through this statement.
Syntax of DROP DATABASE Statement:
DROP DATABASE database_name;
Example of DROP DATABASE Statement:
DROP DATABASE Class;
The above example deletes the class database from the system.
11. DESCRIBE Statement
This statement provides the details of the specified table or the view.
Syntax of DESCRIBE Statement:
DESCRIBE table_name | view_name;
Example of DESCRIBE Statement:
This example provides the details about the structure and other details of the student_details table.
12. DISTINCT Clause
This statement shows the distinct values of specified columns of the table. This clause is used with the SELECT keyword.
Syntax of DISTINCT Clause:
SELECT DISTINCT column_name1, column_name2, .... FROM table_name;
Example of DISTINCT Clause:
SELECT DISTINCT grade FROM student_details;
The above example will show distinct grade values from the student_details table.
13. TRUNCATE TABLE Statement
This statement deletes all the stored data from the selected table of the database.
Syntax of TRUNCATE TABLE Statement:
TRUNCATE TABLE table_name;
Example of TRUNCATE TABLE Statement:
TRUNCATE TABLE student_details;
The above example wipes away all the data stored in the student_details table.
14. COMMIT Statement
This statement makes the changes made by the user in the transaction permanent.
Syntax of COMMIT Statement:
Example of COMMIT Statement:
DELETE FROM student_details WHERE fee = 5000 COMMIT;
This example deletes the records of children whose fee is 5000 and then it makes the changes permanently.
15. ROLLBACK Statement
This statement undoes all the transactions and operations which were unsaved in the SQL database.
Syntax of ROLLBACK Statement:
Example of ROLLBACK Statement:
DELETE FROM student_details WHERE grade = "N" ROLLBACK;
This statement deletes the records of those students where the grade is N and then it undoes the changes in the database.
16. CREATE INDEX Statement
This statement creates a new index in the selected table of the database.
Syntax of CREATE INDEX Statement:
CREATE INDEX index_name ON table_name (column_name1, column_name2, ...., column_nameN);
Example of CREATE INDEX Statement:
CREATE INDEX idx_name ON student_details (name);
This statement creates an index idx_name on the name column of the student_details table.
17. DROP INDEX Statement
This statement is to delete the existing index of the SQL database table.
Syntax of DROP INDEX Statement:
DROP INDEX index_name;
Example of DROP INDEX Statement:
DROP INDEX idx_name;
The above example deletes the index idx_name from the database.
18. ORDER BY Clause
This clause is used to sort the records on the basis of the specified columns in the table. The ASC or DESC keywords are used with the clause to sort the data ascendingly or descendingly respectively.
Syntax of ORDER BY Clause:
SELECT column_name1, …, column_nameN FROM table_name
ORDER BY coulmn_name (ASC | DESC);
Let’s have a quick glance at what SQL keys are:
There are times when data in a table runs in thousands of rows with duplicate rows and redundant data. That’s where keys come in handy, to deal with duplicate data and rows, to manage and only store unique data, and most importantly to link multiple tables in a database. Different keys in SQL are:
- Primary Key
A column is said to be the primary key if it identifies each row in the table uniquely and cannot have null values. To create a primary key, PRIMARY KEY constraint while creating or modifying the table. When multiple rows satisfy the condition of the primary key, it is known as a composite primary key.
- Foreign Key
This key is used to establish a link between two tables. It uses the primary key of one table to point to the primary key of another table.
- Unique Key
A column is said to be the unique key if it identifies each row in the table uniquely. It is the same as primary but the only difference is that it allows null values and it cannot have duplicate data.
- Alternate Key
An alternate key is just a secondary key i.e. it is a candidate that is not selected as the primary key.
Advantages of SQL
SQL is a promising part of data science and analytics. It is perfect for communication between the database and data professionals, and users. Some major benefits of SQL are:
Speed is a priority user never overlook. In SQL a large amount of data from the database is processed coherently and quickly. Query Processing is done in less time.
- Programming is not needed
No extra hours or energy is spent on writing a large number of coding lines for query processing or database management. By following SQL syntactical rules, the database can be easily accessed and managed in a user-friendly manner.
SQL can be easily used on laptops, tablets, desktop computers, and smartphones as well. It can adjust with other applications according to the requirements.
SQL is a user-friendly language, easy to learn, interpret and understand. It can solve complex queries in mere seconds and it’s no challenge to access the database.
SQL follows well-established standards of ISO and ANSI, which offers a uniform platform across the globe.
- Multiple Data View
Different database users are allowed multiple views of the database structure.
Disadvantages of SQL
The disadvantages of SQL are listed below:
- Limited Database control
The data professionals and users of SQL are not allowed complete control of the database since business rules are hidden.
- Complex Interface
The interface of this query language is complicated, which makes it difficult for users to use and manage it.
Some more advanced operations of SQL versions are costly hence making it difficult for some programmers to use it.
We have covered the basics of Structured Query Language along with basic commands and statements. We now know what is it and have a grasp of how it works. Therefore we can conclude that it is one of the best ways to deal with large data efficiently while taking less time.
Image: SQL_Architecture: https://www.tutorialspoint.com/sql/images/sql-architecture.jpg
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.