Interacting with Remote Databases – PostgreSQL and DBAPIs
This article was published as a part of the Data Science Blogathon.
When creating data pipelines, Software Engineers and Data Engineers frequently work with databases using Database Management Systems like PostgreSQL.
This article explores the fundamentals of interacting with a database and using Database APIs (DBAPI) to connect with a database from another language or web server. It covers the fundamentals of DBAPIs and how they enable Python-based database interactions.
A server is a centralized piece of software that interacts with users across a network (like the Internet) to provide services. A client is a software that may request a server for information, similar to the web browser on one’s computer. The browser (the client) contacts the server when accessing a web page, and the server responds by returning the page’s data.
Relational Database Clients
Any application that makes requests to a database is a database client. The database client may, in certain situations, be a web server. The web server fulfills requests from the browser as a server, but when it requests data from a database, it functions as a client to that database, and the database is the server (because it is fulfilling the request).
The database system PostgreSQL (Postgres), which can communicate with several clients, serves as the server in this scenario. The client might take a variety of forms, such as:
- Command Line programs
- Web apps with graphical user interfaces are downloaded to a computer, which makes calls to the Postgres server and anticipates answers to satisfy user requests.
The same network-wide communication protocol (TCP/IP) allows multiple clients to communicate with the same database server simultaneously in various ways. A database server can serve another web server that is serving additional clients. The database server then treats the web server as a client in such a scenario.
PostgreSQL Database System
Postgres has to be installed before utilizing it. MacOS may already have Postgres because it comes pre-installed in the computer. Here are some instructions for downloading and installing it, just in case:
The PGAdmin client and the psql client will both be installed. A good feature of Postgres is that it includes certain utility programs, such as createuser and createdb, that one may use as they learn more about. A user named postgres is created by default, and this user has complete superadmin access to the entire PostgreSQL instance running on the operating system.
What is Postgres (features):
- Many people believe that Postgres, an open-source, general-purpose, and object-relational database management system, is currently the most cutting-edge database system.
- It is a relational database system that has been enhanced with object-oriented capabilities and is cross-platform.
- Support for arrays (several values in a single column) and inheritance are two examples of object-relational support (child-parent relationships between tables).
- The SQL standard is fully supported.
- Transaction-based: Atomic transactions are used to perform database operations.
- Has multi-version concurrency management, preventing pointless locking when several writes are being made to the database simultaneously (avoiding waiting times for access to the database).
- Multiple databases may be read from and written to simultaneously using Postgres.
- It offers excellent performance and a variety of indexing options for improving query speed.
Postgres CLI Tools
A few fundamental operations and commands are available on the Postgres command line (psql client) for general reference.
Open the terminal, then sign in as a specific user to the psql client.
|MacOS||$ sudo -u -i|
|Linux||$ sudo -u psql|
The default installed user is called postgres. Later, the username and password can be changed.
$ psql -U Password for user postgres:
Creating a new database.
The installed default database is called postgres. However, a new database can be created using the below command (SQL statement). The new database can be opened using the “c” psql command.
postgres=# postgres=# CREATE DATABASE new_database; CREATE DATABASE postgres=# c new_database You are now connected to database "new_database" as user "postgres". new_database=#
Creating a sample table (table1) and populating it with sample data.
postgres=# postgres=# CREATE DATABASE new_database; CREATE DATABASE postgres=# c new_database You are now connected to database "new_database" as user "postgres". new_database=# CREATE TABLE table1 ( new_database(# id SERIAL PRIMARY KEY, new_database(# description VARCHAR(30) NOT NULL ); CREATE TABLE new_database=# INSERT INTO table1 (description) VALUES ('New table'),('called table 1'); INSERT 0 2 new_database=# SELECT * FROM table1; id | description ----+---------------- 1 | New table 2 | called table 1 (2 rows)
As can be seen, the SQL command (SELECT * FROM table1;) queries the database and displays all the content of table1. Exit the psql client using the quit command “q”.
DBAPIs and Psycopg2
When utilizing a given application (client) and a particular programming language, one may occasionally need to interface with the database (server), query its contents, and use its results. A DBAPI would be useful, for instance, if a data engineer wanted to create a data pipeline using Python.
Database Adapters is another name for DBAPIs in use today. They offer a common interface that allows a programming language like Python to communicate with a relational database server. A database adapter is a basic library to create SQL queries that connect to databases.
Every server framework or language has a distinct DBAPI for interacting with a database system. For instance, Python (Django or Flask) and Postgres utilize the psycopg2 DBAPI, whereas the NodeJS framework and Postgres database system use the node-postgres database adapter. In the case of psycopg2, the subject of the next section, the database adapter turns the output of a SQL query (such as SELECT * FROM table_name;) into a list of tuples in Python.
Installation of psycopg2 and using it to establish a connection to our Postgres server and interact with it in Python. The following are the steps:
- Make sure Python 3 (version 3.5 and above) is installed. Check with the command:
$ python --version Python 3.10.4
pip install psycopg2
Use the latest version of pip installed in your environment (i.e., pip3)
Basic CRUD Operations
Create a local directory and create a sample python file (sample.py). Open the file in a code editor (vs. code). Use the python code below to perform a basic SQL operation.
- Use the database created in the previous section (new_database). Create another one if possible.
- The code psycopg2.connect(“dbname=your_database_name”) connects to the existing database.
- The connection object offers the cursor() method. It is an interface that allows you to queue work and start database transactions.
- The cur.execute(‘SQL Statement’) executes the transaction on the database. Here, a table (newtable) is created and records inserted into it using SQL statements.
- The transactions must then be manually committed to the database using the connection.commit() method.
- The connection is then closed using the connection.close() and cur.close().
import psycopg2 # Establish a connection with to your existing database and start a session connection = psycopg2.connect('dbname=new_database user= host=localhost password=') # Create a session for database transactions cur = connection.cursor() # Create a new table in the database cur.execute(''' CREATE TABLE newtable ( id INTEGER PRIMARY KEY, status BOOLEAN NOT NULL DEFAULT False ); ''') # Insert records into the new table cur.execute(''' INSERT INTO newtable (id,status) VALUES (1,True); ''') # Commit the transactions in the session connection.commit() # Close the connection to the session connection.close() cur.close()
Save the sample.py script and then run it in a terminal: all python (or python3) and point to the saved python file using the command below.
$ python sample.py
NB: You might sometimes get an operational error:
$ python sample.py Traceback (most recent call last): File "D:Dev LearningDev ArticlesBlogathonsample.py", line 6, in connection = psycopg2.connect('dbname=new_database ') File "C:UsersonyanAppDataLocalProgramsPythonPython310libsite-packagespsycopg2__init__.py", line 122, in connect conn = _connect(dsn, connection_factory=connection_factory, **kwasync) psycopg2.OperationalError: connection to server at "localhost" (::1), port 5432 failed: fe_sendauth: no password supplied
Try adding a password and specifying the user.
psycopg2.connect("dbname=database user= host=localhost password=password")
Next, you check your database for any changes made by your python script. Reopen your psql client using the same steps as before, and view your tables using the “dt” command.
new_database=# new_database=# dt List of relations Schema | Name | Type | Owner --------+----------+-------+---------- public | newtable | table | postgres public | table1 | table | postgres (2 rows)
In conclusion, this article discussed common methods for interacting and connecting to Postgres databases. The main lessons learned were, in brief, as follows:
- Connections are the channels via which client software can communicate with a database server. As defined by the Client-server Model, connections are necessary to transmit requests and receive responses across a communication protocol (TCP/IP).
- In a database, interactions occur through units of work (Transactions) completed throughout sessions.
- Both database adapters like psycopg2 and postgres database clients like psql may be used to implement the database interactions.
As they work with enormous amounts of data, database systems like PostgreSQL, SQLite, MS SQL Server, etc., are essential to a Data Engineer’s work.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.