Data Engineering for Beginners – Difference Between OLTP and OLAP
- OLTP and OLAP are 2 data processing capabilities
- Understand the difference between OLTP and OLAP
You acquire new information every day. But it’s only after analyzing it that you are able to add it to your knowledge and make more informed decisions. The same is the story with every organization around the world. Organizations collect huge volumes of data and then process it to make sense out of it and take the apt business decision. Unlike humans, an organization has two types of data processing capabilities: OLTP and OLAP.
Contrary to their name, there is a stark difference between the two. While one manages real-time processes, the other helps in analyzing huge volumes of data for improving the capabilities of the organization. Let’s understand this difference between the two in a bit more detail in this article.
Table of Contents
- Whats is OLTP?
- Example of OLTP
- Characteristics of OLTP
- What is OLAP?
- Example of OLAP
- Characteristics of OLAP
- OLTP vs OLAP
What is OLTP?
OLTP, or Online Transactional Processing, systems handle a large number of transactions happening in real-time. But, what are the transactions?
Well, Transactions are processes that occur in their entirety and in isolation from one another. They either insert, update, or delete data in a database. On successful execution, the changes made by a transaction to a database persist in the database even in the event of a system failure.
The transactional data is stored in Relational Databases that ensure ACID properties for transactions. This data is written and queried at a very high pace to prevent any delay in processing.
OLTP governs transactions because they are the critical processes that we encounter in our everyday life. Online transactions, e-commerce orderings, online hotel bookings, atm transactions, etc. are all managed by OLTP processes.
Examples of OLTP
Imagine you log onto an e-commerce website to book the last pair of your favorite headphones which are currently on sale. Consider the following:
- Multiple people might be trying to book the headphones but none are aware of the processes of the others. (Isolation)
- The order will be considered successful only when the entire steps along with the payment are completed by any user. (Atomicity)
- Once the order is successfully completed by a user, it will be updated in the website database. The headphones will then become unavailable on the website. (Consistency)
- Now, even if the e-commerce website goes down due to a deluge of user traffic, the user still owns the headphone they bought successfully. (Durability)
OLTP ensures that such transactions are carried out without any inconsistencies in the database with the help of the ACID (Atomicity, Consistency, Isolation, Durability) properties (that we just discussed).
Characteristics of OLTP
- It handles real-time transactions.
- These systems modify data in the database.
- They handle transactions that are governed by the ACID properties.
- These systems store data in Relational Databases.
- The implementation of OLTP transactions is usually very fast, in the order of milliseconds.
What is OLAP?
Organizations have data generated from transactions stored in various OLTP databases. But this data is of no use unless analyze it to derive valuable insights for the organization. However, querying this data directly from OLTP databases is not efficient because of the sheer amount of data and the complexity of the queries that need to be written. Therefore, we store this data in a different database called OLAP databases.
OLAP, or Online Analytical Processing, databases store data in an aggregated form from multiple OLTP databases. This data is then stored within a data warehouse. But, instead of a transaction-level view, it gives a multidimensional view of the data. This means that if the organization wants to view the aggregated sales data, they can view them according to multiple categories and sub-categories- location (region, country, state), time (year, month, day), the customer (gender, age), etc. This enables organizations to perform advanced analytics on their data, thereby giving a deeper understanding of their products.
It goes without saying that the key to the success of OLAP databases is the multidimensional view. But how is it stored? Let’s take an example to understand that.
Example of OLAP
Taking our previous example of the e-commerce website, imagine that the company now wants to analyze sales for the year that went by. But simply looking at the sales per month is too simplistic. Looking at sales per month categorized by region would be a better approach.
This table is known as an OLAP cube with two dimensions (quarter and region) and the aggregated data stored here is known as a measure.
The data stored in data warehouses is in a similar fashion with as many dimensions as required for analysis.
Here, Time, Region, and Category are dimensions. The shaded cell of the cube shows data for the month of February, in the North region for the Clothes category.
Characteristics of OLAP
- It deals with historical data.
- These systems do not make changes to the data.
- It stores data in data warehouses in multidimensional form.
- It is used for data analysis purposes.
- The data is never modified.
OLTP vs OLAP
(Online Transactional Processing)
(Online Analytical Processing)
|Functionality||Manages transactions that modify data in databases.||Used for analytical and reporting purposes.|
|Source||Real-time transactions of organizations.||Data is consolidated from various OLTP databases.|
|Storage format||Tabular form in Relational Databases.||Multidimensional form in OLAP cubes.|
|Operation||Read and write||Read-only|
|Response time||Fast processing since queries are simple.||Slower than OLTP|
|Users||Executives, Data scientists||Programmers, Database professionals|
To summarize, in this article, we discussed, the difference between OLTP and OLAP. Sure it is a rudimentary topic but an essential one to build your foundation for a data engineering career.
If you looking for more such articles in the data engineering domain, I highly recommend the following articles to kickstart your journey:
I hope this article was able to give clear explanation to the topic. Let me know if you have any queries in the comments below.