Deepti J — July 8, 2021
Advanced Data Engineering Project Python

This article was published as a part of the Data Science Blogathon

Introduction

Snowflake is a cloud data platform solution with unique features offered. In this tutorial, you will see an overview of Snowflake, its architecture, how to create a free trial account, and how to create a database, schema, and tables inside Snowflake.

 

Snowflake

Image source: Link

Table of Contents

  1. Introduction
  2. The Architecture of Snowflake
  3. Creation of Free Trial Account
  4. Beginner’s Hands-on
  5. Summary

1.Introduction

Snowflake is a completely managed cloud data platform which means there is no need to install any hardware or software that is required to work with it.

If you are from a SQL Developer background, then most of the time, your job includes how to write a faster query and also how to do performance tuning. W.r.t Snowflake even the performance tuning of the query is also taken care of by Snowflake itself. Maintainance or up-gradation is also managed by Snowflake itself. Snowflake cannot be run in an on-premise environment.

The workloads that are supported by Snowflake are listed below

  1. Data Engineering
  2. Data Lake
  3. Datawarehouse
  4. DataScience
  5. DataApplications
  6. DataExchange
I believe by now you have got an overview of what snowflake is. Now let’s dive quickly into Architecture.

2. The Architecture of Snowflake

The internal architecture of Snowflake comprises three layers as shown below.

  1. Database Storage Layer
  2. Query Processing Layer
  3. Cloud Services Layer
The Architecture of Snowflake

Image credit: Link

2a.Database Storage Layer

This layer utilizes the existing cloud storage capabilities of AWS, GCP, or Azure which is internal to Snowflake. For example, if it is AWS, then S3 is used. As an end-user of the product, this is not revealed and it is completely managed by Snowflake itself.

For example, if we create a table within snowflake, then we can see the table by querying the table from the user interface but we can’t access the table from the cloud storage provider.

2b. Query Processing Layer

This layer consists of several virtual warehouses which are used to process the queries. Each virtual warehouse is not dependent on the other. These represent EC2 compute instances if you are from an AWS background. As an end-user, we can only select the size of the warehouse and also the minimum and a maximum number of clusters that come within each virtual warehouse unlike setting up an EC2 instance in AWS.

2c.Cloud Services Layer

Cloud services layer is a coordinator service that comprises of below

  1. Authentication and Access Control
  2. Infrastructure Manager
  3. Optimizer
  4. Metadata Manager
  5. Security

3. Creation of a Free Trial Account

Snowflake provides a trial account for 400 credits free for one month to explore the product.  Let’s see how we can quickly connect to Snowflake.

Login to URL and provide all the necessary details like your first name, last name, email, company name and click on continue and shown below

 

Creation of a Free Trial Account snowflake

                                                                                                          Image source: Author

Then the below screen is popped up where you can see the editions that snowflake provides. Currently, it provides Standard, Enterprise, Business Critical. You are free to select anyone.

You need to provide the cloud provider name as shown below. This is for the storage provider that we discussed above in the Architecture layer. Here I have provided AWS, Once you select then you should also provide the region, Here I have provided Asia Pacific Mumbai as shown below. 

The reason for allowing the end-user to select the cloud provider and region is if in case the organization is already hosted on the cloud then we can select the same cloud for snowflake as well. We can also select the different clouds but we need to bear the data transfer cost. Typically we need to pay for the storage and compute capacity that we have used.

location

Image source: Author

Once you are done, you will receive an email for the registered email id to activate the account. Click to activate and set up your user name and password for the same. You will also receive a link to log in to your snowflake account like below.

https://AZ90921.ap-south-1.aws.snowflakecomputing.com/console/login.

This is the unique login link that we need to use every time to log in to the snowflake. Here AZ90921 is the unique account id generated for you. The rest of the URL contains cloud provider details like AWS,south-1 since we have selected the Mumbai region while logging in. Once you log in you will get the options to know quickly about the snowflake as shown below. Check the screenshot below on what you can do on the web-based user interface.

Welcome to snowflake

Image source: Author

4. Beginner’s Hands-on 

Let’s quickly see the hands-on now.

Open the worksheet and paste the commands below. Execute the commands using the RUN option.

use role accountadmin

#Creating a warehouse 
create warehouse if not exists test_warehouse warehouse_size ='SMALL' auto_suspend=300 initially_suspended=true;
USE WAREHOUSE TEST_WAREHOUSE
# Create a test database 
create database testdb;
#Check if the database is created successfully
show databases like 'test%';
USE DATABASE testdb 
# Create a test schema 
create schema testschema;
#Check if the schema is created successfully
show schemas;
USE SCHEMA testschema
#Creating a sample employee table with two columns empname,empid
create or replace table test_emptable (empid number,empname varchar);
#Inserting  sample data into an employee table
insert into test_emptable values(1,'Micheal');
insert into test_emptable values(2,'Nick');
insert into test_emptable values(3,'George');
#Check if the table is created successfully
show tables like 'test_emptable';
#Display the content of the table
select * from test_emptable

4a.Explanation of the code

In the above code, we have used the role of account admin which is a high-level role available in the snowflake that has access to all objects. 

Then I have created a warehouse called “test_warehouse”. Using this warehouse we created a database called “testdb”.In this DB a sample schema “testschema” is created. Then a sample employee table is created and I inserted data in it.

4b.Observation

Have you noticed that I have not given any commit command as we give explicitly in OracleDB? This is because the session has default parameters turned on. Here the parameter auto-commit is turned on. So that is the reason there is no need to give auto-commit explicitly again.

4c.Output 

snowflake author

Image source: Author

5.Summary

In this blog post, we have seen what Snowflake is, its architecture, creating a free trial account, and hands-on. Using the free trial version you can also explore most of the prominent features that snowflake provides.

Here is my Linkedin profile in case if you want to connect with me.

For any queries, you can reach me on mycontact.

Thank you for reading!

I hope this has given you an idea to get started with the Snowflake data platform. Please take the time to comment below and let me know your feedback.

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.

 

About the Author

Our Top Authors

  • Analytics Vidhya
  • Guest Blog
  • Tavish Srivastava
  • Aishwarya Singh
  • Aniruddha Bhandari
  • Abhishek Sharma
  • Aarshay Jain

Download Analytics Vidhya App for the Latest blog/Article

Leave a Reply Your email address will not be published. Required fields are marked *