Python Coding Interview Questions on DataFrame and Zip()

Saumyab271 31 Aug, 2022 • 5 min read

This article was published as a part of the Data Science Blogathon.

Introduction

Python is an interpreted programming language used to build any Machine Learning or Deep Learning model. Data science aspirants need to have a good understanding of Python looking to work in the field of artificial intelligence.

So, if you are a fresher looking for a job role in data science, you must be very well prepared to answer Python-based interview questions. This article will cover coding questions on two main topics, i.e., zip() function and dataframe, frequently asked in interviews.

Python Coding

So let’s get started!

Python Coding Questions

Question 1: Given two lists, generate a list of pairs (one element from each list).

Suppose two lists are:

list1 = [1, 2, 3, 4]

list2 = [1, 4, 9, 16]

The below code can be used to create pairs using the zip() function.

zipped_list = zip(list1, list2)
print(zipped_list)
print(list(zipped_list))

The zip() function takes elements from each list one by one and pairs them up i.e., ‘1’ from list1 is paired with ‘1’ from list2 and so on.

Output:

 
[(1, 1), (2, 4), (3, 9), (4, 16)]

Question 2: Given two lists,

list1 = [“Ram”, “Shyam”, “Mohan”]

list2 = [24, 56, 34]

The output should be:

0   Ram         24

1   Shyam     56

2   Mohan     34

According to the given input and output, the task is to generate a dataframe from list1 and list2.

The above output can be achieved using the below code:

list1 = ["Ram", "Shyam", "Mohan"]
list2 = [24, 56, 34]
for x, (list1, list2) in enumerate(zip(list1, list2)):
    print(x, list1, list2)

The zip() function makes pairs, taking elements index-wise from each list. Thus, the above code prints the index and elements from each list.

Output:

0   Ram       24
1   Shyam     56
2   Mohan     34

Question 3: Write a code snippet to design a Dictionary from two lists.

To design a dictionary, we need to take one element each from two lists and combine them iteratively as shown below:

list1 = [1, 2, 3, 4]
list2 = [1, 4, 9, 16] 
dict1 = {list1: list2 for list1, list2 in zip(list1, list2)}
print(dict1)

The above code works at the element level, i.e., picking the first element of each list, grouping, and printing them. Then move to the second element of each list and again group and print them.

Output:

{1: 1, 2: 4, 3: 9, 4: 16}

Question 4: Given a list of pairs, say, [(1,1), (2,4),(3,9),(4,16)]. Write a code snippet to split into two sequences.

To achieve this, we need to unpair the items listed as shown below:

list_pair = [(1,1), (2,4),(3,9),(4,16)]
num, square = zip(*list_pair)
print(num)
print(square)

Output:

(1, 2, 3, 4) 
(1, 4, 9, 16)

Question 5: Explain zip() function and its functionality.

Zip() is a Python function that combines two data (lists, tuples) into one. For instance,

data1 = [“Ram” ,”Shyam”]          data2= [24, 56]

print(zip(data1, data2)

Question 6: What is DataFrame?

A DataFrame is a two-dimensional data structure consisting of rows, columns, and cells.

For instance-

S.No.         Name            Age

1               Ram                 26

2               Shyam             28

3               Neha                36

Here, the first row represents that Ram has age 26 and so on.

Question 7: Discuss different ways of creating DataFrame.

There are different ways of creating DataFrame:

1. By using a single list

import pandas as pd
data= ['Mumbai', 'Delhi', 'Pune']
df = pd.DataFrame(data)
print(df)

Output:

        0 
0  Mumbai 
1   Delhi 
2    Pune

2. By using list of lists

import pandas as pd
data= [['Mumbai', 6500], ['Delhi', 7000], ['Pune', 4000]]
df = pd.DataFrame(data, columns =['City','Distance'])
print(df)

Output:

        City     Distance 
0   Mumbai        6500 
1   Delhi         7000 
2   Pune          4000

3. Create an empty DataFrame

df1 = pd.DataFrame(columns =[‘A’, ‘B’, ‘C])

This creates an empty dataframe with three columns A, B, and C.

Question 8: Write a code snippet to create a dataframe from the dictionary.

The dataframe can be created using the below code:

import pandas as pd
dict1 = {'Roll no':[1001, 1002, 1003],'Name' :['Geeta','Sita','Anjali']}
df = pd.DataFrame(dict1)
print (df)

The dictionary consists of key-value pairs. The key in the dictionary becomes the column name, and values become the entry on the cell of a particular column.

Output:

     Roll no     Name 
0    1001       Geeta 
1    1002       Sita 
2    1003       Anjali

Question 9: How will you rename a column in DataFrame?

We can rename a column in a dataframe by using rename() function as shown below:

import pandas as pd
dict1 = {'Roll no':[1001, 1002, 1003],'Name' :['Geeta','Sita','Anjali']}
df = pd.DataFrame(dict1)
print (df)
df.rename(columns = {'Roll no':'S.no.'}, inplace = True)
print(df)

Output:

   Roll no    Name 
0     1001   Geeta 
1     1002    Sita 
2     1003  Anjali 
 S.no.    Name 
0   1001   Geeta 
1   1002    Sita 
2   1003  Anjali

We have renamed “Roll no.” as “S.no.”

Changing all column names

df.columns = ['Roll no', 'First Name']
print(df)

Output:

      Roll no    First Name 
0      1001       Geeta 
1      1002        Sita 
2      1003        Anjali

Question 10: What is the difference between loc and iloc?

loc: It is label-based, i.e., rows and column names have to be provided to access an element of a dataframe.iloc: It is integer location-based, i.e., row number and column number have to be provided to access an element of a dataframe.

For eg, The given dataframe is:

Roll no    Name     Course

1        101        Ram         Grad

2        102       Gorav       Postgrad

3        103         Sita         Grad

df.loc[‘1’, ‘Name’] will give you output as “Ram’.

df.iloc[0,1] will give you the output as “Ram’.

Question 11: How will you delete a row or column from the DataFrame?

The row or column of a DataFrame can be deleted by using the drop() function, i.e.if name of the dataframe is df; then the row can be deleted by using.

df.drop(['row_name'])

Similarly, the column can be deleted by using

df.drop(['column_name'])

Question 12: How do we sort a DataFrame?

The dataframe can be sorted using the command:

df.sort_values(by=['Name'])

The above command sorts the dataframe in ascending order.

df.sort_values(by=['Name'], ascending = False)

Question 13: Write a code snippet to add a new column in DataFrame.

A new column can be added using the below code:

import pandas as pd
dataframe = {'Roll no': [101, 102, 103],
        'Name': ['Gorav', 'Riddhi', 'Shyama']}
df = pd.DataFrame(dataframe)
age = [24,25,26]
df['age'] = age
print(df)

One should know about the column name and its value for adding a new column. Then, the column can be added using df[‘column name’].

Output:

   Roll no    Name  age 
0      101   Gorav   24 
1      102  Riddhi   25 
2      103  Shyama   26

Helpful Tips

Python Coding

Source: TechBullion

  • Dataframe and zip() are the two most important topics in any data scientist interview.
  • One should be aware of the basics of dataframe before going for advanced topics.
  • The creation, updation, and manipulation are important topics of dataframe. Therefore, any data science aspirant should be hand good grasp of the topic.
  • Practice the basic questions on zip() before going for tough questions.

Conclusion

This article covered coding questions on zip() function and dataframe.

Following are takeaways:

  •  Zip() is used to combine two lists and is frequently used in combination with the dictionary.
  • Dataframe is a two-dimensional data structure consisting of rows, columns, and cells.
  • The difference between loc and iloc is a must-to-know. In loc, row and column name whereas in iloc row and column number is provided to access an element of the dataframe.

If you get a good grasp of these questions, you can also answer other Python coding questions that require using any of these two functionality provided by Python.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion. 

Saumyab271 31 Aug 2022

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear