Data structures and algorithms are essential knowledge for every machine learning practitioner. They enable programmers to write efficient code, which is particularly valuable when working with large datasets. Aspiring candidates should have a solid understanding of these fundamentals, as data structure and algorithm questions are frequently posed in data science interviews. To help you prepare, here’s a curated list of 15 commonly asked data coding questions.
Test your skills by attempting these questions and assessing your proficiency!
This article was published as a part of the Data Science Blogathon.
Also Read: 5 Free Data Science Projects With Solutions
(a) It is a non-linear data structure
(b) In a tree data structure, a node can have any number of child nodes
(c) There is one and only one possible path between every pair of vertices in a tree
(d) Any connected graph having n vertices and n edges is considered as a tree
Answer: [ a, b, c ]
Explanation: A graph is a tree if and only if it is minimally connected, which means any connected graph with n vertices and (n-1) edges is a tree.
(a) The Inorder traversal of the given tree is B D A G E C H F I
(b) The Preorder traversal of the given tree is A B D C E G F H I
(c) The Postorder traversal of the given tree is D B G E H I F C A
(d) The breadth-first traversal of the given tree is A B C D E F G H I
Answer: [ a, b, c, d ]
Explanation:
(a) In a binary tree, each node must have 2 children
(b) In a binary tree, nodes are always arranged in a specific order
(c) It is a special type of tree data structure
(d) Number of nodes having zero children in any binary tree depends only on the number of nodes with 2 children
Answer: [ c, d ]
Explanation:
In a binary tree, each node can have at most 2 children. Total Number of nodes having zero children in a Binary Tree = Total Number of nodes having 2 children + 1
(b) Nodes are arranged in a specific order
(c) Only smaller values in its right subtree
(d) Only larger values in its left subtree
Answer: [ a, b ]
Explanation: In a binary search tree (BST), each node contains, only smaller values in its left subtree and only larger values in its right subtree.
(b) AVL trees are also called self-balancing binary search trees
(c) In AVL trees, the height of the left subtree and right subtree of every node differs by at least one
(d) In AVL trees, the balancing factor of each node is either 0 or 1 or -1
Answer: [ a, b, d ]
Explanation: In AVL trees, the height of the left subtree and the right subtree of every node differs by at most one.
(b) It follows the Last-In-First-Out(LIFO) principle
(c) Stack is a non-linear Data Structure
(d) The INSERT operation on the stack is often known as PUSH
Answer: [ a, b, d ]
Explanation: Stack is a linear Data Structure.
Also Read: Top 10 GitHub Data Science Projects with Source Code
(b) 3
(c) 4
(d) 5
Answer: [ b ]
Explanation: The Binary Search Tree formed is shown as below:
(b) 17
(c) 25
(d) 7
Answer: [ b ]
Explanation: Total Number of leaf nodes in a Binary Tree = Total Number of nodes having 2 children + 1
(b) Array can store the elements of different data type
(c) Array is a Linear Data Structure
(d) Accessing array elements takes constant time
Answer: [ a, c, d ]
Explanation: Array contains all the elements of the same data type.
(b) The connecting link between any two nodes in a tree is called an edge
(c) Nodes that belong to the same parent are called siblings
(d) Degree of a Tree is the total number of children of any node of a tree
Answer: [ b, c ]
Hint: Self Explanatory(Basics of Tree Terminology)
push(5)
push(8)
pop
push(2)
push(5)
pop
pop
pop
push(1)
pop
(b) 8 2 5 5 1
(c) 8 1 2 5 5
(d) 8 5 2 5 1
Answer: [ d ]
Explanation: Stack Data Structure follows the Last-In-First-Out(LIFO) Principle.
(b) Number of nodes in the left subtree of the root = 5
(c) Number of nodes in the right subtree of the root = 2
(d) Node with label 20 have only 1 child
Answer: [ a, b, c ]
Explanation: The tree formed after inserting all the elements is shown as below:
(b) f2, f1, f3, f4
(c) f1, f2, f3, f4
(d) f3, f2, f4, f1
Answer: [ d ]
Explanation: Comparison of various time complexities:
O(1) <O(log(logn)) <O(logn) <O(n1/2) <O(n) <O(nlogn) <O(n2) <O(n3) <0(nk) <O(2n) <O(nn)
(b) 6
(c) 7
(d) 8
Answer: [ c ]
Hint: Using the recursive relation: N(h) = N(h-1) + N(h-2) + 1, with base condition as N(0)=1 and N(1)=2 and here we have to calculate the value of N(3).
(b) Maximum number of nodes in a binary tree of height H = 2H+1 – 1
(c) Maximum number of nodes at any level ‘L’ in a binary tree = 2L
(d) Maximum number of nodes at any level ‘L’ in a binary tree = 2L-1
Answer: [ a, b, c ]
Hint: Self Explanatory(Take a small tree example and then verifies the options).
Data Science is your guide to the world of numbers and information. It’s like the superhero of data, using math, statistics, and computer science to make sense of the vast amounts of information we have today. Think of Data scientists as the detectives of the digital world. They dive into data, clean it up, and uncover hidden treasures that help businesses, researchers, and even your favorite apps make smarter decisions. So, when you hear about Data Science, think of it as the magic that turns data into valuable insights.
Data analytics and Data science are like two cousins in the data world. While both deal with data, Analytics examines past data to understand what happened. It’s like a detective looking at evidence to solve a crime. On the other hand, Data Science takes it a step further. It’s not just interested in what happened; it wants to predict the future. It’s more like a fortune-teller, using data to anticipate what might happen next.
Think of supervised learning as teaching a child with the answer key. You show the model examples of input and the correct output, and it learns to predict the output for new inputs. On the other hand, unsupervised learning is like giving a child a box of puzzles without a picture on the box. It figures out the patterns and groups the pieces independently, without any guidance.
What are some of the techniques used for sampling? What is the main advantage of sampling?
Sampling is like taking a bite from a large dish to understand its taste. There are various techniques like random, stratified, and cluster sampling. The main advantage is that it saves time and resources. You can get a good sense of the whole dish (or population) without eating the entire thing.
A Confusion Matrix is like a scorecard for a model’s performance. It shows you how well the model can distinguish between different classes. It’s called ‘confusion’ because it might mix things up and this matrix helps you keep track of the mix-ups.
Logistic regression is like fitting a curve to predict yes/no or 1/0 outcomes. It uses a mathematical function to find the relationship between one or more features and a binary outcome. Think of it as drawing a line that best separates the two classes.
The p-value is like a judge in a courtroom. It decides whether the evidence (data) is strong enough to convict a defendant (your hypothesis). If the p-value is low, you have a strong case. If it’s high, you might need more evidence.
Imagine you have a box of fruits and want to sort them into apples and oranges (Classification). But if you want to predict the weight of the fruits (Regression), you’d be better off using a scale instead of your eyes. Classification is for sorting, and regression is for measuring.
Overfitting is like wearing a suit that’s too tight; it fits perfectly but leaves no room for movement. It happens when a model learns the training data too well but can’t adapt to new data. Conversely, underfitting is like wearing a suit two sizes too big; it’s comfortable but looks sloppy. It occurs when a model is too simple and can’t capture the complexity of the data. The key is to find the sweet spot in between, just like wearing a suit that fits just right.
In the ever-evolving landscape of data science, the importance of robust coding skills cannot be overstated. This article has provided a comprehensive array of coding questions and answers designed to empower data scientists in the year 2023. By embracing these challenges, you’ve fortified your problem-solving abilities and expanded your knowledge in this dynamic field. As data science continues to shape our world, your proficiency in coding is a key asset. So, keep practicing, keep learning, and keep pushing the boundaries of what’s possible. With these skills, you’re well-equipped to excel in the exciting data science journey that lies ahead.
The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion.
Lorem ipsum dolor sit amet, consectetur adipiscing elit,