Tavish Srivastava — May 14, 2014
Business Analytics Career Education Intermediate Interviews Interviews Listicle R SAS

4 Tricky Interview questions
Analytics industry in India is dominated by SAS currently. But, it will be too optimistic to hope that this remains to in years to come. R, on the other hand is open source, and can be implemented in any environment. SAS grows by efforts of smart people employed by SAS but  R grows by the effort of anyone who works on the language. Anyone can contribute to the language R. Hence, I feel that every analyst should develop expertise in both the languages.

There are some key differences in coding on R vs. coding on SAS. This makes some of the interview questions on R tricky and handling them becomes overwhelming for some candidates. I strongly feel a need of a common thread which has all the tricky R questions asked in interviews. This article will give a kick-start to such a thread.  We have a similar series of articles published on SAS (Part 1 and Part 2). Please note that the content of this article is based on the information I gathered from various R sources.

If you’re looking to understand these questions through the lens of cracking data science interviews, look no further! We have put together a comprehensive course to help you land your first data science role!

 

 Question 1 : Rotational multiplication 

You have two vector defined as follows :

> a <- c(2,3,4) 
> b <- c(1,2)

What is the value of the vector d, which is defined as follows :

> d <- a*b

Answer : 2 , 6 , 4

R language does vectorized operations. ‘a’ and ‘b’ are two vectors with different length. By process, R multiplies the first element of a with 1st element of b, than second element of a with that of b, and so on. But in this case, after the second multiplication R hits the end of vector “b”. In such cases R, starts with the first element of smaller vector till each element of longer vector is exhausted. The vectorized operation always leads to a vector of length equal to that of longer vector.

 Question 2 : Scoping Rules 

You need to understand the following code and answer a question based on this understanding.

> y <- 3
> f <- function(x) {
+                            y <- 2
+                            y ^ 2 + g(x)
+                            }
> g <- function(x) {
+                             x * y
+                             }

What is the value of f(6)?

Answer : 22

If you answered anything other than 22, you probably need to refresh the lexical scoping in R.  The function f(x) returns a value y^2 + g(x). y in this environment has been defined as 2 and g(x) from inside this function. The value of x is passed of function g as 6. Now comes the catch, what is the value of free variable y here? Unlike dynamic environment where the value is assumed from the parent environment, lexical scoping assumes the value of a variable from the environment where the function is defined. The function g(x) is defined in the global environment here, and hence the value of y is assumed to be 3. Therefore a value of 18 is returned from the function g(x).  f(6) is finally returning as 22.

 Question 3 : Summarizing at each factor 

You have been assigned to check two race tracks. To complete this task you are expected to find the means of the total time taken by cars to cross the track. In the following data assignment, “b” is the vector of total time taken by different cars and “a” is the vector of track on which this time is taken. The first element of the vector “b” corresponds to the first element of vector “a” (and so on).

> a <- c(1,1,1,1,2,2,2,2,2)
> b <- c(10,12,15,12,NA,30,42,38,40)

How do you find the mean time of each track using split function?

Answer : Code is as follows 

> s <- split(b,a)
> lapply(s,mean)

 Question 4 : Treating missing values 

Following is the output of the last section :

$`1` [1] 12.25
$`2` [1] NA

How do you modify the code, to treat the missing value in the second track record?

Answer : The modified code is as follows :

> lapply(s,mean,na.rm=TRUE)
$`1` [1] 12.25
$`2` [1] 37.5

End Notes : 

Coders are lazy! and R language is built for coders. Codes in R are much more compact as compared to SAS. But it makes the language more difficult to retain all the syntax. You will probably need a lot of practice to get a hang of it (if you have been using SAS extensively). In one of our coming articles, we will compare coding in SAS and R. Have you faced any other R problem in analytics interview? Are you facing any specific problem with R codes?  Do you think this provides a solution to any problem you face? Do you think there are other methods to solve the problems discussed in a more optimized way? Do let us know your thoughts in the comments below.

 

If you like what you just read & want to continue your analytics learning, subscribe to our emailsfollow us on twitter or like our facebook page.

About the Author

Tavish Srivastava

Tavish Srivastava, co-founder and Chief Strategy Officer of Analytics Vidhya, is an IIT Madras graduate and a passionate data-science professional with 8+ years of diverse experience in markets including the US, India and Singapore, domains including Digital Acquisitions, Customer Servicing and Customer Management, and industry including Retail Banking, Credit Cards and Insurance. He is fascinated by the idea of artificial intelligence inspired by human intelligence and enjoys every discussion, theory or even movie related to this idea.

Our Top Authors

Download Analytics Vidhya App for the Latest blog/Article

23 thoughts on "4 Tricky R interview questions"

Ashish Jain
Ashish Jain says: May 15, 2014 at 5:23 am
Hi Tavish, It's good you started this thread. My view on above R topics is to create separate section on R interviews categorized as: 1) R basic interview question - To test how strongly a person knows R. 2) R intermediate level questions- To test multiple ways(alternatives) of achieving same results on R and which is more efficient way of analysis. Advance level 3) How to handle or process millions of rows(Big Data) in R. what are the challenges and how to cope up with it. 4) How to develop R based predictive applications using Shiny package or integrate R predictive model results with other application 5) R statistical models interpretation -interview questions. like clustering in R, record linkage in R, multinominal Regression in R,Random forest and SVM etc. Let me know if we can collaborate to learn and write more stuffs, share thoughts and approach to analytics solution in R and SAS. Reply
Subhajit
Subhajit says: May 15, 2014 at 7:01 am
This discussion is going to be really helpful for people who want to be an analyst...Thank you for the valuable inputs. Reply
Abhinav
Abhinav says: May 20, 2014 at 6:42 pm
All, do let me know if there is any analytics assessment tests available in the market which can help the organization in taking hiring decision. thanks, Reply
Kunal Jain
Kunal Jain says: May 22, 2014 at 3:34 am
Abhinav, Nice idea! I am not aware of any standard tests available yet. However you can get a few tests on tools, if you search. Regards, Kunal Reply
neha
neha says: June 12, 2014 at 11:33 pm
hi Tavish, I am a fresher in SAS programming , I am in USA , i just took some training is SAS programming being a one with no IT background....i need more help in using SAS software , do you provide any training or are you willing to provide , i can definitely pay you for that, it will be a big help . Thanks Neha Reply
Deepak
Deepak says: June 14, 2014 at 8:12 am
Will you tell me that, which model we use to rate banks for corporates and retail clients in R. Reply
Kunal Jain
Kunal Jain says: June 15, 2014 at 7:19 pm
Neha, Thanks for your query. Currently, we do not offer any training on SAS. If you can tell what kind of time and resources can you spend in learning SAS, we might be able to help you with a few recommendations. Regards, Kunal Reply
neha
neha says: June 16, 2014 at 7:04 pm
thanks for replying Kunal, Since i am in USA i can spend 1 to 2 hours in the evening thats morning for india and we can talk about charges , you can email me at [email protected] Reply
Kunal Jain
Kunal Jain says: June 23, 2014 at 11:31 pm
Neha, As mentioned, we do not run a training on SAS ourselves. However, I'll put you in touch with a few good courses. Regards, Kunal Reply
rashmi pareek
rashmi pareek says: July 07, 2014 at 5:27 pm
int a=0.7; if(a>0.7) { printf("hello"); } else { printf("bye"); } and same do with 0.8 my question is what will be the output ?? Reply
Mahendra
Mahendra says: July 12, 2014 at 10:57 am
great start, good for a new R user what i feel you need to have a cool understanding of stats in order to use R in an efficient manner ,for ex running a deep learning algorithm like neural awareness of decay factor value with other parameters is mandatory before one can apply else go for spss / e miner ,all the basic preprocessing of data can be handled by linux power tools (fast) will try to share one of the uncommon model for insurance when nothing worked ie Random/logit/Gradient boost hope i m inline :) Reply
Aakash
Aakash says: July 18, 2014 at 9:06 pm
Well, there is an informs data scientist certification. You can try that. Reply
rafi khan
rafi khan says: July 24, 2014 at 7:07 am
Kunal you are doing a good job . it really helpful for me to enhance my skills in program language. Thanks REGARDS RAFI KHAN Reply
Kunal Jain
Kunal Jain says: July 24, 2014 at 7:08 pm
Thanks Rafi Reply
sanjeeb
sanjeeb says: July 29, 2014 at 11:54 am
Hey All, Thanks a lot for these questions.. I am new to R programming and I have a POC to use R,PMML and I have to use data from Hadoop. Can you guys help me out. Reply
Anoop
Anoop says: September 11, 2014 at 10:55 am
To the best of my knowledge:- output would be "bye". Reply
naidu
naidu says: November 19, 2014 at 6:29 am
hi ashish i,m looking forward to learn r programming i would like to learn it can u pls help me out Reply
Varun
Varun says: April 29, 2015 at 12:51 pm
Some more questions in R(have encountered them in interviews): 1. How to do different joins(left, right, inner) using R 2. Do column order, number and names matter if we want to apply cbind and rbind in r Reply
Vikas
Vikas says: May 31, 2015 at 2:05 pm
Kunal/Guys, Please post more R interview questions of all level. I/We need it desperately, as I have to attend more R interviews. Please do the needful. Reply
kalyan
kalyan says: May 31, 2015 at 3:33 pm
why cant we use tapply for question 3. why should we use split and lapply? Reply
Chirayu
Chirayu says: September 25, 2015 at 5:52 am
in 3rd quest you can use tapply(b,a,mean,na.rm=T) Reply
priya
priya says: October 12, 2015 at 7:22 am
hi , i am new to R. i have no knowledge on statistics. will this hinder my progress?? Reply
Mona
Mona says: November 01, 2015 at 5:46 pm
Hi Kunal, Can you please a few good SAS / Analytics training institutes in Delhi / NCR. I want to make a mid-carrer shift into analytics domain. I just want liitle help regarding the correct software / course to opt for the same. Thanks Mona Reply

Leave a Reply Your email address will not be published. Required fields are marked *