«

»

Apr 13 2014

Tricky Base SAS interview questions : Part-II

tough questions

SAS is the largest market-share holder for advanced analytics.

If you are going to work in analytics industry, it is impossible to escape from the language SAS. Different softwares of SAS are used in the industry for data handling. Most common of them used in data handling are SAS Enterprise guide and Base SAS. Both the tools have a similar format and usage. The only difference being SAS Enterprise guide is the graphical user interface of Base SAS. SAS Enterprise guide is much easier to use and is modular in nature. Because of the modularity, SAS  Enterprise guide is being widely used in the industry. With time Base SAS is losing its importance in the industry, and SAS Enterprise guide is filling in for Base SAS. However, having used both, I now appreciate and prefer using Base SAS to handle data of billions of customers and trillions of transactions.

One of our previous articles (http://www.analyticsvidhya.com/blog/2013/11/4-sas-tricky-analytics-interview/) covers four tricky questions asked in SAS interviews. In this article, I will cover some tricky scenarios in which using base SAS will become far easier than using the SAS Enterprise guide. These questions are tougher and lengthier than those covered in the first part of this article series. These questions are asked widely in companies who have a broad base of analytics and deal with big data (Millions of customers, Billions of transactions, Trillions of dollar value transactions).

Base SAS vs. SAS Enterprise guide

Let’s first look into the pros of using both Base SAS and SAS Enterprise guide. This will make us appreciate those tricky scenarios, even more.

SAS Enterprise guide

1. EG is more common in firms with a smaller team of analytics. This is because programs on EG are far more understandable by a person who is new to the firm. Given the high attrition rate in analytics industry, it becomes very essential for such firm to hedge their risk by using the SAS Enterprise guide instead of Base SAS.

2. You develop a certain traditional routine much faster on EG than on Base SAS.

3. It is much easier to comprehend the flow using EG.

 Base SAS

1. It is much easier to code on base SAS in case the logic of code is very complex.

2. Base SAS is much faster as compare to EG.

3. It is much easier to modify a code on Base SAS than on EG.

4. Using Macros makes coding in Base SAS much easier than SAS EG

Background to case 1

You work in a retail industry. You have recently started a loyalty program for your customers. A study conducted on retail bank, says that the customers with a total purchase of $1,000 in 3rd month (T+ 2th months) are the customers who will finally purchase more than $30,000. You want to focus your loyalty campaign on these customers.

You have 2 data-sets. First has the entire list of customer IDs with their date of first purchase. Second data has the customer ID with their monthly purchases for each year-month. First purchase can possibly be a non-financial transaction which might not be a part of table 2.

You need to identify customers who make more than $1000 purchase in the 3rd month from the first purchase.

Table 1 :

tab1

Table 2 :

tab2

Solution to Case 1

This question is a classic case when Base SAS clearly beats SAS EG. In this section you will see a simple solution for this case study.

* Creating a macro for each month

%macro fetch_data (next_mon = , third_mon = );

data create_list;

set table_1;

if first_pur < next_mon;

run;

 

proc sort data =  create_list out=list; by customer_id; run;

 

proc sort data =  table_2 out=purchase; by customer_id; run;

 

data fetch_purchase;

merge list(in=a) purchase(in=b);

if yearmonth = third_mon;

by customer_id;

if a;

run;

 

proc datasets;

append base=final_dataset data =  fetch_purchase foce;

run;

%mend fetch_data;

 

%fetch_data (next_mon = ’01Feb2012′d  , third_mon = 201204);

%fetch_data (next_mon = ’01Feb2013′d  , third_mon = 201304);

%fetch_data (next_mon = ’01Mar2012′d  , third_mon = 201205);

%fetch_data (next_mon = ’01Mar2013′d  , third_mon = 201305);

 

*Identifying the customers with purchase above $1000 in 3rd month

data shortlisted;

set   final_dataset;

if sales ge 1000;

run;

Background to case 2

You work for a banking industry. You want to analyze the transaction dataset and want to find the median transaction amount for each customer. This is the amount over which we will want to pay to the customer for stretching. More the dollar value of transactions, the cheaper is the total cost of transactions.You need to make a list of all customer with their floored median transaction amount (if there are 5 transactions, we want the 2nd lowest transaction and not 3rd and if transactions are only 1 then remove the customer from the list).

The only dataset you have is unique on transaction ID. It also has the customer ID and amount of the transaction.

 Table 1

tab3

Solution to Case 2

The solution to this problem is tiresome on SAS EG because there is no median function on SQL routines after grouping data. SQL routines are the foundation of data handling in SAS EG.But this becomes quite easy on Base SAS. Let’s see how this can be done easily on Base SAS.

proc sql;

create table work.summarize as

select count(*) as trans_nos, customer_id

from work.table1

group by customer_id;

quit;

 

proc sort data = tables1; by customer_id;run;

 

proc sort data = summarize; by customer_id;run;

 

data add_total_trans;

merge table1 (in=a) summarize (in=b);

median_no = floor(trans_nos/2);

by customer_id;

drop trans_nos;

run;

 

proc sort data = add_total_trans; by customer_id amount;run;

 

data final_list;

set add_total_trans;

by customer_id amount;

if first.customer_id then n =1;

if n = median_no;

n + 1;

run;

The solution in base SAS for this question is not only effective but also time efficient.

End Notes

Both Base SAS and EG have their own pros and cons. The best recommended strategy is to use both. If you want to make a traditional query, use SAS EG to generate automated code. Now copy this code to make it macronized and generalized using Base SAS. The macro adds a new dimension to the codes which helps you generalize the code and avoid hard entered data.

Have you faced any other SAS problem in analytics interview? Are you facing any specific problem with SAS codes?  Do you think this provides a solution to any problem you face? Do you think there are other methods to solve the problems discussed in a more optimized way? Do let us know your thoughts in the comments below.

If you like what you just read & want to continue your analytics learning, subscribe to our emailsfollow us on twitter or like our facebook page.

9 comments

1 ping

Skip to comment form

  1. Şarkı Sözü

    Thank you very nice blog liked

  2. Swati

    This is great stuff. Thanks Tavish for putting this together. Waiting for few more articles from the same series.

  3. Wasim

    Very Impressive!!!

  4. Venkatesh. KR.

    Very informative and crystal clear.

  5. sowmya

    hi really nice..
    but please post more examples i.e real time scenarios regarding credit cards domain.

  6. Bhavani

    Thank you very much for sharing your experiences and your article also very understandable.

  7. Ravi

    Thank You so much….! Really nice…!!!!

  8. Ali

    Hi Tavish:

    I am a senior Statistician in my late 40′s. For last over 15 years I have been using SAS for time series forecasting purposes only i.e. Time Series Forecasting system using SAS/ETS.
    Now for my new job scenario I need to use SAS Base for data manipulations. Could you please tell me the most simplest and quickest way to put my hands on and learn SAS Base quickly. Plus for SAS Base certification what would be the most effective fastest approach.
    Your guidance will very helpful and will be much appreciated.
    thanks,

    1. Tavish Srivastava

      Ali,

      Given that you have such deep experience into SAS, it should not be a challenge to learn base SAS. The tricky part is that learning material on SAS in a structured format is rarely available for free. Alternatively, if you are open to spend money and take up some paid trainings, SAS institute and Jigsaw academy offer such courses.
      For Base SAS certification there won’t be any additional preperation required. If you are comfortable with base SAS fundamentals, certification should be cakewalk.
      Hope this helps.

      Tavish

  1. Some tricky SAS interview questions

    […] Tricky Base SAS interview questions : Part-II […]

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>