SAS Base (59)

Given the contents of the raw data file TYPECOLOR.DAT:

—-+—-10—+—-20—+—-30
daisyyellow

The following SAS program is submitted:

data FLOWERS;
infile ‘TYPECOLOR.DAT’ truncover;
length
Type $ 5
Color $ 11;
input
Type $
Color $;
run;

What are the values of the variables Type and Color?

A. Type=daisy, Color=yellow
B. Type=daisy, Color=w
C. Type=daisy, Color=daisyyellow
D. Type=daisy, Color=

Check Answer
Answer: D

注解:The code neither specified where to read the value of each variable nor the delimiter. SAS will start to read value from left to right, using the default delimiter, space(‘ ‘). SAS reads the first value ‘daisyyellow’ and assigns it to variable ‘Type’. As the length of ‘Type’ is 5 which means it can only hold up to 5 characters, the value is truncated to ‘daisy’. Then SAS tries to read the second value for ‘Color’, but there is nothing left in the file. SAS simply sets the value to missing.

One of the proper way to read the file would be telling SAS the position of each variable:

data FLOWERS;
infile ‘TYPECOLOR.DAT’;
length
Type $ 5
Color $ 11;
input
Type $ 1-5
Color $ 6-11;
run;

Obs Type Color
1 daisy yellow

If you were not sure the length of the second variable, TRUNCOVER option would be handy. It enables you to read variable-length records when some records are shorter than the INPUT statement expects.

The following program will produce the identical result, except that the value of ‘Color’ could be any length from 0 to 15.

data FLOWERS;
infile ‘TYPECOLOR.DAT’ truncover;
length
Type $ 5
Color $ 11;
input
Type $ 1-5
Color $ 6-20;
run;

SAS Base (58)

The following program is submitted:

proc format;
value salfmt. 0 -< 50000 = ‘Less than 50K’ 50000 – high = ’50K or Greater’;

options fmterr nodate pageno=1;
title ‘Employee Report’;

proc print data=work.employees noobs;
var fullname salary hiredate;
format salary salfmt. hiredate date9.;
label fullname=’Name of Employee’ salary=’Annual Salary’ hiredate=’Date of Hire’;
run;

Why does the program fail?

A. The PAGENO option is invalid in the OPTIONS statement.
B. The RUN statement is missing after the FORMAT procedure.
C. The format name contains a period in the VALUE statement.
D. The LABEL option is missing from the PROC PRINT statement.

Check Answer
Answer: C

注解:When defining custom formats, the name of formats cannot end in a number or a period. The name, ‘salfmt.’, in VALUE statement is invalid. However, when you refer to the custom format, the ending period is necessary.

SAS Base (57)

Given the SAS data set WORK.ONE:

N BeginDate
1 09JAN201
2 12JAN201

The following SAS program is submitted:

data WORK.TWO;
set WORK.ONE;
Day=<_insert_code_>;
format BeginDate date9.;
run;

The data set WORK.TWO is created, where Day would be 1 for Sunday, 2 for Monday, 3 for Tuesday, … :

N BeginDate Day
1 09JAN2010 7
2 12JAN2010 3

Which expression successfully completed the program and creates the variable Day?

A. day(BeginDate)
B. weekday(BeginDate)
C. dayofweek(BeginDate)
D. getday(BeginDate,today())

Check Answer
Answer: B

注解:WEEKDAY function  returns an integer that corresponds to the day of the week, where 1=Sunday, 2=Monday, …, 7=Saturday, from a SAS date value. DAY function returns the day of the month. There is no function called DAYOFWEEK or GETDAY in SAS.

SAS Base (56)

The following output is created by the FREQUENCY procedure:

Frequency
Percent
Row Pct
Col Pct
Table of Region by Product
Region Product
Corn Cotton Oranges Total
East
2
22.22
50.00
50.00
1
11.11
25.00
33.33
1
11.11
25.00
50.00
4
44.44
South
2
22.22
40.00
50.00
2
22.22
40.00
66.67
1
11.11
20.00
50.00
5
55.56
Total
4
44.44
3
33.33
2
22.22
9
100.00

Which TABLES statement was used to completed the following program
that produced the output?

proc freq data=sales;
<_insert_code_>
run;

A. tables region product;
B. tables region,product;
C. tables region/product;
D. tables region*product;

Check Answer
Answer: D

注解:Comma and slash are invalid symbols in TABLES statement. Asterisk is used to create  multi-way tables. By adding an asterisk between two variables, SAS creates a two-way crosstabulation table. In option A, using a space between two variables, SAS will create two one-way frequency tables:

Region Frequency Percent Cumulative
Frequency
Cumulative
Percent
East 4 44.44 4 44.44
South 5 55.56 9 100.00
Product Frequency Percent Cumulative
Frequency
Cumulative
Percent
Corn 4 44.44 4 44.44
Cotton 3 33.33 7 77.78
Oranges 2 22.22 9 100.00

SAS Base (55)

The following SAS program is submitted:

data WORK.DATE_INFO;
X=”01Jan1960″D ;
run;

Variable X contains what value?

A. the numeric value 0
B. the character value “01Jan1960”
C. the date value 01011960
D. the code contains a syntax error and does not execute.

Check Answer
Answer: A

注解:Letter D is used to convert a normal date in DDMMMYY or DDMMMYYYY format to SAS date value. Check SAS Base (16) for details.

This question has another version. A space is placed between the closing quotation mark and letter D. This will cause a compile error and the answer would be D.

SAS Base (54)

Consider the following data step:

data WORK.TEST;
set SASHELP.CLASS(obs=5);
retain City ‘Beverly Hills’;
State=’California’;
run;

The computed variables City and State have their values assigned using two different methods, a RETAIN statement and an Assignment statement. Which statement regarding this program is true?

A. The RETAIN statement is fine, but the value of City will be truncated to 8 bytes as the LENGTH statement has been omitted.
B. Both the RETAIN and assignment statement are being used to initialize new variables and are equally efficient. Method used is a matter of programmer preference.
C. The assignment statement is fine, but the value of City will be truncated to 8 bytes as the LENGTH statement has been omitted.
D. City’s value will be assigned one time, State’s value 5 times.

Check Answer
Answer: D

注解:Both RETAIN statement and Assignment (=) statement can be used to assign values, but there are a few differences.

1. As RETAIN statement assigns the value at compile time, the code will be executed once and only once. Assignment statement, however, respecifies the value in every iteration.
2. SAS automatically sets variables that are assigned values by an assignment statement to missing before each iteration of the DATA step. On the contrary, RETAIN statement retains the value from one iteration to the next.

AS this program reads the first 5 observations (5 iterations) in CLASS data set,  the value of State is specified 5 times, and the variable is reset to missing when SAS begins to read the next observation. City is initialized as ‘Beverly Hills’ before DATA step and won’t be reset to missing.

SAS Base (53)

The following SAS program is submitted:

data WORK.TOTAL_SALARY;
retain Total;
set WORK.SALARY;
by Department;
if First.Department
then Total=0;
Total=sum(Total, Wagerate);
if Last.Total;
run;

What is the initial value of the variable Total?

A. 0
B. Missing
C. The value of the first observations Wagerate
D. Cannot be determined from the information given

Check Answer
Answer: B

注解:The second line of codes, retain Total, initializes the variable Total.  As it doesn’t specify an initial value, a missing value is assigned to the variable. Please also check SAS Base (32) for the details of RETAIN statement.

SAS Base (52)

Given the SAS data set WORK.EMP_NAME:

Name EmpID
Jill 1864
Jack 2121
Joan 4698
John 5463

Given the SAS data set WORK.EMP_DEPT:

EmpID Department
2121 Accounti
3567 Finance
4698 Marketin
5463 Accounti

The following program is submitted:

data WORK.ALL;
merge WORK.EMP_NAME(in=Emp_N)
WORK.EMP_DEPT(in=Emp_D);
by Empid;
if (Emp_N and not Emp_D) or (Emp_D and not Emp_N);
run;

How many observations are in data set WORK.ALL after submitting the program?

A. 1
B. 2
C. 3
D. 5

Check Answer
Answer: B

注解:The new data set WORK.ALL only contains observation which exits in EMP_Name but not in EMP_DEPT, or the other way around. WORK.ALL is:

Obs Name EmpID Department
1 Jill 1864
2 3567 Finance

For the details of MERGE statement and IN option, please check SAS Base (28).

SAS Base (51)

The following program is submitted:

proc contents data=_all_;
run;

Which statement best describes the output from the submitted program?

A. The output contains only a list of the SAS data sets that are contained in the WORK library.
B. The output displays only the contents of the SAS data sets that are contained in the WORK library.
C. The output displays only the variables in the SAS data sets that are contained in the WORK library.
D. The output contains a list of the SAS data sets that are contained in the WORK library and displays the contents of those data sets.

Check Answer
Answer: D

注解:The CONTENTS procedure shows the contents of a SAS data set and prints the directory of the SAS library. As the libref is omitted, SAS uses the default library WORK, and  _ALL_ represents all data sets in that library. SAS lists the data sets in WORK library followed by the contents of each data set.

SAS Base (50)

Given the SAS data set WORK.ONE:

Id Char1
111 A
158 B
329 C
644 D

and the SAS data set WORK.TWO:

Id Char2
111 E
538 F
644 G

The following program is submitted:

data WORK.BOTH;
set WORK.ONE WORK.TWO;
by Id;
run;

What is the first observation in SAS data set WORK.BOTH?

A.

Id Char1 Char2
111 A

B.

Id Char1 Char2
111 E

C.

Id Char1 Char2
111 A  E

D.

Id Char1 Char2
644 D G
Check Answer
Answer: A

注解:SET statement的作用是将多个data set合并,但不会对观测值进行合并。题目中,新data set BOTH中将包含ONE和TWO中所有的变量,BY statement作用是使新data set根据ID升序排列,所以ONE中的第一条数据将成为BOTH中的第一条数据,由于该观测值中没有Char2变量,所以输出结果中显示为missing。完整的BOTH为:

Obs Id Char1 Char2
1 111 A
2 111 E
3 158 B
4 329 C
5 538 F
6 644 D
7 644 G

如需对观测值合并,应当使用MERGE,具体请参考SAS Base (28)