SAS Base (29)

The following SAS program is sumbitted:

data WORK.INFO;
infile ‘DATAFILE.TXT’;
input @1 Company $20. @25 State $2. @;
if State=’ ‘ then input @30 Year;
else input @30 City Year;
input NumEmployees;
run;

How many raw data records are read during each iteration of the DATA step?

A. 1
B. 2
C. 3
D. 4

Check Answer
Answer: B

注解:问几行数据构成一个观测值,每次DATA step循环一共会执行3个INPUT statement,默认需要3行来完成一个观测值。但第一个INPUT statement末尾有一个@,意味着在执行第二个INPUT时不换行读取数据,每个观测值需要2行数据。@的具体解释详见SAS Base (2)

SAS Base (28)

Given the SAS data set WORK.P2000:

Location Pop2000
Alaska 626931
Delaware 783595
Vermont 608826
Wyoming 493782

and the SAS data set WORK.P2008:

State Pop2008
Alaska 686293
Delaware 873092
Wyoming 532668

The following output is desired:

Obs State Pop2000 Pop2008 Difference
1 Alaska 626931 686293 59362
2 Delaware 783595 873092 89497
3 Wyoming 493782 532668 38886

Which SAS program correctly combines the data?

A.
data compare;
merge WORK.P2000(in=_a Location=State)
WORK.P2008(in=_b);
by State;
if _a and _b;
Difference=Pop2008-Pop2000;
run;

B.
data compare;
merge WORK.P2000(rename=(Location=State))
WORK.P2008;
by State;
if _a and _b;
Difference=Pop2008-Pop2000;
run;

C.
data compare;
merge WORK.P2000(in=_a rename=(Location=State))
WORK.P2008(in=_b);
by State;
if _a and _b;
Difference=Pop2008-Pop2000;
run;

D.
data compare;
merge WORK.P2000(in=_a) (rename=(Location=State))
WORK.P2008(in=_b);
by State;
if _a and _b;
Difference=Pop2008-Pop2000;
run;

Check Answer
Answer: C

注解:MERGE的作用是将来自于多个data set中的观测值合并为一个观测值(相当于SQL中的JOIN)。IN option用于生成一个布尔型变量,变量名为等号右边的变量(题目中分别为_a和_b),当Merge后的观测值有来自于当前data set的数据时,该变量的值为1,否则为0。比如题目中新data set的第一个观测值的State为Alaska,WORK.P2000和WORK.P2008中均有Alaska这个观测值,那么_a和_b在第一次DATA step循环中均为1。以下程序可以显示各个DATA step中_a和_b的值(由于_a和_b并不会被输出到新的data set中,需要把其值赋值给新的变量,以下程序中分别为a和b):
data compare;
merge P2000(in=_a rename=(Location=State))
P2008(in=_b);
by State;
a = _a;
b = _b;
Difference=Pop2008-Pop2000;
run;

Obs State Pop2000 Pop2008 a b Difference
1 Alaska 626931 686293 1 1 59362
2 Delaware 783595 873092 1 1 89497
3 Vermont 608826 . 1 0 .
4 Wyoming 493782 532668 1 1 38886

RENAME option的作用是重新对变量命名,格式为:RENAME = (旧变量名 = 新变量名)。

本题MERGE statement的语法格式,MERGE dataset1 dataset2 (option1 option2);

最后的IF statement则用于输出_a和_b都为1的那些观测值,即WORK.Pop2000和WORK.Pop2008共同拥有的观测值。

SAS Base (27)

Given the SAS data set WORK.TEMPS:

Day Month Temp
1 May 75
15 May 70
15 June 80
3 June 76
2 July 85
14 July 89

The following program is submitted:

proc sort data=WORK.TEMPS;
by descending Month Day;
run;

proc print data=WORK.TEMPS;
run;

Which output is correct?

A.

Obs Day Month Temp
1 2 July 85
2 14 July 89
3 3 June 76
4 15 June 80
5 1 May 75
6 15 May 70

B.

Obs Day Month Temp
1 1 May 75
2 2 July 85
3 3 June 76
4 14 July 89
5 15 May 70
6 15 June 80

C.

Obs Day Month Temp
1 1 May 75
2 15 May 70
3 3 June 76
4 15 June 80
5 2 July 85
6 14 July 89

D.

Obs Day Month Temp
1 15 May 70
2 1 May 75
3 15 June 80
4 3 June 76
5 14 July 89
6 2 July 85
Check Answer
Answer: C

注解:DESCENDING只作用于紧跟着它的那一个变量,所以输出的结果应当先按Month降序排列,相同的Month再按Day升序排列。Month是字符型,所以按照英文字母的顺序倒序排列。注意,SAS并不知道May是5月,June是6月,SAS只知道May的第一个字母是M,而June的第一个字母是J。字符型变量的排序在采用不同编码表的操作系统中略有不同。

采用ASCII编码的系统,例如Windows,Linux和MAC OS中,各字符从小到大的排列顺序为:
空格 ! ” # $ % & ‘ ( ) * + , – . /0 1 2 3 4 5 6 7 8 9 : ; < = > ? @
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z[ ] ˆ_
a b c d e f g h i j k l m n o p q r s t u v w x y z { } ~

采用EBCDIC编码的系统,例如z/OS,各字符从小到大的排列顺序为:
空格 . < ( + | & ! $ * ) ; ¬ – / , % _ > ?: # @ ‘ = “
a b c d e f g h i j k l m n o p q r ~ s t u v w x y z
{ A B C D E F G H I } J K L M N O P Q R S T U V W X Y Z
0 1 2 3 4 5 6 7 8 9

SAS Base (26)

Which step sorts the observations of a permanent SAS data set by two variables and stores the sorted observations in a temporary SAS data set?

A.
proc sort out=EMPLOYEES data=EMPSORT;
by Lname and Fname;
run;

B.
proc sort data=SASUSER.EMPLOYEES out=EMPSORT;
by Lname Fname;
run;

C.
proc sort out=SASUSER.EMPLOYEES data=WORK.EMPSORT;
by Lname Fname;
run;

D.
proc sort data=SASUSER.EMPLOYEES out=SASUSER.EMPSORT;
by Lname and Fname;
run;

Check Answer
Answer: B

注解:DATA指定需要被排序的data set,OUT指定存放排序后数据的data set,如果省略OUT,SAS则会用排序后的数据替换原始数据。BY statement后直接列出需要排序的变量名,并以空格隔开。WORK library中的dataset都为临时data set,即关闭SAS后这些数据都将被移除,以WORK.dataset表示,其中WORK可以省略,例如:WORK.EMPSORT或EMPSORT。其他library中的都为永久data set,以libref.dataset表示,例如:SASUSER.EMPLOYEES。

SAS Base (25)

Given the following code:

proc print data=SASHELP.CLASS(firstobs=5 obs=15);
where Sex=’M’;
run;

How many observations will be displayed?

A. 11
B. 15
C. 10 or fewer
D. 11 or fewer

Check Answer
Answer: D

注解:首先只有5-11行中的数据才会被打印出来,所以最多能出现11条数据,其次这些数据还需要满足Sex为M。FIRSTOBS和OBS参考SAS Base (5)

SAS Base (24)

Given the following raw data records:

----|----10---|----20
Susan*12/29/1970*10
Michael**6

The following output is desired:

Obs employee bdate years
1 Susan 4015 10
2 Michael . 6

Which SAS program correctly reads in the raw data?
A.
data employees;
infile ‘file specification’ dlm=’*’;
input employee $ bdate : mmddyy10. years;
run;

B.
data employees;
infile ‘file specification’ dsd=’*’;
input employee $ bdate mmddyy10. years;
run;

C.
data employees;
infile ‘file specification’ dlm dsd;
input employee $ bdate mmddyy10. years;
run;

D.
data employees;
infile ‘file specification’ dlm=’*’ dsd;
input employee $ bdate : mmddyy10. years;
run;

Check Answer
Answer: D

注解:考点是DLM和DSD连用,具体参考SAS Base (2)

SAS Base (23)

Which is a valid LIBNAME statement?
A. libname “_SAS_data_library_location_”;
B. sasdata libname “_SAS_data_library_location_”;
C. libname sasdata “_SAS_data_library_location_”;
D. libname sasdata sas “_SAS_data_library_location_”;

Check Answer
Answer: C

注解:正确格式为:LIBNAME  libref  ‘文件路径’。

SAS Base (22)

Which step displays a listing of all the data sets in the WORK library?
A. proc contents lib=WORK run;
B. proc contents lib=WORK.all;run;
C. proc contents data=WORK._all_; run;
D. proc contents data=WORK _ALL_; run;

Check Answer
Answer: C

注解:在PROC CONTENTS statement中,libref._ALL_用于显示libref这一SAS library下所有data set的信息。D缺了个“.”,如果libref是work的话,可以省略work,即data = _all_。

SAS Base (21)

Given the SAS data set WORK.PRODUCTS:

ProdId Price ProductType Sales Returns
K12S 95.50 OUTDOOR 15 2
B132S 2.99 CLOTHING 300 10
R18KY2 51.99 EQUIPMENT 25 5
3KL8BY 6.39 OUTDOOR 125 15
DY65DW 5.60 OUTDOOR 45 5
DGTY23 34.55 EQUIPMENT 67 2

The following SAS program is submitted:

data WORK.OUTDOOR WORK.CLOTH WORK.EQUIP;
set WORK.PRODUCTS;
if Sales GT 30;
if ProductType EQ ‘OUTDOOR’ then output WORK.OUTDOOR;
else if ProductType EQ ‘CLOTHING’ then output WORK.CLOTH;
else if ProductType EQ ‘EQUIPMENT’ then output WORK.EQUIP;
run;

How many observations does the WORK.OUTDOOR data set contain?
A. 1
B. 2
C. 3
D. 6

Check Answer
Answer: B

注解:Outdoor data set中的数据需要满足2个条件,Sales大于30以及ProductType为OUTDOOR,只有第四和第五个observation满足条件。

SAS Base (20)

The data set WORK.REALESTATE has the variable LocalFee with a format of 9. and a variable CountryFee with a format of 7.;

The following SAS program is submitted:

data WORK.FEE_STRUCTURE;
format LocalFee CountryFee percent7.2;
set WORK.REALESTAT;
LocalFee=LocalFee/100;
CountryFee=CountryFee/100;
run;

What are the formats of the variables LOCALFEE and COUNTRYFEE in the output dataset?
A. LocalFee has format of 9. and CountryFee has a format of 7.
B. LocalFee has format of 9. and CountryFee has a format of percent7.2
C. Both LocalFee and CountryFee have a format of percent7.2
D. The data step fails execution; there is no format for LocalFee.

Check Answer
Answer: C

注解:FORMAT statement可以一次改变多个变量的format。此外,假设原始值是10,那么Format改成percent7.2之后,输出的结果则为1000%。