SAS Base (48)

The following SAS program is submitted:

data WORK.TEST;
drop City;
infile datalines;
input Name $ 1-14 / Address $ 1-14 / City $ 1-12 ;
if City=’New York ‘ then input @1 State $2.;
else input;
datalines;
Joe Conley
123 Main St.
Janesville
WI
Jane Ngyuen
555 Alpha Ave.
New York
NY
Jennifer Jason
666 Mt. Diablo
Eureka
CA
;
run;

What will the data set WORK.TEST contain?

A.

Obs Name Address State
1 Joe Conley 123 Main St.
2 Jane Ngyuen 555 Alpha Ave. NY
3 Jennifer Jason 666 Mt. Diablo

B.

Obs Name Address City State
1 Joe Conley 123 Main St. Janesville
2 Jane Ngyuen 555 Alpha Ave. New York NY
3 Jennifer Jason 666 Mt. Diablo Eureka

C.

Obs Name Address State
1 Jane Ngyuen 555 Alpha Ave. NY

D. No observations, there is a syntax error in the data step.

Check Answer
Answer: A

注解:INPUT statement中,$表示这是一个字符型变量;1-14表示将该行的第1-14个字符写进变量;/表示将指针移动到下一行的第一个字符,即之后的变量将从下二行的第一个字符开始读取。INPUT statement默认自动前往下一行,当INPUT后没有申明任何变量名时,其起到的作用和/类似,但两者的工作原理不同,具体请查看官方手册中INPUT statement的Without Arguments部分。题目中的程序首先逐行读取Name,Address和City,只有在City为New York时读取State的值,最后DROP statement告诉SAS不要向data set中写入City这个变量。

SAS Base (47)

The following output is created by the FREQUENCY procedure:

Frequency
Percent
Row Pct
Col Pct
Table of region by product
region product
corn cotton oranges Total
EAST
2
22.22
50.00
50.00
1
11.11
25.00
33.33
1
11.11
25.00
50.00
4
44.44
SOUTH
2
22.22
40.00
50.00
2
22.22
40.00
66.67
1
11.11
20.00
50.00
5
55.56
Total
4
44.44
3
33.33
2
22.22
9
100.00

Which TABLES option(s) would be used to eliminate the

row and column counts and just see the frequencies and percents?

A. norowcount nocolcount
B. freq percent
C. norow nocol
D. nocounts

Check Answer
Answer: C

注解:RPOC FREQ statement用于生成Frequency Table。题目要求不显示Row Percent和Column Percent这个统计量,这仅需要在TABLES statement后加上NOROW和NOCOL选项,程序为:
proc freq data = dataset;
tables region * product / nocol norow;
run;
输出的结果为:

Frequency
Percent
Table of region by product
region product
corn cotton oranges Total
EAST
2
22.22
1
11.11
1
11.11
4
44.44
SOUTH
2
22.22
2
22.22
1
11.11
5
55.56
Total
4
44.44
3
33.33
2
22.22
9
100.00

SAS Base (46)

Given the SAS data set WORK.ONE:

Obs Revenue2008 Revenue2009 Revenue2010
1 1.2 1.6 2

The following SAS program is submitted:

data WORK.TWO;
set WORK.ONE;
Total=mean(of Rev:);
run;

What value will SAS assign to Total?

A. 3
B. 1.6
C. 4.8
D. The program fails to execute due to errors.

Check Answer
Answer: B

注解:本题使用了Variable List,Rev:是指选取所有以Rev开头的变量,并且在对变量列使用function时,要在变量列之前加上OF。mean(of Rev:)等价于 mean(Revenue2008, Revenue2009, Revenue2010),MEAN function用于求这三个值的均值。

SAS Base (45)

The following SAS program is submitted:

ods csvall file=’c:test.csv’;
proc print data=WORK.ONE;
var Name Score Grade;
by IdNumber;
run;
ods csvall close;

What is produced as output?

A. A file named test.csv that can only be opened in Excel.
B. A text file named test.csv that can be opened in Excel or in any text editor.
C. A text file named test.csv that can only be opened in a text editor.
D. A file named test.csv that can only be opened by SAS.

Check Answer
Answer: B

注解:ODS,全称Output Delivery System,用于导出SAS中数据和分析结果。开启ODS格式:
ODS destination file = ‘想要保存文件的路径+文件名.扩展名’;
关闭ODS格式:
ODS destination CLOSE;
其中destination用于定义导出文件的类型,比如题目中的CSVALL、HTML、PDF等。本题的程序输出的是CSV文件,可以用Excel和任意本文编辑器打开。

SAS Base (44)

The following SAS program is submitted:

data ONE TWO SASUSER.TWO;
set SASUSER.ONE;
run;

Assuming that SASUSER.ONE exists, how many temporary and permanent SAS data sets are created?

A. 2 temporary and 1 permanent SAS data sets are created
B. 3 temporary and 2 permanent SAS data sets are created
C. 2 temporary and 2 permanent SAS data sets are created
D. there is an error and no new data sets are created

Check Answer
Answer: A

注解:DATA statement中可以同时建立多个data set,以WORK为libref或者省略libref的为临时data set,保存在WORK library下。指定其它libref的,比如题目中的SASUSER,则在该library下建立永久data set。更多关于DATA statement建立多个data set,可以参考SAS Base (35)

SAS Base (43)

Given the SAS data set WORK.ORDERS:

order_id customer shipped
9341 Josh Martin 02FEB2009
9874 Rachel Lords 14MAR2009
10233 Takashi Sato 07JUL2009

The variable order_id is numeric; customer is character; and shipped is numeric, contains a SAS date value,and is shown with the DATE9. format.

A programmer would like to create a new variable, ship_note,that shows a character value with the order_id,shipped date, and customer name.

For example, given the first observation ship_note would have the value “Order 9341 shipped on 02FEB2009 to Josh Martin”.

Which of the following statement will correctly create the value and assign it to ship_note?

A. ship_note=catx(‘ ‘,’Order’,order_id,’shipped on’,input(shipped,date9.),’to’,customer);
B. ship_note=catx(‘ ‘,’Order’,order_id,’shipped on’,char(shipped,date9.),’to’,customer);
C. ship_note=catx(‘ ‘,’Order’,order_id,’shipped on’,tranwrd(shipped,date9.),’to’,customer);
D. ship_note=catx(‘ ‘,’Order’,order_id,’shipped on’,put(shipped,date9.),’to’,customer);

Check Answer
Answer: D

注解:CHAR function的作用是返回字符串中某一字符的位置,比如char(‘SAS’, 3),返回的值为’S’,char(‘SAS’, 4)则返回missing。

TRANWRD function用于替换字符串中的某些字符,比如tranwrd(‘SAS&R’, ‘R’, ‘Python’),返回的值为’SAS&Python’。

PUT function的作用是将某个值用特定的格式输出,用法为:
new_variable = PUT(source_value, format);
format指定的是new_variable的格式。比如题目中,将一个SAS date(距1960年1月1日的天数),用date9.0的格式表示。

INPUT function则正好与PUT相反,用于指定读取某个值的格式,用法为:
new_variable = INPUT(source_value, informat);
informat用于告诉SAS source_value的格式。

希望以下代码会有助于对PUT和INPUT的理解:
data d;
today1 = ’18Nov2014’d;
new_today1 = put(today1, ddmmyy10.);
today2 = ’18/11/2014′;
new_today2 = input(today2, ddmmyy10.);
run;

Obs today1 new_today1 today2 new_today2
1 20045 18/11/2014 18/11/2014 20045

SAS Base (42)

The following SAS program is submitted:

data WORK.ONE;
Text=’Australia, US, Denmark’;
Pos=find(Text,’US’,’i’,5);
run;

What value will SAS assign to Pos?

A. 0
B. 1
C. 2
D. 12

Check Answer
Answer: D

注解:FIND function用于寻找字符串中某一个特定字符串片段的启示位置。题目中为:从’Australia, US, Denmark’中的第5个字符开始寻找’US’,’i’的作用是忽略大小写。忽略大小写之后,’Australia, US, Denmark’中有2处’US’,但由于是从第5个字符开始找,所以第一次出现’US’的位置是第12个字符(从’A’开始数,包括逗号空格)。

SAS Base (41)

Given the raw data record in the file phone.txt:

----|----10---|----20---|----30---|
Stevens James SALES 304-923-3721 14

The following SAS program is submitted:

data WORK.PHONES;
infile ‘phone.txt’;
input EmpLName $ EmpFName $ Dept $ Phone $ Extension;
<_insert_code_>
run;

Which SAS statement completes the program and results in a value of “James Stevens” for the variable FullName?

A. FullName=CATX(‘ ‘,EmpFName,EmpLName);
B. FullName=CAT(‘ ‘,EmpFName,EmpLName);
C. FullName=EmpFName||EmpLName;
D. FullName=EmpFName + EmpLName;

Check Answer
Answer: A

注解:CAT function用于连接括号内的字符串,字符串头部和尾部的空格会得到保留。例如:
data _null_;
x = ‘ SAS ‘;
y = ‘ R Python.’;
z = cat(x, y);
put z $char.;
run;
‘SAS’前面和后面的空格,’R’前面的空格都会得到保留。在LOG中会输出以下结果(不包括标尺):
----|----10---|
 SAS  R Python.

CATX function在连接字符串时,会在各字符串之间插入一个设定好的分隔符,并且去掉字符串头部和尾部的空格。例如:
data _null_;
x = ‘ SAS ‘;
y = ‘ R Python.’;
z = catx(‘*’, x, y);
put z $char.;
run;
输出的结果为:
----|----10---|
SAS*R Python.

||为concatenation operator,起作用几乎相当于CAT,即CAT(a, b)和a||b输出相同的结果。唯一的区别是,接收合并后字符串的那个变量的默认长度(Length),例如以下程序:
data d;
x = ‘ SAS ‘;
y = ‘ R Python.’;
z1 = cat(x, y);
z2 = x||y;
put z1 $char.;
put z2 $char.;
run;
如果z1和z2的长度都没有预先定义,z1的长度为200,而z2的长度为x的长度加上y的长度。以下为PROC CONTENTS的输出结果:

Alphabetic List of Variables and Attributes
# Variable Type Len
1 x Char 5
2 y Char 10
3 z1 Char 200
4 z2 Char 15

D选项适度求2个字符型变量的和,字母之间无法相加。

SAS Base (40)

The following SAS program is submitted:

data WORK.PRODUCTS;
Prod=1;
do while(Prod LE 6);
Prod + 1;
end;
run;

What is the value of the variable Prod in the output data set?

A. 6
B. 7
C. 8
D. . (missing numeric)

Check Answer
Answer: B

注解:DO WHILE为循环语句,重复执行该statement和END之间的语句,直到括号内的表达式为False为止。本题中,当Prod的值为6时,执行最后一次Prod + 1,所以程序结束时Prod的值为7。

SAS Base (39)

The following SAS program is submitted:

data WORK.AUTHORS;
array Favorites{3} $ 8 (‘Shakespeare’,’Hemingway’,’McCaffrey’);
run;

What is the value of the second variable in the dataset WORK.AUTHORS?
A. Hemingway
B. Hemingwa
C. ‘ ‘ (a missing value)
D. The program contains errors. No variables are created.

Check Answer
Answer: B

注解:ARRAY statement中,Favorites为数组名字,{3}表示数组中有3个元素,$表示数组中的元素为字符型,8表示数组内的值最长为8个字节,(‘Shakespeare’,’Hemingway’,’McCaffrey’)设定数组元素的初始值。第二个数组元素,即 Favorites{2},得到的初始值为’Hemingway’,但由于定义了最长8个字符,所以其值为Hemingwa。