SAS Base (8)

The SAS data set named WORK.SALARY contains 10 observations for each department,and is currently ordered by Department. The following SAS program is submitted:

data WORK.TOTAL;
set WORK.SALARY(keep=Department MonthlyWageRate);
by Department;
if First.Department=1 then Payroll=0;
Payroll+(MonthlyWageRate*12);
if Last.Department=1;
run;

Which statement is true?
A. The by statement in the DATA step causes a syntax error.
B. The statement Payroll+(MonthlyWageRate*12); in the data step causes a syntax error.
C. The values of the variable Payroll represent the monthly total for each department in the WORK.SALARY data set.
D. The values of the variable Payroll represent a monthly total for all values of WAGERATE in the WORK.SALARY data set.

Check Answer
Answer: C

注解:和SAS Base (1)类似。

SAS Base (1)

The following SAS program is submitted:

data WORK.TOTAL;
set WORK.SALARY;
by Department Gender;
if First.<_insert_code_> then Payroll=0;
Payroll+Wagerate;
if Last.<_insert_code_>;
run;

The SAS data set WORK.SALARY is currently ordered by Gender within Department.

Which inserted code will accumulate subtotals for each Gender within Department?
A. Gender
B. Department
C. Gender Department
D. Department Gender

Check Answer
Answer: A

注解:SAS通过FIRST.variable和LAST.variable来判断一个BY group的开始和结束。当SAS在读取一个BY group中的第一条记录时,FIRST.variable被赋值为“1”,其余情况赋值为“0”。LAST.variable则在读取最后一条记录是赋值为“1”,其余情况赋值为“0”。我们用一个例子来说明。

首先虚拟一个data set:

data d;
input department $ gender $ wagerate ;
datalines;
D1 F 10
D1 F 12
D1 M 9
D2 F 8
D2 M 5
D3 F 7
D3 F 15
D3 F 3
D4 F 20
;
run;

接下来为这组数据排序:

proc sort data = d out = salary;
by department gender;
run;

sort之后的数据:

Obs department gender wagerate
1 D1 F 10
2 D1 F 12
3 D1 M 9
4 D2 F 8
5 D2 M 5
6 D3 F 7
7 D3 F 15
8 D3 F 3
9 D4 F 20

首先C和D分别给了2个变量,但FIRST.variable1 variable2不符合SAS的语法。前面的FIRST.variable1,SAS可以理解,但单独的一个variable2,前面没有任何逻辑运算符(AND、OR),SAS在编译代码的时候会出错。排除C和D之后,A和B的区别是按照不同的BY group求和。题目告诉我们,数据是先按Department排序,然后在同一Department中再按Gender排序,那么FIRST.department会对每个Department中无论男女、所有人的Wagerate求和,输出结果为:

Obs department gender wagerate payroll
1 D1 M 9 31
2 D2 M 5 13
3 D3 F 3 25
4 D4 F 20 20

相反,由于Gender是二级排序变量,FIRST.gender会对每一个Department中不同性别的人群的Wagerate求和,而非仅仅按性别求和。所以Gender符合题目的要求:accumulate subtotals for each Gender within Department。

最后输出的数据:

Obs department gender wagerate payroll
1 D1 F 12 22
2 D1 M 9 9
3 D2 F 8 8
4 D2 M 5 5
5 D3 F 3 25
6 D4 F 20 20