Given the following raw data records in TEXTFILE.TXT:
----|----10---|----20---|----30
John,FEB,13,25,14,27,Final
John,MAR,26,17,29,11,23,Current
Tina,FEB,15,18,12,13,Final
Tina,MAR,29,14,19,27,20,Current
The following output is desired:
Obs | Name | Month | Status | Week1 | Week2 | Week3 | Week4 | Week5 |
1 | John | FEB | Final | $13 | $25 | $14 | $27 | . |
2 | John | MAR | Current | $26 | $17 | $29 | $11 | $23 |
3 | Tina | FEB | Final | $15 | $18 | $12 | $13 | . |
4 | Tina | MAR | Current | $29 | $14 | $19 | $27 | $20 |
Which SAS program correctly produces the desired output?
A.
data WORK.NUMBERS;
length Name $ 4 Month $ 3 Status $ 7;
infile ‘TEXTFILE.TXT’ dsd;
input Name $ Month $;
if Month=’FEB’ then input Week1 Week2 Week3 Week4 Status $;
else if Month=’MAR’ then input Week1 Week2 Week3 Week4 Week5 Status $;
format Week1-Week5 dollar6.;
run;
proc print data=WORK.NUMBERS;
run;
B.
data WORK.NUMBERS;
length Name $ 4 Month $ 3 Status $ 7;
infile ‘TEXTFILE.TXT’ dlm=’,’ missover;
input Name $ Month $;
if Month=’FEB’ then input Week1 Week2 Week3 Week4 Status $;
else if Month=’MAR’ then input Week1 Week2 Week3 Week4 Week5 Status $;
format Week1-Week5 dollar6.;
run;
proc print data=WORK.NUMBERS;
run;
C.
data WORK.NUMBERS;
length Name $ 4 Month $ 3 Status $ 7;
infile ‘TEXTFILE.TXT’ dlm=’,’;
input Name $ Month $ @;
if Month=’FEB’ then input Week1 Week2 Week3 Week4 Status $;
else if Month=’MAR’ then input Week1 Week2 Week3 Week4 Week5 Status $;
format Week1-Week5 dollar6.;
run;
proc print data=WORK.NUMBERS;
run;
D.
data WORK.NUMBERS;
length Name $ 4 Month $ 3 Status $ 7;
infile ‘TEXTFILE.TXT’ dsd @;
input Name $ Month $;
if Month=’FEB’ then input Week1 Week2 Week3 Week4 Status $;
else if Month=’MAR’ then input Week1 Week2 Week3 Week4 Week5 Status $;
format Week1-Week5 dollar6.;
run;
proc print data=WORK.NUMBERS;
run;
Check Answer注解:
DSD:默认“,”为分隔符,将2个连续的分隔符视为一个missing value。比如将数据“a,b,,d”视为:’a’ ‘b’ missing value ‘d’
DLM:等价于DELIMITER,用于替换默认分隔符(空格)。比如DLM=’*’,将分隔符由空格替换成‘*’
MISSOVER:如果一行数据中的数据个数少于需要定义的变量数量,MISSOVER将防止SAS去下一行寻找数据,并将多出来的变量的值设为missing。比如,一行数据中仅有3个数据“a,b,c”,但INPUT中定义了4个变量(variable1-variable4)。如果没有MISSOVER,SAS会去新的一行寻找数据并为variable4赋值。加上MISSOVER,SAS就不会去下一行,而是将variable4的值设为missing。
@:默认情况下,每出现一次INPUT,SAS都会去新的一行读取数据,而@的作用是让SAS继续在当前行读取数据。比如这个例子:
data d1;
input v1 $ v2 $ @;
input v3 $ v4 $;
datalines;
a b c d
e f g h
;
run;
有@的输出为:
Obs | v1 | v2 | v3 | v4 |
1 | a | b | c | d |
2 | e | f | g | h |
去掉@则为:
Obs | v1 | v2 | v3 | v4 |
1 | a | b | e | f |
与@类似的还有@@。区别在于,在同一个DATA步骤中阻止换行用@,而在不同的DATA步骤中则用@@。何为同一DATA步骤?上面这个例子中,声明了4个变量,那么定义一遍v1, v2, v3, v4为一个DATA步骤。下面举一个使用@@的例子:
data d2;
input v1 $ v2 $ @@;
datalines;
a b c d
;
run;
这个例子中定义一遍v1, v2为一个DATA步骤,@@能够阻止SAS在下一个DATA步骤中去新的一行读取数据。
有@@输出的结果为:
Obs | v1 | v2 |
1 | a | b |
2 | c | d |
不使用@或仅使用一个@的输出结果为:
Obs | v1 | v2 |
1 | a | b |