SAS Base (31)

Given the following raw data records in DATAFILE.TXT:

----|----10---|----20---|----30
Kim,Basketball,Golf,Tennis
Bill,Football
Tracy,Soccer,Track

The following program is submitted:

data WORK.SPORTS_INFO;
length Fname Sport1-Sport3 $ 10;
infile ‘DATAFILE.TXT’ dlm=’,’;
input Fname $ Sport1 $ Sport2 $ Sport3 $;
run;

proc print data=WORK.SPORTS_INFO;
run;

Which output is correct based on the submitted program?

A.

Obs Fname Sport1 Sport2 Sport3
1 Kim Basketball Golf Tennis
2 Bill Football
3 Tracy Soccer Track

B.

Obs Fname Sport1 Sport2 Sport3
1 Kim Basketball Golf Tennis
2 Bill Football  Football  Football
3 Tracy Soccer Track  Track

C.

Obs Fname Sport1 Sport2 Sport3
1 Kim Basketball Golf Tennis
2 Bill Football Tracy Soccer

D.

Obs Fname Sport1 Sport2 Sport3
1 Kim Basketball Golf Tennis
2 Bill Football
Check Answer
Answer: C

注解:当一行数据中变量值的数量少于变量的数量时,SAS会去下一行接着读取数据。当所有变量都得到赋值之后,无论该行数据中是否还有未使用的变量值,SAS都会前往下一行开始新的DATA step,即开始读取新的观测值。所以在题目中,当SAS在执行第二次DATA step时,由于第二行数据只有2个值,SAS会去第三行寻找值并赋给Sport2和Sport3。当4个变量都得到赋值之后,SAS忽略第三行中余下的值Track。如果这时DATAFILE.TXT中还有第四行的话,SAS就会前往第四行开始读取第三个观测值。如果要得到A中的结果,在INFILE statement末尾加上MISSOVER即可。DLM和MISSOVER具体的含义请查看SAS Base (2)

6 thoughts to “SAS Base (31)”

  1. I am wondering why sas does not read “Track” in raw data?
    I ran the code and got c but I am still wondering why sas miss reading “Track” .

    1. During the second iteration, SAS reads the second line of the raw data file, but it only finds 2 values. By default, SAS goes to the next line and assigns values to remaining variables, therefore Sport2 gets Tracy and Sport3 gets Soccer. Since all variables have been assigned, SAS ignores all values left in the third line. After all these SAS tries to start the third iteration, but there is no forth line in the data file and the program ends.

      If you add a MISSOVER at the end of INFILE statement, you will see different results.

  2. Just to add to comment above,
    since there is no missover statement, SAS will only assign values in accordance with input variiables in terms of number ( 4 input variables here. if you keep only one variable in the 4th line, it wont be read since SAS will not find matching input and third iteration will not execute. but, if you place 4 variables in 4 th line, they will be read.

    1. Try this program
      data WORK.SPORTS_INFO;
      length Fname Sport1-Sport3 $ 10;
      infile datalines ;
      input Fname $ Sport1 $ Sport2 $ Sport3 $;
      datalines;
      Kim Basketball Golf Tennis
      Bill Football
      Tracy Soccer Track
      dummy data here xx

      ;
      run;

      proc print data=WORK.SPORTS_INFO;
      run;

Leave a Reply

Your email address will not be published. Required fields are marked *