Wednesday, November 16, 2011

Did you find the difference? _N_ study notes 1

Look at the differences between data set a1, a2 and a3:

data a00;
      input a b;
      cards;
      2 2
      3 3
      5 5
      ;
run;

data a01;
      x=1;
run;

data a1;
      if _n_=1 then do;
            set a01;
      end;
      set a00;
run;

data a2;
      if _n_=1 then do;
            x=1;
      end;
      set a00;
run;

data a3;
      retain x;
      if _n_=1 then do;
            x=1;
      end;
      set a00;
run;
 Things to remember:

1)  a1 v.s. a2: in " if then do end", we use set a01, the value of x will be retained in each iteration of data steps; if you put 'set' in data statement, the value will be retained automatically;

2) a2 v.s. a3: if the difference of use retain or not. use retain, the value of x is retained in each data step;

3) a1 v.s. a3: it looks there is no difference. however, the internal is different. if you remove set a00 both and you will find the new a1 has two obs while new a3 only has one obs.

step-by-step study of:
data a1;
      if _n_=1 then do;
            set a01;
      end;
      set a00;
run;


1) first step of data a1: _n_=1, read in set a01 and retain the value of x=1; goes down and read the first value of set a00; combined together we get  (x a b)=(1 2 2)
2) second step of a1: _n_=2, without reading set a01 but just the second row of a00, since x=1 is retained, so we get (x a b)
               (1 2 2)
               (2 3 3) 
3) in the same way for _n_=3;
4) forth step of data a1, _n_=4, but since a00 has only three records, jump out of data step; at last we got three rows as the result shows
5) if there is no set a00, then how many obs will we get? (the answer is 2)

No comments:

Post a Comment