Wednesday, May 30, 2012

IS MISSING / IS NULL to select missing rows

In sas, if want to select those missing rows, you can use VAR=. (for numeric var) or VAR='' OR VAR=' ' (for char var).

A convenient way is to use VAR IS MISSING   OR   VAR IS NULL.

Monday, May 28, 2012

Using Multiple SET Statements

When you use multiple SET statements,
  • processing stops when SAS encounters the end-of-file (EOF) marker on either data set (even if there is more data in the other data set) 
  •  
  • the variables in the program data vector (PDV) are not reinitialized when a second SET statement is executed. 
This is useful when want to combine the summary info with the details rows of the data set.

suppose sasuser.summary has one var named sum, with one obs valued as 100; sasuser.monthsum has two vars: month sale:
month sale
1      28
2      32
3      40
we want to combine them together to calculate the percentage for each month.

data a; 
  if _N_=1 then set sasuser.summary;
  set sasuser.monthsum;
  pct=sale/sum;
run;

the automatic variable _N_ keeps track of how many times the DATA step has begun to execute. The following DATA step uses _N_ to keep SAS from reaching the EOF marker for Sasuser.Summary after the first iteration of the step. Since the variables in the PDV will not be reinitialized on each iteration, the first value of Summary.Cargosum will be retained in the PDV for each observation that is read from Sasuser.Monthsum

Friday, May 18, 2012

zz: SAS与R语言的数据加载与转化

第一:R加载/调用SAS
--在SAS中生成传送文件
LIBNAME SAS_R xport 'C:\sea.xpt';
DATA SAS_R.sea;
SET custdet1;
RUN;
--在R中读入
library(foreign)
library(Hmisc)
sea<-sasxport.get("c:/sea.xpt")
head(mydata)

第二:SAS调用R的数据
Library(foreign)
write.foreign(sea,"c:/sea.txt","c:/sea.sas",package="SAS")
在C盘中会生成二个文件:
SAS程序代码
一个是TXT文件
然后在SAS中直接加载程序就行。

zz: 如何在R中调用matlab

install.packages("R.matlab")
library(R.matlab)
path <- system.file("mat-files", package="R.matlab")
mat <- readMat(file.path(path, "structLooped.mat"))
s <- mat$s
fields <- dimnames(s)[[1]]
cat("Field names: ", paste(fields, collapse=", "), "\n", sep="");

print(s)

Thursday, May 17, 2012

check difference of PROC SUMMARY <> PROC MEANS (not finished)



***  if VAR is missing, sas will drop that records when calculating mean. To include that records, we can impute that missing var by 0 ***;

data test;
  input a $ 1-3 b 4-5 ;
  cards;
  a 1
  a
  a 3
  b
  b 2
  b 1
  c 1
  c
    1
    4
  ;
run;

proc print data=test noobs;
  title "print of original data";
run;

proc summary data=test nway missing;
  var b;
  class a;
  output out=sum1(drop=_type_ _freq_) mean=;
run;

proc print data=sum1;
  title "Summary with MISSING opinion";
run;

proc summary data=test nway;
  var b;
  class a;
  output out=sum2(drop=_type_ _freq_) mean=;
run;

proc print data=sum2;
  title "Summary without MISSING opinion";
run;

proc means data=test nway missing;
  var b;
  class a;
  output out=means1(drop=_type_ _freq_) mean=;
run;

proc print data=means1;
  title "Means with MISSING opinion";
run;

proc summary data=test nway;
  var b;
  class a;
  output out=means2(drop=_type_ _freq_) mean=;
run;

proc print data=means2;
  title "Means without MISSING opinion";
run;





^LSummary with MISSING opinion

Obs    a     b

 1          2.5
 2     a    2.0
 3     b    1.5
 4     c    1.0


^LSummary without MISSING opinion

Obs    a     b

 1     a    2.0
 2     b    1.5
 3     c    1.0


^LMeans with MISSING opinion

Obs    a     b

 1          2.5
 2     a    2.0
 3     b    1.5
 4     c    1.0


^LMeans without MISSING opinion

Obs    a     b

 1     a    2.0
 2     b    1.5
 3     c    1.0

Tuesday, May 8, 2012

zz: proc means and proc mixed for paired t test



/*********************************************************************
FILENAME: MIXVMEAN.SAS
SUBJECT HEADING: STAT
INITIALS:  KBW
DATE:  7/26/96
PROGRAM:  SAS
VERSION:  6.11
PLATFORM:  WINDOWS 3.11 TS040
TITLE:  USING PROC MIXED FOR A PAIRED T-TEST COMPARED TO PROC MEANS

DESCRIPTION:  THIS PROGRAM DOES A PAIRED T-TEST, FOR THE SAME DATA,
              FIRST USING PROC MEANS, AND THEN USING PROC MIXED.
              NOTE THAT THE DATA STRUCTURE FOR THESE TWO METHODS
              IS DIFFERENT.  THE DATA STEPS AT THE BEGINNING SET
              UP THE TWO DIFFERENT DATA STRUCTURES.
              NOTE THAT THE T-TEST, THE DEGREES OF FREEDOM AND
              THE P-VALUES ARE THE SAME FOR BOTH METHODS.
**********************************************************************/


data test;
  input y1 y2;
  diff=y1-y2;
  Pair+1;
  datalines;
  13 15
  12 14
  17 17.2
  14 18
  11 12
  5  4.1
  7 9.3
     ;
run;

proc print data=test;
  title 'printout of original data set';
run;

proc means n mean t prt;
  var diff;
  title 'paired t-test using proc means';
run;

data test2(keep=y pair group);
  set test;
     y=y1;
     group=1;
     output;
     y=y2;
     group=0;
     output;
run;

proc print data=test2;
  title 'rearranged data for proc mixed';
run;

proc mixed;
    class pair group;
    model y=group;
    random pair;
    lsmeans group / pdiff;
    title 'paired t-test using proc mixed';
run; 

from SAS: Determining the Number of Variables and Observations in a Data Set

%macro obsnvars(ds);
   %global dset nvars nobs;
   %let dset=&ds;
   %let dsid = %sysfunc(open(&dset));
   %if &dsid %then
      %do;
         %let nobs =%sysfunc(attrn(&dsid,NOBS));
         %let nvars=%sysfunc(attrn(&dsid,NVARS));
         %let rc = %sysfunc(close(&dsid));
         %put &dset has &nvars  variable(s) and &nobs observation(s).;
      %end;
   %else
      %put Open for data set &dset failed - %sysfunc(sysmsg());
%mend obsnvars;

%obsnvars(sasuser.houses)
 
 
 
 
******************************************************************************;
 
 
%macro nobs(Dsn= /*Data set name */);
  if exist("&Dsn") then do;
    Dsid = open("&Dsn","i");
    Nobs = attrn(Dsid,"Nlobs");
  end;
  else Nobs=.;
  rc = close(Dsid);
%mend nobs;