Thursday, August 23, 2012

using prxchange in SAS to replace non-alpha and non-numeric characters to blanks


For example, we want to replace the special characters like "# $ ! & * ) ^" in the following words into blanks.


san francisco, @California
Oregon, && U^S
Google, *Mountain View


Function prxchange can be used:


data a;
input old & $100.;
cards;
san francisco, @California
Oregon, && U^S
Google, *Mountain View
;
run;


data a2;
set a;
new=
prxchange('s/[^a-zA-Z0-9]/ /i', -1, old);
run;

proc print data=a2;
run;


Thursday, August 9, 2012

Number of obs for multi-join in proc sql

It's necessary to remember that proc sql first do the Carditian product and then select obs based on the conditions.

For example, the following code: a.a 1 has 3 replicates, 2 has 2 replicates, b.a is unique, c.a 2 has 3 replicates and 3 has 3 replicates, so totally 1 will have 3*1*1=3 replicates, 2 has 2*1*3=6 replicates, 3 will have 1*1*3=3 replicates. if no other conditions, then totally there will be 3+6+3=12 observations.


data a;
input a b;
cards;
1 2
1 3
1 5
2 3
2 7
3 5
;
run;

data b;
input a c;
cards;
1 2
2 3
3 5
;
run;

data c;
input a d;
cards;
1 1
2 1
2 2
2 3
3 1
3 2
3 3
;
run;


proc sql;
create table test as
select a.*, b.c, c.d
from a,b,c
where a.a=b.a and b.a=c.a;
quit;

proc print data=test;
run;


The output is:


Obs a b c d

1 1 2 2 1
2 1 3 2 1
3 1 5 2 1
4 2 3 3 1
5 2 3 3 3
6 2 3 3 2
7 2 7 3 1
8 2 7 3 3
9 2 7 3 2
10 3 5 5 1
11 3 5 5 3
12 3 5 5 2