Thursday, March 10, 2011

DLM and DSD options in infile statement and & usage

The DLM= option enables you to list read data with delimiters rather than the default space. i.e., if your data are separated by ";", you can use DLM=';' to read them correctly.

The DSD (Delimiter-Sensitive Data) in infile statement does three things for you. 1: it ignores delimiters in data values enclosed in quotation marks; 2: it ignores quotation marks as part of your data; 3: it treats two consecutive delimiters in a row as missing value.

eg1: we want to read this data set named testdata.txt in c:\sasdata.

10034 "apples, grapes kiwi" 123456
92626 "oranges" 97654
25414 "pears apple" 987654

data test;
      infile 'c:\sasdata\testdata.txt' dlm=' ' dsd;
      length name $22;
      input zip name $ pounds;
run;


*********************************************************************

& usage in SAS

& stops reading character variables until it encounters two or more spaces. So a single space to differentiate the character variable and the adjacent variable will be ignored and the two variables will be treated as one variable.

eg2:
data web;
input age site & $41. hits;
datalines;
12  http://www.site1.org/default.htm   123456
30  http://www.site2.com/index.htm   97654
54  http://www.site3.edu/department/index.htm    987654
;

data two;
input name & 20. age;
cards;
james band   25
barak obama   26
jintao hu   25
;

3 comments:

  1. Great. Explained in a simple way.Thanks.

    ReplyDelete
  2. hi,

    i am new to sas programming, i was trying to use multiple delimiter options, for the following cards
    001*Alfred, M,24,10000,"a-nagar,Hyderabad"
    002*Alice, F,23,10000,"b-nagar,Hyderabad"
    003*Barbara,F,23,20000,"a-nagar,Hyderabad"

    code:
    data SPECIAL;
    infile datalines dlm='* ,';
    input ID NAME$ SEX$ AGE SALARY ADDRESS:$20.;
    datalines;
    001*Alfred, M,24,10000,"a-nagar,Hyderabad"
    002*Alice, F,23,10000,"b-nagar,Hyderabad"
    003*Barbara,F,23,20000,"a-nagar,Hyderabad"
    ;
    run;

    output:
    Obs ID NAME SEX AGE SALARY ADDRESS
    1 1 Alfred M 24 10000 "a-nagar
    2 2 Alice F 23 10000 "b-nagar
    3 3 Barbara F 23 20000 "a-nagar

    the output is not complete or as i expected and the last column is truncated.

    can you help to suggest the correct usage of delimiters to have the complete data in output.

    ReplyDelete