8.6.10

Read from multiple external files in one data step by using FILEVAR= option

Prepared by Jose Abraham

External files are usually read into SAS one by one using separate data steps for each external file. But multiple external files which have the same structure can be easily read into SAS in one data step by using FILEVAR= and END= options in the INFILE statement. Following example illustrates how to read multiple external files where the locations of the external files are stored in another external file.

Consider we have demographic information of subjects from three different centers stored in three external files. All the three external files have the same structure as given below




The data values are aligned in columns and there are no missing values. The layout follows.


We have another external file which contains the location information of these external files. Suppose these files are stored in the 'demog' folder in the E-drive, and the following external file (dmgfiles) which contains the locations is also in it.


Following SAS data step reads the three external files in one DATA step by using the names which are specified in the external file 'dmgfiles'. This reads the list to determine the external files it should read.



Data step working:

1.First INFILE statement specifies the name of the external file containing the list of filenames that the DATA step should read.

2.First INPUT statement reads the name of the external files with modified list input. A width (60) which is sufficient to hold the name of the external file is specified.

3.Second INFILE statement specifies a text, dummy, and this act as a placeholder for the file specification which is always required on the INFILE statement. The actual specification for the input file comes from the value of the variable assigned by the FILEVAR= option.

a.The FILEVAR= option is set to 'dmgfiles', the variable that contains the name of the external file that the current iteration of the data step should read.

b.END= option defines a variable that SAS sets to 1 when it reads the last data line in the currently opened external file. The END= variable is initialized to 0 and retains the value until it detects that the current input data line is the last in the external file. SAS then sets the variable to 1.

c.When the FILEVAR= option is included in the INFILE statement, SAS resets the END= variable to 0 when the value of the FILEVAR= variable changes (If SAS did not reset the value of the END= variable to 0 each time it opened a new external file, the DATA step would stop after reading the first external file).

4.The do while loop is controlled by testing the value of the END= variable. The loop stops after SAS reads the last data line in the currently opened external file.

5.Name of the file from which the records are read (source file name) is assigned into a variable 'Source'.

6.The above data step iterates four times: one for each of the dmg files (dmg01, dmg02, dmg03) and a fourth time in which it detects that there are no more data lines in the external file that contains the filenames.

7.The default behavior of SAS is that it writes an observation to a data set only at the end of each iteration of the DATA step. An explicit OUTPUT statement is specified to avoid this and output all data values read form the external file.

8.The output dataset 'demogdat' obtained is as follows

Source

Ctrn

Subjid

Age

Sex

Race

E:\demog\dmg01.txt

001

001_01

29

Male

Caucasian

E:\demog\dmg01.txt

001

001_02

28

Female

Caucasian

E:\demog\dmg01.txt

001

001_03

25

Male

Caucasian

E:\demog\dmg02.txt

002

002_01

27

Male

Asian

E:\demog\dmg02.txt

002

002_02

28

Male

Asian

E:\demog\dmg02.txt

002

002_03

25

Female

Asian

E:\demog\dmg03.txt

003

003_01

27

Female

Asian

E:\demog\dmg03.txt

003

003_02

28

Male

Asian

E:\demog\dmg03.txt

003

003_03

25

Female

Asian



No comments: