10.10.08

Quality Control (QC): steps towards effective programs and outputs
Prepared by Mohanan K K

The term SAS QC refers to the maintenance of the quality of data and quality of programming. The quality of data refers to the Accuracy, Completeness, Consistency, Timeliness, Uniqueness and Validity of data. The quality of programming on the other hand means that the SAS program should produce correct and meaningful outputs and at the same time meet all standards like indentation of statements, optimization of code, use of drop and keep statements and error/warning free logs to name a few.

The QC personnel involved in the SAS QC process are SAS programmers who are independent of the actual programming that is being carried out for the study. At Kreara, the SAS QC process involves the following steps

a) Checking the quality of SAS code and outputs developed.
b) Entering the review comments into the issue tracker.
c) Tracking the resolution of the review comments.


The QC personnel review the code, log and output and raise issues in an issue tracker which is a web application accessible to the team members and which helps in tracking the resolution of issues. The issues raised by the QC are made available to the SAS Programmer who in turn corrects the code as per comments and flags the issue as resolved. The issues are closed by the QC personnel when the correction is satisfactory.

At Kreara intensive QC of codes and outputs is done in four stages as described below

1. Output QC
2. Log QC
3. Code QC
4. Parallel Programming

1. Output QC

The Output QC further includes

a. Sample or complete check against actual data
b. Check against the template

During the output QC, the QC personnel checks whether the outputs produced are as per template or requirements. In addition a sample check is performed against the database. The sample size for sample check varies depending on the study. In case of tables a complete check is done on the outputs produced.

2. Log QC

The SAS program always generate a log file. As part of the Log QC, the QC personnel is required to look for errors, warnings and critical notes in the log. Critical notes may include “Missing values were generated as a result of performing an operation on missing values”. The Log QC is aimed at making the log free of warnings, errors and notes.

3. Code QC

The code QC involves step by step review of the code. The following aspects of the code need to be reviewed

a. Adherence to the requirements
b. Logic
c. Syntax
d. Optimization
e. Presentation
f. Completeness of datasets

A peer review is carried out during which the SAS Programmer is required to explain the code to the QC personnel. Any discrepancies are discussed and corrected by the SAS Programmer.

4. Parallel Programming

Parallel programming is included in the QC process at Kreara. Parallel programming helps in both the output QC and optimization of code. During the parallel programming the QC personnel generates the outputs as per requirement without emphasis on the template and SAS programming standards/practices. The outputs generated by parallel program are compared to actual output for discrepancies. The parallel program may also be compared with actual program in order to improve the Quality of programming.

The following example explains the parallel programming in QC process.

If a SAS programmer calculates the confidence interval, using the ‘tinv’ function in the data step as shown below, a parallel programmer or QC personnel uses the simple proc means procedure and then the outputs are compared.

PROC UNIVARIATE DATA=old;
BY id ;
VAR anal;
OUTPUT out=new n = n1 n2
mean = mean1 mean2
std = std1 std2
;
RUN;

DATA new1;
SET new;
n1 = PUT(n1,4.0);
n2 = PUT(n2,4.0);
mn1 = PUT(mean1,6.1);
mn2 = PUT(mean2,6.1);
sd1 = PUT(std1,7.2);
sd2 = PUT(std2,7.2);


IF n1 NE 0 THEN DO;
clh1 = mean1 + tinv(0.975,n1-1) * std1 / sqrt(n1);
cll1 = mean1 - tinv(0.975,n1-1) * std1 / sqrt(n1);
ci1 = "("PUT(clh1,6.2)","PUT(cll1,6.2)")";
END;
IF n2 NE 0 THEN DO;
clh2 = mean2 + tinv(0.975,n2-1) * std2 / sqrt(n2);
cll2 = mean2 - tinv(0.975,n2-1) * std2 / sqrt(n2);
ci2 = "("PUT(clh2,6.2)","PUT(cll2,6.2)")";
END;
RUN;

The output can be simply obtained using proc means procedure as follows

proc means data=new ;
BY id;
VAR anal;
OUTPUT out=sum1
n = n1 n2
mean = mean1 mean2
std = std1 std2
lclm = ll1 ll2
uclm = ul1 ul2;
RUN;

The given parallel program suggests the reduction of steps and on the other hand the output produced by the parallel program is compared with actual output, as part of the output QC.

The Quality Control process is carried out until good quality outputs and code are obtained.

No comments: