Our mission is to become the Number 1 CRO in the country providing supplementary services for clinical trials and ultimately help our clients to make affordable and quality health care to the masses a reality.
21.6.10
Our mission is to become the Number 1 CRO in the country providing supplementary services for clinical trials and ultimately help our clients to make affordable and quality health care to the masses a reality.
13.3.09
The following example illustrates how a shift table presents changes in laboratory values from visit to visit. Here we consider the laboratory parameter chloride, in the laboratory group bio-chemistry. The normal range for the parameter is 98 to 107. Then the laboratory values obtained for chloride are categorized as low, normal and high according as described above. Here the lab values less than or equal to 98 are flagged as Low, the lab values within 97 and 107 are flagged as Normal and the lab values above 107 as high. There are three visits in this example screening, visit 1 and visit 2. The screening is considered as baseline visit so here we consider two cases
1. Changes occurred in lab values from screening to visit 1
2. Changes occurred in lab values from screening to visit 2
*Purpose: Illustrate how a shift table create and present;
*********************************************************;
*Give formats for Visits;
proc format;
value vsyn 0='Screening'
1='Visit-1'
2='Vist-2';
*Give formats for categories;
value flg -1='Low'
0='Normal'
1='High';
run;
data lab;
length labgrp $20 labparam $20;
format visit vsyn. flag flg.;
input SUBNO VISIT labgrp&$ labparam&$ val low high;
if val <=low then flag=-1; else if val>low and val<=high then flag=0; else if val>high then flag=1;
cards;
1 1 Bio chemistry Chloride (MMOL/L) 101 98 107
2 0 Bio chemistry Chloride (MMOL/L) 105 98 107
2 1 Bio chemistry Chloride (MMOL/L) 97 98 107
3 0 Bio chemistry Chloride (MMOL/L) 101 98 107
3 1 Bio chemistry Chloride (MMOL/L) 100 98 107
4 0 Bio chemistry Chloride (MMOL/L) 104 98 107
4 1 Bio chemistry Chloride (MMOL/L) 106 98 107
5 0 Bio chemistry Chloride (MMOL/L) 101 98 107
5 1 Bio chemistry Chloride (MMOL/L) 108 98 107
6 0 Bio chemistry Chloride (MMOL/L) 103 98 107
6 1 Bio chemistry Chloride (MMOL/L) 103 98 107
7 0 Bio chemistry Chloride (MMOL/L) 105 98 107
7 1 Bio chemistry Chloride (MMOL/L) 104 98 107
8 0 Bio chemistry Chloride (MMOL/L) 97 98 107
8 1 Bio chemistry Chloride (MMOL/L) 108 98 107
9 0 Bio chemistry Chloride (MMOL/L) 100 98 107
9 1 Bio chemistry Chloride (MMOL/L) 100 98 107
10 0 Bio chemistry Chloride (MMOL/L) 103 98 107
10 1 Bio chemistry Chloride (MMOL/L) 102 98 107
11 0 Bio chemistry Chloride (MMOL/L) 100 98 107
11 1 Bio chemistry Chloride (MMOL/L) 102 98 107
12 0 Bio chemistry Chloride (MMOL/L) 109 98 107
12 1 Bio chemistry Chloride (MMOL/L) 104 98 107
13 0 Bio chemistry Chloride (MMOL/L) 99 98 107
13 1 Bio chemistry Chloride (MMOL/L) 101 98 107
14 0 Bio chemistry Chloride (MMOL/L) 101 98 107
14 1 Bio chemistry Chloride (MMOL/L) 100 98 107
15 0 Bio chemistry Chloride (MMOL/L) 110 98 107
15 1 Bio chemistry Chloride (MMOL/L) 96 98 107
1 2 Bio chemistry Chloride (MMOL/L) 108 98 107
2 2 Bio chemistry Chloride (MMOL/L) 99 98 107
3 2 Bio chemistry Chloride (MMOL/L) 100 98 107
4 2 Bio chemistry Chloride (MMOL/L) 106 98 107
5 2 Bio chemistry Chloride (MMOL/L) 96 98 107
6 2 Bio chemistry Chloride (MMOL/L) 103 98 107
7 2 Bio chemistry Chloride (MMOL/L) 108 98 107
8 2 Bio chemistry Chloride (MMOL/L) 103 98 107
9 2 Bio chemistry Chloride (MMOL/L) 100 98 107
10 2 Bio chemistry Chloride (MMOL/L) 96 98 107
11 2 Bio chemistry Chloride (MMOL/L) 102 98 107
12 2 Bio chemistry Chloride (MMOL/L) 104 98 107
13 2 Bio chemistry Chloride (MMOL/L) 101 98 107
14 2 Bio chemistry Chloride (MMOL/L) 100 98 107
15 2 Bio chemistry Chloride (MMOL/L) 99 98 107
;
data scr (where=(visit=0)rename=(flag=flg0))
vis1(where=(visit=1)rename=(flag=flg1))
vis2(where=(visit=2)rename=(flag=flg2));
set lab;
run;
*Merge the datasets so that the flags for the three visits lie adjacent to each other;
data comon;
merge scr vis1 vis2;
run;
*Find number of subjects for all combinations of laboratory flags
from baseline (screening) to visit 1;
proc means data=comon completetypes noprint missing;
class labgrp labparam flg0 flg1 /preloadfmt ;
output out=shif01 N=num;
run;
*Transposing data into presentable format;
proc transpose data= shif01(where=(flg1 ne . and flg0 ne .)) out=trn1;
by flg0 notsorted;
id flg1;
var num;
copy labgrp labparam;
run;
*Find number of subjects for all combinations of laboratory flags from baseline (screening) to visit 2;
proc means data=comon completetypes noprint missing;
class labgrp labparam flg0 flg2 /preloadfmt ;
output out=shif02 N=num;
run;
*Transposing data into presentable format;
proc transpose data= shif02(where=(flg2 ne . and flg0 ne .)) out=trn2;
by flg0 notsorted;
id flg2;
var num;
copy labgrp labparam;
run;
*Combine two transposed datasets to create report;
data final(where=(_NAME_ ne '' and labgrp ne '' and labparam ne ''));
merge trn1
trn2(rename=(low=low2 normal=normal2 high=high2));
run;
*Determining number of subjects under each flag at baseline;
data base;
set shif01;
keep flg0 num;
where labgrp eq '' and labparam eq '' and flg1 eq . and flg0 ne .;
rename num=basenum;
run;
data final1;
merge final base;
by flg0;
run;
*rtf output of shift table;
proc report data= final1 nowd style(header)=[background=white font_size=8 pt] split='*';
column labgrp labparam flg0 ('Screening vs. Visit 1' Low Normal High ) ('Screening vs. Visit 2' Low2 Normal2 High2 ) basenum;
define labgrp/order 'Lab Group'left;
define labparam/order 'Lab Parameter' left;
define flg0/display 'Baseline *Status'left;
define Low/display 'Low' center;
define Normal/display 'Normal' center;
define High/display 'High'center;
define Low2/display 'Low'center;
define Normal2/display 'Normal'center;
define High2/display 'High'center;
define basenum/display 'Baseline*Count' center;
run;
ods rtf close;

In the table the column named ‘Baseline Count’ represents the number of low, normal and high flags at baseline visit(or screening) i.e. there is 1 subject with low flag, 11 with normal flag and 3 subjects with high flag.
The ‘Screening vs Visit 1’ section is a 3x3 contingency table. The value in first cell (first row, first column) presents the number of subjects for whom the flag was low both at Screening and Visit1. The second cell (first row, second column) further presents the number of subjects with low flag at screening and normal flag at visit1 and so on.
6.3.09
Prepared by Meena R S
Bootstrap method has the following assumptions
1. The sample taken should be a valid representative of the population
2. Bootstrap method takes sampling with replacement from the sample. Each sub sampling is independent and identical distribution (i.i.d.). In other words, it assumes that the sub samples come from the same distribution of the population, but each sample is drawn independently from the other samples.
The bootstrap works by computing the desired statistic for a sub sample of the data set. The sub sampling is done with replacement and the size of the sample is equal to the size of the original sample. The desired statistic is calculated for each sub sample. The collection of these statistics is used as an estimate of the sampling distribution.
Example 1: The following example represents the length of 3 different petals 10 trees of the same type. This program estimates the uncertainty parameters mean and standard deviation
data petals;
input petal1 petal2 petal3;
cards;
1.21 1.31 1.53
2.13 2.21 3.17
1.59 1.70 1.56
1.45 1.23 1.21
1.41 1.96 1.24
1.04 1.8 1.58
1.03 1.05 2.1
1.4 1.25 1.26
1.56 1.26 1.34
1.82 1.24 1.56
;
run;
* This macro is used to generate 10 bootstrap samples of the above data;
%macro bootsamp(data,boot, b);
data &boot;
do isample=1 to &b;
do i = 1 to nobs;
pt = round(ranuni(0) * nobs) ;
set &data nobs = nobs point=pt; *point options is used to create samples in any order;
output;
end;
end;
stop;
run;
%mend;
%bootsamp(petals, boot, 10); *Generating 10 bootstrap samples;
*Calculating the parameters mean and standard deviation for each subsample and appending them to obtain the final sample dataset;
%macro sample(j=, n=);
%do i=&j %to &n;
data petals1;
set boot;
where isample=&i;
run;
proc means data=petals1 ;
var petal1 petal2 petal3 ;
output out = petals_&i mean = mean std = std n =n;
run;
proc append base=final data=petals_&i force;
run;
%end;
%mend;
%sample (j=1, n=10);
proc means data=final mean std ;
var mean std;
output out = means_ mean = mean std = std n =n;
run;
Output is shown below

Prepared by Jose Abraham
Multiple ampersands can be used to allow the value of a macro variable to become another macro variable reference. The macro variable reference will be rescanned until the macro variable is resolved.
The following demonstrates how macro variables with multiple ampersands are resolved. There are 4 macro variables
Macro variable :Value
A : CATCH
B : STUMP
C : RUN
HIT : A
Resolving a macro variable:
&VARNAME references a macro variable. The rule is that the scanner reads from left to right.
1. If we put one ampersand i.e., ‘&HIT’ then the macro variable hit resolves to ‘A’.
2. If we put two ampersands then the two ampersands resolve to one and scanner continues. i.e., ‘&&HIT’
On the first scan - ‘&&’ resolves to ‘&’ and ‘HIT’ held as token.
On the second scan – ‘&HIT’ resolves to ‘A’.
3. If we put three ampersands i.e., ‘&&&HIT’
On the first scan -‘&&’ resolves to & and the remaining &HIT resolves to ‘A’ and the it results ‘&A’
On the second scan –‘&A’ resolves to ‘CATCH’
When creating macros for programming, sometimes we would like to generate a dynamic SAS statement within a macro %DO loop. For example if we want to run a print procedure inside a macro and refer to a set of macro variables within the VAR statement.
proc print;
var
%do i = 1 %to &max;
&&var&i
%end;;
run;
Consider a simple program containing this %DO loop in a macro
data one;
input A $ B C D E;
datalines;
a 12 16 18 20
;
run;
%let var1=A;
%let var2=B;
%let var3=C;
%let var4=D;
%let var5=E;
%let max=4;
%let indt=one;
%macro prnt;
proc print data=&indt.;
var
%do i = 1 %to &max.;
&&var&i
%end;;
run;
%mend;
%prnt;
In this program, there are two consecutive semicolons used after the %end statement which is not common in a simple SAS program. Here the first semicolon closes the %END and the second semicolon closes the VAR statement. And if we run this macro it will generate the following SAS statements
proc print data=one;
var a b c d;
run;
and it produces the result
5.3.09
Prepared by Sreeja E V
As per quality standards when presenting descriptive statistics for parameters in clinical trial reporting, the data should be aligned with respect to the decimal point. This dynamic decimal alignment and numeric precision should be maintained between varying parameters in the same dataset.
The following example contains 4 parameters A, B, C and D and their values. We have to present the descriptive statistics namely n, mean, standard deviation, minimum, median and maximum. The mean, standard deviation and median will be presented to one more decimal place than the observed value while minimum and maximum will be presented to the same number of decimal places as observed value. The value for n will be presented as integer.
As a first step the descriptive statistics needs to be computed for the parameters and then it is required to determine the number of decimal places needed for the descriptive statistics for each parameter. The observed value with the most number of decimal places is then found out and the maximum number of decimal places used to present the data is determined for each parameter.
Once the maximum number of decimal places per parameter is obtained, one simply needs to pass this information into a character variable containing a representation of the appropriate numeric format as described.
Further the maximum integer length for each parameter is determined and white space is inserted using repeat function for values whose integer length is less than maximum integer length.
data lab;
input parameter $ value;
datalines;
A 12.3654
A 13.1
B 456.1
B 456
C 41.236
C 41.04
D 1.76
D 1.241
;
run;
proc means data=lab noprint;
by parameter;
var value;
output out=desc_data N=N Mean=Mean Std=std Min=Min Median=Median Max=Max;
run;
*Determining the number of decimal points;
*For that the values of all the parameters have been converted to character values so that the digits after the decimal places can be extracted to the variable de_part and its length can be stored in the variable dec_no. If the values of a particular parameter are whole numbers then dec_no will be assigned to zero;
data deci_point;
set lab;
value_n=put(value,best.);
de_part=scan(value_n,2,'.');
if de_part ne ' ' then dec_no=length(de_part);
else dec_no=0;
run;
*Determining maximum number of decimal points for each parameter; *Here the maximum of the variable dec_no for each parameter is determined and stored in the variable decimal;
proc sql noprint;
create table decimal as select
distinct parameter,
max(dec_no) as decimal
from deci_point
group by parameter;
select * from decimal;
quit;
*Creating the formats;
*The variables zerornd, onernd, zerofmt and onefmt are determined for each parameter for rounding and formatting purpose;
proc sql noprint;
create table decimal_1 as select
distinct
parameter,
decimal,
10**(-decimal -0) format best. as zerornd,
10**(-decimal -1) format best. as onernd,
"8." put(decimal +0,1.) as zerofmt,
"8." put(decimal +1,1.) as onefmt
from decimal
;
select * from decimal_1;
quit;
*Applying decimal formats;
data desc_stats(keep=parameter fn fmean fmedian fstd fmin fmax );
merge desc_data decimal_1;
by parameter;
fn=compress(put(n,3.));
if mean ne . then fmean=compress(putn(round(mean,onernd),onefmt));
if median ne . then fmedian=compress(putn(round(median,onernd),onefmt));
if std ne . then fstd=compress(putn(round(std,onernd), onefmt));
if min ne . then fmin=compress(putn(round(min,zerornd),zerofmt));
if max ne . then fmax=compress(putn(round(max,zerornd), zerofmt));
run;
proc sort data=desc_stats;
by parameter;
run;
proc transpose data=desc_stats out=stat;
var fn fmean fstd fmin fmedian fmax;
by parameter;
run;
*To align decimal points;
*Length of integer part of each value is determined and stored in the variable lenint. Further maximum integer length is obtained by determining maximum over lenint and maxint where initial value of the variable maxint is set to zero. While attaining end of the file the value of maxint is assigned to the macro variable max;
data outdata;
set stat(rename=(col1=value)) end=eof;
retain maxint 0;
lenint=length(compress(scan(value,1,'.')));
maxint = max(maxint, lenint);
if eof then call symput("max", put(maxint, best.));
run;
*The difference between the variable max and lenint is determined by the variable diffint. For observations whose diffint>0 (i.e.the observations whose integer length is diffint times less than max) white space is inserted diffint-1 times using repeat function (repeat function gives repetitions after the original string) and concatenates that with the value after removing trailing blanks of value using trim function. For observations whose diffint=0 (i.e.the observations whose integer length same as max) no white space is inserted;
data aligned(drop=maxint lenint diffint value);
retain parameter _name_ value value_aligned;
length value_aligned $15;
set outdata;
if parameter ne '' and value ne '' then do;
diffint = &max - lenint - 1;
if diffint >= 0 then do;
value_aligned = repeat(" ", diffint)trim(left(value));
end;
else do;
value_aligned = trim(left(value));
end;
end;
run;
proc format ;
value $stat
"fn"="n"
"fmean"="Mean"
"fstd"="SD"
"fmin"="Minimum"
"fmedian"="Median"
"fmax"="Maximum"
;
run;
proc print data=aligned;
format _name_ $stat.;
run;
The output is obtained as

10.10.08
Prepared by Jose Abraham
Survival analysis (also called time to event analysis) is concerned with studying the time between entry to a study and a subsequent event. These methods are most often applied to the study of deaths. In fact, they were originally designed for that purpose, which explains the name survival analysis. Survival analysis is an important medical concern and is extremely useful for studying events like onset of disease and recurrence of disease.
The point of survival analysis is to follow subjects over time and observe at which point in time they experience the event of interest. The data which is obtained from survival studies may contain censored observations. Censoring comes in many forms and occurs for many different reasons.
For example if we consider a cancer study in which the subjects after response from treatment were followed up for a specific period of time for the recurrence of cancer (event of interest). If a subject experiences recurrence at time t, which is not known exactly and all we know that the event occurred after a specific time T (i.e. t>T), then the last time at which the subject was observed is recorded and the survival time for that subject is considered as right censored. Also if the recurrence is experienced before a specific time, and the exact time is unknown, then the survival time recorded from that subject is considered as left censored. So the times obtained from subjects who are having no recurrence until the end of the study and those who were lost to follow up, before the end of the study period are censored.
In the aforesaid study, the basic structure of the data is that for each case there is one variable which contains either the time that recurrence happened or, for censored cases, the last time at which the case was observed, both measured from the chosen origin. Another variable that denotes the censoring status of each case is also present (uncensored =1 and censored=1). Also the data contain values of other variables such as markers, tissues etc…. A small part of data in this form is given below
data molecules;
input marker surv censor stage histo;
datalines;
0 75 1 2 1
1 115 0 3 2
1 96 1 1 1
0 110 0 2 3
0 178 0 3 2
1 149 1 2 3
1 163 1 4 4
0 211 1 1 2
1 167 1 2 1
0 195 0 2 1
1 140 1 3 4
0 202 0 4 4
0 153 0 2 2
1 147 0 1 3
0 132 0 4 1
0 178 1 3 2
;
run;
Analysis of censored data can be easily performed in SAS with the help of various procedures like PROC LIFETEST, PROC PHREG etc.The purpose of the analysis is to model the underlying distribution of the survival time variable and to assess the dependence of the survival time variable on the independent variables.
The Kaplan Meier curve is plotted by taking disease free survival time on the horizontal axis and survival probability on the vertical axis. This curve is useful to measure the proportion of patients surviving at a specific time. Also we can compare the survival experience of two groups by comparing their curves. This comparison of survival estimates can be done by making use of the strata statement in PROC LIFETEST. The significant differences of the Kaplan Meier curves can be tested by Logrank test. If the p-value in the log-rank test is large (>0.05) then we can say that there is no difference in survival. The piece of SAS code for doing this comparison of survival curves between those cases in which the marker is present (marker=1) and those in which it is not present (marker=0).
proc lifetest data= survdata method=km plots=(s,lls) outsurv=option;
time surv*censor (0);
strata marker;
run;
The strata statement provides the log rank test and Wilcoxon test statistics. The outsurv= option in the proc lifetest statement to create a SAS data set that has the KM survival estimates. Plots=(s, lls) produces log-log curves as well as survival curves. The log-log survival curves will be parallel or nearly parallel if the proportional hazard assumption is met.
Kaplan – Meire Curves
Hazard ratio is a reasonable estimate for representing the effect of different factors in event occurrence. Cox regression model can be used and it models the time to event data. This can be done in SAS using the PROC PHREG. The following piece of code can be used to model the data
proc phreg data =molecules;
model surv*censor(0) =marker stage histo /rl ties=breslow selection=b;
baseline out=out1 survival=s logsurv=ls loglogs=lls;
run;
The backward selection procedure (with the option selection=b) in Cox’s regression removes the non-significant variables from the regression model and it includes only significant variables in the final model. The option ties= breslow is used to handle the ties. Proc phreg produces the regression coefficients and their standard errors for the variables which were included in the final model along with the p-values obtained from the Wald’s chi-square test. Hazard ratios and their 95% confidence intervals for those variables are also included in the output.

Hazard ratios can be interpreted similarly as that of interpreting odds ratios, i.e. a hazard ratio of 1 for an explanatory variable can be interpreted as it has no effect on the hazard. While a hazard ratio less than 1 denotes that the variable effect results in a decreased hazard. And a hazard ratio greater than 1, denotes that the variable effect results in an increased hazard.
Prepared by Mohanan K K
The term SAS QC refers to the maintenance of the quality of data and quality of programming. The quality of data refers to the Accuracy, Completeness, Consistency, Timeliness, Uniqueness and Validity of data. The quality of programming on the other hand means that the SAS program should produce correct and meaningful outputs and at the same time meet all standards like indentation of statements, optimization of code, use of drop and keep statements and error/warning free logs to name a few.
The QC personnel involved in the SAS QC process are SAS programmers who are independent of the actual programming that is being carried out for the study. At Kreara, the SAS QC process involves the following steps
a) Checking the quality of SAS code and outputs developed.
b) Entering the review comments into the issue tracker.
c) Tracking the resolution of the review comments.
The QC personnel review the code, log and output and raise issues in an issue tracker which is a web application accessible to the team members and which helps in tracking the resolution of issues. The issues raised by the QC are made available to the SAS Programmer who in turn corrects the code as per comments and flags the issue as resolved. The issues are closed by the QC personnel when the correction is satisfactory.
At Kreara intensive QC of codes and outputs is done in four stages as described below
1. Output QC
2. Log QC
3. Code QC
4. Parallel Programming
1. Output QC
The Output QC further includes
a. Sample or complete check against actual data
b. Check against the template
During the output QC, the QC personnel checks whether the outputs produced are as per template or requirements. In addition a sample check is performed against the database. The sample size for sample check varies depending on the study. In case of tables a complete check is done on the outputs produced.
2. Log QC
The SAS program always generate a log file. As part of the Log QC, the QC personnel is required to look for errors, warnings and critical notes in the log. Critical notes may include “Missing values were generated as a result of performing an operation on missing values”. The Log QC is aimed at making the log free of warnings, errors and notes.
3. Code QC
The code QC involves step by step review of the code. The following aspects of the code need to be reviewed
a. Adherence to the requirements
b. Logic
c. Syntax
d. Optimization
e. Presentation
f. Completeness of datasets
A peer review is carried out during which the SAS Programmer is required to explain the code to the QC personnel. Any discrepancies are discussed and corrected by the SAS Programmer.
4. Parallel Programming
Parallel programming is included in the QC process at Kreara. Parallel programming helps in both the output QC and optimization of code. During the parallel programming the QC personnel generates the outputs as per requirement without emphasis on the template and SAS programming standards/practices. The outputs generated by parallel program are compared to actual output for discrepancies. The parallel program may also be compared with actual program in order to improve the Quality of programming.
The following example explains the parallel programming in QC process.
If a SAS programmer calculates the confidence interval, using the ‘tinv’ function in the data step as shown below, a parallel programmer or QC personnel uses the simple proc means procedure and then the outputs are compared.
PROC UNIVARIATE DATA=old;
BY id ;
VAR anal;
OUTPUT out=new n = n1 n2
mean = mean1 mean2
std = std1 std2
;
RUN;
DATA new1;
SET new;
n1 = PUT(n1,4.0);
n2 = PUT(n2,4.0);
mn1 = PUT(mean1,6.1);
mn2 = PUT(mean2,6.1);
sd1 = PUT(std1,7.2);
sd2 = PUT(std2,7.2);
IF n1 NE 0 THEN DO;
clh1 = mean1 + tinv(0.975,n1-1) * std1 / sqrt(n1);
cll1 = mean1 - tinv(0.975,n1-1) * std1 / sqrt(n1);
ci1 = "("PUT(clh1,6.2)","PUT(cll1,6.2)")";
END;
IF n2 NE 0 THEN DO;
clh2 = mean2 + tinv(0.975,n2-1) * std2 / sqrt(n2);
cll2 = mean2 - tinv(0.975,n2-1) * std2 / sqrt(n2);
ci2 = "("PUT(clh2,6.2)","PUT(cll2,6.2)")";
END;
RUN;
The output can be simply obtained using proc means procedure as follows
proc means data=new ;
BY id;
VAR anal;
OUTPUT out=sum1
n = n1 n2
mean = mean1 mean2
std = std1 std2
lclm = ll1 ll2
uclm = ul1 ul2;
RUN;
The given parallel program suggests the reduction of steps and on the other hand the output produced by the parallel program is compared with actual output, as part of the output QC.
The Quality Control process is carried out until good quality outputs and code are obtained.
Prepared by Sreeja E V
In clinical trials, it is often of interest to investigate the relationship between the increasing dosage and the effect of the drug under study. Usually the dose levels tested are ordinal, and the effect of the drug is measured in binary. In such cases, Cochran-Armitage trend test is most frequently used to test for trend.
Here, the Null hypothesis (H0): There is no linear trend in effect of the drug under study across increasing levels of dosage. Alternative hypothesis (H1): There is linear trend in effect of the drug under study across increasing levels of dosage.
Consider an example. The data set effect contains hypothetical data for a clinical trial of a case control study. The clinical trial investigates whether the variable cascon relates with different genotype statuses. Subjects have one of either three genotype statuses 1, 2 or 3 where 1 represents abnormal, 2 represents partially abnormal and 3 represents normal status. The variable cascon has values 1=’Case’ and 2=’Control’. The number of subjects for each group is represented by the variable Count.
data effect;
input status cascon Count @@;
datalines;
1 1 15 1 2 26
2 1 19 2 2 10
3 1 20 3 2 3
;
run;
proc freq data=effect;
tables status*cascon / trend measures cl;
weight Count;
title 'Clinical Trial for case control study’;
run;
The output will appear as follows
Cochran-Armitage Trend Test
*************************
One-sided Pr > Z <.0001
There are situations where prior to running a SAS Editor(s) one may be required to run a list of files (SAS or non SAS). Our objective is how to call them in a single statement.
We need to run a set of SAS files prior to a SAS code. The usual scenario is to execute them one by one. But it is possible to access a set of files or members from this storage location by a single statement as follows.
Using filename statement we assign the fileref storage in an aggregate location.
filename storage “An- Aggregate –storage- location”;
Several files or members from this storage location can be accessed by listing them in parentheses after the fileref in a single %INCLUDE statement
%inc storage (Monthly, Quarterly); Non SAS files can also be accessed using quotation marks around the complete filename listed inside the parentheses. %inc storage ("file-1.txt","file-2.dat","file-3.cat"); Auto call SAS macros and Formats When the SAS editors contain user defined macros the above %inc statement does not work. In this situation we use sasautos option. Usually the formats and macros will be in separate folders. In those situations they can be called as follows. Libname project “project-path”;Libname formlib “format –path”;Libname mymacros “macro-path”; options fmtsearch=(formlib project) sasautos=( mymacros) mautosource ;
The fmtsearch option searches the formats in the files or libraries in the following order
1. Work.formats
2. formlib.formats
3. project.formats
Sasautos option invokes the macros in the file reference storage.
The auto call facility is usually used when all user-defined macros are stored in a standard location and they are not compiled until they are actually needed.
But when we get formats and macros in the same folder it is better to use the following statement.Options fmtsearch=( storage project) sasautos=( storage ) mautosource ; Here even though library work is not specified the fmtsearch option will search for the formats in the work library by default as mentioned earlier.The formats can also be called without using fmtsearch option if we know the name of SAS file which contains all the formats for the particular study. We can access the formats using a single %include statement.
%inc storage (formats);
Here the SAS editor formats contain all the formats for the study.
Prepared by Rajeev V
While performing chi square test using proc freq in SAS one may sometimes encounter such a warning, WARNING: 25% of the cells have expected counts les than 5. Chi-Square may not be a valid test.
This type of warning arises in output window when the expected count in any of the cells is less than 5 when performing Pearson chi-square test; for example the expected counts for row2column1 in the below table is calculated as (16X235)/766 equal to 4.9086. If we see this type of warning just avoid p value corresponding to Pearson chi-square test and take into consideration p value of Fisher’s exact test. These are explained below using an example:-
Data _cat_;
Input grp $7. Cat1 $4;
Cards;
Case ABN
Control NRM
………………….
………………….
;
run;
ods output Chisq=chitb_1(where =(statistic in ("Chi-Square")))
FishersExact=fishexctb_1(where=(label1="Two-sided Pr<=P"));

The p-value obtained on performing the chi-square test above may not measure the association because (2, 1) th expected count is less than 5. SAS automatically performs Fisher's exact test for 2×2 tables but for tables larger than 2x2, exact tests are requested by using the exact option on the table statement or using exact fisher statement after table statement. SAS generate every possible table that is compatible with the given marginal totals, and calculates the exact probability p of each table, using Fisher’s formula (1934). By summing the probabilities of the extreme tables we obtain a P value that is used in the usual classical way as a test of the null hypothesis.From our example we will consider the two sided p value (1.00) from Fisher’s Exact Test.
Adjusted odds ratio and corresponding 95% confidence interval is obtained by performing logistic regression analysis, this technique is implemented in the SAS® System using PROC LOGISTIC.
Logistic regression analysis provides adjusted odds ratio if adjustors are used as additional predictors, otherwise it provides unadjusted odds ratio.
The general syntax of PROC LOGISTIC is:
PROC LOGISTIC DATA=dsn ;
MODEL depvar = indepvar(s)/options;
RUN;
Example:
Suppose we are interested in conducting a case control study to evaluate the relation among cases and controls between different genotypes. The different gene statuses are ‘abnormal’ and ‘normal’ where normal is considered as referent group. For this purpose we generate a dataset as follows
data genestat;
do i=1 to 50;
gene=round(1 + (3-1)*uniform(10));
age =round(1+(3-1)*uniform(15));
ethnic=round(1+(3-1)*uniform(14));
status=round(1+(2-1)*uniform(16));
cascon=round(1+(2-1)*uniform(17));
output;
end ;
drop i;
run;
/*formats */
proc format;
value gene 1= 'Gene1' 2= 'Gene2' 3= 'Gene3';
value cascon 1='Case' 2='Control';
value age 1='<18' 2="'18-35'" 3="'">35';
value ethnic 1='Asian' 2='Caucasian' 3='Other';
value status 1='Abnormal' 2='Normal ';
proc sort data=genestat ;
by gene;
format gene gene. cascon cascon. status status. age age. ethnic ethnic.;
run;
Let’s consider a model with variable status, age and ethnic are as predictors.
/*proc logistic for calculating adjusted odds ratio*/
ods trace on;
ods output CLoddsWald=gene_cancer(where=(Effect="status Abnormal vs Normal"));
proc logistic data=genestat;
class status/param=ref ref=last;/* reference parameter ref=last
i.e. ref='Normal'*/
model cascon=status age ethnic / clodds=both;/* clodds =gives WALD confidence Interval for odds ratio*/
by gene ;
run;
ods output close;
ods trace off;
The MODEL statement names the response variable and the explanatory effects, including covariates, main effects, interactions, and nested effects. The CLASS statement names the classification variables to be used in the analysis. The CLASS Statement permits specification of a reference level. By default, the lowest level of the variable placed in the CLASS Statement is treated as the reference category. The BY statement is used to obtain separate analyses on observations in groups defined by the BY variables.
The output table is obtained as

This shows that Gene1 and Gene2 are less likely to have the abnormal genotype status in case than control. Adjusted odds ratio for Gene3 shows that the odds of abnormal genotype occurring in the case group are higher than it occurring in control group.
Let's now consider the model where status is the only predictor.
/*proc logistic for calculating unadjusted odds ratio*/
ods trace on;
ods output CLoddsWald=gene_cancer(where=(Effect="status Abnormal vs Normal"));
proc logistic data=genestat;
class status/param=ref ref=last;/* reference parameter ref=last
i.e. ref='Normal'*/
model cascon=status / clodds=both;/* clodds =gives WALD confidence interval
for odds ratio*/
by gene ;
run;
ods output close;
ods trace off;
The output table is obtained as

This shows that the odds of abnormal genotype occurring in the case group are higher than it occurring in control group for gene 3 while it same for genes 1 and 2.

Scrum is a process skeleton that includes a set of practices and predefined roles. The main roles in Scrum are the ScrumMaster who maintains the processes and works similar to a project manager, the Product Owner who represents the stakeholders, and the Team which includes the developers.
During each sprint, a 15-30 day period (length decided by the team), the team creates an increment of potential shippable (usable) software. The set of features that go into each sprint come from the product backlog, which is a prioritized set of high level requirements of work to be done. Which backlog items go into the sprint is determined during the sprint planning meeting. During this meeting the Product Owner informs the team of the items in the product backlog that he wants completed. The team then determines how much of this they can commit to complete during the next sprint. During the sprint, no one is able to change the sprint backlog, which means that the requirements are frozen for a sprint.
There are several implementations of systems for managing the Scrum process which range from yellow stickers and white-boards to software packages. One of Scrum's biggest advantages is that it is very easy to learn and requires little effort to start using.
6.10.08
Documentation is an integral part of clinical trial studies and maintenance of standards while preparing various documents is of utmost importance. The documents related to clinical trial study may be anything from SOPs to documents related to project management, data management, statistics or SAS.
At Kreara, all or some of such documents are prepared as per requirement of the study. The emphasis is not only to make these documents as informative as possible but also to convey the information in a concise and effective manner. Further, effort is taken to maintain the quality of the information contained and the way of presentation.
An SOP for General Documentation Guidelines is maintained at Kreara and all personnel in the organization are trained on the same. This standard operating procedure describes the various guidelines to be followed during preparation and amendment of SOPs in general. It also presents guidelines for preparation of project related documents like the naming conventions to be followed.
In addition to this, individual SOPs are maintained for each and every document and to maintain standards, templates with instructions regarding the contents, layout and formatting of the contents are maintained in a central repository. The personnel responsible for writing the documents are required to follow the format in the templates while preparing the documents. The QC personnel check for any non-compliance to templates in the document in addition to the relevance of contents. Further the QA manager is responsible to ensure that the process is followed correctly.
The personnel at Kreara are trained in the SOPs related to document writing. A great deal of exposure in the related field is provided to them so that they are capable of preparing informative and effective documents.
3.10.08
Prepared by : Sujith K G.
Usually we use substr () function to select a string which starts with specific characters or for selecting a part of a string. We can also use Sas Colon Modifier “=:” for performing the same task. Both methods allow comparison of values based on the prefix of a text string. Both these methods have been explained using the following example
Here we have a dataset adverse which contains patient id and name of adverse event.
data adverse;
Input id ae $;
cards;
001 asthma
002 chesttightness
003 dizziness
004 cold
005 headache
006 dysphonia
007 commoncold
008 nausea
009 cough
;
run;
We are interested to flag the adverse events starting with “co” namely cough, cold and common cold as Yes and others as No.
This can be performed by using the SUBSTR() function as described below
data event;
set adverse;
if lowcase(substr(ae,1,2))='co' then res="Yes";
else res="No";
run;
The same purpose can be served by applying the Colon Modifier “=:” as described in the following steps
Now using, Colon Modifier the condition is,
data event;
set adverse;
if lowcase(ae) =: "co" then res="Yes";
else res="No";
run;
As can be seen from the above examples in the substring function we need to specify the position to extract the first two letters while in Colon modifier such a requirement is not needed.