1

Here is how my data are set up:

where date# is the date the lab result# was collected. I have tried to check if the dates are in asending order but cannot be certain (the real data can have up to 10 dates with 10 corresponding lab results). Where . are missing data. Where DATE# are dates mm/dd/yyyy, RESULT# are numeric, and ID is a string.

ID DATE1 RESULT1 DATE2 RESULT2 DATE3 RESULT3 DATE4 RESULT4

1   1/1/2000   200   3/1/2000   250   8/1/2000   100   9/1/2000   400  

2   .    .      .    .   .    .   .   .

3   2/1/2000   225   9/1/2000   100   .   .   .   .

4   12/1/2000   300   .   .   .   .   .   .

QUESTION 1: How do I identify/tag the earliest and oldest lab result per ID?

These here is the output I need:

ID FIRST LAST

1   200   400

2   .     .

3   225   100

4   300   300

QUESTION2: any way to make the LAST output for ID 4 set to missing since there is only one result for this record?

flag

2 Answers

3

Hello,

Here is a solution for both questions with the use of arrays.

Regards

DATA result(KEEP=id min max);  
SET dataset;  
ARRAY dates date1-date10;  
ARRAY results result1-result10;  

id_min=1;  
id_max=1;  

DO i=2 TO 10;    
IF dates{i}>dates{id_max} THEN id_max=i;    
    IF dates{i} < dates{id_min} THEN id_min=i;      
END;

first=results{id_min};
last=results{id_max};

/* Check if at least two valid values */  
nb_Not_Missing=N(OF date1-date10);  
IF nb_Not_Missing<=1 THEN last=.;  

RUN;
link|flag
I had to make a few adjustments and wanted to post what I used: DATA CD4; *(KEEP=recnum cd4_first cd4_last); SET temp3; ARRAY dates date_cd1-date_cd10; ARRAY results cdct1-cdct10; id_min=1; id_max=1; DO i=2 TO 10; IF dates(i) ^=. THEN IF dates{i} >= dates{id_max} THEN id_max=i; ELSE IF dates{i} <= dates{id_min} THEN id_min=i; END; cd4_first=results{id_min}; cd4_last=results{id_max}; * Check if at least two valid values; nb_Not_Missing=N(OF date_cd1-date_cd10); IF nb_Not_Missing<=1 THEN cd4_last=.; cd4_diff=(cd4_last-cd4_first); RUN; – KwonC Jan 10 at 19:25
1

Wanted to post the code I used after a few adjustments:

DATA CD4; *(KEEP=recnum cd4_first cd4_last);  
SET temp3;  
ARRAY dates date_cd1-date_cd10;  
ARRAY results cdct1-cdct10;  

id_min=1;  
id_max=1;  

DO i=2 TO 10; 
IF dates(i) ^=. THEN
    IF dates{i} >= dates{id_max} THEN id_max=i;    
    ELSE IF dates{i} <= dates{id_min} THEN id_min=i;      
END;

cd4_first=results{id_min};  
cd4_last=results{id_max};

* Check if at least two valid values;  
nb_Not_Missing=N(OF date_cd1-date_cd10);  
IF nb_Not_Missing<=1 THEN cd4_last=.;  

cd4_diff=(cd4_last-cd4_first);

RUN;
link|flag

Your Answer

Not the answer you're looking for? Browse other questions tagged or ask your own question.