4

Hello everyone,

I am trying to replace missing values with the corresponding median, bearing in mind that this Median must varies across a class variable (CountryCode)

so I have something like

CountryCode AGE US 12 US 10 US . FR 11 FR . NL 10

etc...

in the above example I am trying to replace the missing in Age by the medians AGE within each Country (US, FR, NL)

Have tried a coupld of sulutions that will relace with the median across the whole sample but nothing that actually take into account the class variable CountryCode)

Thanking you in advance,

flag

2 Answers

2

Proc STDIZE provides an easy solution.

data country;  
   input countrycode $:2. age @@;  
   cards;  
US 12 US 10 US . FR 11 FR . NL 10  
;;;;  
   run;  

proc stdize method=median reponly out=poke;
   by notsorted countrycode;
   var age;
   run;
proc print;
   run;
link|flag
1

Hello,
By using a proc means and after a Data Step with the "call execute" command, you can solve your problem.
Kind regards,
Toloc

DATA country;  
LENGTH CountryCode $3. age 8.;  
INPUT CountryCode $ age 8.;  
CARDS;  
US 12   
US 10   
US .   
FR 11   
FR .   
NL 10  
;  
RUN;  

PROC MEANS DATA=country;  
CLASS CountryCode;  
VAR age;  
OUTPUT OUT=countryMed MEDIAN=med;  
RUN;

DATA _NULL_;  
SET countryMed(WHERE=(  _type_=1)) END=last;  
IF _N_=1   
THEN  
  DO;    
        CALL EXECUTE("DATA country_New;");  
          CALL EXECUTE("SET country;");  
  END;  
CALL EXECUTE("IF age=. AND countrycode='"||COMPRESS(countrycode)||"' THEN age="||med||";");    
IF last THEN CALL EXECUTE("RUN;");  
RUN;
link|flag
Thanks Toloc - It worked like a dream! – Olivier Dec 22 at 17:59

Your Answer

Not the answer you're looking for? Browse other questions tagged or ask your own question.