Warm tip: This article is reproduced from stackoverflow.com, please click
sas

SAS-identifying three preceding years

发布于 2020-04-10 16:17:03

I use following code to identify and output three preceding years. To give more details, the sample includes multiple Person ID and each Person ID have observations in multiple years. The final sample will only keep the Person ID who at least have three preceding years (e.g. 2001 2002 2003).

data have3 ;
  set have2;
  by personid;
  set have2 ( firstobs = 2 keep = year rename = (year = _year2) )

      have2 (      obs = 1 drop = _all_                        );

  _year2 = ifn(  last.personid, (.), _year2 ); /*output the value of next year*/

  set have2 ( firstobs = 3 keep = year rename = (year = _year3) )

      have2 (      obs = 2 drop = _all_                        );

  _year3 = ifn(  last.personid, (.), _year3 );  /*output the value of the year after the next year*/

  _prev1 = ifn( first.personid, (.), lag(year) ); /*output the value of previous year*/

  _prev2 = ifn( first.personid, (.), lag2(year) );/*output the value of the year before the previous year*/


  if (year-2 eq _prev1-1 eq _prev2) or

     (year+2 eq _year2+1 eq _year3) or

     (year eq _year2-1 eq _prev1+1) then output;

run;

This code is fine in most situations. However, my sample has some tricky situations. The following figure show one of the situation. Person ID 488 only have two observations (Year 1994 and 1995). Unfortunately, the first year of the next Person ID 489 is 1996. Hence, the _year3 of ID488 Year1994 is 1996 that make year+2 eq _year2+1 eq _year3 become true. As the result, ID488 Year1994 also output to the final sample. How could I improve the code to avoid this situation? Thanks!

enter image description here

Questioner
Neal801
Viewed
43
Tom 2020-02-02 11:24

It is probably going to be easier to just keep the id variable also in your look ahead SET statements.

set have2(firstobs=2 keep=personid year rename=(personid=personid2 year=_year2))...
set have2(firstobs=3 keep=personid year rename=(personid=personid3 year=_year3))...

Then you can make sure that the next and next-next records are actually for the same PERSONID.

if personid ne personid2 then _year2=.;
if personid ne personid3 then _year3=.;

Same thing for your look back variables.

_prev1=lag(year);
_prev2=lag2(year);
if personid ne lag(personid) then _prev1=.;
if personid ne lag2(personid) then _prev2=.;