Warm tip: This article is reproduced from serverfault.com, please click
sas

create data step variables using a dynamic macro-variable

发布于 2020-12-03 01:15:17

I want to store an instance of a data step variable in a macro-variable using call symput, then use that macro-variable in the same data step to populate a new field, assigning it a new value every 36 records.

I tried the following code:

data a;
set a;
if MOB = 1 then do;
   MOB1_accounts = accounts;
   call symput('MOB1_acct', MOB1_accounts);
end;
else if MOB > 1 then MOB1_accounts = &MOB1_acct.;
run;

I have a series of repeating MOB's (1-36). I want to create a field called MOB1_Accts, set it equal to the # of accounts for that cohort where MOB = 1, and keep that value when MOB = 2, 3, 4 etc. I basically want to "drag down" the MOB 1 value every 36 records.

For some reason this macro-variable is returning "1" instead of the correct # accounts. I think it might be a char/numeric issue but unsure. I've tried every possible permutation of single quotes, double quotes, symget, etc... no luck.

Thanks for the help!

Questioner
cbart
Viewed
0
Richard 2020-12-04 02:06:00

You are misusing the macro system.

The ampersand (&) introducer in source code tells SAS to resolve the following symbol and place it into the code submission stream. Thus, the resolved &MOB1_acct. can not be changed in the running DATA Step. In other words, a running step can not change it's source code -- The resolved macro variable will be the same for all implicit iterations of the step because its value became part of the source code of the step.

You can use SYMPUT() and SYMGET() functions to move strings out of and into a DATA Step. But that is still the wrong approach for your problem.

The most straight forward technique could be

  • use of a retained variable
  • mod (_n_, 36) computation to determine every 36th row. (_n_ is a proxy for row number in a simple step with a single SET.)

Example:

data a;
  set a;

  retain mob1_accounts;

  * every 36 rows change the value, otherwise the value is retained;
  if mod(_n_,36) = 1 then mob1_accounts = accounts;
run;

You didn't show any data, so the actual program statements you need might be slightly different.

Contrasting SYMPUT/SYMGET with RETAIN

As stated, SYMPUT/SYMGET is a possible way to retain values by off storing them in the macro symbol table. There is a penalty though. The SYM* requires a function call and whatever machinations/blackbox goings on are happening to store/retrieve a symbol value, and possibly additional conversions between character and numeric.

Example:

1,000,000 rows read. DATA _null_ steps to avoid writing overhead as part of contrast.

data have;
  do rownum = 1 to 1e6;
    mob + 1;
    accounts = sum(accounts, rand('integer', 1,50) - 10);
    if mob > 36 then mob = 1;
    output;
  end;
run;

data _null_;
  set have;

  if mob = 1 then call symput ('mob1_accounts', cats(accounts));

  mob1_accounts = symgetn('mob1_accounts');
run;

data _null_;
  set have;
  retain mob1_accounts;

  if mob = 1 then mob1_accounts = accounts;
run;

On my system logs

142  data _null_;
143    set have;
144
145    if mob = 1 then call symput ('mob1_accounts', cats(accounts));
146
147    mob1_accounts = symgetn('mob1_accounts');
148  run;

NOTE: There were 1000000 observations read from the data set WORK.HAVE.
NOTE: DATA statement used (Total process time):
      real time           0.34 seconds
      cpu time            0.34 seconds


149
150  data _null_;
151    set have;
152    retain mob1_accounts;
153
154    if mob = 1 then mob1_accounts = accounts;
155  run;

NOTE: There were 1000000 observations read from the data set WORK.HAVE.
NOTE: DATA statement used (Total process time):
      real time           0.04 seconds
      cpu time            0.03 seconds

Or

   way          real   cpu
------------- ------  ----
SYMPUT/SYMGET   0.34  0.34
    RETAIN      0.04  0.03