I have the following results that I'd like to put into Stata and run some analysis:
Is there a way I can generate the dataset for this data in Stata so that there are the appropriate number of observations, allowing me to run tabodds
or similar?
Perhaps this is what you are looking for.
clear
set obs 6
gen region = word("`c(ALPHA)'", ceil(_n / 2))
bysort region : gen control = _n - 1
label define casecontrol 1 "Control" 0 "Case"
label values control casecontrol
local expandlist 708 1392 946 2086 328 996
gen exp = real(word("`expandlist'", _n))
expand exp
drop exp
tab region control
Result:
. tab region control
| control
region | Case Control | Total
-----------+----------------------+----------
A | 708 1,392 | 2,100
B | 946 2,086 | 3,032
C | 328 996 | 1,324
-----------+----------------------+----------
Total | 1,982 4,474 | 6,456
Thanks so much @Wouter, this helps a lot and I was able to run my chi-squared trend test. However, I'm struggling to apply this to other situations - how would I adapt it to be for a 2x2 table (a binary variable say, rather than 3 regions)? I tried changing bysort region to be something like _n / 2 or _n - 2 but that's clearly not right.
The most important thing is to set the right amount of initial observations, one for each combination of variables. So 2x2 implies
set obs 4
. Just changing that would already work. If you want a 0 1 indicator variable instead of region A B you could changegen region = word("`c(ALPHA)'", ceil(_n / 2))
to something likegen region = mod(_n, 2)