Warm tip: This article is reproduced from serverfault.com, please click

Multiple choices in a choice data set

发布于 2020-12-08 22:41:34

The original data contains information on the consumerid and the cars they purchased.

clear
    input consumerid car    purchase
    6   American    1
    6   Japanese    0
    6   European    0
    7   American    0
    7   Japanese    0
    7   European    1
    7   Korean      1
end

Since this is a purchase data, the data set needs to be expanded in a way to depict the full choice set of cars every time a consumer made a purchase. The final data set should look like this (the screenshot taken from the Stata manual www.stata.com/manuals/cm.pdf on p. 97 in "Example 4: Multiple choices per case"):

enter image description here

I have generated several codes (shown below) to almost get me to where I need but I have trouble generating a single value of purchase=1 per consumerid-carnumber combination (i.e. due to the expansion, the purchase values are duplicated).

egen sumpurchase=total(purchase), by(id)
expand sumpurchase
bysort id car (purchase): gen carnumber=_n


    
Questioner
Olga
Viewed
0
Wouter 2020-12-09 16:40:52

You could use reshape to get all combinations of consumerid/car per car bought. This example assumes that the sort order in the original dataset defines which car is carnumber 1, carnumber 2 etc.

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte consumerid str8 car byte purchase
6 "American" 1
6 "Japanese" 0
6 "European" 0
7 "American" 0
7 "Japanese" 0
7 "European" 1
7 "Korean"   1
end

// Generate carnumber
bys consumerid: gen carnumber = cond(purchase != 0, sum(purchase), 0)

// To wide
reshape wide purchase, i(consumerid car) j(carnumber)

// Keep purchased cars only
drop purchase0

// Back to long
reshape long

// Drop if no cars purchased for consumerid/carnumber
bysort consumerid carnumber (purchase) : drop if missing(purchase[1])

// Replace missing with 0 for non-purchased cars
mvencode purchase, mv(0)

// Sort and see results
sort consumerid carnumber car
list, sepby(consumerid carnumber) abbr(14)

Results:

. list, sepby(consumerid carnumber) abbr(14)

     +----------------------------------------------+
     | consumerid        car   carnumber   purchase |
     |----------------------------------------------|
  1. |          6   American           1          1 |
  2. |          6   European           1          0 |
  3. |          6   Japanese           1          0 |
     |----------------------------------------------|
  4. |          7   American           1          0 |
  5. |          7   European           1          1 |
  6. |          7   Japanese           1          0 |
  7. |          7     Korean           1          0 |
     |----------------------------------------------|
  8. |          7   American           2          0 |
  9. |          7   European           2          0 |
 10. |          7   Japanese           2          0 |
 11. |          7     Korean           2          1 |
     +----------------------------------------------+