Fitting Conditional Logit using d0 Mata-based evaluator using maximum likelihood -ml- on Stata

Stata Mata ml Discrete Choice Models

This post is higly based on the excellent book Maximum likelihood estimation with Stata. Stata press Gould, W., Pitblado, J., & Sribney, W. (2006) and by any means tries to be a replacement for such an amazing reference. However, within the chapter 6 devoted to d-family -ml- evaluators, somehow something, in my humble opinion, very important was missing. The missing part was the discussion of Discrete Choice Models as a particular family of models specially suitable for such family of evaluators where ikelihood function does not meet the linearform restrictions.

Here is provide a way to both simulate such a data using Mata and then fit conventional Conditional Logit using a Mata-based d0 evaluator using -ml-.

Additionally, if you follow closely, you will discover a couple of cool features of Mata. For example, how to integrate objects from Stata to Mata and vice versa and how to declare transformic objects in Mata.

Data Generation

We are going to generate the following model:

\[U_{in}=V_{in} + \varepsilon_{in} = \sum_{m=1}^{M}x_{im}\beta_{m} + \varepsilon_{in}\] Knowing the probability of each individual selecting alternative \(i\) is given by (Secction 3.10 Train, K. E. (2009) : \[P_{in}=\dfrac{e^{V_{in}}}{\sum_{j=1}^{J}e^{V_{jn}}}\]

clear all
set seed 157
set obs 2000
gen id = _n
local n_choices =5
expand `n_choices'
bys id : gen alternative = _n

gen x1 =  runiform(-2,2)
gen x2 =  runiform(-2,2)

matrix betas = (0.5 ,2)
global individuals = "id"

mata: 
// Calls from Stata the matrix "betas"
betas =st_matrix("betas")
// Generates a view of all attributes (M) x*
st_view(X = ., ., "x*")
// Generates a view individuals id 
st_view(panvar = ., ., st_global("individuals"))
// Extemely usefull utilities for panel data 
paninfo = panelsetup(panvar, 1)     
npanels = panelstats(paninfo)[1] 
   
// Looping over individuals (n)
for(n=1; n <= npanels; ++n) { 
    // Extract only the submatrix of individual n
    x_n = panelsubmatrix(X, n, paninfo) 
    
    // Linear utility
    util_n =betas :* x_n
    util_sum =rowsum(util_n) 
    U_exp = exp(util_sum)
    // Probability of each alternative
    p_i =     U_exp :/ colsum(U_exp)
    
    // Probability balance of alternatives
    cum_p_i =runningsum(p_i)    
    rand_draws = J(rows(x_n),1,uniform(1,1)) 
    pbb_balance = rand_draws:<cum_p_i
    cum_pbb_balance = runningsum(pbb_balance)
    choice_n = (cum_pbb_balance:== J(rows(x_n),1,1))
    
    // Storing each individual choice.
    if (n==1) Y =choice_n    
    else       Y = Y \ choice_n    
}
   
// Creates a new Stata variable called "choice"    
idx = st_addvar("float", "choice")
// save the content of Y on "choice" Stata variable
st_store(., idx, Y)
     
end

Fitting Clogit using Mata evaluator & -ml-

Here the trick is define a Mata evaluator and call it from -ml-. For further details see Gould, W., Pitblado, J., & Sribney, W. (2006).

mata:
    void myclogit_eval_d0(transmorphic scalar M, real scalar todo,
    real rowvector b, real scalar lnf,
    real rowvector g, real matrix H)
 {
    /*-----------------------------*/
    /*----variables declaration----*/
    /*-----------------------------*/
    
    real matrix panvar  
    real matrix paninfo 
    real scalar npanels

    real scalar n 
    real matrix Y 
    real matrix X 
    real matrix x_n 
    real matrix y_n 

    // variables creation
    Y = moptimize_util_depvar(M, 1)     
    X= moptimize_init_eq_indepvars(M,1)    

    st_view(panvar = ., ., st_global("individuals"))
    paninfo = panelsetup(panvar, 1)     
    npanels = panelstats(paninfo)[1] 
    lnfj = J(npanels, 1, 0) 

    for(n=1; n <= npanels; ++n) {
        x_n = panelsubmatrix(X, n, paninfo) 
        y_n = panelsubmatrix(Y, n, paninfo) 
        // Linear utility        
        U = rowsum(b:* x_n) 
        // exp() Utility
        U_exp = exp(U)
        // Probability 
        p_i = colsum((U_exp :* y_n))  / colsum(U_exp)
        // log probability
        lnfj[n] = ln(p_i)
         }
    // Sum over individuals (n) to obtain loglikeli    
    lnf = moptimize_util_sum(M, lnfj)
}
end

ml model d0 myclogit_eval_d0() (myclogit:  choice = x*, nocons)

Results: MyClogit vs old clogit (still kicking!)

MyClogit

Now that we just declare tje ml model using myclogit_eval_d0() we can fit it just tipying ml maximize.

ml maximize 

initial:       log likelihood = -3218.8758
alternative:   log likelihood = -3208.8925
rescale:       log likelihood = -2783.5704
Iteration 0:   log likelihood = -2783.5704  
Iteration 1:   log likelihood = -1690.7307  
Iteration 2:   log likelihood =  -1583.515  
Iteration 3:   log likelihood = -1582.7268  
Iteration 4:   log likelihood = -1582.7261  
Iteration 5:   log likelihood = -1582.7261  

                                                Number of obs     =     10,000
                                                Wald chi2(2)      =    1250.08
Log likelihood = -1582.7261                     Prob > chi2       =     0.0000

------------------------------------------------------------------------------
      choice |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |   .5137069   .0319027    16.10   0.000     .4511788     .576235
          x2 |   1.980155   .0567223    34.91   0.000     1.868981    2.091328
------------------------------------------------------------------------------

-clogit- (old but still kicking!)

Here is the same model fitted using -clogit-, and as you can see the results are equivalent.

clogit  choice  x* ,group(id)

Iteration 0:   log likelihood = -1626.3245  
Iteration 1:   log likelihood = -1582.7827  
Iteration 2:   log likelihood = -1582.7261  
Iteration 3:   log likelihood = -1582.7261  

Conditional (fixed-effects) logistic regression

                                                Number of obs     =     10,000
                                                LR chi2(2)        =    3272.30
                                                Prob > chi2       =     0.0000
Log likelihood = -1582.7261                     Pseudo R2         =     0.5083

------------------------------------------------------------------------------
      choice |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |   .5137069   .0319027    16.10   0.000     .4511788     .576235
          x2 |   1.980154   .0567223    34.91   0.000     1.868981    2.091328
------------------------------------------------------------------------------

References

Gould, W., Pitblado, J., & Sribney, W. (2006).. Maximum likelihood estimation with Stata. Stata press.
Train, K. E. (2009). Discrete choice methods with simulation. Cambridge university press.