Are my labels there? Searching among label variables values on Stata
Stata ENGThe problem
This brief post answer the question my orignal google search How to check if a value is contained into a label variable on stata?
Imagine we have a problem where we need to check if a string is contained into the labels of another variable. Let’s start with a minimal example using sysuse auto and the variable foreing
which is the only one with labels already in place.
clear all
sysuse auto, clear
tab foreign
tab foreign ,nola
Car type | Freq. Percent Cum.
------------+-----------------------------------
Domestic | 52 70.27 70.27
Foreign | 22 29.73 100.00
------------+-----------------------------------
Total | 74 100.00
The solution
The simple answer to the question is contained in the following lines. For example, if we search for an existing label like Domestic. We can see that 52 observations present such a label.
sum foreign if foreign =="Domestic":`: value label foreign'
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
foreign | 52 0 0 0 0
However, if we seach for a label that is not present we obtain the following.
sum foreign if foreign =="Borges":`: value label foreign'
(value label dereference "Borges":origin not found)
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
foreign | 0
Up to here the question is already answered, it was just a combination of summarize and a clever way to search among labels.
The explanation
Well, let’s start breaking down the syntax used above in small pieces. Take a look on the following lines. Fist, label list
will list all the existent labels, where the only existent one is called origin
. Therefore, when we repeate the summarize operation replacing `: value label foreign'
by origin
we, indeed, obtain the same results, meaning that learn that the the expression among quotes is no more than a way to call the name of the label attached to the variable foreign
. To know more about such expressions that calculates results on the fly (N.Cox 2008)
label list
origin:
0 Domestic
1 Foreign
. sum foreign if foreign =="Domestic":origin
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
foreign | 52 0 0 0 0
The fancy solution
Now let’s write a program
that can check the existence of a string among the labels of a variable for us. We will use a combination of summarize
, capture
and confirm
.
The following program (check_labels
) will check if the string that we type on the option , label()
is present or not in the labels of a variable.
syntax varlist(max=1) [if] , label(string)
is declaring that the program will just allows one variable and one mandatory option called label()
that is going to be read as a string by the program.
After, we use quietly sum
calling the options from syntax
in the same fashion as the first example, and adding the option ,nomean
to speed up the process because it won’t compute the variance (probably a better name could be novariance
tho).
Finally, we use confirm
to check if the value was or nor present into the label. We will get a 7
when it’s not present and a 0
when the label is actually present storage in _rc
. See help confirm. After we can storage into a scalar called present
a value of 1 when the the string was present and 0 if it wasn’t.
capture drop program check_labels
program define check_labels ,rclass
version 12.0
syntax varlist(max=1) [if] , label(string)
quietly sum `varlist' if `varlist' == "`label'": `: value label `varlist'' ,nomean
local mean_altern = r(mean)
capture confirm number `mean_altern'
if _rc ==7 {
di in red " `label' isn't present on labels of var `varlist' "
return scalar present = 0
}
else if _rc ==0 {
di in red " `label' is present on labels of var `varlist' "
return scalar present = 1
}
end
To conclude, using can invoke the program check_labels
as follows.
check_labels foreign , label(Domestic)
Domestic is present on labels of var foreign
ret list
scalars:
r(present) = 1
check_labels foreign , label(Borges)
Borges isn't present on labels of var foreign
ret list
scalars:
r(present) = 0
Resourses that inspired me to post this.