Consideration sets, intentions and the inclusion of ''don't know'' in a two-stage model for voter choice
We present a statistical model for voter choice that incorporates a consideration set stage and final vote intention stage. The first stage involves a multivariate probit (MVP) model to describe the probabilities that a candidate or a party gets considered. The second stage of the model is a multinomial probit (MNP) model for the actual choice. In both stages, we use as explanatory variables data on voter choice at the previous election, as well as sociodemographic respondent characteristics. Importantly, our model explicitly accounts for the three types of ''missing data'' encountered in polling. First, we include a no-vote option in the final vote intention stage. Second, the ''do not know'' (DNK) response is assumed to arise from too little difference in the utility between the two most preferred options in the consideration set, or is considered to be a missing observation. Third, the ''do not want to say'' (DNWTS) response is modeled as a missing observation on the most preferred alternative in the consideration set. Thus, we consider the missing data generating mechanism to be nonignorable and build a model based on utility maximization to describe the vote intentions of these respondents. We illustrate the merits of the model as we have information on a sample of about 5000 individuals from the Netherlands for who we know how they voted last time (if at all), which parties they would consider for the upcoming election, and what their vote intention is. A unique feature of the data set is that information is available on actual individual voting behavior, measured at the day of election. We find that the inclusion of the consideration set stage in the model enables the user to make a more precise inference on the competitive structure in the political domain and to obtain better out-of-sample forecasts.