| Title: | Working with Choice Data |
|---|---|
| Description: | Offers a set of objects tailored to simplify working with choice data. It enables the computation of choice probabilities and the likelihood of various types of choice models based on given data. |
| Authors: | Lennart Oelschläger [aut, cre] (ORCID: <https://orcid.org/0000-0001-5421-9313>) |
| Maintainer: | Lennart Oelschläger <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.1.0 |
| Built: | 2026-05-07 16:25:26 UTC |
| Source: | https://github.com/loelschlaeger/choicedata |
The choice_alternatives object defines the set of choice alternatives.
choice_alternatives( J = 2, alternatives = LETTERS[1:J], base = NULL, ordered = FALSE ) ## S3 method for class 'choice_alternatives' print(x, ...)choice_alternatives( J = 2, alternatives = LETTERS[1:J], base = NULL, ordered = FALSE ) ## S3 method for class 'choice_alternatives' print(x, ...)
J |
[ |
alternatives |
[ |
base |
[ If |
ordered |
[ When Otherwise, they are sorted alphabetically. |
x |
[ |
... |
Currently not used. |
An object of class choice_alternatives, i.e. a character vector of the
choice alternatives with attributes:
JThe number of choice alternatives.
baseThe name of the base alternative.
orderedDo the alternatives encode an inherent ordering?
The full set of coefficients for covariates that are constant across
alternatives (including alternative-specific constants) is not identified.
To achieve identifiability, the coefficient of alternative base
is fixed to zero. The other coefficients then have to be interpreted with
respect to base. The base alternative is marked with a * when
printing a choice_alternatives object.
choice_alternatives( J = 3, alternatives = c("gas", "electricity", "oil"), base = "gas" )choice_alternatives( J = 3, alternatives = c("gas", "electricity", "oil"), base = "gas" )
The choice_covariates object defines the choice model covariates.
generate_choice_covariates() samples covariates.
covariate_names() gives the covariate names for given choice_effects.
design_matrices() builds design matrices.
choice_covariates( data_frame, format = "wide", column_decider = "deciderID", column_occasion = NULL, column_alternative = NULL, column_ac_covariates = NULL, column_as_covariates = NULL, delimiter = "_", cross_section = is.null(column_occasion) ) generate_choice_covariates( choice_effects = NULL, choice_identifiers = generate_choice_identifiers(N = 100), labels = covariate_names(choice_effects), n = nrow(choice_identifiers), marginals = list(), correlation = diag(length(labels)), verbose = FALSE, delimiter = "_" ) covariate_names(choice_effects) design_matrices( x, choice_effects, choice_identifiers = extract_choice_identifiers(x) )choice_covariates( data_frame, format = "wide", column_decider = "deciderID", column_occasion = NULL, column_alternative = NULL, column_ac_covariates = NULL, column_as_covariates = NULL, delimiter = "_", cross_section = is.null(column_occasion) ) generate_choice_covariates( choice_effects = NULL, choice_identifiers = generate_choice_identifiers(N = 100), labels = covariate_names(choice_effects), n = nrow(choice_identifiers), marginals = list(), correlation = diag(length(labels)), verbose = FALSE, delimiter = "_" ) covariate_names(choice_effects) design_matrices( x, choice_effects, choice_identifiers = extract_choice_identifiers(x) )
data_frame |
[ |
format |
[ |
column_decider |
[ |
column_occasion |
[ |
column_alternative |
[ |
column_ac_covariates |
[ |
column_as_covariates |
[ |
delimiter |
[ |
cross_section |
[ |
choice_effects |
[ |
choice_identifiers |
[ |
labels |
[ |
n |
[ |
marginals |
[ Each list entry must be named according to a regressor label, and the following distributions are currently supported:
|
correlation |
[ |
verbose |
[ |
x |
A |
A tibble.
A covariate design matrix contains the choice covariates of a decider at a
choice occasion. It is of dimension J x P, where J is
the number of choice alternatives and P the number of effects. See
compute_P to compute the number P.
choice_effects <- choice_effects( choice_formula = choice_formula( formula = choice ~ price | income | comfort, error_term = "probit", random_effects = c( "price" = "cn", "income" = "cn" ) ), choice_alternatives = choice_alternatives(J = 3) ) ids <- generate_choice_identifiers(N = 3, Tp = 2) choice_covariates <- generate_choice_covariates( choice_effects = choice_effects, choice_identifiers = ids )choice_effects <- choice_effects( choice_formula = choice_formula( formula = choice ~ price | income | comfort, error_term = "probit", random_effects = c( "price" = "cn", "income" = "cn" ) ), choice_alternatives = choice_alternatives(J = 3) ) ids <- generate_choice_identifiers(N = 3, Tp = 2) choice_covariates <- generate_choice_covariates( choice_effects = choice_effects, choice_identifiers = ids )
The choice_data object defines the choice data, it is a combination of
choice_responses and choice_covariates.
choice_data( data_frame, format = "wide", column_choice = "choice", column_decider = "deciderID", column_occasion = NULL, column_alternative = NULL, column_ac_covariates = NULL, column_as_covariates = NULL, delimiter = "_", cross_section = is.null(column_occasion), choice_type = c("discrete", "ordered", "ranked") ) generate_choice_data( choice_effects, choice_identifiers = generate_choice_identifiers(N = 100), choice_covariates = NULL, choice_parameters = NULL, choice_preferences = NULL, column_choice = "choice", choice_type = c("auto", "discrete", "ordered", "ranked") ) long_to_wide( data_frame, column_ac_covariates = NULL, column_as_covariates = NULL, column_choice = "choice", column_alternative = "alternative", column_decider = "deciderID", column_occasion = NULL, alternatives = unique(data_frame[[column_alternative]]), delimiter = "_", choice_type = c("discrete", "ordered", "ranked") ) wide_to_long( data_frame, column_choice = "choice", column_alternative = "alternative", alternatives = NULL, delimiter = "_", choice_type = c("discrete", "ordered", "ranked") )choice_data( data_frame, format = "wide", column_choice = "choice", column_decider = "deciderID", column_occasion = NULL, column_alternative = NULL, column_ac_covariates = NULL, column_as_covariates = NULL, delimiter = "_", cross_section = is.null(column_occasion), choice_type = c("discrete", "ordered", "ranked") ) generate_choice_data( choice_effects, choice_identifiers = generate_choice_identifiers(N = 100), choice_covariates = NULL, choice_parameters = NULL, choice_preferences = NULL, column_choice = "choice", choice_type = c("auto", "discrete", "ordered", "ranked") ) long_to_wide( data_frame, column_ac_covariates = NULL, column_as_covariates = NULL, column_choice = "choice", column_alternative = "alternative", column_decider = "deciderID", column_occasion = NULL, alternatives = unique(data_frame[[column_alternative]]), delimiter = "_", choice_type = c("discrete", "ordered", "ranked") ) wide_to_long( data_frame, column_choice = "choice", column_alternative = "alternative", alternatives = NULL, delimiter = "_", choice_type = c("discrete", "ordered", "ranked") )
data_frame |
[ |
format |
[ |
column_choice |
[ |
column_decider |
[ |
column_occasion |
[ |
column_alternative |
[ |
column_ac_covariates |
[ |
column_as_covariates |
[ |
delimiter |
[ |
cross_section |
[ |
choice_type |
[ |
choice_effects |
[ |
choice_identifiers |
[ |
choice_covariates |
[ |
choice_parameters |
[ |
choice_preferences |
[ |
alternatives |
[ |
choice_data() acts as the main entry point for observed data. It accepts
either long or wide layouts and performs validation before
returning a tidy tibble with consistent identifiers. Columns that refer to
the same alternative are aligned using delimiter so that downstream helpers
can detect them automatically. When used with ranked or ordered choices the
function checks that rankings are complete and warns about inconsistencies.
Internally the helper converts long inputs to wide format. This guarantees that subsequent steps (such as computing probabilities) receive the same structure regardless of the original layout and keeps the workflow concise.
generate_choice_data() simulates choice data.
wide_to_long() and long_to_wide() transform to wide and long format.
The generated choice_data object inherits a choice_type attribute for
the requested simulation mode. Ordered alternatives (ordered = TRUE)
yield ordered responses, unordered alternatives default to discrete
multinomial outcomes, and ranked simulations return complete rankings for
every observation.
A tibble that inherits from choice_data.
choice_responses(), choice_covariates(), and choice_identifiers() for
the helper objects that feed into choice_data().
### simulate data from a multinomial probit model choice_effects <- choice_effects( choice_formula = choice_formula( formula = choice ~ A | B, error_term = "probit", random_effects = c("A" = "cn") ), choice_alternatives = choice_alternatives(J = 3) ) generate_choice_data(choice_effects) ### transform between long/wide format long_to_wide( data_frame = travel_mode_choice, column_alternative = "mode", column_decider = "individual" ) wide_to_long( data_frame = train_choice )### simulate data from a multinomial probit model choice_effects <- choice_effects( choice_formula = choice_formula( formula = choice ~ A | B, error_term = "probit", random_effects = c("A" = "cn") ), choice_alternatives = choice_alternatives(J = 3) ) generate_choice_data(choice_effects) ### transform between long/wide format long_to_wide( data_frame = travel_mode_choice, column_alternative = "mode", column_decider = "individual" ) wide_to_long( data_frame = train_choice )
This function constructs an object of class choice_effects, which
defines the effects of a choice model.
choice_effects( choice_formula, choice_alternatives, choice_data = NULL, delimiter = "_" ) ## S3 method for class 'choice_effects' print(x, ...)choice_effects( choice_formula, choice_alternatives, choice_data = NULL, delimiter = "_" ) ## S3 method for class 'choice_effects' print(x, ...)
choice_formula |
[ |
choice_alternatives |
[ |
choice_data |
[ Required to resolve data-dependent elements in |
delimiter |
[ |
x |
[ |
... |
Currently not used. |
A choice_effects object, which is a data.frame, where each row
is a model effect, and columns are
"effect_name", the name for the effect which is composed of
covariate and alternative name,
"generic_name", the generic effect name "beta_<effect number>",
"covariate", the (transformed) covariate name connected to the effect,
"alternative", the alternative name connected to the effect (only
if the effect is alternative-specific),
"as_covariate", indicator whether the covariate is alternative-specific,
"as_effect", indicator whether the effect is alternative-specific,
"mixing", a factor with levels in the order
"cn" (correlated normal distribution),
indicating the type of random effect.
For identification, the choice effects are ordered according to the following rules:
Non-random effects come before random effects.
According to the ordering of the factor mixing.
Otherwise, the order is determined by occurrence in formula.
It contains the arguments choice_formula, choice_alternatives, and
delimiter as attributes.
choice_effects( choice_formula = choice_formula( formula = choice ~ price | income | I(comfort == 1), error_term = "probit", random_effects = c( "price" = "cn", "income" = "cn" ) ), choice_alternatives = choice_alternatives(J = 3) )choice_effects( choice_formula = choice_formula( formula = choice ~ price | income | I(comfort == 1), error_term = "probit", random_effects = c( "price" = "cn", "income" = "cn" ) ), choice_alternatives = choice_alternatives(J = 3) )
The choice_formula object defines the choice model equation.
choice_formula(formula, error_term = "probit", random_effects = character()) ## S3 method for class 'choice_formula' print(x, ...)choice_formula(formula, error_term = "probit", random_effects = character()) ## S3 method for class 'choice_formula' print(x, ...)
formula |
[ |
error_term |
[
|
random_effects |
[ Current options for
To have random effects for the ASCs, use |
x |
[ |
... |
Currently not used. |
An object of class choice_formula, which is a list of the elements:
formulaThe model formula.
error_termThe name of the model's error term specification.
choiceThe name of the response variable.
covariate_typesThe (up to) three different types of covariates.
ASCDoes the model have ASCs?
random_effectsThe names of covariates with random effects.
The structure of formula is choice ~ A | B | C, i.e., a standard
formula object but with three parts on the right-hand
side, separated by |, where
choice is the name of the discrete response variable,
A are names of alternative-specific covariates with
a coefficient that is constant across alternatives,
B are names of covariates that are constant across
alternatives,
and C are names of alternative-specific covariates
with alternative-specific coefficients.
The following rules apply:
By default, intercepts (referred to as alternative-specific
constants, ASCs) are added to the model. They can be removed by adding
+ 0 in the second part, e.g., choice ~ A | B + 0 | C. To not include
any covariates of the second type but to estimate ASCs, add 1 in the
second part, e.g., choice ~ A | 1 | C. The expression
choice ~ A | 0 | C is interpreted as no covariates of the second type and
no ASCs.
To not include covariates of any type, add 0 in the respective
part, e.g., choice ~ 0 | B | C.
Some parts of the formula can be omitted when there is no ambiguity.
For example, choice ~ A is equivalent to choice ~ A | 1 | 0.
Multiple covariates in one part are separated by a + sign, e.g.,
choice ~ A1 + A2.
Arithmetic transformations of covariates in all three parts of the
right-hand side are possible via the function I(), e.g.,
choice ~ I(A1^2 + A2 * 2). In this case, a random effect can be defined
for the transformed covariate, e.g.,
random_effects = c("I(A1^2 + A2 * 2)" = "cn").
choice_formula( formula = choice ~ I(A^2 + 1) | B | I(log(C)), error_term = "probit", random_effects = c("I(A^2+1)" = "cn", "B" = "cn") )choice_formula( formula = choice ~ I(A^2 + 1) | B | I(log(C)), error_term = "probit", random_effects = c("I(A^2+1)" = "cn", "B" = "cn") )
The choice_identifiers object defines identifiers for the deciders and
choice occasions.
generate_choice_identifiers() generates identifiers.
extract_choice_identifiers() extracts choice identifiers.
choice_identifiers( data_frame, format = "wide", column_decider = "deciderID", column_occasion = "occasionID", cross_section = FALSE ) generate_choice_identifiers( N = length(Tp), Tp = 1, column_decider = "deciderID", column_occasion = "occasionID" ) extract_choice_identifiers( x, format = attr(x, "format"), column_decider = attr(x, "column_decider"), column_occasion = attr(x, "column_occasion"), cross_section = attr(x, "cross_section") )choice_identifiers( data_frame, format = "wide", column_decider = "deciderID", column_occasion = "occasionID", cross_section = FALSE ) generate_choice_identifiers( N = length(Tp), Tp = 1, column_decider = "deciderID", column_occasion = "occasionID" ) extract_choice_identifiers( x, format = attr(x, "format"), column_decider = attr(x, "column_decider"), column_occasion = attr(x, "column_occasion"), cross_section = attr(x, "cross_section") )
data_frame |
[ |
format |
[ In the long case, unique combinations of |
column_decider |
[ |
column_occasion |
[ |
cross_section |
[ |
N |
[ |
Tp |
[ Can also be of length |
x |
An object of class
|
An object of class choice_identifiers, which is a tibble with columns:
column_decider contains the decider identifiers,
column_occasion contains the choice occasion identifiers (only if
column_occasion is not NULL and cross_section = FALSE).
### panel case generate_choice_identifiers(N = 2, Tp = 2) ### cross-sectional case generate_choice_identifiers(N = 5, column_occasion = NULL) ### read choice identifiers choice_identifiers( data_frame = travel_mode_choice, format = "long", column_decider = "individual", column_occasion = NULL, cross_section = TRUE )### panel case generate_choice_identifiers(N = 2, Tp = 2) ### cross-sectional case generate_choice_identifiers(N = 5, column_occasion = NULL) ### read choice identifiers choice_identifiers( data_frame = travel_mode_choice, format = "long", column_decider = "individual", column_occasion = NULL, cross_section = TRUE )
These functions prepare and evaluate the likelihood contribution of observed choices for a given choice model.
choice_likelihood() pre-computes the design matrices and choice indices
implied by choice_data and choice_effects. The returned object stores
these quantities so that repeated likelihood evaluations during maximum
likelihood estimation avoid redundant work.
compute_choice_likelihood() evaluates the (log-)likelihood for given
choice_parameters. It can either take the original choice objects or a
pre-computed choice_likelihood object.
choice_likelihood( choice_data, choice_effects, choice_identifiers = extract_choice_identifiers(choice_data), input_checks = TRUE, lower_bound = 1e-10, ... ) compute_choice_likelihood( choice_parameters, choice_data, choice_effects, logarithm = TRUE, negative = FALSE, lower_bound = 1e-10, input_checks = TRUE, ... )choice_likelihood( choice_data, choice_effects, choice_identifiers = extract_choice_identifiers(choice_data), input_checks = TRUE, lower_bound = 1e-10, ... ) compute_choice_likelihood( choice_parameters, choice_data, choice_effects, logarithm = TRUE, negative = FALSE, lower_bound = 1e-10, input_checks = TRUE, ... )
choice_data |
[ |
choice_effects |
[ |
choice_identifiers |
[ |
input_checks |
[ |
lower_bound |
[ |
... |
Additional arguments passed to |
choice_parameters |
[ |
logarithm |
[ |
negative |
[ |
choice_likelihood() returns an object of class choice_likelihood, which
is a list containing the design matrices, the choice indices, and the
identifiers. compute_choice_likelihood() returns a single numeric value
with the (negative) log-likelihood or likelihood, depending on logarithm
and negative.
data(train_choice) choice_effects <- choice_effects( choice_formula = choice_formula( formula = choice ~ price | time, error_term = "probit" ), choice_alternatives = choice_alternatives( J = 2, alternatives = c("A", "B") ) ) choice_data <- choice_data( data_frame = train_choice, format = "wide", column_choice = "choice", column_decider = "deciderID", column_occasion = "occasionID" ) likelihood <- choice_likelihood( choice_data = choice_data, choice_effects = choice_effects ) choice_parameters <- generate_choice_parameters(choice_effects) compute_choice_likelihood( choice_parameters = choice_parameters, choice_data = likelihood, choice_effects = choice_effects, logarithm = TRUE )data(train_choice) choice_effects <- choice_effects( choice_formula = choice_formula( formula = choice ~ price | time, error_term = "probit" ), choice_alternatives = choice_alternatives( J = 2, alternatives = c("A", "B") ) ) choice_data <- choice_data( data_frame = train_choice, format = "wide", column_choice = "choice", column_decider = "deciderID", column_occasion = "occasionID" ) likelihood <- choice_likelihood( choice_data = choice_data, choice_effects = choice_effects ) choice_parameters <- generate_choice_parameters(choice_effects) compute_choice_likelihood( choice_parameters = choice_parameters, choice_data = likelihood, choice_effects = choice_effects, logarithm = TRUE )
These functions construct, validate, and transform an object of class
choice_parameters, which defines the parameters of a choice model.
choice_parameters() constructs a choice_parameters object.
generate_choice_parameters() samples parameters at random, see details.
validate_choice_parameters() validates a choice_parameters object.
switch_parameter_space() transforms a choice_parameters object between
the interpretation and optimization space, see details.
choice_parameters(beta = NULL, Omega = NULL, Sigma = NULL, gamma = NULL) generate_choice_parameters( choice_effects, fixed_parameters = choice_parameters() ) validate_choice_parameters( choice_parameters, choice_effects, allow_missing = FALSE ) switch_parameter_space(choice_parameters, choice_effects)choice_parameters(beta = NULL, Omega = NULL, Sigma = NULL, gamma = NULL) generate_choice_parameters( choice_effects, fixed_parameters = choice_parameters() ) validate_choice_parameters( choice_parameters, choice_effects, allow_missing = FALSE ) switch_parameter_space(choice_parameters, choice_effects)
beta |
[ |
Omega |
[ Can be |
Sigma |
[ |
gamma |
[ |
choice_effects |
[ |
fixed_parameters |
[ |
choice_parameters |
[ |
allow_missing |
[ |
An object of class choice_parameters, which is a list with the elements:
betaThe coefficient vector (if any).
OmegaThe covariance matrix of random effects (if any).
SigmaThe error term covariance matrix (or variance in ordered models).
gammaThreshold parameters for ordered models (if any).
Unspecified choice model parameters (if required) are drawn independently from the following distributions:
betaDrawn from a multivariate normal distribution with zero mean and a diagonal covariance matrix with value 10 on the diagonal.
OmegaDrawn from an Inverse-Wishart distribution with degrees
of freedom equal to P_r + 2 and scale matrix equal to the identity.
SigmaThe first row and column are fixed to 0 for level
normalization. The -value is fixed to 1 for scale
normalization. The lower right block is drawn from an Inverse-Wishart
distribution with degrees of freedom equal to J + 1 and scale matrix
equal to the identity.
The switch_parameter_space() function transforms a choice_parameters
object between the interpretation and optimization space.
The interpretation space is a list of (not necessarily identified)
parameters that can be interpreted.
The optimization space is a numeric vector of identified parameters that
can be optimized:
beta is not transformed
the first row and column of Sigma are fixed to 0 for level
normalization and the second diagonal element is fixed to 1 for scale
normalization
the covariance matrices (Omega and Sigma) are transformed to their
vectorized Cholesky factor (diagonal fixed to be positive for uniqueness)
### generate choice parameters at random J <- 3 choice_effects <- choice_effects( choice_formula = choice_formula( formula = choice ~ A | B, error_term = "probit", random_effects = c("A" = "cn") ), choice_alternatives = choice_alternatives(J = J) ) choice_parameters <- generate_choice_parameters( choice_effects = choice_effects, fixed_parameters = choice_parameters( Sigma = diag(c(0, rep(1, J - 1))) # scale and level normalization ) )### generate choice parameters at random J <- 3 choice_effects <- choice_effects( choice_formula = choice_formula( formula = choice ~ A | B, error_term = "probit", random_effects = c("A" = "cn") ), choice_alternatives = choice_alternatives(J = J) ) choice_parameters <- generate_choice_parameters( choice_effects = choice_effects, fixed_parameters = choice_parameters( Sigma = diag(c(0, rep(1, J - 1))) # scale and level normalization ) )
The choice_preferences object defines the deciders' preferences in the
choice model.
choice_preferences() constructs a choice_preferences object.
generate_choice_preferences() samples choice preferences at random.
choice_preferences(data_frame, column_decider = colnames(data_frame)[1]) generate_choice_preferences( choice_effects, choice_parameters = NULL, choice_identifiers = generate_choice_identifiers(N = 100) )choice_preferences(data_frame, column_decider = colnames(data_frame)[1]) generate_choice_preferences( choice_effects, choice_parameters = NULL, choice_identifiers = generate_choice_identifiers(N = 100) )
data_frame |
[ |
column_decider |
[ |
choice_effects |
[ |
choice_parameters |
[ |
choice_identifiers |
[ |
An object of class choice_preferences, which is a data.frame with the
deciders' preferences. The column names are the names of the effects in the
choice model. The first column contains the decider identifiers.
### generate choice preferences from choice parameters and effects choice_effects <- choice_effects( choice_formula = choice_formula( formula = choice ~ price | income | comfort, error_term = "probit", random_effects = c( "price" = "cn", "income" = "cn" ) ), choice_alternatives = choice_alternatives(J = 3) ) choice_parameters <- generate_choice_parameters( choice_effects = choice_effects ) ids <- generate_choice_identifiers(N = 4) (choice_preferences <- generate_choice_preferences( choice_parameters = choice_parameters, choice_effects = choice_effects, choice_identifiers = ids )) ### inspect decider-specific preference vectors head(choice_preferences)### generate choice preferences from choice parameters and effects choice_effects <- choice_effects( choice_formula = choice_formula( formula = choice ~ price | income | comfort, error_term = "probit", random_effects = c( "price" = "cn", "income" = "cn" ) ), choice_alternatives = choice_alternatives(J = 3) ) choice_parameters <- generate_choice_parameters( choice_effects = choice_effects ) ids <- generate_choice_identifiers(N = 4) (choice_preferences <- generate_choice_preferences( choice_parameters = choice_parameters, choice_effects = choice_effects, choice_identifiers = ids )) ### inspect decider-specific preference vectors head(choice_preferences)
The choice_probabilities object defines the choice probabilities.
compute_choice_probabilities() calculates the choice probabilities based
on the choice parameters and the choice data.
choice_probabilities( data_frame, choice_only = TRUE, column_decider = "deciderID", column_occasion = NULL, cross_section = FALSE, column_probabilities = if (choice_only) "choice_probability" ) compute_choice_probabilities( choice_parameters, choice_data, choice_effects, choice_only = FALSE, input_checks = TRUE, ... )choice_probabilities( data_frame, choice_only = TRUE, column_decider = "deciderID", column_occasion = NULL, cross_section = FALSE, column_probabilities = if (choice_only) "choice_probability" ) compute_choice_probabilities( choice_parameters, choice_data, choice_effects, choice_only = FALSE, input_checks = TRUE, ... )
data_frame |
[ |
choice_only |
[ |
column_decider |
[ |
column_occasion |
[ |
cross_section |
[ |
column_probabilities |
[ If |
choice_parameters |
[ |
choice_data |
[ |
choice_effects |
[ |
input_checks |
[ |
... |
Passed to the underlying probability computation routine. |
A choice_probabilities S3 object (a data frame) that stores additional
metadata in attributes such as column_probabilities, choice_only, and the
identifier columns. These attributes are used by downstream helpers to
reconstruct the original structure.
data(train_choice) choice_effects <- choice_effects( choice_formula = choice_formula( formula = choice ~ price | time, error_term = "probit" ), choice_alternatives = choice_alternatives( J = 2, alternatives = c("A", "B") ) ) choice_parameters <- generate_choice_parameters(choice_effects) choice_data <- choice_data( data_frame = train_choice, format = "wide", column_choice = "choice", column_decider = "deciderID", column_occasion = "occasionID" ) compute_choice_probabilities( choice_parameters = choice_parameters, choice_data = choice_data, choice_effects = choice_effects, choice_only = TRUE )data(train_choice) choice_effects <- choice_effects( choice_formula = choice_formula( formula = choice ~ price | time, error_term = "probit" ), choice_alternatives = choice_alternatives( J = 2, alternatives = c("A", "B") ) ) choice_parameters <- generate_choice_parameters(choice_effects) choice_data <- choice_data( data_frame = train_choice, format = "wide", column_choice = "choice", column_decider = "deciderID", column_occasion = "occasionID" ) compute_choice_probabilities( choice_parameters = choice_parameters, choice_data = choice_data, choice_effects = choice_effects, choice_only = TRUE )
The choice_responses object defines the observed discrete responses.
Additional response columns (for example ranked choice indicators) are
preserved so they can be merged with covariates downstream.
generate_choice_responses() simulates choices
choice_responses( data_frame, column_choice = "choice", column_decider = "deciderID", column_occasion = NULL, cross_section = FALSE ) generate_choice_responses( choice_effects, choice_covariates = generate_choice_covariates(choice_effects = choice_effects), choice_parameters = generate_choice_parameters(choice_effects = choice_effects), choice_identifiers = extract_choice_identifiers(choice_covariates), choice_preferences = generate_choice_preferences(choice_parameters = choice_parameters, choice_effects = choice_effects, choice_identifiers = choice_identifiers), column_choice = "choice", choice_type = c("auto", "discrete", "ordered", "ranked") )choice_responses( data_frame, column_choice = "choice", column_decider = "deciderID", column_occasion = NULL, cross_section = FALSE ) generate_choice_responses( choice_effects, choice_covariates = generate_choice_covariates(choice_effects = choice_effects), choice_parameters = generate_choice_parameters(choice_effects = choice_effects), choice_identifiers = extract_choice_identifiers(choice_covariates), choice_preferences = generate_choice_preferences(choice_parameters = choice_parameters, choice_effects = choice_effects, choice_identifiers = choice_identifiers), column_choice = "choice", choice_type = c("auto", "discrete", "ordered", "ranked") )
data_frame |
[ |
column_choice |
[ |
column_decider |
[ |
column_occasion |
[ |
cross_section |
[ |
choice_effects |
[ |
choice_covariates |
[ |
choice_parameters |
[ |
choice_identifiers |
[ |
choice_preferences |
[ |
choice_type |
[ |
A data.frame.
choice_effects <- choice_effects( choice_formula = choice_formula( formula = choice ~ price | time, error_term = "probit" ), choice_alternatives = choice_alternatives(J = 5) ) generate_choice_responses(choice_effects = choice_effects)choice_effects <- choice_effects( choice_formula = choice_formula( formula = choice ~ price | time, error_term = "probit" ), choice_alternatives = choice_alternatives(J = 5) ) generate_choice_responses(choice_effects = choice_effects)
These helper functions compute logit choice probabilities for unordered and
ordered outcomes. Panel inputs reuse the observation-level logit formulae,
which remain valid because the logit error term is independent across
occasions. Latent class models are supported via weighted averages of
class-specific probabilities. When Omega is supplied, the coefficients are
assumed to follow a multivariate normal distribution and the resulting
probabilities are evaluated by averaging over simulation draws.
choiceprob_logit( X, y = NULL, Tp = NULL, beta, Omega = NULL, gamma = NULL, weights = NULL, input_checks = TRUE, ordered = !is.null(gamma), ranked = !ordered && !is.null(y) && length(y) > 0 && length(y[[1]]) > 1, panel = !is.null(Tp) && any(Tp > 1), lc = !is.null(weights), draws = NULL, n_draws = 200 )choiceprob_logit( X, y = NULL, Tp = NULL, beta, Omega = NULL, gamma = NULL, weights = NULL, input_checks = TRUE, ordered = !is.null(gamma), ranked = !ordered && !is.null(y) && length(y) > 0 && length(y[[1]]) > 1, panel = !is.null(Tp) && any(Tp > 1), lc = !is.null(weights), draws = NULL, n_draws = 200 )
X |
[ In the ordered case ( |
y |
[ In the ranked case ( In the non-panel case ( |
Tp |
[ Can be |
beta |
[ In the latent class case ( |
Omega |
[ Can be In the latent class case ( |
gamma |
[ The event |
weights |
[ |
input_checks |
[ |
ordered, ranked, panel, lc
|
[ |
draws |
[ |
n_draws |
[ |
A numeric vector with the choice probabilities for the observed choices when
y is supplied. If y is NULL, a matrix with one row per observation and
one column per alternative is returned.
These helper functions calculate probit choice probabilities for various scenarios:
in the regular (choiceprob_mnp_*), ordered (*_ordered), and
ranked (ranked = TRUE) case,
in the normally mixed (choiceprob_mmnp_*) and latent class (*_lc) case,
for panel data (*_panel),
based on the full likelihood (cml = "no"), the full pairwise composite
marginal likelihood (cml = "fp"), and the adjacent pairwise composite
marginal likelihood (cml = "ap"),
for the observed choices or for all alternatives (if y is NULL).
The function choiceprob_probit() is the general API which calls the
specialized functions and can perform input checks.
choiceprob_probit( X, y = NULL, Tp = NULL, cml = "no", beta, Omega = NULL, Sigma, gamma = NULL, weights = NULL, re_position = utils::tail(seq_along(beta), nrow(Omega)), gcdf = pmvnorm_cdf_default, lower_bound = 0, input_checks = TRUE, ordered = !is.null(gamma), ranked = if (!ordered && !is.null(y) && isTRUE(length(y) > 0)) { length(y[[1]]) > 1 } else { FALSE }, mixed = !is.null(Omega), panel = mixed & !is.null(Tp) & any(Tp > 1), lc = !is.null(weights) ) choiceprob_mnp( X, y, beta, Sigma, gcdf = pmvnorm_cdf_default, lower_bound = 0, ranked = FALSE ) choiceprob_mnp_ordered(X, y, beta, Sigma, gamma, lower_bound = 0) choiceprob_mmnp( X, y, beta, Omega, Sigma, re_position = utils::tail(seq_along(beta), nrow(Omega)), gcdf = pmvnorm_cdf_default, lower_bound = 0, ranked = FALSE ) choiceprob_mmnp_ordered( X, y, beta, Omega, Sigma, gamma, re_position = utils::tail(seq_along(beta), nrow(Omega)), lower_bound = 0 ) choiceprob_mmnp_lc( X, y, beta, Omega, Sigma, weights, re_position = utils::tail(seq_along(beta[[1]]), nrow(Omega[[1]])), gcdf = pmvnorm_cdf_default, lower_bound = 0, ranked = FALSE ) choiceprob_mmnp_ordered_lc( X, y, beta, Omega, Sigma, gamma, weights, re_position = utils::tail(seq_along(beta[[1]]), nrow(Omega[[1]])), lower_bound = 0 ) choiceprob_mmnp_panel( X, y, Tp, cml, beta, Omega, Sigma, re_position = utils::tail(seq_along(beta), nrow(Omega)), gcdf = pmvnorm_cdf_default, lower_bound = 0, ranked = FALSE ) choiceprob_mmnp_ordered_panel( X, y, Tp, cml, beta, Omega, Sigma, gamma, re_position = utils::tail(seq_along(beta), nrow(Omega)), gcdf = pmvnorm_cdf_default, lower_bound = 0 ) choiceprob_mmnp_panel_lc( X, y, Tp, cml, beta, Omega, Sigma, weights, re_position = utils::tail(seq_along(beta), nrow(Omega)), gcdf = pmvnorm_cdf_default, lower_bound = 0, ranked = FALSE ) choiceprob_mmnp_ordered_panel_lc( X, y, Tp, cml, beta, Omega, Sigma, gamma, weights, re_position = utils::tail(seq_along(beta), nrow(Omega)), gcdf = pmvnorm_cdf_default, lower_bound = 0 )choiceprob_probit( X, y = NULL, Tp = NULL, cml = "no", beta, Omega = NULL, Sigma, gamma = NULL, weights = NULL, re_position = utils::tail(seq_along(beta), nrow(Omega)), gcdf = pmvnorm_cdf_default, lower_bound = 0, input_checks = TRUE, ordered = !is.null(gamma), ranked = if (!ordered && !is.null(y) && isTRUE(length(y) > 0)) { length(y[[1]]) > 1 } else { FALSE }, mixed = !is.null(Omega), panel = mixed & !is.null(Tp) & any(Tp > 1), lc = !is.null(weights) ) choiceprob_mnp( X, y, beta, Sigma, gcdf = pmvnorm_cdf_default, lower_bound = 0, ranked = FALSE ) choiceprob_mnp_ordered(X, y, beta, Sigma, gamma, lower_bound = 0) choiceprob_mmnp( X, y, beta, Omega, Sigma, re_position = utils::tail(seq_along(beta), nrow(Omega)), gcdf = pmvnorm_cdf_default, lower_bound = 0, ranked = FALSE ) choiceprob_mmnp_ordered( X, y, beta, Omega, Sigma, gamma, re_position = utils::tail(seq_along(beta), nrow(Omega)), lower_bound = 0 ) choiceprob_mmnp_lc( X, y, beta, Omega, Sigma, weights, re_position = utils::tail(seq_along(beta[[1]]), nrow(Omega[[1]])), gcdf = pmvnorm_cdf_default, lower_bound = 0, ranked = FALSE ) choiceprob_mmnp_ordered_lc( X, y, beta, Omega, Sigma, gamma, weights, re_position = utils::tail(seq_along(beta[[1]]), nrow(Omega[[1]])), lower_bound = 0 ) choiceprob_mmnp_panel( X, y, Tp, cml, beta, Omega, Sigma, re_position = utils::tail(seq_along(beta), nrow(Omega)), gcdf = pmvnorm_cdf_default, lower_bound = 0, ranked = FALSE ) choiceprob_mmnp_ordered_panel( X, y, Tp, cml, beta, Omega, Sigma, gamma, re_position = utils::tail(seq_along(beta), nrow(Omega)), gcdf = pmvnorm_cdf_default, lower_bound = 0 ) choiceprob_mmnp_panel_lc( X, y, Tp, cml, beta, Omega, Sigma, weights, re_position = utils::tail(seq_along(beta), nrow(Omega)), gcdf = pmvnorm_cdf_default, lower_bound = 0, ranked = FALSE ) choiceprob_mmnp_ordered_panel_lc( X, y, Tp, cml, beta, Omega, Sigma, gamma, weights, re_position = utils::tail(seq_along(beta), nrow(Omega)), gcdf = pmvnorm_cdf_default, lower_bound = 0 )
X |
[ In the ordered case ( |
y |
[ In the ranked case ( In the non-panel case ( |
Tp |
[ Can be |
cml |
[ |
beta |
[ In the latent class case ( |
Omega |
[ Can be In the latent class case ( |
Sigma |
[ In the ordered case ( |
gamma |
[ The event |
weights |
[ |
re_position |
[ By default, the last |
gcdf |
[ In the no-panel ( |
lower_bound |
[ |
input_checks |
[ |
ordered, ranked, mixed, panel, lc
|
[ |
A numeric vector of length N, the probabilities for the observed
choices y.
In the panel case (panel = TRUE), the probabilities of the observed choice
sequence of length length(Tp).
If y is NULL and in the non-panel case (panel = FALSE), a matrix of
dimension N times J, the probabilities for all alternatives.
In the ranked case (ranked = TRUE), only first place choice probabilities
are computed, which is equivalent to computing choice probabilities in the
regular (maximum utility) model.
These helper functions count the number of model effects:
compute_P() returns the total number P of model effects.
compute_P_d() returns the number P_d of non-random effects.
compute_P_r() returns the number P_r of random effects.
compute_P(choice_effects) compute_P_d(choice_effects) compute_P_r(choice_effects)compute_P(choice_effects) compute_P_d(choice_effects) compute_P_r(choice_effects)
choice_effects |
[ |
An integer, the number of model effects.
Data set of 2929 stated choices by 235 Dutch individuals deciding between
two hypothetical train trip options "A" and "B" based on the
price, the travel time, the number of rail-to-rail transfers (changes), and
the level of comfort.
The data were obtained in 1987 by Hague Consulting Group for the National Dutch Railways. Prices were recorded in Dutch guilder and in this data set transformed to Euro at an exchange rate of 2.20371 guilders = 1 Euro.
train_choicetrain_choice
A tibble with 2929 rows and 11 columns:
integer]The identifier for the decider.
integer]The identifier for the choice occasion.
character]The chosen alternative, either "A" or
"B".
numeric]The price for alternative "A" in Euro.
numeric]The travel time for alternative "A" in
hours.
integer]The number of changes for alternative
"A".
factor]The comfort level for alternative "A",
where 0 is the best comfort and 2 the worst.
numeric]The price for alternative "B" in Euro.
numeric]The travel time for alternative "B" in
hours.
integer]The number of changes for alternative
"B".
factor]The comfort level for alternative "B",
where 0 is the best comfort and 2 the worst.
Ben-Akiva M, Bolduc D, Bradley M (1993). “Estimation of travel choice models with randomly distributed values of time.” Transportation Research Record, 1413.
Data set of revealed choices by 210 travelers between Sydney and Melbourne who report their choice between the four travel modes plane, train, bus, or car. The data were collected as part of a 1987 intercity mode choice study.
travel_mode_choicetravel_mode_choice
A tibble with 840 rows and 8 columns:
integer]The identifier for the decider.
character]The travel mode.
integer]Whether the mode was chosen.
integer]The terminal waiting time, 0 for car.
integer]The travel cost in dollars.
integer]The travel time in minutes.
integer]The household income in thousand dollars.
integer]The traveling group size.
Ben-Akiva M, Bolduc D, Bradley M (1993). “Estimation of travel choice models with randomly distributed values of time.” Transportation Research Record, 1413.