bolasso 0.4.0
CRAN release: 2025-10-14
-
Allows the user to extract the bootstrap indices with
bootstrap_samples().library(bolasso) model <- bolasso(mpg ~ hp + wt, data = mtcars, n.boot = 10) bootstrap_samples(model) Fixes #12
Fixes #13
bolasso 0.3.0
CRAN release: 2024-12-08
New Features
-
Fast Estimation Mode:
-
bolasso()gains afastargument which optimizes computation by using a single cross-validated regression on the entire dataset to determine the optimal regularization parameter (lambda). This approach bypasses the need for cross-validation within each bootstrap replicate, drastically reducing computation time, especially beneficial for large datasets or when using a high number of bootstrap replicates.# Fast mode reduces computation time by using a single cross-validated lambda model_fast <- bolasso( diabetes ~ ., data = train, n.boot = 1000, progress = FALSE, family = "binomial", fast = TRUE )
-
-
Enhanced Variable Selection Methods:
selected_vars()is now a shorthand forselected_variables().-
selected_variables()/selected_vars()supports two variable selection algorithms via themethodargument: the Variable Inclusion Probability (VIP) method and the Quantile (QNT) method. The VIP method selects variables that appear in a high percentage of bootstrap models, while the QNT method selects variables based on bootstrap confidence intervals. Setmethod = "vip"ormethod = "qnt", respectively.# Select variables using the VIP method with a 95% threshold selected_vars_vip <- selected_variables(model, threshold = 0.95, method = "vip") # Select variables using the QNT method selected_vars_qnt <- selected_variables(model, threshold = 0.95, method = "qnt")
- Tidy Method for Bolasso Objects:
-
Variable Selection Visualization:
-
plot_selection_thresholds()provides a visual representation of the selection thresholds for each variable. This visualization helps users understand the stability and robustness of variable selection across different thresholds and methods.# Visualize selection thresholds for variables plot_selection_thresholds(model, select = "lambda.min")
-
Improvements
-
Plotting Coefficient Distributions:
-
plot_selected_variables()visualizes the coefficient distributions for only the selected variables. This function provides a focused view of the most relevant variables in the model.# Plot coefficient distributions for selected variables plot_selected_variables( model, threshold = 0.95, method = "vip", select = "lambda.min" ) -
plotvisualizes the coefficient distributions for all model covariates.# Plot coefficient distributions for selected variables plot(model, select = "lambda.min")
-
bolasso 0.2.0
CRAN release: 2022-05-09
- Added a
NEWS.mdfile to track changes to the package. -
bolasso()argumentformhas been renamed toformulato reflect common naming conventions in R statistical modeling packages. -
predict()andcoef()methods are now implemented usingfuture.apply::future_lapplyallowing for computing predictions and extracting coefficients in parallel. This may result in slightly worse performance (due to memory overhead) when the model/prediction data is small but will be significantly faster when e.g. generating predictions on a very large data-set. - Solved an issue with
bolasso()argumentformula. The user-supplied value offormulais handled viadeparse()which has a defaultwidth.cutoffvalue of 60. This was causing issues with formulas by splitting them into multi-element character vectors. It has now been set to the maximum value of500Lwhich will correctly parse all lengths of formulas. -
predict()now forces evaluation of theformulaargument in thebolasso()call. This resolves an issue where, if a user passes a formula via a variable,predict()would pass the variable name to the underlying prediction function as opposed to the actual formula.
