Engineering > STUDY GUIDE > Lecture_Notes_ISYE_6501_Midterm_2 # Week 5 Notes Variable Selection (All)
# Week 5 Notes Variable Selection what do we do with a lot of factors in our models? variable selection helps us choose the best factors for our models variable selection can work for any factor ba... sed model - regression / classification why do we not want a lot of factors in our models? - overfitting: when the number of factors is close or larger than number of data points our model will overfit - overfitting: model captures the random effect of our data instead of the real effects too many factors is the same idea - we will model too much of the random effects in our model with few data points overfitting can cause bad estimates if too many factors our model with be influenced too much by the random effect of that data with few data our model can even fit unrelated variables! - simplicity: simple models are more easier to interpret - collecting data can be expensive - with less factors less data is required to production the model - fewer factors - less chance of including factor that is meaningless - easier to explain to others - we want to know the why? hard to do with too many factors - need to clearly communicate what you model is doing fewer factors is very beneficial! building simpler models with fewer factors helps avoid - - overfitting - difficulty of interpretation # Week 5 Notes Variable Selection Models models can automate the variable selection process all can be applied to all types of models two types: - step-by-step building a model forward selection: start with a model that has no factors - step by step add variables and keep the variable is there is model improvement - we can limit the model by a number of thresholds - after built up we can go back and remove any variables that might not be important after full model is fit - we can judge factors by p value (.15 of exploration or .05 for final model) backward elimination: start with a model with all factors - step by step remove variables that are 'bad' based on p value - continue to do this until all variables included are 'good' variables or we reached a factor number criteria - factors can be judged by p value (.15 for exploration and .05 for final model) stepwise regression: combination of forward selection and backward elimination - start with all or no variables - at each step add or remove a factor based on some pvalue criteria - model will adjust older factors based on what new values we add to the model - we can use other metrics AIC, BIC, R^2 to measure 'good' variables in any step by step method step-by-step = greedy algorithm, does the one step that is the best without taking future options into account - these are model 'classical' newer methods based on optimization models that look at all possible options at the same time - LASSO: add a constraint to the standard regression equation to bound coefficients from getting large - sum of the coefficients sumof(|ai|) <= t - regression has a budget t to use on coefficients - factors that are not important will be dragged down to 0 - constraining any variables means we need to scale the data beforehand! - how to be pick t? - number of variables and quality of model? - try LASSO with different values of t and choose the best performance - Elastic Net: combination of LASSO and RIDGE regression - constrain a combination of the absolute value of the sum of coefficients vs. the squared sum of coefficients - need to scale the data - sumof(ai^2) <= t - without the absolute term we have RIDGE regression - these are global approaches to variable selection what is the key difference between stepwise and LASSO regression? - lasso has a regularization term and requires the data to be scaled beforehand - in regression contexts LASSO needs to be scaled - size constraint will pick up the wrong values because magnitude of factors messes with the coefficient estimates! # Week 5 Notes variable selection - greedy variable selection - stepwise - global optimization - LASSO, ridge, Elastic net how do we choose between these methods? stepwise methods: good for exploration and quick analysis - stepwise is the most common - can give set of variables that fit to random effects - they might generalize as well to new data global optimization: slower but better for prediction - LASSO, Elastic Net Regularized Regression: LASSO: - minimize sumof((yi - (a0 _ a1x1 + a2x2 + ... + ajxji))^2) subject to sumof(|ai|) <= t - some coefficients forced to zero to simplify model - will take some variables to zero but they may not b [Show More]
Last updated: 1 year ago
Preview 1 out of 28 pages
Instant download
Instant download
Connected school, study & course
About the document
Uploaded On
May 19, 2022
Number of pages
28
Written in
This document has been written for:
Uploaded
May 19, 2022
Downloads
0
Views
94
In Browsegrades, a student can earn by offering help to other student. Students can help other students with materials by upploading their notes and earn money.
We're available through e-mail, Twitter, Facebook, and live chat.
FAQ
Questions? Leave a message!
Copyright © Browsegrades · High quality services·