ISYE6501 HOMEWORK 10 Question 14.1 The breast cancer data set breast-cancer-wisconsin.data.txt from http://archive.ics.uci.edu/ml/ machine-learning-databases/breast-cancer-wisconsin/ (description a... t http://archive.ics.uci.edu/ml/ datasets/Breast+Cancer+Wisconsin+%28Original%29 ) has missing values. 1. Use the mean/mode imputation method to impute values for the missing data. 2. Use regression to impute values for the missing data. 3. Use regression with perturbation to impute values for the missing data. 4. (Optional) Compare the results and quality of classification models (e.g., SVM, KNN) build using (1) the data sets from questions 1,2,3; (2) the data that remains after data points with missing values are removed; and (3) the data set when a binary variable is introduced to indicate missing values. Question 15.1 Describe a situation or problem from your job, everyday life, current events, etc., for which optimization would be appropriate. What data would you need? I worked at a bank, and our fraud agents needs to review Direct Deposits that we identified as suspiscious. In that case, I built a logistic regression model to identify the Deposits that are more likely to be suspiscious. The challenge is that we can only use model score to prioritize the queue. Also Deposit with higher amounts might have a higher priority. Moreover, Deposits are processed in batch at different time of the day, so depending on the time of the day, the day of the week and the week of the monther, the quantity of deposits might differ; and agents have a limited time to review the suspicious deposits. So since agents are a limited resource, optimization might be appropriate to estimate the number of agent needed at any given time of the day. The data that I would need is: * The list of deposits at any given time/day, the amount and the fraud score. * The team budget, the max number of agents on week days vs week ends, and the average time spent to review each. * The minimum penetration rate(reviewed deposits over total deposit) [Show More]

