Computer Science > QUESTIONS & ANSWERS > CS 412 Introduction To Data Mining - University of Illinois-2020 Midterm Exam Q&A (All)

CS 412 Introduction To Data Mining - University of Illinois-2020 Midterm Exam Q&A

Document Content and Description Below

Midterm Exam 1. Fill in your information: Full Name: NetID: 1 Question 1 (20 points): Get to Know your Data A. [5] The least and greatest number in a list of 7 integers are 2 and 20 respective... ly. The median and mode of list are 6 and 3 respectively. Find out which of the following options can be mean of the list. (a) 4 (b) 7 (c) 6.85 (d) 6.71 Note there may be more than one correct option. B. [5] Consider the following data on age distribution of a population of 3594 people. age frequency 1-5 6-15 16-20 21-50 51-80 81-110 2004503001500700444 (a) Compute the approximate median age for the given population. (b) What special case will result in maximum error in the approximate median you calculated above? Assume that the frequency for each interval remains unchanged. C. [5] Consider the following data on two persons Jack and Jill. Height and Weight are given in ft and lbs respectively. Name Age Height Weight Jack 24 7 210 Jill 14 4 140 Age , Height and Weight are ordinal attributes with following interval states. (a) Age : [3-18], [19-40], [41-80] (b) Height : [3-5], [6-9] (c) Weight : [70-110], [110-150], [150-190], [190-230] Compute the Manhattan distance between two persons. Show the steps of your calculation. D. [5] The following table contains the medical record of two patients (John and Jane) on six lab tests. Calculate the Dissimilarity between John and Jane based on these records (P & N mean positive and negative test results, respectively). Write down the steps. test-1 test-2 test-3 test-4 test-5 test-6 John P N P N N N Jane P N P N P N Solution. 2 A. 2; 3; 3; 6; x; y; 20 . x and y has to be at least 7 and 8 respectively. B is the correct answer. 7 can be the mean of the list B. (a) Approximate median = 21 + 1797 1500 -950 ∗ (50 - 21) = 37:38 (b) When all 1500 people in the median interval have age 21. 37.38 - 21 = 16.38 C. Jack : (2 , 2, 4) , Jill : (1, 1, 2) [replacing ordinal attribute by rank] Each ordinal attributes has different number of states. So they need to be normalized. Jack : ( 1 2; 1; 1) Jill : (0; 0; 1 3) Manhattan Distance : 13 6 D. these are asymmetric binary attributes, so we need to draw the contingency table and calculate q, r, and s. 2 2 + 0 + 1 = 0:66 3 Question 2 (15 points): Data Preprocessing A. [3] Consider the following data for the attribute price: 8, 9, 15, 16, 21, 21, 24, 26, 27, 30, 30, 34 Use smoothing by bin means to smooth this data using a equi-depth bins and a bin depth of 4. B. [12] Consider the following data for two attributes A and B: A B 21 25 42 43 57 59 657975998781 (a) [5] Normalize attribute ‘A’ based on z-score normalization. (b) [5] Calculate the correlation coefficient. Are these two attributes positively correlated or negatively correlated? (c) [2] Compute the covariance of the two attributes. Solution. A. Bin means: Bin 1: 8+9+15+16 4 = 12 Bin 2: 21+21+24+26 4 = 23 Bin 3: 27+30+30+34 4 = 30 Smoothed data: Bin 1 = 12, 12, 12, 12 Bin 2 = 23, 23, 23, 23 Bin 2 = 30, 30, 30, 30 B. (a) Mean µ = Pn i=1 n Ai = 247 6 = 41:167 Variance s2 = 1 n-1(Pn i=1 A2 i - n1 (Pn i=1 Ai)2) = 6-1 1(11409 - 247 6 2 ) = 248:167 Standard deviation s = p248:167 = 15:753 z-score = Ai-µ [Show More]

Last updated: 1 year ago

Preview 1 out of 8 pages

Reviews( 0 )

$7.50

Add to cart

Instant download

Can't find what you want? Try our AI powered Search

OR

GET ASSIGNMENT HELP
26
0

Document information


Connected school, study & course


About the document


Uploaded On

Apr 02, 2023

Number of pages

8

Written in

Seller


seller-icon
PAPERS UNLIMITED™

Member since 2 years

482 Documents Sold


Additional information

This document has been written for:

Uploaded

Apr 02, 2023

Downloads

 0

Views

 26

Document Keyword Tags

More From PAPERS UNLIMITED™

View all PAPERS UNLIMITED™'s documents »

Recommended For You

What is Browsegrades

In Browsegrades, a student can earn by offering help to other student. Students can help other students with materials by upploading their notes and earn money.

We are here to help

We're available through e-mail, Twitter, Facebook, and live chat.
 FAQ
 Questions? Leave a message!

Follow us on
 Twitter

Copyright © Browsegrades · High quality services·