Computer Science > QUESTIONS & ANSWERS > UIUC-CS412 An Introduction to Data Mining_University of Illinois (Fall 2020) Final Exam. 100 marks,  (All)

UIUC-CS412 An Introduction to Data Mining_University of Illinois (Fall 2020) Final Exam. 100 marks, brief answers

Document Content and Description Below

UIUC-CS412 \An Introduction to Data Mining" (Fall 2020) Final Exam Minkowski Distance [10 points] Given three data points in 2-D space: x1 = (1; 0)0, x2 = (-1; 0)0 and x3 = (a; b)0, where a and b ... are two unknown numbers. Let d1 be the distance between x1 and x3, and d2 be the distance between x2 and x3 (a) [6 pts] What are the L2, L1 and L1 distances between x1 and x2 respectively? (b) [1:5 pts] If we use L2 distance, under which condition does d1 = d2? (c) [2:5 pts] If we use L1 distance, under which condition does d1 = d2? half point for each part 3 2 Basic Statistics and Normalization [10 points] Table 1 provides the information of 9 randomly sampled students’ final exam scores of an online course. Table 1: Final Exam Scores of 9 Students. (a) [3 pts] What is the median score? (b) [3 pts] [True or False]. If one student’s score improves, the sample mean will definitely increase as well. (c) [2 pts] [True or False]. If scores of six students improve, the median will definitely increase as well. (d) [2 pts] Suppose scores of k (1 ≤ k ≤ 9) students improve and the remaining (9 - k) students’ scores remain the same. What is the minimal k so that the median will definitely increases? 3 Data Warehouse [10 points] (a) [4 pts] Suppose we build a data warehouse with three dimensions, including location, supplier, and time. If we do not consider the concept hierarchy, how many cuboids are there in total? (b) [6 pts] Suppose the location dimension has three different values, including Urbana, Chicago and New York City; the supplier dimension has two different values, including Dairy Land and Land O’Lakes; the time dimension has twelve different values, ranging from January to December. How many base cells are there in total (3 pts)? How many aggregated cells are there in total (3 pts)? 4 Pattern Evaluations [10 points] Giving two itemsets A and B and the following contingency table (Table 2). A :A Prow B a b a + b :B c d c + d Pcol a + c b + d a + b + c + d Table 2: Contingency Table of Problem 4 (a) [4 pts] If we use lift as the interestingness measure, under which condition will we conclude that A and B are positively correlated? Solution: ad > bc (b) [3 pts] Suppose we conclude that A and B are positively correlated based on lift. [True or False] Now suppose we increase d while keep a; b; c unchanged, A and B will still be positively correlated based on lift. [Show More]

Last updated: 1 year ago

Preview 1 out of 13 pages

Reviews( 0 )

$7.50

Add to cart

Instant download

Can't find what you want? Try our AI powered Search

OR

GET ASSIGNMENT HELP
47
0

Document information


Connected school, study & course


About the document


Uploaded On

Apr 02, 2023

Number of pages

13

Written in

Seller


seller-icon
PAPERS UNLIMITED™

Member since 2 years

482 Documents Sold


Additional information

This document has been written for:

Uploaded

Apr 02, 2023

Downloads

 0

Views

 47

Document Keyword Tags

More From PAPERS UNLIMITED™

View all PAPERS UNLIMITED™'s documents »

Recommended For You

What is Browsegrades

In Browsegrades, a student can earn by offering help to other student. Students can help other students with materials by upploading their notes and earn money.

We are here to help

We're available through e-mail, Twitter, Facebook, and live chat.
 FAQ
 Questions? Leave a message!

Follow us on
 Twitter

Copyright © Browsegrades · High quality services·