Computer Science > QUESTIONS & ANSWERS > CS 412 Introduction To Data Mining - University of Illinois_ (Spring 2016) Final Exam, Version 1. 15 (All)
UIUC-CS412 \Introduction to Data Mining" (Spring 2016) Final Exam, Version 1 Thursday, May 12, 2016 180 minutes, 150 points Name: NetID: 1 [30] 2 [30] 3 [47] 4 [43] Total 1. [30] Preprocessi... ng Data, Data Cube (a) [4’] Present the value range for each of the following measures. i. [2’] Jacquard coefficient ANSWER: [0; 1] ii. [2’] Covariance ANSWER: (-1; +1) (b) [6’] Give three example distance measures for each of the following two kinds. i. [3’] The distance between two objects ii. [3’] The distance between two clusters (c) [12’] Consider 5 data points in a 2-D space: (-3,3), (-1,1), (0,0), (1,-1), and (3,-3). (For all sub-questions below, correct answers without explanations receive full points; and incorrect answers with explanations may receive partial credit.) i. [2’] Calculate the covariance matrix. ii. [2’] Calculate the correlation coefficient for the two dimensions. iii. [3’] Calculate the first and the second principal components (two vectors), and indicate which is the first principal component. (Note: drawing is not enough, and no calculation is needed) iv. [3’] What are the coordinates of the 5 data points, projected to the 1-D space corresponding to the first principal component? v. [2’] Is the projection of data points to the 1-D space in the previous sub-question a lossless or lossy compression? Briefly explain. (d) [8’] Suppose the base cuboid of a data cube contains only two cells. (a1; a2; a3; :::; a10); (b1; b2; b3; :::; b10), where ai = bi if i is an odd number; otherwise ai 6= bi. i. [3’] How many nonempty aggregated (i.e., non-base) cells are there in this data cube? ANSWER: 2014. ii. [3’] How many nonempty, closed aggregated cells are there in this data cube? ANSWER: 3. They are (a1; a2; a3; :::; a10) : 1; (b1; b2; b3; :::; b10) : 1; (a1; ∗; a3; ∗; :::; a9; ∗) : 2. iii. [2’] If we set minimum support = 2, with the measure being count how many nonempty aggregated cells are there in the corresponding iceberg cube? ANSWER: 32. (a1; ∗; a3; ∗; :::; a9; ∗) 2. [30] Frequent Pattern and Association Mining (a) [8’] The price of each item in a store is nonnegative. For each of the following cases, identify the type of constraint they represent and briefly discuss how to mine such association rules efficiently with frequent pattern mining algorithms. i. [4’] Containing at least one Nintendo game. ii. [4’] Containing one free item and other items the sum of whose prices is at least $200. (b) [10’] Suppose a sequence database D contains three sequences as follows. Note (bc) means that items b and c are purchased at the same time (i.e., in the same transaction). Let the minimum support be 3. customer id shopping sequence 1 (bc)(de)f 2 bcdef 3 (bc)dbegf Use Generalized Sequential Patterns(GSP) to mine frequent sequential patterns from this database. You need to list all the steps and the results for mining frequent sequential patterns. [Show More]
Last updated: 1 year ago
Preview 1 out of 14 pages
Connected school, study & course
About the document
Uploaded On
Apr 02, 2023
Number of pages
14
Written in
This document has been written for:
Uploaded
Apr 02, 2023
Downloads
0
Views
53
In Browsegrades, a student can earn by offering help to other student. Students can help other students with materials by upploading their notes and earn money.
We're available through e-mail, Twitter, Facebook, and live chat.
FAQ
Questions? Leave a message!
Copyright © Browsegrades · High quality services·