## DATA WAREHOUSING AND DATA MINING Assignments

**KOTTAM TUALSI REDDY MEMORIAL COLLEGE OF ENGINEERING, KONDAIR**

**DATA WAREHOUSING AND DATA MINING**

**Unit I**

- Explain Data mining as a step in the process of knowledge discovery.
- Draw and explain the architecture of typical data mining systems.
- Differentiate OLTP and OLAP.
- What is data mining and data warehousing? Give their applications.
- Briefly discuss the functionalities of data mining.
- Briefly discuss about Multidimensional data model
- Multidimensional Schema.
- Architecture of data mining systems
- Briefly discuss about data warehouse architecture
- Classification of data mining systems

**Unit II**

- Briefly discuss the forms of data processing with neat diagram.
- Explain about concept hierarchy generation for categorical data.
- Explain various data reduction techniques.
- Explain about concept hierarchy generation for numerical attributes.
- Explain about data Integration and Transformation techniques
- Briefly explain about discritization and Concept Hierarchy Generation for numerical and Categorical data.
- Briefly explain about needs for preprocessing data.
- Explain various data cleaning techniques.

**Unit II I **

- List and describe data mining primitives for specifying a Data Mining Task.
- Briefly discuss about Task-relevant data specifications
- Explain the syntax for Task-relevant data specifications.
- Describe why concept hierarchies are useful in data mining.
- Briefly explain about Data Mining Query Language with suitable examples
- Explain about designing graphical user interfaces based on Data Mining query Language.

**Unit IV**

- What is concept description and explain about Attribute relevance analysis for data characterization
- What are the differences between concept description in large databases and OLAP?
- Differentiate between predictive and descriptive data mining
- State and explain algorithm for attribute oriented induction.
- Explain mining class comparisons using example.
- Explain various formats for presenting derived generalized relations.
- Explain various mining descriptive statistical measures in large databases

**Unit V**

- Discuss about mining frequent item sets without candidate generation.
- What is association rule mining? Discuss about multilevel association rule mining from transactional databases in detail.
- Write the FP-growth algorithm. Explain.
- What is Iceberg query? Explain with example..
- Discuss about ARCS.

- Explain mining multidimensional association rules from relational databases and warehouses
- What is correlation analysis? And explain constraint based association mining

** **

**Unit VI**

- How scalable is decision tree induction? Explain.
- Describe working procedure of simple Bayesian classifier.
- Write backpropagation algorithm and explain.
- Discuss about nearest neighbor classifiers and case-based reasoning.
- Can any ideas from association rule mining be applied to classification? Explain.
- Explain about prediction and Explain Bayesian belief Networks
- How does tree pruning work? What are some enhancements to basic decision tree induction?
- what is classification and explain classification by Decision Tree Induction

**Unit VII**

- What is cluster analysis? What are the various types of data in Cluster Analysis? Explain.
- Given two objects represented by the tuples (22, 1,42,10) and (20,0,36,8):
- Compute the Euclidean distance between the two objects
- Compute the Euclidean distance between the two objects
- Compute the Euclidean distance between the two objects
- Explain categorization of Major Clustering Methods
- what is distance based Outlier? What are the efficient algorithms for mining distance-based algorithm? How are outliers determined in this method?
- Given following measures for variable age:

18,22,25,42,28,43,33,35,56,28 Standardize the variable by the following

i. Compute the mean absolute deviation of age.

ii. Compute the Z-Score for the first four measurements.

- Describe Model based clustering methods.
- Suppose that the data mining task is to cluster the following eight points ( with (x,y) representing location) into three clusters.

A1(2,10), A2(2,5),A3(8,4),B1(5,8),B2(7,5),B3(6,4),C1(1,2),C2(4,9).

The distance function is Euclidean distance. Suppose initially we assign A1,B1 and C1 as the center of each cluster, respectively. Use the k-means algorithm to show only

1. the three cluster centers after the first round execution

2. the final three clusters

- Explain DBSCAN algorithm with suitable example.
- How does CLIQUE work.
- Explain Partitioning Methods
- Explain Density-Based Methods
- Explain Grid-based Methods
- Explain Model-Based Clustering Methods

** **

**Unit VIII**

- Explain the construction of spatial data cube with suitable example.
- Explain methods are for information retrieval? Explain
- Describe web usage mining.
- Explain construction and mining of object cubes
- What is multimedia database? Explain mining multimedia databases.
- What is time series database? What is a sequence database? Explain mining time series and sequence data.
- Define spatial database, multimedia database, time series database, sequential database and text database.
- Explain Periodicity analysis and Latent semantic indexing
- Explain mining associations in multimedia data
- Briefly discuss about Multidimensional Analysis and Descriptive Mining of Complex Data Objects
- Briefly discuss about Mining Spatial Databases
- Briefly discuss about Multimedia Databases
- Briefly discuss about Mining Time-series and Sequence Data Mining
- Briefly describe about Text Databases
- Briefly discuss about mining the World Wide Web.

Thanks conducive to the adept info. Narmally fair-minded wen up! I consistently do not bourgeoning on these but brooding you did a fervently buddy-buddy depredate and Iâ€™m indubitable some people suavity the less anyway.

Cool work. Keep it comin’!

I really like your writing style, wonderful info, appreciate it for putting up .

Thank u