r5410503 Data Warehousing and Data Mining

Embed Size (px)

Citation preview

  • 8/7/2019 r5410503 Data Warehousing and Data Mining

    1/4

    Code No: R5410503 1IV B.Tech I Semester(R05) Supplementary Examinations, May/June 2009

    DATA WAREHOUSING AND DATA MINING

    (Computer Science & Engineering)

    Time: 3 hours Max Marks: 80

    Answer any FIVE Questions

    All Questions carry equal marks

    1. (a) What is a Data Warehouse? Discuss in detail.

    (b) Describe with the help of a figure the typical process flow within a Data Warehouse. [8+8]

    2. (a) Explain design of summary tables.

    (b) Explain load manager architecture. [8+8]

    3. (a) Describe the distinct capabilities of a parallel technology of a data warehouse system.

    (b) Explain in brief the following items. [8+8]

    i. degree of parallelism

    ii. parallel index build

    4. (a) Describe the role of security restrictions once the data warehouse has gone live

    (b) What are the audit requirements to impose security restrictions at the beginning of data Ware-

    house. [8+8]

    5. (a) Discuss with a neat sketch dataflow through data warehouse with reference to tuning the data

    load.

    (b) What are fixed queries? [12+4]

    6. (a) Describe the class histogram, count matrix and AVC sets. Are they similar in some respect?

    [6+2]

    (b) Compare ID3 and C4.5 DECISION TREE construction algorithms. [8]

    7. (a) What is text clustering? Discuss the principles underlying text clustering.

    [2+6]

    (b) Discuss the relationship between text mining and information retrieval and information extraction.

    [8]

    8. What is Event prediction problem? Explain PLANMINE & TIMEWEAVER algorithm. Compare the

    PLANMINE & TIMEWEAVER Algorithms. [4+6+6]

  • 8/7/2019 r5410503 Data Warehousing and Data Mining

    2/4

    Code No: R5410503 2IV B.Tech I Semester(R05) Supplementary Examinations, May/June 2009

    DATA WAREHOUSING AND DATA MINING

    (Computer Science & Engineering)

    Time: 3 hours Max Marks: 80

    Answer any FIVE Questions

    All Questions carry equal marks

    1. (a) How to clear and transform the Data?

    (b) Explain how to transforming into Effective Structures?

    (c) Describe the Backup and Archive process. [6+4+6]

    2. Explain the following techniques of storing time data:

    (a) Physical time

    (b) an offset from inherent satart of table.

    (c) Date range. [4+6+6]

    3. Design and management of a data warehouse on an MPP system is considerably more difficult than

    on an AMP or cluster syste. Do you support the above statement or not? Justify your stand. [16]

    4. (a) Explain the need and role of security on the performance of data warehouse

    (b) Describe the impact of security on the design of the data warehouse. [8+8]

    5. (a) Is daily processing different from overnight processing for Load estimation process?

    (b) What are the system administration requirements of database siting. [10+6]

    6. What is a DECISION TREE? With an example, Explain about the CART, ID3 algorithms. Give

    comparison between CART & ID3 algorithm. [3+9+4]

    7. (a) What is text clustering? Discuss the principles underlying text clustering.

    [2+6]

    (b) Discuss the relationship between text mining and information retrieval and information extraction.

    [8]

    8. (a) What is Constrained Sequence Mining Problem? In which situation we will use constrained

    sequence mining. [8]

    (b) Discuss about SPIRIT algorithm. In what way it is different from WUM.

    [5+3]

  • 8/7/2019 r5410503 Data Warehousing and Data Mining

    3/4

    Code No: R5410503 3IV B.Tech I Semester(R05) Supplementary Examinations, May/June 2009

    DATA WAREHOUSING AND DATA MINING

    (Computer Science & Engineering)

    Time: 3 hours Max Marks: 80

    Answer any FIVE Questions

    All Questions carry equal marks

    1. (a) Explain the ADHOC query and Automation in Data Warehouse delivery process.

    (b) Explain to the ideaCan we do without an Enterprise data warehous? [8+8]

    2. (a) Explain hardware partitioning.

    (b) Explain the significance of keyin partitioning.

    (c) How do you size the partition? [8+4+4]

    3. (a) Discuss the issues involved in the design of server environments in a data warehouse system.

    (b) Describe the design issues involved in the selection of user-front end hardware of a data Warehouse

    system. [10+6]

    4. (a) Describe the role and importance of backup strategy of a data warehouse.

    (b) Explain the role of hardware to implement backup strategy of a data warehouse. [8+8]

    5. Explain various query tuning methods in Data warehouse. [16]

    6. (a) What is a Decision Tree? What are the advantages and disadvantages of DECISION TREE

    classifications? [3+5](b) For the given data set create a Decision Tree? And explain about the knowledge obtained from

    it. [4+4]

    OUTLOOK TEMP(F) HUMIDITY(%) WINDY CLASS

    sunny 79 90 True play

    sunny 56 70 Flase play

    sunny 79 75 True no play

    sunny 60 90 True no play

    overcast 88 88 False no play

    overcast 63 75 True play

    overcast 88 95 False play

    Rain 78 60 False play

    Rain 66 70 False no play

    Rain 68 60 True play

    7. (a) What are the different types of web mining? How is web usage mining different from web structure

    mining and web content mining? [3+5]

    (b) What is concept hierarchy? How is it related to web mining? [3+5]

    8. (a) What is spatial trend? Explain about the spatial trend detection algorithm.

    [3+5]

    (b) What is spatial clustering? Write about spatial characterization. [3+5]

  • 8/7/2019 r5410503 Data Warehousing and Data Mining

    4/4

    Code No: R5410503 4IV B.Tech I Semester(R05) Supplementary Examinations, May/June 2009

    DATA WAREHOUSING AND DATA MINING

    (Computer Science & Engineering)

    Time: 3 hours Max Marks: 80

    Answer any FIVE Questions

    All Questions carry equal marks

    1. (a) Explain the ADHOC query and Automation in Data Warehouse delivery process.

    (b) Explain to the ideaCan we do without an Enterprise data warehous? [8+8]

    2. (a) Explain difference between designing a Data Warehouse and an OLTP system.

    (b) Explain fact table identification process. [8+8]

    3. What are the different architectural options available to design server hardware for a data warehouse

    system. [16]

    4. (a) Why is it important to get all the security and audit requirements clearly documented ?

    (b) Data movement is an expensive process Justify [8+8]

    5. Estimate the Disk space required for a data warehouse. [16]

    6. (a) Explain about the Three basic levels of Testing. [8]

    (b) Explain about the GUILLOTINE CUT phenomenon. What is the advantage of this method

    comparing with other. [4+4]

    7. (a) Which frequent itemset mining is suitable for text mining and why? Explain?

    (b) Discuss the relationship between text mining and information retrieval and information extraction.

    [8+8]

    8. (a) What is Constrained Sequence Mining Problem? In which situation we will use constrained

    sequence mining.

    (b) Explain about the Episode Discovery process. [8+8]