37
College of Education School of Continuing and Distance Education 2014/2015 – 2016/2017 Lecturer: Dr. Ebenezer Ankrah, Dept. of Information Studies Contact Information: [email protected] godsonug.wordpress.com/blog

Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

College of Education

School of Continuing and Distance Education 2014/2015 – 2016/2017

Lecturer: Dr. Ebenezer Ankrah, Dept. of Information Studies Contact Information: [email protected]

godsonug.wordpress.com/blog

Page 2: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Session Overview

• Relational databases have been in use for over two decades. A large portion of the applications of relational databases have been in the commercial word, supporting such tasks as transaction processing for banks and stock exchanges, sales and reservations for a variety of businesses, and inventory and payroll for almost all companies.

• New application of DBMS has become increasingly important. This session seeks to explain some of the new application of database management systems.

DR. EBENEZER ANKRAH Slide 2

Page 3: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Session Overview

• At the end of the session, the student will

– Identify and explain some of the new applications of DBMS

– Understand new trends as well as applications in Database Management Systems

– Be expose to Enterprise Cloud Database Application

– Able to understand some core issues in Multimedia databases

DR. EBENEZER ANKRAH Slide 3

Page 4: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Session Outline

The key topics to be covered in the session are as follows:

• Cloud Database Application

• Data Mining

• Warehousing

DR. EBENEZER ANKRAH Slide 4

Page 5: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Reading List

• Silberschatz, A., Korth, H. F., & Sudarshan, S. (2010). Database System Concepts. Boston, Massachusetts. WCB: McGraw-Hill. (Chapter 21)

• http://ecomputernotes.com/fundamental/introduction-to-computer/cloud-computing

DR. EBENEZER ANKRAH Slide 5

Page 6: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

CLOUD DATABASE APPLICATION Topic One

DR. EBENEZER ANKRAH Slide 6

Page 7: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Cloud Database Application

• Cloud Computing is a model that allows access to a shared pool of configurable computing resource (eg. networks, servers, storage, applications, and services) network on demand.

• Cloud computing literally, is the use of remote servers (usually accessible via the Internet) to process or store information. Access is usually using a Web browser. Save files on a server via the Internet is one example. The software itself can be mounted also on the remote computer.

DR. EBENEZER ANKRAH Slide 7

Page 8: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Cloud Database Application

• Cloud computing is the best solution to manage your applications yourself; it is a shared multi-tenant platform that is supported. When using an application running in the cloud, you simply connect to it, customize it and use it.

• Today, millions use a variety of applications in the cloud, such as applications of CRM, HR, accounting, and even business applications. These applications based in the cloud can be operational in a few days is not possible with traditional enterprise software.

DR. EBENEZER ANKRAH Slide 8

Page 9: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Cloud Database Application

• They are cheap because you do not have to invest in hardware and software, or to spend money for the configuration and maintenance of complex layers of technology or to finance facilities to run them. And they are more scalable, more secure and reliable than most applications.

• In addition, upgrades are supported, so that your applications automatically benefit from all the improvements of safety and performance available, as well as new features.

DR. EBENEZER ANKRAH Slide 9

Page 10: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Cloud Database Application

• The advantage of cloud computing is twofold. It is a file backup shape. It also allows working on the same document for several jobs of various types.

• Cloud computing simplifies usage by allowing overcoming the constraints of traditional computer tools (installation and updating of software, storage, data portability ...). Cloud computing also provides more elasticity and agility because it allows faster access to IT resources (server, storage or bandwidth) via a simple web portal and thus without investing in additional hardware.

DR. EBENEZER ANKRAH Slide 10

Page 11: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Cloud Database Application

• Three service models: – Cloud Software as a Service (SaaS): the user has the

possibility to use the service provider's applications over the network.

– Cloud Platform as a Service (PaaS): the consumer can

deploy cloud infrastructure on its own applications.

– Cloud Infrastructure as a Service (IaaS): the client can rent

storage, processing power, network and other computing resources.

DR. EBENEZER ANKRAH Slide 11

Page 12: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Sample Question

• Individual Assignment:

– State any three (3) benefits of cloud computing in Database Management Systems

• Forum Question:

– Distinguish between Data Mining And Data Warehousing

DR. EBENEZER ANKRAH Slide 12

Page 13: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

DATA MINING Topic Two

DR. EBENEZER ANKRAH Slide 13

Page 14: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Mining

• The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases and using it to make crucial business decisions is known as Data Mining.

• Data mining is concerned with the analysis of data and the use of software techniques for finding hidden and unexpected patterns and relationships in sets of data. The focus of data mining is to find the information that is hidden and unexpected.

DR. EBENEZER ANKRAH Slide 14

Page 15: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Mining

• Data mining can provide huge paybacks for companies who have made a significant investment in data warehousing. Although data mining is still a relatively new technology, it is already used in a number of industries. Table lists examples of applications of data mining in retail/marketing, banking, insurance, and medicine.

DR. EBENEZER ANKRAH Slide 15

Page 16: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Mining

• Examples of data mining applications

DR. EBENEZER ANKRAH Slide 16

Page 17: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Mining

• Data Mining Techniques

• There are four main operations associated with data mining techniques which include:

– Predictive modelling

– Database segmentation

– Link analysis

– Deviation detection.

DR. EBENEZER ANKRAH Slide 17

Page 18: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Mining

• Predictive Modelling • It is designed on a similar pattern of the human learning

experience in using observations to form a model of the important characteristics of some task. It corresponds to the 'real world'. It 'is developed using a supervised learning approach, which has to phases: training and testing.

• Training phase is based on a large sample of historical data called a training set, while testing involves trying out the model on new, previously unseen data to determine its accuracy and physical performance characteristics.

DR. EBENEZER ANKRAH Slide 18

Page 19: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Mining

• It is commonly used in customer retention management, credit approval, cross-selling, and direct marketing. There are two techniques associated with predictive modelling. These are:

– Classification - Classification is used to classify the records to form a finite set of possible class values.

– Value prediction - It uses the traditional statistical techniques of linear regression and nonlinear regression.

DR. EBENEZER ANKRAH Slide 19

Page 20: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Mining

• Database Segmentation

• Segmentation is a group of similar records that share a number of properties. The aim of database segmentation is to partition a database into an unknown number of segments, or clusters.

• This approach uses unsupervised learning to discover homogeneous sub-populations in a database to improve the accuracy of the profiles. Applications of database segmentation include customer profiling, direct marketing, and cross-selling.

DR. EBENEZER ANKRAH Slide 20

Page 21: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Mining

• Link Analysis

• Link analysis aims to establish links, called associations, between the individual record sets of records, in a database. There are three specializations of link analysis. These are:

– Associations discovery

– Sequential pattern discovery

– Similar time sequence discovery.

DR. EBENEZER ANKRAH Slide 21

Page 22: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Mining

• Deviation Detection • Deviation detection is a relatively new technique in terms

of commercially available data mining tools. However, deviation detection is often a source of true discovery because it identifies outliers, which express deviation from some previously known expectation "and norm. This operation can be performed using statistics and visualization techniques.

• Applications of deviation detection include fraud

detection in the use of credit cards and insurance claims, quality control, and defects tracing.

DR. EBENEZER ANKRAH Slide 22

Page 23: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

DATA WAREHOUSING Topic Three

DR. EBENEZER ANKRAH Slide 23

Page 24: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Warehousing

• Data warehouse is defined as "A subject-oriented, integrated, time-variant, and non-volatile collection of data in support of management's decision-making process."

• In this definition the data is:

– Subject-oriented as the warehouse is organized around the major subjects of the enterprise (such as customers, products, and sales) rather than major application areas (such as customer invoicing, stock control, and product sales). Date warehouse is designed to support decision making rather than application oriented data.

DR. EBENEZER ANKRAH Slide 24

Page 25: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Warehousing

– Integrated because of the coming together of source data from different enterprise-wide applications systems. The source data is often inconsistent using, for example, different formats. The integrated data source must be made consistent to present a unified view of the data to the users.

– Time-variant because data in the warehouse is only accurate and valid at some point in· time or over some time interval.

DR. EBENEZER ANKRAH Slide 25

Page 26: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Warehousing

–Non-volatile as the data is not updated in real time but is refreshed from on a regular basis from different data sources. New data is always added as a supplement to the database, rather than a replacement. The database continually absorbs this new data, incrementally integrating it with the previous data.

DR. EBENEZER ANKRAH Slide 26

Page 27: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Warehousing

• Benefits of Data Warehousing

• The successful implementation of a data warehouse can bring major, benefits to an organization including:

• Potential high returns on investment

– Implementation of data warehousing by an organization requires a huge investment. However, a study by the International Data Corporation (IDC) reported that average three-year returns on investment (RO I) in data warehousing reached 401%.

DR. EBENEZER ANKRAH Slide 27

Page 28: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Warehousing

• Competitive advantage

– The competitive advantage is gained by allowing decision-makers access to data that can reveal previously unavailable, unknown, and untapped information on, for example, customers, trends, and demands.

• Increased productivity of corporate decision-makers

– Data warehousing improves the productivity of corporate decision-makers by creating an integrated database of consistent, subject-oriented, historical data.

DR. EBENEZER ANKRAH Slide 28

Page 29: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Warehousing

• More cost-effective decision-making

– Data warehousing helps to reduce the overall cost of the product by reducing the number of channels.

• Better enterprise intelligence.

– It helps to provide better enterprise intelligence, enhanced customer service and It is used to enhance customer service.

DR. EBENEZER ANKRAH Slide 29

Page 30: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Warehousing

DR. EBENEZER ANKRAH Slide 30

Page 31: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Warehousing

• Problems of Data Warehousing • The problems associated with developing and managing

a data warehousing are as follows: • Underestimation of resources of data loading

– Some times we underestimate the time required to extract, clean, and load the data into the warehouse. It may take the significant proportion of the total development time, although some tools are there which are used to reduce the time and effort spent on this process.

• Hidden problems with source systems – Some times hidden .problems associated with the source

systems feeding the data warehouse may be identified after years of being undetected.

DR. EBENEZER ANKRAH Slide 31

Page 32: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Warehousing

• Required data not captured

– In some cases the required data is not captured by the source systems which may be very important for the data warehouse purpose.

• Increased end-user demands

– After satisfying some of end-users queries, requests for support from staff may increase rather than decrease. This is caused by an increasing awareness of the users on the capabilities and value of the data warehouse. Another reason for increasing demands is that once a data warehouse is online, it is often the case that the number of users and queries increase together with requests for answers to more and more complex queries.

DR. EBENEZER ANKRAH Slide 32

Page 33: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Warehousing

• Data homogenization

– The concept of data warehouse deals with similarity of data formats between different data sources. Thus, results in to lose of some important value of the data.

• Data homogenization

– The concept of data warehouse deals with similarity of data formats between different data sources. Thus, results in to lose of some important value of the data.

• High demand for resources

– The data warehouse requires large amounts of data.

DR. EBENEZER ANKRAH Slide 33

Page 34: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Warehousing

• Data ownership – Data warehousing may change the attitude of end-users to

the ownership of data. Sensitive data that owned by one department has to be loaded in data warehouse for decision making purpose. But some time it results in to reluctance of that department because it may hesitate to share it with others.

• High maintenance – Data warehouses are high maintenance systems. Any

reorganization· of the business processes and the source systems may affect the data warehouse and it results high maintenance cost.

DR. EBENEZER ANKRAH Slide 34

Page 35: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Warehousing

• Long-duration projects

– The building of a warehouse can take up to three years, which is why some organizations are reluctant in investigating in to data warehouse. Some only the historical data of a particular department is captured in the data warehouse resulting data marts.

– Data marts support only the requirements of a particular department and limited the functionality to that department or area only.

DR. EBENEZER ANKRAH Slide 35

Page 36: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

Data Warehousing

• Complexity of integration

– The most important area for the management of a data warehouse is the integration capabilities. An organization must spend a significant amount of time determining how well the various different data warehousing tools can be integrated into the overall solution that is needed.

– This can be a very difficult task, as there are a number of tools for every operation of the data warehouse.

DR. EBENEZER ANKRAH Slide 36

Page 37: Lecturer: Dr. Ebenezer Ankrah, Dept. of Information ... · •The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases

References

• Helman, P. (2000). The Science of Database Management. IRWIN. Boston, Massachusetts. R. R. Donnelly and Sons Company.

• Hoffer, J. A., Prescott, M. B., & Topi, H. (2009). Modern Database Management. Pearson Prentice Hall.

• Silberschatz, A., Korth, H. F., & Sudarshan, S. (2010). Database System Concepts. Boston, Massachusetts. WCB: McGraw-Hill.

DR. EBENEZER ANKRAH Slide 37