27
UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

Embed Size (px)

Citation preview

Page 1: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

UFCE8V-20-3 Information Systems Development 3 (SHAPE HK)

Lecture nDatabase Theory & Practice (1) : Data Modelling

Page 2: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

What is Data?

o A representation of facts, concepts, or instructions in a formalized manner suitable for communication, interpretation, or processing by humans or by automated means.sam.dgs.ca.gov/TOC/4800/4819.2.htm

o Factual information, especially information organized for analysis or used to reason or make decisions.www.florite.com/support/terminology.htm

o The raw material of information. Refers mostly to the information entered into, and stored within a computer or file.www.angelfire.com/bc/nursinginformatics/glossary.html

o Information stored on the computer system, used by applications to accomplish tasks.www.krollontrack.com/legalresources/glossary.asp

Page 3: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

Data, Information & Knowledge

The WKIDN Hierarchy

noise – unstructured, unrelated, non-symbolic, unrecognised interference; e.g. the ‘string’: ?£^&**8…---┐€↨/

data – symbolic or non-symbolic unstructured ‘facts’ about one or more domains;e.g. the ‘strings’: James 111081

information – data + meaning (+ context);e.g. my sons name and dob (in an application form) or my friends name and id (within a IRC service)

knowledge – refers to awareness of a domain or procedures used to attain goals;e.g. knowing when and where the above two uses are appropriate.

wisdom – intuitive and heuristic understanding of the limits of knowledge and how and when to apply knowledge; when to reject information or question the validity of data; may be counter-intuitive and abductive;

Page 4: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

What is a Database?

o Any organized collection of information; it may be paper or electronic.www.library.arizona.edu/rio/glossary.htm

o a standardized collection of information in computerized format, searchable by various parameters; in libraries often refers to electronic catalogs and indexes.library.wexler.hunter.cuny.edu/lyannott/thesis_guide/libraryterms.html

o A database is a collection of information stored in a computer in a systematic way, such that a computer program can consult it to answer questions. The software used to manage and query a database is known as a database management system (DBMS). The properties of database systems are studied in information science. en.wikipedia.org/wiki/Database

Page 5: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

What is data modelling?o Data modelling is concerned with the design of the data content and

structure of the database.

o Data modelling gives us a formal model of an organisation which is achieved through the consolidation of the user requirements specification.

o The information gathered from fact -finding (analysis) is appraised and the basic data and data relationships are established.

o The result of the data analysis is a representation of the user's view of the data. It is independent of any DBMS software or hardware considerations.

o The model documents the structure of and interrelationships between the data. It is presented as a combination of simple diagrams and written definitions.

Page 6: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

Why is the data model important?o Leverage

- A small change to a data model may have a major impact on the system as a whole. Although in commercial information systems, the programs are far more complex and take much longer to specify and construct than the database, their content and structure are heavily influenced by the database design. Their structure will therefore need to reflect the way the data is organized ... in other words, the data model.- A well designed data model can make programming simpler and cheaper. Even a small change to the model may lead to significant savings in total programming cost.

o Conciseness

- The data model is formal and concise and the time required to review a data model is considerably less than that needed to review functional specifications which could amount to hundreds of pages.

o Data Quality- Data is a valuable organizational asset. The data model plays a key role in ensuring good data quality by establishing a common understanding of what data is to be held and how to interpret it.

Page 7: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

What makes a good data model? (1)o Completeness

- Does the model support all the necessary data? Do we need to record something that is currently omitted?

o Non-redundancy- Does the model specify a database in which the same fact could be recorded more than once? Recording the same data more than once (duplication) increases the amount of space needed to store the database, requires extra processes (and processing) to keep the various copies in step, and leads to consistency problems if the copies get out of step.

o Enforcement of Business Rules- How accurately does the model reflect and enforce the rules that apply to the business' data? If rules correctly reflect the business requirement and are correctly enforced, the resulting database will be a powerful tool in enforcing correct practice, and in maintaining data quality.

o Data Reusability- Will the data stored in the database be reusable for purposes beyond those anticipated in the process model? This requirement is often expressed in terms of its solution: as far as possible, data should be organised independently of any specific application.

Page 8: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

What makes a good data model? (2)o Stability and Flexibility

- How well will the model cope with possible changes to the business requirements? Can any new data required to support such changes be accommodated in existing tables? Alternatively, will simple extensions suffice? Or will major structural changes be required, with corresponding impact on the rest of the system?- A data model is stable in the face of a change to requirements if it requires little or no modification. Models are more or less stable, depending on the level of change required. - A data model is flexible if it can be readily extended to accommodate likely new requirements with only minimal impact on the existing structure.

o Elegance- Does the data model provide a reasonably neat and simple classification of the data? Elegant models are typically simple, consistent, and easily described and summarized.- The difference in development cost between systems based on simple, elegant data models and those based on highly complex ones can be considerable There is a risk that a simple model ends up being complex and brittle as result of incremental business changes over a long period without any rethinking of processes and supporting data.

Page 9: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

What makes a good data model? (3)

o Communication- How effective is the model in supporting communication among the various stakeholders in the design of a system? Do the tables and columns represent business concepts that the users and business specialists are familiar with and can easily verify? Will programmers interpret the model correctly?

o Integration- How will the proposed database fit with the organization's existing and future databases? Even when individual databases are well designed, it is common for the same data to appear in more than one database and for problems to arise in drawing together data from multiple databases. Are the coding schemes and definitions consistent? How easy is it to keep the different versions in step, or to assemble a complete picture?

Page 10: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

What makes a good data model? (4)

o Conflicting Objectives-The above aims will often conflict with one another. An elegant but radical solution may be difficult to communicate. An elegant model may exclude requirements that do not fit. A model that accurately enforces a large number of business rules will be unstable if some of those rules change. A model that is easy to understand because it reflects the perspectives of the immediate system users may not support reusability or integrate well with other databases.

- The goal is to develop a model that provides the best balance among these possibly conflicting objectives. As in other design disciplines, achieving this is a process of proposal and evaluation, rather than a step by-step progression to the ideal solution. We may not realize that a better solution or trade-off is possible until we see it.

Page 11: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

(some) consequences of bad model/design

slow data access

specific data is hard to find

data is contradictory fields hold meaningless data

records are hard to update

transactions go wrong

transactions corrupt data

data gets lost and cannot be recovered

database can’t scale

database design is complex and confusing

users have to download much more data than needed

security is flawed leading to damage or theft

queries cannot be optimized

Page 12: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

Database Design Stages & Deliverables (1)o Conceptual, Logical and Physical Data Models

- The conceptual data model is a (relatively) technology independent specification of the data to be held in the database. It is the focus of communication between the data modeler and business stakeholders, and it is usually presented as a diagram with supporting documentation. - The logical data model is a translation of the conceptual model into structures that can be implemented using a database management system (DBMS) – usually relational. - The physical data model incorporates any performance considerations and is presented in terms of tables, columns, indexes etc. It will include a specification of physical storage and access mechanisms.

Page 13: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

Database Design Stages & Deliverables (2)o The Three-Schema Architecture and Terminology

- The internal schema describes how the data will be physically stored and accessed, using the facilities provided by a particular DBMS. it represents the foundations, electrical wiring, and hidden plumbing of the database.

- The conceptual schema describes the organization of the data into tables and columns.

- The external schemas specify views that enable different users of the data to see it in different ways. It is usual to provide one external schema that covers the entire conceptual and then to provide a number of external schemas that meet specific user requirements.

Page 14: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

Where do data models fit in?o Data-Driven Approaches

for example Information Engineering (IE) appeared in the late 1970’s and have since evolved into parallel or “blended” approaches. The emphasis was on developing the data model before the detailed process model.

o Parallel (Blended) Approachesprovides simultaneous modelling of the data and the process models. Supported by CASE products.

o Object-Oriented Approachesuses conventional (relational) data models as OO databases are not commonly used. Use of UML.

o Prototyping ApproachesRapid Application Development (RAD) have in many cases replaced the traditional waterfall approaches to systems development. Use of data-driven approach to data-modelling.

o Agile MethodsBacklash against “heavy” methodologies – values software over documentation; shared understanding; pair-programming etc. Data model is developed early in the development process.

Page 15: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

Who should be involved in data modelling?o The system users, owners, and/or sponsors will need to verify that the model

meets their requirements.

o Business specialists (sometimes called subject matter experts or SMEs) may be called upon to verify the accuracy and stability of business rules incorporated in the model.

o The data modeler has overall responsibility for developing the model and ensuring that other stakeholders are fully aware of its implications for them.

o Process modelers and program designers will need to specify programs to run against the database. They will want to verify that the data model supports all the required processes.

o The physical database designer (often an additional role given to the database administrator) will need to assess whether the physical data model needs to differ substantially from the logical data model to achieve adequate performance, and, if so, propose and negotiate such changes.

o The systems integration manager (or other person with that responsibility, possibly an enterprise architect, data administrator, information systems planner, or chief information officer) will be interested in how the new database will fit into the bigger picture: are there overlaps with other databases; does the coding of data follow organizational or external standards.

Page 16: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

Essential Deliverables

o Whatever formal or informal data modeling methodology you are using, the data modeling process should deliver:– A broad summary of requirements

covering scope, objectives, and future business directions.

– Inputs to the model: interview summaries, reverse-engineered models, process models, etc.

– A conceptual data model

– Entity class definitions, and attribute lists, and attribute definitions for every entity class in the model

– Documentation of constraints and business rules that cannot be put in a diagram

– A logical data model– Design notes covering decisions

made in translating the conceptual model to a logical model

– Cross-reference to the process model

– Higher level and local versions of the model to facilitate presentation (as necessary).

– A physical data model

Page 17: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

Gathering Requirementso Good design begins with an understanding of the big picture and

will take the form of (a) written structured deliverables and (b) knowledge that may never be formally recorded, but that will inform data modellers' decisions.

o This early phase in a project provides an excellent opportunity to build relationships with the business stakeholders and the other systems developers.

o Sources of information:– The business case for the project(s) influencing the data modelling task– Interviews and workshops– Riding the Trucks– Existing Systems– Organisational Processes

Page 18: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

The Business Caseo In reviewing a business case, you should take particular note of

the following matters:– The broad justification for the application, beneficiaries, and

those disadvantaged (if applicable).– The business concepts, rules, and terminology.– The critical success factors for the system.– The scope of the system.– System size and timeframes.– Performance-related information.– Management information requirements for the system to

meet.– The expected lifetime and likely changes.– Interfaces to other systems (internal and external)

Page 19: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

Interviews and Workshops

o Essential for requirements gathering.o Rule: invite

– (a) the people whom we believe collectively understand the requirements of the system and

– (b) anyone likely to say, after the task is complete, “why wasn’t I asked?”

o Be wary of being directed to the “user representative”—the single person delegated to answer all of your questions about the business — (the real users get on with his/her work)

Page 20: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

Should You Model in Interviews and Workshops?

o Use anything but data models.o Data models are not a comfortable language for

most business people (they prefer activities).o You need to be able to accept all terms offered

by stakeholders, be they entity classes, attributes, relationships, classification schemes, categories or even instances of any of these.

Page 21: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

Interviews with Senior Managers

o CEOs and other senior managers are usually the best placed to paint a picture of future directions (future context for database(s)).

o Keep time demands limited and explain in concise terms the importance of the manager’s contribution to the success of the system.

o Ensure that you are familiar with their area of business and focus on future directions.

o Be aware of what the project as a whole will deliver for the interviewee.

Page 22: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

Interviews with Subject Matter Experts

o These are the people we speak to in order to understand the data requirements in depth. Encourage them to talk about the processes and the data they use and to look critically at how well their needs are met.

o A goal and process based approach is often the best structure for the interview.– “What is the purpose of what you do?” – This leads to an examination of how the goals are achieved

and what data is (ideally) required to support them.

Page 23: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

Facilitated Workshopso These are great for identifying and verifying requirements by

brainstorming, and ensuring that a wide range of stakeholders have an opportunity to contribute.

o Some guidelines:– Use an experienced facilitator– If your expertise is in data modeling, avoid facilitating the

workshop yourself.– Give the facilitator time to prepare an approach and

discuss it with you.– Appoint an informed note-taker– Avoid “modeling as you go.”– Do not try to solve everything in the workshop.

Page 24: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

Riding the Truckso Spend time with the hands-on users of the existing

system as they go about their day-to-day work. When you do, look for, :– Variations in practices and interpretation of business rules

at different locations– Variations in understanding of the meaning of data.– Terminology used by the real users of the system– Availability and correct use of data (on several occasions

we have heard, “No-one ever looks at this field, so we just make it up.”)

– Misuse or undocumented use of data fields (“Everyone knows that an ‘F’ at the beginning of the comment field signifies a difficult customer.”)

Page 25: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

Existing Systems and Reverse Engineeringo Existing database designs provide a set of entity classes,

relationships, and attributes that we can use to ask the question, “How does our new model support this?”

o If you are very fortunate, the underlying data model will be properly documented. Otherwise, you should produce at least an E-R diagram, short definitions, and attribute lists by “reverse engineering”:– Represent existing files, segments, record types, tables, or equivalents as

entity classes.– Normalize.– Identify relationships supported by “hard links.”– Identify relationships supported by foreign keys.– List the attributes for each entity class and define each entity class and

attribute.

Page 26: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

Process Models

o The data required by individual processes may be documented explicitly (e.g., as data stores) or implicitly within the process description (e.g., “Amend product price on invoice.”).

o You must verify the data model against the process model when it is available and allow time for enhancement of the data model.

o A one or two level data flow diagram or interaction diagram is a valuable adjunct to communicating the impact of different data models on the system as a whole.

Page 27: UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture n Database Theory & Practice (1) : Data Modelling

Bibliography / Readings / Home based activities

Bibliography- Data Modeling Essentials (3rd ed.), GC Simpson & GC Witt, Morgan Kaufmann, 2005- Beginning Database Design Solutions, R Stephens, Wrox, 2009- Data Modeling - A Beginner's Guide, A Oppel, McGraw-Hill Osborne, 2010