CS317 File and Database Systems

Preview:

Citation preview

November 15, 2017 Sam Siewert

CS317File and Database Systems

Lecture 13 – OODBMS and UML Concepts

http://dilbert.com/strips/comic/1995-10-11/

RemindersPLEASE FILL OUT COURSE EVALUATIONS ON CANVAS [5 points bonus on Assignment #6]

Assignment #6, DBMS Project of Your Interest –POSTED– Work with your Chosen Team– Self-Directed – Autonomy, Mastery, Purpose and Life-long

Learning and Requires Some Research on Your Part

Assignment #5 Grading In Progress

Exam #2 – Monday and Wednesday, Week 15

Assignment #6 Assessed with Final Grading

Sam Siewert 2

Interdisciplinary Nature of DBMS

Sam Siewert 3

DBMS

File Systems

Operating Systems

Programming Languages(SQL, OOP)

Security

Networking(Clusters, DR, Client/Server)

Storage(SAN, NAS, DAS)

Big

Data

Analytics

?

CS332 – “R”

CS332 – C++ & JavaFinal Lecture – Week 14SE300/310 – OOA/OOD/OOP

MySQL ConnectorsC/C++, Java, …

C&B REF - CHAPTER 27OODBMS and UML Concepts

Sam Siewert 4

Next Generation Database Systems

First Generation DBMS: Network and Hierarchical – Required complex programs for even simple queries.– Minimal data independence.– No widely accepted theoretical foundation.

Second Generation DBMS: Relational DBMS– Helped overcome these problems.

Third Generation DBMS: OODBMS and ORDBMS. [NoSQL]

Pearson Education © 2014 5

History of Data Models

Pearson Education © 2014 6

Object-Oriented Data ModelNo one agreed object data model. One definition:

Object-Oriented Data Model (OODM)– Data model that captures semantics of objects supported in

object-oriented programming.

Object-Oriented Database (OODB)– Persistent and sharable collection of objects defined by an

ODM.

Object-Oriented DBMS (OODBMS)– Manager of an ODB.

OMG [Object Management Group], CORBA

Pearson Education © 2014 7

Object-Oriented Data ModelZdonik and Maier present a threshold model thatan OODBMS must, at a minimum, satisfy:– It must provide database functionality.– It must support object identity.– It must provide encapsulation.– It must support objects with complex state.

Distributed Object Stores – E.g. Ceph andAmplidata, OODBMS, Object-Relational MappingSystems

Pearson Education © 2014 8

Name space

DBMS

Object-Oriented Data ModelKhoshafian and Abnous define OODBMS as:– OO = ADTs + Inheritance + Object identity– OODBMS = OO + Database capabilities.

Parsaye et al. gives:1.High-level query language with query optimization.2.Support for persistence, atomic transactions:

concurrency and recovery control.3.Support for complex object storage, indexes, and

access methods.– OODBMS = OO system + (1), (2), and (3).

Pearson Education © 2014 9

Commercial OODBMSsGemStone/S from Gemstone Systems Inc.,

Objectivity/DB from Objectivity Inc.,

ObjectStore from Progress Software Corp.,

Versant Object Database db40 and FastObjects fromVersant Corp.

Amplidata? - Object Store

Open Source Object Stores and NoSQL ( R DBMS)

– Ceph, Mongo DB NoSQL

Pearson Education © 2014 10

Recall Object Definition – From Week 4Instance of a Class [Hierarchy from Abstract that Can’t Be instantiated to Concrete Sub-classes]Classes Define Public and Private Data and Methods to Operate on that DataSub-classes Inherit from Parent Classes and Refine Data Abstraction and MethodsOOPS – Java, C++, …, back to Smalltalk and Lisp CLOSEncapsulation and Abstraction is the GoalImplementation Hiding and Interface DefintionEach OOP Has Variations on Support [E.g. Multiple Inheritance, Abstract Classes and Methods, Polymorphism [Parametric, Ad-hoc, Operator Overloading]E.g. Oracle’s Java Object Tutorial

Sam Siewert 11

UML ExampleOOP – CS225, OOA/OOD – SE310UML – CASE OO Analysis and Design with ModelioCASE Tools (Like MySQL Workbench) for C++, Java

Sam Siewert 12

UML Class Compared to EER SchemaEER – Entities, Attributes, Relationships (Schema)– Specifically Missing

Operations on Entities– Does Include Inheritance

and Aggregation

UML Class Diagram (Specifies C++, Java, … OOP Class Hierarchy for Abstract and Concrete Classes)– Encapsulates Operations

(that can be inherited) along with Attributes in Class Hierarchy

– Includes Aggregation, Relationships and Hierarchy (like EER)

Sam Siewert 13

Origins of the Object-Oriented Data Model

Pearson Education © 2014 14

Persistent Programming Languages (PPLs)

Language that provides users with ability to(transparently) preserve data across successiveexecutions of a program, and even allows such datato be used by many different programs.

• In contrast, database programming language (e.g.SQL) differs by its incorporation of features beyondpersistence, such as transaction management,concurrency control, and recovery.

Pearson Education © 2014 15

Persistent Programming Languages (PPLs)

PPLs eliminate impedance mismatch [interfacebetween application and DBMS] by extendingprogramming language with database capabilities.– In PPL, language’s type system provides data

model, containing rich structuring mechanisms.

In some PPLs procedures are ‘first class’ objectsand are treated like any other object in language.– Procedures are assignable, may be result of

expressions, other procedures or blocks, and maybe elements of constructor types.

– Procedures can be used to implement ADTs.

Pearson Education © 2014 16

Persistent Programming Languages (PPLs)

PPL also maintains same data representation inmemory as in persistent store.– Overcomes difficulty and overhead of mapping between

the two representations.

Addition of (transparent) persistence into a PPL is important enhancement to IDE, and integration of two paradigms provides more functionality and semantics.

Pearson Education © 2014 17

Alternative Strategies for Developing an OODBMS

Extend existing OO programming language.– GemStone extended Smalltalk.

Provide extensible OODBMS library.– Approach taken by Ontos, Versant, and ObjectStore.

Embed OODB language constructs in a conventional host language.– Approach taken by O2,which has extensions for C.– O2 Merged into Infromix, which was Acquired by IBM– http://www-01.ibm.com/software/data/informix/

Pearson Education © 2014 18

Single-Level v. Two-Level Storage Model

With a traditional DBMS, programmer has to:1. Decide when to read and update objects.

2. Write code to translate between application’s objectmodel and the data model of the DBMS.

3. Perform additional type-checking when object is readback from database, to guarantee object will conform toits original type.

Pearson Education © 2014 19

Single-Level v. Two-Level Storage Model

Difficulties occur because conventional DBMSshave two-level storage model: storage model inmemory, and database storage model on disk.

In contrast, OODBMS gives illusion of single-levelstorage model, with similar representation in bothmemory and in database stored on disk.– Requires clever management of representation of objects

in memory and on disk (called “pointer swizzling”).

Pearson Education © 2014 20

Two-Level Storage Model for RDBMS

Pearson Education © 2014 21

Single-Level Storage Model for OODBMS

Pearson Education © 2014 22

No SwizzlingEasiest implementation is not to do any swizzling.Objects faulted into memory, and handle passed toapplication containing object’s OID.OID is used every time the object is accessed.System must maintain some type of lookup table -Resident Object Table (ROT) - so that object’svirtual memory pointer can be located and thenused to access object.Inefficient if same objects are accessed repeatedly.Acceptable if objects only accessed once.

Pearson Education © 2014 23

Resident Object Table (ROT)

Pearson Education © 2014 24

Object ReferencingNeed to distinguish between resident and non-resident objects.Most techniques variations of edge marking ornode marking.Edge marking marks every object pointer with a tagbit:– if bit set, reference is to memory pointer;– else, still pointing to OID and needs to be

swizzled when object it refers to is faulted into.

Pearson Education © 2014 25

Object ReferencingNode marking requires that all object referencesare immediately converted to virtual memorypointers when object is faulted into memory.

First approach is software-based technique butsecond can be implemented using software orhardware-based techniques.

Pearson Education © 2014 26

Pointer Swizzling TechniquesThe action of converting object identifiers (OIDs) tomain memory pointers.

• Aim is to optimize access to objects.• Should be able to locate any referenced objects on

secondary storage using their OIDs.• Once objects have been read into cache, want to record

that objects are now in memory to prevent them frombeing retrieved again.

• Could hold lookup table that maps OIDs to memorypointers (e.g. using hashing).

• Pointer swizzling attempts to provide a more efficientstrategy by storing memory pointers in the place ofreferenced OIDs, and vice versa when the object iswritten back to disk.

Pearson Education © 2014 27

Interesting to Ponder…How Will Persistent Main Memory Impact?– E.g. Phase Change Memory, PCM– Memristor– Nand Programming Time is Too Slow for Persistent Memory

Solution

How Will SSD and Nand Flash I/O Cards Impact?– Fusion IO, Micron, Virident [Purchased by WD], Intel, etc. –

Persistent PCI Express [NVM Express] Block Device– SSD – SAS/SATA Device with Nand Internal Storage –

Emulates HDD with Faster Random Access

Sam Siewert 28

Accessing an Object with a RDBMS

Pearson Education © 2014 29

Accessing an Object with an OODBMS

Pearson Education © 2014 30

Persistent SchemesConsider three persistent schemes:– Checkpointing.– Serialization.– Explicit Paging.

Note, persistence can also be applied to (object) code and to the program execution state.

Relate Back to Transactions

Pearson Education © 2014 31

State Transition Diagram for Transaction

Pearson Education © 2014 32

Written to Storage(Cached w/o ATA Flush, SCSI FUA)

http://en.wikipedia.org/wiki/Disk_buffer

Completed TransactionBuffered for Write-back

Partially Completed TransactionCleared from Buffer – No Write-back

If We NEVER used CacheAnd ALWAYS used FUA, thishelps, But WAY TOO SLOW

Client-Server Architecture

Pearson Education © 2014 33

Architecture - Storing and Executing Methods

Two approaches:– Store methods in external files.– Store methods in database.

Benefits of latter approach:– Eliminates redundant code.– Simplifies modifications.– Methods are more secure.– Methods can be shared concurrently.– Improved integrity.

Obviously, more difficult to implement.

Pearson Education © 2014 34

Benchmarking - Wisconsin benchmark

Original benchmark had 3 relations: one calledOnektup with 1000 tuples, and two others calledTenktup1/Tenktup2 with 10000 tuples.

Generally useful although does not cater for highlyskewed attribute distributions and join queriesused are relatively simplistic.

Consortium of manufacturers formed TransactionProcessing Council (TPC) in 1988 to create seriesof transaction-based test suites to measuredatabase/TP environments.

Pearson Education © 2014 35

TPC BenchmarksTPC-A and TPC-B for OLTP (now obsolete).TPC-C replaced TPC-A/B and based on order entryapplication.TPC-H for ad hoc, decision support environments.TPC-R for business reporting within decision supportenvironments.TPC-W, a transactional Web benchmark for eCommerce.

Pearson Education © 2014 36

Object Operations Version 1 (OO1) Benchmark

Intended as generic measure of OODBMSperformance. Designed to reproduce operationscommon in advanced engineering applications,such as finding all parts connected to a randompart, all parts connected to one of those parts, andso on, to a depth of seven levels.

About 1990, benchmark was run on GemStone,Ontos, ObjectStore, Objectivity/DB, and Versant,and INGRES and Sybase. Results showed anaverage 30-fold performance improvement forOODBMSs over RDBMSs.

Pearson Education © 2014 37

OO7 BenchmarkMore comprehensive set of tests and a morecomplex database based on parts hierarchy.Designed for detailed comparisons of OODBMSproducts.Simulates CAD/CAM environment and testssystem performance in area of object-to-objectnavigation over cached data, disk-resident data,and both sparse and dense traversals.Also tests indexed and nonindexed updates ofobjects, repeated updates, and the creation anddeletion of objects.

Pearson Education © 2014 38

Advantages of OODBMSsEnriched Modeling Capabilities.Extensibility.Removal of Impedance Mismatch.More Expressive Query Language.Support for Schema Evolution.Support for Long Duration Transactions.Applicability to Advanced Database Applications.Improved Performance.

Pearson Education © 2014 39

Disadvantages of OODBMSsLack of Universal Data Model.Lack of Experience.Lack of Standards.Query Optimization compromises Encapsulation.Object Level Locking may impact Performance.Complexity.Lack of Support for Views.Lack of Support for Security.

Pearson Education © 2014 40

Comparison of ORDBMS and OODBMS –Data Modeling

Pearson Education © 2014 41

Comparison of ORDBMS and OODBMS –Data Access

Pearson Education © 2014 42

Comparison of ORDBMS and OODBMS –Data Sharing

Pearson Education © 2014 43

SummaryIgnore FDM [Functional Data Model] and Paging Details in Chapter 27

OODBMS is a Work in Progress, but Promising

NoSQL, Object Stores, Innovation in File Systems

RDBMS Remains Ideal for Transaction Workloads

OODBMS, Object Stores, NoSQL – Growing Alternative for Newer Workloads – Object Oriented Data (BLOBs) and Big Data Analytics

Sam Siewert 44

UML BASICS (SE310)UML Introduction

Sam Siewert 45

UMLRepresents unification and evolution of severalOOAD methods, particularly:– Booch method, – Object Modeling Technique (OMT), – Object-Oriented Software Engineering (OOSE).

Adopted as a standard by OMG and accepted bysoftware community as primary notation formodeling objects and components.

Pearson Education © 2014 46

UMLDefined as “a standard language for specifying,constructing, visualizing, and documenting theartifacts of a software system”.The UML does not prescribe any particularmethodology, but instead is flexible andcustomizable to fit any approach and can be used inconjunction with a wide range of software lifecyclesand development processes.

Pearson Education © 2014 47

UML – Design GoalsProvide ready-to-use, expressive visual modelinglanguage so users can develop and exchangemeaningful models.Provide extensibility and specialization mechanismsto extend core concepts.Be independent of particular programminglanguages and development processes.Provide a formal basis for understanding themodeling language.Encourage growth of object-oriented tools market.Support higher-level development concepts such ascollaborations, frameworks, patterns, andcomponents.Integrate best practices.

Pearson Education © 2014 48

UML - DiagramsStructural:– class diagrams– object diagrams– component diagrams– deployment diagrams.

Behavioral:– use case diagrams– sequence diagrams– collaboration diagrams– statechart diagrams– activity diagrams.

Pearson Education © 2014 49

UML – Object DiagramsModel instances ofclasses and used todescribe system at aparticular point intime.

Can be used tovalidate classdiagram with “realworld” data andrecord test cases.

Pearson Education © 2014 50

UML – Component DiagramsDescribe organization and dependencies amongphysical software components, such as sourcecode, run-time (binary) code, and executables.

Pearson Education © 2014 51

UML – Deployment DiagramsDepict configuration of run-time system, showinghardware nodes, components that run on thesenodes, and connections between nodes.

Pearson Education © 2014 52

UML – Use Case DiagramsModel functionality provided by system (use cases),users who interact with system (actors), andassociation between users and the functionality.Used in requirements collection and analysis phaseto represent high-level requirements of system.More specifically, specifies a sequence of actions,including variants, that system can perform and thatyields an observable result of value to a particularactor.

Pearson Education © 2014 53

UML – Use Case Diagrams

Pearson Education © 2014 54

UML – Use Case Diagrams

Pearson Education © 2014 55

UML – Sequence DiagramsModel interactions between objects over time,capturing behavior of an individual use case.Show the objects and the messages that are passedbetween these objects in the use case.

Pearson Education © 2014 56

UML – Sequence Diagrams

Pearson Education © 2014 57

UML – Collaboration DiagramsShow interactions between objects as a series ofsequenced messages.Cross between an object diagram and a sequencediagram.Unlike sequence diagram, which has column/rowformat, collaboration diagram uses free-formarrangement, which makes it easier to see allinteractions involving a particular object.

Pearson Education © 2014 58

UML – Collaboration Diagrams

Pearson Education © 2014 59

UML – Statechart DiagramsShow how objects can change in response toexternal events.Usually model transitions of a specific object.

Pearson Education © 2014 60

UML – Activity DiagramsModel flow of control from one activity to another.Typically represent invocation of an operation, astep in a business process, or an entire businessprocess.Consist of activity states and transitions betweenthem.

Pearson Education © 2014 61

UML – Activity Diagrams

Pearson Education © 2014 62

UML – Usage in Database Design Methodology

Produce use case diagrams from requirementsspecification or while producing requirementsspecification to depict main functions required ofsystem. Can be augmented with use casedescriptions.Produce first cut class diagram (ER model).Produce a sequence diagram for each use case orgroup of related use cases.May be useful to add a control class to classdiagram to represent interface between the actorsand the system.

Pearson Education © 2014 63

UML – Usage in Database Design Methodology

Update class diagram to show required methods ineach class.Create state diagram for each class to show howclass changes state in response to messages.Messages are identified from sequence diagrams.Revise earlier diagrams based on new knowledgegained during this process.

Pearson Education © 2014 64

Next Steps for UML and OOA/OOD/OOP

Take SE310 – Analysis and Design of Software Systems

Test out Modelio on PRClab or Download for your PC

Start using UML Analysis and Design Methods [in addition to ER/EER and SA/SD]

Take C++ Programming

Learn Java Programming

Take CS332 – Organization of Programming Languages

Sam Siewert 65

SummaryAlong with Connectors for Applications, This Completes CS317 Material

Final Quiz on Wed, Dec 2nd

Final Oral Exam on Dec 5th, 8AM

Material We Ran Out of Time to Cover– Distributed DBMS and Scaling– Comprehensive Final Review (Final Oral Exam Instead)

Sam Siewert 66

Recommended