38
1 Persistence

1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

Embed Size (px)

DESCRIPTION

3 Persistence Types Basic –No Persistence –Simple Persistence –Object Serialization Object/RDBMS –Using Blobs –Horizontal Partitioning –Vertical Partitioning –Unification OODBMS

Citation preview

Page 1: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

1

Persistence

Page 2: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

2

Persistence Usage• Scalability

– disk cheaper than memory• Fault recovery

– last known state maintained through recovery• Parallel processing

– multiple processors working on shared data source• Queryable storage

– locate objects for access• Checkpointing

– save current state, potentially as a blob• Pass by Value

– pass objects from one process to another so that method invocations on the passed object will result in a local method call in the process it was passed to

Page 3: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

3

Persistence Types

• Basic– No Persistence– Simple Persistence– Object Serialization

• Object/RDBMS– Using Blobs– Horizontal Partitioning– Vertical Partitioning– Unification

• OODBMS

Page 4: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

4

No Persistence: Client Must Manage State

Accountid_ : Stringbalance_ : double ownerId_ : String

: Client : Account DataOutputStream

1: get id_ ( )

3: get balance_ ( )

5: get ownerId_ ( )

2: writeByes()

4: writeInt()

6: writeBytes

• Used when legacy classes not designed with persistence built in

Page 5: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

5

No Persistence: Implementation does not address persistence of state

public class Account{ public String id_; public double balance_; public String ownerId_; public Account(String id,

double balance, String ownerId) { id_ = id; balance_ = balance; ownerId_ = ownerId; } public print(java.util.PrintStream) { ... }}

Page 6: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

6

Client Code Must Implement Persistence

void save() throws IOException { System.out.println("Client saving accounts"); DataOutputStream ostream = new DataOutputStream( new FileOutputStream(stateFile_)); ostream.writeInt(accounts_.length); for(int i=0; i<accounts_.length; i++) { ostream.writeInt(accounts_[i].id_.length()); ostream.writeBytes(accounts_[i].id_); ostream.writeDouble(accounts_[i].balance_); ostream.writeInt(accounts_[i].ownerId_.length()); ostream.writeBytes(accounts_[i].ownerId_); } ostream.close(); }

Page 7: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

7

Client Code Must Implement Persistence

void restore() throws IOException { DataInputStream istream = new DataInputStream( new FileInputStream(stateFile_)); accounts_ = new Account[istream.readInt()]; for(int i=0; i<accounts_.length; i++) { int len = istream.readInt(); byte buffer[] = new byte[len]; istream.readFully(buffer); String id = new String(buffer); double balance = istream.readDouble(); len = istream.readInt(); buffer = new byte[len]; istream.readFully(buffer); String ownerId = new String(buffer); accounts_[i] = new Account(id,balance,ownerId); } istream.close(); }

Page 8: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

8

Simple Persistence

• Class takes over responsibility for state persistence• Slight improvement over no persistence

: Client : Account :DataOutputStream

1: writeExternal (DataOutputStream)

2: writeBytes()

3: writeDouble()

4: writeBytes()

Accountid_ : Stringbalance_ : doubleownerId_ : String

readExternal (in : DataInputStream)writeExternal (out : DataOutput)

Page 9: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

9

Simple Persistence: Class Saves Its Own State

import java.io.PrintStream;import java.io.IOException;import java.io.DataInputStream;import java.io.DataOutputStream;public class Account{ private String id_; private double balance_; private String ownerId_; public Account() { } public Account(String id, double balance, String ownerId) { id_ = id; balance_ = balance; ownerId_ = ownerId; }//...

Page 10: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

10

public void writeExternal(DataOutputStream ostream) throws IOException{ ostream.writeInt(id_.length()); ostream.writeBytes(id_); ostream.writeDouble(balance_); ostream.writeInt(ownerId_.length()); ostream.writeBytes(ownerId_); } public void readExternal(DataInputStream istream) throws IOException { int len = istream.readInt(); byte buffer[] = new byte[len]; istream.readFully(buffer); id_ = new String(buffer); balance_ = istream.readDouble(); len = istream.readInt(); buffer = new byte[len]; istream.readFully(buffer); ownerId_ = new String(buffer);}

Simple Persistence: Class Saves Its Own State

Page 11: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

11

Simple Persistence:Client Simplified

• Client shielded from details of save/restorevoid save() throws IOException { System.out.println("Client saving accounts"); DataOutputStream ostream = new DataOutputStream( new FileOutputStream(stateFile_)); ostream.writeInt(accounts_.length); for(int i=0; i<accounts_.length; i++) { accounts_[i].writeExternal(ostream); } ostream.close();}

Page 12: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

12

Simple Persistence:Client must know object’s class

• Client still must know the class of the object saved/restoredvoid restore() throws IOException { System.out.println("Client restoring accounts"); DataInputStream istream = new DataInputStream( new FileInputStream(stateFile_)); accounts_ = new Account[istream.readInt()]; for(int i=0; i<accounts_.length; i++) { accounts_[i] = new Account(); accounts_[i].readExternal(istream);

} istream.close();}

Page 13: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

13

Simple Persistence Problem

• Objects may have multiple references to them

• An object may be saved multiple times, once for each reference

• Multiple clones might be instantiated, one for each persisted copy

Page 14: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

14

Simple Persistence ProblemSecretary

name = Money Penny

Managername= Msecretary =

Managername= Jamessecretary =

Secretaryname = Money Penny

Managername= Msecretary =

Managername= Jamessecretary =

Secretaryname = Money Penny

Secretaryname = Money Penny

When written out, we get multiple copies of aliased objects because object references are not resolved.

Page 15: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

15

Serialization

• Clients do not need to know the type of the object saved– an abstract interface (java.io.Serializable is defined to tag the

object)

• Clients do not need to know the class of the object restored– any abstract super-class or interface of the object is suitable

• Serialization takes care of object references• Classes take responsibility for their persisted elements

– delegate to the language– tag elements for no persistence– specialize the save process for alternate storage mechanisms

Page 16: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

16

Classes add a Tagging Interfaceimport java.io.PrintStream;import java.io.Serializable;public class Account implements Serializable{ private String id_; private double balance_; private Owner owner_; public Account(String id, double balance, Owner owner) { id_ = id; balance_ = balance; owner_ = owner; }}

Page 17: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

17

Associated Classes add the Tagging Interface

import java.io.PrintStream;import java.io.Serializable;public class Owner implements Serializable{ private String name_; private String taxId_; public Owner(String name, String taxId) { name_ = name; taxId_ = taxId; } public void print(PrintStream out) { out.println(this + "- name="+name_ + ", taxid=" + taxId_); }}

Page 18: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

18

Clients have Simple Saveand Restore Mechanism

void save() throws IOException { System.out.println("Client saving accounts"); ObjectOutputStream ostream = new ObjectOutputStream( new FileOutputStream(stateFile)); ostream.writeObject(accounts); ostream.close();} void restore() throws IOException, ClassNotFoundException { System.out.println("Client restoring accounts"); ObjectInputStream istream = new ObjectInputStream( new FileInputStream(stateFile)); accounts = (Account[])istream.readObject(); istream.close();}

Page 19: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

19

References to Common Objectsare Resolved

creating Client with new accountselements=3 id=1 balance=100.0streaming.serialize.Owner@173d14-

name=bob, taxid=111-11-1111 id=2 balance=200.0streaming.serialize.Owner@173d0f-

name=larry, taxid=222-22-2222 id=3 balance=300.0streaming.serialize.Owner@173d14-

name=bob, taxid=111-11-1111Client saving accounts

Client adopting statefileClient restoring accountselements=3 id=1 balance=100.0streaming.serialize.Owner@173f4a-

name=bob, taxid=111-11-1111 id=2 balance=200.0streaming.serialize.Owner@173f3f-

name=larry, taxid=222-22-2222 id=3 balance=300.0streaming.serialize.Owner@173f4a-

name=bob, taxid=111-11-1111

Page 20: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

20

Protecting Attributes from Serialization

public class Account implements java.io.Serializable{ private String id_; private double balance_; private Owner owner_; private transient Date dummy_; //only intialized on creation public Account(String id, double balance, Owner owner) { id_ = id; balance_ = balance; owner_ = owner; dummy_ = new Date(); }

Page 21: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

21

Protecting Attributes from Serialization

creating Client with new accountselements=3 id=1 balance=100.0streaming.serialize.Owner@173d14- name=bob,

taxid=111-11-1111transient dummy=Sun Aug 01 22:23:35 EDT 1999 id=2 balance=200.0streaming.serialize.Owner@173d0f- name=larry,

taxid=222-22-2222transient dummy=Sun Aug 01 22:23:35 EDT 1999 id=3 balance=300.0streaming.serialize.Owner@173d14- name=bob,

taxid=111-11-1111transient dummy=Sun Aug 01 22:23:35 EDT 1999Client saving accounts

Client adopting statefileClient restoring accountselements=3 id=1 balance=100.0streaming.serialize.Owner@174af0- name=bob,

taxid=111-11-1111transient dummy=null id=2 balance=200.0streaming.serialize.Owner@174ae5- name=larry,

taxid=222-22-2222transient dummy=null id=3 balance=300.0streaming.serialize.Owner@174af0- name=bob,

taxid=111-11-1111transient dummy=null

Page 22: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

22

Providing Manual SerializationOverrides

public class Account implements java.io.Serializable { private String id_; private double balance_; private Owner owner_; private transient Date dummy_; //only intialized on ctor private void writeObject(ObjectOutputStream out) throws IOException { System.out.println("do something to override writeObject"); out.writeObject(id_); out.writeDouble(balance_); out.writeObject(owner_); out.writeObject(dummy_); //write transient var out anyway } private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException { System.out.println("do something to override readObject"); id_ = (String) in.readObject(); balance_ = in.readDouble(); owner_ = (Owner) in.readObject(); dummy_ = (Date) in.readObject(); //read transient var in anyway }

Page 23: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

23

Providing Manual Serialization Overrides

creating Client with new accountselements=3 id=1 balance=100.0streaming.serialize.Owner@173d14- name=bob, taxid=111-11-

1111transient dummy=Sun Aug 01 22:39:57 EDT 1999 id=2 balance=200.0streaming.serialize.Owner@173d0f- name=larry, taxid=222-22-

2222transient dummy=Sun Aug 01 22:39:57 EDT 1999 id=3 balance=300.0streaming.serialize.Owner@173d14- name=bob, taxid=111-11-

1111transient dummy=Sun Aug 01 22:39:57 EDT 1999Client saving accountsdo something to override writeObjectdo something to override writeObjectdo something to override writeObject

Client adopting statefileClient restoring accountsdo something to override readObjectdo something to override readObjectdo something to override readObjectelements=3 id=1 balance=100.0streaming.serialize.Owner@174ad8- name=bob, taxid=111-

11-1111transient dummy=Sun Aug 01 22:39:57 EDT 1999 id=2 balance=200.0streaming.serialize.Owner@174b19- name=larry,

taxid=222-22-2222transient dummy=Sun Aug 01 22:39:57 EDT 1999 id=3 balance=300.0transient dummy=Sun Aug 01 22:39:57 EDT 1999

Page 24: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

24

Object/RDBMS

• How do we map the following Class Model to an RDBMS

InterestBearingAccountrate_ : doubletermDays_ : intminimumBalance_ : double

CheckingAccountcheckFee_ double

Ownername_ : StringtaxId_ : String

Accountid_ : Stringbalance_ : double

owner_

1 *

Page 25: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

25

Storing the Objects as Blobsvoid save() throws SQLException, Exception { PreparedStatement pstatement = null; try { pstatement = connection_.prepareStatement("insert into accounts(id, data) values (?, ?)"); for(int i=0; i<accounts_.length; i++) {

pstatement.setString(1,accounts_[i].getId()); try { File file = File.createTempFile("tmp","dat"); ObjectOutputStream ostream = new ObjectOutputStream(new FileOutputStream(file));

ostream.writeObject(accounts_[i]); ostream.close(); FileInputStream istream = new FileInputStream(file); pstatement.setBinaryStream(2, istream, (int)file.length()); //pstatement.setObject(2,accounts_[i]); pstatement.execute(); pstatement.clearParameters();

} }

finally { if (pstatement != null) pstatement.close(); } }

Page 26: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

26

Restoring Objects from Blobs void restore() throws SQLException, Exception { Statement statement = null; ResultSet rs = null; try {

statement = connection_.createStatement(); rs = statement.executeQuery("select id, data from accounts";); Vector accounts = new Vector(); while (rs.next()) { String accountNo = rs.getString(1); ObjectInputStream istream = new ObjectInputStream(rs.getBinaryStream(2));

Account account = (Account) istream.readObject(); //Account account = (Account) rs.getObject(2); accounts.add(account); accounts_ = new Account[accounts.size()]; accounts.toArray(accounts_);

} finally { if (rs != null) rs.close(); if (statement != null) statement.close(); }}

Page 27: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

27

Using Blobs

• Pros– Good encapsulation of object properties

• Cons– Example still allows for accidental object duplication– Slows database performance

• can segment object into multiple tables and make use of lazy instantiation

– Serialization brittle in the face of software changes/extended time

• better use as a cache• possible use of XML or other stable marshalling forms

Page 28: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

28

Horizontal Partitioning

• Each concrete class is mapped to a table

InterestBearingAccountrate_ : doubletermDays_ : intminimumBalance_ : double

CheckingAccountcheckFee_ double

Ownername_ : StringtaxId_ : String

Accountid_ : Stringbalance_ : double

owner_

1 *

OwnerTablename taxId

InterestBearingAccountTableid balance ownerId rate termDays

CheckingAccountTableid balance ownerId checkFee

Page 29: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

29

Vertical Partitioning

• Each class is mapped to a table

InterestBearingAccountrate_ : doubletermDays_ : intminimumBalance_ : double

CheckingAccountcheckFee_ double

Ownername_ : StringtaxId_ : String

Accountid_ : Stringbalance_ : double

owner_

1 *

AccountTableid balance ownerId

OwnerTablename taxId

InterestBearingAccountTableid rate termDays

CheckingAccountid checkFee

Page 30: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

30

Unification

• Each sub-class is mapped to the same table

InterestBearingAccountrate_ : doubletermDays_ : intminimumBalance_ : double

CheckingAccountcheckFee_ double

Ownername_ : StringtaxId_ : String

Accountid_ : Stringbalance_ : double

owner_

1 *

AccountTableid acctType balance ownerId rate termDays checkFee

OwnerTablename taxId

Page 31: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

31

RDBMS Mapping• Horizontal Partitioning

– entire object within one table– only one table required to

activate object– no unnecessary fields in the

table– must search over multiple

tables for common properties• Vertical Partitioning

– object spread across different tables

– must join several tables to activate object

• Vertical Partitioning (cont.)– no unnecessary fields in each

table– only need to search over parent

tables for common properties• Unification

– entire object within one table– only one table required to

activate object– unnecessary fields in the table– all sub-types will be located in

a search of the common table

Page 32: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

32

Inserting Data Access ObjectsApplication

ObjectData Access

ObjectValueObject

Account AccountDAO

AccountValue

Owner OwnerDAO

OwnerValue

Page 33: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

33

Roles• Application Objects

– Encapsulate the business rules– Obtain connections– Demarcate transactions– Not Serializable

• Value Objects– Simply carry values– Serializable

• Data Access Objects– Encapsulate interaction with information source (database)– Designed to work with different Application Objects (e.g., no threads)

– Not Serializable

Page 34: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

34

Value Objectpackage ejava.persistence.dao;public OwnerValue implements Serializable { String name_; String taxId_; public OwnerValue(String name, String taxId) { name_ = name; taxId_ = taxId; } public OwnerValue(OwnerValue rhs) { this(rhs.name_, rhs.taxId_); } public String getName() { return name_; } public void setName(String name) { name_ = name; } public String getTaxId() { return taxId_; } public void setTaxId(String taxId) { taxId_ = taxId; } public String toString() { return "name="+name + ", taxId="+taxId_; }}

Page 35: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

35

Data Access Objectpackage ejava.persistence.dao;public class OwnerDAO { OwnerValue values_; public void insert(Connection connection, OwnerValue values) { Statement statement = null; try { statement = connection.createStatement();

int rows = statement.executeUpdate( "insert into owner (name, taxid) values (" + values.getName() + ", " + values.getTaxId() + ")"); if (rows != 1) ...

} finally { if (statement != null) statement.close(); } }}

Page 36: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

36

Application Objectpackage ejava.persistence.dao;

/** This class represents the business logic for Owners. */public class Owner{ private OwnerValue values_; private static OwnerDAO ownerDAO_ = new OwnerDAO(); public Owner() { } public OwnerValue getValues() { return new OwnerValue(values_); } public void setValues(OwnerValue values) { values_ = values; } private Connection getConnection() {…} private void closeConnection(Connection connection) {…}

Page 37: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

37

Application Object public void create(OwnerValue values) throws SQLException { values_ = values; Connection connection = null; try { connection = getConnection(); ownerDAO_.insert(connection, values_); } finally { closeConnection(connection); } }

Page 38: 1 Persistence. 2 Persistence Usage Scalability –disk cheaper than memory Fault recovery –last known state maintained through recovery Parallel processing

38

Application Object (cont.) public void remove() throws SQLException { Connection connection = null; try { connection = getConnection();

ownerDAO_.remove(connection, values_); values_ = null;

} finally { closeConnection(connection); } }}