Upload
queens-university-blackberry
View
64
Download
0
Embed Size (px)
Citation preview
Improving the Quality of Large-Scale Database-Centric Software Systems by
Analyzing Database Access Code
1
Tse-Hsun(Peter) Chen
Supervised by Ahmed E. Hassan
Inefficient data access
3
for each userId{…executeQuery(“select … where
u.id = userId”);}
Code
SQL select … from user where u.id = 1 select … from user where u.id = 2…
Common performance problems in database-access code
4
DBMSCode
• Inefficient data access• Unneeded data access
Unneeded data access
5
DBMSCode
Requireuser data in
the code
Actual request sent
Team TableUser Tablejoin
Common performance problems in database-access code
6
DBMSCode
• Inefficient data access• Unneeded data access• Overly-strict isolation level
Overly-strict isolation level
7
DBMSCode
Reading onlyuser name
set read-write transactionselect … from user
Cannot find problems by only looking at the DBMS side
8
DBMS
select … from user where u.id = 1 select … from user where u.id = 2…for each userId{
…executeQuery(“
select … where u.id = userId”);}
Database accesses are abstracted
9
DBMSCode
AbstractionLayers
Problems become more complex and frequent after adding abstraction layers
Accessing data using ORM incorrectly
10
@EntityClass User{
…}
for each userId{User u = findUserById(userId);u.getName();
}
A database entity class
Objects
SQL
select … from user where u.id = 1 select … from user where u.id = 2…
Research statement
There are common problems in how developers write database-access code, and these problems are hard to diagnose by only looking at the DBMS. In addition, due to the use of different database abstraction layers, these problems become even harder to find.
By finding problems in the database access code and configuring the abstraction layers correctly, we can significantly improve the performance of database-centric systems.
11
12
Detecting inefficientdata access code
Overview of the thesis
Detecting unneededdata access
Finding overly-strict isolation level
Finished work Future workUnder
submission
Detecting unneededdata access
13
Detecting inefficientdata access code
Overview of the thesis
Finding overly-strict isolation level
Finished work Future workUnder
submission
Inefficient data access detection framework
Inefficient data access detection and ranking framework
Ranked according to performance impact
Ranked inefficient
data access code
Source Code
detection
ranking
14
Inefficient data access detection framework
Inefficient data access detection and ranking framework
Ranked according to performance impact
Ranked inefficient
data access code
Source Code
detection
ranking
15
Focus on one type of inefficient data access: one-by-one processing
Detecting one-by-one processingusing static analysis
16
First find all the functions that read/write from/to DBMS
Class User{getUserById()…getUserAddress()…
}
Identify the positions of all loopsfor each userId{
foo(userId)}
Check if the loop contains any database-accessing function
foo (userId){getUserById(userId)
}
Inefficient data access detection framework
Inefficient data access detection and ranking framework
Ranked according to performance impact
Ranked inefficient
data access code
Source Code
detection
ranking
17
Inefficient data access detection framework
Inefficient data access detection and ranking framework
Ranked according to performance impact
Ranked inefficient
data access code
Source Code
detection
ranking
18
Assessing inefficient data access impact by fixing the problem
Execution
Response timefor u in users{
update u}
Code with inefficient data access
users.updateAll()
Code withoutinefficient data access
19
Execute test suite 30 times
Response time after the fix
Avg. % improvement
Execution
Execute test suite 30 times
Inefficient data access causes large performance impacts
0%
20%
40%
60%
80%
100%
One-by-one processing20
% im
pro
vem
en
t in
res
po
nse
tim
e
Detecting unneededdata access
21
Detecting inefficientdata access code
Overview of the thesis
Finding overly-strict isolation level
Finished work Future workUnder
submission
22
Detecting inefficientdata access code
Overview of the thesis
Finding overly-strict isolation level
Finished work Future workUnder
submission
Detecting unneededdata access
Intercepting system executions using byte code instrumentation
ORMDBMS
23
Intercept called functions in the code
Intercept ORM generated SQLs
Needed data
Requested data
diff
Mapping data access to called functions
24
@Entity@Table(name = “user”)public class User{
…
@Column(name=“name”)String userName;
@OneToManyList<Team> teams;
public String getUserName(){return userName;
}
public void addTeam(Team t){teams.add(t);
}
User.java
We apply static analysis to find which database column a function is reading/modifying
24
Identifying unneeded data access from intercepted data
Only user name is needed in the application logic
Needed user name in code
Requested id, name, address, phone number
25
26
Detecting inefficientdata access code
Overview of the thesis
Finding overly-strict isolation level
Finished work Future workUnder
submission
Detecting unneededdata access
Detecting unneededdata access
27
Detecting inefficientdata access code
Overview of the thesis
Finding overly-strict isolation level
Finished work Future workUnder
submission
A real life example of transaction abstraction problem
28
All functions in foo will be executed in transactions
@Transactionalpublic class foo{
int getFooVal(){return foo.val;
}…
}
This function does not need to be executed in transactions
Where to put the transaction may affect system behavior and performance
Plans for finding transaction problems related abstraction
29
Bug Reports
Categories of transaction-related bugs
We plan to empirically study bug reports to find root causes and implement a
detection framework