Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
CREST
Genetic Improvement
Justyna Petke
Centre for Research in Evolution, Search and TestingUniversity College London
CREST Justyna PetkeGenetic Improvement
Thank you
Mark Harman Yue Jia Alexandru Marginean
CREST Justyna PetkeGenetic Improvement
What does the word “Computer” mean?
Oxford Dictionary
“a person who makes calculations, especially with a calculating machine.”
Wikipedia“The term "computer", in use from the mid 17th
century, meant "one who computes": a person performing mathematical calculations.”
CREST Justyna PetkeGenetic Improvement
What does the word “Computer” mean?
Oxford Dictionary
“a person who makes calculations, especially with a calculating machine.”
Wikipedia“The term "computer", in use from the mid 17th
century, meant "one who computes": a person performing mathematical calculations.”
CREST Justyna PetkeGenetic Improvement
in the beginning ...
CREST Justyna PetkeGenetic Improvement
The First Computer?
Different People have different opinions
CREST Justyna PetkeGenetic Improvement
Who are the programers
CREST Justyna PetkeGenetic Improvement
Who are the programers
CREST Justyna PetkeGenetic Improvement
Who are the programers
CREST Justyna PetkeGenetic Improvement
Who are the programers
it’s always been people
CREST Justyna PetkeGenetic Improvement
Who are the programers
it’s always been people
CREST Justyna PetkeGenetic Improvement
Who are the programers
lots of people
CREST Justyna PetkeGenetic Improvement
why people?
human computers seem quaint today
will human programmers seem quaint tomorrow ?
CREST Justyna PetkeGenetic Improvement
programming is changing
CREST Justyna PetkeGenetic Improvement
RequirementsRequirements
Functional Requirements
Non-Functional Requirements
CREST Justyna PetkeGenetic Improvement
Functional Requirements
Non-Functional Requirements
Memory
Execution Time
Battery
Size
Bandwidth functionality of the Program
CREST Justyna PetkeGenetic Improvement
Software Design Process
CREST Justyna PetkeGenetic Improvement
Software Design Process
CREST Justyna PetkeGenetic Improvement
Software Design Process
CREST Justyna PetkeGenetic Improvement
Software Design Process
CREST Justyna PetkeGenetic Improvement
Software Design Process
CREST Justyna PetkeGenetic Improvement
Multiplicity
Multiple Devices
Conflicting Objectives
Multiple Platforms
CREST Justyna PetkeGenetic Improvement
Functional Requirements
Non-Functional Requirements
Which requirements must be human coded ?
humans have to define these
CREST Justyna PetkeGenetic Improvement
Functional Requirements
Non-Functional Requirements
Which requirements are essential to human ?
humans have to define these
we can optimise these
CREST Justyna PetkeGenetic Improvement
CREST Justyna PetkeGenetic Improvement
Pareto Front
CREST Justyna PetkeGenetic Improvement
Pareto Front
each circle is a program found by a machine
CREST Justyna PetkeGenetic Improvement
Pareto Front
different nonfunctional
properties havedifferent paretoprogram fronts
CREST Justyna PetkeGenetic Improvement
Failed Test Cases
CREST Justyna PetkeGenetic Improvement
Why can’t functional properties be optimisation objectives ?
CREST Justyna PetkeGenetic Improvement
CREST Justyna PetkeGenetic Improvement
Optimisation
CREST Justyna PetkeGenetic Improvement
Optimisation
2.5 times faster but failed 1 test case?
CREST Justyna PetkeGenetic Improvement
CREST Justyna PetkeGenetic Improvement
Optimisation
double the battery life but failed 2 test cases?
CREST Justyna PetkeGenetic Improvement
ConformantGI
VS. HereticalGI
functional correctness
is king
conforms totraditional
views
CREST Justyna PetkeGenetic Improvement
ConformantGI
VS. HereticalGI
functional correctness
is king
conforms totraditional
views
correctness meanshaving sufficient resources
for computation
conforms to acompelling new
orthodoxy
CREST Justyna PetkeGenetic Improvement
ConformantGI
VS. HereticalGI
functional correctness
is king
conforms totraditional
views
correctness meanshaving sufficient resources
for computation
conforms to acompelling new
orthodoxy
there’s nothing correct about a flat battery
CREST Justyna PetkeGenetic Improvement
can it work ?
CREST Justyna PetkeGenetic Improvement
Software Uniqueness
500,000,000 LoCone has to write approximately 6 statements before one is writing unique code
CREST Justyna PetkeGenetic Improvement
Software Uniqueness
500,000,000 LoCone has to write approximately 6 statements before one is writing unique code
M. Gabel and Z. Su. A study of the uniqueness of source code. (FSE 2010)
The space of candidate programs is far smaller than we might suppose.
“
”
CREST Justyna PetkeGenetic Improvement
Software Robustness
after one line changes up to 89% of programs that compile run without error
CREST Justyna PetkeGenetic Improvement
Software Robustness
after one line changes up to 89% of programs that compile run without error
W. B .Langdon and J. Petke Software is Not Fragile. (CS-DC 2015)
Software engineering artefacts are more robust than is often assumed.
“
”
CREST Justyna Petke
Genetic Improvement for Software Specialisation
CREST Justyna Petke
Question
Can we improve the efficiency of an already highly-optimised piece of
software using genetic programming?
Genetic Improvement Justyna Petke
Contributions
Introduction of multi-donor software transplantation
Use of genetic improvement as means to specialise software
Genetic Improvement Justyna Petke
Genetic Improvement
Genetic Improvement Justyna Petke
Program Representation
Changes at the level of lines of source code
Each individual is composed of a list of changes
Specialised grammar used to preserve syntax
Genetic Improvement Justyna Petke
Example
Genetic Improvement Justyna Petke
Code Transplants
GP has access to both:
• the host program to be evolved
• the donor program(s)
code bank contains all lines of source code GP has access to
Genetic Improvement Justyna Petke
Mutation
Addition of one of the following operations:
delete
copy
replace
Genetic Improvement Justyna Petke
Example
Genetic Improvement Justyna Petke
Crossover
Concatenation of two individuals
by appending two lists of mutations
Genetic Improvement Justyna Petke
Fitness
Based on solution quality and
Efficiency in terms of lines of source code
Avoids environmental bias
Genetic Improvement Justyna Petke
Fitness
Test cases are sorted into groups
One test case is sampled uniformly from each group
Avoids overfitting
Genetic Improvement Justyna Petke
Selection
Fixed number of generations
Fixed population size
Initial population contains single-mutation individuals
Genetic Improvement Justyna Petke
Selection
Top-half of population selected
Based on a threshold fitness value
Mutation and Crossover applied
Genetic Improvement Justyna Petke
Genetic Improvement
Genetic Improvement Justyna Petke
Filtering
Mutations in best individuals are often independent
Greedy approach used to combine best individuals
Genetic Improvement Justyna Petke
Question
Can we improve the efficiency of an already highly-optimised piece of
software using genetic programming?
Genetic Improvement Justyna Petke
Motivation for choosing a SAT solver
Boolean satisfiability (SAT) example:
x1 ∨ x2 ∨ ¬x4
¬x2 ∨ ¬x3
• xi : a Boolean variable
• xi, ¬xi : a literal
• ¬x2 ∨ ¬x3 : a clause
Genetic Improvement Justyna Petke
Motivation for choosing a SAT solver
Bounded Model Checking
Planning
Software Verification
Automatic Test Pattern Generation
Combinational Equivalence Checking
Combinatorial Interaction Testing
and many other applications..
Genetic Improvement Justyna Petke
Motivation for choosing a SAT solver
MiniSAT-hack track in SAT solver competitions
- good source for software transplants
Genetic Improvement Justyna Petke
Question
Can we evolve a version of the MiniSAT solver that is faster than any
of the human-improved versions of the solver?
Genetic Improvement Justyna Petke
Experiments: Setup
Solvers used:
MiniSAT2-070721
Test cases used:
∼ 2.5% improvement for general benchmarks (SSBSE’13)
Genetic Improvement Justyna Petke
Motivation for choosing a SAT solver
MiniSAT-hack track in SAT solver competitions
- good source for software transplants
Genetic Improvement Justyna Petke
Question
Can we evolve a version of the MiniSAT solver that is faster than any
of the human-improved versions of the solver for a particular
problem class?
Genetic Improvement Justyna Petke
Experiments: Setup
Solvers used:
MiniSAT2-070721
Test cases used:
from Combinatorial Interaction Testing field
Genetic Improvement Justyna Petke
Combinatorial Interaction Testing
Use of SAT-solvers limited due to poor scalability
SAT benchmarks containing millions of clauses
It takes hours to days to generate a CIT test suite using SAT
Genetic Improvement Justyna Petke
Experiments: Setup
Host program:
MiniSAT2-070721 (478 lines in main algorithm)
Donor programs:
MiniSAT-best09 (winner of ’09 MiniSAT-hack competition)
MiniSAT-bestCIT (best for CIT from ’09 competition)
- total of 104 new lines
Genetic Improvement Justyna Petke
Results
Solver Donor Lines Seconds
MiniSAT (original) — 1.00 1.00
MiniSAT-best09 — 1.46 1.76
MiniSAT-bestCIT — 0.72 0.87
MiniSAT-best09+bestCIT — 1.26 1.63
Genetic Improvement Justyna Petke
Question
How much runtime improvement can we achieve?
Genetic Improvement Justyna Petke
Results
Solver Donor Lines Seconds
MiniSAT (original) — 1.00 1.00
MiniSAT-best09 — 1.46 1.76
MiniSAT-bestCIT — 0.72 0.87
MiniSAT-best09+bestCIT — 1.26 1.63
MiniSAT-gp best09 0.93 0.95
Genetic Improvement Justyna Petke
Results
Donor: best09
13 delete, 9 replace, 1 copy
Among changes:
3 assertions removed
1 deletion on variable used for statistics
Genetic Improvement Justyna Petke
Results
Mainly if and for statements switched off
Decreased iteration count in for loops
Genetic Improvement Justyna Petke
Results
Solver Donor Lines Seconds
MiniSAT (original) — 1.00 1.00
MiniSAT-best09 — 1.46 1.76
MiniSAT-bestCIT — 0.72 0.87
MiniSAT-best09+bestCIT — 1.26 1.63
MiniSAT-gp best09 0.93 0.95
MiniSAT-gp bestCIT 0.72 0.87
Genetic Improvement Justyna Petke
Results
Donor: bestCIT
1 delete, 1 replace
Among changes:
1 assertion deletion
1 replace operation triggers 95% of donor code
Genetic Improvement Justyna Petke
Results
Solver Donor Lines Seconds
MiniSAT (original) — 1.00 1.00
MiniSAT-best09 — 1.46 1.76
MiniSAT-bestCIT — 0.72 0.87
MiniSAT-best09+bestCIT — 1.26 1.63
MiniSAT-gp best09 0.93 0.95
MiniSAT-gp bestCIT 0.72 0.87
MiniSAT-gp best09+bestCIT 0.94 0.96
Genetic Improvement Justyna Petke
Results
Donor: best09+bestCIT
50 delete, 20 replace, 5 copy
Among changes:
5 assertions removed
4 semantically equivalent replacements
3 operations used for statistics removed
∼ half of the mutations remove dead code
Genetic Improvement Justyna Petke
Results
Solver Donor Lines Seconds
MiniSAT (original) — 1.00 1.00
MiniSAT-best09 — 1.46 1.76
MiniSAT-bestCIT — 0.72 0.87
MiniSAT-best09+bestCIT — 1.26 1.63
MiniSAT-gp best09 0.93 0.95
MiniSAT-gp bestCIT 0.72 0.87
MiniSAT-gp best09+bestCIT 0.94 0.96
MiniSAT-gp-combined best09+bestCIT 0.54 0.83
Genetic Improvement Justyna Petke
Results
Combining results:
37 delete, 15 replace, 4 copy
56 out of 100 mutations used
Among changes:
8 assertion removed
95% of the bestCIT donor code executed
Genetic Improvement Justyna Petke
Conclusions
Introduced multi-donor software transplantation
Used genetic improvement as means to specialise software
Achieved 17% runtime improvement on MiniSAT
for the Combinatorial Interaction Testing domain
by combining best individuals
Genetic Improvement Justyna Petke
CREST Justyna PetkeGenetic Improvement
GPProgra
msProgra
msProgra
msMiniSatImprov
Non-functional property Test
FitneTest
Sensitivity Analysis
Justyna Petke, Mark Harman, William B. Langdon and Westley WeimerUsing Genetic Improvement & Code Transplants to Specialise a C++ programto a Problem Class (EuroGP’14)
Multi-doner transplantSpecialized for CIT
17% faster
MiniSat
MiniSat
MiniSat
v1
v2
vn
GECCO Humie
silver medal
Inter version transplantation
CREST Justyna PetkeGenetic Improvement
Bowtie2
GPProgra
msProgra
msProgra
msBowtie
2
Non-functional property Test
FitneTest
Sensitivity Analysis
Genetic Improvement of Programs
W. B. Langdon and M. HarmanOptimising Existing Software with Genetic Programming. TEC 2015
70 times faster30+ interventions
HC clean up: 7slight semantic improvement
CREST Justyna PetkeGenetic Improvement
Cuda GPProgra
msProgra
msProgra
msCuda
Improv
Non-functional property Test
FitneTest
Sensitivity Analysis
Genetic Improvement of Programs
W. B. Langdon and M. HarmanGenetically Improved CUDA C++ Software, EuroGP 2014
7 times fasterupdated for new hardware
automated updating
CREST Justyna PetkeGenetic Improvement
GP
Non-functional property Test
FitneTest
Sensitivity Analysis
Memory speed trade offs System
malloc
System
optimisedmalloc
Fan Wu, Westley Weimer, Mark Harman, Yue Jia and Jens Krinke Deep Parameter OptimisationConference on Genetic and Evolutionary Computation (GECCO 2015)
Improve execution time by 12% or achieve a 21% memory consumption reduction
CREST Justyna PetkeGenetic Improvement
GP
Non-functional property Test
FitneTest
Sensitivity Analysis
Reducing energy consumption
Ensemble
AProVE
MiniSatCIT
MiniSat
MiniSatEnsemble
AProVE
Improved MiniSatCIT
Improved MiniSat
Improved MiniSat
Bobby R. Bruce Justyna Petke Mark HarmanReducing Energy Consumption Using Genetic ImprovementConference on Genetic and Evolutionary Computation (GECCO 2015)
Energy consumption can be reduced by as much as 25%
CREST Justyna PetkeGenetic Improvement
GP
Non-functional property Test
FitneTest
Grow and graft new functionality
GP
Non-functional property Test
FitneTest
Sensitivity Feature
Grow Graft
HumanKnowledge Feature
Host System
Mark Harman, Yue Jia and Bill Langdon, Babel Pidgin: SBSE can grow and graft entirely new functionality into a real world system Symposium on Search-Based Software Engineering SSBSE 2014. (Challenge track)Chall
enge T
rack
Award
CREST Justyna PetkeGenetic Improvement
GP
Non-functional property Test
FitneTest
Sensitivity Analysis
Real world cross system transplantation
Earl T. Barr, Mark Harman, Yue Jia, Alexandru Marginean, and Justyna Petke Automated Software Transplantation (ISSTA 2015)
Donor
Host
featureHost’
featureSuccessfully autotransplanted new functionality and passed all regression tests for 12 out of 15 real world systems
CREST Justyna PetkeGenetic Improvement
Automated Software TransplantationE.T. Barr, M. Harman, Y. Jia, A. Marginean & J. Petke
ACM Distinguished Paper Award at ISSTA 2015
coverage in
article in
2647 shares of
Video Player
Start from scratch
Why Autotransplantation?Check open
source repositories Why not handle H.264?
~100 players
CREST Justyna PetkeGenetic Improvement
Human Organ Transplantation
CREST Justyna PetkeGenetic Improvement
Automated Software Transplantation
Host
Donor
OrganOrgan
ENTRY
V
Organ Test Suite
Manual Work:
Organ Entry
Organ’s Test Suite
Implantation Point
CREST Justyna PetkeGenetic Improvement
μTransHost
Donor
Stage 1: Static Analysis
Host Beneficiary
Stage 2: Genetic
Programming
Stage 3: Organ
Implantation
Organ Test Suite
Implantation Point
Organ Entry
CREST Justyna PetkeGenetic Improvement
Stage 1 — Static Analysis Donor
OEENTRY
VeinOrgan
Matching Table
Dependency Graph
Host
Implantation Point
Stm: x = 10; -> Decl: int x;Donor: int X -> Host: int A, B, C
CREST Justyna PetkeGenetic Improvement
Stage 2 — GPS1 S2 S3 S4 S5 … Sn
Matching Table
V3H V4H
Donor Variable ID
Host Variable ID (set)
V1D
V2D
…
V1H V2H
V5H
Individual
Var
Mat
chin
gSt
atem
ent
s
V1D V1H
V2D V4H
S1 S7 S73
M1:M2:
…Genetic Programming
Algorithm 1 Generate the initial population P ; the functionchoose returns an element from a set uniformly at random.Input V , the organ vein; SD, the donor symbol table; OM , the host
type-to-variable map; Sp, the size of population; v, statements inindividual; m, mappings in individual.
1: P := ∅2: for i := 1 to Sp do3: m, v := ∅, ∅4: for all sd ∈ SD do5: sh := choose(OM [sd])6: m := m ∪ {sd → sh}7: v := { choose(V ) }8: P := P ∪ {(m, v)}9: return P
variables used in donor at LO. For each individual, Alg. 1 firstuniformly selects a type compatible binding from the host’svariables in scope at the implantation point to each of theorgan’s parameters. We then uniformly select one statementfrom the organ, including its vein, and add it to the individual.The GP system records which statements have been selectedand favours statements that have not yet been selected.
Search Operators During GP, µTrans applies crossoverwith a probability of 0.5. We define two custom, crossoveroperators: fixed-two-points and uniform crossover. The fixed-two-points crossover is the standard fixed-point operator sepa-rately applied to the organ’s map from host variables to organparameters and the statement vector, restricted to each vec-tor’s centre point. The uniform crossover operator producesonly one offspring, whose host to organ map is the crossover ofits parents’ and whose V and O statements are the union of itsparents. The use of union here is novel. Initially, we used con-ventional fixed-point crossover on organ and vein statementsvectors, but convergence was too slow. Adopting union spedconvergence, as desired. Fig. 4 shows an example of applyingthe crossover operators on the two individuals on the left.After crossover, one of the two mutation operators is ap-
plied with a probability of 0.5. The first operator uniformlyreplaces a binding in the organ’s map from host variablesto its parameters with a type compatible alternative. Inour running example, say an individual currently maps thehost variable curs to its N_init parameter. Since curs isnot a valid array length in idct, the individual fails theorgan test suite. Say the remap operator chooses to remapN_init. Since its type is int, the remap operator selects anew variable from among the int variables in scope at theinsertion point, which include ‘hit_eof, curs, tos, length,. . . ’ . The correct mapping is N_list to length; if the remapoperator selects it, the resulting individual will be more fit.
The second operator mutates the statements of the organ.First, it uniformly picks t, an offset into the organ’s statementlist. When adding or replacing, it first uniformly selects aindex into the over-organ’s statement array. To add, it insertsthe selected statement at t in the organ’s statement list; toreplace, it overwrites the statement at t with the selectedstatement. In essence, the over-organ defines a large set ofaddition and replacement operations, one for each uniquestatement, weighted by the frequency of that statement’sappearance in the over-organ. Fig. 5 shows an example ofapplying µTrans’s mutation operators.
At each generation, we select top 10% most fit individuals(i.e. elitism) and insert them into the new generation. Weuse tournament selection to select 60% of the population forreproduction. Parents must be compilable; if the proportionof possible parents is less than 60% of the population, Alg. 1generates new individuals. At the end of evolution, an organ
that passes all the tests is selected uniformly at random andinserted into the host at Hl.
Fitness Function Let IC be the set of individuals thatcan be compiled. Let T be the set of unit tests used in GP,TXi and TPi be the set of non-crashed tests and passed testsfor the individual i respectively. Our fitness function follows:
fitness(i) =
!13 (1 +
|TXi||T | + |TPi|
|T | ) i ∈ IC
0 i /∈ IC(1)
For the autotransplantation goal, a viable candidate must,at a minimum, compile. At the other extreme, a successfulcandidate passes all of the TO
D , the developer-provided testsuite that defines the functionality we seek to transplant.These poles form a continuum. In between fall those individ-uals who execute tests to termination, even if they fail. Ourfitness function therefore contains three equally-weightedfitness components. The first checks whether the individualcompiles properly. The second rewards an individual forexecuting test cases to termination without crashing and lastrewards an individual for passing tests in TO
D .
Implementation Implemented in TXL and C, µScalpelrealizes µTrans and comprises 28k SLoCs, of which 16k isTXL [17], and 12k is C code. µScalpel inherits the limita-tions of TXL, such as its stack limit which precludes parsinglarge programs and its default C grammar’s inability to prop-erly handle preprocessor directives. As an optimisation weinline all the function calls in the organ. Inlining eases slicereduction, eliminating unneeded parameters, returns andaliasing. For the construction of the call graphs, we use GNUcflow, and inherit its limitations related to function pointers.
5. EMPIRICAL STUDYThis section explains the subjects, test suites, and research
questions we address in our empirical evaluation of automatedcode transplantation as realized in our tool, µScalpel.
Subjects We transplant code for five donors into three hosts.We used the following criteria to choose these programs. First,they had to be written in C, because µScalpel currentlyoperates only on C programs. Second, they had to be popularreal-world programs people use. Third, they had to be diverse.Fourth, the host is the system we seek to augment, so it had tobe large and complex to present a significant transplantationchallenge, while, fifth, the organ we transplant could comefrom anywhere, so donors had to reflect a wide range of sizes.To meet these constraints, we perused GitHub, SourceForge,and GNU Savannah in August 2014, restricting our attentionto popular C projects in different application domains.
Presented in Tab. 1, our donors include the audio stream-ing client IDCT, the simple archive utility MYTAR, GNUCflow (which extracts call graphs from C source code), Web-server3 (which handles HTTP requests), the command lineencryption utility TuxCrypt, and the H.264 codec x264. Ourhosts include Pidgin, GNU Cflow (which we use as both adonor and a host), SOX, a cross-platform command line au-dio processing utility, and VLC, a media player. We use x264and VLC in our case study in Sec. 7; we use the rest in Sec. 6.These programs are diverse: their application domains
span chat, static analysis, sound processing, audio streaming,archiving, encryption, and a web server. The donors vary insize from 0.4–63k SLoC and the hosts are large, all greater3https://github.com/Hepia/webserver.
261
Weak Proxies: Does it execute test cases without
crashing?
Does it compile?Strong Proxies: Does it produce the correct output?
CREST Justyna PetkeGenetic Improvement
Host
Organ
Donor
Do we break the initial functionality?
Have we really added new functionality?
How about the computational effort?Is autotransplantation useful?
Research QuestionsRegression TestsAcceptance Tests
CREST Justyna PetkeGenetic Improvement
Research QuestionsDo we break the initial functionality?
How about the computational effort? Is autotransplantation useful?
Have we really added new functionality?
Empirical Study
15 Transplantations 300 Runs 5 Donors 3 Hosts
Case Study:
H.264 Encoding Transplantation
CREST Justyna PetkeGenetic Improvement
ValidationRegression
Tests
Augmented Regression
Tests
Donor Acceptance
Tests
Acceptance Tests
Manual Validation
Host Beneficiary
CREST Justyna PetkeGenetic Improvement
Subjects
Minimal size: 0.4k
Max size: 422k
Average Donor:16k
Average Host: 213k
Subjects Type Size KLOC
Idct Donor 2.3Mytar Donor 0.4Cflow Donor 25
Webserver Donor 1.7TuxCrypt Donor 2.7
Pidgin Host 363Cflow Host 25SoX Host 43
Case Studyx264 Donor 63VLC Host 422
Feedback-Controlled Random TestGeneration
Kohsuke Yatoh1∗, Kazunori Sakamoto2, Fuyuki Ishikawa2, Shinichi Honiden12
1University of Tokyo, Japan,2National Institute of Informatics, Japan
{k-yatoh, exkazuu, f-ishikawa, honiden}@nii.ac.jp
ABSTRACTFeedback-directed random test generation is a widely usedtechnique to generate random method sequences. It lever-ages feedback to guide generation. However, the validity offeedback guidance has not been challenged yet. In this pa-per, we investigate the characteristics of feedback-directedrandom test generation and propose a method that exploitsthe obtained knowledge that excessive feedback limits thediversity of tests. First, we show that the feedback loopof feedback-directed generation algorithm is a positive feed-back loop and amplifies the bias that emerges in the candi-date value pool. This over-directs the generation and limitsthe diversity of generated tests. Thus, limiting the amountof feedback can improve diversity and effectiveness of gener-ated tests. Second, we propose a method named feedback-controlled random test generation, which aggressively con-trols the feedback in order to promote diversity of generatedtests. Experiments on eight different, real-world applicationlibraries indicate that our method increases branch cover-age by 78% to 204% over the original feedback-directed al-gorithm on large-scale utility libraries.
Categories and Subject DescriptorsD.2.5 [Software Engineering]: Testing and Debugging—Testing tools
General TermsAlgorithms, Reliability, Verification
KeywordsRandom testing, Test generation, Diversity
∗The author is currently affiliated with Google Inc., Japan,and can be reached at [email protected].
Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.ISSTA’15 , July 12–17, 2015, Baltimore, MD, USACopyright 2015 ACM 978-1-4503-3620-8/15/07 ...$15.00.
1. INTRODUCTIONFeedback-directed random testing [17] is a promising tech-
nique to automatically generate software tests. The tech-nique can create random method sequences using publicmethods from the classes of a system-under-test (SUT). Itis a general and test oracle independent technique to gen-erate software tests. Due to its generality and flexibility,many researchers have used feedback-directed random test-ing. Some researchers leveraged feedback-directed randomtesting as a part of their proposed methods [5, 25]. Othersused feedback-directed random testing to prove their the-ories on random testing [11, 12]. There is an interestingstudy that mined SUT specifications by analyzing the dy-namic behavior of SUT observed during feedback-directedrandom testing [18]. In addition, feedback-directed randomtesting has already been adopted by industries and under-gone intensive use [19].Despite its importance, characteristics of feedback-directed
random testing have seldom been studied. To the best ofour knowledge, some studies have proposed extensions tofeedback-directed random testing [14, 27], but they failedto analyze the nature of feedback-directed random testing.Specifically, the idea of feedback guidance had never beenchallenged. In this paper we investigate characteristics offeedback-directed random testing by using a model SUT andpropose a new technique that exploits the obtained knowl-edge that excessive feedback over-directs generation, ampli-fies bias, and limits the diversity of generated tests.We address two research questions in this paper.
RQ1: Why does the test effectiveness stop increasing atdifferent points depending on random seeds?
RQ2: Can our proposed technique lessen the dependencyon random seeds and improve the overall performanceof test generation?
The resulting test effectiveness of feedback-directed randomtesting should differ because of its randomness. However,the observed difference is much larger than expected. Forexample, the interquartile range marks 10% in our prelim-inary experiment on the model SUT. This spoils the credi-bility of feedback-directed random testing.There are three contributions in this paper.
• We hypothesize that feedback guidance over-directs thegeneration and limits the diversity of generated testsand show that both average score and variance of testeffectiveness improve by limiting the amount of feed-back.
Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected] is held by the owner/author(s). Publication rights licensed to ACM.
ISSTA’15, July 13–17, 2015, Baltimore, MD, USAACM. 978-1-4503-3620-8/15/07http://dx.doi.org/10.1145/2771783.2771805
Consist
ent *Complete *Well D
ocumented*Easyt
oR
euse* *
Evaluated
*ISSTA*Ar
tifact *
AEC
316
CREST Justyna PetkeGenetic Improvement
Experimental Methodology and Setup
μSCALPEL
Host
Implantation Point
Donor
OEOrgan Test Suite
Host Beneficiary
Implantation Point
Organ
64 bit Ubuntu 14.10 16 GB RAM 8 threads
CREST Justyna PetkeGenetic Improvement
Donor Host All Passed Regression Regression++ AcceptanceIdct Pidgin 16 20 17 16
Mytar Pidgin 16 20 18 20Web Pidgin 0 20 0 18Cflow Pidgin 15 20 15 16Tux Pidgin 15 20 17 16Idct Cflow 16 17 16 16
Mytar Cflow 17 17 17 20Web Cflow 0 0 0 17Cflow Cflow 20 20 20 20Tux Cflow 14 15 14 16Idct SoX 15 18 17 16
Mytar SoX 17 17 17 20Web SoX 0 0 0 17Cflow SoX 14 16 15 14Tux SoX 13 13 13 14
TOTAL 188/300 233/300 196/300 256/300
Empirical Study Feedback-Controlled Random TestGeneration
Kohsuke Yatoh1∗, Kazunori Sakamoto2, Fuyuki Ishikawa2, Shinichi Honiden12
1University of Tokyo, Japan,2National Institute of Informatics, Japan
{k-yatoh, exkazuu, f-ishikawa, honiden}@nii.ac.jp
ABSTRACTFeedback-directed random test generation is a widely usedtechnique to generate random method sequences. It lever-ages feedback to guide generation. However, the validity offeedback guidance has not been challenged yet. In this pa-per, we investigate the characteristics of feedback-directedrandom test generation and propose a method that exploitsthe obtained knowledge that excessive feedback limits thediversity of tests. First, we show that the feedback loopof feedback-directed generation algorithm is a positive feed-back loop and amplifies the bias that emerges in the candi-date value pool. This over-directs the generation and limitsthe diversity of generated tests. Thus, limiting the amountof feedback can improve diversity and effectiveness of gener-ated tests. Second, we propose a method named feedback-controlled random test generation, which aggressively con-trols the feedback in order to promote diversity of generatedtests. Experiments on eight different, real-world applicationlibraries indicate that our method increases branch cover-age by 78% to 204% over the original feedback-directed al-gorithm on large-scale utility libraries.
Categories and Subject DescriptorsD.2.5 [Software Engineering]: Testing and Debugging—Testing tools
General TermsAlgorithms, Reliability, Verification
KeywordsRandom testing, Test generation, Diversity
∗The author is currently affiliated with Google Inc., Japan,and can be reached at [email protected].
Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.ISSTA’15 , July 12–17, 2015, Baltimore, MD, USACopyright 2015 ACM 978-1-4503-3620-8/15/07 ...$15.00.
1. INTRODUCTIONFeedback-directed random testing [17] is a promising tech-
nique to automatically generate software tests. The tech-nique can create random method sequences using publicmethods from the classes of a system-under-test (SUT). Itis a general and test oracle independent technique to gen-erate software tests. Due to its generality and flexibility,many researchers have used feedback-directed random test-ing. Some researchers leveraged feedback-directed randomtesting as a part of their proposed methods [5, 25]. Othersused feedback-directed random testing to prove their the-ories on random testing [11, 12]. There is an interestingstudy that mined SUT specifications by analyzing the dy-namic behavior of SUT observed during feedback-directedrandom testing [18]. In addition, feedback-directed randomtesting has already been adopted by industries and under-gone intensive use [19].Despite its importance, characteristics of feedback-directed
random testing have seldom been studied. To the best ofour knowledge, some studies have proposed extensions tofeedback-directed random testing [14, 27], but they failedto analyze the nature of feedback-directed random testing.Specifically, the idea of feedback guidance had never beenchallenged. In this paper we investigate characteristics offeedback-directed random testing by using a model SUT andpropose a new technique that exploits the obtained knowl-edge that excessive feedback over-directs generation, ampli-fies bias, and limits the diversity of generated tests.We address two research questions in this paper.
RQ1: Why does the test effectiveness stop increasing atdifferent points depending on random seeds?
RQ2: Can our proposed technique lessen the dependencyon random seeds and improve the overall performanceof test generation?
The resulting test effectiveness of feedback-directed randomtesting should differ because of its randomness. However,the observed difference is much larger than expected. Forexample, the interquartile range marks 10% in our prelim-inary experiment on the model SUT. This spoils the credi-bility of feedback-directed random testing.There are three contributions in this paper.
• We hypothesize that feedback guidance over-directs thegeneration and limits the diversity of generated testsand show that both average score and variance of testeffectiveness improve by limiting the amount of feed-back.
Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected] is held by the owner/author(s). Publication rights licensed to ACM.
ISSTA’15, July 13–17, 2015, Baltimore, MD, USAACM. 978-1-4503-3620-8/15/07http://dx.doi.org/10.1145/2771783.2771805
Consist
ent *Complete *Well D
ocumented*Easyt
oR
euse* *
Evaluated
*ISSTA*Ar
tifact *
AEC
316
CREST Justyna PetkeGenetic Improvement
in 12 out of 15 experiments we successfully autotransplanted
new functionality
Execution Time (minutes)Donor HostIdct Pidgin 5 7 97
Mytar Pidgin 3 1 65Web Pidgin 8 5 160Cflow Pidgin 58 16 1151Tux Pidgin 29 10 574Idct Cflow 3 5 59
Mytar Cflow 3 1 53Web Cflow 5 2 102Cflow Cflow 44 9 872Tux Cflow 31 11 623Idct SoX 12 17 233
Mytar SoX 3 1 60Web SoX 7 3 132Cflow SoX 89 53 74Tux SoX 34 13 94
Total
Empirical Study
Average
334 (min)
Std. Dev. Total
72 (hours)10 (Average)
Feedback-Controlled Random TestGeneration
Kohsuke Yatoh1∗, Kazunori Sakamoto2, Fuyuki Ishikawa2, Shinichi Honiden12
1University of Tokyo, Japan,2National Institute of Informatics, Japan
{k-yatoh, exkazuu, f-ishikawa, honiden}@nii.ac.jp
ABSTRACTFeedback-directed random test generation is a widely usedtechnique to generate random method sequences. It lever-ages feedback to guide generation. However, the validity offeedback guidance has not been challenged yet. In this pa-per, we investigate the characteristics of feedback-directedrandom test generation and propose a method that exploitsthe obtained knowledge that excessive feedback limits thediversity of tests. First, we show that the feedback loopof feedback-directed generation algorithm is a positive feed-back loop and amplifies the bias that emerges in the candi-date value pool. This over-directs the generation and limitsthe diversity of generated tests. Thus, limiting the amountof feedback can improve diversity and effectiveness of gener-ated tests. Second, we propose a method named feedback-controlled random test generation, which aggressively con-trols the feedback in order to promote diversity of generatedtests. Experiments on eight different, real-world applicationlibraries indicate that our method increases branch cover-age by 78% to 204% over the original feedback-directed al-gorithm on large-scale utility libraries.
Categories and Subject DescriptorsD.2.5 [Software Engineering]: Testing and Debugging—Testing tools
General TermsAlgorithms, Reliability, Verification
KeywordsRandom testing, Test generation, Diversity
∗The author is currently affiliated with Google Inc., Japan,and can be reached at [email protected].
Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.ISSTA’15 , July 12–17, 2015, Baltimore, MD, USACopyright 2015 ACM 978-1-4503-3620-8/15/07 ...$15.00.
1. INTRODUCTIONFeedback-directed random testing [17] is a promising tech-
nique to automatically generate software tests. The tech-nique can create random method sequences using publicmethods from the classes of a system-under-test (SUT). Itis a general and test oracle independent technique to gen-erate software tests. Due to its generality and flexibility,many researchers have used feedback-directed random test-ing. Some researchers leveraged feedback-directed randomtesting as a part of their proposed methods [5, 25]. Othersused feedback-directed random testing to prove their the-ories on random testing [11, 12]. There is an interestingstudy that mined SUT specifications by analyzing the dy-namic behavior of SUT observed during feedback-directedrandom testing [18]. In addition, feedback-directed randomtesting has already been adopted by industries and under-gone intensive use [19].Despite its importance, characteristics of feedback-directed
random testing have seldom been studied. To the best ofour knowledge, some studies have proposed extensions tofeedback-directed random testing [14, 27], but they failedto analyze the nature of feedback-directed random testing.Specifically, the idea of feedback guidance had never beenchallenged. In this paper we investigate characteristics offeedback-directed random testing by using a model SUT andpropose a new technique that exploits the obtained knowl-edge that excessive feedback over-directs generation, ampli-fies bias, and limits the diversity of generated tests.We address two research questions in this paper.
RQ1: Why does the test effectiveness stop increasing atdifferent points depending on random seeds?
RQ2: Can our proposed technique lessen the dependencyon random seeds and improve the overall performanceof test generation?
The resulting test effectiveness of feedback-directed randomtesting should differ because of its randomness. However,the observed difference is much larger than expected. Forexample, the interquartile range marks 10% in our prelim-inary experiment on the model SUT. This spoils the credi-bility of feedback-directed random testing.There are three contributions in this paper.
• We hypothesize that feedback guidance over-directs thegeneration and limits the diversity of generated testsand show that both average score and variance of testeffectiveness improve by limiting the amount of feed-back.
Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected] is held by the owner/author(s). Publication rights licensed to ACM.
ISSTA’15, July 13–17, 2015, Baltimore, MD, USAACM. 978-1-4503-3620-8/15/07http://dx.doi.org/10.1145/2771783.2771805
Consist
ent *Complete *Well D
ocumented*Easyt
oR
euse* *
Evaluated
*ISSTA*Ar
tifact *
AEC
316
CREST Justyna PetkeGenetic Improvement
Case Study
Transplant Time & Test SuitesTime (hours) Regression Regression++ Acceptance
H.264 26 100% 100% 100%
Feedback-Controlled Random TestGeneration
Kohsuke Yatoh1∗, Kazunori Sakamoto2, Fuyuki Ishikawa2, Shinichi Honiden12
1University of Tokyo, Japan,2National Institute of Informatics, Japan
{k-yatoh, exkazuu, f-ishikawa, honiden}@nii.ac.jp
ABSTRACTFeedback-directed random test generation is a widely usedtechnique to generate random method sequences. It lever-ages feedback to guide generation. However, the validity offeedback guidance has not been challenged yet. In this pa-per, we investigate the characteristics of feedback-directedrandom test generation and propose a method that exploitsthe obtained knowledge that excessive feedback limits thediversity of tests. First, we show that the feedback loopof feedback-directed generation algorithm is a positive feed-back loop and amplifies the bias that emerges in the candi-date value pool. This over-directs the generation and limitsthe diversity of generated tests. Thus, limiting the amountof feedback can improve diversity and effectiveness of gener-ated tests. Second, we propose a method named feedback-controlled random test generation, which aggressively con-trols the feedback in order to promote diversity of generatedtests. Experiments on eight different, real-world applicationlibraries indicate that our method increases branch cover-age by 78% to 204% over the original feedback-directed al-gorithm on large-scale utility libraries.
Categories and Subject DescriptorsD.2.5 [Software Engineering]: Testing and Debugging—Testing tools
General TermsAlgorithms, Reliability, Verification
KeywordsRandom testing, Test generation, Diversity
∗The author is currently affiliated with Google Inc., Japan,and can be reached at [email protected].
Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.ISSTA’15 , July 12–17, 2015, Baltimore, MD, USACopyright 2015 ACM 978-1-4503-3620-8/15/07 ...$15.00.
1. INTRODUCTIONFeedback-directed random testing [17] is a promising tech-
nique to automatically generate software tests. The tech-nique can create random method sequences using publicmethods from the classes of a system-under-test (SUT). Itis a general and test oracle independent technique to gen-erate software tests. Due to its generality and flexibility,many researchers have used feedback-directed random test-ing. Some researchers leveraged feedback-directed randomtesting as a part of their proposed methods [5, 25]. Othersused feedback-directed random testing to prove their the-ories on random testing [11, 12]. There is an interestingstudy that mined SUT specifications by analyzing the dy-namic behavior of SUT observed during feedback-directedrandom testing [18]. In addition, feedback-directed randomtesting has already been adopted by industries and under-gone intensive use [19].Despite its importance, characteristics of feedback-directed
random testing have seldom been studied. To the best ofour knowledge, some studies have proposed extensions tofeedback-directed random testing [14, 27], but they failedto analyze the nature of feedback-directed random testing.Specifically, the idea of feedback guidance had never beenchallenged. In this paper we investigate characteristics offeedback-directed random testing by using a model SUT andpropose a new technique that exploits the obtained knowl-edge that excessive feedback over-directs generation, ampli-fies bias, and limits the diversity of generated tests.We address two research questions in this paper.
RQ1: Why does the test effectiveness stop increasing atdifferent points depending on random seeds?
RQ2: Can our proposed technique lessen the dependencyon random seeds and improve the overall performanceof test generation?
The resulting test effectiveness of feedback-directed randomtesting should differ because of its randomness. However,the observed difference is much larger than expected. Forexample, the interquartile range marks 10% in our prelim-inary experiment on the model SUT. This spoils the credi-bility of feedback-directed random testing.There are three contributions in this paper.
• We hypothesize that feedback guidance over-directs thegeneration and limits the diversity of generated testsand show that both average score and variance of testeffectiveness improve by limiting the amount of feed-back.
Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected] is held by the owner/author(s). Publication rights licensed to ACM.
ISSTA’15, July 13–17, 2015, Baltimore, MD, USAACM. 978-1-4503-3620-8/15/07http://dx.doi.org/10.1145/2771783.2771805
Consist
ent *Complete *Well D
ocumented*Easyt
oR
euse* *
Evaluated
*ISSTA*Ar
tifact *
AEC
316
VLC
H264
CREST Justyna PetkeGenetic Improvement
within 26 hours performed a taskthat took developers
an avg of 20 days of elapsed time
Automated Software Transplantation
H
D
OO
ENTRY
V
Organ’s Test Suite
Manual Work:
Organ Entry
Organ’s Test Suite
Implantation Point
Alexandru Marginean — Automated Software Transplantation
μTrans
Alexandru Marginean — Automated Software Transplantation
Host
Donor
Stage 1: Static Analysis
Host Beneficiary
Stage 2: Genetic
Programming
Stage 3: Organ
Implantation
Organ’s Test Suite
Validation
Alexandru Marginean — Automated Software Transplantation
Regression Tests
Augmented Regression
Tests
Host Beneficiary
Donor Acceptance
Tests
Acceptance Tests
Manual Validation
RQ1.a
RQ1.b RQ2
CREST Justyna PetkeGenetic Improvement
CREST Justyna PetkeGenetic Improvement
* http://crest.cs.ucl.ac.uk/autotransplantation/MuScalpel.html
CREST Justyna PetkeGenetic Improvement
GI Applications
Bug fixing
CREST Justyna PetkeGenetic Improvement
* http://dijkstra.cs.virginia.edu/genprog/
* http://people.csail.mit.edu/fanl/
Kali, SPR, ClearView
(from MIT)
and other …Claire Le Goues, Stephanie Forrest, Westley Weimer:
Current challenges in automatic software repair. Software Quality Journal 21(3): 421-443 (2013)
CREST Justyna PetkeGenetic Improvement
GI Applications
Bug fixing
Improving energy consumption
Porting old code to new hardware
Grafting new functionality into an existing system
Specialising software for a particular problem class
Other
CREST Justyna PetkeGenetic Improvement
What if fitness is expensive to compute ?
GI4GI: Improving Genetic Improvement Fitness FunctionsMark Harman & Justyna Petke(Genetic Improvement Workshop 2015)
CREST Justyna PetkeGenetic Improvement
GI4GI: Energy Optimisation Example
many factors affecting energy consumption, including:
screen behaviour
memory access
device communications
CPU utilisation
CREST Justyna PetkeGenetic Improvement
GI4GI: Energy Optimisation Example
a hardware-dependent linear energy model for GI:
Post-compiler software optimization for reducing energy (ASPLOS’14) Schulte et al.
CREST Justyna PetkeGenetic Improvement
GI4GI: Energy Optimisation Example
Idea:
Use GI to evolve a fitness function f for energy consumption.
Use f to improve energy consumption of software.
CREST Justyna PetkeGenetic Improvement
GI4GI
CREST Justyna Petke
GI Growth
CREST Justyna PetkeGenetic Improvement
Growing Area1st International Genetic Improvement Workshopat GECCO 2015, Madrid, Spainwww.geneticimprovementofsoftware.com
CREST Justyna PetkeGenetic Improvement
GI Growth
Special Issueon GI
Special Session on GI *http://www.wcci2016.org/
CREST Justyna PetkeGenetic Improvement
Functional Requirements
Non-Functional Requirements
Conclusions
humans have to define these
we can optimise these
CREST Justyna PetkeGenetic Improvement
Summary
Search Based Optimisation
Software Engineering
S B S E
Genetic Improvement
Combinatorial Interaction Testing
CREST Justyna Petke
Research Opportunities
CREST Justyna Petke
CREST Justyna Petke
Contact me if you want to visit CREST:
j.petke at ucl.ac.uk
Centre for Research on Evolution, Search and TestingUniversity College London
CREST Justyna Petke
20 mins walk
NationalGallery
Nelson’s Column
Eros RoyalCourts of Justice
St. Paul’s
Tate ModernGlobe
Theatre
Covent Garden Market
WestminsterAbbey
House of Parliament
London Eye
BritishMuseum
Madame Tussaud’sSherlock Holmes
Museum
Marble Arch
National History Museum
CREST Justyna Petke
COWsCREST Open Workshop
Roughly one per month
Discussion based
Recorded and archivedhttp://crest.cs.ucl.ac.uk/cow/
CREST Justyna Petke
COWs
http://crest.cs.ucl.ac.uk/cow/
CREST Justyna Petke
COWs
http://crest.cs.ucl.ac.uk/cow/
#Total Registrations 1512 #Unique Attendees 667 #Unique Institutions 244 #Countries 43 #Talks 421
(Last updated on November 4, 2015)
CREST Justyna Petke
CREST Open Workshop (COW)
CREST Justyna Petke
http://crest.cs.ucl.ac.uk/cow/
Genetic Improvement25-26 January 2016
45th COW
CREST Justyna Petke
Dynamic Adaptive Search Based Software
Engineering
EPSRCGrant
Stirling
Birmingham
York
UCL
DAASE
CREST Justyna PetkeGenetic Improvement
Summary
Search Based Optimisation
Software Engineering
S B S E
Genetic Improvement
Combinatorial Interaction Testing
COWsVisitor SchemeOpen positions
CREST Justyna PetkeGenetic Improvement
Pictures used with thanks from these sources
Pickering's Harem: [Public domain], via Wikimedia Commons
IBM 026 Card Punch: By Ben Franske (Own work) [GFDL (http://www.gnu.org/copyleft/fdl.html) or CC-BY-SA-3.0-2.5-2.0-1.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons
BBC_Micro: [Public domain], via Wikimedia Commons
Programmer: undesarchiv, B 145 Bild-F031434-0006 / Gathmann, Jens / CC-BY-SA [CC-BY-SA-3.0-de (http://creativecommons.org/licenses/by-sa/3.0/de/deed.en)], via Wikimedia Commons
IBM PC: By Boffy B (Own work) [GFDL (http://www.gnu.org/copyleft/fdl.html) or CC-BY-SA-3.0-2.5-2.0-1.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons
IMac: By Matthieu Riegler, Wikimedia Commons [CC-BY-3.0 (http://creativecommons.org/licenses/by/3.0)], via Wikimedia Commons
Ada Lovelace: By Alfred Edward Chalon [Public domain], via Wikimedia Commons
Stonehenge: By Yuanyuan Zhang [All right reserved] via Flickr
Bath Abbey: By Yuanyuan Zhang [All right reserved] via Flickr