Upload
gilbert-warren
View
214
Download
1
Embed Size (px)
Citation preview
Component MiningComponent Mining
Mahdi Cheraghchi-Bashi-AstanehMahdi [email protected]@ce.sharif.edu
Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)
22
OutlineOutline
What is a component?What is a component?Software reuseSoftware reuseWhat is component retrieval?What is component retrieval?Pros and cons of reusePros and cons of reuseHow to retrieve?How to retrieve?EvaluationEvaluation
Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)
33
What is a component?What is a component? A part of the whole.A part of the whole. ““A piece of software small enough to create and A piece of software small enough to create and
maintain, big enough to deploy and support, and with maintain, big enough to deploy and support, and with standard interfaces for interoperability" - standard interfaces for interoperability" - Jed Harris, Jed Harris, President CI Labs.President CI Labs.
Self contained binary pieces of software, but not Self contained binary pieces of software, but not complete applications.complete applications.
Can be combined with other components to produce Can be combined with other components to produce complete applications, regardless of the languages the complete applications, regardless of the languages the components are implemented in or platforms they run components are implemented in or platforms they run on. on.
Object-Oriented methods are often used for component Object-Oriented methods are often used for component development and reuse.development and reuse.
Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)
44
Some Examples in PracticeSome Examples in Practice
Borland DelphiBorland DelphiBorland C++ BuilderBorland C++ BuilderBorland KylixBorland KylixOLE / COM / ActiveXOLE / COM / ActiveXJavaBeansJavaBeansCORBACORBA
Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)
55
Software ReuseSoftware Reuse
Software reuse is the process of creating Software reuse is the process of creating software systems from existing software software systems from existing software rather than building software systems from rather than building software systems from scratch. [Krueger,1992]scratch. [Krueger,1992]
Levels of software reuse: source code, Levels of software reuse: source code, algorithms, architectures, domain models, algorithms, architectures, domain models, design, program transformations, design, program transformations, documentation, … every possible aspect documentation, … every possible aspect of a software systemof a software system
Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)
66
What is Component RetrievalWhat is Component Retrieval?? The mere existence of a component library does
not automatically entail its re-use. “Component Mining” is the deliberate, organized
and automated process of extracting reusable components from an existing rich software base.
Re-users need support to help them identifying components which suit their needs, This task is the topic of software component retrieval.
The goal is to develop reusable, adaptable The goal is to develop reusable, adaptable software components rather than large, software components rather than large, monolithic applications.monolithic applications.
Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)
77
Types of ReuseTypes of Reuse
Black-Box Reuse: Black-Box Reuse: a client may reuse the retrieved components “as is.”
Component-adaptive Grey-Box Reuse: a client may reuse the retrieved components without meeting any additional conditions but only after interface-level modifications of the components.
White-Box Reuse: White-Box Reuse: arbitrary additions and modifications are required.
Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)
88
Pros and Cons of ReusePros and Cons of Reuse
Advantages:Advantages:1.1. Reduces time and cost spent on programming.Reduces time and cost spent on programming.2.2. Increases programmers’ productivity.Increases programmers’ productivity.3.3. Increases program quality and reliability.Increases program quality and reliability.4.4. Expertise sharingExpertise sharing
Problems:Problems:1.1. It is hard to find things, especially in a large scale.It is hard to find things, especially in a large scale.2.2. Typically components are not (easily) modifiable.Typically components are not (easily) modifiable.3.3. It is hard to manage a large pool of components.It is hard to manage a large pool of components.4.4. It only worth if it is easier to locate and modify a It only worth if it is easier to locate and modify a
reusable component than to write it from scratch.reusable component than to write it from scratch.
Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)
99
How to Retrieve?How to Retrieve? Component retrieval is in fact a form of information Component retrieval is in fact a form of information
retrieval. Despite this fact, “dedicated” component retrieval. Despite this fact, “dedicated” component retrieval algorithms are being developed, since retrieval algorithms are being developed, since software is more than an ordinary text.software is more than an ordinary text.
Component retrieval is a complex and heuristic Component retrieval is a complex and heuristic process.process.
Typically needs a well-structured repository of Typically needs a well-structured repository of components.components.
Methods of retrievalMethods of retrieval1.1. Algorithms based on the meta-data accompanying software Algorithms based on the meta-data accompanying software
components.components.2.2. Algorithms based on the structure of the components.Algorithms based on the structure of the components.
Exact retrieval versus approximated retrievalExact retrieval versus approximated retrieval
Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)
1010
Retrieval by Meta-DataRetrieval by Meta-Data
By meta-data we mean the documentation By meta-data we mean the documentation accompanying the component.accompanying the component.
This method relies on existence and quality of This method relies on existence and quality of the documentation and needs some pre-the documentation and needs some pre-processing.processing.
How to find?How to find?1.1. Using full-text search on documents and program Using full-text search on documents and program
files: No cost, but inaccuratefiles: No cost, but inaccurate2.2. By classification of the components either By classification of the components either
automatically or manually. (depending on the cost automatically or manually. (depending on the cost and accuracy we need)and accuracy we need)
Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)
1111
Retrieval by StructureRetrieval by Structure
Depends on the availability of the structure Depends on the availability of the structure in some form (source code, interface, etc)in some form (source code, interface, etc)
Depends on the availability of computer Depends on the availability of computer language processors.language processors.
Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)
1212
Some Other MethodsSome Other Methods
Formal component specificationFormal component specification1.1. Domain theories: algebraic model, Domain theories: algebraic model,
signatures, etcsignatures, etc
2.2. Interface specificationsInterface specifications
3.3. Interface matching (automated theorem Interface matching (automated theorem proving, etc)proving, etc)
Semantic ClassificationSemantic Classification Feature-based methods (What possible Feature-based methods (What possible
features can a component have?)features can a component have?)
Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)
1313
Some Other MethodsSome Other Methods
Deduction-Based Component RetrievalDeduction-Based Component RetrievalIs the only method which retrieves proven
matches only.Suitable for the development of high-reliability
or safety-critical applications, e.g. space craft control systems.
Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)
1414
Searching and BrowsingSearching and Browsing Searching: Software developers formulate a query, and the Searching: Software developers formulate a query, and the
repository system returns components that match the query.repository system returns components that match the query. Problem: Formulating an effective query is a challenging Problem: Formulating an effective query is a challenging
task.task. Browsing: Developers determine the relevance of the Browsing: Developers determine the relevance of the
components currently being displayed in terms of their components currently being displayed in terms of their development task, and traverse the associated links.development task, and traverse the associated links. It is an incremental task, and is usually preferred.It is an incremental task, and is usually preferred. Problem: Software developer may be puzzled.Problem: Software developer may be puzzled.
Context-Aware Browsing: Infers developers’ tasks by Context-Aware Browsing: Infers developers’ tasks by monitoring their interactions with the environment.monitoring their interactions with the environment. Similar to browsing, but results in a significantly smaller Similar to browsing, but results in a significantly smaller
browsing space.browsing space. Uses learning methods to refine itself.Uses learning methods to refine itself. Problem: It is difficult to “understand” the content.Problem: It is difficult to “understand” the content.
Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)
1515
The Reuse EnvironmentThe Reuse Environment
A component database.A component database.A library management system providing A library management system providing
access to the database.access to the database.A software component retrieval system A software component retrieval system
(e.g. an ORB) that enables client (e.g. an ORB) that enables client applications to retrieve components from applications to retrieve components from the library server.the library server.
CBSE tools that support the integration of CBSE tools that support the integration of reused components into a new design.reused components into a new design.
Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)
1616
Evaluation MeasuresEvaluation Measures
Recall = Ratio of the number of relevant Recall = Ratio of the number of relevant components retrieved to the total number components retrieved to the total number of relevant components in repositoryof relevant components in repository
Precision = Ratio of the number of Precision = Ratio of the number of relevant components retrieved to the total relevant components retrieved to the total number of components retrievednumber of components retrieved
Response timeResponse time
Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)
1717
Summary and ConclusionSummary and Conclusion Software reuse is a crucial concern in today’s Software reuse is a crucial concern in today’s
world of complex software products.world of complex software products. Component-based development model plays an Component-based development model plays an
important role in software reuse.important role in software reuse. Component-based model is useful only when an Component-based model is useful only when an
satisfactory means of retrieval is available.satisfactory means of retrieval is available. No definite answer has yet been developed for No definite answer has yet been developed for
description of components in unambiguous description of components in unambiguous classifiable terms.classifiable terms.
Component retrieval is a difficult problem and Component retrieval is a difficult problem and more work is needed to find an efficient solution.more work is needed to find an efficient solution.
Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)
1818
ReferencesReferences D. Spinellis, K. Raptis, Component Mining: a process and its pattern D. Spinellis, K. Raptis, Component Mining: a process and its pattern
language, language, Information and Software TechnologyInformation and Software Technology 42 (2000) pp 609-617 42 (2000) pp 609-617 Hafedh Mili et al, An experiment in software component retrieval, Hafedh Mili et al, An experiment in software component retrieval,
Information and Software TechnologyInformation and Software Technology 45 (2003) pp 633-649 45 (2003) pp 633-649 K. McArthur et al, An evaluation of the impact of component-based K. McArthur et al, An evaluation of the impact of component-based
architectures on software reusability, architectures on software reusability, Information and Software Information and Software TechnologyTechnology 44 (2002) pp 351-359 44 (2002) pp 351-359
P.A.V. Hall, Architecture-driven component reuse, P.A.V. Hall, Architecture-driven component reuse, Information and Information and Software TechnologySoftware Technology 41 (1999) pp 963-968 41 (1999) pp 963-968
I. Crnkovic, M. Larsson, Challenges of component-based development, I. Crnkovic, M. Larsson, Challenges of component-based development, The Journal of Systems and SoftwareThe Journal of Systems and Software 61 (2002) pp 201-212 61 (2002) pp 201-212
Y. Ye, G. Fischer, Context-Aware Browsing of Large Component Repositories, IEEE 16th International Conference on Automated Software Engineering, 2001
A. M. Zaremski, J. M. Wing, Signature Matching, A Key to ReuseA. M. Zaremski, J. M. Wing, Signature Matching, A Key to Reuse B. Fischer, Deduction-Based Software Component Retrieval (Thesis)