Download pdf - Case-based learning for knowledge-based optimization modeling system: UNIK-CASE

Expert Systems With Applications, Vol. 6, pp. 87-95, 1993 0957--4174/93 $5.00 + .00 Printed in the USA. © 1993 Petllamon Press Ltd.

Case-Based Learning for Knowledge-Based Optimization Modeling System: UNIK-CASE

JAE K Y U LEE AND MIN YONG KIM

Korea Advanced Institute of Science and Technology, Cheongryang, S¢oul, Korea

Abstract--This article explores how the previously built optimization models can be used as a medium for automatic learning. To explain the learning process, we describe the target representation of common modeling knowledge base in UNIK.OPT and the representations of specific optimization model cases. Both are represented in frames. The common modeling knowledge is represented by the potential iinkability between attributes and in~ces, attributes and blocks of terms, and blocks of terms and constraints. There are also no fixed indices on blocks of terms and constraint sets. Therefore, the learning process includes the addition of new frames, generalization of linkability, generalization of attribute's role (constant or variable), and context identification in terms of period, time units, usage, perspectives, and so forth. To realize the learning process, the UNIK-CASE is under development as a front end of the modeling system UNIK-OPT.

1. INTRODUCTION

THIS ARTICLE EXPLORES how the previously built optimization models can provide knowledge to the knowledge base, which is shared in aiding the formulation processes of multiple optimization models. For this purpose, a case-based learner, UNIK-CASE, (Lee & Kim, 1990) is developed as a front-end learning me- dia of knowledge-based optimization model formulator UNIK-OPT (Lee & Kim, 1992). The previously built models may be used for two purposes: case-based reasoning and case-based learning. In case-based reasoning, the previous models are used directly to build a similar model as depicted in Figure l(a). This approach to model formulation has been suggested by some re- searchers. Vellore, Sen, and Vinze (1990) proposed a case-based planning approach; Liang (1989) and Liang and Konsynski (1990) discussed issues related to modeling by analogy; and Binbasioglu (1990) investigated the process-based analogy approach.

On the other hand, in the case-based learning framework the cases are not used directly. Instead, the relevant information from the cases are integrated into the knowledge base via the addition of new knowledge, generalization, and inconsistency resolution, as depicted in Figure 1 (b). This article pursues the case-based learning framework. In practice, both approaches may be adopted together complementarily.

Requests for reprints should be sent to Jae Kyu Lee, Department of Management Science, Korean Advanced Institute of Science and Technology, 207-43, Cheongryangri-dong, Dongdaemun-gu, Seoul 130-012, Korea.

87

For literature about the concepts and techniques on case-based reasoning refer to Kolodner, Simpson, and Sycara (1985), Kolodner (1991), and Riesbeck and Schank (1989). For review on the applications of case- based reasoning on various domains, see Slade (1991). An interesting application of case-based learning to computer programming can also be found in Williams (1988).

Because the target knowledge base in UNIK-OPT is tailored to optimization modeling support, we need to contrast the case-based learning process by UNIK- CASE with the formulation reasoning by UNIK=OPT. This issue is described in Section 2. Section 3 shows the representation of specific linear programming models, while Section 4 illustrates the representation of a common modeling knowledge base. Section 5 pre- sents three learning frameworks with examples: addition of new knowledge, generalization, and context identification. Section 6 describes the development of a prototype and its application to refinery.

2. CONTRAST OF CASE-BASED LEARNING AND FORMULATION REASONING

PROCESS

2.1. Relationships

To understand the role of UNIK-CASE, we need to understand the role of UNIK-OPT and their mutual relationships. This relationship is depicted in Figure 2.

Note that case-based learning is not the only channel of knowledge acquisition. Domain knowledge and modeling structure knowledge that is not included in

88 J. K. Lee and M. Y. Kim

Reasoning

(a) Case-based Reasoning

Reasoning

I Machine Learner

(b) Case-based Learning

FIGURE 1. Case-based reasoning vs. case-based learning.

the cases may also be learned via the conventional knowledge acquisition system. This kind of knowledge acquisition system is particularly necessary when we do not have a sufficient number of cases. The output of UNIK-CASE is the input knowledge to UNIK-OPT. Because the output from UNIK-OPT has the same syntactic structure as the input of UNIK-CASE, the specific linear programming (LP) model cases generated by other companies in the same industry may be used for learning. More reliable cases may be generated from previously built models in other modeling pack- ages by transforming the syntax to that of UNIK-CASE. To help explain the UNIK-CASE, we will describe the UNIK-OPT first. At this stage of research, we limit the scope of optimization to linear programming.

2.2. UNIK-OPT

To describe the rationale of the knowledge representation that is adopted in UNIK-OPT, the architecture

and formulation reasoning process of UNIK-OPT should be reviewed. The key purposes of UNIK-OPT are as follows: 1. Specify the LP model formulations as a subset of

the knowledge base and a model builder's problem definition.

2. Achieve model independence from the knowledge base and data base. By achieving the model independence, multiple LP models can maintain their consistency independently of changes made in the knowledge base.

3. Make the tool UNIK-OPT domain independent. Let the stored knowledge determine the application domain. To fulfiU the objectives, UNIK-OPT uses three views

(optionally four views) of model representation as depicted in Figure 3: semantic view, notational view (modeling language view optionally), and tabular view.

Semantic view represents the LP model in frames and includes the semantic information about variables, coefficients, indices, terms, constraints, and specific formulated models. The semantic view of the model can be syntactically transformed into the notational view--the view usually used in the Operations Re- search (OR) textbooks. The notational view can be further reclassified into aggregate and individual equa- tional forms. In a sense, the notational view is a kind of canonical representation of most modeling languages such as GAMS (Brooke, Kendrick, & Meevaus, 1988), PLATFORM (Palmer, 1984), and SML (Geoffrion, 1988). The last view is the tabular view, which is an input format for a specific solver such as MPSX or MINOS. Because transforming the modeling language to the tabular view has been the topic of many re- searches on modeling languages, it will not be discussed in this article.

To formulate the semantic view of a specific model, a model builder uses the common modeling knowledge base as depicted in Figure 4. The common modeling knowledge base comprises the modeling structure knowledge and domain knowledge. The modeling structure knowledge is the component that needs to be learned from a variety of cases. An example knowledge representation is shown in Section 4.

A specific LP model formulated using the modeling knowledge base consists of the relevant modeling knowledge (retrieved from the common modeling knowledge base) and user's problem definition, which a model builder has interactively provided during the formulation process. An example knowledge representation is shown in Section 3.

3. REPRESENTATION OF MODEL CASES

3.1. Two Example Notational LP Models

To illustrate the representation of specific LP model cases, let us see two examples of the notational LP

Case-Based Learning for Optimization Models 89

I Specific h ~ Cased-based Formulation I J Specific I Model ~! Learning by Reaaonlng by LP

UNIK-CASE UNIK-OPT

~1 Knowledge Knowledge Engineers/ ,- Acquisition Domain I System Experts

4

I , i UNIK-CASE ,~1

I" "1

UNIK-OPT

FIGURE 2. Contrast of case-based leaming and formulation reasoning process.

"1

I Semantic-level Model Formulation

Modeling Language View

Semantic View

J Transformation [

I Transformation I

Tabular View

Aggregate Notation Individual Equation ~ Notational coefficients

Numeric coefficients

FIGURE 3. Four modeling views.


Specific Semantic Model

..i" Syntactic Transformer L [ Data L _ r ~ l M a n a g e r ~

[

• • • @ Kn°wledgel" ~

~ ][Spec,fIc/' ISemantIc ' II,-o,,,°/ Model ~ Model ~ - ~ Model

[ / Definition/ I I FormuIat°rl I [ Definitior

:u~ d;elr I I ~ [

FIGURE 4. Architecture of UNIK-OPT.

models: (a) a monthly production and inventory model and (b) a long-term material supply contract model.

X o , + I u - ~ - I i , = s i t i ~ I , t E T (3) J

3.1. I. Notations. i: product; j: facility; t: month;

Xo,: production amount of product i, produced by facility j during month t;

lit: inventory of product i at the end of month t; co: unit production cost of product i at facility j," a~j: unit processing time of product i at facility j,' bj,: capacity of facilityj for month t; &,: sales estimate of product i for month t.

Xot, Ii, >-- 0 for all i, j, t. (4)

In the above model, the objective is to minimize the total production costs, and the constraints are the production capacity and the monthly balance in production, inventory, and sales.

3.1.3. Annual Product Mix Model for 1992.

min ~ coXij,9z (5) 6

3.1.2. Monthly Product Mix Model for 1992.

min Z coXij, /it S.t. Z aoXijt <- bit j E J,

i t E T

(1)

(2)

S.t. ~, aijXij,92 < bj,92 j ~ J (6) i

~, X0,92 < s/,9z i @ I (7) J

Xij,92 >-- 0 for all i, j. (8)


The symbols used in models (5)-(8) are the same as the monthly model except the month t is replaced by the year 1992. In this model, we do not need to consider the monthly inventory stated in (3). Instead, the inventory leftover at the end of the year is considered in (7).

3.2. Semantic Representation of Specific LP Model

The semantic view of the specific LP model case is structured as in Figure 5. The model includes the fol- lowing information: 1. Semantics of indices that describe, for example,

products, materials, facilities, and time units. 2. Linkages of indices with attributes. 3. Classification of attributes on whether they are de-

cision variables or coefficients. 4. Indexed blocks of terms (BOTs). BOT denotes a set

of terms that share the same summation sign. 5. Indices associated with constraints. 6. BOTs associated with objective function.

The case model has an one-to-one mapping with the mathematical notation of the LP model. Note that there are indices on constraints, blocks of terms, coefficients, and decision variables.

OBJECTIVE

LP_MODEL

has

ha . / I ~ ha .

OPERATOR

haa ( DECISION

FIGURE 5. Structure of specific LP model.

hae

An illustrative semantic model of (1)-(4) can be represented in frames as follows. First, the overall structure of model is shown in (9).

{ {MONTHLY_PRODUCTION._INVENTOR Y _MODEL

is_a: LP_MODEL objective: (MIN PRODUCTION_COST) constraints: PRODUCTION_CAPACITY

PRODUCTION...SALES __BALANCE}}

(9)

The PRODUCTION_CAPACITY constraint in (9) can be further represented by an operator and a set of BOTs with the index slot named for_each in (10). Note that there is also the "context" slot. This kind of me- taknowledge is necessary to resolve the inconsistencies between multiple constraints that axe associated with the same set of BOTs.

{ {PRODUCTION_CAPACITY_INDEXED i~ a: INDEXED_CONSTRAINT operator. LE LHS: (+ PROCESSING_TIME._BOT) RHS: (+ FACILITY_CAPACITY._BOT) for._each: FACILITY TIME_UNIT context: (TIME__UNIT MONTH)}}

(1o)

The PROCESSING_TIME_.BOT in (10)can be further represented by a pair of coefficient and decision with the index slot named sum...index in (11).

{ {PROCESSING_TIME_BOT is_a: INDEXED__BOT coefficient: UNIT...PROCESSING_TIME decision: PRODUCTION_AMOUNT sum_index: PRODUCT} }

( l l )

The attribute (decision or coefficient) in (11) can be linked with indices as in (12).

{ {PRODUCTION_.AMOUNT_WITH _JNDEX_ifl

is_a: INDEXED_.ATTRIBUTE attribute: PRODUCTION_AMOUNT index: PRODUCT FACILITY MONTH symbol: X(i, j, t) da*~ ~vailability: no}}

(12)

According to the value of the data._availability slot, the attribute is treated as either a variable or a coefficient. In this case, the indexed attribute PRODUC- TION_-4MOUNT_WITH_.JNDEX_ijt is treated as a variable because the data are not available.


Finally, the index in (12) can be represented as (13).

{ {PRODUCT (13) is_a: INDEX symbol: i linked_attributes:PRODUCTION_AMOUNT

UNIT_..PROCESSING_TIME} }

4. REPRESENTATION OF COMMON MODELING KNOWLEDGE BASE

As mentioned in Section 2, the common modeling knowledge base consists of the domain knowledge and modeling structure knowledge. The term "common" means that multiple users and models can share the knowledge base in a bounded environment such as a plant, a company, or an industry. For instance, the domain knowledge is concerned with products, raw materials, facilities, and time units, which are mostly used as indices of LP models. This information may be provided directly by knowledge engineers rather than by machine learning from the cases.

The structure of the modeling knowledge is shown in Figure 6. At the top level, there are potential objec-

fives and index-free constraints. The potential objective is composed of index-free blocks of terms, while the index-free constraint is composed of blocks of terms and an operator (<, >, or =). The index-free BOT can be represented by a pair of coefficient and decision. The constraint network effectively represents the relationships between the constraints and index-free BOTs. The decision, coefficient, and index-free constraints have potentially linkable indices. Figure 7 shows an illustrative constraint network generated from the two models described in Section 3.

5. CASE-BASED LEARNING FRAMEWORK

The case-based learning and formulation reasoning processes fill the gap between a specific LP model and a common modeling knowledge base. Roughly speak- ing, the formulation reasoning is a specialization process while the case-based learning is a generalization process. Thus, the two processes have opposite direc- tionality. In the case-based learning process, there are three key operations: addition of new knowledge, generalization of cases, and context identification between inconsistent constraints.

OBJECTIVE CONSTRAINT

ha has

BLOCK-OF OPERATOR

has has

Iinkable

COEFFICIENT DECISION

Index

INDEX

FIGURE 6. Structure of modeling knowledge base.


_COST_BOT

_AMOUNT

UNIT_COST _COST_BOT

.Z t-1

SALES AMOUNT_I

SALES • ~OLUUE n

PROFIT_BOT

UNIT_PROCESSING _TIME

Q : attribute

~ ) : index-free BOT

D : a constraint

_SOT

_TIME_BOT

FIGURE 7. Constraint network.

FACILITY CAPACITY_B

5.1. Addition of New Knowledge

To learn a new knowledge from a model case, one has to identify new frames that exist in the case but not in the knowledge base. Thus, the new frames can be simply included in the knowledge base. Typically, the new frames provide the knowledge about indices, attributes, BOTs, and constraints.

5.1.1. Generation of new indices. From the indexed attribute in (13), we can generate a new index frame PRODUCT if it had not existed already.

{ { PRODUCT (14) is a: INDEX symbol: i linkable__attributes:PRODUCTION__AMOUNT

UNIT_PROCESSING_TIME} }

Note that the slot linked__.attributes has become relaxed to linkable_attributes, which is a kind of generalization. If the index frame already exists, only the new attribute's name can be added to the linkable attributes slot.

5.1.2. Generation of new attributes. We can also generate new attributes from the indexed attribute in (12).

{ { PRODUCTION_.AMOUNT is a" ATTRIBUTE symbol: X linkable__index: PRODUCT MONrtt} }

FACILITY

(15)

Note that the linked index is also relaxed to the link- able_index. The role of the attributes described in the data_availability slot is also removed. This means that the data availability will be determined during the formulation process.

5.1.3. Generation of new index-free BOTs. We can generate a new index-free BOT simply by removing the sum_index slot from the indexed BOT. For instance, the frame (16) can be generated from (11).

{ {PROCESSING_TIME._BOT is_a: BOT coefficient: UNIT_PROCESSSING_TIME decision: PRODUCTION.__AMOUNT} }

(16)


5.1.4. Generation of new index-free constraints. A new index-free constraint can also be generated by changing the index slot for_each to linkable_index. For instance, the index-free constraint (17) can be generated from the indexed constraint in (10).

{ { PROD UCTION_CAPA CITY is_a: CONSTRAINT operator: LE LHS: (+ PROCESSING_TIME_BOT) RHS: (+ FACILITY_CAPACITY__.BOT) linkableindex: FACILITY TIME_UNIT context: (TIME_UNIT MONTH)} }

(17)

5.2. Generalization

The generalization process is the opposite direction of the specialization process during the formulation. There are two typical types of generalizations.

5.2.1. Relaxation of linkage with indices. The "finked" relationships of indices with attributes, BOTs, and constraints in a specific model should be relaxed to "linkable" relationships in the common modeling knowledge base. For instance, the relationships are relaxed to linkable ones in (14), (15), and (17).

5.2.2. Relaxation of the attribute's role. In a specific model, the role of an attribute, whether it is a constant or a variable, is fixed. However, the role of attribute should be determined depending upon the purpose of a specific model and dynamic data availability. There- fore, all coefficients and variables in the model cases should be generalized as attributes in the common modeling knowledge base.

The data availability of these attributes should be identified dynamically, consulting the data dictionary. However, the ultimate decision on the role of the attributes should be determined by the model builder during the formulation process. This should be fol- lowed because some coefficients may be treated as variables, which implies a nonlinear programming model; some attributes may have a flexible role (i.e., the attribute may be a variable in one model and a constant in the other models); both the coefficient and decision in a BOT may have constant values.

5.3. Context Identification

If more than one constraint associated with the same BOTs exists, this means ambiguity during the formulation process. For example, two constraints exist that are associated with the BOTs PRODUC- TION__AMOUNT__BOT and SALES_VOLUME _BOT in Figure 7. From the two models in (1)-(4) and (5)-(8), constraints (2) and (6) can be merged into an index-free PRODUCTION_CAPACITY constraint.

However, constraints (3) and (7) are inconsistent. In this case, either the context in the cases may be trans- ferred to the knowledge base or the knowledge engineer may identify the context of the inconsistent constraints. Then, the UNIK-CASE can learn the context information to resolve the inconsistencies. The stored contexts will be displayed to the model builder during the formulation process so that the constraint related to a specific modeling context can be selected.

5.4. Role of Knowledge Engineers

As we have seen in the previous section, the case-based learning framework does not necessarily eliminate the role of knowledge engineers completely. In particular, the domain knowledge such as products, facilities, and materials that are mainly used as indices of LP models may be provided directly by knowledge engineers rather than by machine learning from the model cases. How- ever, because knowledge engineers may not have com- plete knowledge about the modeling structure the UNIK-CASE can contribute to learning more on the modeling structure knowledge.

6. DEVELOPMENT OF PROTOTYPE UNIK- CASE AND ITS APPLICATION

TO REFINERY

The UNIK-OPT and UNIK-CASE are applied to the UNIK-R (Refinery) project, where the optimization models are used with the knowledge-based systems for integrated planning and control of the refinery plant (Lee, Oh, Suh, Kim, & Song, 1990). In this project, the UNIK-OPT generates optimization models to solve problems such as daily product blending and monthly crude selection. Even for a small daily blending problem, which typically has less than 100 constraints, the model varies often to reflect the selection of processing units. Moreover, the monthly crude selection model usually has 500-1,500 constraints. Due to the size and complexity of the models, it was necessary to develop a knowledge-based optimization model formulation system. The prototype UNIK-OPT is developed using the frame-based tool UNIK and Common Lisp on a Sun 3/280 workstation. However, the preparation of the modeling structure knowledge was not an easy task. So, the prototype case-based learning system UNIK- CASE was designed and is under development to aid the knowledge acquisition process. We expect that UNIK-CASE can gradually enhance the modeling knowledge base.

7. CONCLUSION

Case-based learning from specific LP model cases was attempted to aid the knowledge acquisition process for the knowledge-based formulation system UNIK-OPT. By using the case-based learner UNIK-CASE, we could


au tomat i ca l ly add new knowledge, general ize m o d e l cases, and ident i fy contexts for amb iguous constraints . U N I K - C A S E can be effectively used as a knowledge acquis i t ion tool to help refine the model ing knowledge.

R E F E R E N C E S

Binbasioglu, M. (1990). Process based reconstructive approach to model building. In Proceedings of the 1990 ISDSS Conference, (pp. 383-403), Austin, TX: University of Texas at Austin.

Brooke, A., Kendrick, D., & Meeraus, A. (1988). GAMS: A user's guide. Redwood City, CA: The Scientific Press.

Geoffrion, A.M. (1988). SAIL: A model definition language for structured modeling (Working Paper No. 360). Los Angeles, CA: Uni- versity of California.

Kolodner, J.L. (1991). Improving human decision making through case-hased decision aiding. AI Magazine, 12(2), 52-68.

Kolodner, J.L., Simpson, R.L., & Sycara, K. (1985). A process model of case-based reasoning in problem solving. In Proceedings of the Ninth International Joint Conference on Artificial Intelligence (pp. 284-290), Los Angeles, CA: Morgan Kaufmann.

Lee, J.K., & Kim, M.Y. (1990). Case-based learning for knowledge- based optimization modeling system: UNIK-CASE. Paper pre- sented at the ! 990 ISDSS Conference, Austin, TX.

Lee, J.K., & Kim, M.Y. (in press). Knowledge-assisted optimization model formulation: UNIK-OPT. Decision Support Systems.

Lee, J.K., Oh, S.B., SOh, M.S., Kim, M.Y., & Song, Y.U. (1990). Knowledge network for planning and control of refinery industry: UNIK-R project experience. In Proceedings of the Fourth Inter- national Conference on Expert Systems in Production and Op- erations Management (pp. 16-32), Columbia, SC: University of South Carolina.

Liang, T.P. (1989). Modeling by analog?/: A case-based approach to model construction (Working Paper No. 89-1524). Urbana- Champaign, IL: University of Illinois at Urbana-Champaign.

Liang, T.P., & Konsynski, B.R. (1990). Modeling by analogy: Use of analogical reasoning in model management systems. In Pro- ceedings of the 1990 ISDSS Conference (pp. 405--421) Austin, TX: University of Texas at Austin.

Palmer, K. (1984). A model management framework for mathematical programming. New York: Wiley.

Riesbeck, C.K., & Sehank, R.C. (1989). Inside case-based reasoning. Hillsdale, N J: Lawrence Erlbaum.

Slade, S. (1991). Case-based reasoning: A research paradigm. AI Magazine, 12(1), 42-55.

Vellore, R.C., Sen, A., & Vinze, A.S. (1990). A case-based planning approach to model formulation. In Proceedings of the 1990 ISDSS Conference (pp. 353-382) Austin, TX: University of Texas at Austin.

Williams, R.S. (1988). l.earning to program by examining and mod- ifying cases. In Proceedings of the Fifth International Conference on Machine Learning, (pp. 318-324) Ann Arbor, MI: Morgan Kaufmann.