Coverage, Diversity, And Coherence Optimization for Multi-document Summarization

COVERAGE, DIVERSITY, AND COHERENCE OPTIMIZATION FOR MULTI-DOCUMENT SUMMARIZATION

Khoirul Umam1, Fidi Wincoko Putro2, Gulpi Qorik Oktagalu Pratamasunu3

1,2,3 Postgraduate program of Informatics, Department of Informatics, Faculty of Information Technology, Sepuluh Nopember Institute of Technology

E-mail: [email protected], [email protected], [email protected]

Abstract

A great summarization on multi-document with similar topics can help users to get useful informa-tions. A good summary must have an extensive coverage, minimum redundancy (high diversity), and smooth connection among sentences (high coherence). Therefore, multi-document summarization that considers the coverage, diversity, and coherence of summary is needed. In this paper we propose a novel method on multi-document summarization which optimizes the coverage, diversity, and coher-ence among summary's sentences simultaneously. The method uses self-adaptive differential evolu-tion (SaDE) algorithm to solve the optimization problem. Sentences ordering algorithm based on top-ical closeness approach is performed in SaDE iterations to improve coherences among summary's sentences. Experiments have been performed on Text Analysis Conference (TAC) 2008 data sets. The experimental results show that the proposed method generates summaries with average coherence and ROUGE scores 29-41,2 times and 46,97-64,71% better than other method that only consider coverage and diversity, respectively.

Keywords: multi-document summarization, optimization, self-adaptive differential evolution, sentences ordering, topical closeness

Abstrak

Peringkasan yang baik terhadap dokumen-dokumen dengan topik yang seragam dapat membantu pembaca dalam memperoleh informasi secara cepat. Ringkasan yang baik merupakan ringkasan de-ngan cakupan pembahasan (coverage) yang luas dan dengan tingkat keberagaman (diversity) serta ke-terhubungan antarkalimat (coherence) yang tinggi. Oleh karena itu dibutuhkan metode peringkasan multi-dokumen yang mempertimbangkan tingkat coverage, diversity, dan coherence pada hasil ring-kasan. Pada paper ini dikembangkan sebuah metode baru dalam peringkasan multi-dokumen dengan mengoptimasi tingkat coverage, diversity, dan coherence antarkalimat hasil ringkasan. Metode yang digunakan ialah self-adaptive differential evolution (SaDE) dengan penambahan fase pengurutan kali-mat. Metode SaDE digunakan untuk mencari solusi hasil ringkasan yang memiliki tingkat coverage, diversity, dan coherence paling tinggi melalui tahap pengurutan kalimat menggunakan pendekatan topical closeness. Uji coba dilakukan pada 15 topik dataset Text Analysis Conference (TAC) 2008. Hasil uji coba menunjukkan bahwa metode yang diusulkan dapat menghasilkan ringkasan dengan ra-ta-rata koherensi 29-41,2 kali lebih tinggi serta skor ROUGE 46,97-64,71% lebih besar dibandingkan dengan metode yang hanya mempertimbangkan coverage dan diversity hasil ringkasan.

Kata Kunci: optimasi, pengurutan kalimat, peringkasan multi-dokumen, self-adaptive differential evolution, topical closeness

1. Introduction

The contents of a document can be long. It presents several informations with specified topic. Current technological developments makes people can find related documents with similar topic easier than before. The other documents can be had a long contents too. It means there is a mas-sive quantity of data or information with similar topic that can be obtained.

The massive quantity of data available in

Internet today has reached such a huge volume. It becomes humanly unfeasible to efficiently get useful information from the Internet [1]. Thus, automatic methods are needed in order to get use-ful information from the documents efficiently.

Document summarization is one of methods to process informations automatically. It creates compressed version of documents that provides useful information which covers all informations in the original documents relevantly. Document summarization can be classified based on the

christian ozora

Highlight

christian ozora

Highlight

christian ozora

Highlight

christian ozora

Highlight

christian ozora

Highlight

number of document processed simultaneously, i.e. single-document and multi-document summa-rization. Single-document summarization process-es only one document into a summary, whereas multi-document summarization processes more than one document with similar topic into a sum-mary.

Various kinds of algorithms have been pro-posed on multi-document summarization problem. These algorithms include ontology-based, clus-tering, and heuristic approach. The example of document summarization method that uses ontol-ogy-based approach is the proposed method in [2]. It can perform multi-document summarization by utilizing Yago ontology to capture the intent and context of sentences in documents. It can choose the exact meaning of sentences that has ambiguous word based on Yago ontology scores.

Multi-document summarization methods based on clustering approach have been also pro-posed. For example, the method proposed in [3]. It generates a summary from sentences set that have been clustered based on similarity between sentences. In the other multi-document summa-rization method which has been proposed in [1] also there is a clustering stage.

Whereas the multi-document summarization methods based on heuristic approach are methods that utilize optimization algorithm in order to se-lect the summary's sentences properly. One of multi-document summarization methods that use this approach is Optimization of Coverage and Di-versity for Summarization using Self-adaptive Differential Evolution (OCDsum-SaDE) method which proposed in [4]. In the method, an optimal summary is searched by considering the coverage and diversity of summary's sentences.

Multi-document summarization cannot be separated from sentences ordering process. The process is needed to be performed in order to ob-tain the composition of the summary's sentences that allows users to get informations easily. Sev-eral summary's sentences ordering methods had been proposed in [5-7]. It considers a variety of approaches, i.e. chronological, probabilistic, top-ical closeness, precedence, succession, semantic, and also text entailment approaches. The process is generally carried out after the document sum-marization process completes. Thus, the results of sentences ordering depend on the summary.

A good summary is expected to meet three factors. These factors are 1) an extensive cover-age, 2) high diversity or minimum redundancy, and 3) high coherences among summary's sen-tences [4]. Summary that have an extensive cover-age indicates it has summarized all informations from original documents. Summary's sentences with high diversity or minimum redundancy indi-

cate the summary able to presents informations without any convoluted. On other hand, the smooth connectivity between summary's senten-ces may help the users to understand and absorb information from summary easily.

Process to obtain the best summary can be considered as an optimization problem [8]. There-fore, the process to generate a summary with high level of coverage, diversity, and coherences among the sentences also can be considered as an optimization problem. Thus a multi-document summarization method which considers optimiz-ing those factors simultaneously is needed to study in order to generate a good summary.

In this paper we propose a novel method for multi-document summarization that considers the coverage, diversity, and coherence of the summa-ry. This method is inspired by self-adaptive differ-ential evolution (SaDE) algorithm from [4] and sentences ordering algorithm using topical close-ness approach in [6]. SaDE algorithm is used to solve the coverage, diversity, and coherence opti-mization problem. Whereas the topical closeness approach that integrated to SaDE iterations helps to find the solution of summary with optimal co-herences. Thus, this method can generates sum-mary with an extensive coverage, minimum re-dundancy, and high coherence among the summa-ry's sentences.

2. Summary's Quality Factors

In this section we describe three factors of summary's quality (i.e. coverage, diversity, and coherence) that optimized in our proposed method.

2.1. Coverage

Let N denotes the number of sentences from documents that will be summarized, M denotes the number of distinct terms in documents, 푠푒푛 denotes the nth sentence from documents which has normalized form 푠푒푛 , 푡푒푟푚 denotes the mth distinct term from documents, 푡푓 denotes the number of occurrences of 푡푒푟푚 in 푠푒푛 , 푖푠푓 denotes inverse sentence frequency of 푡푒푟푚 , and 푁 denotes the number of sentences containing 푡푒푟푚 . Term's weight of 푡푒푟푚 in 푠푒푛 (푤 ) can be calculated using term fre-quency-inverse sentence frequency (TF-ISF) scheme in Eq. (1) and Eq. (2):

푤 = 푡푓 × 푖푠푓 , (1)

푖푠푓 = 푙표푔 . (2)

The 푠푒푛 is represented as a vector

christian ozora

Highlight

christian ozora

Highlight

christian ozora

Highlight

christian ozora

Highlight

christian ozora

Highlight

christian ozora

Highlight

christian ozora

Highlight

christian ozora

Highlight

christian ozora

Highlight

christian ozora

Highlight

which has M components such that 푠푒푛 =[푤 , … ,푤 ]. The similarity between sentences can be calculated using cosine measure formula-tion in Eq. (3):

푠푖푚 푠푒푛 , 푠푒푛 =∑ ,

∑ ⋅∑. (3)

Summary’s coverage value reflects the coverage of summary’s contents towards contents in original documents. It can be calculated by con-sidering similarity between main content in ori-ginal documents with main content in candidate summary [4]. Radev et.al. [9] describes that main content of documents set is reflected by its cen-troid or its term’s weight means.

Centroid of original documents and candi-date summary are represented as a vector with M components. Let 푆 (푡) denotes set of sentences in pth candidate summary on tth generation and 푁 (푡) denotes the number of sentences in 푆 (푡). Each component 표 of the original documents’s centroid 푂 and each component 표 , (푡) of the pth candidate summary’s centroid on current gener-ation 푂 (푡) can be calculated using Eq. (4) and Eq. (5), respectively:

표 = ∑ 푤 , (4)

표 , (푡) = ( )∑ 푤∈ ( ) . (5)

Alguliev et.al. [4] also describes that by con-sidering the similarity between main content of original documents and main content of summary, we will know the importance of summary towards original documents. Moreover, by considering the similarity between main content of original docu-ments with each summary’s sentence, we will know the importance of each summary’s sentence towards its original documents. The greater simi-larity between main content of original documents with a summary’s sentence reflects the more im-portance of the sentence towards original docu-ments. Therefore greater summary’s coverage value reflects better summary. The formulation to calculate summary’s coverage value 푓 is shown in Eq. (6). In the Eq. (6), 푈 (푡) denotes binary form of vector solution for the pth candi-date summary on tth generation and 푢 , (푡) de-notes the nth component of 푈 (푡). Process to

generate this vector will be described in the next section.

2.2. Diversity

Summary’s diversity value reflects the diver-sity of summary’s sentences. It can be considered by calculating similarity between each summary’s sentences. If the summary has high total value of sentences similarity, then it has low diversity. Otherwise, if the summary has low total value of sentences similarity, then it has high diversity be-tween its sentences [4].

Summary with low diversity between its sen-tences tends to present a poor summary because its sentences tend to discuss redundant informa-tion. Therefore, in order to get a good summary, the combination of summary’s sentences which has high diversity have to be found. In other words, the combination of summary’s sentences with low total value of its sentences similarity have to be found, because it can present the infor-mation with minimum redundancy.

In this paper the summary's diversity value is defined as total value of its sentences similarity. Therefore, its diversity value is related with diversity of its sentences inversely. The lower its diversity value reflects the more diversity in its sentences and also the better summary.

The formulation to calculate summary’s di-versity value 푓 is shown in Eq. (7). Eq. (7) only sums similarity between summary’s sen-tences and ignores sentences which not in the summary [4].

2.3. Coherence

Summary’s coherence value reflects the summary’s sentences coherences degree. It cor-responds with smooth connectivity between sum-mary's sentences. Thus, it also corresponds with readability of information in summary by readers. A summary with higher coherences degree is ex-pected to be easier for reader in order to under-stand the information which presents by the sum-mary.

Generally a summary can simplify the read-ers to understand the information if its sentences are ordered such as two adjacent sentences discuss similar content or topic. It has same principle with topical closeness approach which has been pre-sented in [6]. The closeness between sentence’s

푓 푈 (푡) = 푠푖푚 푂,푂 (푡) ⋅ ∑ 푠푖푚(푂, 푠푒푛 )푢 , . (6)

푓 푈 (푡) = ∑ ∑ 푠푖푚 푠푒푛 , 푠푒푛 푢 , 푢 , . (7)

christian ozora

Highlight

christian ozora

Highlight

christian ozora

Highlight

christian ozora

Highlight

christian ozora

Highlight

topics can be considered using similarity value be-tween the sentences. The greater similarity be-tween adjacent sentences reflects that they have similar contents or topics.

Based on the description we can make con-clusion that a good summary is a summary with high coherences degree between its adjacent sen-tences. But a good summary have to presents the information about its original documents contents to readers in simple form (i.e. the summary has a little number of sentences). Therefore, the sum-mary’s coherence value 푓 in this paper is formulated as mean value of adjacent summary’s sentences similarities as shown in Eq. (8). The 푈 (푡) in Eq. (8) denotes the ordered form of vector solution for the pth candidate summary on tth generation and 푢 , (푡) denotes the nth compo-

nent of 푈 (푡). Process to generate this vector will be described in the next section.

In order to improve the coherences among summary's sentences, the sentences ordering pro-cess is performed. In this paper we proposed two types of sentences ordering algorithm as described in Algorithm 1 and 2. The proposed algorithms are inspired from topical closeness approach which had been presented in [6]. The first type (Type A) is an algorithm which maximizes simi-larity between adjacent sentences. Whereas the second type (Type B) is an algorithm which em-phasizes two sentences with most similar topic should be at the beginning of summary’s para-graph.

푓 푈 (푡) =∑ ( ) , ( )

( )

( ) , 푠(푖) = 푢 , (푡). (8)

Algorithm 1. Sentences Ordering Type A Algorithm

1. From sentences in the candidate summary, choose two sentences (푠푒푛 and 푠푒푛 ) which has highest similarity (푠푖푚 푠푒푛 , 푠푒푛 ) and then make it as initialization of ordering result → 표푟푑 =푠푒푛 , 푠푒푛 .

2. Change 푠푒푛 and 푠푒푛 status to be head and tail, respectively. 3. For each sentence which has not in ordering result, choose a sentence (푠푒푛 ) which has highest

similarity if paired with head or tail. 4. Do one of following conditional:

a. If 푠푖푚(ℎ푒푎푑, 푠푒푛 ) ≥ 푠푖푚(푡푎푖푙, 푠푒푛 ), then put 푠푒푛 in front of the head and change 푠푒푛 status to be head → 표푟푑 = 푠푒푛 ,푠푒푛 , 푠푒푛 .

b. If 푠푖푚(ℎ푒푎푑, 푠푒푛 ) < 푠푖푚(푡푎푖푙, 푠푒푛 ), then put 푠푒푛 behind the tail and change the 푠푒푛 status to be tail → 표푟푑 = 푠푒푛 , 푠푒푛 , 푠푒푛 .

5. Repeat steps 3-4 until the entire sentences are in the ordering result. Algorithm 2. Sentences Ordering Type B Algorithm

1. From sentences in the candidate summary, choose two sentences (푠푒푛 and 푠푒푛 ) which has highest similarity (푠푖푚 푠푒푛 , 푠푒푛 ) and then make it as initialization of ordering result → 표푟푑 =푠푒푛 , 푠푒푛 .

2. Choose the other sentence (푠푒푛 ) which has highest similarity if paired with one of sentence in ordering result (푠푒푛 or 푠푒푛 ).

3. Do one of following conditional: a. If 푠푖푚(푠푒푛 , 푠푒푛 ) ≥ 푠푖푚 푠푒푛 ,푠푒푛 , then put 푠푒푛 beside 푠푒푛 and set 푠푒푛 and 푠푒푛 status

as head and tail, respectively → 표푟푑 = 푠푒푛 , 푠푒푛 , 푠푒푛 . b. If 푠푖푚(푠푒푛 , 푠푒푛 ) < 푠푖푚 푠푒푛 ,푠푒푛 , then put 푠푒푛 beside 푠푒푛 and set 푠푒푛 and 푠푒푛 status

as head and tail, respectively → 표푟푑 = 푠푒푛 , 푠푒푛 , 푠푒푛 . 4. For each sentence which has not in ordering result, choose a sentence (푠푒푛 ) which has highest

similarity if paired with tail. 5. Put 푠푒푛 behind the tail and change the 푠푒푛 status as tail. 6. Repeat steps 4-5 until the entire sentences are in the ordering result.

christian ozora

Highlight

3. Optimization of Summary's Quality Fac-tors

The coverage, diversity, and coherence opti-mization process in our proposed method consists of preprocessing and main process phase. The main process implements self-adaptive differen-tial evolution (SaDE) algorithm inspired from [4] with the additions of the sentences ordering phase. For the convenience, we denote our proposed method as CoDiCo method which stands for three factors that we would be optimized i.e. coverage, diversity, and coherence. Fig. 1 depicts the flow-chart of CoDiCo method.

3.1. Preprocessing Phase

Preprocessing phase is a step to prepare the datas which would be used in main process. In this step there are some processes, i.e. 1) sen-tences extraction, 2) sentences normalization, 3) distinct terms extraction, 4) term weights matrix preparation, and 5) sentences similarity matrix preparation.

Sentences extraction is a process to take each sentence from documents which have same topic in dataset. The process will produce N sentences. Each extracted sentence 푠푒푛 is represented as a single line of data in sentences list D such that 퐷 = [푠푒푛 , . . . , 푠푒푛 ].

After the extraction process, each sentence 푠푒푛 is normalized into 푠푒푛 using stop-word removal, punctuation removal, and stemming process. We use 571 stop-words from Journal of

Machine Learning Research stop-word list1 for the stop-word removal process. For the stemming process we use Porter Stemmer algorithm2.

On the next step we perform distinct terms extraction from each 푠푒푛 . This process pro-duces M distinct terms. Each extracted term 푡푒푟푚 is stored into terms list T such that 푇 = [푡푒푟푚 , . . . , 푡푒푟푚 ].

Based on N normalized sentences and M dis-tinct terms, we generate a terms weight matrix W which has 푁 × 푀 dimensions. Each component in W stores the term’s weight of 푡푒푟푚 in normal-ized sentence 푠푒푛 (푤 ). The weights calcu-lation is conducted using TF-ISF scheme in Eq. (1) and (2).

Each term’s weight then used to calculate the sentences similarity. Similarity value between 푠푒푛 and 푠푒푛 for 푖, 푗 = [1, . . . ,푁] can be calculated using cosine measure scheme in Eq. (3). This process will produce a sentences simi-larity matrix which has 푁 × 푁 dimensions.

3.2. Main Process Phase

The main steps in main process phase of Co-DiCo method as shown in Fig. 1 consist of initial-ization, binarization, ordering, evaluation, muta-tion, crossover, stopping criterion, and output steps. Binarization and ordering steps can be di-vided into two phases, i.e. binarization and or-

1 http://jmlr.org/papers/volume5/lewis04a/a11-smart-stop-

list/english.stop 2 http://tartarus.org/martin/PorterStemmer/

Sentences and terms extraction

Initialization Binarization(target vector)

Ordering(target vector)

Solutions Evaluation

Mutation

Stoping Criterion

Selection Ordering(trial vector)

Binarization(trial vector)

Crossover

Output

Preprocessing Main Process

Sentences Ordering

Fig. 1. CoDiCo Method Flowchart

dering for target vectors (i.e. solution vectors which generated by initialization and selection steps) and binarization and ordering for trial vec-tors (i.e. solution vectors which generated by crossover step). The brief descriptions for each step is describes in the next subsection.

3.2.1. Initialization

Initialization is a step to provide a set of so-lutions U which would be used to find the optimal solution of summarization. Let P and t denote the number of generated solutions and the current generation, respectively, such that 푈(푡) =[푈 (푡), . . . ,푈 (푡)] for 푡 = 0. Each solution in U is referred as a target vector. Each target vector 푈 (푡) for 푝 = [1, . . . ,푃] is represented as a vector which has N components such that 푈 (푡) =푢 , (푡), . . . , 푢 , (푡) where 푢 , (푡) denotes the

nth component in pth target vector. Each target vector’s component in this step

푢 , (0) is randomly initialized by a real-value be-tween specified lower bound 푢 and upper bound 푢 . The formulation to initialize 푢 , (0) is shown in Eq. (9):

푢 , (0) = 푢 + (푢 − 푢 ) ⋅ 푟푎푛푑 , , (9)

where 푟푎푛푑 , denotes a uniform random value between 0 and 1 for the nth component in pth tar-get vector [4].

3.2.2. Binarization

Binarization is a step to encode real-value of 푢 , (푡) into binary-value. The binary-values are used to indicate the sentences from D which used as sentences in the pth candidate summary 푆 (푡). If 푢 , (푡) = 1, then it indicates that the 푠푒푛 in D is selected as sentence in the 푆 (푡). Otherwise, if 푢 , (푡) = 0, then it indicates that the 푠푒푛 in D is not a sentence in the 푆 (푡).

Alguliev et.al. [4] describes that encoding process of real-value 푢 , (푡) into binary-value 푢 , (푡) can be performed by comparing 푟푎푛푑 , value with sigmoid value of 푢 , (푡). The formula-tion for this process is shown in Eq. (10) and Eq. (11):

푢 , (푡) = 1, if 푟푎푛푑 , < 푠푖푔푚 푢 , (푡)0, otherwise

, (10)

푠푖푔푚(퐴) = . (11)

The 푟푎푛푑 , in this step has same value with the one which have been used in initialization step.

3.2.3. Ordering

In this step, 푁 (푡) sentences for each 푆 (푡) derived from 푈 (푡) solution are ordered using sentences ordering algorithm which described in Subsection 2.3. Ordered form of 푈 (푡) is stores in ordering-solution vector 푈 (푡) which has 푁 (푡) components 푢 , (푡) such that 푈 (푡) =푢 , (푡), . . . , 푢 , (푡) . The 푢 , (푡) component

stores sentences index which include as sum-mary's sentences in 푆 (푡) (i.e. 푢 , (푡) =[1, … ,푁]).

3.2.4. Solutions Evaluation

The evaluation step is used to calculate fit-ness value for each summarization solution. Evaluations are performed for each 푈 (푡) which has been encoded to binary form (푈 (푡)) and or-dered form (푈 (푡)). Based on our purpose in this paper, calculation for fitness value of 푈 (푡) (푓푖푡 푈 (푡) ) is conducted by considering the three factors of summary's quality, i.e. the its coverage, diversity, and coherence values. The formulation is shown in Eq. (12).

The best and the worst solutions on current generation can be determined using each solu-tion's fitness value. The best solution on current generation (local best) 푈 (푡) is a target vector which has the highest fitness value. Otherwise, the worst solution on current generation (local worst) 푈 (푡) is a target vector which has the lowest fitness value. In this step we also can update the global best 푈 (푡) i.e. the best solution until current generation using the rule which formu-lated in Eq. (13).

푓푖푡 푈 (푡) =( )

( )⋅ 푓 푈 (푡) . (12)

푈 (푡) =푈 (푡), if 푓푖푡 푈 (푡) > 푓푖푡 푈 (푡 − 1)푈 (푡 − 1), otherwise

. (13)

3.2.5. Mutation

Mutation is a step to generate mutant vectors set V from target vectors set U. Mutation process of 푈 (푡) is conducted by involving 푈 (푡) vec-tor, 푈 (푡) vector, a randomly selected vector 푈 (푡) where 푝1 = [1, . . . ,푃] and 푝1 ≠ 푝, and a mutation factor for current generation 퐹(푡). The formulation to generate pth mutant vector on cur-rent generation 푉 (푡) is shown in Eq. (14), where-as the formulation to calculate the 퐹(푡) value is shown in Eq. (16):

퐹(푡) = 푒 / . (16)

In Eq. (16) 푡 denotes maximum generation which specified in initialization step [4].

One or more 푉 (푡) components 푣 , (푡) have a probability to violate the boundary constraints. Its values can be less than 푢 or greater than 푢 . Each 푣 , (푡) which its value violates the boundary constraints have to been reflected back. The rules to reflect back the 푣 , (푡) value is for-mulated in Eq. (15).

3.2.6. Crossover

Crossover is a step to generate trial vectors set Z. Each trial vector 푍 (푡) has N components 푧 , (푡) which its value is derived from the value of 푢 , (푡) or 푣 , (푡) [4]. The purpose of this oper-ation is to increase the diversity of solution vec-tors in order to expand the search space of optimal solution.

Alguliev et.al. [4] describes that to generate the 푍 (푡) vector, relative distance between 푈 (푡) vector and 푈 (푡) vector 푅퐷 (푡) has to be cal-culated first. The 푅퐷 (푡) then used to calculate the crossover rate 퐶푅 (푡). Eq. (17)-(19) shows the formulation to calculate the 푅퐷 (푡) and 퐶푅 (푡).

푅퐷 (푡) =( ) ( )

( ) ( ) . (17)

퐶푅 (푡) =( )

( ). (18)

푡푎푛ℎ(퐴) = . (19)

The rule to determine trial vector component 푧 , (푡) value is formulated in Eq. (20). In the Eq. (20) k is a randomly selected integer value for 푘 = [1, . . . ,푁]. It ensures that at least one compo-nent of trial vector is obtained from the mutant vector. It will ensure that the solutions on the next generation have differences with the solutions on current generation [4].

3.2.7. Selection

Selection is a step to generate a novel target vectors set for the next generation 푈(푡 + 1). The vectors are derived from 푈(푡) vectors and 푍(푡) vectors which have the best fitness value [4]. In other words only the best solution for each pair is survived from this operation. It ensures that the searching of optimal solution is always approach to the best solution until the last iteration. The rule to choose the pth target vector on the next gen-eration 푈 (푡 + 1) is formulated in Eq. (21).

3.2.8. Stopping Criterion

In this step the iteration of optimal solution searching process is determined to be stopped or not. The stopping criterion in this paper is uses a specified number of generation. If the iteration has reached the maximum generation 푡 , then the iteration is stopped. Otherwise, if the iteration has not reached the 푡 , then the iteration is con-tinued.

푧 , (푡) =푣 , (푡), if 푟푎푛푑 , ≤ 퐶푅 or 푛 = 푘푢 , (푡), otherwise , (20)

푈 (푡 + 1) =푍 (푡), if 푓푖푡 푍 (푡) ≥ 푓푖푡 푈 (푡)푈 (푡), otherwise

. (21)

푉 (푡) = 푈 (푡) + 1− 퐹(푡) ⋅ 푈 (푡)−푈 (푡) + 퐹(푡) ⋅ 푈 (푡)−푈 (푡) . (14)

푣 , (푡) =2푢 − 푣 , (푡), if 푣 , (푡) < 푢 2푢 − 푣 , (푡), if 푣 , (푡) > 푢 . (15)

3.2.9. Output

This step is the final step in main process of CoDiCo method. In this step the global best solu-tion of summarization on the last generation 푈 (푡 ) has been acquired. Its binary form 푈 (푡 ) is denotes the sentences index in D which has selected as summary’s sentences, whereas its ordered form 푈 (푡 ) stores the order of the summary’s sentences index. Further-more a sentences set which indicated in 푈 (푡 ) are returned as the summary.

4. Experiments and Results

In this paper we use Text Analysis Confer-ence (TAC) 2008 datasets from National Institute of Standards and Technology (NIST) to test our CoDiCo method. We choose 15 topics for the test-ing. Each topic has 10 documents that would be summarized. Fig. 2 shows the example of docu-ments from TAC 2008.

The experiments are performed using Matlab R2013a and run on Microsoft Windows platform. We test both the proposed sentences ordering al-gorithm type A and type B using CoDiCo method. We represented these methods as CoDiCo-A and CoDiCo-B, respectively. In order to compare the summarization results from CoDiCo method with another multi-document summarization method which considers coverage and diversity factors only, we use the OCDsum-SaDE method from [4].

We also test both of our proposed sentences ordering algorithm by involving a threshold 푇 in sentences similarity value in order to evaluate the impact of similarity between summary's sen-tences toward the optimal summary's solution. We use three threshold values, i.e. 0,7, 0,8, and 0,9. In this scenario, the sentences ordering process ex-clude every pair of sentences which have similar-

ity value greater than or equal to 푇 value. The excluding pair of sentences will not be chosen as adjacent sentences in summary's solutions.

Both of our proposed methode (CoDiCo) and the compared method (OCDsum-SaDE) use four specified parameters in the initialization state i.e. population size (P), maximum generation (푡 ), lower bound (푢 ), and upper bound (푢 ) which sets to 20, 500, -5, and 5, respect-ively. Those parameters values assign based on heuristic choices. After the multiple-documents summarizations were processed using the tested methods, we get the summarization results as many as selected topics. Therefore we have 15 summaries from 15 selected topics for each method. Fig. 3 depicts an example of the docu-ments summary.

The testing results are evaluated using Recall-Oriented Understudy of Gisting (ROUGE) method [10]. There are 4 type of ROUGE method which is used to evaluate our experiments i.e. ROUGE-1, ROUGE-2, ROUGE-L, and ROUGE-SU. ROUGE-1 and ROUGE-2 are variants of ROUGE-N which consider n-gram recall between summarization result from candidate summary and reference summary for n assigned by 1 and 2. ROUGE-L is a ROUGE method which considers about longest common subsequence (LCS) be-tween candidate summary and reference summa-ry. In other hand ROUGE-SU considers the uni-gram value on candidate summary and reference summary as counted unit [4,10].

ROUGE score for CoDiCo-A, CoDiCo-B, and OCDsum-SaDE methods are presented in Table 1. It shows the comparison of ROUGE scores among each tested methods.

We also evaluate our proposed method by considering the averages of summary's coherence value which generated by each tested method. The comparison of averages coherence value from tested methods is shown in Table 2.

<DOC id="XIN_ENG_20050403.0060" type="story" > <HEADLINE> 1st lead: Syria to withdraw all forces from Lebanon by April 30 </HEADLINE> <DATELINE> DAMASCUS, April 3 (Xinhua) </DATELINE> <TEXT> <P> UN envoy Terje Roed-Larsen said Sunday that the Syrian government has pledged to remove all Syrian troops and intelligence agents from Lebanon by April 30. </P> … </TEXT> </DOC>

Fig. 2. The Example of Documents From TAC 2008

5. Discussion

Series of experiment has been conducted to evaluate our proposed method (i.e. CoDiCo-A and CoDiCo-B) in comparison with compared method (OCDsum-SaDE). Based on the evaluation results as shown in Table 1, we know that the CoDiCo-A method using 푇 = 0,7 has higher ROUGE score on ROUGE-1, ROUGE-L, and ROUGE-SU than the other methods. Whereas in ROUGE-2 can be shown that CoDiCo-B method using 푇 = 0,8 has higher score than the others. From Table 1 we also know that the lowest average ROUGE score of CoDiCo method is 0,3370 which obtained by CoDiCo-A using 푇 = 0,9 and the highest average ROUGE score is 0,3777

which obtained by CoDiCo-A using 푇 = 0,7. Whereas the average ROUGE score of OCDsum-SaDE method only reached 0,2293. It means all of CoDiCo method variants have better performance than the compared method.

By comparing the averages ROUGE score for each CoDiCo method with the average ROUGE score of OCDsum-SaDE method, we know that CoDiCo methods have averages ROUGE score in range 46,97-64,71% higher than the averages ROUGE score of compared method. It shows that the multi-document summarization method which considers coverage, diversity, and coherence simultaneously can produce better sum-mary than summarization method that considers coverage and diversity only.

Based on the evaluation of averages of sum-mary's coherence value that shown in Table 2, we know that CoDiCo-B method without threshold reaches the average coherence value higher than the others. We also know that the lowest average coherence value among the variants of CoDiCo method is 0,145 which obtained by CoDiCo-B method using 푇 = 0,7. Nevertheless the value is higher than the average coherence value of OCDsum-SaDE. If we compare it with OCDsum-SaDE method, CoDiCo-B without threshold can produce summary with average coherence value

A 25-year-old man was killed when the tree fell on the car in which he was traveling, CBS-4 television reported, quoting local authorities. Hurricane Katrina caused a first fatality in southeastern Florida Thursday when it sent a tree crashing on the roof of a car in Fort Lauderdale, local television reported. Swiss Reinsurance Co., the world's second largest reinsurance company on Monday doubled to 40 billion US dollars its initial estimate of the global insured losses caused by Hurricane Katrina in the United States. Hurricane Katrina could leave the world's insurers facing a bill of up to 50 billion dollars (40 billion euros), British company Brit Insurance forecast on Tuesday. Rescue crews worked frantically Tuesday to save hundreds of people trapped by floodwaters from Hurricane Katrina, which devastated the US Gulf Coast states of Louisiana, Mississippi and Alabama and reportedly left dozens of people dead. "The US had been generous in responding to natural and humanitarian disasters all over the world, include in Africa ... justice requires us to aid the people in Louisiana, Mississippi and Alabama, who have lost their homes and loved ones," President Yoweri Museveni was quoted by The New Vision as saying. Those returns will allow us to give more information about the markets overall exposure. "Despite the severity of Katrina, the Lloyds market is well-equipped to manage the financial impact of a catastrophe on this scale.

Fig. 3. An Example of Summarization Result

TABLE 1. ROUGE SCORES COMPARISON OF EACH TESTED METHODS

Methods ROUGE Score Average ROUGE-1 ROUGE-2 ROUGE-L ROUGE-SU OCDsum-SaDE 0,3701 0,0794 0,3424 0,1254 0,2293 CoDiCo-A (without threshold) 0,5144 0,1395 0,4820 0,2256 0,3404 CoDiCo-B (without threshold) 0,5279 0,1524 0,4933 0,2351 0,3522 CoDiCo-A (푇 = 0,9) 0,5118 0,1443 0,4749 0,2172 0,3370 CoDiCo-B (푇 = 0,9) 0,5163 0,1525 0,4785 0,2304 0,3444 CoDiCo-A (푇 = 0,8) 0,5322 0,1473 0,4963 0,2310 0,3517 CoDiCo-B (푇 = 0,8) 0,5464 0,1747 0,5127 0,2640 0,3744 CoDiCo-A (푇 = 0,7) 0,5620 0,1611 0,5205 0,2674 0,3777 CoDiCo-B (푇 = 0,7) 0,5296 0,1448 0,4923 0,2434 0,3525

TABLE 2. AVERAGES COHERENCE VALUE COMPARISON FROM EACH

TESTED METHOD

Methods Average of coherences value

OCDsum-SaDE 0,005 CoDiCo-A (without threshold) 0,193 CoDiCo-B (without threshold) 0,206 CoDiCo-A (푇 = 0,9) 0,151 CoDiCo-B (푇 = 0,9) 0,167 CoDiCo-A (푇 = 0,8) 0,148 CoDiCo-B (푇 = 0,8) 0,160 CoDiCo-A (푇 = 0,7) 0,140 CoDiCo-B (푇 = 0,7) 0,145

about 41,2 times higher than the OCDsum-SaDE method, whereas CoDiCo-B method using 푇 = 0,7 can reach average coherence value about 29 times higher than OCDsum-SaDE method. It shows that the CoDiCo method which involves ordering step in optimization process can produce summary with better coherences or smoother connectivity among sentences than the other method which does not consider the or-dering of summary's sentences.

The comparison of two proposed sentences ordering algorithm with same threshold value using their ROUGE scores shows that CoDiCo-B is better than CoDiCo-A. As shown in Table 1, CoDiCo-B has higher average ROUGE score than CoDiCo-A when not using a threshold, using 푇 = 0,9, and using 푇 = 0,8. Whereas CoDi-Co-A only has higher average ROUGE score than CoDiCo-B when using 푇 = 0,7.

If we compare the averages coherence value between both of sentences ordering algorithms as shown in Table 2, we know that CoDiCo-B has higher average coherence value than CoDiCo-A when using all variants of threshold value. This performance comparisons indicate that ordering sentences strategy which arrange sentences by higher similarity on the beginning of paragraph (sentences ordering algorithm type B) is better than ordering sentences strategy which maximize similarity between two sentences in the summa-rization result (sentences ordering algorithm type A).

6. Conclusion

This paper has proposed and describes a new method on multi-document summarization by considering the coverage, diversity, and coherence factors among the summary's sentences simultan-eously. The proposed method tries to improve connectivity among summary's sentences in order to enhance the readability of summary by adding sentences ordering phase in summary optimiza-tion process. Thus the process of sentences or-dering is no longer relying on the summary.

The experimental results show that the mul-ti-document summarization method that considers the coverage, diversity, and coherence factors in summary simultaneously is able to provide better summary than other methods which only con-siders the coverage and diversity of summary. In addition to the consideration of the coherence fac-tor in summary can provides summary with better readability compared with another method that do not consider this factor.

This research can be developed further. Fur-ther development can be done by considering the other approaches in the sentences ordering algo-

rithm in order to improve the readability of the summary.

7. References

[1] Ferreira, R., Cabral, L.S., Freitas, F., Lins, R.D., Silva, G.F., Simske, S.J., dan Favaro, L., “A Multi-Document Summarization Sys-tem Based on Statistics and Linguistic Treat-ment”, Expert Systems with Applications, vol. 41, pp. 5780-5787, 2014.

[2] Baralis, A., Cagliero, L., Jabeen, S., Fiori, A., Shah, S., “Multi-Document Summariza-tion Based on The Yago Ontology”, Expert Systems with Applications, vol 40, pp. 6976-6984, 2013.

[3] Aliguliyev, R.M., “A New Sentence Similar-ity Measure and Sentence Based Extractive Technique for Automatic Text Summariza-tion”, Expert Systems with Applications, vol. 36, pp. 7764-7772, 2009.

[4] Alguliev, R.M., Aliguliyev, R.M., dan Isa-zade, N.R., “Multiple Documents Summa-rization Based on Evolutionary Optimization Algorithm”, Expert Systems with Applica-tions, vol. 40, pp. 1675-1689, 2013.

[5] Barzilay, R., Elhadad, N., dan McKeown, K., “Inferring Strategies for Sentence Order-ing in Multidocument News Summariza-tion”, Journal of Artificial Intelligence Re-search, vol 17, pp. 35–55, 2002.

[6] Bollegala, D., Okazaki, N., dan Ishizuka, M., “A Preference Learning Approach to Sen-tence Ordering for Multi-Document Summa-rization”, Information Sciences, vol. 217, pp. 78-95, 2012.

[7] Sukumar, P., dan Gayathri, K.S., “Semantic Based Sentence Ordering Approach for Mul-ti-Document Summarization”, International Journal of Recent Technology and Engineer-ing (IJRTE), vol. 3, no. 2, pp. 71-76, 2014.

[8] Huang, L., He, Y., Wei, F., dan Li, W., “Modeling Document Summarization as Multi-objective Optimization”, Third Inter-national Symposium on Intelligent Informa-tion Technology and Security Informatics (IITSI), Jinggangshan, China, 2010.

[9] Radev, D.R., Jing, H., Stys, M., dan Tam, D., “Centroid-based Summarization of Mul-tiple Documents”, Information Processing and Management, vol. 40, pp. 919-938, 2004.

[10] Lin, C.-Y., “ROUGE: A Package for Auto-matic Evaluation Summaries”, in Proceed-ings of the workshop on text summarization branches out, Barcelona, Spain, pp. 74–81, 2004.