10
A Semantic-Based Aroach for Preserving Operational Experience of Nuclear Installations A.N. Kosilov, V.M. Kupriyanov, I.A. Koupriyanova, N.V. Maksimov -------------------------------------- NRNU MEPHI, Moscow, Russian Federation RPA “TYPHOON”, Obninsk, Russian Federation Third International Conference on Nuclear Knowledge Management – Challenges and Approaches, RKM-2016

A Semantic-Based Aroach for Preserving Operational ...€¦ · A Semantic-Based Aroach for Preserving Operational Experience of Nuclear Installations A.N. Kosilov, ... the indexing

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Semantic-Based Aroach for Preserving Operational ...€¦ · A Semantic-Based Aroach for Preserving Operational Experience of Nuclear Installations A.N. Kosilov, ... the indexing

A Semantic-Based Aroach for Preserving Operational Experience

of Nuclear InstallationsA.N. Kosilov, V.M. Kupriyanov, I.A. Koupriyanova,

N.V. Maksimov

--------------------------------------

NRNU MEPHI, Moscow, Russian Federation

RPA “TYPHOON”, Obninsk, Russian Federation

Third International Conference

on Nuclear Knowledge

Management

– Challenges and Approaches,

RKM-2016

Page 2: A Semantic-Based Aroach for Preserving Operational ...€¦ · A Semantic-Based Aroach for Preserving Operational Experience of Nuclear Installations A.N. Kosilov, ... the indexing

The paper discusses the experience in development of the tools necessary for thefirst phase of automation of description of explicit technological knowledge - thethematic categorization of text documents on nuclear facility life cycle.

In the study the existing thematic index IAEA INIS was used as a INISindex due toits fullest provision at the moment by such means for Russian texts, thanks to theRussian multilingual thesaurus INIS as there is a lack of similar tools in Russiannational index GRNTI

It is argued that a simple approach to the analysis, using the frequency ofoccurrence of technical terms in the analyzed text does not yield consistentresults for thematic categorization of text.

Content

INIS IAEA THESAURI

Russian Languagepart of INIS Thesauri

Russian national index GRNTI

Exploitation documentation

Page 3: A Semantic-Based Aroach for Preserving Operational ...€¦ · A Semantic-Based Aroach for Preserving Operational Experience of Nuclear Installations A.N. Kosilov, ... the indexing

• It is proposed to increase the stability ofcategorization procedures by using a simpleontological model to establish complex linksbetween the presence of the words from the pre-marked-up "manually" selected words list of terms -descriptors and specifically described semanticfeatures of index heading, grouping knowledge onthe topics

Proposing approach

?

Experience has shown that to build such a model, it is necessary thatthe indexing experts marked out manually all descriptors according totheir belonging to each category

Page 4: A Semantic-Based Aroach for Preserving Operational ...€¦ · A Semantic-Based Aroach for Preserving Operational Experience of Nuclear Installations A.N. Kosilov, ... the indexing

+

In classical approach, usually three types of knowledge are considered:

explicit - information and knowledge, which are presented on a tangible medium,

implicit - information or knowledge that is not contained in a material form, but can be made explicit, and

tacit - information or knowledge that is extremely difficult to express in material form.

We argue that it is important not to reduce the number to two types: implicit-tacit, becauseit could lead to the lost of very important field of activity - the creation of procedures to describeand identify the subject knowledge, in particular, identification and preserving of the operatingexperience of nuclear facilities.

This paper is devoted namely to the technological aspects of knowledge description and handlingfor effectiveness of the preservation.

The features of identification mechanism of each storaging unit in the system of knowledgepreservation are the basic principles of the system of a knowledge organization system.

What to be preserved ?

Page 5: A Semantic-Based Aroach for Preserving Operational ...€¦ · A Semantic-Based Aroach for Preserving Operational Experience of Nuclear Installations A.N. Kosilov, ... the indexing

Adaptation of IAEA KOSFR system up to the project of small fast reactor SVBR-100 (Russia)

Explicit nuclear knowledge is contained in a documents, drawings, calculations, designs, databases, procedures and manuals. The term "explicit knowledge" implies to a declared knowledge (i.e. the knowledge that was extracted from the bearers of knowledge). Subject headings – is a procedure of linking each element of the stored object with one of the predefined classes. The classes are defined by the unity of some concepts (in the case of thematic headings) or through a coincidence of structural attributes.

As a rule, despite the existence of predefined setrs in index object the attributes required to uniquely identify the new element the subject heading are not enough strictly defined.

In a practice, indexing experts carries headings "manually" for each newly described storage element. However, currently there is an urgent need to develop an information system for newly constructed nuclear power plants as it is necessary to describe a huge amount (more than 100,000 objects) of different documents in the information repository.

Automation of thematic headings of NPP documents will help to solve one of the most pressing problems of information support of the NPP lifecycle – a creation of long-term knowledge organization system. The solution to this problem is a basic prerequisite for improving plant operation safety for more than 40 years of its operation.

Page 6: A Semantic-Based Aroach for Preserving Operational ...€¦ · A Semantic-Based Aroach for Preserving Operational Experience of Nuclear Installations A.N. Kosilov, ... the indexing

Adaptation of IAEA KOSFR system to the project of small fast reactor SVBR-100 (Russia)

• Since 2012 experts of the Russian National INIS Center in collaboration with the NRNU MEPhI English Department create an automated system of categorization of technical texts.

• The first prototype was tested in the framework of an information support system for the makeup of the experimental liquid-metal fast breeder reactor. The prototype was created by specialists of Department «System Analisis» of NRNU MEPHI.

• The prototype system has implemented the structural taxonomy used by the design organization and was based on the IAEA recommendations according to results of the development the fast reactor knowledge organization system KOS FR.

• Categories of the upper level of the thematic categories for concept identification were taken from INIS headings. A screenshot of taxonomy fragment is shown on thr slide.

Fragment of description of physical taxonomy of

installation

Fragment of description of functional hierarhy of

processes

Page 7: A Semantic-Based Aroach for Preserving Operational ...€¦ · A Semantic-Based Aroach for Preserving Operational Experience of Nuclear Installations A.N. Kosilov, ... the indexing

The features of the model realization

• In the first step of the procedure development about 100 key words in Russian were drawn up for each column heading of the real system, those keywords were selected from typical texts on nuclear energy - they were identified as descriptors of the columns. Then indexing was carried out using a special computer program by comparing the sequences of characters of each descriptor with words of the indexed text. A number of matches between descriptors and key words of the text have been used as an indicator column. However, this simple algorithm provided a very low stability for thematically similar texts, so it was rejected.

• In the second stage, it was decided to create a simple ontological model to create more reliable connections between entities of subject headings and keywords in the text under investigation. The simple ontological model was based on a model of communications data, providing links from keywords in headings and developed according to recommendations of the standard ISO 15926:2, a set of definitions of the INIS thesaurus hierarchy has been taken as a basis of this model. Some new types of references have been included as additional elements.

• Finally the expert team has implemented the ontological model with the following properties.

1. The model allows identifying connections between physical objects and thematic concepts.

2. The model supports not only links such as "narrowing the term" - "extending the term", but like "something is a part of something", "something contains something", "something like something" and some others links as for physical objects and subjective concepts.

Page 8: A Semantic-Based Aroach for Preserving Operational ...€¦ · A Semantic-Based Aroach for Preserving Operational Experience of Nuclear Installations A.N. Kosilov, ... the indexing

Simplified onological model of categorization

The ontological data model provides a more stable and reliable categorization of text then a simple accounting of keywords in the indexed text. Especially, it is important to divide the reference connection on physical objects and thematic concepts of the essence. This makes it possible to distinguish between the descriptions of "to be part of something" and "to be more narrow concept of something". Obviously,this requires a special additional markup of properties of descriptors and column attributes.

Set of ontological links

Ca

teg

ory

1 C

ate

go

ry 2

Ca

teg

ory

... C

ate

go

ry N

Page 9: A Semantic-Based Aroach for Preserving Operational ...€¦ · A Semantic-Based Aroach for Preserving Operational Experience of Nuclear Installations A.N. Kosilov, ... the indexing

Results

The creation of an automated categorization systems and operational text meta-descriptions of the system requires specialists with a unique competence - knowledge workers, with new competencies including the concepts described above.

In particular they should know:•Nuclear facility subject areas;•Basic linguistic concepts;•The principles of semantic analysis;• The structure and content of the different heading lists (INIS, GRNTI - for Russian language);• The terminology of the English language in the nuclear field;• The basic concepts and principles of modern ontological descriptions;• The basic principles of computer science (including database design).

Page 10: A Semantic-Based Aroach for Preserving Operational ...€¦ · A Semantic-Based Aroach for Preserving Operational Experience of Nuclear Installations A.N. Kosilov, ... the indexing

Conclusions and recommendations

• The following conclusions and recommendations based on the test study which was performed by using a set of Russian NPP operation documents could be made.

• A careful selection of the optimal attributes as attributes for each column is a very important step increating a categorization system. Optimization can only be achieved by highly professional indexing experts (e.g. experts from the INIS National Center specialized in the "NPP operation" knowledge domain).

• It is important to optimize the working list of descriptors for each knowledge subdomains. Testing shows that the excessive number of descriptors for any subject provides less reliable categorization of the results than "optimal set". A number of 50-70 descriptors for each column was acceptable for the Russian language. Moreover, descriptors must be "common terms" of conventional text. Thus, the optimal list of descriptors (attributes and descriptors) can be selected and assigned to only manually by highly professional indexing experts.