24
ISSN 00051055, Automatic Documentation and Mathematical Linguistics, 2013, Vol. 47, No. 3, pp. 69–92. © Allerton Press, Inc., 2013. Original Russian Text © A.N. Libkind, V.A. Markusova, I.A. Libkind, M. Jansz, K.N. Ivanov, 2013, published in NauchnoTekhnicheskaya Informatsiya, Seriya 2, 2013, No. 3, pp. 9–34. 69 1 INTRODUCTION It is wellknown that the creation of the Science Citation Index (SCI) by Eugene Garfield in the Insti tute for Scientific Information (ISI, nowadays part of ThomsonReuters), was a powerful impulse for the development of scientometrics. The accumulation of enormous files of bibliographic information in ISI and the development of increasingly powerful computer facilities has made possible the creation of a new infor mation product based on the interrelations of scien tific journals: the Journal Citation Reports (JCR). The JCR first published in 1975 and containing statistics about some 3000 scientific journals, is publishing annually. A special version of the JCR for social sci ences (JCR Social Science Edition) is publishing since 1978. Both versions of the JCR are accessible online since 2010, as components of the huge and multidi mensional information system Web of Knowledge (WoK). Today, the JCR contains more than ten quan titative characteristics for each journal. For the pur poses of the present research, the most important characteristics are the number of articles and the impact factor value of each journal. The concept of impact factor was introduced by Dr. Eugene Garfield in cooperation with Dr. Irving Sher in 1955 [1]. Dis 1 The article was translated by the authors. cussing the SCI and the JCR, E. Garfield stressed that the main purpose of these resources is to provide infor mation to researchers; he notes that in addition the sets of bibliographic records are an invaluable source of scientometric information [2]. Scientometrics researchers soon appreciated the opportunities offered by the JCR and increasingly used JCR data in their research. The number of research and papers based on the use of journal statistics from the JCR is so large that even simply listing them is not an easy task. In these circumstances, questions of the reliability and correctness when using these statistics becomes a very actual and important problem [3–11]. These ques tions are even more important in view of the fact that JCR data are also used for making science and tech nology policy decisions. In this paper we try to estimate the stability in time of some important bibliometric indicators for sets of journals, which, according to the opinion of the world scientific society, belong to the most authoritative sources of scientific papers in the corresponding fields of science. In other words, the aim of the article is to describe the process of change with time in statistical indicators of journal sets, and to build up a mathemat ical model describing this process. For this purpose we will try to form the sets of journals in such a way, that each of the journal sets at a given moment in time will Modeling the Dynamics of the Retentivity Process of Journals Among the Most Authoritative Scientific Serials 1 A. N. Libkind a , V. A. Markusova b , I. A. Libkind c , M. Jansz d , and K. N. Ivanov e a PhD, Head of department of VINITI RAS, Moscow, Russia b Dr. Sci., Head of department of VINITI RAS, Moscow, Russia c Programming supervisor of Finacial University under the Government of the Russian Federation, Moscow, Russia d Program Director at Technology Foundation STW, Utrecht, Netherlands e Programming supervisor of Finacial University under the Government of the Russian Federation, Moscow, Russia email: [email protected]; [email protected]; [email protected]; [email protected]; [email protected] Received January 11, 2013 Abstract—The article examines the process of journal retention over time as the most authoritative sources of scientific papers. Formal concepts of retentivity are introduced, retentivity orders and retentivity direc tions. The postulates, which are formulated in the article, link the probability of retention to the time interval between the two compared journal lists, to the ratio of the sizes of these lists, as well as to some qualitative characteristics of the journals. A mathematical model of the retentivity process is built on the basis of these postulates. The model is compared with data from the Journal Citation Reports—Science Edition for a 16 year period. The results of this comparison show a high degree of conformity of the model to the real process of retentivity, and reveal important features of this process. Keywords: lists of journals; impact factor; journal output; Journal Citation Reports; journals retention pro cess; mathematical model DOI: 10.3103/S0005105513030011

Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

Embed Size (px)

Citation preview

Page 1: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

ISSN 0005�1055, Automatic Documentation and Mathematical Linguistics, 2013, Vol. 47, No. 3, pp. 69–92. © Allerton Press, Inc., 2013.Original Russian Text © A.N. Libkind, V.A. Markusova, I.A. Libkind, M. Jansz, K.N. Ivanov, 2013, published in Nauchno�Tekhnicheskaya Informatsiya, Seriya 2, 2013, No. 3,pp. 9–34.

69

1 INTRODUCTION

It is well�known that the creation of the ScienceCitation Index (SCI) by Eugene Garfield in the Insti�tute for Scientific Information (ISI, nowadays part ofThomson�Reuters), was a powerful impulse for thedevelopment of scientometrics. The accumulation ofenormous files of bibliographic information in ISI andthe development of increasingly powerful computerfacilities has made possible the creation of a new infor�mation product based on the interrelations of scien�tific journals: the Journal Citation Reports (JCR). TheJCR first published in 1975 and containing statisticsabout some 3000 scientific journals, is publishingannually. A special version of the JCR for social sci�ences (JCR Social Science Edition) is publishing since1978. Both versions of the JCR are accessible onlinesince 2010, as components of the huge and multidi�mensional information system Web of Knowledge(WoK). Today, the JCR contains more than ten quan�titative characteristics for each journal. For the pur�poses of the present research, the most importantcharacteristics are the number of articles and theimpact factor value of each journal. The concept ofimpact factor was introduced by Dr. Eugene Garfieldin cooperation with Dr. Irving Sher in 1955 [1]. Dis�

1 The article was translated by the authors.

cussing the SCI and the JCR, E. Garfield stressed thatthe main purpose of these resources is to provide infor�mation to researchers; he notes that in addition thesets of bibliographic records are an invaluable sourceof scientometric information [2]. Scientometricsresearchers soon appreciated the opportunities offeredby the JCR and increasingly used JCR data in theirresearch. The number of research and papers based onthe use of journal statistics from the JCR is so largethat even simply listing them is not an easy task. Inthese circumstances, questions of the reliability andcorrectness when using these statistics becomes a veryactual and important problem [3–11]. These ques�tions are even more important in view of the fact thatJCR data are also used for making science and tech�nology policy decisions.

In this paper we try to estimate the stability in timeof some important bibliometric indicators for sets ofjournals, which, according to the opinion of the worldscientific society, belong to the most authoritativesources of scientific papers in the corresponding fieldsof science. In other words, the aim of the article is todescribe the process of change with time in statisticalindicators of journal sets, and to build up a mathemat�ical model describing this process. For this purpose wewill try to form the sets of journals in such a way, thateach of the journal sets at a given moment in time will

Modeling the Dynamics of the Retentivity Process of Journals Among the Most Authoritative Scientific Serials1

A. N. Libkinda, V. A. Markusovab, I. A. Libkindc, M. Janszd, and K. N. Ivanove

a PhD, Head of department of VINITI RAS, Moscow, Russiab Dr. Sci., Head of department of VINITI RAS, Moscow, Russia

c Programming supervisor of Finacial University under the Government of the Russian Federation, Moscow, Russiad Program Director at Technology Foundation STW, Utrecht, Netherlands

e Programming supervisor of Finacial University under the Government of the Russian Federation, Moscow, Russiae�mail: [email protected]; [email protected]; [email protected]; [email protected]; [email protected]

Received January 11, 2013

Abstract—The article examines the process of journal retention over time as the most authoritative sourcesof scientific papers. Formal concepts of retentivity are introduced, retentivity orders and retentivity direc�tions. The postulates, which are formulated in the article, link the probability of retention to the time intervalbetween the two compared journal lists, to the ratio of the sizes of these lists, as well as to some qualitativecharacteristics of the journals. A mathematical model of the retentivity process is built on the basis of thesepostulates. The model is compared with data from the Journal Citation Reports—Science Edition for a 16�year period. The results of this comparison show a high degree of conformity of the model to the real processof retentivity, and reveal important features of this process.

Keywords: lists of journals; impact factor; journal output; Journal Citation Reports; journals retention pro�cess; mathematical model

DOI: 10.3103/S0005105513030011

Page 2: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

70

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

LIBKIND et al.

represent practically all the fields and branches of sci�ence (social sciences and humanities excluded). Afterthe selection of these sets of world (national and inter�national) journals we will consider them withoutregard to their thematic spectrum.

BASIC CONCEPT AND DEFINITIONS

The central concept of this paper is concept of“retentivity of journals as the most authoritativesources of scientific papers”. This concept is close tothe concept of “retentivity of regular sources for agiven specific field, which had been introduced beforeby one of the authors of this article in co�authorshipwith Michael V. Arapov [12, 13].

It is obvious that on the one hand the concept of“retentivity of journals as the most authoritativesources of scientific papers” is a kind of simplification(reduction) of the concept of “retentivity of regularsources for a given specific field”. Unlike the conceptin [12, 13] the concept presented in this article isabstract away from a subject of a journal, and thusreduces the concept of retentivity. On the other hand,the concept of retentivity suggested here may be alsoregarded as some extension of the concept in papers[12, 13]. Namely, according to [12, 13], the probabilityof retention of a journal as the source for a given sub�ject—after a certain period of time—depends on itsproductivity (the number of articles on the subject) atsome initial time, and on the size of the time interval.The notion of retentivity, which is introduced in thisarticle links the fact of retention/non�retention of ajournal with its productivity and the time interval aswell, and in addition, also takes into account suchimportant characteristics as the impact factor (as anindicator for the scientific level of a journal) and theexpected response (as an indicator for its degree ofinfluence, see below) on the research carried out in theworld of science. With the help of this concept we willtry to formalize the assumption, that the majority ofthe journals which, in the opinion of the world scien�tific society (at some given moment), are character�ized as having a high scientific level, will remain forquite a long period of time in the subset of authorita�tive sources of scientific papers.

We will regard the fact of including a journal in thelist of journals of the information resource “JournalReports—Science Edition” (JCR SE) of ThomsonReuters company as the basis for recognizing the jour�nal as an authoritative source.

Let us consider a pair of lists Lt and Lt + τ, separated

by the time interval of τ years. We will assume thatjournal x1 from list Lt (xi ∈ Lt) is retentive in list Lt + τ

ifin case xi ∈ Lt also xi ∈ Lt + τ

(xi ∈ Lt ⇒ xi ∈ Lt + τ) is

true. In analogy we will say that journal yj from list Lt + τ

(yj ∈ Lt + τ) is retentive in list Lt if in case yj ∈ Lt + τ

istrue yj ∈ Lt (yi ∈ Lt + τ

⇒ yi ∈ Lt) also. It is clear thatwith this condition the equation is alwaysnLt

nLt τ+=

true, where is a number of journals from list Lt,

which are retentive in list Lt + τ and is a number of

journals from list Lt + τ, that were also present in list Lt.

Let us characterize each of the compared lists Ltand Lt + τ

with the following four quantitative charac�teristics:

—number of journals in the list Li of the given year i;—the sum of number of articles, which were pub�

lished in the journals from the list Li in the given year i;—the sum of the values of the 2�year impact�factor

of the journals from the list Li in the given year i;—the sum of the expected response values to the

articles, which were published in journals from the listi of the given year i. The expected response (ER) is thenumber of articles published in the journal Li in thegiven year multiplied by the value of the impact�factorof this journal i in the same year.

Let us briefly elaborate what we mean by the abovementioned quantitative characteristics.

The number of journals in the list of a given year isthe number of scientific periodicals and proceedingeditions, which are included in the “Master List” ofthe JCR SE in a given year.

The number of articles is the number of researchand review articles (“citeble items” in JCR teminol�ogy), which a given journal published in a given yearaccording to the JCR SE. Our analysis of SCI�expanded data shows that in the period 1995–2010citeble items constitute about 75% of the total numberof publications in this resource.

The sum of the 2–year impact–factor values of thejournals: today, JCR SE gives two values for a journal’simpact factor: a 2�year impact�factor and a 5–yearimpact factor. For an n�year impact factor, JCR SEtakes into account all citations in a given year of thepapers in the journal in the previous n years. Ourresearch covers a period starting in 1995. The values ofthe 2–year impact factor were regularly brought intothe JCR long before 1995. JCR SE only gives the val�ues of the 5–year impact factor since 2007. Hence ourchoice of the 2–year impact factor as one of the quan�titative characteristics for this study.

The sum of the expected response values. The valueof the expected response (ER) to the articles of a jour�nal, we take to be the product of the number of articlespublished in a given year and the value of the impactfactor of the journal in the same year. The sum of theexpected response values then is the sum of the ERvalues for all journals included in the “Master List” ofJCR SE for a given year.

Let us introduce the concept of retentivity of jour�nals by a given quantitative characteristic.

Retentivity of journals by a given quantitative char�acteristic—by this we mean the ratio between thevalue of a certain quantitative characteristic, whichcorresponds to the subset of the retentive journals from

nLt

nLt τ+

Page 3: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

MODELING THE DYNAMICS OF THE RETENTIVITY PROCESS 71

list Li in list Lj, and the full value of this quantitativecharacteristic corresponding to the whole set of jour�nals of the list Li.

It is important to emphasize that this value corre�sponding to the subset of retentive journals from thelist of the year ti in the list of some other year tj is alwaystaken in relation to the sum value of the quantitativecharacteristic corresponding to this year ti.

First case. Since τ is a non�negative integer, t + τ >t is always true (t + τ = t describes the degeneratedcase, when we consider the retention of a list in itself).Accordingly, list Lt dates from an earlier year than listLt + τ

. Thus, estimating retentivity of journals by givenparameter from list Lt in list Lt + τ

the researcher“moves” in the direction of “flow of time”.

Second case. When looking at retentivity of jour�nals by a given parameter from list Lt + τ

in list Lt, theresearcher moves in the reverse direction, i.e. into thepast. In the first case, we will talk about direct retentiv�ity, in the second about reverse retentivity. At firstglance, reverse retentivity may seem of purely aca�demic interest. However, this study shows that with thehelp of this concept we can address some questionsthat otherwise would remain unanswered.

In the first case we shall say about direct retentivityin the second—about the reverse retentivity.

Retentivity of the first order (retentivity by numberof journals): We define direct retentivity of the firstorder of journals from list Lt in list Lt + τ

by the

following relationship:

(1)

where: is the number of journals from list Lt, which

are present in list Lt + τ; is the complete number of

all the journals in list Lt, and generally speak�

ing, characterizes the probability of the retention of anaverage journal from list Lt in list Lt + τ

Lt (if we view listLt as a set, then is the potency of the set Lt; =

|Lt|).

We define the reverse retentivity of the first order of journals from list Lt + τ

in list Lt by the fol�

lowing relationship:

(2)

where is the number of journals from list Lt + τ

which are present in list Lt; is the number of all

the journals in list Lt + τ (if we view list Lt + τ

as a set,then is the potency of the set Lt + τ

; =

|Lt + τ|).

q1_Lt t τ+,

,

q1_Lt t τ+,

nLt

NLt

������,=

nLt

NLt

q1_Lt t τ+,

,

NLtNLt

q1_Lt t τ+,

,

q1_Lt τ+ t,

nLt τ+

NLt τ+

���������,=

nLt τ+

NLt τ+

NLt τ+NLt τ+

Retentivity of the second order (retentivity by thesum of number of articles). If Ut is the number of arti�cles published by all the journals from list Lt in year t,and ut is the number of articles which were publishedin the same year t by those journals from list Lt, whichin year t + τ (i.e. after τ years) are present also in listLt + τ

, then direct retentivity of journals of the secondorder is defined by the following relationship:

(3)

With being the number of articles published byall the journals from list in year , and thenumber of articles which were published in the sameyear t + τ only by those journals from list Lt + τ

, whichfor year t (i.e. τ years earlier) are present in list Lt, thereverse retentivity of journals of the second order is defined by the following relationship:

(4)

where – is a number of journals from list pre�

senting in list Lt; – is a number of journals in list

Lt+τ (if examin list as a set, is a power of set

; = | |).

The relationship presented in formula (3) charac�terizes the ratio of the contribution of those journalsfrom list Lt which are present (are retentive) in listLt + τ

, to the complete number of articles published inyear t by all the journals from list Lt. Similarly, the rela�tionship presented in formula (4) characterizes theratio of the contribution of those journals from listLt + τ

, which are present in list Lt for year t, to the com�plete number of articles published by all the journalsfrom list Lt + τ

in year t + τ.

Retentivity of the third order (retentivity by the sumof the impact�factor values). If Wt is the total sum of2�year impact�factor values of all the journals from listLt for year t, and wt is the sum of the 2�year impact�fac�tor values in the same year t of those journals from listLt which are present (are retentive) in list Lt + τ

for year t +τ (i.e. τ years later), then direct retentivity of journals ofthe third order is defined by the following relationship:

(5)

With Wt + τ being the total sum of 2�year impact�

factor values of all the journals from list Lt + τ for year

t + τ, and wt the sum of the 2�year impact�factor valuesin the same year t + τ of those journals from the listLt + τ

which are present (are retentive) in list Lt for yeart (i.e. τ years earlier), then reverse retentivity of journals

qLt

,2 _ .t t

tL

t

uq

U+τ

=

tU+τ

tL+τ

t + τ tu+τ

,2 _ t tLq+τ

,2 _ .t t

tL

t

uq

U+τ

=

NLt tL+τ

NLt

tL+τ

NLt

tL+τ

NLt tL+τ

,3 _ .t t

tL

t

wq

W+τ

=

Page 4: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

72

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

LIBKIND et al.

of the third order is defined by the following relation�ship:

(6)

The parameter characterizes the relativecontribution of the 2�year impact�factor values ofthose journals from list , which are present (areretentive) in list , to the complete sum of 2�yearimpact�factor values of the journals in list . Simi�larly, the parameter corresponds to the relativecontribution of the 2�year impact�factor values ofthose journals in list , which are present also in list

, to the complete sum of 2�year impact�factor valuesof the journals in list .

Retentivity of the fourth order (retentivity by thesum of the expected response values). We will considerthe expected response (ER) to the articles of a journalas a measure for the influence of that journal on globalresearch and define the ER as the product of the num�ber of articles published in the journal j in a given yeart and the 2�year impact�factor value of this journal j inthe same year t:

(7)

where:

is the number of articles published by journal j inyear t;

is the 2�year impact�factor of journal j in year t;

is the expected response to the articles pub�lished in journal j in year t.

The sum of expected responses to all articles pub�lished in all the journals in list Lt in year t is:

(8)

where:

is the total number of journals in list ;

The sum of expected response values inyear t for only those journals from list Lt, which arepresent also in list Lt + τ

for the year t + τ is given by:

(9)

where: is the number of journals from list Lt, which are

present also in list for the year .Then we can write the direct retentivity of journals

of the fourth order as:

(10)

,3 _ .t t

tL

t

wq

W+τ

=

,3 _ t tLq+τ

tL+τtL

tL

,3 _ t tLq+τ

+τtLtL

,t tL+τ

,t t tj j jER f IF=

tjf

tjIF

tjER

1

_ ,LtN

t tt j

j

S ER ER=

=∑

tLN tL+τ_ t

tS ER

,

1

_ ,Ltn

t t tt j

j

S ER ER+τ

=

=∑

tLntL+τ

t + τ

,

,

4__

_t t

t tt

L tt

S ERq

S ER+τ

=

and the reverse retentivity of journals of the fourthorder as:

(11)

where:

is the sum of expected response values toall articles published in all journals from list for

year , and is the sum of expectedresponse values in year for only those journalsfrom list , which are present also in list for year .

SOURCE DATA AND FORMINGPAIRS OF LISTS

As data source we used the annual editions of JCRSE published during the 16–year period: 1995–2010(with the exception of 2001, because the edition ofJCR SE for 2001 was unavailable to us). In these edi�tions of JCR SE we were interested in the followingdata:

—the journals (we used as the identifier of a journal

either its title2 or ISSN) in each year;

—the number of articles published by each journalin each year;

—values of the 2�year impact�factor of each jour�nal in each year.

In total 15 lists of journals were extracted with thecorresponding data for each of the four quantitativecharacteristics and pairs of lists to be compared wereformed. The number of pairs for each of the quantita�tive characteristics is equal to the number of permuta�tions:

In our case n = 15 (number of lists) and m = 2 (thelists are compared in pairs). Thus, the number of pairsof lists we obtain for each of the four quantitative char�acteristics (we sum their total number for the directand the reverse retentivity) is:

The total number of comparisons is 840 (210 × 4).Let us specify the creation of each pair of lists

and/or corresponded numerical values depending onthe order retentivity. This clarification necessary bynext reasons. In some cases in JCR SE for a particular

2 We didn’t keep a close watch on the cases when a journal titlechanged or when a journal became a part of another journal orwhen a journal is divided into several independent publicationsor when a journal stopped to exist. The assessment of the fre�quency of such cases shows that the share of those journals isusually less than 0.5–1% of the complete number of journals inthe list of the corresponding year.

,

,

4__

,_t t

t tt

L tt

S ERq

S ER+τ

=

+τ_ t

tS ER

tL+τ

t + τ,_ t t

tS ER +τ

t + τ

tL+τ tL t

! .( )!

mn

nAn m

=

215

15! 210.(15 2)!

A = =

Page 5: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

MODELING THE DYNAMICS OF THE RETENTIVITY PROCESS 73

journal in one of list of given couple of lists the numberof articles and/or the impact factor was present, whilein the second list, these values is not indicated. Some�times in the JCR stated that the corresponding valuesare zero. In case of number of articles this situation isprobably due to the fact that at the time of the prepa�ration of next issue JCR SE specialists Thomson Reu�ters, for one reason or another, did not have informa�tion about the number of articles published by thejournal.

The situation is more obvious in case of impact fac�tor. Collection of statistics (number of references tothe paper of this journal) to calculate the impact factorshould be at least 2 years of observation. If the journalwas included in the JCR SE for the first time, thensuch statistics do not exist. As a result, in JCR SE valueof impact factor for a journal are absent, or indicatesthat the value is zero. Our analysis showed that theshare of journals for which the number articles isabsent or equal to 0—in the range of 0.5–3%. Thecorresponding values for case of impact factor is in therange 1–3.5%. In some cases of missing data, we havebeen able to restore the number of articles for gevenjournal. This was done by using the issue of JCR SE forthe year following after year of issue in which dataabout the number of articles for the journal was notindicated or specified as zero. This procedura was pos�sible because number of articles published in a journalin the previous year (t – 1) is used in the JCR SE (inyear t) for calculating the impact factor of the journalin year t. Unfortunately, it was possible to restore thenumber of articles not for all such journals. The spec�ified procedure is quite time�consuming, as in each ofthe 15 lists the information for one hundred or morejournals needed to be restored. In addition, the journalcould be missing entirely from the list of journals in theissue of JCR in year t + 1.

Pairs of lists were created as follows:—For retentivity by number of journals (retentivity

of the first order): list Lt includes without exception alljournals which are present in JCR SE for the year t;similarly, list Lt + τ includes all journals which arepresent in JCR SE for the year t + τ. That is, no journalwas excluded from any of the 15 lists.

—For retentivity by the sum of the number of arti�cles (retentivity of the second order): a journal j wasexcluded from both lists Lt and of the pair, if for one ofthem the number of articles in this journal j was miss�ing or was equal to zero.

—For retentivity by the sum of impact factor(retentivity of the third order): a journal j was excludedfrom both lists Lt and Lt + τ of the pair, if for one of themthe impact factor value for a journal j was missing orwas equal to zero.

—For retentivity by the sum of the expectedresponse values (retentivity of the fourth order): a jour�nal j was excluded from both lists Lt and list Lt + τ of thepair, if for one of them the value of the number of arti�

cles and/or the impact factor value for the journal jwas/were missing or equal to zero.

BUILDING THE MODEL AND COMPUTATION OF ITS PARAMETERS

Let us formulate the following postulates:

Postulate 1. The probability of retentivity of jour�nals from list Lt in list Lj, which are separated by timeinterval τ, basically depends on the value of Δt and onthe relationship between the numbers of journals inthe compared lists Ni and Nj respectively.

Postulate 2. Retentivity of journals from list Lt inlist Lj, which are separated by time interval Δt, isinversely dependent on the time interval Δt.

It is obvious that, if all other conditions are equal,then the probability of retentivity of some journal froma “shorter” list (a list that includes a smaller number ofjournals) in a “longer” list (a list that includes a largernumber of journals) is higher. Vice versa, this probabil�ity gets lower in the opposite case. This leads us to for�mulate a third postulate:

Postulate 3. If all other conditions are equal (i.e.with a fixed value of τ) for different pairs of lists to becompared, the probability of the retentivity of journalsfrom list Li in list Lj grows, when , andreduces when .

NB: Because in reality we observe a regular growthover time in the number of journals included in thelists, the condition will be generally corre�sponding to the direct retentivity, and the condition

to reverse retentivity.

And finally:

Postulate 4. The postulates 1–3 are true not onlyfor the retentivity of lists of journals, but also for theretentivity with regard to the other characteristics:total number of articles, sum of impact�factor values,and sum of expected response values.

Let us discuss briefly the suggested postulates.Because postulate 1 only gives us general presump�tions, which would be verified if postulates 2 and 3 arefound to be true, it makes sense to discuss its truthful�ness only after discussing postulates 2 and 3. In ouropinion, postulate 2 can hardly be doubted: commonsense tells us that the larger the time interval betweenthe compared lists is, the lower the probability ofretention of journals from one list in the other will be.As to postulate 3, to our mind it also looks quite natu�ral. Indeed, the probability of retentivity of some partof a shorter list in a longer list (a list with a larger num�ber of journals) must be higher than the probability ofretentivity of some part of a longer list in a shorter list.

There can be doubt about postulate 4, which saysthat all three previous postulates are true also for theother numerical characteristics of retentivity. We will

/ 1i jN N <

/ 1i jN N >

/ 1i jN N <

/ 1i jN N >

Page 6: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

74

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

LIBKIND et al.

examine the validity of these postulates against theempirical retentivity data.

It should be noted that the postulates and themathematical model that is created on the basis ofthese postulates, are only an approximation describingthe process of retentivity. Indeed, these postulates donot consider irregularities in the growth of the numberof journals and the papers published in them. E.g. theydo not take into account some external, socio�eco�nomic factors, In particular: a change in the worldeconomy, which could impact the intensity of researchand, therefore, the number and productivity of scien�tific journals; possible changes in the general rules bywhich the lists of journals in JCR SE are compiled;sharp changes of emphasis and trends in the directionof basic research (influence of possible scientific revo�lutions). The degree to which the model and the pos�tulates describe the real process of retentivity can indi�cate the level of influence of these factors.

We now turn to the creation of a mathematicalmodel for the process of retentivity. The differentorders of retentivity will be denoted as , where htakes values 1, 2, 3 and 4.

It is obvious that retentivity is an estimator for theprobability of a given quantitative characteristic. Frompostulates 1–3 follows that the retentivity for a givenquantitative characteristic basically depends on thefollowing two variables:

—The time interval between a pair of lists for agiven kh, where , ti the time (years) corre�sponding to the list i, and tj the time corresponding tolist j.

—The ratio between the values the characteristic inthe two lists Li and Lj (for a given order of retentivity kh).

Based on the above, we can write the two�parame�ter regression equation:

(12)

In this equation let us interpret the variables andcoefficients of regressors in line with our goal (inter�pretation of the intercept term c in equation (12) willbe discussed after completion of the model). Theregressand y in equation (12) will be interpreted as theremaining share (retentivity) of the value of a

quantitative characteristic corresponding to agiven retentivity order .

According to postulate 2, an increase of the timeinterval leads to a drop in retentivity. Hence, theregressor can be interpreted as a function of the timeinterval τ between the lists are compared. Since thereis no certainty that this is a linear function, we canequate to τγ (x1 = τγ). The coefficient a of (i.e. thecoefficient a of τλ) in the case γ = 1 can be interpretedas the value of retentivity decrease during 1 year, i.e.when the time interval between the lists is 1 year (τ = 1).

hk

| |hk i jt tτ = −

1 2 .y ax bx c= + +

pkh

hkR

hk

1x

1x 1x

Based on postulate 3, the regressor x2 in equation(12) will be considered as a variable, which reflects theinfluence of the ratio of the values and onthe retentivity changes for lists and . Coefficientb of x2 then is the constant, which describes thestrength of this influence. The values of quantitativecharacteristics are changing over time: if we are mov�ing from the past towards the present, these values tendto increase, and vice versa, reduced when we are mov�ing backwards in time. Generally speaking, there is noreason to suppose that the rate of change in time of thevalues is constant, i.e. we may assume that the rate ofchange in the values is time�dependent and thisdependence is not necessarily linear. For this reasonwe can consider the variable x2 as an analogue of accel�eration, which characterizes the rate of change ofretentivity in time. Based on this assumption, wedenote x2 as “acceleration” . Obviously, willhave a positive or negative value depending on thedirection in time. In the case of direct retentivity will have a positive, in the case of reverse retentivity anegative value. We will denote direct retentivity by

and reverse retentivity by .

Based on the above considerations, we can write fordirect retentivity

(13)

And for case of reverse retentivity

(14)

Here and are the values of given charac�teristic of the pair of lists Lt and Lt + τ for a given orderof retentivity kh. Exponent β of τ in the formulas (13)and (14) is introduced, based on the assumptionsmade above, that changes of numerical values and

are time�dependent and this dependence is notnecessarily linear. We introduced the constant λ = 1only for the stability of the model in the “degeneratecase”, i.e. in case τ = 0. This value λ = 1 was chosenbecause, in normal circumstances, it practicallyshould not affect the values (the values of and

are larger by 3–6 exponent of a numbers thanvalue of λ).

Now we can write the final form of the equation fordirect retentivity as follows:

(15)

And for reverse retentivity:

(16)

_hk tR _hk tR+τ

tL tL+τ

khRν khRν

khRν

,,

khR directν ,khR reverseν

,

, ,

,

.( )

h h

kh

h

t k t kR direct

t k

R R

R

β

−ν =

τ + λ

, ,,

,

.( )

h h

kh

h

t k t kR reverse

t k

R R

R

β+τ

−ν =

τ + λ

, ht kR , ht kR+τ

,ht kR

, ht kR+τ

hkν , ht kR

, ht kR+τ

, ,,

,

.( )

h h

h

h

t k t kdirect k

t k

R Rp a b c

R

γ +τ

β

−= τ + +

τ + λ

, ,,

,

.( )

h h

h

h

t k t krevers k

t k

R Rp a b c

R

γ +τ

β+τ

−= τ + +

τ + λ

Page 7: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

MODELING THE DYNAMICS OF THE RETENTIVITY PROCESS 75

We will consider these equations just as the mathe�matical model of the retentivity process.

NB. In equations (15) and (16) free term c would beequal to 1 only in case τ = 0, that is when a list of jour�nals is saved in itself (degenerate case): in this case theprobability of retentivity is equal to 1. In all other casesc = 1 is not quite correct. Indeed, as mentioned above,the proposed postulates do not account for the influ�ence of several socio�economic factors, so neitherdoes this model, which is based on those postulates. Itcan be assumed that the value c – 1 will assess thedegree of influence on the retentivity of these unac�counted factors.

COMPARISON OF THE MODELWITH THE OBSERVED

RETENTIVITY PROCESS

We compared the mathematical model of theretentivity process with the empirical data in an itera�tive process, using a specially developed computerprogram (in TSQL). Assuming a chi�square value of0.03 (χ2 = 0.03), we obtained for each of the fourorders of direct and reverse retentivity coefficients andexponents for the first two terms of equations (15) and(16), as well as the values of the free terms.

Substituting the parameters thus obtained in equa�tions (15) and (16) we can write equations (17–24)which together constitute the mathematical model ofthe retentivity process.

Direct retentivity of the first order (direct retentiv�ity by number of journals):

(17)

Reverse retentivity of the first order (reverse reten�tivity by number of journals):

(18)

Direct retentivity of the second order (direct reten�tivity by the sum of the number of articles):

(19)

Reverse retentivity of the second order (reverseretentivity by the sum of the number of articles):

(20)

34 354 993.1 1

1

1

0.8 , ,, 0.8

,

0.0 0. 0.( ) 1

t k t kdirect k

t k

R Rp

R+τ

= − τ + +

τ +

=− τ + +

τ +

138 1 411 1 149.1 1

1

1

0,5 , ,, 0.5

,

0. . .( ) 1

t k t kreverse k

t k

R Rp

R

27 798 972.2 2

2

2

0,8 , ,, 0.8

,

0.0 0. 0.( ) 1

t k t kdirect k

t k

R Rp

R+τ

=− τ + +

τ +

46

99 1 15.

2

2 2

2

0.8,

, ,0.8

,

0.0

0.0 .0( ) 1

reverse k

t k t k

t k

p

R R

R+τ

= − τ

− +

τ +

Direct retentivity of the third order (direct reten�tivity by the sum of the impact�factor values):

(21)

Reverse retentivity of the third order (reverse reten�tivity by the sum of the impact�factor values):

(22)

Direct retentivity of the fourth (direct retentivity bythe sum of the expected response values):

(23)

Reverse retentivity of the fourth order (reverseretentivity by the sum of the expected response val�ues):

(24)

EVALUATION THE MODELAND POSTULATES IN VIEW OF THEIR COMPLIANCE WITH EMPIRICAL DATA

The observed and model values of retentivity are pre�sented in tables 1–4. The comparison shows a highdegree of agreement between observed and model values.

Let us consider the degree of compliance of modeland postulates with the empirical data in general.Table 1 contains data about the retentivity by numberof journals, Table 2 of retentivity by the sum of numberof articles, Table 3 of retentivity by the sum of theimpact�factor values, and Table 4 of retentivity by thesum of the expected response values. Each of thesetables is a matrix that contains three types of data. Inthe next paragraph we describe table 1 in detail; thestructure of the other tables and the types of data theycontain are identical.

Each of the rows in the table corresponds to a par�ticular year and is divided into three substrings. Eachof the columns in the table also corresponds to a year.Each intersection of a row with a column correspondsto a particular pair of lists. E.g. if the string is in the rowfor the year 2000, and in the column for the year 2007,the data are related to the comparison of the lists for2000 and 2007. Each of these intersections is subdi�vided into three cells. The lower cell is filled only onthe diagonal and gives the number of journals for the

29

432 977.

3

3 3

3

0.8,

, ,0.8

,

0.0

0. 0.( ) 1

direct k

t k t k

t k

p

R R

R

= − τ

+ +

τ +

56

321 991.

3

3 3

3

0.8,

, ,0.8

,

0.0

0. 0.( ) 1

reverse k

t k t k

t k

p

R R

R

= − τ

− +

τ +

16

279 987.

4

4 4

4

0.8,

, ,0.8

,

0.0

0. 0.( ) 1

direct k

t k t k

t k

p

R R

R+τ

= − τ

+ +

τ +

39

178 1 19.

4

4 4

4

0.8,

, ,0.8

,

0.0

0. .0( ) 1

reverse k

t k t k

t k

p

R R

R+τ

= − τ

− +

τ +

Page 8: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

76

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

LIBKIND et al.

Table 1. Retentivity by number of journals (retentivity of the first order)*

Years 1995 1996 1997 1998 1999 2000 2002 2003 2004 2005 2006 2007 2008 2009 2010

1995

1.000 0.965 0.935 0.918 0.896 0.871 0.827 0.807 0.795 0.783 0.765 0.753 0.744 0.736 0.736

1.000 0.971 0.949 0.938 0.914 0.893 0.852 0.832 0.814 0.796 0.779 0.764 0.748 0.738 0.727

4623

1996

0.933 1.000 0.966 0.945 0.921 0.894 0.849 0.828 0.815 0.802 0.784 0.772 0.765 0.756 0.755

0.964 1.000 0.973 0.964 0.925 0.912 0.870 0.850 0.830 0.813 0.795 0.780 0.764 0.753 0.742

4779

1997

0.869 0.928 1.000 0.973 0.946 0.921 0.875 0.853 0.838 0.825 0.807 0.796 0.788 0.779 0.778

0.883 0.956 1.000 0.995 0.958 0.933 0.888 0.867 0.847 0.829 0.811 0.795 0.779 0.768 0.757

4963

1998

0.774 0.824 0.863 1.000 0.966 0.936 0.885 0.860 0.844 0.829 0.810 0.798 0.787 0.778 0.777

0.782 0.826 0.881 1.000 0.964 0.942 0.899 0/878 0.858 0.840 0.822 0.807 0.790 0.779 0.768

5464

1999

0.744 0.791 0.846 0.952 1.000 0.963 0.910 0.885 0.867 0.853 0.832 0.819 0.808 0.799 0.798

0.753 0.795 0.848 0.990 1.000 0.968 0.920 0.898 0.877 0.859 0.840 0.824 0.808 0.797 0.785

5550

2000

0.705 0.749 0.802 0.899 0.939 1.000 0.940 0.914 0.894 0.878 0.856 0.842 0.830 0.821 0.818

0.720 0.758 0.805 0.914 0.976 1.000 0.941 0.917 0.896 0.877 0.858 0.841 0.825 0.814 0.802

5684

2002

0.647 0.688 0.737 0.822 0.857 0.909 1.000 0.967 0.944 0.926 0.904 0.888 0.876 0.865 0.862

0.668 0.701 0.741 0.822 0.863 0.920 1.000 0.961 0.937 0.916 0.896 0.879 0.861 0.850 0.838

5876

2003

0.628 0.667 0.714 0.794 0.829 0.878 0.962 1.000 0.974 0.955 0.933 0.917 0.903 0.892 0.888

0.648 0.680 0.717 0.792 0.829 0.878 0.998 1.000 0.963 0.940 0.917 0.900 0.881 0.871 0.859

5907

2004

0.612 0.649 0.695 0.771 0.805 0.851 0.929 0.964 1.000 0.978 0.953 0.936 0.922 0.910 0.906

0.627 0.657 0.693 0.761 0.795 0.839 0.938 0.997 1.000 0.966 0.940 0.822 0.903 0.892 0.880

5968

2005

0.591 0.627 0.671 0.743 0.776 0.819 0.894 0.926 0.959 1.000 0.972 0.955 0.941 0.928 0.924

0.603 0.632 0.665 0.726 0.759 0.798 0.881 0.924 0.983 1.000 0.963 0.945 0.924 0.914 0.902

6088

2006

0.571 0.605 0.648 0.717 0.779 0.788 0.861 0.893 0.923 0.960 1.000 0.980 0.965 0.951 0.946

0.583 0.610 0.642 0.701 0.729 0.765 0.840 0.876 0.922 0.993 1.000 0.974 0.949 0.939 0.926

6166

2007

0.538 0.572 0.613 0.677 0.706 0.744 0.812 0.843 0.870 0.904 0.941 1.000 0.982 0.967 0.962

0.555 0.580 0.610 0.664 0.689 0.722 0.786 0.816 0.852 0.901 0.954 1.000 0.970 0.963 0.949

6427

2008

0.517 0.549 0.589 0.648 0.676 0.712 0.777 0.805 0.831 0.866 0.899 0.953 1.000 0.982 0.975

0.532 0.556 0.584 0.634 0.658 0.687 0.746 0.772 0.803 0.844 0.885 0.970 1.000 0.998 0.978

6620

2009

0.460 0.489 0.525 0.578 0.602 0.635 0.692 0.717 0.739 0.769 0.798 0.846 0.885 1.000 0.984

0.491 0.513 0.538 0.581 0.602 0.628 0.677 0.698 0.722 0.752 0.779 0.829 0.871 1.000 0.994

7347

2010

0.419 0.445 0.477 0.525 0.547 0.576 0.627 0.649 0.670 0.697 0.723 0.766 0.799 0.895 1.000

0.457 0.477 0.500 0.538 0.557 0.580 0.623 0.641 0.661 0.685 0.706 0.744 0.774 0.884 1.000

8073* — For each intersection the top cell contains the value of the observed retentivity, the middle cell the model value for the comparison

of the relvant years’ lists. On the diagonal, the number of journals in the JCR SE release for that year is given in the bottom cell in bold.

Page 9: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

MODELING THE DYNAMICS OF THE RETENTIVITY PROCESS 77

Tabl

e 2.

Ret

enti

vity

by

the

sum

of n

umbe

r of

art

icle

s (r

eten

tivi

ty o

f th

e se

con

d or

der)

*

Year

s19

9519

9619

9719

9819

9920

0020

0220

0320

0420

0520

0620

0720

0820

0920

10

1995

1.00

00.

981

0.96

00.

934

0.91

40.

893

0.86

80.

854

0.84

80.

843

0.82

20.

820

0.81

70.

811

0.80

8

1.00

00.

977

0.94

70.

934

0.91

80.

903

0.87

40.

865

0.86

00.

852

0.83

50.

830

0.81

90.

813

0.80

8

607

049

1996

0.96

41.

000

0.97

80.

945

0.92

40.

902

0.87

90.

864

0.85

70.

852

0.83

10.

826

0.82

30.

816

0.81

4

0.97

31.

000

0.95

10.

944

0.92

80.

913

0.88

50.

875

0.87

10.

863

0.84

60.

840

0.82

90.

823

0.81

8

6311

37

1997

0.93

00.

963

1.00

00.

967

0.94

80.

925

0.90

10.

886

0.88

00.

874

0.85

40.

851

0.84

80.

841

0.83

9

0.93

80.

970

1.00

00.

971

0.95

10.

933

0.90

20.

892

0.88

80.

880

0.86

20.

857

0.84

50.

839

0.83

3

6353

86

1998

0.88

80.

917

0.95

11.

000

0.97

90.

954

0.92

80.

911

0.90

50.

899

0.87

60.

874

0.86

70.

862

0.85

9

0.90

80.

938

0.97

21.

000

0.96

40.

946

0.91

40.

905

0.90

10.

893

0.87

40.

869

0.85

70.

851

0.84

5

656

014

1999

0.86

00.

889

0.92

10.

967

1.00

00.

976

0.94

90.

931

0.92

30.

917

0.89

40.

890

0.88

50.

879

0.87

6

0.88

00.

908

0.93

90.

972

1.00

00.

963

0.92

90.

920

0.91

70.

908

0.88

90.

884

0.87

10.

865

0.85

9

6714

66

2000

0.83

00.

856

0.89

00.

932

0.96

51.

000

0.96

70.

951

0.94

10.

934

0.90

80.

905

0.90

00.

893

0.89

1

0.85

30.

879

0.90

80.

938

0.97

11.

000

0.94

50.

936

0.93

50.

926

0.90

40.

899

0.88

60.

880

0.87

4

686

146

2002

0.78

40.

811

0.84

40.

882

0.91

40.

943

1.00

00.

981

0.97

00.

963

0.93

30.

930

0.92

50.

917

0.91

4

0.80

20.

826

0.85

30.

879

0.90

80.

938

1.00

00.

979

0.98

00.

967

0.93

90.

934

0.92

00.

912

0.90

6

7163

04

2003

0.76

70.

794

0.82

70.

865

0.89

40.

924

0.97

51.

000

0.98

50.

978

0.94

80.

946

0.94

00.

932

0.92

8

0.77

80.

802

0.82

70.

853

0.88

00.

908

0.97

31.

000

0.99

70.

986

0.95

30.

948

0.93

20.

925

0.91

9

7470

60

Page 10: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

78

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

LIBKIND et al.

Tabl

e 2.

(Con

td.)

Year

s19

9519

9619

9719

9819

9920

0020

0220

0320

0420

0520

0620

0720

0820

0920

10

2004

0.73

70.

762

0.79

40.

830

0.88

00.

910

0.95

90.

959

1.00

00.

990

0.93

60.

933

0.92

70.

918

0.91

6

0.75

40.

778

0.80

30.

828

0.85

40.

881

0.94

20.

976

1.00

00.

989

0.95

20.

925

0.93

70.

930

0.92

4

8029

88

2005

0.71

00.

736

0.76

80.

803

0.85

90.

888

0.93

70.

937

0.97

81.

000

0.93

90.

936

0.93

00.

920

0.91

8

0.73

10.

755

0.77

90.

803

0.82

80.

855

0.91

10.

942

0.97

51.

000

0.94

80.

960

0.94

50.

940

0.93

5

847

143

2006

0.71

70.

744

0.77

70.

812

0.83

80.

867

0.91

80.

943

0.95

80.

977

1.00

00.

994

0.98

60.

976

0.97

4

0.70

80.

731

0.75

50.

778

0.80

30.

828

0.88

20.

910

0.93

90.

970

1.00

00.

996

0.97

60.

968

0.96

1

8564

76

2007

0.68

50.

712

0.74

30.

777

0.89

050.

833

0.88

40.

911

0.92

50.

946

0.96

71.

000

0.99

10.

981

0.97

8

0.68

60.

709

0.73

20.

755

0.77

90.

804

0.85

50.

882

0.91

00.

940

0.97

61.

000

0.97

40.

973

0.96

8

9142

64

2008

0.66

80.

695

0.72

50.

759

0.78

40.

813

0.86

50.

890

0.90

50.

925

0.94

70.

979

1.00

00.

989

0.98

2

0.66

40.

686

0.70

90.

732

0.75

50.

779

0.82

90.

855

0.88

20.

909

0.94

10.

973

1.00

00.

998

0.99

0

9456

79

2009

0.62

10.

649

0.67

80.

713

0.73

80.

764

0.81

20.

836

0.85

10.

873

0.89

40.

927

0.94

51.

000

0.98

9

0.64

30.

664

0.68

70.

709

0.73

20.

756

0.80

40.

830

0.85

50.

882

0.91

10.

941

0.97

51.

000

0.99

8

1009

590

2010

0.58

20.

608

0.63

60.

668

0.69

20.

718

0.76

50.

787

0.80

20.

823

0.84

40.

880

0.89

70.

951

1.00

0

0.62

10.

643

0.66

50.

687

0.71

00.

733

0.78

00.

805

0.83

00.

855

0.88

40.

911

0.94

30.

976

1.00

0

1080

209

*—

For

eac

h in

ters

ecti

on th

e to

p ce

ll c

onta

ins

the

valu

e of

the

obse

rved

ret

enti

vity

, th

e m

iddl

e ce

ll th

e m

odel

val

ue fo

r th

e co

mpa

riso

n o

f th

e re

lvan

t yea

rs’ l

ists

. On

the

diag

onal

, th

en

umbe

r of

art

icle

s in

th

e JC

R S

E r

elea

se fo

r th

at y

ear

is g

iven

in t

he

bott

om c

ell i

n b

old.

Page 11: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

MODELING THE DYNAMICS OF THE RETENTIVITY PROCESS 79

Tabl

e 3.

Ret

enti

vity

by

the

sum

of t

he

impa

ct�f

acto

r va

lues

(re

ten

tivi

ty o

f th

e th

ird

orde

r)*

Year

s19

9519

9619

9719

9819

9920

0020

0220

0320

0420

0520

0620

0720

0820

0920

10

1995

1.00

00.

976

0.93

40.

927

0.90

90.

899

0.87

80.

861

0.85

30.

843

0.82

00.

819

0.81

70.

774

0.81

5

1.00

00.

976

0.94

20.

936

0.92

70.

911

0.87

20.

871

0.85

80.

847

0.83

40.

823

0.82

30.

806

0.80

7

5873

.2

1996

0.94

71.

000

0.96

60.

955

0.94

00.

927

0.90

20.

885

0.87

60.

864

0.84

10.

840

0.83

80.

799

0.83

8

0.95

51.

000

0.94

80.

949

0.94

10.

924

0.89

50.

883

0.87

00.

858

0.84

50.

834

0.83

40.

817

0.81

7

6251

.8

1997

0.89

30.

946

1.00

00.

980

0.95

90.

946

0.91

80.

903

0.89

20.

880

0.85

70.

854

0.85

30.

817

0.85

5

0.90

50.

936

1.00

00.

988

0.97

40.

952

0.91

80.

905

0.89

10.

879

0.86

50.

853

0.85

30.

836

0.83

6

6255

.8

1998

0.84

50.

893

0.94

21.

000

0.97

10.

956

0.92

30.

907

0.89

80.

884

0.85

70.

855

0.85

50.

814

0.85

5

0.87

60.

910

0.96

31.

000

0.98

70.

963

0.92

80.

915

0.90

00.

888

0.87

40.

862

0.86

10.

843

0.84

3

6840

.3

1999

0.81

40.

863

0.90

70.

962

1.00

00.

979

0.94

40.

927

0.91

80.

904

0.87

70.

876

0.87

50.

834

0.87

4

0.84

50.

878

0.92

40.

962

1.00

00.

970

0.93

80.

925

0.91

00.

897

0.88

30.

871

0.87

00.

851

0.85

1

7462

.8

2000

0.77

90.

825

0.87

10.

921

0.95

71.

000

0.95

80.

939

0.92

80.

914

0.88

80.

886

0.88

60.

846

0.88

3

0.81

10.

844

0.88

40.

918

0.95

11.

000

0.95

50.

942

0.92

60.

912

0.89

80.

885

0.88

40.

865

0.86

4

7851

.1

2002

0.71

80.

759

0.79

60.

841

0.87

10.

916

1.00

00.

977

0.96

50.

953

0.92

50.

924

0.92

40.

882

0.91

9

0.74

90.

779

0.81

40.

845

0.87

70.

913

1.00

00.

980

0.95

90.

944

0.92

80.

913

0.91

30.

892

0.89

1

8755

.3

2003

0.68

50.

725

0.76

20.

806

0.83

60.

877

0.96

71.

000

0.98

50.

973

0.94

60.

944

0.94

40.

898

0.93

7

0.72

00.

749

0.78

30.

813

0.84

40.

879

0.95

71.

000

0.97

20.

958

0.94

10.

926

0.92

70.

904

0.90

3

9401

.3

Page 12: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

80

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

LIBKIND et al.

Tabl

e 3.

(Con

td.)

Year

s19

9519

9619

9719

9819

9920

0020

0220

0320

0420

0520

0620

0720

0820

0920

10

2004

0.66

80.

705

0.74

20.

782

0.81

10.

850

0.93

50.

970

1.00

00.

986

0.95

60.

954

0.95

30.

908

0.94

7

0.69

00.

719

0.75

20.

781

0.81

10.

844

0.91

60.

952

1.00

00.

977

0.95

70.

942

0.94

40.

919

0.91

8

9938

.2

2005

0.64

30.

680

0.71

50.

753

0.78

10.

819

0.90

20.

936

0.96

81.

000

0.96

80.

967

0.96

60.

919

0.95

7

0.66

20.

691

0.72

20.

751

0.78

00.

812

0.88

00.

915

0.95

61.

000

0.97

20.

957

0.96

10.

934

0.93

3

1061

7.9

2006

0.62

20.

656

0.69

20.

728

0.75

50.

791

0.86

90.

903

0.93

10.

962

1.00

00.

997

0.97

80.

944

0.98

5

0.63

40.

662

0.69

20.

721

0.74

90.

780

0.84

50.

878

0.91

50.

952

1.00

00.

976

0.98

40.

952

0.95

1

1142

2.0

2007

0.60

20.

635

0.66

90.

703

0.72

70.

761

0.83

60.

867

0.89

50.

924

0.96

21.

000

0.95

80.

944

0.98

5

0.60

70.

634

0.66

40.

691

0.72

00.

750

0.81

30.

844

0.87

90.

914

0.95

51.

000

0.99

70.

970

0.97

0

1205

7.8

2008

0.60

50.

640

0.67

60.

712

0.73

80.

772

0.84

90.

883

0.91

10.

942

0.96

30.

980

1.00

00.

944

0.98

4

0.58

10.

608

0.63

40.

665

0.69

30.

723

0.78

30.

817

0.85

20.

888

0.92

90.

979

1.00

00.

954

0.96

8

1345

3.9

2009

0.53

70.

569

0.60

30.

634

0.65

80.

689

0.75

90.

789

0.81

50.

842

0.87

40.

913

0.89

51.

000

0.98

9

0.55

30.

580

0.60

90.

636

0.66

30.

692

0.75

20.

783

0.81

50.

848

0.88

40.

922

0.94

01.

000

0.99

9

1470

3.0

2010

0.51

50.

542

0.57

20.

601

0.62

40.

651

0.71

60.

744

0.76

70.

792

0.82

40.

861

0.84

40.

899

1.00

0

0.52

80.

555

0.58

30.

610

0.63

70.

666

0.72

50.

755

0.78

70.

820

0.85

50.

892

0.91

90.

978

1.00

0

1621

6.0

*—

For

eac

h in

ters

ecti

on th

e to

p ce

ll c

onta

ins

the

valu

e of

the

obse

rved

ret

enti

vity

, th

e m

iddl

e ce

ll th

e m

odel

val

ue fo

r th

e co

mpa

riso

n o

f th

e re

lvan

t yea

rs' l

ists

. On

the

diag

onal

, th

esu

m o

f im

pact

fact

or v

alue

s of

jour

nal

s in

th

e JC

R S

E r

elea

se fo

r th

e ye

ar is

giv

en in

th

e bo

ttom

cel

l in

bol

d.

Page 13: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

MODELING THE DYNAMICS OF THE RETENTIVITY PROCESS 81

Tab

le 4

.R

eten

tivi

ty b

y th

e su

m o

f th

e ex

pect

ed r

espo

nse

val

ues

(ret

enti

vity

of t

he

four

th o

rder

)*

Year

s19

9519

9619

9719

9819

9920

0020

0220

0320

0420

0520

0620

0720

0820

0920

10

1995

1.00

00.

991

0.97

70.

971

0.95

80.

950

0.93

70.

927

0.92

30.

918

0.90

90.

909

0.90

70.

852

0.90

2

1.00

00.

993

0.96

90.

962

0.95

70.

948

0.93

20.

927

0.92

20.

917

0.91

00.

904

0.90

40.

892

0.89

3

1175

704.

6

1996

0.97

71.

000

0.98

60.

979

0.96

60.

957

0.94

40.

935

0.93

00.

926

0.91

80.

916

0.91

50.

860

0.91

0

0.99

31.

000

0.96

80.

965

0.96

20.

953

0.93

70.

932

0.92

80.

922

0.91

50.

909

0.90

90.

897

0.89

7

1269

083.

2

1997

0.95

70.

983

1.00

00.

992

0.97

90.

969

0.95

70.

947

0.94

30.

938

0.93

00.

929

0.92

70.

874

0.92

3

0.95

80.

978

1.00

00.

986

0.98

10.

969

0.95

10.

945

0.94

00.

935

0.92

70.

921

0.92

10.

908

0.90

8

1254

494.

1

1998

0.93

80.

963

0.98

11.

000

0.98

70.

976

0.96

40.

953

0.94

90.

944

0.93

60.

935

0.93

30.

978

0.92

9

0.93

40.

956

0.98

91.

000

0.99

40.

978

0.95

90.

953

0.94

80.

942

0.93

50.

928

0.92

80.

915

0.91

5

1322

313.

0

1999

0.91

40.

940

0.95

80.

980

1.00

00.

989

0.97

50.

964

0.95

90.

954

0.94

50.

944

0.94

20.

889

0.93

8

0.91

20.

934

0.96

40.

994

1.00

00.

981

0.96

40.

959

0.95

40.

948

0.94

00.

933

0.93

30.

919

0.91

9

1430

310.

8

2000

0.89

20.

918

0.93

60.

958

0.97

81.

000

0.98

40.

974

0.96

90.

964

0.95

50.

954

0.95

20.

899

0.94

8

0.88

90.

910

0.93

70.

963

0.98

71.

000

0.97

50.

969

0.96

50.

958

0.95

00.

942

0.94

20.

828

0.92

8

1483

081.

0

2002

0.95

10.

877

0.89

50.

916

0.93

70.

962

1.00

00.

988

0.98

30.

979

0.96

80.

967

0.96

50.

910

0.96

0

0.94

60.

866

0.89

00.

913

0.93

50.

961

1.00

00.

993

0.98

70.

979

0.96

80.

960

0.96

00.

944

0.94

4

1632

429.

6

2003

0.82

60.

854

0.87

30.

895

0.91

60.

943

0.98

31.

000

0.99

30.

988

0.97

80.

977

0.97

50.

919

0.96

9

0.82

60.

846

0.86

90.

891

0.91

30.

937

0.99

31.

000

0.99

60.

987

0.97

50.

966

0.96

70.

950

0.95

0

1760

771.

8

Page 14: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

82

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

LIBKIND et al.

Tab

le 4

.(C

ontd

.)

Year

s19

9519

9619

9719

9819

9920

0020

0220

0320

0420

0520

0620

0720

0820

0920

10

2004

0.80

30.

831

0.85

10.

874

0.90

00.

926

0.96

80.

981

1.00

00.

994

0.97

90.

978

0.97

50.

919

0.96

8

0.80

70.

826

0.84

90.

870

0.89

10.

915

0.96

70.

995

1.00

00.

992

0.98

00.

971

0.97

20.

955

0.95

5

1922

874.

5

2005

0.78

50.

814

0.83

30.

855

0.88

20.

910

0.95

30.

968

0.98

61.

000

0.98

30.

981

0.97

90.

920

0.97

2

0.78

70.

807

0.82

60.

849

0.87

00.

893

0.94

10.

967

0.99

31.

000

0.98

60.

978

0.98

00.

961

0.96

1

2071

938.

5

2006

0.76

50.

793

0.81

30.

838

0.85

80.

887

0.93

40.

954

0.96

90.

983

1.00

00.

998

0.98

90.

938

0.98

7

0.76

80.

787

0.80

80.

828

0.84

90.

871

0.91

70.

940

0.96

40.

989

1.00

00.

988

0.99

30.

970

0.97

1

2192

911.

0

2007

0.74

20.

771

0.79

10.

812

0.83

40.

864

0.91

20.

933

0.94

90.

964

0.98

31.

000

0.98

20.

939

0.98

7

0.74

90.

768

0.78

80.

808

0.82

80.

849

0.89

30.

916

0.93

80.

962

0.99

01.

000

0.99

80.

978

0.98

1

2322

726.

7

2008

0.73

20.

764

0.78

30.

905

0.82

60.

855

0.90

70.

929

0.94

70.

963

0.97

90.

988

1.00

00.

942

0.98

9

0.73

10.

750

0.77

00.

789

0.80

90.

830

0.87

30.

895

0.91

80.

942

0.97

00.

999

1.00

00.

965

0.97

8

2556

148.

1

2009

0.68

50.

717

0.73

70.

760

0.78

00.

809

0.85

60.

877

0.89

40.

911

0.93

40.

955

0.95

11.

000

0.99

4

0.71

20.

730

0.74

90.

769

0.78

80.

808

0.85

00.

871

0.89

20.

914

0.93

80.

963

0.97

71.

000

0.99

8

2737

072.

5

2010

0.65

80.

689

0.70

70.

727

0.74

70.

777

0.82

40.

845

0.86

20.

877

0.90

20.

925

0.92

20.

928

1.00

0

0.69

40.

712

0.73

10.

750

0.76

90.

789

0.83

00.

851

0.87

20.

894

0.91

70.

942

0.96

30.

997

1.00

0

2976

525.

8

*—

For

eac

h in

ters

ecti

on th

e to

p ce

ll c

onta

ins

the

valu

e of

the

obse

rved

ret

enti

vity

, th

e m

iddl

e ce

ll th

e m

odel

val

ue fo

r th

e co

mpa

riso

n o

f th

e re

lvan

t yea

rs’ l

ists

. On

the

diag

onal

, th

esu

m o

f exp

ecte

d re

spon

se v

alue

s of

jour

nal

s in

th

e JC

R S

E r

elea

se fo

r th

at y

ear

is g

iven

in t

he

bott

om c

ell i

n b

old.

Page 15: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

MODELING THE DYNAMICS OF THE RETENTIVITY PROCESS 83

related year in JCR SE (collumn and row on the diag�onal, of course, refer to the same year). Observedretentivity is indicated in the upper cell, and the modelretentivity in the middle cell. Data above the diagonalrefer to direct retentivity, those below the diagonal toreverse retentivity.

Tables 1–4 contain detailed information about theretentivity. In a more compact form the degree ofagreement of the model with the empirical data can bedescribed (using data of tables 1–4) by the followingrelationship:

(25)exp

exp

| |100.theorp p

relatp

−Δ =

where:

is the relative deviation of the model reten�tivity from the observed retentivity.

is the model value of the retentivity;

is the observed value of the retentivity;

is the absolute value of the differencebetween the model and observed retentivity.

Table 5 shows the fraction of cases (from the totalnumber of comparisons) for which the values of Δrelatare within the values shown in columns 3–7 of thetable.

relatΔ

theorp

expp

exp| |theorp p−

Table 5. Level of agreement of the model with empirical data*

1 2 3 4 5 6 7

Direct retentivity of the first order (by number of journals) 105 32.4 45.7 19.0 2.9 0.0

Reverse retentivity of the first order (by number of journals) 105 23.8 26.7 28.6 15.2 5.7

Direct retentivity of the second order (by the sum of number of articles) 105 56.2 39.0 4.8 0.0 0.0

Reverse retentivity of the second order (by the sum of number of articles) 105 28.6 21.9 22.9 16.2 10.5

Direct retentivity of the third order (by the sum of the impact�factor values) 105 60.0 22.9 11.4 3.8 1.9

Reverse retentivity of the third order (by the sum of the impact�factor values) 105 36.2 22.9 17.1 9.5 14.3

Direct retentivity of the fourth order (direct retentivity by the sum of the expected response values)

105 67.6 18.1 2.9 6.7 4.8

Reverse retentivity of the fourth order (by the sum of the expected response values) 105 57.1 26.7 6.7 6.7 2.9

* —1—order and direction of retentivity; 2—number of comparisons of observed and model retentivity; 3—fraction of comparisonsfor which Δrelat ⇐ 1 (%); 4—fraction of for which 1 < Δrelat ⇐ 2 (%); 5—fraction of comparisons for which 2 < Δrelat ⇐ 3 (%);6—fraction of comparisons for which (%) 3 < Δrelat ⇐ 4 ; 7—fraction of comparisons for which Δrelat > 4 (%);

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

1.0

0.2

0.9

0.8

0.7

0.6

0.5

0.4

0.3

Time interval, yearsReal retentivity: direct retentivity by number of journalsReal retentivity: reverse retentivity by number of journalsTheo retical retentivity: direct retentivity by number of journals, function graphTheo retical retentivity: reverse t retentivity by number of journals, function graph

Ret

enti

vity

Fig. 1. Comparison of model and empirical data for the1st order of retentivity (by number of journals).

Page 16: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

84

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

LIBKIND et al.

Table 5 allows us to draw some conclusions. Firstly,the model is in close enough agreement with theempirical data. Since the proportion of cases wherethe calculated (theoretical) values of retentivity differby no more than 1% is in the range 23.8–67.6%. Theproportion of cases when the deviation is less than2%—in the range of 50.5–85.7% (the sum of the cor�responding values of columns 3 and 4). For caseswhere the deviation is less than 3%—80.1–88.6%.(the sum corresponding values of columns 3, 4 and 5).

The maximum difference value found, is 9.6%; there isonly one such large deviation among 840 compari�sons. Secondly, for all the cases of direct retentivity themodel is in closer agreement with the observed valuesthen in cases of reverse retentivity (see also Figs. 1–4).Thirdly, the agreement of the model with empiricaldata increases with the increasing retentivity order.

We also find the highest values of retentivity for the4th order of retentivity, and the lowest for the firstorder (see also figures 1–4).

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

1.0

0.2

0.9

0.8

0.7

0.6

0.5

0.4

0.3

Time interval, yearsReal retentivity: direct retentivity by the sum of number of articlesReal retentivity: reverse retentivity by the sum of number of articlesTheoretocal retentivity: direct retentivity by the sum of number of articles, function graphTheoretocal retentivity: reverse retentivity by the sum of number of articles, function graph

Ret

enti

vity

Fig. 2. Comparison of model and empirical data for 2nd order of retentivity (by the sum of number of articles).

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

1.0

0.2

0.9

0.8

0.7

0.6

0.5

0.4

0.3

Time interval, yearsReal retentivity: direct retentivity by the sum of the values of impact factorReal retentivity: reverse retentivity by the sum of the values of impact factorTheoretocal retentivity: direct retentivity by the sum of the values of impact factor,

Theoretocal retentivity: reverse retentivity by the sum of the values of impact factor,

Ret

enti

vity

function graph

function graph

Fig. 3. Comparison of model and empirical data for the 3rd order of retentivity (by the sum of the impact factor values).

Page 17: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

MODELING THE DYNAMICS OF THE RETENTIVITY PROCESS 85

The level of agreement between the model and theempirical data is presented in a visually accessibleform in Figs. 1–4, which correspond to retentivityorders 1–4. The curves in these graphs correspond tothe model retentivity (upper curve—direct retentivity,lower—reverse retentivity), and data points representthe observed retentivity values.

In interpreting these graphs one should keep thefollowing in mind. If the values of the observed reten�tivity correspond to different periods of physical time,but refer to the same time intervals, then these valuesare usually only slightly different from each other, andoften coincide up to the third decimal place. In thesecases, the symbols for these values on the graph will

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

1.0

0.2

0.9

0.8

0.7

0.6

0.5

0.4

0.3

Time interval, yearsReal retentivity: direct retentivity by the sum of values of the expected responseReal etentivity: reverse retentivity by the sum values of impact factorTheoretocal retentivity: direct retentivity by the sum of values of the expected response,

Theoretocal retentivity: reverse retentivity by the sum of values of the expected response,

Ret

enti

vity

function graph

function graph

Fig. 4. Comparison of model and empirical data for the 4th order of retentivity (by the sum of the expected response values).

0.05

0

–0.05

–0.10

–0.15

–0.20

159742 31 5

–0.25

–0.30

–0.35141312106 8 11

Influence on retentivity of the 1st term of the equation

Influence on retentivity of the 2st term of the equation Influence on retentivityof the free tern of the equation, с–1

Time interval, years

Dec

reas

ing/

incr

easi

ng

of r

eten

tivi

ty,

shar

e o

f 1

Fig. 5. Direct retentivity of the 1st order. Influence on retentivity of 1—the time interval τ between two compared lists (1st termof the equation); 2—the changes over time of the ratio of the journal numbers in these lists (2nd term of the equation); 3—factors,which the postulates do not take into account directly (the difference between the free term c and 1).

Page 18: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

86

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

LIBKIND et al.

overlap and merge into a single data point and may beperceived as a single value. It is typical for. This hap�pens most often in the case of direct retentivity of the

2nd, 3rd and 4th order. It is especially true for smalltime intervals, for which, naturally, more data aboutretentivity are available. For example, for an interval of

0.17

0.07

–0.03

–0.13

–0.23

–0.33

159742 31 5

–0.43

–0.53

–0.63141312106 8 11

Influence on retentivity of the 1st term of the equationInfluence on retentivity of the 2st term of the equationInfluence on retentivity of the free tern of the equation, c–1

Time interval, years

Dec

reas

ing/

incr

easi

ng

of r

eten

tivi

ty,

shar

e of

1

Fig. 6. Reverse retentivity of the 1st order. Influence on retentivity of 1—the time interval τ between two compared lists (1st termof the equation); 2—the changes over time of the ratio of the journal numbers in these lists (2nd term of the equation); 3—thefactors, which the postulates do not take into account directly (the difference between the free term c and 1).

0.10

0.05

0

–0.05

–0.10

–0.15

159742 31 5

–0.20

–0.25

–0.30141312106 8 11

Influence on retentivity of the 1st term of the equation

nfluence on retentivity of the 2st term of the equationInfluence on retentivity of the free ternthe equation, c–1

Time interval, years

Dec

reas

ing/

incr

easi

ng

of r

eten

tivi

ty,

shar

e of

1

Fig. 7. Direct retentivity of the 2nd order. Influence on retentivity of 1) the time interval τ between two compared lists (1st term ofthe equation); 2—the changes over time of the ratio of the sum of number of articles in these lists (2nd term of the equation); 3—the factors, which the postulates do not take into account directly (the difference between the free term c and 1).

Page 19: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

MODELING THE DYNAMICS OF THE RETENTIVITY PROCESS 87

1 year we have 14 different values, for an interval of 2years 13 values, etc. At the same time, sporadic outliersare clearly visible and seem to have a greater “weight”most of the empirical data lie close to the graphs, indi�cating good agreement of the model.

FINAL VERIFICATION OF THE POSTULATES

While the tables and figures presented above showgood agreement between the model and the observedretentivities, unfortunately, those data do not allow usto estimate the specific impact of each term in the cor�

0.05

–0.05

–0.15

–0.20

–0.25

–0.30

159742 31 5

–0.35

–0.40

–0.45141312106 8 11

Influence on retentivity of the 1st termof the equation

Influence on retentivity of the 2st termof the equation

Influence on retentivity of the free ternof the equation, c–1

Time interval, years

Dec

reas

ing/

incr

easi

ng

of r

eten

tivi

ty,

shar

e of

1–0.10

0

Fig. 8. Reverse retentivity of the 2nd order. Influence on retentivity of 1—the time interval τ between two compared lists (1st termof the equation); 2—the changes over time of the ratio of the sum of number of articles in these lists (2nd term of the equation);3—the factors, which the postulates do not take into account directly (the difference between the free term c and 1).

0.10

0.05

0

–0.05

–0.10

–0.15

159742 31 5

–0.20

–0.25

–0.30141312106 8 11

Dec

reas

ing/

incr

easi

ng

of r

eten

tivi

ty,

shar

e of

1

Influence on retentivity of the 1st termof the equation

Influence on retentivity of the 2st termof the equation

Influence on retentivity of the free ternof the equation, c–1

Fig. 9. Direct retentivity of the 3rd order. Influence on retentivity of 1—the time interval τ between two compared lists (1st termof the equation); 2—the changes over time of the ratio of the sum of the impact�factor values in these lists (2nd term of the equa�tion); 3—the factors, which the postulates do not take into account directly (the difference between the free term c and 1).

Page 20: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

88

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

LIBKIND et al.

responding equation of the model. Without such anassessment, we cannot check the validity of the fourpostulates which form the basis of the model. To solvethis key problem, we will consider the mathematicalstructure of each term of the relevant equations, anduse the graphs in Fig. 5–12. Each of these graphsreflects both the impact of the interval between the

compared lists on retentivity (1st term of the equation)and the change in the ratio between the quantitativecharacteristic values assigned to a given pair of lists(2nd term of the equation). In addition, each graphsgives an indication of the influence of the socio�eco�nomic factors which the postulates do not take intoaccount. The graphs show this influence in a graph

0.05

0

–0.10

–0.20

–0.30

–0.40

159742 31 5

–0.45

–0.50

–0.55141312106 8 11

Time interval, years

Dec

reas

ing/

incr

easi

ng

of r

eten

tivi

ty,

shar

e of

1

–0.05

–0.15

–0.25

–0.35 Influence on retentivity of the 1st termof the equation

Influence on retentivity of the 2st termof the equation

Influence on retentivity of the free ternof the equation, c–1

Fig. 10. Reverse retentivity of the 3rd order. Influence on retentivity of 1—the time interval τ between two compared lists (1st termof the equation); 2—the changes over time of the ratio of the sum of the impact�factor values in these lists (2nd term of the equa�tion); 3—the factors, which the postulates do not take into account directly (the difference between the free term c and 1).

0.06

–0.04

0.01

–0.09

–0.14

159742 31 5–0.19

141312106 8 11

Time interval, years

Dec

reas

ing/

incr

easi

ng

of r

eten

tivi

ty,

shar

e of

1

Influence on retentivity of the 1st termof the equationInfluence on retentivity of the 2st termof the equationInfluence on retentivity of the free ternof the equation, c–1

Fig. 11. Direct retentivity of the 4th order. Influence on retentivity of 1—the time interval τ between two compared lists (1st termof the equation); 2—the changes over time of the ratio of the sum of the expected response values in these lists (2nd term of theequation); 3—the factors, which the postulates do not take into account directly (the difference between the free term c and 1).

Page 21: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

MODELING THE DYNAMICS OF THE RETENTIVITY PROCESS 89

that shows the change in time of c – 1, where c is thevalue of the free term of the equation, and 1 the prob�ability of retention in the degenerate case, i.e. of theretention of the list in itself.

For the numerical evaluation of the impact of eachterm of the equations, we will use the following ratio:

(26)

where:

is the numerical value of the contribution of agiven term to retentivity/non retentivity; i corre�sponds to the 1st, 2nd or 3rd term; k is the order ofretentivity; D the direction of retentivity;

is the numerical value of the retentivity of agiven order k for direction D;

r is the fraction of the contribution of a given termof the equation to the retentivity/non retentivity (%).

For convenience we will take the time interval to be15 years, i.e. the maximum time difference betweenlists considered in the present study.

The influence of the 1st term of the equation. Look�ing at the first term on the right side of the model equa�tions, we find that in all eight cases (four cases of directand four cases of reverse retentivity) an increase of thetime interval between lists is connected with a drop inretentivity. Indeed, in equations (17)–(24) the value ofthe coefficient a related to τ is always negative.

, ,

,

100,i k Dm

k D

rp

δ=

, ,i k Dmδ

, ,i k Dm

,k Dp

The influence of the time interval on retentivitydecreases with the increase of the order of retentivity.For example, in the case of first order reverse retentiv�ity a = –0.034, but for the fourth order a = –0.016.

The retentivity decreases non�linearly with thetime interval (the exponent in all cases is less than 1),which results in a decrease of the absolute value of theincrement of retentivity reduction with increasingtime interval. Therefore, the impact of retentivitydependence on the time interval falls, even if onlyslowly. The non�linear nature of this dependence ismore characteristic for reverse retentivity (see Fig. 6,8, 10 and 12). This is most clearly illustrated by Fig. 6,in which the lower curve is significantly concave andcorresponds to the reverse retentivity by number ofjournals.

The influence of increasing time interval τ is stron�ger for the reverse retentivity than for the correspond�ing cases of direct retentivity, as is very well illustratedby the graphs in Fig. 5–12. This is indicated by theratio of the values of coefficient a of the reverse anddirect retentivity, which is always greater than 1; forthe 2nd, 3rd and 4th orders of retentivity it rangesbetween 1.7 and 2.4. We do not considere here thevalue of this ratio for the 1st order, which has an evengreater value (4.05). In this case the values of exponentγ for reverse and direct retentivity are not equal (0.5and 0.8 respectively), making such a comparisonincorrect.

0.05

0

–0.05

–0.10

–0.15

–0.20

159742 31 5

–0.25

–0.30

–0.40141312106 8 11

Time interval, years

Dec

reas

ing/

incr

easi

ng

of r

eten

tivi

ty,

shar

e of

1

–0.35

Influence on retentivity of the 1st termof the equation

Influence on retentivity of the 2st termof the equation

Influence on retentivity of the free ternof the equation, c–1

Fig. 12. Reverse t retentivity of the 4th order. Influence on retentivity of 1—the time interval τ between two compared lists (1st

term of the equation); 2—the changes over time of the ratio of the sum of the expected response values in these lists (2nd term ofthe equation); 3—the factors, which the postulates do not take into account directly (the difference between the free term c and 1).

Page 22: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

90

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

LIBKIND et al.

To illustrate the impact of the 1st term of the equa�tions, we consider the ratio r (26) for the maximumvalue of τ (i.e. a time difference of 15 years, comparing1995 and 2010). For direct retentivity we find r to be⎯40.9%, –29.3%, –32.0% and –15.8% for the 1st,2nd, 3rd and 4th order retentivity respectively. Forreverse retentivity of the 2nd, 3rd and 4th order we finda stronger influence with r values of –63.8, –91.7 and–48.5% respectively. In relation to reverse retentivityof the 1st order (by number of journals), the losses ofretentivity described by the first term of the equationexceed the total retentivity (r = –116.3%); these arecompensated only by the influence of the remainingtwo terms of the equation.

Looking at the influence of the 1st term of themodel equations, we can conclude that postulate 2 isfully confirmed.

The influence of 2nd term of the equation. The

expressions and in equations

(15) and (16) that are part of the 2nd term of theseequations, can be considered as an analogue of “accel�eration” (either positive or negative), which charac�terizes the process of change of the quantitative char�acteristics of the lists over time. This is indicated by thevalue of the exponent β of τ in the denominator: β isalways less than 1 (0.5 ⇐ β ⇐ 0.8).

When looking at the second term of the right side ofequations (15) and (16) we must keep in mind that the

expression , which corresponds to the

case of direct retentivity, as a rule has a positive value,

and the expression , which matches the

case of reverse retentivity, always has a negative value.At the same time, as follows from equations (17)–(24), this is not true for the values of the coefficient b.Hence, unlike in the case of the 1st term, we cannotestimate the character and the degree of influence ofthe 2nd term on a retentivity directly by using its coef�ficient b. Instead, we look at figures 5–12 and ratiovelue r in equation (26) to assess the influence of the2nd term. Based on these data we come to the follow�ing conclusions.

1. In all four cases of a direct retentivity, the 2ndterm makes a positive constribution to the retentivity.This influence increases with higher orders of retentiv�ity. For the retentivity by number of journals (1st orderof direct retentivity—Fig. 5) and τ = 15: r = +4.2%.For the direct retentivity by the sum of number of arti�cles (2d order of direct retentivity—Fig. 7 rises to r =+8.9%. For the retentivity by the sum of the impact�factor values (3rd order of direct retentivity—Fig. 9)r = 10.9%, and for the direct retentivity by the sum of

4 4

4

, ,

,( ) 1

t k t k

t k

R R

R+τ

β

τ +

, ,

,( ) 1h h

h

t k t k

t k

R R

R

β+τ

τ +

4 4

4

, ,

,( ) 1

t k t k

t k

R R

R+τ

β

τ +

, ,

,( ) 1h h

h

t k t k

t k

R R

R

β+τ

τ +

the expected response values (4th order of direct

retentivity—Fig. 11) r = 5.4%.3

2. The influence of the 2nd term on reverse reten�tivity is not so unequivocal. For a reverse retentivity bynumber of journals (1st order of a reverse retentivity)the agreement with postulate 3 is fair: the contributionof the 2nd term of the equation in the retentivity in thiscase is negative. Looking at figure 6, we find a rathersignificant contribution of the 2nd term: r = 34.1%.

3. For the reverse retentivity of 2nd, 3rd and 4thorders it becomes clear that the postulate 3 does notdescribe the real processes of retentivity change:

Firstly, according to 3rd postulate the 2nd memberof the equation should to bring a negative contributionto the reverse retentivity. In reality, we find for all threespecified cases of reverse retentivity a positive contri�bution of the 2nd term, a very insignificant one. Indeed,from figures 8, 10 and 12, we find the ratios for the 2nd,3rd and 4th order retentivity to be +0.8%, +4.4% and+1.8% (see alsow Figs. 8, 10 and 12).

Secondly, the contribution of the 2nd term to thereverse retentivity depends to only a very small degreeon the τ value: the curves corresponding to the contri�bution of the 2nd term in all three specified figures arepractically in parallel to the abscissas. The explanationfor this paradox probably lies in the presence in eachannual list of JCR SE, of a core of the most authorita�tive journals, characterized by high productivity(number of articles) and a high academic level (highimpact factor) and, consequently, a great influence onthe world science (high expected response).

There is indeed such a core of 3259 journals (40.4%of the JCR SE list of journals in 2010), each of whichis present in each annual list JCR SE we considered.Moreover, in 2010 those 40.4% provided 57.2%, 51.0%and 66.3% of the total number of articles, the sum ofimpact factor values and the sum of the expectedresponse values, respectively. The ratio of the core�sponding characteristics of this set of journals con�firms the strong influence of this core: the averagenumber of articles per core journal is 1.4 times that ofthe average for the entire list in 2010 (192.01 and 137.3respectively), their impact factor 1.25 times (2.540 and2.032), and their expected response 1.62 times higher(610.6 and 376.4).

Influence of factors that are not accounted for in thepostulates. We need to consider influences on theretentivity of some factors which are not addresseddirectly by the postulates, but which nevertheless areindirectly mentioned in a postulate 1. Here, theexpression “basically depends” specifies that postulate 1

3 It is noticeable that in all these four graphs, curves correspond�ing to the 2nd member of equations sometimes look like joggedlines. This is because the calculation of these curves bases on tothe actual values of the numerical characteristics that sometimesdo not changes vary smoothly (see the values on the diagonals ofTables 1–4.

Page 23: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

MODELING THE DYNAMICS OF THE RETENTIVITY PROCESS 91

to consider influences on the retentivity of some fac�tors which are not addressed directly by the postulates,but which nevertheless are indirectly mentioned in apostulate 1. Above we already suggested that such fac�tors could have a socio�economic nature. In particu�lar, the state of the global economy, trends in basicresearch, changing the general rules in the formationof the journal lists of JCR�SE.

The model does not allow estimating separately thedegree of influence of any socio�economic factors onthe retentivity of each of listed above socio�economicfactors. Nevertheless, the offered model not onlyenables to estimate total influence of these factors, itdoes, however, give us an opportunity to estimate thetotal influence on each of the eight considered cases ofretentivity.

As a visual assessment of the influence of these fac�tors, we use the curves corresponding to c – 1 in figures5–12. Let us recall that c is the constant term of thecorresponding equation, and 1 the probability ofretention of a list in itself. To illustrate and evaluatethis effect, we use the value of the ratio of (26).

The influence of socio�economic factors on thedirect retentivity. In the graphs (figures 5, 7, 9 and 11),the curves corresponding to the difference c – 1 alwayslie below 0. Hence, it can be argued that these factorsalways have a negative impact on the direct retentivity.It should be noted that this impact is very small. In thecase of the direct retentivity by number of journals, theshare of the impact of these factors on the total directretentivity for τ = 15 is less than 1% (r = +0.9%). Inthe case of direct retentivity by the total number ofarticles a few moreit is somewhat higher (r = +3.5%);for the direct retentivity by the sum of the impact fac�tor values this characteristic is r = +2.8%, and fordirect retentivity by sum of the expected response val�ues r = +1.4%.

The influence of socio�economic factors on thereverse retentivity. From the graphs it is immediatelyevident that the influence of both the difference c – 1and of the 2nd term for the reverse retentivity is not asclear as it is for the direct retentivity. For the 1st orderretentivity that influence is positive and significant at32.4% for τ = 15 (see figure 6). For the other cases ofreverse retentivity, the influence of socio�economic fac�tors is practically negligible (see figures 8, 10 and 12),positive with r = +2.4% for the 2nd order and r =+2.7% for for the 4th order, and negative at –1.7% forthe 3rd order.

We can conclude with regard to the matching pos�tulates formulated at the beginning of this article, thatpostulate 2, according to which the probability ofretention for a journal as an authoritative source ofpapers (1st order retentivity) is inversely dependent onthe time interval, is true for both direct and reverseretentivity.

Postulate 3, according to which in case of a fixedtime interval, the probability of retention for a journal

as an authoritative source of papers (1st order retentiv�ity) increases when the number of journals in the start�ing list is smaller than the number in the list to whichit is compared, and, conversely, this probabilitydecreases when the number of journal in the startinglist is larger. In both these cases, these suppositionswere confirmed.

As postulates 2 and 3 are elaborations of postulate 1,this postulate also is not in contradiction with the realprocess of retentivity of journals as the most authorita�tive source of scientific papers.

As to a postulate 4, according to which postulates 2and 3 are valid also for 2nd, 3rd and 4th orders of a reten�tivity, we can only partly confirm its validity. The gen�eralization of postulate 2 for the retentivity of the 2nd,3rd and 4th orders is found valid, both in case of directand of reverse retentivity. However, this is not true forpostulate 3 for the reverse retentivity. Indeed, the ratiobetween the quantitative characteristics of the twocompared lists practically does not influence theretentivity for given characteristic.

CONCLUSIONS

1. The article analyzes the lists of journals from theJCR SE for the period of 16 years between 1995 and2010. The results of this analysis indicate for these setsof scientific journals sufficient stability of the statisti�cal characteristics, which are the starting point for theconstruction of most scientometric indicators. This inturn suggests that it is reasonable to use these indica�tors to assess trends in science.

2. We constructed a mathematical model of theprocess of change over time of the retentivity of jour�nals as the most authoritative sources of scientificpapers. The model is based on postulates that connectthe probability of a journal’s retention with the timeinterval between two lists of journals that are com�pared, and with the ratio of the sizes of these lists: 1—the probability of retention for journals falls withincreasing time interval; 2—the probability of reten�tion for a journal from a shorter list in a longer list ishigher than the probability of retention of a journalfrom a longer list in a shorter list.

3. We introduced the concept of retentivity of jour�nals as the most authoritative sources of scientificpapers as well as the concepts of retentivity order andretentivity direction. There are four retentivity orders,each of which has its own quantitative characteristic.As the 1st order we define retentivity by the number ofjournals in the given list; the 2nd order as retentivity bythe sum of the number of articles published by thejournals from a given list; the 3rd order as retentivity bythe sum of the impact�factor values of the journalsfrom a given list; the 4th order as retentivity by the sumof the expected response values for the articles whichhave been published in the journals from a given list.

Page 24: Modeling the dynamics of the retentivity process of journals among the most authoritative scientific serials

92

AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 47 No. 3 2013

LIBKIND et al.

4. The concepts of direct and reverse retentivitywere introduced. When journals are retained in a laterlist, we call this direct retentivity. We speak of reverseretentivity when journals from a list are also present inan earlier list.

5. Comparing the model with empirical data showsthat it gives a quite good description of the process ofchange of retentivity with time.The validity of the pos�tulates, which were used to build the model, was alsochecked.

We found that, indeed, the retentivity falls withincrease of the time interval between two lists, both incase of direct and of reverse retentivity.

The statement about inverse dependence of directretentivity on a ratio value of the sizes of comparedjournals lists is true. Moreover, the effect of this rde�pendenc is significant. The same holds for all orders ofdirect retentivity.

For reverse retentivity it is true only for the 1st

order; in all other cases of reverse retentivity we findonly minimal impact. The reason for this seeminglyparadoxical finding lies in the fact that a substantial setof journals, characterized by high productivity and/ora high academic level (high impact factor value) and,consequently, of great influence on the world of sci�ence (high expected response value), form a stablecore in the JCR SE lists; the probability of their inclu�sion depends very little on any external factors.

6. An analysis of the mathematical structure of themodel and its direct comparison with empirical datashows that the negative correlation of the retentivitywith the time interval is nonlinear, i.e. the absolutevalue of the increment of retentivity change decreaseswith time and, therefore, the dependence of the reten�tivity on the time interval fades, albeit slowly.

7. The change in the size of the lists and othernumerical characteristics are nonlinear as well (this alsofollows from the structure of a model and from of thecomparison between the model and empirical data).

8. The effect of increasing time interval betweenlists is always stronger in cases of reverse retentivitythan in the corresponding cases of direct retentivity.

9. We were able to draw a number of additionalconclusions from the model and a comparison withempirical data. Thus, for direct retentivity by thenumber of journals, we found considerable influenceof factors, not covered by the postulates, which weassume to be of socio�economic nature (let’s call themexternal). In contrast, the influence of these factors onthe 2nd, 3rd and 4th orders of direct retentivity is mini�mal. The influence of external factors is insignificantfor all orders of reverse retentivity. This fact is indirectevidence that the core of the world’s scientific jour�nals, which was formed before the period we studied,is quite stable and very little influenced by external fac�tors. Indeed, 70.5% of the full list in 1995 (3259 out of4623 journals) has been included in every subsequentlist, despite a very significant increase in the size of

subsequent JCR SE journal lists. The recognition ofthese core journals is confirmed by the fact that theyhave significantly higher values of productivity (num�ber of articles), scientific level (impact factors values)and the degree of influence on the world of science(the expected response values) than the average jour�nal in the lists.

ACKNOWLEDGMENTS

The study was supported by the Russian Founda�tion for Humanities (project 12�03�00070).

REFERENCES

1. Garfield, E., Citation Indexing for science: Newdimension in documentation though association ofideas, Science, 1955, no. 122, pp. 108–111.

2. Garfield, E., A century of citation indexing. Key noteaddress, Proc. 13th Int. Conf. on Webometrics, Informet�rics and Scientometrics and COLLNET Meeting, Istan�bul, 2011, pp. 20–23.

3. Johnstone, M.J., Journal impact factors: implicationsfor the nursing profession, Int. Nursing Rev., 2007,vol. 54, pp. 35–40.

4. Leydesdorff, L. and Wagner, C., Macro�Level Indica�tors of the Relations between Research Funding andResearch Output. http://www.leydesdorff.net/road�map/roadmap.pdf

5. Braun, T., Glanzel, W., and Schubert, A., A Hirsch�type index for journals, Scientometrics, 2006, vol. 69,pp. 169–173.

6. Glanzel, W., On the h�index�a mathematical approachto a new measure of publication activity and citationimpact, Scientometrics, 2006, vol. 67, pp. 121–129.

7. Halfman, W. and Leydesdorff, L., Is inequality amonguniversities increasing? Gini coefficients and the elu�sive rise of elite universities. http://www.loetleydes�dorff.net

8. Bergstrom, C.T., Eigen factor as the value and prestige ofscholarly journals, College Res. Libr. New, 2007, vol. 68.http://www.ala.org/ala/acr1pubs/crlnews/backissues2007/may07/egenfactor.cfm

9. Michels, C. and Schmoch, U., The growth of scienceand database coverage, Scientometrics, 2012, vol. 93,pp. 831–846.

10. Markusova, V.A., Quality of scientific journals and basecriterions for including to information system Web ofScience of Thomson Reuters Co., Acta Naturae, 2012,vol. 4, pp. 6–13.

11. Gilyarevsky, R.S., Mulchenko, Z.M., Terekhin, A.T.,and Cherny, A.B., Experience with the study of the Sci�ence Citation Index, in Prikladnaya dokumentalistika(Applied Documentation), Moscow: Nauka, 1968,pp. 32–53.

12. Arapov, M.V. and Libkind, A.N., Change of productiv�ity in regular sources of information, Nauchn.�Tekhn.Inform., Ser. 2, 1976, no. 10, pp. 3–14.

13. Libkind, A.N., One approach to study communicationin science, Scientometrics, 1985, vol. 8, nos. 3–4,pp. 217–231.