14
www.jcimjournal.com/jim September 2013, Vol.11, No.5 352 Journal of Integrative Medicine 1 Introduction Knowledge discovery in databases (KDD), as a relatively young and interdisciplinary field of computer science, has been adopted by many researchers in traditional Chinese medicine (TCM) in recent decades [1-3] . It is a useful tool for extracting underlying patterns from TCM datasets and translating this traditional medical system into scientific language. Increasingly, basic therapeutic principles from TCM have been explored and explained using KDD techniques. These advances feature in the modernization of the oriental medical system [4] . For example, Zhang et al [5] proposed a latent tree approach to study ZHENG differentiation with application to the kidney deficiency dataset. Ehrman et al [6] employed a random forest approach to analyze the efficacy classification of 8 411 phytochemical compounds from 240 Chinese herbs. Further, two novel network- based methods (distance-based mutual information model and network target-based identification of multicomponent synergy) were proposed to uncover the combination rules of TCM formulae by the team of Shao Li [7,8] , who tried to understand TCM principles in the view of bioinformatics and networks. In fact, the KDD approach has already been considered to be a good tool for mining the therapeutic prin- ciples under the mysterious veil of TCM medical theories [1,9] . As the bridge between TCM diagnosis and treatment, Chinese herbal property theory (CHPT) is among the most important but unclear parts of TCM. CHPT incorporates philosophies and terminologies from Chinese meteorology Methodology An improved association - mining research for exploring Chinese herbal property theory: based on data of the Shennongs Classic of Materia Medica Rui Jin 1,2 , Zhi-jian Lin 2 , Chun-miao Xue 2 , Bing Zhang 2 1. Department of Pharmacy, Beijing Shijitan Hospital, Capital Medical University, Beijing 100038, China 2. School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing 100029, China ABSTRACT: Knowledge Discovery in Databases is gaining attention and raising new hopes for traditional Chinese medicine (TCM) researchers. It is a useful tool in understanding and deciphering TCM theories. Aiming for a better understanding of Chinese herbal property theory (CHPT), this paper performed an improved association rule learning to analyze semistructured text in the book entitled Shennong’s Classic of Materia Medica. The text was firstly annotated and transformed to well-structured multidimensional data. Subsequently, an Apriori algorithm was employed for producing association rules after the sensitivity analysis of parameters. From the confirmed 120 resulting rules that described the intrinsic relationships between herbal property (qi, flavor and their combinations) and herbal efficacy, two novel fundamental principles underlying CHPT were acquired and further elucidated: (1) the many-to-one mapping of herbal efficacy to herbal property; (2) the nonrandom overlap between the related efficacy of qi and flavor. This work provided an innovative knowledge about CHPT, which would be helpful for its modern research. KEYWORDS: traditional Chinese medicine; Chinese herbal property theory; association rule learning; knowledge discovery; data mining DOI: 10.3736/jintegrmed2013051 Jin R, Lin ZJ, Xue CM, Zhang B. An improved association-mining research for exploring Chinese herbal property: based on data of the Shennong’s Classic of Materia Medica. J Integr Med. 2013; 11(5): 352- 365. Received December 28, 2012; accepted April 15, 2013. Open-access article copyright © 2013 Rui Jin et al. Correspondence: Prof. Bing Zhang; Tel: +86-10-84738606; E-mail: [email protected]

Methodology An improved association-mining research for exploring

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

www.jcimjournal.com/jim

September 2013, Vol.11, No.5 352 Journal of Integrative Medicine

1 Introduction

Knowledge discovery in databases (KDD), as a relatively young and interdisciplinary field of computer science, has been adopted by many researchers in traditional Chinese medicine (TCM) in recent decades[1-3]. It is a useful tool for extracting underlying patterns from TCM datasets and translating this traditional medical system into scientific language. Increasingly, basic therapeutic principles from TCM have been explored and explained using KDD techniques. These advances feature in the modernization of the oriental medical system[4]. For example, Zhang et al[5] proposed a latent tree approach to study ZHENG differentiation with application to the kidney deficiency

dataset. Ehrman et al[6] employed a random forest approach to analyze the efficacy classification of 8 411 phytochemical compounds from 240 Chinese herbs. Further, two novel network-based methods (distance-based mutual information model and network target-based identification of multicomponent synergy) were proposed to uncover the combination rules of TCM formulae by the team of Shao Li[7,8], who tried to understand TCM principles in the view of bioinformatics and networks. In fact, the KDD approach has already been considered to be a good tool for mining the therapeutic prin-ciples under the mysterious veil of TCM medical theories[1,9].

As the bridge between TCM diagnosis and treatment, Chinese herbal property theory (CHPT) is among the most important but unclear parts of TCM. CHPT incorporates philosophies and terminologies from Chinese meteorology

● Methodology

An improved association-mining research for exploring Chinese herbal property theory: based on data of the Shennong’s Classic of Materia MedicaRui Jin1,2, Zhi-jian Lin2, Chun-miao Xue2, Bing Zhang2

1. Department of Pharmacy, Beijing Shijitan Hospital, Capital Medical University, Beijing 100038, China2. School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing 100029, China

ABSTRACT: Knowledge Discovery in Databases is gaining attention and raising new hopes for traditional Chinese medicine (TCM) researchers. It is a useful tool in understanding and deciphering TCM theories. Aiming for a better understanding of Chinese herbal property theory (CHPT), this paper performed an improved association rule learning to analyze semistructured text in the book entitled Shennong’s Classic of Materia Medica. The text was firstly annotated and transformed to well-structured multidimensional data. Subsequently, an Apriori algorithm was employed for producing association rules after the sensitivity analysis of parameters. From the confirmed 120 resulting rules that described the intrinsic relationships between herbal property (qi, flavor and their combinations) and herbal efficacy, two novel fundamental principles underlying CHPT were acquired and further elucidated: (1) the many-to-one mapping of herbal efficacy to herbal property; (2) the nonrandom overlap between the related efficacy of qi and flavor. This work provided an innovative knowledge about CHPT, which would be helpful for its modern research. KEYWORDS: traditional Chinese medicine; Chinese herbal property theory; association rule learning; knowledge discovery; data mining

DOI: 10.3736/jintegrmed2013051Jin R, Lin ZJ, Xue CM, Zhang B. An improved association-mining research for exploring Chinese herbal property: based on data of the Shennong’s Classic of Materia Medica. J Integr Med. 2013; 11(5): 352-365. Received December 28, 2012; accepted April 15, 2013.Open-access article copyright © 2013 Rui Jin et al.Correspondence: Prof. Bing Zhang; Tel: +86-10-84738606; E-mail: [email protected]

September 2013, Vol.11, No.5353Journal of Integrative Medicine

www.jcimjournal.com/jim

and sociology into the TCM concepts such as “cold” and “hot”. This terminology reflects a kind of special observation on therapeutic effects or side effects of medicinal herbs based on the sensations of humanity, within the cultural history of Chinese medicinal practice. According to the records in the earliest extant classic of TCM pharmacology, Shennong’s Classic of Materia Medica (SCMM), CHPT provided the initial core concepts of herbs having specific qi and flavor properties[10,11]. Reflecting the thermal response of herbal treatment, five different types of herbal qi were defined: cold, cool, neutral, warm and hot. Similarly, five herbal flavors, relating to taste sensation, were defined as pungent, sweet, sour, bitter and salty. Concentrating on herbal qi and flavor, CHPT has developed into an expansive theoretical framework that has guided the identification, preparation and clinical use of herbal medicines. It is also most focused on refining the efficacy of herbal treatment (named herbal efficacy), which indicates the capacity of a medicinal herb for therapeutic or toxic effects. While CHPT has been used and explored by TCM practitioners for thousands of years, it is an evolving philosophy that has relied on accumulation of practical and cultural experience throughout its history.

Along with TCM modernization, recent interests are focused on CHPT and its scientific explanations. Since 2005, two large research programs specializing in CHPT were approved by the Chinese government and supported by the National Basic Research Program (China 973 Program)[12,13]. Some research focused on searching for the matching therapeutic effects and biomedical markers to a particular herbal property. For example, researchers working in a wide range of animal models have reported observing differences between hot and cold qi in response to small-scale herbal interventions (i.e., behavioral response in mice[14], shifts in cultured neural cell function[15], the growth-thermogram curve of Escherichia coli[16], and the characteristics of neuroendocrine-immune network[17]). However, defining a global relationship between herbal property and curative effects is an ongoing problem, which is noticed by only a few researchers. Thereinto, Xiao et al[18] and Zhou et al[19] reported the relationship between herbal property (qi, flavor and channel tropism) and clinical function based on a data-mining method and a characterization-modeling method respectively. They collected many verb-object phrases in TCM as theoretical “function terms” but performed few analysis on the resulting relations. Therefore, the fundamental principles of CHPT buried in these datasets are still undescribed and require further attention.

In our previous work, we collected information from SCMM and mined some rules of association, following standard procedures[20]. After incorporating more data into the SCMM dataset, we have been able to expand and

improve our data-mining experiment by conducting a thorough parameter analysis and translating the mining results into two fundamental principles. These improvements lay on the following:

(1) Data structure analysis: the semistructured characteristics of text in SCMM were analyzed. It is helpful for understanding CHPT.

(2) Further data integration: the synonyms of TCM terminologies were further integrated into a unified statement.

(3) Experimental parameter analysis: two familiar parameters named support and confidence of association rules were analyzed before its determination. Another parameter named lift was also introduced for evaluating the resulting rules.

(4) Validation of the results: the resulting strong rules were compared to the TCM theory for validation.

(5) Analysis of mining results: the results were translated into fundamental rules underlying CHPT, which should be of benefit for its understanding.

The rest of the paper is organized as follows. Section 2 proposes the concept of semistructured data in SCMM, describes the process of structured data extraction and multidimensional table construction to allow the data to be clearly arranged and easily manipulated. Section 3 presents the association rule mining experiment with method introductions, parameter analysis, results and validations. Section 4 contains an elucidation of fundamental principles learned from the results, and Section 5 provides a discussion of these fundamental principles.

2 Data extraction from semistructured text

2.1 Analysis of data structuresAs a great classic of TCM pharmacology, SCMM collects

365 types of Chinese herbal medicines (including medicinal plants, animals and minerals). The text of the book describes medicinal names, origins, properties (qi and flavor) and efficacy. More than 170 kinds of diseases which belong to internal medicine, surgery, gynecology and pediatrics were also discussed in SCMM[21]. Most of the recorded herbs are still commonly used, such as Mahuang (Ephedrae herba), Rougui (Cinnamomi cortex), Chaihu (Bupleuri radix), Huangqin (Scutellariae radix), Huanglian (Coptidis rhizoma), Qinghao (Artemisiae annuae herba), Dahuang (Rhei radix et rhizoma), Fuzi (Aconiti lateralis radix praeparata) and Renshen (Ginseng radix et rhizoma). Their efficacy has been proved by the long-term clinical practice and in some cases, modern scientific research[22].

In the terms of data organization, the text in SCMM can be defined as semistructured data because it contains mixed sentences and semantic markers[2,23]. On one hand, herbal names can be used for dividing the whole text into 365 herb records; some semantic elements including

www.jcimjournal.com/jim

September 2013, Vol.11, No.5 354 Journal of Integrative Medicine

Chinese words “Wei” (followed by herbal flavor), “Zhu” (followed by herbal efficacy), “Yi-ming” (followed by alias) and “Sheng” (followed by place of origin) can be used to identify distinct aspects of contents in each herbal record. That is to say, each herbal record in SCMM may be written in accordance with the format which would be divided into six sequential parts including herbal name, herbal flavor, herbal qi, herbal efficacy, alias and place of origin. On the other hand, efficacy information of herbs is practical-oriented data with features of text-heavy and full of TCM synonyms, lacking a definite structure. It needs to be annotated with specialized data cleaning and integra-tion. Figure 1 presents an analysis on the data structure of ginseng in SCMM, showing all the six content parts. In this paper, the first four parts were selected to construct a table for data management and further mining work. As the model shown in Table 1, each row matches to a single herbal medicine and each column represents a separate piece of herbal information. It describes the main framework of the text information in SCMM, which contains a unique identifier (Herb ID), herbal name, herbal qi, herbal flavor and herbal efficacy.

of a single herbal medicine. To transform the unstructured text describing efficacy information into structured data, we completed a specialized data preprocess for settling all candidate attributes and ensuring the accuracy and consistency in Chinese vocabulary explanations. Ancient and present-day reference books including the Treatise on the Pathogenesis and Manifestations of All Diseases[24], Internal Medicine of TCM[25], Surgery of TCM[26], Obstetrics and Gynecology of TCM[27] and two proofreading and annotation books for SCMM[28,29] were employed to identify the synonyms of efficacy terminologies, which were integrated into a unified statement (Table 2). After that, the information of the resulting table was as follows:

(1) Records: all 365 herbal medicines were contained with 6 of them missing herbal qi and/or flavor.

(2) Qi dimension: 5 attributes were identified, including cold, cool, neutral, warm and hot.

(3) Flavor dimension: 5 attributes were identified, including pungent, sweet, sour, bitter and salty.

(4) Efficacy dimension: 182 attributes were identified, including tonifying the middle qi, clearing away heat, improving vision, relieving cough with dyspnea, curing aggregation-accumulation, treating sore and ulcer, etc.

Once the attributes in three dimensions were defined, a three-dimensional table (Table 3) was constructed in an Microsoft Excel file format. Medicinal herbs were located in the table using Boolean values (0, 1). A value of 0 means that the medicine did not have the corresponding attribute, and 1 means that it had. Using ginseng in Figure 1 as an example, the value of the cell identified by the row of ginseng and the column (attribute) of cool was 1 while other values in herbal qi dimension were 0 because the herbal qi of ginseng was cool.

3 Association rule mining experiment

3.1 Data-mining methodsFocusing on the relationships between herbal property

and efficacy, the interdimensional association rules instead of intradimensional ones were mined through the entire database in this work. That is to say, this method extracted sets of items in the efficacy dimension that often occurred in the herbal medicines containing the particular qi or flavor attribute. A formal statement of the problem was described as follows.

Table 1 Data model of the book entitled Shennong’s Classic of Materia Medica

Herb ID Herbal name Herbal qi Herbal flavor Herbal efficacy

43 Renshen (Ginseng radix et rhizoma) Cool Sweet

Tonifying the middle qi, nourishing essence-spirit, settling soul and spirit, tranquilizing, removing pathogenic qi, improving vision, enhancing the wisdom, and promoting longevity

Figure 1 The semistructured text of ginseng in the book entitled Shennong’s Classic of Materia MedicaThis semistructured record was divided by the underlined semantic elements into six parts: ① herbal name, ② herbal flavor, ③ herbal qi, ④ herbal efficacy, ⑤ alias and ⑥ place of origin.

2.2 Construction of multidimensional table Further, by splitting these characteristics into limited

field categories, the last three fields of the Table 1 can be considered as three separate sets (dimensions) of descriptive attributes belonging to herbal medicines. A particular combination of candidate attributes in three dimensions can be used for herbal location. One attribute of qi dimension, one attribute of flavor dimension and several attributes of efficacy dimension would serve as markers

September 2013, Vol.11, No.5355Journal of Integrative Medicine

www.jcimjournal.com/jim

Table 2 Data integration solutions

Unified name Synonyms in Chinese Unified name Synonyms in ChinesePromoting longevity 轻身 /a 延年 / 不老 / 神仙 Nourishing essence-spirit ( 安 )b 养 ( 精 ) 神 / 安心

Settling soul and spirit ( 定 / 强 ) 安魂魄 Tranquilizing 止 ( 定 ) 惊悸 ( 气 )

Harmonizing five Zang 安 ( 定 / 和 ) 五脏 Preventing from pathogenic qi 辟 ( 邪 ) 恶气 ( 不祥 )

Nourishing brain marrow ( 强 / 填 ) 补髓 ( 脑 ) Enhancing the wisdom 益 ( 增 ) 智 ( 慧 )Relieving heat vexation and fullness 烦热 / 烦满 / 大烦

Strengthening muscles and bones 坚骨齿 (强骨节 )/坚 (强 )筋骨

Tonifying the middle qi 补中 / 补五脏 / 补内 / 益中气 Curing convulsive disease 痉 / 瘈疭 / 项背强急

Curing wind stroke 偏枯 / 中风 / 卒中 Clearing away heat 除热 /主身 (大 /暴 )热 /热气

Tonifying qi 益气 ( 力 )/ 益精气 / 益 ( 脾 /肾 ) 气

Treating fright palpitation 惊痫 / 惊 ( 悸 )

Curing malaria 温疟 / 痎疟 / 疟 Curing epilepsy 癫痫 / 痫

Curing impediment disease ( 寒 / 风 / 湿 / 周 / 血 ) 痹 / 痹气 / 四肢拘挛 / 机关缓急 / 屈伸不利 / 胫重酸痛 / 骨节痛 /膝痛

Curing consumptive disease due to overexertion

虚劳 / 劳极 / 羸瘦 / 五劳七伤

Curing flaccidity disease 痿躄 / 四肢偏痿 / 四肢重弱 Activating joint 通 ( 利 ) 百 ( 关 ) 节

Unblocking the blood meridian 通 ( 保 ) 血脉 / 逐血 / 通血气 Relieving headache and dizziness 风头脑动 / 头眩痛 / 风入脑户

Improving complexion 面生光 / 润泽 / 和颜色 / 媚好 Curing throat impediment 喉痹 / 咽喉肿痛

Relieving cough with dyspnea 欬 ( 咳 ) 逆上气 /( 胸胁 ) 逆气 Eliminating flooding and spotting 下血 / 漏下 / 崩中

Removing water retention ( 除 / 逐 / 下 ) 消 ( 腹 中 ) 水( 气 )/ 肢体浮肿

Resolving hard mass in stomach and intestine

荡胃中积聚 / 涤积聚饮食 / 逐六腑积聚

Curing infertility 绝子 / 无子 / 不孕 / 令人有子 Curing constipation 胃胀闭 / 厌谷胃闭 ( 痹 )

Promoting digestion 消 ( 化 / 利 ) 食 ( 水谷 ) Curing diarrhea 泄澼 / 肠 ( 泄 ) 澼 / 泄痢

Curing dysentery 下痢脓血 / 下痢赤白 Curing vaginal discharge 漏下赤白 / 白沃 / 带下赤白

Curing strangury disease 淋 / 气癃闭 / 溺不下 / 小便余沥 / 膀胱热

Curing pudendal sore 阴蚀 / 阴疮 / 阴中肿痛

Expelling and killing worms 杀三虫 / 去长虫 / 去白虫 Treating unhealed sore 恶疮 / 久败疮

Treating sore and ulcer 痈 / 痈肿 / 疽 / 疡 / 伤热火烂 /赤熛 / 浸淫疮

Treating polyp 恶肉 / 死肌 / 息肉

Treating blood amassment 瘀 ( 留 / 止 ) 血 / 恶血 Curing aggregation-accumulation 癥瘕积聚 / 留固结癖 / 坚瘕 /癥坚 ( 痞 )/ 血瘕 ( 积 )

Curing goiter and tumor 瘿瘤 / 瘰疬 / 颈下核 / 鼠瘘 Curing scabies 疥瘙 ( 癣 )/ 痂疥

Removing toxicity 解 ( 诸 ) 毒 / 鉤吻 / 鸩羽 / 蛇螫 / 蜂 / 猘狗 / 菜 / 肉 / 虫毒

Treating strange diseases caused by ghost

蛊毒 / 精魅邪鬼 / 魇寐寤

a The symbol “/” separate the synonyms. b The words in bracket appear at times.

Let Q={q1, q2, ..., q5} be the set of items (attributes) in qi dimension, F={f1, f2, ..., f5} be the set of items in flavor dimension, and E={e1, e2, ..., e182} be the set of items in efficacy dimension. Let T={t1, t2, ..., t5} be the set of transac-tions (herbal medicines), where each transaction ti contains a nonempty subset of items chosen from Q, F and E. A transaction tj is said to contain an itemset X of items in Q if X tj. Thus, an interdimensional association rule is an implication of the form X→Y, where X Q, Y E

(or X F, Y E, etc.) and X∩Y=ø. The left hand side of the rule is called the antecedent and the right hand side is called the consequent. In this work, the itemsets that contained only one item were considered in three dimensions, and the antecedent and consequent of a rule were from different dimensions, so as to simplify the interpreting of the relations of each herbal property and efficacy. Table 4 shows the defined formats of rules in this association analysis.

www.jcimjournal.com/jim

September 2013, Vol.11, No.5 356 Journal of Integrative Medicine

Given the association rule X→Y, its quality is often measured by two parameters, support and confidence[30]. The measure “support” gives the proportion of transactions in the datasets that contain X and Y (Formula 1) and the measure “confidence” gives the proportion of transactions containing Y in those ones that contain X (Formula 2). Further, a measure named lift was also employed in this work to evaluate the correlation between antecedent and consequent of a rule. It is defined as the confidence of a rule divided by the support of the consequent (Formula 3). Their symbols and calculations are as follows:

Support: (1)

Confidence: (2)

Lift: (3)

where the symbol σ( ) denotes the number of transactions which contain the particular itemset.

Support and confidence are the most common measures related to a rule. Their thresholds are used to control the number and quality of the generated rules. However, some associations among uncorrelated elements can be generated using this “support-confidence” framework. In this case, lift is added to further assess the quality of a rule. A rule with the lift greater than one indicates that the rule predicts a consequent better than random chance. The Apriori algorithm that we utilized in this paper for determining association rules was proposed by Agrawal in 1993, and has been widely employed in biomedical and TCM research[20, 31-33].3.2 Parameter analysis

Suitable support and confidence thresholds allow researchers to identify strong, well-supported rules among many weak and less predictive rules that may emerge from this kind of analysis. For example, a rule with high support value represents the high frequency of occurrences of the items, which should involve the common efficacy in TCM clinics. A high confidence value means the association described by a rule is predictive of a pattern. However, Ta

ble

3 T

hree

dim

ensi

onal

tabl

e of

dat

a in

SC

MM

with

Boo

lean

val

ues

Her

b ID

Her

bal

nam

e

Her

bal q

i H

erba

l fla

vor

Func

tion

Col

dC

ool

Neu

tral

War

mH

ot

Pung

ent

Sour

Sw

eet

Salt

Bitt

erPr

omot

ing

long

evity

Rem

ovin

g pa

thog

enic

qi

Toni

fyin

g qi

Cur

ing

impe

dim

ent

dise

ase

Cle

arin

g he

atIm

prov

ing

visi

on

1

Yuqu

an

00

10

00

01

00

10

10

00

2

Dan

sha

01

00

00

01

00

10

10

01

……

……

……

……

……

……

……

……

……

41

Juhu

a0

01

00

00

00

11

00

10

0 4

2G

anca

o0

01

00

00

10

01

11

00

0 4

3R

ensh

en0

10

00

00

10

01

10

00

1 4

4Sh

ihu

00

10

00

01

00

10

01

00

……

……

……

……

……

……

……

……

……

364

Max

ianh

ao0

01

00

00

00

10

00

10

036

5Fu

bi0

01

00

10

00

00

10

00

0

All

attri

bute

s in

qi a

nd fl

avor

dim

ensio

n an

d 11

attr

ibut

es in

effi

cacy

dim

ensio

n w

ere

chos

en fo

r disp

lay.

SC

MM

: she

nnon

g’s C

lass

ic o

f Mat

eria

Med

ica.

Table 4 The specified forms of association analysis

No. Form Implication1 {Q(q)→E(e)} The occurrence probability of

efficacy e when the herbal property is given

2 {F(f)→E(e)}3 {Q(q)∧F(f)→E(e)}4 {E(e)→Q(q)}

The probable herbal property that inferred from the efficacy e5 {E(e)→F(f)}

6 {E(e)→Q(q)∧F(f)}

September 2013, Vol.11, No.5357Journal of Integrative Medicine

www.jcimjournal.com/jim

(Figure 2(b)). This graph shows the strong positive relationship between the confidence and lift values of these association patterns. The confidence of {E(ei)→Q(cold)} patterns presented a linear increase from less than 20% to 100%. Among these patterns, 16 were strong rules with confidence >45% and lift >1.5. On the contrary, most of {Q(cold)→E(ei)} associations were spread irregularly in the area under the 20% confidence curve. Therefore, we set the minimum confidence percentage (miniconf) at 45% to produce association rules efficiently, which also brought the rule a high lift value. Once the minisup and miniconf were defined, the methods generated all association rules that satisfied the named forms. 3.3 Mining results

In this work, the association rule algorithm was implemented in three cases corresponding to the six forms (Table 4). Association rules were extracted with the minisup of 3, the miniconf of 45% and a minimum lift of 1. At these thresholds, 120 rules involving approximately 80 kinds of efficacy were obtained. All of these association rules were for itemsets containing just two items. These rules were unevenly distributed among the parameter space and most of them included particular attributes such as hot-qi, neutral-qi, cold-qi, pungent-flavor, sweet-flavor and bitter-flavor. Further, the number of rules that efficacy attributes appeared in the antecedent and the above qi or flavor attributes in the consequent was far greater than the rules with the locations interchanged (Figure 3). Table 5 listed all achieved association rules. All rules were numbered and described with support, confidence and lift values.

excessively high parameter thresholds would lead to over-looking interesting associations. To optimize parameter thresholds, we performed an association rule-mining on a representative sample data that were comprised of cold attribute in qi dimension and all attributes in efficacy dimension, while varying the threshold levels of support and confidence. Then the sensitivities of these parameters were analyzed.

First we discuss the sensitivity of the support parameter. Figure 2(a) shows the number of 1-itemset (with cardinality k=1) in the efficacy dimension, when the minimum support count (minisup) ranged from 1 to 30. As can be seen from the upper solid line, the support values of almost half of the attributes in the efficacy dimension were less than 5, and only 30 attributes had support greater than 30. When candidate 2-itemsets were produced by one of the frequent items {E(ei)|σ(ei)≥minisup} in the efficacy dimension and the most frequent item {Q(cold)} in qi dimension, their number was fewer and decreased rapidly to 2 at the minisup of 30, as represented by the lower solid line (Figure 2(a)). It was such a low occurrence for these itemsets that a high minisup threshold would remove many expected interesting associations. Meanwhile, some items with very low support levels like the large number of attributes that appeared only once in the efficacy dimension may occur simply by chance. Thus, minisup was set at 3, resulting in 67 frequent 2-itemsets.

Next, confidence and lift were calculated for 134 association patterns, in the forms of {Q(cold)→E(ei)} and {E(ei)→Q(cold)}, developed from 67 frequent 2-itemsets

Figure 2 The sensitivity of parameters(a) presented the number of itemsets along with the increasing support threshold and (b) presented the confidence distribution of frequent patterns with the minisup of 3.

www.jcimjournal.com/jim

September 2013, Vol.11, No.5 358 Journal of Integrative Medicine

3.4 Validation of resultsFor a given rule X→Y, the higher the confidence, the

more likely it is for Y to be present in the transactions that contain X. Hence for the resulting rules in the forms of {E(e)→Q(q)}, {E(e)→F(f)} and {E(e)→Q(q)∧F(f)}, the higher the confidence, the more likely it was for the efficacy e to be presented in the herbs whose property was q, f or their combination. In these strong association rules, the efficacy e was probably believed to be a reason for judgment of the corresponding property. For example, the results showed that cold qi can be inferred from the efficacy of resolving hard mass in stomach and intestine (see rule No. 29). This suggests that a medicinal herb that can resolve hard mass in stomach and intestine of the human body is more likely to have the property of cold qi. Generally, TCM theory states that cold qi has the actions of descending and purging, and can be used in treating accumulations in stomach and intestine by purgation. Therefore, there was a good match between this association rule and TCM theory. In the present approach, all the rules resulting from our analysis are to be checked for conformity with TCM theory and clinical experience (Table 6). As we can see, more than half rules were successfully classified into proper groups of the actions of matching herbal properties in TCM theory.

4 Principles learned form the results

The above resulting rules made a good elaboration on the relationships between herbal property and efficacy, which not only suggested the efficacy meanings corresponding to the particular herbal property, but also helped us to explore the global structure of CHPT. When we revisited these rules in the view of mathematics, some general

principles that shared by all herbal property attributes emerged. They were underlying principles with seemingly simple essence but rich manifestations, which were elucidated as follows. 4.1 The many-to-one mapping of herbal efficacy to herbal property

As we mentioned in part 3.4, some herbal efficacy attributes are strongly linked to qi and flavor of an herb, whose definition and judgment could rely on these attributes. For example, “resolving hard mass in stomach and intestine” is frequently associated with herbs having cold qi (rule No. 29), and its treatment may call for Dahuang. Clearing away heat is also frequently associated with herbs having the property of cold qi (see rule No. 35), and its treatment may call for Zhizi (Gardeniae fructus). Noticeably, one herbal property would be inferred by more than one kind of efficacy, because many efficacy attributes can be the antecedent of these strong rules with only one herbal property in the consequent. This phenomenon was observed for six herbal properties including cold qi, neutral qi, hot qi, pungent flavor, sweet flavor, and bitter flavor. It was also noted there were no two herbal properties in one dimension sharing the same efficacy attribute. For example, none of the antecedents of the rules with cold qi in the consequent appeared in the rules with neutral qi or hot qi in the consequent. This was also observed for all the above six herbal properties. We defined these phenomena as an abstract principle, namely a many-to-one mapping of herbal efficacy to herbal property. Figure 4 illustrates a five-to-one mapping of herbal efficacy to cold qi with some common medicinal herbs. As we can see, these herbs had distinct ways of “expressing” their cold qi. For example, the cold qi of Zhizi and Cheqianzi (Plantaginis semen) should develop from the efficacy of clearing away heat and regulating the waterways respectively, while the cold qi of Dahuang should develop from both regulating the waterways and resolving hard mass in stomach and intestine. The actions represented by the same “cold qi” were different in herbs with cold property.

Furthermore, with the diversity and identity of medicinal herbs considered, we believe the many-to-one mapping described in this paper should be one of the most important frameworks underlying CHPT, which is also mentioned briefly in the work of Yao et al[34]. Actually, this principle can demonstrate the TCM wisdoms on grasping the medicinal value of original herbs. It provided a logical approach to the generality and unification of complex herbal therapeutic effects, that is, to classify the similar herbal effects into a group labeled herbal property. 4.2 The nonrandom overlaps between the meanings of qi and flavor attributes

Distinct from formal logic, TCM has developed an approach under the influence of Chinese philosophy, which

Figure 3 The proportion of association rules of specified formsThe proportion of rules with efficacy in the antecedent was 85.83% while that of efficacy in the consequent was 14.17%.

September 2013, Vol.11, No.5359Journal of Integrative Medicine

www.jcimjournal.com/jimTa

ble

5 (

to b

e co

ntin

ued

) Th

e re

sulti

ng a

ssoc

iatio

n ru

les

Rul

eA

ntec

eden

tC

onse

quen

t C

onf.

Supp

.Li

ftR

ule

Ant

eced

ent

Con

sequ

ent

Con

f.Su

pp.

Lift

1Pr

omot

ing

swea

ting

Hot

0.

750

33.

380

31Le

adin

g to

ear

ly a

borti

onC

old

0.66

7 4

2.45

8 2

Cur

ing

flacc

idity

dis

ease

Hot

0.66

7 4

3.00

432

Cur

ing

jaun

dice

Col

d0.

636

72.

346

3W

arm

ing

the

mid

dle

qiH

ot0.

636

72.

868

33R

emov

ing

Fu S

hiC

old

0.60

0 3

2.21

2 4

Rel

ievi

ng h

eada

che

and

dizz

ines

sH

ot0.

611

112.

754

34R

elie

ving

dea

fnes

sC

old

0.60

0 3

2.21

2 5

Cur

ing

thro

at im

pedi

men

tH

ot0.

600

62.

704

35C

lear

ing

away

hea

tC

old

0.50

032

1.84

3 6

Prom

otin

g di

gest

ion

Hot

0.60

0 3

2.70

436

Cur

ing

win

d ed

ema

and

dist

entio

nC

old

0.50

0 7

1.84

3 7

Act

ivat

ing

join

tH

ot0.

500

42.

253

37C

urin

g di

arrh

eaC

old

0.50

0 7

1.84

3 8

Reg

ulat

ing

faci

al c

ompl

exio

nH

ot0.

471

82.

121

38C

urin

g st

rang

ury

dise

ase

Col

d0.

483

141.

780

9R

elie

ving

cou

gh w

ith d

yspn

eaH

ot0.

457

212.

057

39R

elie

ving

hea

t vex

atio

n an

d fu

llnes

sC

old

0.46

7 7

1.72

110

Rel

ievi

ng lu

mba

goN

eutra

l 0.

889

82.

439

40C

urin

g fr

actu

res a

nd si

new

inju

ryC

old

0.45

5 5

1.67

6

11Fe

i Jia

nN

eutra

l0.

778

72.

135

41R

elie

ving

blin

dnes

s due

to c

orne

al

opac

ityC

old

0.45

5 5

1.67

6

12Pr

even

ting

abor

tion

Neu

tral

0.75

0 3

2.05

842

Cur

ing

tym

pani

tes

Col

d0.

455

51.

676

13Se

ttlin

g so

ul a

nd sp

irit

Neu

tral

0.66

7 6

1.83

043

Reg

ulat

ing

the

wat

erw

ays

Col

d0.

452

191.

668

14R

elie

ving

dar

kish

com

plex

ion

Neu

tral

0.60

0 3

1.64

744

Neu

tral

Prom

otin

g lo

ngev

ity0.

481

641.

255

15R

elie

ving

diffi

culty

of e

vacu

atin

gN

eutra

l0.

600

31.

647

45C

ool

Cur

ing

uter

ine

obst

ruct

ion

0.60

0 3

1.50

2

16Sh

owin

g to

lera

nce

of h

ungr

yN

eutra

l0.

593

161.

626

46K

illin

g an

imal

s suc

h as

fish

and

mic

e Pu

ngen

t1.

000

33.

724

17C

urin

g im

pote

nce

Neu

tral

0.56

3 9

1.54

447

Prom

otin

g sw

eatin

gPu

ngen

t0.

750

32.

793

18C

urin

g co

nvul

sive

dis

ease

Neu

tral

0.55

610

1.52

548

Prev

entin

g fr

om p

atho

geni

c qi

Pung

ent

0.66

7 6

2.48

319

Cur

ing

hem

orrh

oid

Neu

tral

0.52

610

1.44

449

War

min

g th

e m

iddl

e qi

Pung

ent

0.63

6 7

2.37

020

Stre

ngth

enin

g w

illN

eutra

l0.

526

101.

444

50Im

prov

ing

brig

ht sp

irit

Pung

ent

0.62

510

2.32

821

Toni

fyin

g th

e m

iddl

e qi

Neu

tral

0.50

023

1.37

251

Prom

otin

g di

gest

ion

Pung

ent

0.60

0 3

2.23

522

Har

mon

izin

g fiv

e Za

ngN

eutra

l0.

500

81.

372

52D

irect

ing

qi d

ownw

ard

Pung

ent

0.57

1 8

2.12

823

Unb

lock

ing

the

bloo

d m

erid

ian

Neu

tral

0.50

0 4

1.37

253

Rel

ievi

ng it

chin

gPu

ngen

t0.

571

42.

128

24N

ouris

hing

sper

m

Neu

tral

0.50

0 4

1.37

254

Trea

ting

poly

pPu

ngen

t0.

556

152.

069

25St

oppi

ng b

leed

ing

Neu

tral

0.50

0 3

1.37

255

Rel

ievi

ng h

eada

che

and

dizz

ines

sPu

ngen

t0.

556

102.

069

26N

ouris

hing

ess

ence

-spi

ritN

eutra

l0.

462

61.

267

56C

urin

g Sh

an Ji

aPu

ngen

t0.

556

52.

069

27Pr

omot

ing

long

evity

Neu

tral

0.45

764

1.25

557

Cur

ing

mal

aria

Pung

ent

0.52

411

1.95

128

Rel

ievi

ng re

dden

ed c

ompl

exio

n C

old

1.00

0 4

3.68

758

Rel

ievi

ng c

ough

with

dys

pnea

Pung

ent

0.50

023

1.86

2

29R

esol

ving

har

d m

ass i

n st

omac

h an

d in

test

ine

Col

d0.

778

72.

868

59Tr

eatin

g str

ange

dise

ases

caus

ed

by g

host

Pung

ent

0.50

013

1.86

2

30R

elie

ving

dia

bete

sC

old

0.77

8 7

2.86

860

Cur

ing

diar

rhea

Pung

ent

0.50

0 7

1.86

2

www.jcimjournal.com/jim

September 2013, Vol.11, No.5 360 Journal of Integrative Medicine

Tab

le 5

(co

ntin

uatio

n 1)

The

resu

lting

ass

ocia

tion

rule

s

Rul

eA

ntec

eden

tC

onse

quen

t C

onf.

Supp

.Li

ftR

ule

Ant

eced

ent

Con

sequ

ent

Con

f.Su

pp.

Lift

61C

urin

g th

roat

impe

dim

ent

Pung

ent

0.50

0 5

1.86

2 8

5C

urin

g ut

erin

e ob

stru

ctio

nB

itter

0.80

0 4

2.21

2

62R

elie

ving

bor

bory

gmus

Pung

ent

0.50

0 3

1.86

2 8

6R

esol

ving

har

d m

ass i

n st

omac

h an

d in

test

ine

Bitt

er0.

778

72.

151

63C

urin

g sc

abie

sPu

ngen

t0.

471

81.

753

87

Rel

ievi

ng re

dden

ed c

ompl

exio

nB

itter

0.75

0 3

2.07

464

Rel

ievi

ng w

ind

stro

kePu

ngen

t0.

458

111.

707

88

Cur

ing

tym

pani

tes

Bitt

er0.

727

82.

011

65C

urin

g dy

sent

ery

Swee

t 1.

000

44.

620

89

Show

ing

good

mem

ory

Bitt

er0.

667

41.

843

66Sh

owin

g to

lera

nce

of h

ungr

ySw

eet

0.81

522

3.76

5 9

0C

heck

ing

swea

ting

Bitt

er0.

667

41.

843

67St

oppi

ng b

leed

ing

Swee

t0.

667

43.

080

91

Cur

ing

win

d ed

ema

and

dist

entio

nB

itter

0.64

3

91.

778

68C

urin

g co

nsum

ptiv

e di

seas

e du

e to

ov

erex

ertio

nSw

eet

0.60

022

2.77

2 9

2 C

lear

ing

away

hea

tB

itter

0.60

9 3

91.

685

69N

ouris

hing

mus

cles

Swee

t0.

600

62.

772

93

Rem

ovin

g w

ater

rete

ntio

nB

itter

0.56

3

91.

555

70R

elie

ving

diffi

culty

of e

vacu

atin

gSw

eet

0.60

0 3

2.77

2 9

4R

elie

ving

dia

bete

sB

itter

0.55

6

51.

536

71Im

prov

ing

hear

ing

Swee

t0.

556

122.

567

95

Cur

ing

hem

orrh

oid

Bitt

er0.

526

101.

455

72Se

ttlin

g so

ul a

nd sp

irit

Swee

t0.

556

52.

567

96

Trea

ting

sore

and

ulc

erB

itter

0.50

033

1.38

373

Fei J

ian

Swee

t0.

556

52.

567

97

Rel

ievi

ng p

ain

Bitt

er0.

500

8

1.38

374

Nou

rishi

ng e

ssen

ce-s

pirit

Swee

t0.

538

72.

488

98

Cur

ing

flacc

idity

dis

ease

Bitt

er0.

500

3

1.38

3

75Tr

eatin

g m

enst

rual

floo

ding

and

sp

ottin

gSw

eet

0.53

3 8

2.46

4 9

9U

nblo

ckin

g th

e ob

stru

ctio

n of

qi

Bitt

er0.

452

14

1.24

9

76To

nify

ing

the

mid

dle

qiSw

eet

0.50

023

2.31

010

0So

urPr

omot

ing

long

evity

0.46

7

71.

217

77N

ouris

hing

the

defic

ienc

ySw

eet

0.50

0 7

2.31

010

1Sw

eet

Prom

otin

g lo

ngev

ity0.

785

622.

046

78R

elie

ving

man

y ki

nds o

f dis

ease

sSw

eet

0.50

0 5

2.31

010

2Sw

eet

Toni

fyin

g qi

0.50

640

2.

227

79N

ouris

hing

bra

in m

arro

wSw

eet

0.50

0 4

2.31

010

3Pu

ngen

t-col

d C

lear

ing

away

he

at

0.47

4

92.

701

80N

ouris

hing

qi

Swee

t0.

482

402.

227

104

War

min

g th

e m

iddl

e qi

Pung

ent-h

ot0.

545

64.

424

81R

egul

atin

g fa

cial

com

plex

ion

Swee

t0.

471

82.

227

105

Sore

-neu

tral

Prom

otin

g lo

ngev

ity0.

571

4

1.49

0

82R

epla

cing

old

thin

gs w

ith n

ew th

ings

Bitt

er

1.00

0 3

2.76

510

6Sw

eet-h

otPr

omot

ing

long

evity

0.64

3

91.

676

83N

ouris

hing

wis

dom

Bitt

er0.

833

52.

304

107

Swee

t-hot

Toni

fyin

g qi

0.

500

7

2.19

9

84C

urin

g ja

undi

ceB

itter

0.81

8 9

2.26

210

8Sw

eet-w

arm

Prom

otin

g lo

ngev

ity0.

600

3

1.56

4

September 2013, Vol.11, No.5361Journal of Integrative Medicine

www.jcimjournal.com/jimTa

ble

5 (

cont

inua

tion

2) T

he re

sulti

ng a

ssoc

iatio

n ru

les

Rul

eA

ntec

eden

tC

onse

quen

t C

onf

Supp

Lift

Rul

eA

ntec

eden

tC

onse

quen

t C

onf

Supp

Lift

109

Swee

t-war

mTo

nify

ing

qi

0.60

0 3

2.63

911

5Sw

eet-c

old

Cur

ing

cold

an

d he

at

0.50

0 8

1.98

4

110

Swee

t-neu

tral

Prom

otin

g lo

ngev

ity0.

850

342.

216

116

Bitt

er-c

ool

Cle

arin

g aw

ay

heat

0.47

1 8

1.22

7

111

Swee

t-neu

tral

Toni

fyin

g qi

0.

525

212.

309

117

Res

olvi

ng h

ard

mas

s in

stom

ach

and

inte

stin

eB

itter

-col

d0.

667

64.

867

112

Show

ing

tole

ranc

e of

hun

gry

Swee

t-neu

tral

0.51

914

4.73

111

8R

elie

ving

dia

bete

sB

itter

-col

d0.

556

54.

056

113

Swee

t-coo

l Pr

omot

ing

long

evity

1.00

04

2.60

711

9C

urin

g ja

undi

ceB

itter

-col

d0.

545

63.

982

114

Swee

t-col

dPr

omot

ing

long

evity

0.75

012

1.95

512

0C

urin

g ty

mpa

nite

sB

itter

-col

d0.

455

53.

318

Ant

eced

ent:

left

hand

of a

rule

; Con

sequ

ent:

right

han

d of

a ru

le; C

onf:

confi

denc

e de

gree

; Sup

p: su

ppor

t cou

nt.

The

rule

s N

o. 1

to N

o. 4

3 pr

esen

ted

the

ante

cede

nt-lo

cate

d ef

ficac

y th

at w

as a

ssoc

iate

d w

ith th

e at

tribu

tes

in q

i dim

ensi

on a

nd th

e ru

les

No.

44

to N

o. 4

5 sh

owed

the

effic

acy

attri

bute

s in

the

cons

eque

nt. T

he r

ules

No.

46

to N

o. 1

02 li

sted

ass

ocia

tions

bet

wee

n ef

ficac

y an

d he

rbal

flav

or in

the

sam

e w

ay. I

n ad

ditio

n, th

e le

ft 18

rul

es th

at b

etw

een

effic

acy

and

the

com

bina

tions

of q

i and

flav

or w

ere

pres

ente

d in

the

rule

s No.

103

to N

o. 1

20.

has lead to the informal logic statements of CHPT[35]. From the perspective of TCM history, the development of herbal qi and flavor may not be independent, as they are both related to yin-yang theory and Wu-xing theory[11]. Thus, it was possible to find the same efficacy attribute that strongly relates to the attribute belonging to qi dimension and flavor dimension. However, some efficacy attributes related to one herbal qi did appear in associations with the particular flavor, which made this uncommon. For example, the antecedents of pungent flavor-related rules ({E(e)→F(pungent)}; see rules No.47, No.49, No.51, No.55, No.58 and No.61) were similar to the antecedents of the hot qi-related rules ({E(e)→Q(hot)}, see the rules No.1, No.3, No.4, No.5, No.6 and No.9). These were commonly seen in TCM with high support degrees, which included promoting sweating, warming the middle qi, clearing the throat, promoting digestion, relieving headache and dizziness, and relieving cough with dyspnea. These 6 efficacy attributes strongly point to herbs with hot qi and pungent flavor. In the present work, we identify this association as a “hot-pungent bond”. Following the same approach, we also identified a “neutral-sweet bond” and a “cold-bitter bond” from the resulting rules (Table 7).

Of the 25 possible combinations of qi and flavor, only 3 kinds of strong “bonds” were identified with the present approach. These relationships appear to identify nonrandom connections between the two herbal properties of qi and flavor. According to this finding, the medicinal herbs that can be viewed as combinations of qi and flavor should be naturally classified into two theoretical groups. The first group was composed of herbs following this kind of strong “bond” (hot-pungent, neutral-sweet and cold-bitter), where the related efficacy attributes of qi and flavor were similar. Here, we refer to them as the “isotropic” herbs. The second group was composed of the herbs with attributes that were not in a strong “bond”, where the related efficacy attributes were different such as hot-sweet, neutral-bitter, cold-pungent, etc. Here we refer to this group as “anisotropic” herbs. As shown in Figure 5, “isotropic” herbs, “anisotropic” herbs and their relationships can be explained by using network method. We believe that an “isotropic” herb might have a synergistic and amplified effect while an “anisotropic” herb might be divergent and diverse. The latter should be paid more attention in clinical applications due to the diversity of their individual therapeutic effects, especially the potential compatibility prospective in a TCM formula.

5 Discussion 5.1 Association rule learning method

Association rule learning aims to extract interesting common patterns or causal links among sets of items in

www.jcimjournal.com/jim

September 2013, Vol.11, No.5 362 Journal of Integrative Medicine

a transaction database, and is among the most frequent KDD methods. It has been successfully used in research projects focused on identifying interesting correlations in complex datasets spanning, market basket analysis, web-mining, bioinformatics, and now KDD in TCM. Discovering the frequent combination rules of medicinal herbs from TCM formula data, and analyzing associations between ZHENG and symptoms or biochemical indicators of patients from medical database were the most common types of association rule learning studies in TCM[36-38]. In this work, we applied this method to CHPT research and identified several nonrandom underlying associations.

In the association rule-mining method, the final rules are identified under the control of several parameters, namely support and confidence. It is believed that the proper parameterization of the method ensures that the resulting rules are strongly predictive of associations in the dataset. For the above reason, in the present study, we implemented

Table 6 The comparison between the resulting rules and traditional Chinese medicine theory

Herbal property Actions in traditional Chinese medicine theory Numbers of the related rules Hot qi Expelling cold; restoring yang 1, and 3

Warming and activating qi and blood 2, 4, 6, 7, and 9Neutral qi Nourishing 11, 18, 20, 21, 22, 24, 26, and 27

Harmonizing; mitigating 10, 12, 13, 14, 15, 16, 17, and 25Cold qi Clearing away heat; purging fire 30, 32, 35, 39, and 41

Treating disease related to water retention 36, 42, and 43Descending 29, and 31

Pungent flavor Dispersing; dispelling wind pathogens 47, 53, 54, 55, 60, 63, and 64Promoting the circulation of qi and blood 51, 52, 58, and 62Treating strange diseases considered to be related to ghost in TCM 48, 50, 56, and 59

Sweet flavor Nourishing 68, 69, 73, 74, 76, 77, 79, and 80Harmonizing; mitigating 66, 67, 70, 71, 72, 75, 78, and 81

Bitter flavor Drying; resolving water and dampness pathogen 85, 88, 90, 91, and 93Purging heat 84, 87, 92, 94, and 96Descending; resolving food stagnation 82, 86, and 100

Figure 4 The many-to-one mapping of herbal efficacy to cold qiThe cold qi of different herbs may be inferred from different kinds of efficacy. The presented medicinal herbs are Dahuang (Rhei radix et rhizoma), Zhizi (Gardeniae fructus), Cheqianzi (Plantaginis semen), Huangbai (Phellodendri chinensis cortex) and Zhimu (Anemarrhenae rhizoma).

Table 7 The bonds between herbal qi and flavor

The bonds Efficacy attributes Hot-pungent bond (strong) Promoting sweating

Warming the middle qiRelieving headache and dizzinessCuring throat impedimentPromoting digestionRelieving cough with dyspnea

Neutral-sweet bond (strong) Fei JianSettling soul and spiritRelieving difficulty of evacuatingShowing tolerance of hungryTonifying the middle qiStopping bleedingNourishing essence-spirit

Cold-bitter bond (strong) Relieving reddened complexionResolving hard mass in stomach and intestineRelieving diabetesCuring jaundiceClearing away heatCuring wind edema and distentionCuring tympanites

Hot-sweet bond (weak) Regulating facial complexionHot-bitter bond (weak) Curing flaccidity diseaseNeutral-bitter bond (weak) Curing hemorrhoidCold-pungent bond (weak) Curing diarrhea

September 2013, Vol.11, No.5363Journal of Integrative Medicine

www.jcimjournal.com/jim

a global, complete and detailed parameter analysis before the major mining work. The parameter called “lift” was added in the present study to cross-check the quality of rules. Incorporation of these two new processes improved the current findings over results from our previous work. 5.2 The SCMM

SCMM, dating from the Han Dynasty (202 B.C. to 220 A.D.), is considered the earliest extant classic TCM pharmacology text. By recording the reliable effective medicinal herbs, proposing herbal classification methods, describing herbal property and compatibility theory, and discussing herbal efficacy with the place of origin and the preparation, etc[39], the SCMM establishes the fundamentals of the contemporary Chinese materia medica[11]. Many herbs initially recorded in SCMM are still in use and have reliable clinical results. Further, the recording style and the organization of herbal property theory in SCMM established the model for TCM pharmacology books that were written over the following 2 000 years, including Tang Mateira Medica, Classified Emergency Materia Medica, and Compendium of Materia Medica [40]. Therefore,

SCMM should be the appropriate material for studying CHPT. Even though the mining results might be somewhat different depending on the reference text used for building the base dataset, the final rules in the present study did uncover several nonrandom associative rules that reflect the inherent structure and complexity at the core of CHPT. 5.3 About future research

As TCM has been modernized throughout the last century, a number of researchers have been attracted to investigating the scientific explanations of CHPT. They usually seek the various types of chemical or biochemical indicators that underlie herbal properties. However, the results of these research efforts do not replace the understanding of the fundamental rules underlying CHPT, which should be one crucial aspect of the CHPT modernization. The two principles we learned from the association rules in this paper, begin to fill this gap. The many-to-one mapping of herbal efficacy to herbal properties (qi and flavor) reveals that although properties are shared among herbs they are not predictive of clinical herb use. Thus, herbal properties describe a classification of action, not a specific outcome.

Figure 5 The example of “isotropic” herbs, “anisotropic” herbs and their relationshipsThe strongly related efficacy attributes (in rectangles) are marked next to the corresponding herbal property (in ovals). The herbs (in circles) are connected with its efficacy by straight lines. The “isotropic” herbs are represented by the red, green and blue circles. They are pungent -hot herbs Wuzhuyu (WZY, Euodiae fructus), Fuzi (FZ, Aconiti lateralis radix praeparata) and Xixin (XX, Asari radix et rhizoma); sweet-neutral herbs Puhuang (PH, Typhae pollen) and E-jiao (EJ, Asini corii colla); and bitter-cold herbs Dahuang (DaH, Rhei radix et rhizoma) and Zhizi (ZZ, Gardeniae fructus). The “anisotropic” herbs are represented by the purple circles. They are sweet-hot herb Shizhongru (SZR, Stalactitum), pungent-cold herb Ningshuishi (NSS, Calcitum seu Gypsum rubrum), bitter-hot herbs Mahuang (MH, Ephedrae herba) and Baizhu (BZ, Atractylodis macrocephalae rhizoma), sweet-cold herbs Maogen (MG, Imperatae rhizoma) and Dihuang (DiH, Rehmanniae radix) and bitter-neutral herb Chaihu (CH, Bupleuri radix).

www.jcimjournal.com/jim

September 2013, Vol.11, No.5 364 Journal of Integrative Medicine

This difference emphasizes the individuality of herbs, and implies that simply pursuing the chemical constituents or biochemical indicators that confer these herbal properties may be an unreasonable approach. In other words, the use of Western disease-specific targets was not a reasonable approach to evaluate TCM[41]. Second, the nonrandom overlap between the efficacy meanings of qi and flavor attributes demonstrated close relations between hot qi and pungent flavor, neutral qi and sweet flavor, and cold qi and bitter flavor. These patterns not only reflected the inherent relationships between qi and flavor but also gave new insights into explaining their roles. These findings will also provide researchers with a novel perspective to understand the wisdom by which TCM organizes the complexity of herbal medicine with simple ideas.

6 Conclusions

Here we have presented an improved method that extracts meaningful patterns from the data contained within the ancient Chinese book of Materia Medica to understand CHPT. The approach was based on association discovery technologies with a specialized annotation of ancient Chinese vocabularies and a proper parameter analysis. Finally we identified 120 association rules of six defined formats, including the relations between herbal efficacy and qi, flavor and their combinations. Different from other computational approaches to CHPT, our work aimed at exploring the global fundamental principles embedded in CHPT from these association rules. This process identified the many-to-one mapping of herbal efficacy to herbal property and the nonrandom overlaps between the meanings of qi and flavor attributes. Understanding these principles will support the modernization of TCM philosophy, especially in the context of integrative medicine and rational clinical herb use. In this sense, our findings, and similar attempts to link TCM theory and clinical practice will be of great use to both practitioners and researchers alike.

7 Funding and acknowledgements

We thank for the support of National Basic Research Program of China (973 Program) (No. 2007CB512605) and the Scientific Research Innovation Team of Beijing University of Chinese Medicine (No. 2011-CXTD-14).

8 Conflict of interests

The authors declare that they have no conflict of interests.

REFERENCES

1 Feng Y, Wu ZH, Zhou XZ, Zhou ZM, Fan WY. Knowledge discovery in traditional Chinese medicine: state of the art and perspectives. Artif Intell Med. 2006; 38(3): 219-236.

2 Zhou XZ, Peng YH, Liu BY. Text mining for traditional Chinese medical knowledge discovery: a survey. J Biomed

Inform. 2010; 43(4): 650-660. 3 Lukman S, He YL, Hui SC. Computational methods for

traditional Chinese medicine: a survey. Comput Methods Programs Biomed. 2007; 88(3): 283-294.

4 Yao MC, Yuan YM, Ai L, Qiao YJ. Data mining and its application in the modernization of traditional Chinese medicine and traditional Chinese pharmacy. Beijing Zhong Yi Yao Da Xue Xue Bao. 2002; 25(5): 20-23. Chinese with abstract in English.

5 Zhang NL, Yuan SH, Chen T, Wang Y. Latent tree models and diagnosis in traditional Chinese medicine. Artif Intell Med. 2012; 42(3): 229-245.

6 Ehrman TM, Barlow DJ, Hylands PJ. Phytochemical informatics of traditional Chinese medicine and therapeutic relevance. J Chem Inf Model. 2007; 47(6): 2316-2334.

7 Li S, Zhang B, Jiang D, Wei Y, Zhang N. Herb network construction and co-module analysis for uncovering the combination rule of traditional Chinese herbal formulae. BMC Bioinformatics. 2010; 11(Suppl 11): S6.

8 Li S, Zhang B, Zhang NB. Network target for screening synergistic drug combinations with application to traditional Chinese medicine. BMC Bioinformatics. 2011; 5(Suppl 1): S10.

9 Ren TG. TCM informatics. Beijing: Science Press. 2003. Chinese.

10 Yan ZH. Chinese materia medica. Beijing: People’s Medical Publishing House. 2006. Chinese.

11 Zheng JS. Related historical process of Chinese medicine development. Guilin: Guangxi Normal University Press. 2007. Chinese.

12 High Technology Research and Development Center in the Ministry of Science and Technology of China. Six-five 973 Projects approved by the Ministry of Science and Technology of China in 2006. Zhongguo Ji Chu Yi Xue. 2006; 5: 15-17. Chinese.

13 High Technology Research and Development Center in the Ministry of Science and Technology of China. List of 973 Projects approved by the Ministry of Science and Technology of China in 2007. Zhongguo Ji Chu Yi Xue. 2007; 5: 24-26. Chinese.

14 Zhao YL, Wang JB, Xiao XH, Zhao HP, Zhou CP, Zhang XR, Ren YS, Jia L. Study on the cold and hot properties of medicinal herbs by thermotropism in mice behavior. J Ethnopharmacol. 2011; 133(3): 980-985.

15 Liu YQ, Cheng MC, Wang LX, Zhao N, Xiao HB, Wang ZT. Functional analysis of cultured neural cells for evaluating cold/cool- and hot/warm-natured Chinese herbs. Am J Chin Med. 2008; 36(4): 771-781.

16 Cheng DH, Wang J, Zeng N, Xia HL, Fu Y, Yan D, Zhao YL, Xiao XH. Study on drug property differences of Shexiang (moschus) and Bingpian (borneolum synthcticum) based on analysis of biothermodynamics. J Tradit Chin Med. 2011; 31(1): 21-26.

17 Li S, Zhang ZQ, Wu LJ, Zhang XG, Li YD, Wang YY. Understanding ZHENG in traditional Chinese medicine in the context of neuro-endocrine-immune network. IET Syst Biol. 2007; 1(1): 51-60.

18 Xiao B, Wang Y, Qiao YJ. Study on the relationship between Chinese herbal nature and functions. Zhongguo Zhong Yi Yao Xin Xi. 2011; 18(1): 31-33. Chinese with abstract in English.

September 2013, Vol.11, No.5365Journal of Integrative Medicine

www.jcimjournal.com/jim

19 Zhou FS, Lai XP, Chen XSJ, Chen JN, Chen GL. Methodology for modeling on Chinese herbal property theory. World Sci Tech. 2009; 11(2): 229-233.

20 Jin R, Lin Q, Zhang B, Liu X, Liu SM, Zhao Q, Liu XL. A study of association rules in three-dimensional property-taste-effect data of Chinese herbal medicines based on Apriori algorithm. J Chin Integr Med. 2011; 9(7): 794-803. Chinese with abstract in English.

21 Cai JF. A historical overview of traditional Chinese medicine and ancient Chinese medical ethics. Ethik Med. 1998; 10(Suppl 1): S84-S91.

22 Lei ZQ, Zhang TM. Chinese clinical pharmacy. Beijing: People’s Medical Publishing House. 1998. Chinese.

23 Cao CG, Wang HT, Sui YF. Knowledge modeling and acquisition of traditional Chinese herbal drugs and formulae from text. Artif Intell Med. 2004; 32(1): 3-13.

24 Chao YF. Treatise on the pathogenesis and manifestations of all diseases. Beijing: Huaxia Publishing House. 2008. Chinese.

25 Wang YY, Lu ZL. Internal medicine of traditional Chinese medicine. Beijing: People’s Medical Publishing House. 1999. Chinese.

26 Tan XH, Lu DM. Surgery of traditional Chinese medicine. Beijing: People’s Medical Publishing House. 1999. Chinese.

27 Liu MR, Tan WX. Obstetrics and gynecology of traditional Chinese medicine. Beijing: People’s Medical Publishing House. 2001. Chinese.

28 Shang ZJ. Annotations of Shennong’s Classic of Materia Medica. Beijing: Academy Press. 2008. Chinese.

29 Yang PJ. Annotated Shennong’s Classic of Materia Medica. Beijing: Academy Press. 2007. Chinese.

30 Han JW, Kamber M. Data mining: concepts and techniques. San Francisco: Morgan Kaufmann Publishers. 2006.

31 Carmona-Saez P, Chagoven M, Rodriguez A, Trelles O, Carazo JM, Pascual-Montano A. Integrated analysis of gene expression by association rules discovery. BMC Bioinformatics.

2006; 7: 54. 32 Creighton C, Hanash S. Mining gene expression databases

for association rules. Bioinformatics. 2003; 19(1): 79-86. 33 Tong YY, Zhao YK, Yu J, Hu YM, Pan YL. A study on the

application of association rule mining on traditional Chinese medicine. Zhongguo Zhong Yi Yao Xin Xi. 2009; 16(7): 95-96. Chinese with abstract in English.

34 Yao MC, Qiao YJ, Yuan YM, Ai L. The classification of Chinese herbal efficacy based on artificial neural network. Zhongguo Zhong Yao Za Zhi. 2003; 28(7): 689-691. Chinese.

35 Wang L, Chang CK, Wang BL. Logical perspective view of the Chinese herbal property theory. Yi Xue Yu Zhe Xue Ren Wen She Hui Yi Xue Ban. 2009; 30(12): 56-57, 73. Chinese with abstract in English.

36 Gong J, Dong JL, Tang JF, Zhang L, Liang MX. A new method to find the most suitable syndrome healed by commonly paired herbs based on association rule. Proc IEEE Int Conf Intell Comput Intell Syst. 2010; 2: 53-56.

37 Li J, Wang SR, Xu B, Gao J. The discussion on the relations between TCM syndrome of IgA nephropathy and pathological indicators based on association rules. Beijing Zhong Yi Yao. 2011; 30(9): 653-655. Chinese.

38 Lu W, Wang YF, Wu ZH. System for mining association rules from traditional Chinese medicine database. Zhejiang Da Xue Xue Bao Gong Xue Ban. 2001; 35(4): 370-373. Chinese with abstract in English.

39 Zhang DB, Sun LJ, Wang D. The academic contribution of Shennong’s Classic of Materia Medica. Zhonghua Zhong Yi Yao Xue Kan. 2010; 28(6): 1130-1134. Chinese.

40 Zhang DB, Sun LJ, Wang D. The completion and history of Shennong’s Classic of Materia Medica. Zhonghua Zhong Yi Yao Xue Kan. 2010; 28(5): 924-927. Chinese.

41 Jiang WY. Therapeutic wisdom in traditional Chinese medi-cine: a perspective from modern science. Trends Pharmacol Sci. 2005; 26(11): 558-563.

Submission Guide

Journal of Integrative Medicine (JIM) is a PubMed-indexed, peer-reviewed, open-access journal, publishing papers on all aspects of integrative medicine, such as acupuncture and traditional Chinese medicine, Ayurvedic medicine, herbal medicine, homeopathy, nutrition, chiropractic, mind-body medicine, Taichi, Qigong, meditation, and any other modalities of complementary and alternative medicine (CAM). Article

types include reviews, systematic reviews and meta-analyses, randomized controlled and pragmatic trials, translational and patient-centered effectiveness outcome studies, case series and reports, clinical trial protocols, preclinical and basic science studies, papers on methodology and CAM history or education, editorials, global views, commentaries, short communications, book reviews, conference proceedings, and letters to the editor.

● No submission and page charges ● Quick decision and online first publication

For information on manuscript preparation and submission, please visit JIM website. Send your postal address by e-mail to [email protected], we will send you a complimentary print issue upon receipt.

Editors-in-Chief: Wei-kang Zhao & Lixing Lao. ISSN 2095-4964. Published by Science Press, China.