Classification of Wisconsin Breast Cancer Diagnostic and Prognostic Dataset Using Polynomial Neural Network

  • Upload
    sapmeen

  • View
    200

  • Download
    0

Embed Size (px)

DESCRIPTION

Classification of Wisconsin Breast Cancer Diagnostic and Prognostic Dataset Using Polynomial Neural Networksapmeen

Citation preview

  • Classification of Wisconsin Breast Cancer Diagnostic and Prognostic Dataset using Polynomial Neural

    Network

    A Dissertation Work Submitted in Partial fulfillment for the award of

    Post Graduate Degree of Master of Technology

    In Computer Science & Engineering

    Submitted to

    Rajiv Gandhi Proudyogiki Vishwavidhyalaya,

    Bhopal (M.P.)

    Submitted By: Shweta Saxena

    0126CS10MT17

    Under the Guidance of Dr. Kavita Burse

    Director, OCT, Bhopal.

    Department of Computer Science & Engineering

    ORIENTAL COLLGEGE OF TECHNOLOGY,

    BHOPAL (Formerly known as Thakral College of Technology, Bhopal)

    Approved by AICTE New Delhi & Govt. of M.P. Affiliated to Rajiv Gandhi Proudyogiki Vishwavidhyalaya, Bhopal (M.P.)

    Session 2012-13

  • II

    ORIENTAL COLLGEGE OF TECHNOLOGY, BHOPAL (Formerly known as Thakral College of Technology, Bhopal)

    Approved by AICTE New Delhi & Govt. of M.P. and Affiliated to Rajiv Gandhi Proudyogiki Vishwavidhyalaya Bhopal (M.P.)

    DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

    CERTIFICATE

    THIS IS TO CERTIFY THAT THE DISSERTATION ENTITLED

    Classification of Wisconsin Breast Cancer Diagnostic and Prognostic Dataset using Polynomial Neural Network BEING SUBMITTED BY Shweta Saxena IN PARTIAL FULFILLMENT OF THE REQUIREMENT FOR

    THE AWARD OF M.TECH DEGREE IN COMPUTER SCIENCE & ENGINEERING TO ORIENTAL COLLEGE OF

    TECHNOLOGY, BHOPAL (M.P) IS A RECORD OF BONAFIDE WORK DONE BY HIM UNDER MY GUIDANCE.

    Dr. Kavita Burse Prof. Roopali

    Soni

    Director Head of

    Department, CSE

    OCT, Bhopal OCT, Bhopal

    (Guide)

  • III

    ORIENTAL COLLGEGE OF TECHNOLOGY, BHOPAL (Formerly known as Thakral College of Technology, Bhopal)

    Approved by AICTE New Delhi & Govt. of M.P. and Affiliated to Rajiv Gandhi Proudyogiki Vishwavidhyalaya Bhopal (M.P.)

    DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

    APPROVAL CERTIFICATE

    This dissertation work entitled Classification of

    Wisconsin Breast Cancer Diagnostic and Prognostic

    Dataset using Polynomial Neural Network submitted

    by Shweta Saxena is approved for the award of degree of

    Master of Technology in Computer Science & Engineering.

    INTERNAL EXAMINER EXTERNAL EXAMINER

    Date: Date:

  • IV

    CANDIDATE DECLARATION I hereby declare that the dissertation work presented in the report

    entitled as Classification of Wisconsin Breast Cancer Diagnostic and Prognostic Dataset using Polynomial Neural Network submitted in the partial fulfillment of the requirements for the award

    of the degree of Master of Technology in Computer Science &

    Engineering of Oriental College of Technology is an authentic record of my own work.

    I have not submitted the part and partial of this report for the award

    of any other degree or diploma.

    Date: Shweta Saxena

    (0126CS10MT17)

    This is to certify that the above statement made by the candidate is

    correct to the best the best of my knowledge.

    Dr. Kavita Burse Director

    OCT, Bhopal

    (Guide)

  • V

    ACKNOWLEDGEMENT

    I would like to express my deep sense of respect and gratitude towards my advisor

    and guide Dr. Kavita Burse, Director Oriental College of Technology who has

    given me an opportunity to work under her. She has been a constant source of

    inspiration throughout my work. She displayed unique tolerance and understanding at

    every step of progress of this work and encouraged me incessantly. Her invaluable

    knowledge and innovative ideas helped me to take the work to the final stage. I

    consider it my good fortune work under such a wonderful person.

    I express my respect to Prof. Roopali Soni, Head, Computer Science

    Engineering Department, Oriental College of Technology for her constant

    encouragement and invaluable advice in every aspect of my academic life. I am also

    thankful to all faculty members of Computer Science and Engineering Department for

    their support and guidance.

    I am especially thankful to my father Mr. Damodar Saxena, my mother Mrs.

    Nirmala Saxena, and my loving sisters Shikha and Shraddha for their love,

    sacrifice and support on every path of my life. I extend a special word of thanks to my

    husband Mr. Ashish Saxena for his moral support and help in achieving my aim.

    Last but not the least I am extremely thankful to all who have directly or indirectly

    helped me for the completion of my work.

    Shweta Saxena

    (0126CS10MT17)

  • VI

    ORGANIZATION OF DISSERTATION

    The report Classification of Wisconsin Diagnostic and Prognostic Dataset using

    Polynomial Neural Network has been divided into 7 chapters as follows:

    Chapter 2 Introduction

    Chapter 1 first describes the motivation of this research work. It then describes breast

    cancer disease, its symptoms and types in detail. The chapter also describes diagnosis

    and prognosis process of the disease.

    Chapter 2 Literature Review

    Different Neural network techniques for diagnosis and prognosis of breast cancer

    diagnosis and prognosis are described in this chapter along with the related work

    concerned with these techniques. The chapter also compares the accuracies of

    different techniques at the end.

    Chapter 3 Artificial Neural Network and Principal Component

    Analysis

    In this chapter Artificial Neural network is described in detail along with its

    advantages and medical applications. The chapter describes in detail the higher order

    or polynomial neural network along with back propagation algorithm which are used

    in this research for classification. The chapter next provides the detailed information

    about data preprocessing technique named Principal Component Analysis and its

    advantages.

    Chapter 4 MATLAB

    The technology used for implementation of proposed work is MATLAB. The chapter

    gives a brief introduction of MATLAB along with its advantages and detailed

    description of Neural Network Toolbox available in MATLAB for design of neural

    network. The chapter also explains the neural network design process using neural

    network toolbox.

  • VII

    Chapter 5

    Chapter 5 presents the description of dataset used for implementation of this research

    and the results of implementation.

    Chapter 6

    Chapter 6 concludes the dissertation and provides possible directions for relevant

    future work.

  • VIII

    ABSTRACT

    Breast cancer is the most common form of cancer and major cause of death in

    women. Normally, the cells of the breast divide in a regulated manner. If cells keep

    on dividing when new cells are not needed, a mass of tissue forms. This mass is

    called a tumor. This tumor can be cancerous or non-cancerous. The goal of diagnosis

    is to distinguish between cancerous and non-cancerous cells. Once a patient is

    diagnosed with breast cancer, the prognosis gives the anticipated long-term behavior

    of the ailment. Breast cancer detection, classification, scoring and grading of

    histopathological images is the standard clinical practice for the diagnosis and

    prognosis of breast cancer. In a large hospital, a pathologist typically handles a

    number of cancer detection cases per day. It is, therefore, a very difficult and time-

    consuming task. Owing to their wide range of applicability and their ability to learn

    complex and non linear relationships including noisy or less precise information

    Artificial Neural Networks (ANNs) are very well suited to solve problems in

    biomedical engineering. ANNs can be applied to medicine in four basic fields:

    modeling, bioelectric signal processing, diagnosing and prognostics. There are

    several systems available for the diagnosis and selection of therapeutic strategies in

    breast cancer.

    In this research we propose neural network based clinical support system to provide

    medical data analysis for diagnosis and prognosis of breast cancer. The system

    classifies the breast cancer diagnostic data which are provided as input to neural

    network into two sets- benign (non- cancerous) and malignant (cancerous) to get the

    diagnostic results. For getting prognosis results the system classify the prognostic

    data which are given as input to neural network into two classes- recurrent and non-

    recurrent. Results belong to recurrent set shows that cancer is reoccurred after some

    time. Polynomial neural network (PNN) structure is used along with back

    propagation algorithm for classification of breast cancer data. Wisconsin Breast

    Cancer (WBC) datasets from the UCI Machine Learning repository is used as input

    datasets to PNN. Data pre-processing technique named Principal Component

    Analysis (PCA) is used as a features reduction transformation method to improve the

    accuracy of PNN. In our results the Mean Square error (MSE) is substantially

    reduced for PCA preprocessed data as compared to normalized data. Hence we get

    more accurate diagnosis and prognosis results.

    Keywords- breast cancer, polynomial neural network, principal component

    analysis, wisconsin breast cancer dataset.

  • IX

    CONTENTS

    DESCRIPTION PAGE NO.

    List of Fig.s XII

    List of Tables XIII

    Chapter I

    Introduction 1-7

    1.1 Research Motivation 2

    1.2 Introduction 3

    1.3 Symptoms of breast cancer 4

    1.4 Types of breast cancer 4

    1.5 Breast cancer diagnosis 5

    1.6 Breast cancer prognosis 6

    Chapter - 2

    Literature Review 8-26

    2.1 Introduction 9

    2.2 Neural network techniques for diagnosis and prognosis of breast cancer 11

    2.3 Comparison of neural network techniques for breast cancer diagnosis and

    prognosis 26

    Chapter 3 27-40

    Artificial Neural Network and Principal Component Analysis

    3.1 Overview of ANN 28

    3.2 Basics of ANN 28

    3.3 Feed Forward Neural Network with Back propagation 29

  • X

    3.4 Higher order or polynomial neural network 33

    3.5 Advantages of ANN 35

    3.6 Medical Applications 35

    3.7 Overview of data Preprocessing 36

    3.7.1 Feature selection 37

    3.7.2 Feature extraction 37

    3.8 Principal Component Analysis 38

    3.8.1 Dimension reduction 38

    3.8.2 Lower dimensionality basis 39

    3.8.3 Selection of principal components 39

    3.8.4 Selecting best lower dimensional space 39

    3.8.5 Linear transformation implied 40

    3.9 Advantages of PCA 40

    Chapter 4

    41-48

    MATLAB

    4.1 Introduction 42

    4.2 Advantages of MATLAB 42

    4.3 Limitations of MATLAB 43

    4.4 Neural Network Toolbox 44

    4.5 Neural Network Design using Neural Network Toolbox 45

    4.5.1 Collecting the data 46

    4.5.1.1 Pre-processing and post-processing the data 46

    4.5.1.2 Representing Unknown or Dont Care Targets 47

    4.5.1.3 Dividing the Data 47

  • XI

    4.5.2 Creating and configuring the network 47

    4.5.3 Initializing weights and biases 47

    4.5.4 Training the network 47

    4.5.5 Validation of network 48

    4.5.6 Use the network 48

    Chapter 5

    Simulation and Results 49-60

    5.1 Introduction 50

    5.2 Description of dataset 52

    5.3 Results and discussions 57

    5.3.1 Diagnosis Results 57

    5.3.2 Prognosis Results 58

    Chapter 6

    Conclusion and Future Scope 61-62

    6.1 Conclusion 62

    6.2 Future work 62

    List of Publications 63-64

    References 65-74

    LIST OF FIGURES

  • XII

    FIGURE NO. TITLE PAGE NO.

    Fig. 1.1 Breast Cancer 3

    Fig. 1.2 FNA Images of benign and malignant breast mass 6

    Fig. 2.1 An MLP structure 11

    Fig. 2.2 Probabilistic neural network for cancer diagnosis 16

    Fig. 3.1 A single neuron 26

    Fig. 3.2 Feed Forward NN model for Breast Cancer diagnosis 27

    Fig. 3.3 Node structure of PNN 30

    Fig. 3.4 Polynomial Neural Network 30

    Fig. 3.5 Data Pre-processing using PCA 34

    Fig. 4.1 Pre-processing and post-processing 42

    Fig. 5.1 Flow chart of ANN process 47

    Fig. 5.2 Comparison of the convergence performance for WPBC dataset (50

    iterations) 55

    Fig. 5.3 (a) Testing error for normalization and PCA data for WPBC dataset over

    100 data 55

    Fig. 5.3 (b) Testing error for normalization PCA for WPBC dataset over 198 data

    56

    LIST OF TABLES

    TABLE NO. TITLE PAGE NO.

    Table 2.1 Accuracy comparison for test data classification 23

    Table 4.1 Pre-processing and post-processing functions 42

    Table 5.1 A brief description of breast cancer datasets 46

    Table 5.2 Attribute information for WBC dataset 48

    Table 5.3 Attribute information for WDBC dataset 49

  • XIII

    Table 5.4 Attribute information for WPBC dataset 50-51

    Table 5.5 Training performance for WBC dataset 52

    Table 5.6 Testing performance for WBC dataset 53

    Table 5.7 Training performance for WDBC dataset 53

    Table 5.8 Testing performance for WDBC dataset 53-54

    Table 5.9 Training performance for WPBC dataset 54

    Table 5.10 Testing performance for WPBC dataset 54

  • XIV

    Chapter 1

    Introduction

    1.1 Research Motivation

    According to the World Health Organization (WHO), breast cancer is currently the

    top cancer in women worldwide and the second highest cause of death for all female.

    Diagnosis and prognosis of breast cancer at very early stage is recondite due to

    various factors, which are cryptically interconnected to each other. We are oblivious

    to many of them. Until an effective preventive measure becomes widely available,

    early detection followed by effective treatment is the only recourse for reducing

    breast cancer mortality. Most breast cancers are detected by the patient as the lump in

    the breast. The majority of breast lumps are benign (non- cancerous) so it is the

    physicians responsibility to diagnose breast cancer. The goal of diagnosis is to

    distinguish between malignant (Cancerous) and benign breast lumps. Once a patient

    is diagnosed with breast cancer, the malignant lump must be excised. During this

    procedure, or during a different post-operative procedure, physicians must determine

    the prognosis of the disease. Prognosis gives the anticipated long-term behavior of

    the ailment. A major class of problems in medical science involves the diagnosis and

    prognosis of breast cancer, based upon various tests performed upon the patient.

    When several tests are involved, the ultimate diagnosis and prognosis may be

    difficult to obtain, even for a medical expert. In human operator base analysis of test

    results errors may also be created in calculation and this will result in faulty

    treatment for the patients. This has given rise, over the past few decades, to

    computerized diagnostic and prognostic tools, intended to aid the physician in

    making sense out of the welter of data. A prime target for such computerized tools is

    in the domain of cancer diagnosis and prognosis. Neural networks are computer-

  • XV

    based tools inspired by the vertebrate nervous system that have been increasingly

    used in the past decade to model biomedical domains. The motivation for this

    research is to create neural network based tool for doctors to use for classifying the

    results obtained from various tests performed upon the patient. The neural networks

    based clinical support system proposed in this research provide medical data analysis

    for diagnosis and prognosis in shorter time and remain unaffected by human errors

    caused by inexperience or fatigue. Use of ANN increases the accuracy of most of the

    methods and reduces the need of the human expert. The back propagation algorithm

    has been used to train neural network keeping in view of the significant

    characteristics of NN and its advantages for the implementation of the classification

    problem. PCA is used as a features reduction transformation method to improve the

    accuracy of ANN. Advantages of feature reduction includes the identification of a

    reduced set of features among a large set of features that are used for outcome

    prediction. Though the proposed neural network model is implemented on standard

    Wisconsin dataset obtained from UCI machine learning repository, it can also be

    implemented using similar dataset.

    1.2 Introduction

    Breast cancer is the major cause of death by cancer in the female population [1].

    Most breast cancer cases occur in women aged 40 and above but certain women with

    high-risk characteristics may develop breast cancer at a younger age [2]. Breast

    cancer occurs in humans and other mammals. While theoverwhelming majority of

    human cases occur in women, male breast cancer can also occur [3]. Cancer is a

    disease in which cells become abnormal and form more cells in an uncontrolled way.

    With breast cancer, the cancer begins in the tissues that make up the breasts. The

    breast consists of lobes, lobules, and bulbs that are connected by ducts. The breast

    also contains blood and lymph vessels. These lymph vessels lead to structures that

    are called lymph nodes. Clusters of lymph nodes are found under the arm, above the

    collarbone, in the chest, and in other parts of the body. Together, the lymph vessels

    and lymph nodes make up the lymphatic system, which circulates fluid called lymph

    throughout the body. Lymph contains cells that help fight infection and disease.

    Normally, the cells of the breast divide in a regulated manner. If cells keep dividing

    when new cells are not needed, a mass of tissue forms. This mass is called a tumor as

    shown in fig. 1.1[4]. A tumor can be benign or malignant. A benign tumor is not

  • XVI

    cancer and will not spread to other parts of the body. A malignant tumor is cancer.

    Cancer cells divide and damage tissue around them. When breast cancer spreads

    outside the breast, cancer cells are most often found under the arm in the lymph

    nodes. In many cases, if the cancer has reached the lymph nodes, cancer cells may

    have also spread to other parts of the body via the lymphatic system or through the

    bloodstream. This can be life-threatening [5].

    Fig 1.1 Breast Cancer

    In addition to being the most frequently diagnosed cancer among women in the

    United States, breast cancer accounts for up to 20 percent of the total costs of cancer

    overall. Women covered by Medicaid have unique challenges when it comes to this

    disease. For example, Medicaid recipients are more likely to be diagnosed at an

    advanced stage. They also have much lower screening rates compared to the general

    population. A new study found a high prevalence of breast cancer in Medicaid

    patients as well as significantly higher health care use and costs [6].

    1.3 Symptoms of Breast Cancer

    The first noticeable symptom of breast cancer is typically a lump that feels different

    from the rest of the breast tissue. More than 80% of breast cancer cases are

    discovered when the woman feels a lump. Lumps found in lymph nodes located in

    the armpits can also indicate breast cancer [7]. Indications other than a lump may

    include thickening different from the other breast tissue, one breast becoming larger

    or lower, a nipple changing position or shape or becoming inverted, skin puckering

    or dimpling, a rash on or around a nipple, discharge from nipple/s, constant pain in

    part of the breast or armpit, and swelling beneath the armpit or around the collarbone

    [8]. Inflammatory breast cancer is a particular type of breast cancer which can pose a

  • XVII

    substantial diagnostic challenge. Symptoms may resemble a breast inflammation and

    may include itching, pain, swelling, nipple inversion, warmth and redness throughout

    the breast, as well as an orange-peel texture to the skin [7]. Another reported

    symptom complex of breast cancer is Paget's disease of the breast. This syndrome

    presents as eczematoid skin changes such as redness and mild flaking of the nipple

    skin. As Paget's advances, symptoms may include tingling, itching, increased

    sensitivity, burning, and pain. There may also be discharge from the nipple.

    Approximately half of women diagnosed with Paget's also have a lump in the breast

    [9].

    1.4 Types of Breast Cancer

    Breast cancer can develop in different ways and may affect different parts of the

    breast. The location of cancer will affect the progression of cancer and the treatment.

    Breast cancer is divided mainly into the pre-invasive or in-situ form, or the

    invasive or infiltrating form. The pre-invasive form is restricted to the breast itself

    and has not yet invaded any of the lymphatics or blood vessels that surround the

    breast tissue. Therefore, it does not spread to lymph nodes or other organs in the

    body [5]. Pre-invasive Forms of breast cancer are-

    a) Ductal carcinoma in situ (DCIS):

    This is the most common pre-invasive breast cancer. More commonly seen now

    because this form is generally seen on a mammogram and is identified by unusual

    calcium deposits or puckering of the breast tissue (called stellate appearance). If left

    untreated, DCIS will progress to invasive breast cancer.

    b) Lobular carcinoma in situ (LCIS):

    Unlike DCIS, LCIS is not really cancer at all. Most physicians consider the finding

    of LCIS to be accidental, and it is thought to be a marker for breast cancer risk. That

    is, women with LCIS seem to have a 7-10 times increased risk of developing some

    form of breast cancer (usually invasive lobular carcinoma) over the next 20 years.

    LCIS does not warrant treatment by surgery or radiation therapy. Close follow-up is

    most commonly indicated, and LCIS is not easily seen on mammogram. Recent data

    suggest that this condition may be a precursor to invasive lobular cancer. There may

  • XVIII

    be some forms of LCIS (ie, the pleomorphic subtype) that require more aggressive

    local therapy and closer follow-up.

    Invasive Forms of cancer are-

    a) Ductal carcinoma:

    This is the most common form of breast cancer and accounts for 70% of breast

    cancer cases. This cancer begins in the milk ducts and grows into surrounding

    tissues.

    b) Lobular carcinoma:

    This originates in the milk-producing lobules of the breast. It can spread to the fatty

    tissue and other parts of the body. About 1 in 10 breast cancers are of this type [10].

    c) Medullary, mucinous, and tubular carcinomas:

    These are three relatively slower-growing types of breast cancer.

    d) Inflammatory carcinoma:

    This is the fastest growing and most difficult type of breast cancer to treat. This

    cancer invades the lymphatic vessels of the skin and can be very extensive. It is very

    likely to spread to the local lymph nodes.

    e) Pagets disease:

    Paget's disease is cancer of the areola and nipple. It is very rare (about 1% of all

    breast cancers). In general, women who develop this type of cancer have a history of

    nipple crusting, scaling, itching, or inflammation.

    1.5 Breast Cancer Diagnosis

    Most breast cancers are detected by the patient as the lump in the breast. The

    majority of breast lumps are benign (non- cancerous) so it is the physicians

    responsibility to diagnose breast cancer. The goal of diagnosis is to distinguish

    between malignant (Cancerous) and benign breast lumps. The three methods

    currently used for breast cancer diagnosis are mammography, fine needle aspirate

    (FNA) and surgical biopsy [11]. Mammography has a reported sensitivity

    (probability of correctly identifying a malignant lump) which varies between 68%

    and79% [12].Taking a fine needle aspirate (i.e. extracting fluid from a breast lump

  • XIX

    using a small-gauge needle) and visually inspecting the fluid under a microscope has

    a reported sensitivity varying from 65% to 98% [13]. Fig 1.2 shows an FNA image

    of benign and malignant breast mass.

    Fig 1.2 FNA Images of benign and malignant breast mass

    The more evasive and costly surgical biopsy has close to 100% sensitivity and

    remains the only test that can confirm malignancy. Therefore mammography lacks

    sensitivity, FNA sensitivity varies widely, and surgical biopsy, although accurate, is

    invasive, time consuming and costly [11]. The goal of the diagnostic aspect of our

    research is to develop a neural network system that diagnoses breast cancer with help

    of Wisconsin Breast cancer database which is obtained from FNAs.

    1.6 Breast Cancer Prognosis

    Once a patient is diagnosed with breast cancer, the malignant lump must be excised.

    During this procedure, or during a different post-operative procedure, physicians

    must determine the prognosis of the disease[14]. This is simply the long-term

    outlook for the disease for patients whose cancer has been surgically removed[11].

    Prognosis is important because the type and intensity of the medications are based on

    it. Currently, the most reliable method of determining the prognosis is by axillary

    clearance (the dissection of axillary lymph nodes) [Choong]. Unfortunately, for

    patients with unaffected lymph nodes, the result is unnecessary numbness, pain,

    weakness, swelling, and stiffness[15]. Prognosis poses a more difficult problem than

    that of diagnosis since the data is censored. That is, there are only a few cases where

    we have an observed recurrence of the disease [14]. A patient can be classified as a

    recur if the disease is observed at some subsequent time to tumor excision, a patient

    for whom cancer has not been recurred and may never recur, has an unknown or

    censored[16] time to recur (TTR). On the other hand, we do not observe recurrence

    in most patients. For these, there is no real point at which we can consider the patient

    a non recurrent case. So, the data is considered censored since we do not know the

    time of recurrence. For such patients, all we know is the time of their last check-up.

    We call this the disease-free survival time (DFS) [14]. Prognostic aspect of the

  • XX

    proposed research is to develop a neural network system that classify Wisconsin

    Breast cancer Prognostic database into two classes- Recur and non-recur patients.

  • XXI

    Chapter 2

    Literature Review

    2.1 Introduction

    Neural network techniques have been successfully applied to the diagnosis and

    prognosis of breast cancer. This chapter reviews the existing/popular neural network

    techniques for the diagnosis and prognosis of breast cancer. Various neural network

    techniques are compared at the end. The Wisconsin breast cancer data set is used to

    study the classification accuracy of the neural networks. Two research papers which

    were helpful for getting the idea of survey are-

    An Analysis of the methods employed for breast cancer diagnosis by M. M.

    Beg and M. Jain.

    Breast cancer diagnosis using statistical neural networks by T. Kiyan, T

    Yildirim

    A brief description of above two papers is as follows-

    An Analysis of the methods employed for breast cancer diagnosis, Author:

    M. M. Beg and M. Jain [17]

    Abstract:

    Breast cancer research over the last decade has been tremendous. The ground

    breaking innovations and novel methods help in the early detection, in setting the

    stages of the therapy and in assessing the response of the patient to the treatment. The

  • XXII

    prediction of the recurrent cancer is also crucial for the survival of the patient. This

    paper studies various techniques used for the diagnosis of breast cancer. Different

    methods are explored for their merits and de-merits for the diagnosis of breast lesion.

    Some of the methods are yet unproven but the studies look very encouraging. It was

    found that the recent use of the combination of Artificial Neural Networks in most of

    the instances gives accurate results for the diagnosis of breast cancer and their use

    can also be extended to other diseases.

    Comments:

    This paper reviews the existing/popular methods which employ the soft computing

    techniques to the diagnosis of breast cancer. The paper demonstrated the better

    performance of the multiple neural networks over the monolithic neural networks for

    the diagnosis of breast cancer. It can be concluded from this study that the neural

    networks based clinical support systems provide the medical experts with a second

    opinion thus removing the need for biopsy, excision and reduce the unnecessary

    expenditure. Use of ANN increases the accuracy of most of the methods and reduces

    the need of the human expert. The ANN, Support Vector Machine, Genetic algorithm

    (GA), and K-nearest neighbor may be used for the classification problems. The GA is

    better used for the feature selection. The fuzzy co-occurrence matrix and fuzzy

    entropy method can also be used for feature extraction. Almost all intelligent

    computational learning algorithms use supervised learning. Supervised ANN

    outperforms the unsupervised network but in the case of a patient with no previous

    medical records the unsupervised ANN is the only solution.

    Breast cancer diagnosis using statistical neural networks, Author: M. M.

    Beg and M. Jain[18]

    Abstract:

    Breast cancer is the second largest cause of cancer deaths among women. The

    performance of the statistical neural network structures, radial basis network (RBF),

    general regression neural network (GRNN) and probabilistic neural network (PNN)

    are examined on the Wisconsin breast cancer data (WBCD) in this paper. This is a

    well-used database in machine learning, neural network and signal processing.

  • XXIII

    Statistical neural networks are used to increase the accuracy and objectivity of breast

    cancer diagnosis.

    Comments:

    This paper shows that how statistical neural networks are used in actual clinical

    diagnosis of breast cancer. The simulations were realized by using MATLAB 6.0

    Neural Network Toolbox. Four different neural network structures, multi layer

    perceptron (MLP), RBF, PNN and GRNN were applied to WBCD database to show

    the performance of statistical neural networks on breast cancer data. According to the

    results RBF and PNN are the best classifiers in training set whereas GRNN gives the

    best classification accuracy when the test set is considered. According to overall

    results, it is seen that the most suitable neural network model for classifying WBCD

    data is GRNN.

    2.2 Neural network techniques for diagnosis and prognosis of breast cancer

    Various techniques for diagnosis and prognosis of breast cancer are-

    Multilayer Perceptron (MLP):

    MLP has been widely used for the aim of cancer prediction and prognosis [19]. MLP

    is a class of feed forward neural networks which is trained in a supervised manner to

    become capable of outcome prediction for new data [20]. The structure of MLP is

    shown in fig 2.1. An MLP consists of a set of interconnected artificial neurons

    connected only in a forward manner to form layers. One input, one or more hidden

    and one output layer are the layers forming an MLP [21]. Artificial neuron is basic

    processing element of a neural network. It receives signal from other neurons,

    multiplies each signal by the corresponding connection strength that is weight, sums

    up the weighted signals and passes them through an activation function and feeds the

    output to other neurons [22].

    Fig. 2.1 MLP structure

  • XXIV

    The simplest form of trainable neural network, first developed (Rosenblatt, 1958),

    composed of two layers of nodes namely input and output layer. A mapping between

    the input and output data could be established by assigning weights to the input

    numerical data during training. More complicated MLPs which are commonly used

    consist of some hidden layers in addition to the input and output layers. These hidden

    layers enable the MLP to extract higher order statistics from a set of given data and

    hence, capture the complex relationship between input-output data. Therefore, MLPs

    commonly consist of an input layer for which the number of nodes are defined by

    size of input vector, one or more hidden layers which can have variable number of

    nodes depending on the application and an output layer which has one or more nodes

    depending on the number of output classes. Connections between these layers are

    defined by weights which are assigned in a supervised learning process so that the

    neural network would respond correctly to new data. This can be done via a training

    algorithm, in which a cost function is computed by comparing the networks output

    and the desired output and is then minimized with respect to the network parameters

    [21]. Neural network classification process consists of two steps- training and testing.

    The classification accuracy depends on training [23]. A mapping between the input

    and output data could be established by assigning weights to the input numerical data

    during training [21]. The training requires a series of input and associated output

    vectors. During the training, the network is repeatedly presented with the training

    data and the weights and thresholds in the network are adjusted from time to time till

    the desired input output mapping occurs [22]. Training is done on known examples

    and testing is done on unknown samples. The training procedure itself consisted of

    two processes involving feed-forwarding the input data followed by back

    propagation of error by adjusting weights to minimize error on each training epoch

    [24]. Following research paper presents the effectiveness of MLP for diagnosis and

    prognosis of breast cancer-

    An expert system for detection of breast cancer based on association rules

    and neural network, Author: M. Karabatak and M. C. Ince [93]

    This paper presents an automatic diagnosis system for detecting breast cancer based

    on association rules (AR) and neural network (NN). In this study, AR is used for

    reducing the dimension of breast cancer database and NN is used for intelligent

    classification. The proposed AR + NN system performance is compared with NN

  • XXV

    model. The dimension of input feature space is reduced from nine to four by using

    AR. In test stage, 3-fold cross validation method was applied to the Wisconsin breast

    cancer database to evaluate the proposed system performances. The correct

    classification rate of proposed system is 95.6%. This research demonstrated that the

    AR can be used for reducing the dimension of feature space and proposed AR + NN

    model can be used to obtain fast automatic diagnostic systems for other diseases.

    Cross Validation Evaluation for Breast Cancer Prediction Using Multilayer

    Perceptron Neural Networks, Author: Shirin A. Mojarad, Satnam S. Dlay, Wai L.

    Woo and Gajanan V. Sherbet [25]

    Abstract:

    The aim of this study is to investigate the effectiveness of a Multilayer Perceptron

    (MLP) for predicting breast cancer progression using a set of four biomarkers of

    breast tumors. The biomarkers include DNA ploidy, cell cycle distribution

    (G0G1/G2M), steroid receptors (ER/PR) and S-Phase Fraction (SPF). A further

    objective of the study is to explore the predictive potential of these markers in

    defining the state of nodal involvement in breast cancer. Two methods of outcome

    evaluation viz. stratified and simple k-fold Cross Validation (CV) are studied in order

    to assess their accuracy and reliability for neural network validation. Criteria such as

    output accuracy, sensitivity and specificity are used for selecting the best validation

    technique besides evaluating the network outcome for different combinations of

    markers.

    Comments:

    The presence of metastasis in the regional lymph nodes is the most important factor

    in predicting prognosis in breast cancer. Many biomarkers have been identified that

    appear to relate to the aggressive behaviour of cancer. However, the nonlinear

    relation of these markers to nodal status and also the existence of complex interaction

    between markers have prohibited an accurate prognosis. The results show that

    stratified 2-fold CV is more accurate and reliable compared to simple k-fold CV as it

    obtains a higher accuracy and specificity and also provides a more stable network

    validation in terms of sensitivity. Best prediction results are obtained by using an

    individual marker-SPF which obtains an accuracy of 65%. The authors suggest that

  • XXVI

    MLP-based analysis provides an accurate and reliable platform for breast cancer

    prediction given that an appropriate design and validation method is employed.

    WBCD breast cancer database classification applying artificial

    metaplasticity neural network, Author: A. Marcano-Cedeo , J. Quintanilla-

    Domnguez and D. Andina [26]

    Abstract:

    The correct diagnosis of breast cancer is one of the major problems in the medical

    field. From the literature it has been found that different pattern recognition

    techniques can help them to improve in this domain. These techniques can help

    doctors form a second opinion and make a better diagnosis. In this paper we present a

    novel improvement in neural network training for pattern classification. The

    proposed training algorithm is inspired by the biological metaplasticity property of

    neurons and Shannons information theory. During the training phase the Artificial

    metaplasticity Multilayer Perceptron (AMMLP) algorithm gives priority to updating

    the weights for the less frequent activations over the more frequent ones. In this way

    metaplasticity is modeled artificially. AMMLP achieves a more effcient training,

    while maintaining MLP performance. To test the proposed algorithm we used the

    Wisconsin Breast Cancer Database (WBCD). AMMLP performance is tested using

    classification accuracy, sensitivity and specificity analysis, and confusion matrix.

    The obtained AMMLP classification accuracy of 99.26%, a very promising result

    compared to the Backpropagation Algorithm (BPA) and recent classification

    techniques applied to the same database.

    Comments:

    In this study, a Artificial Neural Network for Classification Breast Cancer based on

    the biological metaplasticity property was presented. The proposed AMMLP

    algorithm was compared with the classic MLP with Backpropagation, applied to the

    Wisconsin Breast Cancer Database. The AMMLP classifier shows a great

    performance obtaining the following results average for 100 networks: 97.89% in

    specificity, 100% in sensitivity and the total classification accuracy of 99.26%, the

    ROC curve to show the AMMPL superiority over the classic MLP with

    Backpropagation and finally the results obtained after calculating the AUC in this

  • XXVII

    case were as follows for AMMLP is 0.989 while the AUC for BP is 0.928, this

    indicates one more time the AMMLP superiority over the BP, in this particular case.

    From the above results, we conclude that the AMMLP obtains very promising results

    in classifying the possible breast cancer. We believe that the proposed system can be

    very helpful to the physicians for their as a second opinion for their final decision. By

    using such an efficient tool, they can make very accurate decisions. Our AMMLP,

    proved to be equal or superior to the state-of-the-art algorithms applied to the

    Wisconsin Breast Cancer Database, and shows that it can be an interesting

    alternative.

    Classification of breast cancer by comparing back propagation training

    algorithms Author: F. Paulin and A. Santhakumaran [27]

    Abstract:

    Breast cancer diagnosis has been approached by various machine learning techniques

    for many years. This paper presents a study on classification of Breast cancer using

    Feed Forward Artificial Neural Networks. Back propagation algorithm is used to

    train this network. The performance of the network is evaluated using Wisconsin

    breast cancer data set for various training algorithms. The highest accuracy of

    99.28% is achieved when using levenberg marquardt algorithm.

    Comments:

    The Back-propagation algorithm and supervised training method are used in this

    project. The aim of training is to adjust the weights until the error measured between

    the desired output and the actual output is reduced. The training stops when this

    reaches a sufficiently low value. To analyze the data neural network tool box which

    is available in MATLAB software is used. In this research a feed forward neural

    network is constructed and the Back propagation algorithm is used to train the

    network. The proposed algorithm is tested on a real life problem, the Wisconsin

    Breast Cancer Diagnosis problem. In this paper six training algorithms are used,

    among these six methods, Levenberg Marquardt method gave the good result of

    99.28%. Preprocessing using min-max normalization is used in this diagnosis.

    Further work is needed to increase the accuracy of classification of breast cancer

    diagnosis.

  • XXVIII

    Radial Basis Function Neural Network (RBFNN)

    RBFNN is trained to perform a mapping from an m-dimensional input space to an n-

    dimensional output space. An RBFNN consists of the m-dimensional input x being

    passed directly to a hidden layer. Suppose there are c neurons in the hidden layer.

    Each of the c neurons in the hidden layer applies an activation function, which is a

    function of the Euclidean distance between the input and an m-dimensional prototype

    vector. Each hidden neuron contains its own prototype vector as a parameter. The

    output of each hidden neuron is then weighted and passed to the output layer. The

    outputs of the network consist of sums of the weighted hidden layer neurons [28].

    The transformation from the input space to the hidden-unit space is nonlinear where

    as the transformation from the hidden-unit space to the output space is linear [29].

    The performance of an RBFNN depends on the number and location (in the input

    space) of the centers, the shape of the RBFNN functions at the hidden neurons, and

    the method used for determining the network weights. Some researchers have trained

    RBFNN networks by selecting the centers randomly from the training data [30].

    Following research paper describes the application of RBFNN in breast cancer

    prediction-

    Breast Cancer Detection using Recursive Least Square and Modified Radial

    Basis Functional Neural Network, Author: M. R. Senapati, P. K .Routray, P. K.

    Dask [31]

    Abstract:

    A new approach for classification has been presented in this paper. The proposed

    technique, Modified Radial Basis Functional Neural Network (MRBFNN) consists of

    assigning weights between the input layer and the hidden layer of Radial Basis

    functional Neural Network (RBFNN). The centers of MRBFNN are initialized using

    Particle swarm Optimization (PSO) and variance and centers are updated using back

    propagation and both the sets of weights are updated using Recursive Least Square

    (RLS). Our simulation result is carried out on Wisconsin Breast Cancer (WBC) data

    set. The results are compared with RBFNN, where the variance and centers are

    updated using back propagation and weights are updated using Recursive Least

  • XXIX

    Square (RLS) and Kalman Filter. It is found the proposed method provides more

    accurate result and better classification.

    Comments:

    Modified Radial Basis Functional Neural Network is same as that of RBFNN with an

    exception that weights are assigned between neurons in the input layer and the

    neurons in the hidden layer. An efficient Pattern Recognition and rule extraction

    technique using Recursive Least square approximation and Modified Radial Basis

    Functional Neural Networks (MRBFNN) is presented in this paper. The weights

    between input layer and the hidden layer as well as hidden layer and output layer of

    the RBFNN classifier can be trained using the linear recursive least square (RLS)

    algorithm. The RLS has a much faster rate of convergence compared to gradient

    search and least mean square (LMS) algorithms.

    Probabilistic Neural Networks (PNN):

    PNN is a kind of RBFNN suitable for classification problems. It has three layers. The

    network contains an input layer, which has as many elements as there are separable

    parameters needed to describe the objects to be classified. It has a pattern layer,

    which organizes the training set such that an individual processing element represents

    each input vector. And finally, the network contains an output layer, called the

    summation layer, which has as many processing elements as there are classes to be

    recognized [32]. For detection of breast cancer output layer should have 2 neurons

    (one for benign class, and another for malignant class). Each element in this layer

    combines via processing elements within the pattern layer which relate to the same

    class and prepares that category for output [32].

    Fig. 2.2 Probabilistic neural network for breast cancer diagnosis

    PNN used in [33] has a multilayer structures consisting of a single RBF hidden layer

    of locally tuned units which are fully interconnected to an output layer (competitive

  • XXX

    layer) of two units, as shown in Fig. 2.2. In this system, real valued input vector is

    features vector, and two outputs are index of two classes. All hidden units

    simultaneously receive the eight-dimensional real valued input vector. The input

    vector to the network is passed to the hidden layer nodes via unit connection weights.

    The hidden layer consists of a set of radial basis functions. Associated with jth

    hidden unit is a parameter vector, called (C_j ) a center. The hidden layer node

    calculates the Euclidean distance between the center and the network input vector

    and then passes the result to the radial basis function. All the radial basis functions

    are of Gaussian type. Equations which used in the neural network model are as

    follows-

    X_j=(f -c _j * b^ih)

    2.1

    (X)=exp(-X^2 )

    2.2

    b^ih= 0.833/s

    2.3

    S_i=_(j=1)^hW_ji^ho* X_j

    2.4

    1, if Si max of { S_1,S_2 }

    Y_i= 2.5

    0, else

    where i = 1,2, j = 1,2,. . . ,h, Y_i is the ith output (classification index), (f ) is the

    eight-dimensional real valued input vector, W_ji^ho is the weight between the jth

    hidden node and the ith output node, (C _j) is the center vector of the jth hidden

    node, s is the real constant known as spread factor, bih is the biasing term of radial

    basis layer, and (.) is the nonlinear RBF (Gaussian). PNN provides a general

    solution to pattern classification problems by following an approach developed in

    statistics, called Bayesian classifiers [34][35]. PNN combines the Bays decision

    strategy with the Parzen non-parametric estimator of the probability density functions

  • XXXI

    of different classes [36]. Following research papers present the application of PNN in

    breast cancer diagnosis and prognosis-

    The Wisconsin Breast Cancer Problem: Diagnosis and DFS time prognosis

    using probabilistic and generalised regression neural classifiers Author: Ioannis

    Anagnostopoulos, Christos Anagnostopoulos, Angelos Rouskas, George

    Kormentzas and Dimitrios Vergados [37].

    Abstract:

    This papers deals with the breast cancer diagnosis and prognosis problem employing

    two proposed neural network architectures over the Wisconsin Diagnostic and

    Prognostic Breast Cancer (WDBC/WPBC) datasets. A probabilistic approach is

    dedicated to solve the diagnosis problem, detecting malignancy among instances

    derived from the Fine Needle Aspirate (FNA) test, while the second architecture

    estimates the time interval that possibly contain the right end-point of the patients

    Disease-Free Survival (DFS) time. The accuracy of the neural classifiers reaches

    nearly 98% for the diagnosis and 92% for the prognosis problem. Furthermore, the

    prognostic recurrence predictions were further evaluated using survival analysis

    through the Kaplan-Meier approximation method and compared with other

    techniques from the literature.

    Comments:

    In this paper PNN is used to solve the diagnosis problem because this kind of

    networks present high-generalization ability and do not require large amount of

    training data. PNN is used to detect malignancy among instances derived from the

    Fine Needle Aspirate (FNA) test. The accuracy of the neural classifiers reaches

    nearly 98%.

    Generalized Regression Neural Networks (GRNN):

    GRNN is the paradigm of RBFNN, often used for function approximations [38].

    GRNN consists of four layers: The first layer is responsible for reception of

    information, the input neurons present the data to the second layer (pattern neurons),

    the output of the pattern neurons are forwarded to the third layer (summation

    neurons), summation neurons are sent to the fourth layer (output neuron)[39]. If f(x)

  • XXXII

    is the probability density function of the vector random variable x and its scalar

    random variable z, then the GRNN calculates the conditional mean E(z\x) of the

    output vector. The joint probability density function f(x, z) is required to compute the

    above conditional mean. GRNN approximates the probability density function from

    the training vectors using Parzen windows estimation [40]. GRNNs do not require

    iterative training; the hidden- to-output weights are just the target values tk, so the

    output y(x), is simply a weighted average of the target values tk of training cases xk

    close to the given input case x. It can be viewed as a normalized RBF network in

    which there is a hidden unit centered at every training case. These RBF units are

    called kernels and are usually probability density functions such as the Gaussians.

    The only weights that need to be learned are the widths of the RBF units h. These

    widths (often a single width is used) are called smoothing parameters or bandwidths

    and are usually chosen by cross validation [38]. Following research paper gives

    breast cancer diagnosis and prognosis results by GRNN-

    The Wisconsin Breast Cancer Problem: Diagnosis and DFS time prognosis

    using probabilistic and generalised regression neural classifiers, Author: Ioannis

    Anagnostopoulos and Christos Anagnostopoulos, Angelos Rouskas, George

    Kormentzas, and Dimitrios Vergados [37].

    Abstract:

    This papers deals with the breast cancer diagnosis and prognosis problem

    employing two proposed neural network architectures over the Wisconsin Diagnostic

    and Prognostic Breast Cancer (WDBC/WPBC) datasets. A probabilistic approach is

    dedicated to solve the diagnosis problem, detecting malignancy among instances

    derived from the Fine Needle Aspirate (FNA) test, while the second architecture

    estimates the time interval that possibly contain the right end-point of the patients

    Disease-Free Survival (DFS) time. The accuracy of the neural classifiers reaches

    nearly 98% for the diagnosis and 92% for the prognosis problem. Furthermore, the

    prognostic recurrence predictions were further evaluated using survival analysis

    through the Kaplan-Meier approximation method and compared with other

    techniques from the literature.

    Comments:

  • XXXIII

    Generalised Regression Neural Network architecture (GRNNs) is used for

    breast cancer prognosis in this paper. These neural networks have the special ability

    to deal with sparse and non-stationary data where non-linear relationships exist

    among inputs and outputs. In the problem addressed, the network calculates a time

    interval that corresponds to a possible right end-point of the patients disease-free

    survival time. Thus, if f(x,z) is the probability density function of the vector random

    variable x and its scalar random variable z, then the GRNN calculates the conditional

    mean E(x\z)of the output vector. The joint probability density function f(x,z) is

    required to compute the above conditional mean. GRNN approximates the pdf from

    the training vectors using Parzen windows estimation, which is a non-parametric

    technique approximating a function by constructing it out of many simple parametric

    probability density functions. Parzen windows are considered as Gaussian functions

    with a constant diagonal covariance matrix. The accuracy of the neural classifiers

    reaches 92% for prognosis problem.

    Fuzzy- Neuro System:

    Fuzzy-Neuro system uses a learning procedure to find a set of fuzzy membership

    functions which can be expressed in the form of if-then rules[41]-[43]. A fuzzy

    inference system uses fuzzy logic, rather than Boolean logic, to reason about data

    [44]. Its basic structure includes four main components- a fuzzifier, which translates

    crisp (real-valued) inputs into fuzzy values; an inference engine that applies a fuzzy

    reasoning mechanism to obtain a fuzzy output; a defuzzifier, which translates this

    latter output into a crisp value; and a knowledge base, which contains both an

    ensemble of fuzzy rules, known as the rule base, and an ensemble of membership

    functions, known as the database. The decision-making process is performed by the

    inference engine using the rules contained in the rule base[45].The fuzzy logic

    procedure can be summarized in following steps: Determination of the input and

    output variables that describe the observed phenomenon together with the selection

    of their variation interval, defining a set of linguistic values together with their

    associated membership functions that map/cover the numerical range of the fuzzy

    variable, and defining a set of fuzzy inference rules between input and output fuzzy

  • XXXIV

    variables[46]. Following research papers uses fuzzy logic approach for breast cancer

    diagnosis-

    A fuzzy-genetic approach to breast cancer diagnosis, Author: Carlos

    Andres Pena-Reyes, Moshe Sipper [47].

    Abstract:

    The automatic diagnosis of breast cancer is an important, real-world medical

    problem. In this paper we focus on the Wisconsin breast cancer diagnosis (WBCD)

    problem, combining two methodologiesfuzzy systems and evolutionary

    algorithmsso as to automatically produce diagnostic systems. We find that our

    fuzzy-genetic approach produces systems exhibiting two prime characteristics: first,

    they attain high classification performance (the best shown to date), with the

    possibility of attributing a confidence measure to the output diagnosis; second, the

    resulting systems involve a few simple rules, and are therefore (human-)

    interpretable.

    Comments:

    A good computerized diagnostic tool should possess two characteristics, which are

    often in conflict. First, the tool must attain the highest possible performance, i.e.

    diagnose the presented cases correctly as being either benign or malignant.

    Moreover, it would be highly desirable to be in possession of a so-called degree of

    confidence: the system not only provides a binary diagnosis (benign or malignant),

    but also outputs a numeric value that represents the degree to which the system is

    confident about its response. Second, it would be highly beneficial for such a

    diagnostic system to be human-friendly, exhibiting so-called interpretability. This

    means that the physician is not faced with a black box that simply spouts answers

    (albeit correct) with no explanation; rather, we would like for the system to provide

    some insight as to how it derives its outputs. In this paper we combine two

    methodologiesfuzzy systems and evolutionary algorithmsso as to automatically

    produce systems for breast cancer diagnosis. The major advantage of fuzzy systems

    is that they favour interpretability, however, finding good fuzzy systems can be quite

    an arduous task. This is where evolutionary algorithms step in, enabling the

    automatic production of fuzzy systems, based on a database of training cases.

  • XXXV

    Cancer Diagnosis Using Modified Fuzzy Network, Author: Essam Al-

    Daoud [48]

    Abstract:

    In this study, a modified fuzzy c-means (MFCM) radial basis function (RBF)

    network is proposed. The main purposes of the suggested model are to diagnose the

    cancer diseases by using fuzzy rules with relatively small number of linguistic labels,

    reduce the similarity of the membership functions and preserve the meaning of the

    linguistic labels. The modified model is implemented and compared with adaptive

    neuro-fuzzy inference system (ANFIS). The both models are applied on "Wisconsin

    Breast Cancer" data set. Three rules are needed to obtain the classification rate 97%

    by using the modified model (3 out of 114 is classified wrongly). On the contrary,

    more rules are needed to get the same accuracy by using ANFIS. Moreover, the

    results indicate that the new model is more accurate than the state-of-art prediction

    methods. The suggested neuro-fuzzy inference system can be re-applied to many

    applications such as data approximation, human behavior representation, forecasting

    urban water demand and identifying DNA splice sites.

    Comments:

    ANFIS works with different activation functions and uses un-weighted connections

    in each layer. ANFIS consists from five layers and can be adapted by a supervised

    learning algorithm. In this paper ANFIS and the modified Fuzzy RBF (MFRBF) are

    applied on Wisconsin Breast Cancer data set. The main purposes of the suggested

    model are to diagnose the cancer diseases by using fuzzy rules with relatively small

    number of linguistic labels, reduce the similarity of the membership functions and

    preserve the meaning of the linguistic labels. The standard fuzzy c-means has various

    well-known problems, namely the number of the clusters must be specified in

    advanced, the output membership functions have high similarity, and FCM is

    unsupervised method and cannot preserve the meaning of the linguistic labels. On the

    contrary, the grid partitions method solves some of the previous matters, but it has

    very high number of the output clusters. The basic idea of the suggested MFCM

    algorithm is to combine the advantages of the two methods, such that, if more than

    one cluster's center exist in one partition then merge them and calculate the

  • XXXVI

    membership values again, but if there is no cluster's center in a partition then delete it

    and redefined the other clusters. The experimental results show that MFRBF can be

    used to get high accuracy with fewer and unambiguous rules. The classificati-on rate

    is 97% by using only three rules. On the contrary, more rules are needed to get the

    same accuracy by using ANFIS. Moreover the features projected partition in ANFIS

    is amb-iguous and cant preserve the meaning of the linguistic labels.

    Genetic Algorithm (GA):

    The standard GA proceeds as follows: an initial population of individuals is

    generated at random or heuristically. Every evolutionary step, known as a generation,

    the individuals in the current population are decoded and evaluated according to

    some predefined quality criterion. To form a new population (the next generation),

    individuals are selected according to their fitness. Many selection procedures are

    currently in use, one of the simplest being fitness-proportionate selection, where

    individuals are selected with a probability proportional to their relative fitness. This

    ensures that the expected number of times an individual is chosen is approximately

    proportional to its relative performance in the population. Thus, high-fitness or good

    individuals stand a better chance of reproducing, while low-fitness ones are more

    likely to disappear [45]. Genetic algorithms can be used to determine the

    interconnecting weights of the ANN. During training of the network, the BP requires

    approximately two ANN evaluations (i.e., one forward propagation and one

    backward error propagation) for each iteration, while the GA required only one ANN

    evaluation (i.e., forward propagation) for each generation and each chromosome. In

    comparison to the conventional BP training algorithm, the GA has shown to provide

    some benefit in evolving the inter-connecting weights for the ANNs. In [49] although

    the GA trained ANN didnt outperform the BP-trained ANN at all numbers of ANN

    evaluations in the test set, the GA trained ANN was found to converge faster than the

    BP trained ANN in the training set.

    Computer-aided diagnosis of breast cancer using artificial neural networks:

    Comparison of Back propagation and Genetic Algorithms Author: Yuan-Hsiang

    Chang, Bin Zheng, Xiao-Hui Wang, abd Walter F. Good [49].

    Abstract:

  • XXXVII

    The authors investigated computer-aided diagnosis (CAD) schemes to determine the

    probabilio for the presence of breast cancer using artificial neural networks (ANN)

    that were trained by a Backpropagation (BP) algorithm or by a Genetic Algorithm

    (GA). A clinical database of 418 previously verified patient cases was employed and

    randomly pariitioned into two independent sets for CAD training and testing. During

    training, the BP and the GA were independenti'y applied to optimize, or to evolve the

    inter-connecting weights of the ANN . Both the BP-trained and the GA-trained CAD

    performances were then compared using receiver-operating characteristics (ROC)

    analysis. In the training set, the BP-trained and the GA-trained CAD schemes yielded

    the areas under ROC curves of 0.91 and 0.93, respectively. In the testing set, both the

    BP-trained and the GA-trained ANN, yielded the areas under ROC curves of

    approximately 0.83. These results demonstrated that the GA performed slightly

    better, although not significantly, than BP for the training of the CAD schemes.

    Comments:

    In this paper it is found that although the GA trained ANN didnt outperform the BP-

    trained ANN at all numbers of ANN evaluations in the test set, the GA trained ANN

    was found to converge faster than the BP trained ANN in the training set.

    2.3 Comparison of neural network techniques for breast cancer diagnosis and

    prognosis

    NN techniques for breast cancer diagnosis are compared for WBC data. It is

    concluded that the MLP, RBFNN, PNN, GRNN, GA, Fuzzy- neuro -system, SANE,

    IGANIFS, Xcyct system, ANFIS, SIANN may be used for the classification problem.

    Almost all intelligent computational learning algorithms use supervised learning. The

    accuracy of different methods is compared in table 2.1.

    Table 2.1 Accuracy comparison for test data classification

    Type of Network Accuracy References

    Radial Basis Function Neural Network (RBFNN) 96.18% [18]

    Probabilistic Neural Network (PNN) 97.0% [18]

    Multilayer Perceptorn (MLP) 95.74% [18]

  • XXXVIII

    Generalized Regression Neural Network (GRNN) 98.8% [18]

    Symbiotic Adaptive Neuro-Evolution (SANE) 98.7% [50]

    Information Gain and Adaptive Neuro-Fuzzy Inference System (IGANIFS)

    98.24% [51]

    Xcyct system using leave one out method 90 to 91% [52]

    Adaptive Neuro-Fuzzy Inference System (ANFIS) 59.90% [53]

    Fuzzy 96.71% [54]

    Shunting Inhibitory Artificial Neural Networks (SIANN) 100% [55]

  • XXXIX

    Chapter 4

    Matlab

    4.1 Introduction

    MATLAB is a powerful computing system for handling the calculations involved in

    scientific and engineering problems. The name MATLAB stands for MATrix

    LABoratory, because the system was designed to make matrix computations

    particularly easy[87]. Matlab program and script files always have filenames ending

    with ".m". Script files contain a sequence of usual MATLAB commands, that are

    executed (in order) once the script is called within MATLAB. In MATLAB almost

    every data object is assumed to be an array. A good source of information related to

    MATLAB, the creator company THE MATHWORKS INC and their other products

    is their Web Page at www.mathworks.com [88]. There are two essential requirements

    for successful MATLAB programming [87]-

    a) We need to learn the exact rules for writing MATLAB statements.

    b) We need to develop a logical plan of attack for solving particular problems.

    The MATLAB program implements the MATLAB programming language, and

    provides a very extensive library of predefined functions to make technical

    programming task easier and more efficient.

  • XL

    4.2 Advantages of MATLAB [89]

    MATLAB has many advantages compared to conventional computer languages for

    technical problem solving. Among them are-

    1. Ease of use:

    MATLAB is an interpreted language like Basic, it is very easy to use. Programs may

    be easily written and modified with the built-in integrated development environment

    and debugged with the MATLAB debugger. Because the language is so easy to use,

    it is ideal for the rapid prototyping of new programs. Many program development

    tools are provided to make the program easy to use. They include an integrated

    editor/debugger, on-line documentation and manuals, a workspace browser, and

    extensive demos.

    2. Platform Independence:

    In MATLAB programs written on any platform will run on all of the other platforms,

    and data files written on any platform may be read transparently on any other

    platform. As a result, programs written in MATLAB can migrate to new platforms

    when the needs of user changes.

    3. Predefined functions:

    MATLAB has extensive library of predefined functions that provide tested

    and pre-packaged solutions to many basic technical tasks. There are many special

    purpose toolboxes available to solve complex problems in specific areas. Toolboxes

    are libraries of MATLAB functions used to customize MATLAB for solving

    particular class of problem. Toolboxes are a result of some of the worlds top

    researchers in specialized fields. They are equivalent to pre-packaged of-the-

    shelfsoftware for particular class of problem. These are the collection of special files

    called M files that extend the functionality of the base program. Such files are called

    m-files because they must have the filename extension .m. This extension is

    required in order for these files to be interpreted by MATLAB. Each toolbox is

    purchased separately. If an evaluation license is requested, the MathWorks sales

    department requires detailed information about the project for which MATLAB is to

    be evaluated. Overall the process of acquiring a license is expensive in terms of

  • XLI

    money and time. If granted (which it often is), the evaluation license is valid for two

    to four weeks. The various toolboxes are-

    a. Control Systems

    b. Signal Processing

    c. Communications

    d. System Identification

    e. Robust Control

    f. Simulink

    g. Image processing

    h. neural networks

    i. fuzzy logic

    j. Analysis

    k. Optimization

    l. Spline

    m. Symbolic

    n. User interface utility

    4. Device- Independent plotting

    MATLAB has many integral plotting and imaging commands. The plots and images

    can be displayed on any graphical output device supported by the computer on which

    MATLAB is running. This capability makes MATLAB an outstanding tool for

    visualizing technical data.

    5. Graphical User Interface:

    MATLAB include tools that allow a programmer to interactively construct a

    graphical user interface (GUI) for his/her own program. With this capability, the

    programmer can design sophisticated data-analysis programs that can be operated by

    relatively inexperienced users.

    6. MATLAB Compiler:

  • XLII

    MATLAB code interpreted rather than compiled. A separate compiler is available.

    This compiler can compile a MATLAB program into a true executable code that runs

    faster than the interpreted code. Its a great way to convert a prototype MATLAB

    program into an executable and suitable for sale and distribution to users.

    MATLAB is an efficient tool to develop applications based on neural network.

    Therefore it is used in proposed result for breast cancer diagnosis and prognosis

    using polynomial neural network.

    4.3 Limitations of MATLAB [89]

    Following are some limitations of using MATLAB-

    1. It is an interpreted language and therefore can execute more slowly than

    compiled languages.

    This problem can be mitigated by properly structuring the MATLAB program and by

    the use of MATLAB compiler to compile the final MATLAB program before

    distribution and general use.

    2. A full copy of MATLAB is 5-10 times more expensive than a conventional

    than C or FORTRAN compiler. There is also an inexpensive student edition for

    MATLAB which is a great tool for students. The student edition of MATLAB is

    essentially identical to the full edition.

    4.4 Neural Network Toolbox [90]

    Neural network toolbox is equivalent to pre-packaged of-the-shelf software for

    neural network class of problem. The Neural Network Toolbox software uses the

    network object to store all of the information that defines a neural network. There are

    four different levels at which the Neural Network Toolbox software can be used-

    1. The first level is represented by the GUIs that are described in Getting

    Started with Neural Network Toolbox. These provide a quick way to access the

    power of the toolbox for many problems of function fitting, pattern recognition,

    clustering and time series analysis.

  • XLIII

    2. The second level of toolbox use is through basic command-line operations.

    The command-line functions use simple argument lists with intelligent default

    settings for function parameters. (You can override all of the default settings, for

    increased functionality.) This topic, and the ones that follow, concentrate on

    command-line operations. The GUIs described in Getting Started can automatically

    generate MATLAB code files with the command-line implementation of the GUI

    operations. This provides a nice introduction to the use of the command-line

    functionality.

    3. A third level of toolbox use is customization of the toolbox. This advanced

    capability allows you to create your own custom neural networks, while still having

    access to the full functionality of the toolbox.

    4. The fourth level of toolbox usage is the ability to modify any of the M-files

    contained in the toolbox. Every computational component is written in MATLAB

    code and is fully accessible.

    4.5 Neural Network Design using Neural Network Toolbox[90]

    The multilayer feed forward neural network is the workhorse of the Neural Network

    Toolbox software. It can be used for both function fitting and pattern recognition

    problems. With the addition of a tapped delay line, it can also be used for prediction

    problems. The work flow for the neural network design process has seven primary

    steps:

    Collecting the data

    Creating the network

    Configuring the network

    Initializing the weights and biases

    Training the network

    Validating the network

    Using the network

  • XLIV

    The first step might happen outside the framework of Neural Network Toolbox

    software, but this step is critical to the success of the design process.

    4.5.1 Collecting the data

    We need to collect and prepare sample data that cover the range of inputs for which

    the network will be used. After the data have been collected, there are two steps that

    need to be performed before the data are used to train the network: the data need to

    be pre-processed, and they need to be divided into subsets.

    4.5.1.1 Pre-processing and post-processing the data

    The most common pre-processing routines are provided automatically when we

    create a network, and they become part of the network object, so that whenever the

    network is used, the data coming into the network is pre-processed in the same way.

    It is easiest to think of the neural network as having a pre-processing block that

    appears between the input and the first layer of the network and a post-processing

    block that appears between the last layer of the network and the output, as shown in

    the fig. 4.1.

    Input Output

    Fig 4.1 Pre-processing and post-processing

    Most of the network creation functions in the toolbox, including the multilayer

    network creation functions, such as feedforwardnet, automatically assign processing

    functions to network inputs and outputs. These functions transform the input and

    target values you provide into values that are better suited for network training. Some

    common pre-processing and post-processing functions are shown in table 4.1.

    Table 4.1 Pre-processing and post-processing functions

    Function Algotithm

    Mapminmax Normalize inputs/targets to fall in the range [1, 1]

  • XLV

    processpca Extract principal components from the input vector

    fixunknowns Process unknown inputs

    Generally, the normalization step is applied to both the input vectors and the target

    vectors in the data set. In this way, the network output always falls into a normalized

    range. The network output can then be reverse transformed back into the units of the

    original target data when the network is put to use in the field.

    4.5.1.2 Representing Unknown or Dont Care Targets

    Unknown or dont care targets can be represented with NaN values. All the

    performance functions of the toolbox will ignore those targets for purposes of

    calculating performance and derivatives of performance.

    4.5.1.3 Dividing the Data

    When training multilayer networks, the general practice is to first divide

    the data into three subsets- trining, validation and testing. The function dividerand

    is a default function that divide the data randomly into three subsets.

    4.5.2 Creating and configuring the network

    Basic components of a neural network are created and stored in the network object.

    As an example, the dataset file contains a predefined set of input and target vectors.

    We Load the dataset using the load command. Loading the dataset file creates two

    variables. The input matrix and The target matrix. The function

    feedforwardnetcreates a multilayer feedforward network.The resulting network can

    then be configured with the configure command.

    4.5.3 Initializing weights and biases

    The configure command automatically initializes the weights, but we might want to

    reinitialize them. You do this with the init command. This function takes a network

    object as input and returns a network object with all weights and biases initialized.

    4.5.4 Train the network

    Once the network weights and biases are initialized, the network is ready for training.

    The multilayer feed forward network can be trained for function approximation

  • XLVI

    (nonlinear regression) or pattern recognition. The training process requires a set of

    examples of proper network behaviour- network inputs pand target outputs t. The

    process of training a neural network involves tuning the values of the weights and

    biases of the network to optimize network performance, as defined by the network

    performance function net.performfcn. The default performance function for feed

    forward networks is mean square error (mse). For training multilayer feedforward

    networks, any standard numerical optimization algorithm like gradient descent can be

    used to optimize the performance function. Gradient descent algorithm updates the

    network weights and biases in the direction in which the performance function

    decreases most rapidly, the negative of the gradient. Training function traingd is

    used for gradient descent algorithm. The gradient is calculated using a technique

    called the back propagation algorithm, which involves performing computations

    backward through the network. Properly trained multilayer networks tend to give

    reasonable answers when presented with inputs that they have never seen. This

    property is called generalization. The default generalization feature for the multilayer

    feed forward network is early stopping. Data are automatically divided into training,

    validation and test sets. The error on the validation set is monitored during training,

    and the training is stopped when the validation increases over

    net.trainParam.max_fail iterations.

    4.5.5Validation of network

    When the training is complete, we check the network performance and determine if

    any changes need to be made to the training process, the network architecture or the

    data sets. The first thing to do is to check the training record, tr, which was the

    second argument returned from the training function. For example, tr.trainInd,

    tr.valInd and tr.testInd contain the indices of the data points that were used in the

    training, validation and test sets, respectively. If we want to retrain the network using

    the same division of data, we can set net.divideFcn to 'divideInd',

    net.divideParam.trainInd to tr.trainInd, net.divideParam-.valInd to tr.valInd,

    net.divideParam.testInd to tr.testInd. We can use the training record to plot the

    performance progress by using the plotperf command. The next step in validating

    the network is to create a regression plot, which shows the relationship between the

    outputs of the network and the targets. If the training were perfect, the network

    outputs and the targets would be exactly equal, but the relationship is rarely perfect in

  • XLVII

    practice. If the network is not sufficiently accurate, we can try initializing the

    network and the training again. Each time your initialize a feed forward network, the

    network parameters are different and might produce different solutions.

    4.5.6 Use the network

    After the network is trained and validated, the network object can be used to

    calculate the network response to any input.

  • XLVIII

    Chapter 5

    Simulation and Results

    5.1 Introduction

    For simulation three different datasets named Wisconsin Breast Cancer original

    (WBC) dataset, Wisconsin diagnosis Breast Cancer (WBCD) dataset and Wisconsin

    Prognosis Breast Cancer (WPBC) dataset are downloaded from the UCI Machine

    Learning Repository website [91] and saved as a text file. A brief description of

    Wisconsin dataset is given in table 5.1. Detaied decription of dataset is provided in

    next section.

    Table 5.1 A brief description of Breast Cancer datasets

    Dataset name No of attributes No of instances No. of classes

    Wisconsin Breast Cancer (WBC) 11 699 2

    Wisconsin Diagnosis Breast Cancer (WDBC) 32 569 2

    Wisconsin Prognosis Breast Cancer (WPBC) 34 198 2

  • XLIX

    After downloading we have got three separate files; one for each dataset. These files

    are then imported into Excel spreadsheets and the values are saved with the

    corresponding attributes as column headers. The ID of the patient cases does not

    contribute to the classifier performance. Hence it is removed and the outcome

    attribute defines the target or dependant variable. We preprocessed the data using

    principal component analysis described in chapter 3[34]. After pre processing the

    WBC data is applied to PNN described in chapter 3[29-31] which classifies the data

    into two sets. The overall classification involves training and testing as shown in fig

    5.1. Implementation is done with help of MATLAB 7.0 using neural network toolbox

    described in chapter 4 [40-41].

    Fig. 5.1 Flow chart of ANN process

    5.2 Description of dataset

    Detailed description of the three datasets used in the proposed research is as follows

    [83]-

    Wisconsin Breast Cancer (WBC) Dataset :

    This database has 699 instances and 10 attributes including the class attribute.

    Attribute 1 through 9 are used to represent instances. Each instance has one of two

    possible classes: benign or malignant. According to the class distribution 458 or

    65.5% instances are Benign and 241 or 34.5% instances are malignant. Table 5.2

    provides the attribute information.

    Table 5.2 Attribute information of WBC dataset

    S.no Attribute Domain

    1 Clump thickness 1-10

    2 Uniformity of cell size 1-10

  • L

    3 Uniformity of cell shape 1-10

    4 Marginal adhesion 1-10

    5 Single epithelial cell size 1-10

    6 Bare nuclei 1-10

    7 Bland chromatin 1-10

    8 Normal nucleoli 1-10

    9 Mitosis 1-10

    Class 2 for benign, 4 for malignant

    In the Clump thickness benign cells tend to be grouped in monolayer, while

    cancerous cells are often grouped in multilayer. While in the Uniformity of cell

    size/shape the cancer cells tend to vary in size and shape. That is why these

    parameters are valuable in determining whether the cells are cancerous or not. In the

    case of Marginal adhesion the normal cells tend to stick together, where cancer cells

    tend to lose this ability. So loss of adhesion is a sign of malignancy. In the Single

    epithelial cell size the size is related to the uniformity mentioned above. Epithelial

  • LI

    cells that are significantly enlarged may be a malignant cell. The Bare nuclei is a

    term used for nuclei that is not surrounded by cytoplasm (the rest of the cell). Those

    are typically seen in benign tumors. The Bland Chromatin describes a uniform

    texture of the nucleus seen in benign cells. In cancer cells the chromatin tends to be

    coarser. The Normal nucleoli are small structures seen in the nucleus. In normal cells

    the nucleolus is usually very small if visible. In cancer cells the nucleoli become

    more prominent, and sometimes there are more of them. Finally, Mitoses is nuclear

    division plus cytokines and produce two identical daughter cells during prophase. It

    is the process in which the cell divides and replicates. Pathologists can determine the

    grade of cancer by counting the number of mitoses.

    Wisconsin Diagnosis Breast Cancer (WDBC) Dataset :

    This database has 569 instances and 32 attributes including the class attribute.

    Attribute 2 is class attribute. Other attributes are used to represent instances. Each

    instance has one of two possible classes: benign or malignant. According to the class

    distribution 357 instances are Benign and 212 instances are Malignant. Table 5.3

    provides the attribute information of WDBC dataset.

    Table 5.3 Attribute information of WDBC dataset

    Attribute name Significance Attribute ID

    ID Unique ID of patient 1

    Outcome Diagnosis ( B- Benign / M- Malingnant) 2

    Radius 1,2,3 Mean of distances from centre to points on the perimeter 3, 13, 23

    Texture 1, 2,3 Standard deviation of gray scale values 4, 14, 24

    Perimeter 1,2,3 Perimeter of the cell nucleolus 5, 15,25

    Area 1,2,3 Area of the cell nucleolus 6, 16, 26

    Smoothness 1,2,3 Local variation in radius lengths 7, 17,27

    Compactness 1,2,3 Perimeter2 / area -