24

To my beloved husband, children, parents and brothers.Pelbagai usaha penyelidikan telah dilaksanakan dalam pengesanan kecacatan kayu secara automatik untuk meningkatkan kualiti produk

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

  • i

    TIMBER DEFECT DETECTION BASED ON SYSTEMATIC FEATURE

    ANALYSIS AND ONE CLASS CLASSIFIER

    UMMI RABA’AH BINTI HASHIM

    A thesis submitted in fulfilment of the

    requirements for the award of the degree of

    Doctor of Philosophy (Computer Science)

    Faculty of Computing

    Universiti Teknologi Malaysia

    DECEMBER 2015

  • iii

    DEDICATION

    To my beloved husband, children, parents and brothers.

  • iv

    ACKNOWLEDGEMENT

    In the name of Allah, most gracious, most merciful. Praise to Allah, for

    guiding me in the right path, blessing me with the best in this life. It takes the efforts

    and supports of many to bring this research study to completion. I am indebted to the

    dozens of people guiding and supporting me throughout this study. I would like to

    express my gratitude to the following special individuals:

    1. My supervisor and co-supervisor, Assoc. Prof. Dr. Siti Zaiton binti Mohd

    Hashim and Assoc. Prof. Dr. Azah Kamilah Muda, for their wonderful

    guidance and continuous encouragement during the progression of my study.

    2. Academicians of UTM, for their valuable teaching, comment, idea and

    motivation for this research.

    3. Industry experts from Hasro Malaysia, Teras Puncak and Elegant Success

    (Malaysian wood products manufacturers) for their co-operation, invaluable

    consultation and kind support.

    4. Universiti Teknikal Malaysia Melaka (UTeM) and Ministry of Education

    Malaysia for their generous financial support.

    5. My husband and children, for their patience and love.

    6. My parents and brothers, for their blessing and care.

    .

  • v

    ABSTRACT

    Substantial research effort has been done in the automation of timber defect

    detection to improve the quality of timber products, optimise raw material resources,

    increase productivity and reduce error related to human labour. This study extends

    the work on automated inspection of timber boards to Malaysian timber species

    hoping that the outcome will benefit the local wood product industries. This study

    aims to propose a timber surface defect detection approach which is robust in

    detecting various defects on multiple timber species using significant texture

    features, validated using data from local timber species. In the experiments, defective

    samples from Malaysian Hardwood are collected and labelled under supervision of

    industry experts. Additionally, this work gives new insight into the characterisation

    of timber defect images by using statistical texture from orientation independent

    Grey Level Dependence Matrix (GLDM) with appropriate parameter analysis. A

    Systematic Feature Analysis (SFA) which includes exploratory and confirmatory

    multivariate analysis was performed to investigate the discriminative power of the

    proposed feature set. The SFA produces a feature set of timber surface defects

    capable of providing significant discrimination between defects and clear wood

    classes. Finally, a new concept in the domain of timber defect detection based on

    outlier detection concept was introduced to overcome the problem of imbalanced

    data. This study proposes a robust Mahalanobis one class classifier (MC) with Fast

    Minimum Covariance Determinant estimator (MC-FMCD) for species independent

    timber defect detection. The experimental results show that the proposed approach

    achieved superior performance over the classical Mahalanobis Distance (MD) and

    robust in detecting many types of defects across timber species.

  • vi

    ABSTRAK

    Pelbagai usaha penyelidikan telah dilaksanakan dalam pengesanan kecacatan

    kayu secara automatik untuk meningkatkan kualiti produk kayu, mengoptimumkan

    sumber bahan mentah dan meningkatkan produktiviti. Kajian dalam bidang ini telah

    dilanjutkan kepada spesies kayu Malaysia dengan harapan bahawa hasilnya akan

    memberi manfaat kepada industri produk kayu tempatan. Kajian ini bertujuan untuk

    mencadangkan pengesanan kecacatan permukaan kayu yang teguh dalam mengesan

    pelbagai kecacatan pada pelbagai spesies kayu menggunakan ciri tekstur yang

    signifikan serta disahkan menggunakan data dari spesies kayu tempatan. Sampel

    kecacatan dari spesies kayu keras Malaysia dikumpul dan dilabel di bawah

    pengawasan pakar-pakar industri untuk digunakan dalam kajian ini. Selain itu, kajian

    ini memberi pemahaman baru dalam perwakilan atribut imej kecacatan kayu dengan

    menggunakan tekstur statistik dari Matriks Pergantungan Aras Kelabu (GLDM)

    berorientasi bebas berserta dengan analisa parameter yang bersesuaian. Satu

    Penilaian Atribut Sistematik (SFA) merangkumi analisa eksplorasi dan pengesahan

    multivariat telah dijalankan untuk mengkaji kuasa diskriminasi set atribut yang

    dicadangkan. SFA tersebut telah menghasilkan perwakilan atribut yang mampu

    membezakan antara kelas-kelas kecacatan kayu dan kayu baik secara signifikan.

    Akhirnya, satu konsep baru dalam domain pengesanan kecacatan kayu yang

    berdasarkan pengesanan anomali telah diperkenalkan untuk menangani masalah data

    tidak seimbang. Kajian ini mencadangkan satu pengelas tunggal Mahalanobis (MC)

    yang teguh dengan penganggar Penentu Kovarians Minimum Pantas (MC-FMCD)

    untuk pengesanan kecacatan kayu tanpa mengira spesies kayu. Hasil eksperimen

    menunjukkan bahawa pendekatan yang dicadangkan berjaya mencapai prestasi yang

    lebih baik jika dibandingkan dengan Jarak Mahalanobis (MD) klasik dan berupaya

    mengesan pelbagai jenis kecacatan pada pelbagai spesies kayu.

  • vii

    TABLE OF CONTENTS

    CHAPTER TITLE PAGE

    DECLARATION ii

    DEDICATION iii

    ACKNOWLEDGEMENT iv

    ABSTRACT v

    ABSTRAK vi

    TABLE OF CONTENTS vii

    LIST OF TABLES xii

    LIST OF FIGURES xiv

    LIST OF ABBREVIATIONS xvii

    LIST OF APPENDICES xx

    TERMS AND DEFINITIONS xxi

    1 INTRODUCTION 1

    1.1 Overview 1

    1.2 Research Background 2

    1.3 Problem Statement and Research Aim 13

    1.4 Research Objective 14

    1.5 Research Scope 14

    1.6 Significance of the Study 16

    1.7 Research Methodology 17

    1.8 Research Contribution 19

    1.9 Thesis Structure 19

  • viii

    2 LITERATURE REVIEW 21

    2.1 Introduction 21

    2.2 Overview of Timber Process 26

    2.3 Malaysian Timber Species 28

    2.4 Timber Defects 31

    2.5 Automated Vision Inspection (AVI) of Timber 33

    2.5.1 Problem Background 33

    2.5.2 AVI in Wood Industry 34

    2.5.3 Sensors Used for AVI in Wood Industry 39

    2.5.4 General Timber Defect Detection Approach 43

    2.5.5 Feature Extraction on Defect Images 46

    2.5.6 Defect Classification 50

    2.5.7 Discussion 53

    2.6 Statistical Texture Feature Based on Grey Level

    Dependence Matrix (GLDM) 55

    2.6.1 Problem Background 55

    2.6.2 Orientation Independent GLDM 58

    2.6.3 Statistical Features of GLDM 63

    2.7 One Class Classification for Imbalanced Data 71

    2.7.1 Introduction and Problem Background 71

    2.7.2 Distance-based One Class Classifier (OCC) 73

    2.7.3 Fast Minimum Covariance Determinant as Robust

    Estimator 77

    2.8 Summary 81

    3 RESEARCH METHODOLOGY 82

    3.1 Introduction 82

    3.2 Problem Situation and Solution Concept 82

    3.3 Research Design 87

    3.3.1 Research Framework 87

    3.3.2 Operational Framework 88

  • ix

    3.3.2.1 Phase 1: Construction of timber defect

    image dataset of Malaysian hardwood 89

    3.3.2.2 Phase 2: Identification of significant texture

    feature set representing timber defect. 90

    3.3.2.3 Phase 3: Development of robust OCC with

    FMCD estimator for timber defect detection 91

    3.3.3 Overall Research Plan 92

    3.4 Evaluation Measurement 95

    3.4.1 Multivariate Analysis of Variance (Manova) to

    Evaluate Feature Quality 95

    3.4.2 Precision, Recall and F Measure to Measure

    Detection Performance 100

    3.4.3 Over Detection and Under Detection Errors to

    Assess Segmentation Quality 102

    3.5 Summary 103

    4 CONSTRUCTION OF TIMBER SURFACE DEFECT

    IMAGE DATASET 104

    4.1 Introduction 104

    4.1 Timber Samples Collection 106

    4.2 Image Acquisition Setup 106

    4.3 Image Labelling and Processing 110

    4.4 Findings 113

    4.5 Summary 116

    5 SIGNIFICANT FEATURE SET OF TIMBER SURFACE

    DEFECTS BASED ON STATISTICAL TEXTURE AND

    SYSTEMATIC FEATURE ANALYSIS 117

    5.1 Introduction 117

    5.2 Overview of Approach 118

    5.3 Feature Extraction 121

  • x

    5.3.1 Extracting Statistical Features from GLDM 121

    5.3.2 Exploring Displacement and Quantization Parameter

    of GLDM 127

    5.4 Evaluation of Feature Quality 133

    5.4.1 Exploratory Feature Analysis 133

    5.4.1.1 Univariate Feature Range Analysis 134

    5.4.1.2 Bivariate Matrix of Scatter Plot 136

    5.4.1.3 Multivariate Intra-Class and Inter-Class

    Distance between Clear Wood and Defects 137

    5.4.2 Confirmatory Feature Analysis 139

    5.4.2.1 Removing Linearly Dependent Features 141

    5.4.2.2 Measuring Significant Difference between

    Defect Classes using Manova Statistics 143

    5.4.2.3 Identifying Significant Features using Post-

    hoc Manova (Discriminant Analysis) 145

    5.5 Performance Validation 149

    5.5.1 Measuring Classification Performance across

    Feature Sets and Classifiers 150

    5.5.2 Measuring Classification Performance of Individual

    Classes 153

    5.5.3 Measuring Classification Accuracy across Timber

    Species 156

    5.6 Discussion 158

    5.7 Summary 159

    6 ROBUST MAHALANOBIAN CLASSIFIER WITH FMCD

    ESTIMATOR (MC-FMCD) FOR TIMBER DEFECT

    DETECTION 160

    6.1 Introduction 160

    6.2 Overview of Approach 161

    6.3 Experimental Setting for Simulated Datasets 163

  • xi

    6.4 Experimental Results for Simulated Datasets 165

    6.4.1 Detection Peformance across Various Defect Ratios 166

    6.4.2 Detection Performance by Defect Type 170

    6.4.3 Detection Performance between Classic MD and

    Robust MC-FMCD 174

    6.4.4 Summary of Detection Performance across Timber

    Species 178

    6.5 Expert Validation on Test Images 180

    6.6 Discussion 185

    6.7 Summary 186

    7 CONCLUSION AND FUTURE RESEARCH 188

    7.1 Summary of Research Finding 188

    7.2 Research Contribution 191

    7.3 Future Work Recommendation 193

    7.4 Concluding Remark 195

    REFERENCES 196

    Appendices A - N 213 - 297

  • xii

    LIST OF TABLES

    TABLE NO. TITLE PAGE

    2.1 List of Malaysian timber classification based on density (MTIB, 2000) 29

    2.2 Natural durability classification based on years (MTIB, 2000) 29

    2.3 Characteristics of four types of timber species (MTIB, 2000) 30

    2.4 List of common timber defect 32

    2.5 Related works on automated inspection of wood products 36

    2.6 Related studies on inspection of external wood defects 40

    2.7 Images of directional matrices and rotation invariant matrix 61

    3.1 Problem leading to solution 86

    3.2 Overall research plan 92

    3.3 Confusion matrix 102

    4.1 List of data collection setting of past studies on timber surface defect detection 109

    4.2 List of classes with example of sub-images collected 114

    4.3 Number of samples collection across species 116

    5.1 Example of sub-image and the corresponding dependence matrix 123

    5.2 List of statistical texture features extracted 124

    5.3 Example of extracted features (one sample per class, species=Meranti, d=1, q=32) 125

    5.4 Texture characteristics of clear wood and defect 126

  • xiii

    5.5 Distances between test samples and independent clear wood samples 142

    5.6 List of feature correlation with r>0.99 142

    5.7 List of features removed after correlation test 143

    5.8 Box's test of equality of covariance matrices 144

    5.9 Manova test 144

    5.10 Pillai’s Trace value across multiple quantization levels and displacements 145

    5.11 Eigenvalues and canonical correlations 146

    5.12 Raw and standardized discriminant function coefficients (Root 1) 147

    5.13 Correlation between features and canonical variable 148

    5.14 List of remaining features after discriminant analysis 148

    5.15 List of feature sets used for performance comparison 150

    5.16 Confusion matrices for D7, D5 and D4 154

    5.17 Samples mistakenly classified as clear wood (undetected defect) 155

    5.18 Confusion matrices for Merbau, KSK and Rubberwood 157

    6.1 Experimental Meranti dataset for various defect ratios 163

    6.2 Detection performance by defect ratio 167

    6.3 Detection performance by defect types 170

    6.4 Detection performance on test images: Rubberwood 181

    6.5 Detection performance on test images: KSK 182

    6.6 Detection performance on test images: Meranti 183

    6.7 Detection performance on test images: Merbau 184

  • xiv

    LIST OF FIGURES

    FIGURE NO. TITLE PAGE

    1.1 Motivation of the study 12

    1.2 Overview of research phases 18

    2.1 Taxonomy of literature review 23

    2.2 Timber process 26

    2.3 Log cutting pattern (Cavette, 2006; Tom & Jeff, 2010) 27

    2.4 The components of an AVI system in wood industry 35

    2.5 Reference pixel, X with its 8 neighbouring pixels (Haralick et al., 1973) 59

    2.6 Distribution of non-zero matrix element on the left, and contour plot showing joint probability density function of the spatial dependence matrix on the right. 62

    2.7 Research solutions to the problem of classification of imbalanced data (Sun et al., 2009) 73

    3.1 Solution concept for timber defect detection 85

    3.2 Research framework 88

    3.3 Operational research framework 89

    4.1 Image acquisition setup 108

    4.2 The process of dataset construction 111

    4.3 Sample of acquired images 111

    4.4 Subdivision of original image into sub-images 113

    4.5 Distribution of defect samples across species 115

  • xv

    5.1 Proposed approach in determining significant feature set 120

    5.2 Procedures for extracting statistical texture features based on GLDM 122

    5.3 Pictorial representation of the orientation independent GLDM 128

    5.4 Normalized feature means against displacement and quantization 131

    5.5 Energy feature range analysis 134

    5.6 Entropy feature range analysis 135

    5.7 Contrast feature range analysis 135

    5.8 Scatter plot matrix showing pairwise comparison of features 136

    5.9 Intra-class distance between clear wood samples and inter-class distance between clear wood and defect samples 138

    5.10 Procedures for confirmatory feature analysis 140

    5.11 Classification accuracy of three proposed feature sets (D6, D7 and D8) 151

    5.12 Classification accuracy between the proposed feature set (D7) and feature sets from previous studies 152

    5.13 F scores for each class across datasets D4, D5 and D7 154

    5.14 Classification accuracy across timber species 156

    6.1 Flow of experiments for timber defect detection 161

    6.2 Proposed MC-FMCD for robust timber defect detection 162

    6.3 F score across defect ratio: (a) Meranti, (b) Rubberwood, (c) KSK, (d) Merbau 168

    6.4 OD Error and UD Error across defect ratio: (a) Meranti, (b) Rubberwood, (c) KSK, (d) Merbau 169

    6.5 F score by defect type: (a) Meranti, (b) Rubberwood, (c) KSK, (d) Merbau 172

    6.6 OD Error and UD Error by defect type: (a) Meranti, (b) Rubberwood, (c) KSK, (d) Merbau 173

    6.7 Detection performance for MC-FMCD and classic MD: Meranti dataset 174

  • xvi

    6.8 Detection performance for MC-FMCD and classic MD: Rubberwood dataset 175

    6.9 Detection performance for MC-FMCD and classic MD: KSK dataset 176

    6.10 Detection performance for MC-FMCD and classic MD: Merbau dataset 177

    6.11 Average detection performance by timber species 178

    6.12 Average detection performance by defect type across timber species (a) F score comparison between timber species by defect type (b) Average F score by defect type 179

    6.13 Average detection performance between MC-FMCD and classic MD 180

    6.14 Average detection performance validated by an expert 185

  • xvii

    LIST OF ABBREVIATIONS

    ANN - Artificial Neural Network

    AUTOC - Autocorrelation

    AVI - Automated Vision Inspection

    BR - Brown Stain

    BS - Blue Stain

    CAR - Causal Auto Regressive Model

    CCD - charged-coupled device

    CL - Clear Wood

    CONT - Contrast

    COR - Correlation

    CPROM - Cluster Prominence

    CSHAD - Cluster Shade

    CT - Computed Tomography

    DENT - Difference entropy

    DISS - Dissimilarity

    DVAR - Difference variance

    EN - Energy

    ENT - Entropy

    EPQ - Equal Probability Quantization

    FMCD - Fast Minimum Covariance Determinant

    FMMIS - Fuzzy Min-Max Neural Network for Image Segmentation

    FN - False Negative

    FP - False Positive

    GA - Genetic Algorithm

    GLDM - Grey Level Dependence Matrix

  • xviii

    GPR - Ground Penetrating Radar

    HL - Hole

    HOMO - Homogeneity

    IDMN - Inverse difference moment normalized

    IDN - Inverse difference normalized

    IMC1 - Information measures of correlation 1

    IMC2 - Information measures of correlation 2

    KN - Knot

    KNN - K-nearest Neighbour

    KSK - Kembang Semangkuk

    LBP - Local Binary Pattern

    MANOVA - Multivariate Analysis of Variance

    MAXPR - Maximum probability

    MCD - Minimum Covariance Determinant

    MC-FMCD - Mahalanobian Classifier based on Robust FMCD

    MD - Mahalanobis Distance

    MGR - Malaysian Grading Rule

    MIDA - Malaysian Investment Development Authority

    MLP - Multi-layer Perceptron

    MSE - Mean Square Error

    MTIB - Malaysian Timber Industry Board

    MVE - Minimum Volume Ellipsoid

    MVV - Minimum Vector Variance

    NATIP - National Timber Industry Policy

    OCC - One Class Classifier

    OD - Over Detection

    PC - Pocket

    RBFN - Radial Basis Function Network

    RGB - Red Green Blue

    RT - Rot

    SAVG - Sum Average

    SDM - Spatial Dependence Matrix

    SENT - Sum Entropy

  • xix

    SOM - Self-organizing Map

    SOSVH - Sum of Squares: Variance

    SP - Split

    SSCP - Sum of Squares Cross Product

    SVAR - Sum Variance

    TN - True Negative

    TP - True Positive

    UD - Under Detection

    WN - Wane

  • xx

    LIST OF APPENDICES

    APPENDIX TITLE PAGE

    A Related studies on inspection of internal wood defects Related studies on multi sensors approach to timber defect detection

    213

    B Example of orientation independent GLDM and normalized GLDM

    216

    C Plots of feature value against displacement and quantization parameter

    219

    D Univariate feature range analysis

    236

    E Matrix of scatter plots comparing feature distribution between classes

    247

    F Pairwise correlation between features and its corresponding significance, p value

    249

    G SPSS Manova output

    252

    H Experimental dataset for various defect ratios

    260

    I Expert validation sheet

    267

    J UTM letter of permission for data collection

    280

    K Biography of industry experts

    284

    L Letter of dataset certification

    287

    M Photo album

    291

    N List of Publication 297

  • xxi

    TERMS AND DEFINITIONS

    TERM DEFINITION

    Wood A hard fibrous material that makes up most of the substance of a tree

    Log A part of the trunk that has been cut off from a felled tree

    Timber Wood boards sawn from logs

    Primary wood industry

    Businesses that process logs or other tree sections directly into timber, veneer, plywood, wood chips or other primary wood products.

    Sawmill A factory where logs are sawn into timbers

    Secondary wood industry

    Businesses that process primary wood products such as timber into secondary wood products such as furniture, doors, and parquet flooring.

    Rough mill The first production area/stage in a secondary wood product industry where timber is being moulded and cut into rough sized components/parts. At this stage, undesirable characteristics or defects are removed.

    Defect Flaws or anomalies found on timber that affect its properties and limit its possible use.

    Natural defect Biological defects occurred during the growth of a tree where the timber originates from.

    Mechanical defect

    Defects that are caused by the handling or processing of timber, such as during drying, sawing and moulding.

    Internal defect Defects that are found inside the timber structure

    External defect Defects that are found on the surface of timber