124
Conference Proceeding of 5th International Conference on Science, Technology & Management (ICSTM-2019) Institution of Engineers, India, Sector 19A, Chandigarh, India on 24th February 2019, ISBN: 978-93-87433-47-2 160 | Amani Jaddu, Ramakrishna S A TWO-PHASE QUICK RESPONSE CODE TECHNIQUE FOR DOCUMENT AND MESSAGE DISTRIBUTION Amani Jaddu 1 , Ramakrishna S 2 1 PG Student, Department of Computer Science, Sri Venkateshwara University Tirupati 2 Professor, Department of Computer Science, Sri Venkateshwara University Tirupati Abstract The Quick response (QR) code was intended for capacity data and fast perusing applications. In this paper, we present another rich QR code that has two stockpiling levels and can be utilized for report verification. This new rich QR code, named two-level QR code, has open and private stockpiling levels. The open dimension is equivalent to the standard QR code stockpiling level; along these lines, it is discernible by any traditional QR code application. The private dimension is developed by supplanting the dark parts by explicit finished examples. It comprises of data encoded utilizing q-ary code with a mistake amendment limit. This permits us not exclusively to expand the capacity limit of the QR code, yet additionally to recognize the first report from a duplicate. This validation is because of the affectability of the utilized examples to the print-and-scan (P&S) process. The example acknowledgment technique that we use to peruse the second-level data can be utilized both in a private message sharing and in a confirmation situation. It depends on expanding the relationship esteems between P&S debased examples and reference designs. The capacity limit can be fundamentally improved by expanding the code letters in order q or by expanding the finished example estimate. The trial results demonstrate an ideal reclamation of private data. It likewise features the likelihood of utilizing this new rich QR code for report confirmation. Keywords: Correlation, QR Code, Document Authentication, Pattern Recognition, Print-and- scan process. I. INTRODUCTION Today, graphical codes, such as EAN-13 barcode, Quick Response (QR) code, Data Matrix, PDF417, are frequently used in our daily lives. These codes have a huge number of applications including information storage (advertising, museum art description), redirection to web sites, track and trace (for transportation tickets or

A TWO-PHASE QUICK RESPONSE CODE TECHNIQUE FOR DOCUMENT …conferenceinfo.org/24_feb.pdf · can recover 7%, 15%, 25% and 30% of errors in the code words respectively. There have been

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    160 | Amani Jaddu, Ramakrishna S

    A TWO-PHASE QUICK RESPONSE CODE TECHNIQUE FOR

    DOCUMENT AND MESSAGE DISTRIBUTION

    Amani Jaddu1, Ramakrishna S2

    1PG Student, Department of Computer Science, Sri Venkateshwara University Tirupati 2Professor, Department of Computer Science, Sri Venkateshwara University Tirupati

    Abstract

    The Quick response (QR) code was intended for capacity data and fast perusing

    applications. In this paper, we present another rich QR code that has two stockpiling levels and

    can be utilized for report verification. This new rich QR code, named two-level QR code, has

    open and private stockpiling levels. The open dimension is equivalent to the standard QR code

    stockpiling level; along these lines, it is discernible by any traditional QR code application. The

    private dimension is developed by supplanting the dark parts by explicit finished examples. It

    comprises of data encoded utilizing q-ary code with a mistake amendment limit. This permits us

    not exclusively to expand the capacity limit of the QR code, yet additionally to recognize the

    first report from a duplicate. This validation is because of the affectability of the utilized

    examples to the print-and-scan (P&S) process. The example acknowledgment technique that we

    use to peruse the second-level data can be utilized both in a private message sharing and in a

    confirmation situation. It depends on expanding the relationship esteems between P&S debased

    examples and reference designs. The capacity limit can be fundamentally improved by

    expanding the code letters in order q or by expanding the finished example estimate. The trial

    results demonstrate an ideal reclamation of private data. It likewise features the likelihood of

    utilizing this new rich QR code for report confirmation.

    Keywords: Correlation, QR Code, Document Authentication, Pattern Recognition, Print-and-

    scan process.

    I. INTRODUCTION

    Today, graphical codes, such as

    EAN-13 barcode, Quick Response (QR)

    code, Data Matrix, PDF417, are frequently

    used in our daily lives. These codes have a

    huge number of applications including

    information storage (advertising, museum

    art description), redirection to web sites,

    track and trace (for transportation tickets or

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    161 | Amani Jaddu, Ramakrishna S

    brands), identification The popularity of

    these codes is mainly due to the following

    features: they are robust to the copying

    process, easy to read by any device and any

    user, they have a high encoding capacity

    enhanced by error correction facilities, they

    have a small size and are robust to

    geometrical distortions.

    However, those undeniable

    advantages also have their counterparts:

    1) Information encoded in a QR code is

    always accessible to everyone, even if it

    is ciphered and therefore is only legible

    to authorized users (the difference

    between “see” and “understand”).

    2) It is impossible to distinguish an

    originally printed QR code from its

    copy due to their sensitivity to the Print-

    and-Scan (P&S) process. In this paper,

    we propose to overcome these

    shortcomings by enriching the standard

    QR code encoding capacity. This

    enrichment is obtained by replacing its

    black components by specific textured

    patterns. Besides the gain of storage

    capacity, these patterns can be designed

    to be sensitive to distortions due to the

    P&S process. These patterns that do not

    introduce disruption in the standard

    reading process, are always perceived as

    black components by any QR code

    reader. Therefore, even when the private

    information is degraded or lost in the

    copy, the public information is always

    accessible for reading.

    The proposed two level QR (2LQR)

    code contains of: a first level accessible for

    any standard QR code reader, therefore it

    keeps the strong characteristics of the QR

    code; and a second level that improves the

    capacities and characteristics of the initial

    QR code. The information in the second

    level is encoded by using q−ary (q ≥ 2) code

    with error correction capacities. This

    information is invisible to the standard QR

    code reader because it perceives the textured

    patterns as black components. Therefore, the

    second level can be used for private message

    sharing. Additionally, thanks to textured

    pattern sensitivity to P&S distortions, the

    second level can be used to distinguish the

    original 2LQR code from its copies.

    This paper is organized as follows.

    We start with an introduction of QR code

    features and existing rich graphical codes in

    Section II. In addition, the distortion added

    during the P&S process will be discussed

    there. The proposed two level QR (2LQR)

    code as well as the proposed recognition

    method are presented in Section III. In

    Section IV, the experimental results show

    the efficiency of the proposed recognition

    methods and analyze the capacities of the

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    162 | Amani Jaddu, Ramakrishna S

    proposed 2LQR code. Finally, Section V

    represents conclusions and perspectives.

    The QR code was invented for the

    Japanese automotive industry by Denso

    Wave1 corporation in 1994. The most

    important characteristics of this code are

    small printout size and high-speed reading

    process. The certification of QR code was

    performed by International Organization of

    Standardization (ISO), and its whole

    specification can be found in. A QR code

    encodes the information into binary form.

    Each information bit is represented by a

    black or a white component.

    The Reed-Solomon error correction

    code [15] is used for data encryption.

    Therefore, one of 4 error correction levels

    have to be chosen during QR code

    generation. The lowest level can restore

    nearly 7% of damaged information, the

    highest level can restore nearly 30%. Today,

    40 QR code versions are available with

    different to storage capacities. The smallest

    QR code version (version V1) has a 21 × 21

    component size. It can store 152 bits of raw

    data at the lowest correction level. The

    biggest QR code version (version V40) has a

    177 × 177 component size. It can store a

    maximum of 7089 bits of raw data at its

    lowest correction level.

    As illustrated in, the QR code has a specific

    structure for geometrical correction and

    high-speed decoding. Three position tags are

    used for QR code detection and orientation

    correction. One or more alignment patterns

    are used to code deformation adjustment.

    The component coordinates are set by

    timing patterns. Furthermore, the format

    information areas contain error correction

    level and mask pattern. The code version

    and error correction bits are stored in the

    version information areas.

    The QR code generation algorithm

    consists of information encoding using

    Reed-Solomon error correction code,

    information division on code words,

    application of mask pattern, placement of

    code words and function patterns into the

    QR code. The QR code recognition

    algorithm includes the scanning process,

    image binarization, geometrical correction

    and decoding algorithm.

    II RELATED WORK

    Image Embedding in QR Code:

    The QR (Quick Response) code is a

    two-dimensional barcode developed by the

    Japanese company Denso-Wave in 1994,

    and was approved as an ISO International

    Standard and Chinese National Standard in

    2000. The QR code has been widely used

    due to its good features such as large data

    capacity, high speed scan, and small printout

    size. Increase in number of smart phones is

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    163 | Amani Jaddu, Ramakrishna S

    the reason behind popularity of QR code.

    Smart phones are capable of decoding and

    accessing on line resources as well as it has

    high storage capacity and high speed of

    decoding. QR codes are used in a various

    application, such as accessing websites,

    initiate phone calls, reproduce videos or

    open text documents and data storing

    purposes. An important problem of QR

    codes is its noisy looks. To improve the

    appearance of QR code and to reduce noisy

    black and white random texture has

    generated great interest for algorithms

    capable of embedding QR codes into images

    without losing decoding robustness. There

    have been many efforts to improve the

    appearance of such embeddings. The main

    challenge of any embedding method is the

    embedded result should be decodable by

    standard applications. The embedding

    introduces changes in the luminance of the

    code, distorting the binarization thresholds

    and thus increasing the probability of

    detection error. The second challenge is the

    problem of using the entire area of the code

    in which the image or logo is to be

    embedded. This cannot be done by simply

    replacing information components with the

    desired image. A good embedding method

    should decrease the number of corrupted

    components and uses the utmost area. The

    proposed method is based on the selection of

    a set of pixels using genetic algorithm. The

    concentration of pixels and its

    corresponding luminance are optimized to

    minimize a visual distortion. Distortion

    metric is subject to a constraint in the

    probability of error.

    QR code consists of black and white

    square blocks called as components of a QR

    code. Each component is assigned a single

    bit value. Information is encoded into the

    QR components. A dark component is

    binary one and a light component is binary

    zero. A codeword contains 8 bits of

    information. There are 40 versions of QR

    code. A QR code with version V has (17 +

    4V) × (17 + 4V) number of components.

    Therefore version 1 has 21 × 21 components

    whereas version 40 corresponds to 177 ×

    177 components. Fig. 1 shows the structure

    of a QR code. Finder pattern contains three

    identical square shape located at the three

    corners of QR code. Finder pattern is the

    most important pattern which enables the

    detection of QR code. Alignment patterns

    are also essential to locate, rotate and

    aligning the QR code. Finder pattern, timing

    pattern and alignment pattern are

    collectively known as function pattern

    region of QR code. Alignment patterns are

    observed with version number 2 and

    onwards however version number 1 does not

    have any alignment pattern. Encoding region

    within the green color consists of data and

    error correction code words. Data code

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    164 | Amani Jaddu, Ramakrishna S

    words are of two types i.e. information code

    words which stores the actual information

    and the second is padding code words.

    Encoding region stores the data, parity

    components and decoding information in the

    form of a code words. A codeword consist

    of a block of 8 components. Quite Zone is

    the guard region of QR code. QR code

    utilizes RS (Reed Solomon) codes for error

    correction. A QR code contains multiple RS

    codes where one RS code is sufficient to

    store the message. The remaining RS codes

    are usually used to store non meaningful

    messages [2]. There are 4 types of error

    correction level i.e. L, M, Q and H which

    can recover 7%, 15%, 25% and 30% of

    errors in the code words respectively.

    There have been a lot of efforts to improve

    the appearance of QR code. The base

    strategy of such work is to find the best

    group of QR components to substitute by the

    image or logo in the QR code. The method

    presented in [3] proposed that, there are

    three areas to replace the QR component by

    the image or logo. These areas include data

    codewords, padding codewords and the error

    correcting codewords. Depending on the

    error correction level of QR code, pad

    characters have been changed. The size of

    the embedding image in the QR code is

    identified and then the image is implanted in

    the identified region of QR code. The size of

    the image which is to be embedded is

    increased and tested the readability of QR

    embedding to find largest size of which the

    image could be embedded except the finder

    pattern of QR code. Author concludes that if

    the numbers of characters in the QR code is

    decreased then the larger image can be

    embedded. The second approach [12] of

    embedding is based on the modification of

    the pixel’s luminance. The luminance of

    central pixels is modified since this is the

    area usually sampled by the decoder. This

    approach uses the entire area of the code for

    embedding except the finder and alignment

    pattern. The approach in [10] performs the

    blending which combines the color image C

    and the QR code Q based on the luminance

    of color image and the binary value of QR

    code. The blending of C and Q to produce

    an output B is accomplished by replacing

    pixels of Q with those of C. Author assumes

    that pixels of Q are normalized so that white

    pixels have a luminance of 1, and black

    pixels have a luminance of 0. This algorithm

    ensures that the blended output image

    preserves the bright part of color image

    when the pixel value of the QR code equals

    to 1, and dark part of the color image when

    pixel value is 0. Cox proposed a complicated

    algorithm [19] to embed a binary image into

    a QR code during the data encoding stage of

    generating the code. He carefully

    investigated the internal structure of QR

    code and the logic behind data encoding,

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    165 | Amani Jaddu, Ramakrishna S

    and designed an algorithm to encode image

    content as redundant numeric strings

    appended to the original data. However, this

    technique works only for URL type data

    string and the quality of embedded image is

    limited by the length of encoded URL.

    Robust Message Hiding for QR Code:

    Response Code (QR code) is widely

    used in daily life in recent years because it

    has high capacity encoding of data, damage

    resistance, fast decoding and other good

    characteristics. Since it is popular, people

    can use it to transmit secret information

    without inspection. The development of

    steganography in QR code lead to many

    problems arising. How to keep the original

    content of QR code and embed secret

    information into it are the two main

    challenges. Hiding secret information based

    on bit technique is so fragile to modification

    attack. If an attacker changes any bit of

    hidden bits, it is impossible to recover the

    secret information. In this paper, we

    proposed a scheme based on Reed- Solomon

    codes and List Decoding to overcome this

    problem. We also conduct our solution by

    analyzing the complexity, security, and

    experiment.

    A traditional barcode is 1-dimension (1D)

    barcode which only contains data by one

    side. Quick Response (QR) code is a type of

    2-dimension (2D) barcode developed in

    1994 by Denso Wave Corporation. QR code

    got this name because it was developed in

    order to improve the reading speed of 2D-

    barcodes. It contains data for both vertical

    and horizontal dimensions. For this reason,

    QR code holds a considerably greater

    volume of information. It can convey

    various kinds of content such as text, web

    link, number, and multimedia data. The

    decoding speed of the QR Code can be 20

    times faster than that of other 2D symbols

    [6]. In recent years, QR code is becoming

    popular in business via QR readers and

    mobile devices. Since QR code is so

    popular, some secret information could be

    transferred via it. The authors [2], [3], [4]

    analyzed the properties of each QR code

    before embedding it into this one. If they

    want to embed a secret message into QR

    code, they will encode it first. After that,

    they exploit the structure of QR code which

    code they want to use. It takes time, risks,

    and cannot get the secret message directly

    from this QR code. Lin et al. [1] observed

    and proposed a novel scheme to solve this

    problem. The idea to hide secret messages

    into QR code is to use the error correction

    capability. This idea is first proposed by Lin

    et al. [1]. First of all, they encode the secret

    message sm by using a shared key K and get

    EK(sm). After that, they embed each bit of

    EK(sm) into QR code. Their first drawback

    is that if any bit of EK(sm) is damaged, it is

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    166 | Amani Jaddu, Ramakrishna S

    impossible to recover sm from QR code. The

    second drawback is that if an attacker does

    not change any bit of EK(sm) but adds some

    extra error values into QR code, they cannot

    recover their secret message. To the best of

    our knowledge, all previous techniques used

    bit embedding scheme to embed secret

    messages into QR code. It is so vulnerable

    to the modification attack, i.e. an attacker

    changes any bit of secret messages. We

    propose using Reed-Solomon code and List

    Decoding to overcome this kind

    of attack.

    In Coding Theory, List Decoding, a

    research field aims to correct as many errors

    as possible in noisy channels, is rapidly

    developed in recent years. Peter Elias [14]

    and M.J. Wozencraft [15] described List

    Decoding in order to correct errors over

    noisy channel. Nowadays, it can be found in

    many applications. It can be used to trace

    who traitor is [18], [17]. Our contribution:

    Our main contributions is to propose

    algorithms that hide a secret message into

    QR code. The secret message is invisible to

    attackers and secure against modification or

    damage attack. We analyze them under

    complexity and security aspects, and

    conduct these algorithms by experiments.

    Outline of the paper: The rest of this paper is

    organized as the following. Section II

    presents the preliminaries. Section III

    describes our proposed solution. The next

    section describes the security, effectiveness,

    and testing of our solution. Section V

    presents experimental results. The last

    section summarizes the key points and

    mentions future work.

    EXISTING SYSTEM

    Nancy Victor proposed a technique

    for data compression which enhances the

    data capability of QR codes by compressing

    the data previous to creation of QR codes.

    B. Sklar proposed the Reed -

    Solomon error correction code used for data

    encryption where one of 4 error correction

    levels have to be elected during QR code

    generation.

    R. Villán, S. Voloshynovskiy, O.

    Koval, F. Deguillaume, and T. Pun proposed

    the combinati on of strong text hashing and

    text data hiding technologies as an effective

    solution to authentication and tamper -

    proofing of text documents.

    T. V. Bui, N. K. Vu, T. T. P.

    Nguyen, I. Echizen, and T. D. Nguyen

    proposed a scheme based on reed - solomon

    codes and list decoding. Using bit technique,

    it hides secret information and prevents

    attacker changing any bit of hidden bits.

    A.E. Dirik, B. Haas discussed a copy

    detection pattern tool to detect copies from

    original documents and solely focus o n

    counterfeit prevention.

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    167 | Amani Jaddu, Ramakrishna S

    M. Querini, A. Grillo, A. Lentini and

    G.F. Italiano proposed a high capacity

    colored two-dimensional code (HCC2D)

    with an intention to increase barcode data

    density. It supports input data of different

    types and sizes and code dimension is

    slickly bespoke to the real input size.

    Disadvantages:

    1. Storage size is high. So, it cannot save

    the storage capacity.

    2. It cannot restore the private information

    perfectly.

    3. While Compression, important

    information removed.

    III PROPOSED SYSTEM

    We propose in this system a two-

    level QR code. These two levels are public

    and private level. These levels are used for

    storage. The public level is the same as the

    standard QR code storage level; therefore, it

    is readable by any classical QR code

    application. The private level is constructed

    by replacing the black components by

    specific textured patterns. It consists of

    information encoded using q-ary code with

    an error correction capacity. This allows us

    not only to increase the storage capacity of

    the QR code, but also to distinguish the

    original document from a copy. This

    authentication is due to the sensitivity of the

    used patterns to the print-and-scan (P&S)

    process. The pattern recognition method that

    we use to read the second-level information

    can be used both in a private message

    sharing and in an authentication scenario. It

    is based on maximizing the correlation

    values between P&S degraded patterns and

    reference patterns. The storage capacity can

    be significantly improved by increasing the

    code alphabet q or by increasing the textured

    pattern size.

    Advantages:

    1. This proposed technique offers

    significant enhancement of the data

    capacity.

    2. Restoration of private information is

    perfectly.

    3. Lossless compression with no

    information lost.

    IV ARCHITECTURE & SYSTEM

    COMPONENTS

    The system architecture is given

    below and identified with the following

    components.

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    168 | Amani Jaddu, Ramakrishna S

    Fig: System Architecture

    Public Message Storage:

    In this component, the public message is

    stored in the standard QR code, using the

    classical generation method. The standard

    QR code generation algorithm includes the

    following steps.

    1. First of all, the most optimal mode

    (numeric, alphanumeric, byte or Kanji)

    is selected by analyzing the message

    content. The message is encoded using

    the shortest possible string of bits. This

    string of bits is split up into 8 bit long

    data code words. Then, the choice of

    error correction level is performed and

    the error correction codewords using the

    Reed-Solomon code are generated.

    2. After that, the data and error correction

    codewords are arranged in the correct

    order. In order to be sure that the

    generated QR code can be read

    correctly, the best (for encoded data)

    mask pattern is applied.

    3. After this manipulation, the codewords

    are placed in a matrix in a zigzag

    pattern, starting from the bottom-right

    corner. The final step is to add the

    function patterns (position tags,

    alignment, timing, format and version

    patterns) into the QR code.

    Private Message Encoding:

    1. In this component, the private row-bit

    string is encoded using error correction

    code (ECC) to ensure the message error

    correction after the P&S operation. We

    use the block codes, and more precisely

    cyclic codes (or polynomial-generated

    codes) such as Golay code or Reed-

    Solomon code, for message encoding.

    2. The public level is identical to the

    standard QR code storage level, read by

    any classical QR code application

    whereas the private level is made by

    replacing the black components by

    specific textured patterns.

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    169 | Amani Jaddu, Ramakrishna S

    Black Component Replacement:

    1. In 2LQR code black and white

    components are represented using zeros

    and ones. Cell is divided into 24x24

    pixel size.

    2. Check for zeros and whole of the zeros

    will be replaced with code. The textured

    pattern which replaces the black

    components is based on the number of

    zeros available.

    For example, if there are 5 zeros then 5

    squares corresponding to that will be drawn

    while encoding. During decoding the same 5

    squares will be decoded as 5.

    V CONCLUSION

    In this paper a new rich code called

    two level QR (2LQR) code is proposed. This

    2LQR code has two levels: a public level

    and a private level. The public level can be

    read by any QR code reading application,

    while the private level needs a specific

    application with specific input information.

    This 2LQR code can be used for private

    message sharing or for authentication

    scenarios. The private level is created by

    replacing black modules with specific

    textured patterns. These textured patterns are

    considered as black modules by standard QR

    code reader. Thus, the private level is

    invisible to standard QR code readers. In

    addition, the private level does not affect in

    anyway the reading process of the public

    level. The proposed 2LQR code increases

    the storage capacity of the classical QR code

    due to its supplementary reading level.

    Experiment results show that the storage

    capacity is improved by up to 28%

    (transition from message size equal to 272

    bits to a message length of 380 bits). The

    storage capacity of the2LQR code can be

    improved by increasing the number of

    textured patterns used or by decreasing the

    textured pattern size. All experiments show

    that even with a pattern size of 6×6 pixels

    and with an alphabet dimension q = 8, it is

    possible to obtain good pattern recognition

    results, and therefore a successful private

    message extraction. However, we are facing

    a trade-off between the pattern size, the

    alphabet dimensions and the quantity of

    stored information during the 2LQR code

    generation. One important feature of the

    textured patterns used is their sensitivity to

    the P&S process. To take advantage of this

    sensitivity, we use a pattern recognition

    method based on maximization of

    correlation values among the P&S degraded

    versions and characterization patterns. We

    have tried three different types of

    characterization patterns: mean patterns,

    median patterns (for the private message

    sharing scenario) and original patterns (for

    the document authentication scenario). The

    mean and median characterization patterns

    give almost the same results of pattern

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    170 | Amani Jaddu, Ramakrishna S

    detection. Therefore, either of them can be

    used in the private message sharing

    scenario. The best pattern recognition results

    were obtained, when the original patterns

    are used as characterization patterns. The

    original patterns can be also used for the

    private message sharing scenario, but in this

    case the blind method for pattern detection

    cannot be performed. The suggested

    textured patterns can be distinguished only

    after one P&S process. Therefore, we can

    use the detection method with original

    patterns in order to ensure good document

    authentication results.

    FUTURE WORK:

    In our future work, we will address

    five different paths. The first path will

    concern the improvements of the pattern

    recognition method. The second will cover

    the textured pattern analysis to automate its

    combination process. The third will deal

    with message recovering and authentication

    attacks, such as cropping and code

    reconstruction. The forth path will concern

    the study of the second level recovery

    problems in the 2LQR code images captured

    by a camera. In the last path, the storage

    capacity of 2LQR code will be increased by

    replacing also the white modules with

    textured patterns, which have small density

    than black pixels.

    REFERENCES

    [1] Information Technology Automatic

    Identification and Data Capture

    Techniques EAN/UPC Bar Code

    Symbology Specification, ISO/IEC

    Standard 15420:2009, 2009.

    [2] Information Technology Automatic

    Identification and Data Capture

    Techniques—Data Matrix Bar Code

    Symbology Specification, ISO/IEC

    Standard 16022:2006, 2006.

    [3] Information Technology Automatic

    Identification and Data Capture

    Techniques—Bar Code Symbology—

    QR Code, ISO/IEC Standard

    18004:2000, 2000.

    [4] Z. Baharav and R. Kakarala, “Visually

    significant QR codes: Image blending

    and statistical analysis,” in Proc. IEEE

    Int. Conf. Multimedia Expo (ICME),

    Jul. 2013, pp. 1–6.

    [5] C. Baras and F. Cayre, “2D bar-codes

    for authentication: A security

    approach,” in Proc. 20th Eur. Signal

    Process. Conf. (EUSIPCO), Aug.

    2012, pp. 1760–1766.

    [6] T. V. Bui, N. K. Vu, T. T. P. Nguyen,

    I. Echizen, and T. D. Nguyen, “Robust

    message hiding for QR code,” in Proc.

    IEEE 10th Int. Conf. Intell. Inf. Hiding

    Multimedia Signal Process. (IIH-

    MSP), Aug. 2014, pp. 520–523.

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    171 | Amani Jaddu, Ramakrishna S

    [7] A. T. P. Ho, B. A. M. Hoang, W.

    Sawaya, and P. Bas, “Document

    authentication using graphical codes:

    Reliable performance analysis and

    channel optimization,” EURASIP J.

    Inf. Secur., vol. 2014, no. 1, p. 9,2014

    [8] T. Langlotz and O. Bimber,

    “Unsynchronized 4D barcodes,” in

    Proc. 3rd Int. Symp., ISVC 2007,

    Lake Tahoe, NV, USA, Nov. 26–28,

    2007, pp. 363–374.

    [9] C.-Y. Lin and S.-F. Chang, “Distortion

    modeling and invariant extraction for

    digital image print-and-scan process,”

    in Proc. Int. Symp. Multimedia Inf.

    Process., 1999, pp. 1–

    [10] P.-Y. Lin, Y.-H. Chen, E. J.-L. Lu, and

    P.-J. Chen, “Secret hiding mechanism

    using QR barcode,” in Proc. IEEE Int.

    Conf. Signal-Image Technol. Internet-

    Based Syst. (SITIS), Dec. 2013, pp.

    22–25.

    [11] J. Picard, “Digital authentication with

    copy-detection patterns,” Proc. SPIE,

    vol. 5310, pp. 176–183, Jun. 2004.

    [12] M. Querini, A. Grillo, A. Lentini, and

    G. F. Italiano, “2D color barcodes for

    mobile phones,” Int. J. Comput. Sci.

    Appl., vol. 8, no. 1, pp. 136–155,

    2011.

    [13] M. Querini and G. F. Italiano, “Facial

    biometrics for 2D barcodes,” in Proc.

    IEEE Fed. Conf. Comput. Sci. Inf.

    Syst. (FedCSIS), Sep. 2012, pp. 755–

    762.

    [14] J. Rouillard, “Contextual QR codes,”

    in Proc. IEEE 3rd Int. Multi-Conf.

    Comput. Global Inf. Technol.

    (ICCGI), Jul./Aug. 2008, pp. 50–55.

    [15] B. Sklar, Digital Communications,

    vol. 2. Englewood Cliffs, NJ, USA:

    Prentice-Hall, 2001.

    [16] K. Solanki, U. Madhow, B. S.

    Manjunath, S. Chandrasekaran, and I.

    El-Khalil, “‘Print and scan’ resilient

    data hiding in images,” IEEE Trans.

    Inf. Forensics Security, vol. 1, no. 4,

    pp. 464–478, Dec. 2006.

    [17] M. Sun, J. Si, and S. Zhang, “Research

    on embedding and extracting methods

    for digital watermarks applied to QR

    code images,” New Zealand J.

    Agricult. Res., vol. 50, no. 5, pp. 861–

    867, 2007.

    [18] I. Tkachenko, W. Puech, O. Strauss, J.-

    M. Gaudin, C. Destruel, and C.

    Guichard, “Fighting against forged

    documents by using textured image,”

    in Proc. 22th Eur. Signal Process.

    Conf. (EUSIPCO), Sep. 2014, pp.

    790–794.

    [19] R. Ulichney, Digital Halftoning.

    Cambridge, MA, USA: MIT Press,

    1987.

    [20] R. Villán, S. Voloshynovskiy, O.

    Koval, F. Deguillaume, and T. Pun,

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    172 | Amani Jaddu, Ramakrishna S

    “Tamper-proofing of electronic and

    printed text documents via robust

    hashing and data-hiding,” in Proc.

    SPIE, vol. 6505, p. 65051T, Feb.

    2007.

    Amani Jaddu she is a master

    of Computer Science (M.Sc)

    pursuing in Sri Venkateswara

    University, Tirupati, A.P. She

    received Degree of Bachelor of

    Science in 2017 from Rayalaseema

    University, Kurnool. Her research interests

    are Cloud Computing, Data Warehousing,

    and Big Data.

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    173 | Aruna Kumari Yanduri, Ramakrishna S

    An Artificial Intelligence Technique for Explorated Searchable

    Query Processing

    Aruna Kumari Yanduri1, Ramakrishna S2 1PG Student, Department of Computer Science, Sri Venkateshwara University Tirupati

    2Professor, Department of Computer Science, Sri Venkateshwara University Tirupati

    Abstract

    Exploratory search is an increasingly imperative movement for Web searchers. Be that as

    it may, the ebb and flow seek framework cannot give adequate help to exploratory inquiry. In

    this manner, we made inside and out investigation for exploratory pursuit procedures, and found

    that there are a ton of hunt objective move marvels in exploratory inquiry. In light of this reality,

    we have planned another inquiry proposal technique to help exploratory pursuit. Right off the

    bat, as per the social qualities of searchers in the inquiry objective move forms, every one of the

    questions submitted in the pursuit objective move forms are extricated from web crawler logs

    utilizing AI. And after that we have utilized the inquiries to assemble a pursuit objective move

    diagram; at long last, the arbitrary walk calculation is utilized to get the inquiry suggestions in

    the hunt objective move chart. Likewise, we showed the viability of the strategy for exploratory

    pursuit by contrasting examinations and alternate strategies.

    Keywords: Exploration Search, Query Recommendation, Artificial Intelligence.

    I. INTRODUCTION

    Exploratory search is an increasingly

    important activity yet challenging for Web

    searchers. In exploratory search, the

    searcher is unfamiliar with their problem

    domain, ensures about the ways to achieve

    their goal, or lacks a well-defined goal. To

    support exploratory search, the search

    system is required not only to provide

    accurate search results, but also to help

    searchers explore related and novel aspects.

    Therefore, exploratory search system needs

    an effective query recommendation method

    to re-solve this problem.

    However, the current query

    recommendation methods mainly focus on

    optimizing users’ current query which is far

    away from satisfying users’ information

    needs of the whole search session. To

    support exploratory search, we observed and

    analyzed the search logs of exploratory

    search process performed by different users,

    and we found that there are a lot of search

    goal shift phenomena in exploratory search.

    As the following example: A Chinese

    university student attends a birthday party

    organized by a French student, and he wants

    to choose a suit-able birthday gift, which is a

    typical exploratory search task. Because the

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    174 | Aruna Kumari Yanduri, Ramakrishna S

    Chinese student only got some very vague

    goals, such as

    Object: a gift not a normal thing

    Applicable occasions: birthday party

    Basic features: French favorite items

    Budget: 200 RMB or so

    Based on these conditions, the

    student used the key words “French people

    like flowers” for the query; explored

    “flowers” which is the most popular gift. He

    felt using flowers as a birthday gift doesn’t

    feature after clicking many links of search

    results. And the search results mentioned

    that French people are very fond of drinking

    wine. So, he changed his idea and felt that

    “wine” may be more appropriate as a gift for

    the birthday party. So, the user used “French

    wine” as a key word and query “red wine”

    as new search goal to explore. Using the

    search results about the “French wine brand”

    and “French wine prices”, the student

    figured out French wine prices are expensive

    far beyond his budget. Obviously “red wine”

    is not a suitable search goal either. At the

    same time, he thought, “arts and crafts” may

    be more appropriate. Then he used

    “handicrafts”, “Chinese arts and crafts” as

    key words to query on the "arts and crafts"

    which is a new search goal, and eventually

    found hopeful gift to the end of the search

    task.

    From the example, it’s clear that the

    user's search goal shifts from the “flowers”

    to “red wine” and then from the “wine” to

    “arts and crafts”. And the search goal shifts

    precisely reflect the user’s exploratory

    behaviors and needs. Therefore, we based on

    the "search goal shift" de-signed a new

    recommendation method to support

    exploratory search. Firstly, according to the

    user’s behavioral characteristics in the

    search goal shift process, we extracted all

    queries during search goal shift processes

    from search logs; then we used the queries

    to construct a search goal shift graph;

    finally, we recommended other goals related

    to the current goals using the search goal

    shift graph.

    In addition, we have designed a

    query recommendation test method, by

    which we can compare our recommendation

    method with the other methods. And the

    experimental results showed that the

    recommendation method we designed can

    significantly shorten the search.

    II RELATED WORK

    Query Recommendation

    Most of the query recommendation

    techniques are using similarity measures

    between queries by query terms, clicked

    documents, or sequences of queries in

    sessions. Baeza-Yates et al. [2] extracted

    query-clicked URL/doc bipartite graphs

    using search logs to find query

    recommendations. Craswell and Szummer

    [3] also used the query-click graph to find

    related documents and queries. Mei et al. [4]

    presented a “Hitting Time” algorithm to find

    related queries using the query-click graph.

    Cao et al. [5] tried to understand user's

    context which in-cludes multiple

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    175 | Aruna Kumari Yanduri, Ramakrishna S

    information including age, gender,

    username, IP, tools etc. and also previous

    queries in a query session in order to suggest

    new queries. Boldi et al. [6] proposed a

    query-flow graph which represents the latent

    querying behavior contained in a query log.

    Exploratory Search

    In the past 30 years, many scholars

    have made in-depth study of the search

    process of exploratory search behavior. In

    1989, Dr. Bates M J proposed Berry picking

    model [7] that the user's search direction and

    the desired result will constantly change

    with the search process changing. In 1991,

    Kuhlthau C C proposed that information

    retrieval process includes starting, selection,

    exploration, collection and ending six stages

    [8]. In 1995, Byström K and Järvelin K used

    the methods of logs and questionnaires to

    analyze the relationship with search

    complexity of the task, type of information,

    information channels and resources [9]. In

    2006, Marchionini G proposed exploratory

    search [10].

    Recently, exploratory search

    research focuses on the characteristics of the

    exploratory search process and the different

    types of support needed to help people make

    exploratory searches [1]. Someone tries to

    provide a query preview control by allowing

    users to take nodes and rec-ord the results

    [11] so that they can view the distribution of

    newly-retrieved and retrieved documents

    before running the query [12]. Some

    research efforts focus on traditional search

    techniques such as query suggestions,

    aspects and information classification. For

    example, Hassan Awadallah et al. [13]

    constructed a method of automatically

    identifying and recommending tasks that

    allow searchers to explore and complete

    complex search tasks, Sun et al. [14]

    proposed a topic-oriented query for explor-

    atory search method, Ksikes et al. [15]

    designed an ex-ploratory faceted search

    system, Zhang et al. [16] grouped the

    relationships between entities into a virtual-

    generated hierarchical clustering to an

    effective leader to explore and discover.

    Other attempts have been made to design

    and research visual search interfaces and

    interactive user modeling to support

    exploratory search tasks. For exam-ple,

    Bron et al. [17] proposed an auxiliary

    exploratory search interface to support

    media research; Bespinyowong et al [18]

    designed exploratory data ex-ploratory

    ranking interface; Peltonen et al [19] used a

    negative feedback search intent radar

    interface to help users conduct exploratory

    search.

    All these previous methods focus on

    refining user requirements, showing several

    facets helping users refine their requirement

    and find their desired information. But they

    cannot satisfy such user needs as finding

    some novel search goals when users are

    losing the interests of current search goal.

    EXISTING SYSTEM

    The current query recommendation

    methods mainly focus on optimizing users’

    current query which is far away from

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    176 | Aruna Kumari Yanduri, Ramakrishna S

    satisfying users’ information needs of the

    whole search session. To support

    exploratory search, we observed and

    analyzed the search logs of exploratory

    search process performed by different users,

    and we found that there are a lot of search

    goal shift phenomena in exploratory search.

    Most of the query recommendation

    techniques are using similarity measures

    between queries by query terms, clicked

    documents, or sequences of queries in

    sessions. In existing system, they used

    extracted query-clicked URL/doc bipartite

    graphs using search logs to find query

    recommendations.

    Disadvantages:

    All these previous methods focus on

    refining user requirements, showing several

    facets helping users refine their requirement

    and find their desired information. But they

    cannot satisfy such user needs as finding

    some novel search goals when users are

    losing the interests of current search goal.

    III PROPOSED SYSTEM

    This paper presents the search goal

    shifts precisely reflect the user’s exploratory

    behaviors and needs. Therefore, we based on

    the "search goal shift" de-signed a new

    recommendation method to support ex-

    ploratory search. Firstly, according to the

    user’s behavior-al characteristics in the

    search goal shift process, we extracted all

    queries during search goal shift processes

    from search logs; then we used the queries

    to construct a search goal shift graph;

    finally, we recommended other goals related

    to the current goals using the search goal

    shift graph. Based on the basic framework of

    the search goal shift graph, exploratory

    query recommendation method mainly

    consists of two parts, offline and online.

    Our final goal is to provide query

    recommendations for users, the process of

    “identifying search goal shift” is to identify

    all search goal shift query pairs from the

    search en-gine logs and use them to

    compose of the candidate set.

    IV SYSTEM ARCHITECTURE

    Based on the basic framework of the

    search goal shift graph, exploratory query

    recommendation method mainly consists of

    two parts, offline and online.

    Fig: System Architecture

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    177 | Aruna Kumari Yanduri, Ramakrishna S

    Offline Part

    Offline part mainly includes two

    major steps, the search goal shift

    identification and the search goal shift graph

    building. In the offline part, we manually

    annotate the search goal shift in some users’

    search session, then use machine learn-ing

    to convert inefficient manual identification

    process into efficient AI calculation. Finally,

    we use all queries submitted during search

    goal shifts to construct a search goal shift

    graph.

    Online Part

    Online part also contains two steps,

    user's search behavior judgment and top-k

    recommend. In the online part, we use the

    identification model which is trained from

    the offline part to judge whether users’

    search behaviors belong to “search goal

    shift”, then we use a random walk algorithm

    to find the top-k most relevant search

    queries from the search goal shift graph as a

    result of recommendation.

    V CONCLUSION

    In this paper, we contemplated the

    pursuit objective move which is one of the

    imperative conduct attributes of exploratory

    inquiry, and planned another question

    suggestion technique dependent on the hunt

    objective move to help exploratory hunt.

    The strategy utilizes AI to uncover all

    inquiries amid pursuit objective move forms

    from internet searcher logs to assemble the

    inquiry objective move chart, and uses

    irregular walk calculation to acquire

    question suggestions in the hunt objective

    move diagram. In the meantime, we

    demonstrated the adequacy of the suggestion

    strategy by the relative investigations with

    different techniques.

    Future Enhancement:

    It is not possible to develop a system

    that makes all the requirements of the user.

    User requirements keep changing as the

    system is being used. Some of the future

    enhancements that can be done to this

    system are:

    As the technology emerges, it is

    possible to upgrade the system and can be

    adaptable to desired environment.

    Based on the future security issues, security

    can be improved using emerging

    technologies like single sign-on.

    VI REFERENCES

    [1] White R W, Roth R A. “Exploratory search: Beyond the query response

    paradigm,” Synth Lect Inf Concept Retr

    Serv, vol. 1, no. 1, pp. 1-98, 2009.

    [2] Baeza-Yates R, Hurtado C, Mendoza M. “Query recommendation using query

    logs in search engines,” in Proc.

    International Conference on Extending

    Database Technology, pp. 588-596,

    2004.

    [3] Craswell N, Szummer M. “Random walks on the click graph,” in Proc. The

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    178 | Aruna Kumari Yanduri, Ramakrishna S

    30th annual international ACM SIGIR

    conference on Research and

    development in information retrieval, pp.

    239-246, 2007.

    [4] Cao H, Jiang D, Pei J, et al. “Context-aware query suggestion by mining click-

    through and session data,” in Proc. The

    14th ACM SIGKDD international

    conference on Knowledge discovery and

    data mining, pp.875-883, 2008.

    [5] Mei Q, Zhou D, Church K. “Query suggestion using hitting time,” in Proc.

    The 17th ACM conference on

    Information and knowledge

    management, pp. 469-478, 2008.

    [6] Boldi P, Bonchi F, Castillo C, et al. “Query suggestions using query-flow

    graphs,” in Proc. The 2009 workshop on

    Web Search Click Data, pp. 56-63, 2009.

    [7] Bates M J. “The design of browsing and berry picking techniques for the online

    search interface,” Online review, vol. 13,

    no. 5, pp. 407-424, 1989.

    [8] Kuhlthau C C. “Inside the search process: Information seeking from the

    user's perspective.” Journal of the

    American society for information

    science, vol. 42, no. 5, pp. 361-424,

    1991.

    [9] Byström K, Järvelin K. “Task complexity affects information seeking

    and use.” Information processing &

    management, vol. 31, no. 2, pp. 191-213,

    1995.

    [10] Marchionini G. “Exploratory search: From finding to under-standing.”

    Communications of the ACM, vol. 49,

    no. 4, pp. 41-46, 2006.

    [11] Donato D, Bonchi F, Chi T, et al. “Do you want to take notes? identifying

    research missions in Yahoo! search

    pad,” in Proc. The 19th ACM

    international conference on World Wide

    Web, pp. 321-330, 2010.

    [12] Qvarfordt P, Golovchinsky G, Dunnigan T, et al. “Looking ahead:

    query preview in exploratory search,” in

    Proc. The 36th international ACM

    SIGIR conference on Research and

    devel-opment in information retrieval,

    PP. 243-252, 2013.

    [13] Hassan Awadallah A, White R W, Pantel P, et al. “Supporting complex

    search tasks,” in Proc. The 23rd ACM

    International Conference on Conference

    on Information and Knowledge

    Management, pp. 829-838, 2014.

    [14] Sun H C, Jiang C J, Ding Z J, et al. “Topic-Oriented Exploratory Search

    Based on an Indexing Network.” IEEE

    Transactions on Systems, Man, and

    Cybernetics: Systems, vol. 46, no.2, pp.

    234-247, 2016.

    [15] Ksikes, A. “Towards exploratory faceted search systems.” Doctoral

    dissertation, University of Cambridge,

    2014

    [16] Zhang Y, Cheng G, Qu Y. “Towards exploratory relationship search: A

    clustering-based approach,” in Proc.

    Joint International-al Semantic

    Technology Conference. Springer

    International Publishing, pp. 277-293,

    2013.

    [17] Bron M, Van Gorp J, Nack F, et al. “A subjunctive exploratory search interface

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    179 | Aruna Kumari Yanduri, Ramakrishna S

    to support media studies researchers,” in

    Proc. ACM SIGIR conference on

    Research and development in

    information retrieval, pp. 425-434, 2012.

    [18] Bespinyowong R, Chen W, Jagadish H V, et al. “ExRank: an exploratory

    ranking interface.” Proceedings of the

    VLDB Endowment, vol. 9, no. 13,

    pp.1529-1532, 2016.

    [19] Peltonen J, Strahl J, Floréen P. “Negative Relevance Feedback for

    Exploratory Search with Visual

    Interactive Intent Modeling,” in Proc.

    The 22nd ACM International

    Conference on Intelligent User

    Interfaces, pp.149-159, 2017.

    ARUNA KUMARI

    YANDURI she is a master

    of Computer Science (M.Sc)

    pursuing in Sri

    Venkateswara University,

    Tirupati, A.P. She received

    Degree of Bachelor of Science in 2017 from

    Rayalaseema University, Kurnool. Her

    research interests are Artificial Intelligence,

    Machine Learning, and Quantum

    Computing.

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    180 | Hari Varma B, Anjan Babu G

    A Deep Learning Enhanced Technique for Classification of Blood

    Cell Images

    Hari Varma B1, Anjan Babu G2

    1PG Student, Department of Computer Science, Sri Venkateshwara University Tirupati 2Professor, Department of Computer Science, Sri Venkateshwara University Tirupati

    Abstract

    The problem of identifying and counting blood cells within the blood smear is of both theoretical

    and practical interest. The differential counting of blood cells provides invaluable information to

    pathologist for diagnosis and treatment of many diseases. In this paper we propose an efficient

    hierarchical blood cell image identification and classification method based on multi-class

    support vector machine. In this automated process, segmentation and classification of blood cells

    are the most important stages. We segment the stained blood cells in digital microscopic images

    and extract the geometric features for each segment to identify and classify the different types of

    blood cells. The experimental results are compared with the manual results obtained by the

    pathologist, and demonstrate the effectiveness of the proposed method.

    Keywords: Artificial Intelligence, Convolutional Neural Network, Machine Learning.

    I. INTRODUCTION

    It is notable that platelets principally

    incorporate red blood cells, white platelets

    and platelets. In blood, leucocyte assumes an

    essential job in the human insusceptible

    capacity, so it is likewise called the safe cell.

    More often than not, hematologists use

    granulated data and shape data in leukocytes

    to isolate white platelets into granular cells:

    neutrophil, eosinophil, basophil and non-

    granular cells: monocyte and lymphocyte.

    The extent in the blood of these five types

    of cells is diverse for the sick and non-ailing

    bloods. Specialists regularly utilize this

    fundamental information as criteria for

    deciding the sort and seriousness of this

    ailment. In this manner, the investigation of

    white platelet classification has critical

    significance and esteem for restorative

    conclusion. In light of the significance of

    platelet classification in the conclusion,

    scientists have proposed numerous

    calculations to order platelets. In 2003, Sinha

    and Ramakrishnan [1] classified cells

    utilizing SVM with an acknowledgment rate

    of 94.1%. In 2006, Yampri et al. [2] utilized

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    181 | Hari Varma B, Anjan Babu G

    100 pictures to play out the same trials. They

    actualized the programmed edge also,

    versatile shape to fragment cells, and utilized

    the littlest blunder strategy to group them,

    and the acknowledgment rate was 96% [2].

    Yampri et al. [2] used the KNN calculation.

    Be that as it may, the KNN calculation does

    not deal with uneven tests well. In the event

    that the example limit of a class is vast, while

    the example limit of different classes is little,

    a few issues emerge. For instance, when

    another example is contribution to the

    analytic framework, it might result in a class

    with a vast limit of being overwhelming in

    the K closest neighbors of this example.

    What's more, the calculation is

    computationally costly on the grounds that

    each example should be sorted in request to

    compute its separation from every single

    referred to test so as to get its K closest

    neighbors.

    II RELATED WORK

    Previously related blood cell

    classification algorithms mainly include the

    KNN algorithm, Bayesian classifier, SVM

    classifier, etc. We briefly review and discuss

    in this section. The core idea of the KNN

    algorithm is that if most of the k most

    adjacent samples in a feature space belong to

    a certain category. Note that the sample also

    has the characteristics of all the other samples

    in this category. This method determines

    the class in which the sample is to be

    classified based on the category of the nearest

    samples in determining the classification

    decision. The KNN method is only relevant

    to a very small number of neighboring

    samples in the category decision. Based on

    this theory, Young (1972) experimented with

    199 cell images. He first used histogram

    thresholds to segment white blood cells and

    classified them using a distance classifier.

    The recognition rate was 92.46% [24]. Bikhet

    et al. [25] used entropy based and iterative

    thresholding methods to divide cells and

    classify them with a distance classifier, with

    a recognition rate of 90.14%. Bayesian

    classification is based on statistical

    classification and uses its knowledge of

    probability statistics to classify data. In many

    classifications, naive Bayes algorithm can

    be compared with decision tree and neural

    network algorithm. Sinha and Ramakrishnan

    [1] used Bayesian classifiers to classify cells

    and the recognition rate was 82.3%. The era-

    Umpon and Dhompongsa (2007) used a

    Bayesian classifier to classify the bone

    marrow images of the Ellis Fisher Cancer

    Center at the center of Missouri (only one

    cell per picture), and the recognition rate was

    77% [26], [27]. Ghosh et al. [28] used a

    watershed algorithm to segment 150 cell

    images and classify them using a Bayesian

    classifier, and the recognition rate was

    83.2%. The classification idea of SVM is

    essentially similar to the linear regression LR

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    182 | Hari Varma B, Anjan Babu G

    classification method. It is to obtain a set of

    weight coefficients that can be classified after

    linear representation. SVM _rst trains a

    separation hyper-plane, and then the plane is

    the decision boundary of the classification.

    Classical SVM algorithm is only suitable for

    two types of classification problems. After

    improvement, SVM can also be applied to

    multiple classification problems. In the

    actual application of white blood cell

    classification, it is generally necessary to

    solve the problem of multiple classifications.

    For example, the five-classification problem

    of leukocytes we studied can be solved by

    combining multiple binary SVM. Rezato_ghi

    and Soltanian-Zadeh [29] used the Gram-

    Schmidt Orthogonal and Snake algorithm to

    segment 400 blood smears and classified

    them using SVM. Their recognition rate was

    90% [29]. Recently, convolutional neural

    networks have been widely implemented in

    various image classification fields. In

    particular, convolutional neural networks

    (ConvNets) [11] achieved unprecedented

    results in the 2012 ImageNet large-scale

    visual recognition challenge, which included

    classifying natural images in the ImageNet

    dataset into 1000 _ne-grained categories [3].

    They also significantly improve the

    performance of various medical imaging

    applications [30], [31], such as classification

    of lung diseases and lymph nodes in CT

    images [32], [33], segmentation (pixel

    classification) of brain tissues in MRI [34],

    vessel segmentation based on fundus images

    [37], and detecting cervical intraepithelial

    neoplasia (CIN, particularly CIN2C) at

    patient level based on Cervi gram images or

    Multimodal data [36]. In addition, ConvNets

    showed superior performance in cell image

    classification such as pleural cancer [38] and

    human epithelial cell images [39]. Although

    these methods can be used to generate good

    classification engines, they still have some

    drawbacks. Traditional machine learning

    methods (such as SVM) need to extract

    features manually. The acquisition of features

    mainly depends on the designer's prior

    knowledge. This feature extraction method is

    difficult to make full use of the information

    contained in the image, and will increase the

    designer's workload. The deep learning

    algorithm effectively solves this problem. It

    can automatically learn the effective features

    of the image. Deep learning algorithms such

    as deep residual network also have good

    performance in image classification tasks.

    However, these neural network classification

    algorithms cannot fully utilize some features

    of the image that have a long-term

    dependency relationship with image labels,

    and thus these classification methods cannot

    classify cell images like people with memory.

    For this purpose, we introduce a recurrent

    neural network and fuse it with a

    convolutional neural network to perform the

    task of blood cell image classification.

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    183 | Hari Varma B, Anjan Babu G

    III PROPOSED SYSTEM

    Convolutional Neural Networks

    Both the 2-dimensional and 3-dimensional

    structures of an organ being studied are

    crucial in order to identify what is normal

    versus abnormal. By maintaining these local

    spatial relationships, CNNs are well-suited to

    perform image recognition tasks. CNNs have

    been put to work in many ways, including

    image classification, localization, detection,

    segmentation and registration. CNNs are the

    most popular machine learning algorithm in

    image recognition and visual learning tasks,

    due to its unique characteristic of preserving

    local image relations, while performing

    dimensionality reduction. This captures

    important feature relationships in an image

    (such as how pixels on an edge join to form a

    line), and reduces the number of parameters

    the algorithm has to compute, increasing

    computational efficiency. CNNs are able to

    take as inputs and process both 2-

    dimensional images, as well as 3-dimensional

    images with minor modifications. This is a

    useful advantage in designing a system for

    hospital use, as some modalities like X-rays

    are 2-dimensional while others like CT or

    MRI scans are 3-dimensional volumes.

    CNNs and Recurrent Neural Networks

    (RNNs) are examples of supervised machine

    learning algorithms, which require significant

    amounts of training data. Unsupervised

    learning algorithms have also been studied

    for use in medical image analysis. These

    include Autoencoders, Restricted Boltzmann

    Machines (RBMs), Deep Belief Networks

    (DBNs), and Generative Adversarial

    Networks (GANs)

    Fig 3 : The Flow Chart of automatic

    recognition of blood cells

    IV METHODOLOGY

    Colour Split Channel

    The blood smear may be stained by

    different colour dyes. To avoid being

    influenced by dye colour, all blood smear

    images were first transformed into gray level.

    A typical peripheral blood smear image

    consists of four components, which are the

    background, erythrocytes, leukocytes, and

    thrombocytes. Leukocytes appear rather

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    184 | Hari Varma B, Anjan Babu G

    darker than the background, and erythrocytes

    appear in an intermediate intensity level. To

    segment the desired object from the

    background, it is found that the green

    component of the RGB input image gives the

    best contrast between the background and the

    blood cells components, as shown in Fig. 4.

    As a result, the green channel is used to

    segment the blood cells in our proposed

    method.

    Image Segmentation

    Image segmentation consists basically on

    partitioning an image into a set of disjoint

    and homogeneous regions which are

    supposed to correspond to image objects that

    are meaningful to a certain application. Thus,

    the segmentation process is based on using

    thresholding, morphology, and watershed to

    enclose every element in the blood slide in a

    distinct area.

    Binary

    In order to segment the desired object from

    the background, we need to generate a binary

    image that separates foreground and

    background image pixels. To produce a

    representative binary image, Otsu’s adaptive

    threshold algorithm [7] is applied on the

    green channel to classify all the pixels into

    two classes. Otsu’s method exhaustively

    searches for the threshold Tc that minimizes

    the within-class variance, defined as a

    weighted sum of variances of two classes:

    Where the weight pi is the probability of a

    pixel in the i-th class separated by a threshold

    t and the variance of pixels’ gray level

    intensities in the i-th classes. Fig. 5 shows the

    output binary image produced corresponding

    to that shown in Fig. 4(c).

    Figure 5 The binary blood cell image

    generated by Otsu’s threshold algorithm.

    Mathematical Morphology

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    185 | Hari Varma B, Anjan Babu G

    Mathematical morphology operations

    [8] are nonlinear, translation invariant

    transformations. The basic morphological

    operations involving an image S and a

    structuring element E are

    where denote the set intersect respectively. E

    + s denotes the translation of a s. The

    opening and closing derived from dilation are

    defined by Mathematical morphology

    operations are holes in blood cells and to

    remove the unwanted blood cells and

    background. Watershed

    The objective of watershed

    segmentation of the highest gray levels,

    which are called the simplest way to explain

    watershed segmentation approach.” Imagine

    that a hole minimum of the surface, and we

    flood was catchment basins from the holes. If

    the w catchment basins are likely to merge

    due to fu a dam is built to prevent the

    merging. This will eventually reach a stage

    when only the watershed lines) is visible

    above the water order to separation of

    overlapping cells, water is applied on

    distance transform of binary having larger

    area. Fig. 6 shows the watershed result for

    the blood cell image.

    Feature Extraction

    MATERIALS AND METHODS

    stages Image Data Collection: The blood

    specimens were obtained from different

    patients with sickle cell anaemia, sickle cell

    disease and normal volunteers.

    Each blood cell image contains

    number of normal and abnormal cells. Blood

    Cell Segmentation: Image segmentation is

    used to detect the entire blood cells

    (Dougherty, 1994; Wroblewska et al., 2003).

    in a segmented image, the picture elements

    are no longer the pixels, but connected set of

    pixels, all belonging to the same region. An

    object can be easily detected in an image if

    the object has sufficient contrast from the

    background. We use edge detection and basic

    morphology tools to detect a cell.

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    186 | Hari Varma B, Anjan Babu G

    The individual cells are close to each

    other and the borders among them are not

    well defined. The morphological operations

    aim at extracting relevant structures of the

    image by probing the image with another set

    of a known shape called structuring element,

    chosen as the result of prior knowledge

    concerning the geometry of the relevant and

    irrelevant image structures.

    The most known morphological operations

    include erosion, dilation, opening and

    closing. The morphological approach to

    image segmentation combines regions

    growing and edge detection techniques

    (Serra, 1984; Ponsen et al., 2009). The

    applied procedure of the image segmentation

    and cell separation consists of the following:

    Transformation of the original image into

    gray scale. Detect the entire cell using edge

    detection technique Application of dilation

    and erosion operations to smooth the object

    and to eliminate the distortions Feature

    Extraction: Twenty-seven features were

    extracted from each cell image (Table 1).

    This included 4 geometrical features, 16

    statistical features and 7 moment invariant

    features (Osowski et al., 2004; Santinelli et

    al., 2002). Geometrical Features Description:

    We use the following geometrical features to

    study characteristics of the cells:

    Area A-the number of pixels on the interior

    of the cell

    Perimeter P-the total distance between

    consecutive points of the border

    Compactness C-given by the formula:

    perimeter2 /area • Form factor F-

    4*3.14*Area/Perimeter2

    SVM Classification

    Support vector machine (SVM) [10] is a

    concept for a set of related supervised

    learning methods that analyze data and

    recognize patterns, used for classification and

    regression analysis. The main advantage of

    the SVM network used as a classifier is its

    very good generalization ability and

    extremely powerful learning procedure,

    leading to the global minimum of the defined

    error function. Given instances xi, i=1, …, l

    with labels 8$! K2K3, the main task in

    training SVMs is to solve the following

    quadratic optimization problem [11]:

    where e is the vector of all ones, C is the

    upper bound of all variables, Q is an l by l

    symmetric matrix with Qij = yiyjK(xi, xj),

    and K(xi, xj) is the kernel function. The most

    known kernel functions are the radial

    Gaussian basis, polynomial, spline, or

    sigmoidal functions. The final learning

    problem of the SVM is transformed to the

    solution of the so-called dual problem

    defined with respect to the Lagrange

    multipliers [12]:

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    187 | Hari Varma B, Anjan Babu G

    where b is the bias and the vector x represent

    the class when s(x) is positive and the

    alternative class when s(x) is negative. The

    hyperparameter of the kernel function and the

    regularization constant C have been adjusted

    by repeating the learning experiments for the

    set of their predefined values and choosing

    the best value on the validation data sets.

    Their optimal values are those for which the

    classification error on the validation data set

    was the smallest.

    The one-against-one method [13] is

    applied to deal with the problem of multiple

    classes. The maximum voting of the multiple

    classes is used to find the final classification

    results. During the training phase, the models

    of the multiple classes SVMs are learned

    from training data. In the testing phase, the

    learned models are employed to generate

    multiple sets of predictions for each test

    sample. The one having the largest prediction

    is the final decision.

    From earlier literatures, we found that

    it is hard to accurately distinguish blood cells

    into seven classes by using the single-stage

    SVM classification. Thus, we propose the

    hierarchical SVM classification to improve

    the recognition ratio. Fig. 8 illustrates our

    proposed hierarchical strategy. For fast and

    efficient classification, five features, area,

    histogram, circularity, cytoplasm ratio, and

    color of cytoplasm, are extracted for the

    following SVM training. For the first level,

    blood cells can be distinguished into two

    types, thrombocytes and erythrocytes,

    leukocytes by the feature “area.” Next, we

    can use the feature “histogram” to identify

    erythrocytes and leukocytes. For leukocytes,

    we can use the feature “circularity” to

    identify granulocytes and agranulocytes due

    to agranulocytes belong to the mononuclear

    cell group. In the following, we use the

    feature “color of cytoplasm” to distinguish

    granulocytes into neutrophils, eosinophils,

    and basophils. Finally, monocytes and

    lymphocytes can be recognized by the feature

    “cytoplasm ratio.”

  • Conference Proceeding of

    5th International Conference on Science, Technology & Management (ICSTM-2019)

    Institution of Engineers, India, Sector 19A, Chandigarh, India

    on 24th February 2019, ISBN: 978-93-87433-47-2

    188 | Hari Varma B, Anjan Babu G

    V CONCLUSION

    This study demonstrated an efficient

    hierarchical blood cells classification method

    using the geometric features from the nucleus

    and the cytoplasm and a multi-class SVM

    classification scheme. Classification using

    the proposed hierarchical strategy

    outperformed classification using only the

    single-stage SVM because the cytoplasm of

    some leukocytes presents a very weak

    difference against the background and

    touches neighboring cells. In addition,

    experimental results showed that using the

    hierarchical multi-class SVM classification

    with hierarchical features could indeed

    improve the classification performance

    compared to the sing