31
Accepted Manuscript Title: An Effective and Economical Architecture for Semantic-based Heterogeneous Multimedia Big Data Retrieval Author: Kehua Guo Wei Pan Mingming Lu Xiaoke Zhou Jianhua Ma PII: S0164-1212(14)00204-0 DOI: http://dx.doi.org/doi:10.1016/j.jss.2014.09.016 Reference: JSS 9382 To appear in: Received date: 5-12-2013 Revised date: 30-8-2014 Accepted date: 8-9-2014 Please cite this article as: Guo, K., Pan, W., Lu, M., Zhou, X., Ma, J.,An Effective and Economical Architecture for Semantic-based Heterogeneous Multimedia Big Data Retrieval, The Journal of Systems and Software (2014), http://dx.doi.org/10.1016/j.jss.2014.09.016 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

  • Upload
    jianhua

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Accepted Manuscript

Title: An Effective and Economical Architecture forSemantic-based Heterogeneous Multimedia Big DataRetrieval

Author: Kehua Guo Wei Pan Mingming Lu Xiaoke ZhouJianhua Ma

PII: S0164-1212(14)00204-0DOI: http://dx.doi.org/doi:10.1016/j.jss.2014.09.016Reference: JSS 9382

To appear in:

Received date: 5-12-2013Revised date: 30-8-2014Accepted date: 8-9-2014

Please cite this article as: Guo, K., Pan, W., Lu, M., Zhou, X., Ma,J.,An Effective and Economical Architecture for Semantic-based HeterogeneousMultimedia Big Data Retrieval, The Journal of Systems and Software (2014),http://dx.doi.org/10.1016/j.jss.2014.09.016

This is a PDF file of an unedited manuscript that has been accepted for publication.As a service to our customers we are providing this early version of the manuscript.The manuscript will undergo copyediting, typesetting, and review of the resulting proofbefore it is published in its final form. Please note that during the production processerrors may be discovered which could affect the content, and all legal disclaimers thatapply to the journal pertain.

Page 2: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 1 of 30

Accep

ted

Man

uscr

ipt

Highlights The precision rate outperforms some other approaches in the case of

user feedback. When the database increases, the time cost is significantly lower than

other approaches. Store the semantic information in the database, not directly process

multimedia data with large size. The storage and I/O cost are reduced. Low-end computers together with open-source frameworks are

adopted. The investment possesses good economic efficiency.

Page 3: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 2 of 30

Accep

ted

Man

uscr

ipt

An Effective and Economical Architecture for Semantic-based

Heterogeneous Multimedia Big Data Retrieval

Kehua Guoa, Wei Panb, Mingming Lua, Xiaoke Zhoua, Jianhua Mac

a School of Information Science & Engineering, Central South University, Changsha China

b School of Software, Central South University, Changsha China

c Faculty of Computer and Information Sciences, Hosei University, Tokyo, Japan

Corresponding author: Kehua Guo, [email protected]

Abstract

Data variety has been one of the most critical features for multimedia big data. Some

multimedia documents, although in different data formats and storage structures, often express

similar semantic information. Therefore, the way to manage and retrieve multimedia

documents reflecting users' intent in heterogeneous big data environments has become an

important issue. In this paper, we present an effective and economical architecture named

SHMR (Semantic-based Heterogeneous Multimedia Retrieval), which uses low cost to store

and retrieve semantic information from heterogeneous multimedia data. Firstly, the

particularity of heterogeneous multimedia retrieval in big data environments is addressed.

Secondly, an approach to extract and represent semantic information for heterogeneous

multimedia documents is proposed. Thirdly, a NoSQL-based approach to semantic storage, in

which multimedia can be parallel processed in distributed nodes is provided. Finally, a

MapReduce-based retrieval algorithm is presented and a user feedback supported scheme to

achieve high retrieval precision and good user experience is designed. The experimental results

indicate that the retrieval performance and economic efficiency of SHMR are suitable for

multimedia information retrieval in heterogeneous big data environments.

Keywords: heterogeneous multimedia, semantic based retrieval, information retrieval, variety,

big data

Page 4: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 3 of 30

Accep

ted

Man

uscr

ipt

1 Introduction

Nowadays, huge volumes of multimedia such as images, audios, videos and text

documents are being generated and consumed daily. With the development of Internet

technology and multimedia provision, the number of rich multimedia contents has exploded.

Currently, multimedia makes up 60% of Internet traffic, 70% of mobile phone traffic, and 70%

of all available unstructured data (Smith, 2011). Multimedia has become a form of big data

which gives the users valuable information such as event occurrence, networks computing,

purchase recommendation and workflow control, etc. (Chen and Yang, 2011; Wang and Jiang

et al., 2014; Wang and Liu et al., 2014). Therefore, multimedia content retrieval from big data

environment is spurring on a tremendous amount of research (Liu et al., 2013).

Multimedia big data retrieval has its own particularities. In this paradigm, multimedia

computing has switched into a distributed pattern to store and process massive multimedia

contents. Although this manner alleviates the maintenance and the computing burden of the

client, multimedia big data storage and processing are facing great challenges. In big data

environment, a large number of commodity computers which possess massive computation

power and storage capacity will generate multimedia content. Since many services and

applications will provide, edit, process, and retrieve rich multimedia contents, some

multimedia documents, probably having different data formats and storage structures, often

express similar semantic information. Incompatible data formats, non-aligned data structures

and inconsistent data semantics has been important problems in multimedia big data research.

Therefore, the most fundamental challenge for multimedia big data storage and retrieval is

heterogeneity, which can be highlighted as follows: (1) Content heterogeneity. Multimedia

content generated from applications may be various and unstructured. For example, the

different types of multimedia services will generate images, videos, audios, graphics or text

documents. Even in videos, the content may be generated by transportation cameras, video

conferencing or user uploading, etc. (2) Service requirement heterogeneity. Information

retrieval may exist in different services, such as photo sharing, information rendering and

Page 5: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 4 of 30

Accep

ted

Man

uscr

ipt

semantic retrieval, etc. Different users require different quality of service (QoS) and quality of

experience (QoE). In this case, the service providers should guarantee the service performance

for millions of users and simultaneously meet different requirements as possible as they can. (3)

Terminal device heterogeneity. In big data environments, numerous types of terminals, such as

personal computers (PC), laptops, pads, mobile phones, etc., can be used to access massive

multimedia. Moreover, even for a single type of terminal, various forms of profile exist. For

instance, mobile phones have different types of operating system (OS), such as Windows CE,

Android and Apple's iOS. Thus, the retrieval architecture should provide ubiquitous services

for various clients. The above features have become new challenges in the area of multimedia

big data storage and retrieval.

In multimedia retrieval process, users' intent is another critical issue. In some traditional

approaches, the retrieval usually restricts to the same type of multimedia content such as

images. This constraint reduces the QoS of multimedia retrieval because the returned results

may fail to identify users' search intent due to the shortage of type diversity.

Therefore, it becomes a significant issue to solve type heterogeneity, storage distribution

and users' intent for a good retrieval performance and economic efficiency (Smith, 2012). In

this paper, a semantic-based approach to represent users' intent is adopted, and a novel storage

and retrieval architecture named SHMR (Semantic-based Heterogeneous Multimedia Retrieval)

to support heterogeneous multimedia big data retrieval is proposed. The characteristics of

SHMR are as follows: (1) heterogeneous multimedia retrieval, because any type of multimedia

documents can be uploaded and retrieved; (2) convenience, since a familiar retrieval interface

similar to traditional commercial search engines; (3) reduced I/O cost, as we store the ontology

represented semantic information in the database, then provide links to the real multimedia

documents, not directly process multimedia data with large size; (4) economic efficiency, as

low-end computers together with open-source frameworks are adopted to store NoSQL

database and process the retrieval, respectively.

The remainder of this paper is structured as follows. Section 2 reviews related works and

briefly introduces the overall concept of SHMR. Section 3 provides a detail description of our

Page 6: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 5 of 30

Accep

ted

Man

uscr

ipt

proposed architecture, including semantic extraction and representation, semantic storage,

multimedia retrieval algorithm and user feedback. Section 4 provides performance evaluation

and experimental results. Section 5 concludes our contribution and points the future work.

2 Related Works

In the past decades, multimedia retrieval is mainly founded on text-based approaches,

which are solely based on the text contents surrounding multimedia in certain host files.

Although keywords are utilized to retrieve various types of multimedia documents, this method

is not intrinsically heterogeneous supported. The retrieval is not able to achieve an excellent

performance because of the noise (Zhao and Grosky, 2002; Yang et al., 2012).

Regardless of the fact that users feel more convenient to retrieve multimedia content

through text keywords, content-based retrieval has been widely used in commercial search

engines, such as Google and Bing Image Search. However, it is extremely difficult to execute

heterogeneous retrieval based on multimedia content (Smeulders et al., 2000; Zhou et al.,

2012). For instance, given a video and audio documents for the same artist, the content-based

approaches have no ability to identify the artist or extract other similar features from the binary

data of the two documents because of the data formats difference. Thus, in many cases,

content-based approach may ignore users' retrieval intent.

To support users' intent reflected retrieval, some contributions focused on introducing

relevance feedback (RF) in content-based retrieval. Comprehensive surveys for RF in image

retrieval systems were presented in (Datta et al., 2008). Representative contributions include

active learning algorithm for conducting effective relevance feedback (He, 2010), Support

Vector Machines (SVMs) based feedback analysis (Wang et al., 2011), local geometrical graph

based feedback learning (Chen et al., 2011)) and Biased Discriminant Analysis (BDA)

approach (Zhang et al., 2012). However, the main drawback of RF is to increase user

involvement. Query users are expected to provide only limited feedback, and excessive

feedbacks will increase their burden (Datta et al., 2008).

Page 7: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 6 of 30

Accep

ted

Man

uscr

ipt

To support heterogeneous multimedia retrieval and reflect users' intent, the feasible

approach is using social semantic information and automatic semantic analysis (Wong and

Leung, 2008; Gijsenij and Gevers, 2010). Related models have been widely used. At present,

text semantic information is generally extracted using topic models such as PLSA

(Probabilistic Latent Semantic Analysis) (Hofmann, 2001) and LDA (Latent Dirichlet

Allocation) (Blei et al. 2003). In addition, BoW (Bag-Of-Words) model (Wu et al. 2010) has

become a typical model to express the visual words. To semantic information representation,

ontology is the most widely used method (Maedche and Staab, 2001). Some achievements

gave the improvement to traditional ontology. For example, Yang et al. (2008) proposed a

hierarchical ontology-based knowledge representation model, and Wang et al. (2008) used

Semantic Web Rule Language to define the semantic ontology in Web environment.

In recent years, some contributions using above approaches to support heterogeneous

multimedia retrieval have been proposed. For text-image retrieval, Rasiwasia et al. (2010)

modeled the correlations between text and image modalities and learned them with canonical

correlation analysis. In 2014, this approach was revised to achieve a better performance (Costa

et al., 2014). Zhai et al. (2013) proposed a heterogeneous media similarity measure with

nearest neighbors which considers both intra-media and inter-media correlations. Liu et al.

(2014) reported accumulated reconstruction error vector to combine the original feature

descriptions into a shared semantic space. However, the above approaches only support the

heterogeneous retrieval between image and text documents.

To achieve various types supported retrieval, Lu et al. (2012) proposed IBCR (Indexing-

based Cross-Media Retrieval) approach and designed indexing MK-tree based on

heterogeneous data distribution to manage the media objects within the semantic space to

improve the performance of heterogeneous multimedia retrieval. Yang et al. (2012)

constructed a semi-semantic graph by jointly analyzing the heterogeneous multimedia data.

However, these approaches ignore the semantic information provided by the social users and

only focus on automatic learning and relevance feedback of query users. In these systems,

semantic features and multimedia documents are stored in the servers' databases, when the data

Page 8: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 7 of 30

Accep

ted

Man

uscr

ipt

scale increases, processing multimedia data with large size will cost much computation

resource.

Retrieval performance and economic efficiency are very important factors to evaluate

multimedia big data retrieval systems. In big data environments, information retrieval

encounters some particular problems such as the data complexity, uncertainty and emergence

(Liu et al., 2013). Traditional RDBMS (Relational Database Management System) technology

is not able to satisfy the requirement of heterogeneous information retrieval due to the data

variety and the high investment (Smith, 2012). At present, NoSQL technology is useful to store

the information which can be represented as map format. Apache HBase is a typical database

to realize the NoSQL idea, which simplifies the design, horizontal scaling and finer control

over availability. The features of HBase are outlined in the original work of GoogleFileSystem

(Ghemawat et al., 2003) and BigTable (Chang et al., 2008). In HBase, tables serve as the input

and output for MapReduce (Dean and Ghemawat, 2008) jobs running in Hadoop (Apache

Hadoop, 2013), and may be accessed through certain typical APIs, such as Java, etc. (Apache

Hbase, 2013).

In this paper, SHMR demonstrates an effective and economical architecture which uses

inexpensive investment to store and retrieve semantic information from heterogeneous

multimedia data. In this architecture, multimedia data with large size is not directly processed,

HBase only stores ontology represented semantic information which can be parallel processed

in distributed nodes with MapReduce-based retrieval algorithm. The experimental results show

that SHMR can effectively identify the heterogeneous multimedia.

3 Methodology

3.1 Overview

This section will present SHMR on how to combine the semantic information and

multimedia documents to perform big data retrieval. Generally speaking, big data processing

tools (e.g. Hadoop) are open-sourced and freely available. On the one hand, Hadoop basically

Page 9: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 8 of 30

Accep

ted

Man

uscr

ipt

provides a programming model to perform the distributed computing. Thus, the distributed

paradigm can be followed to revise the traditional retrieval algorithm as long as it meets the

MapReduce programming specification. Multimedia big data retrieval is able to be performed

without increasing the users' cost. On the other hand, the semantic information of the

multimedia documents can also be easily obtained and saved because of the existence of

various computation models (Hofmann, 2001; Blei et al. 2003; Wu et al. 2010). Hence, in

SHMR, such valuable assets will be applied to facilitate the intent reflected heterogeneous

multimedia big data retrieval.

SHMR adopts four-step architecture as shown in Fig. 1. The architecture mainly consists

of multimedia semantic input (Fig. 1(a)), ontology semantic representation (Fig. 1(b)),

NoSQL-based semantic storage (Fig. 1(c)) and MapReduce-based heterogeneous multimedia

retrieval (Fig. 1(d)) steps. In consideration of the economic efficiency, we select Apache

Hadoop as the implementation tool.

Return

Re-Annotation

Upload

(a) Multimedia Semantic Input(b) Ontology Semantic

Representation

(c) NoSQL-based Semantic Storage

Ontology File Input

Map Structure Conversion

Retrieval User

(d) MapReduce-based Heterogeneous Multimedia RetrievalReturned

Result

Ontology

Ontology

Ontology

(d3)

(d1)

(d4)

Web Crawling

Sensor Collecting

User Generating

Index and Block Generating

HBase Semantic Database

Ontology Generating

Map Structure Conversion

Hadoop Framework

Social Annotating

Automatic Learning

Semantic Extraction

Multimedia Location

Weight Adjustment

Scheme

DataNodes

MapReduce based

Retrieval Algorithm

Social Annotating

Automatic Learning

Social Users

(d2)

Annotation

Semantic Field Refinement Scheme

Social Users

Fig. 1 Overview of SHMR

In the first step, the multimedia content will be obtained from various sources such as

Web crawling, sensor collection and user generating, etc. The multimedia types may include

Page 10: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 9 of 30

Accep

ted

Man

uscr

ipt

images, videos, audios or text documents with various formats. The semantic information will

be initialized by two ways: (1) social annotating, which means extracting semantic information

from annotations provided by social users; (2) automatic learning, which denotes analyzing

semantic information from multimedia features using topic models. After the semantic

extraction, the semantic fields together with the multimedia location will be represented by

ontology in the second step. Weight adjustment scheme is employed to adjust the weight of

every semantic field.

Ontology files are saved into HBase linked to the real multimedia data with the location

information in the third step. For better adaptation to NoSQL-based big data processing tool,

we use the map<key-value> structure conversion process to normalize the correspondence

between multimedia location and semantic fields. Next, the index and storage block will be

generated, according to which the ontology will be saved into the NoSQL-based distributed

semantic database managed by HBase.

In the fourth step, users can upload annotated multimedia documents with arbitrary format

to execute the heterogeneous multimedia retrieval (Fig. 1 (d1)). The semantic information of

the uploaded file(s) is extracted by social annotating and automatic learning (Fig. 1 (d2)). Then

the engine will execute the ontology generating and map structure conversion to adapt to

MapReduce-based retrieval. After the retrieval, the engine will return the results as the

thumbnails with the file locations (Fig. 1 (d3)). Finally, social users will be asked to give

additional annotations to the multimedia documents they selected (Fig. 1 (d4)) to make the

annotations more abundant and accurate.

3.2 Semantic Fields Extraction

For multimedia documents, semantic fields are extracted by social annotating and

automatic learning. Fig. 2 shows the two approaches to semantic fields extraction from a

typical document in Flickr.

Page 11: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 10 of 30

Accep

ted

Man

uscr

ipt

Text Comments User TagsLDA

(Blei et al., 2003) Extract As Initial Social Annotations

Semantic Fields Retrieval User

Add Tags (Guo et al., 2014)

Fig. 2 Semantic Fields Extraction from a Flickr Document

Social annotating is divided into two categories: (1) In the initialization phase, the user

tags of multimedia documents are extracted as the social annotations (note: the dataset is

crawled from some typical websites such as Flickr, Wikipedia and Youtube for simulation,

since the user tags in these websites can be easily analyzed); (2) During the using of SHMR,

social users can manually annotate multimedia documents using the software interfaces

proposed in our previous research (Guo et al., 2014). All the semantic fields are described by

text which will be changed into bytes for storage.

In automatic learning, LDA (Blei et al., 2003) model is employed to analyze the semantic

information. This model performs the extraction process in consideration of the multimedia

text comments, which are supplied by the content provider and embedded in the host document.

LDA model is utilized to extract the topic text as the semantic fields, using the implementation

of APIs from STMT (Stanford Topic Modeling Toolbox, 2014). In addition, this step considers

the multimedia relationship, which is assigned to different document types through hyperlinks.

In this case, the related hyperlinks will be added to the semantic fields.

3.3 Multimedia Semantic Input

Define M as a multimedia document and C as the set of all the input multimedia

documents, satisfying },...,,{ 21 NMMMC (where N is the number of multimedia

Page 12: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 11 of 30

Accep

ted

Man

uscr

ipt

documents). Any CM i will be saved in file system. The location information of iM is

represented as text linked to real file.

Although any CMi has numerous semantic fields, not all fields can accurately

represent users' understanding of iM . Therefore, any mii Sns will be assigned a weight.

Hence, any CMi has a final semantic matrix miS as follows:

T

n

nmi www

sssS

,...,,

,...,,

21

21 (1)

where is is the i-th semantic field, n is the number of semantic fields of iM , and iw

is the corresponding weight. Therefore, all the semantic matrices for the multimedia

documents can be defined as },...,,{ 21 mNmm SSSS . The weight iw of any CMi is

assigned a initial value of n/1 .

It is evident that iw for every semantic field could not be constant during the retrieval.

Obviously, more frequently used semantic fields during the retrieval process can better

describe users' intent, and they should be assigned a greater weight. An adjustment scheme is

designed to adjust the weight of every semantic field during retrieving the returned

document M . The algorithm is detailed as follows:

Algorithm 1. Weight Adjustment Scheme

1

2

3

4

5

6

7

8

9

10

11

1. Input: Define matrix Smi for every input multimedia Mi, define semantic matrix S.

2. Initialize: (1) Obtain the semantic fields and store them to Smi.

(2) Assign wi in Smi as 1/n.

(3) Combine all the Smi to generate semantic matrix S.

3. Adjustment During Retrieval: Set step=1

For each returned document M

For j=1 to n

If Mi is retrieved by sj Then

set kj=1

Else set kj =0

End If

Page 13: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 12 of 30

Accep

ted

Man

uscr

ipt

12

13

14

15

wj = wj + kj /n

End For

End For

4. Output: Restore all the adjusted semantic matrices to generate new matrix S.

It can be seen from line 8 to 12 that Algorithm 1 is to assign greater weights to more

frequently used semantic fields. In the latter algorithm, the fields with less weight will be

eliminated to make semantic information more accurate.

The initial weight assignment has to check all the multimedia documents in the database,

which is computational expensive. To solve this problem, this process can be executed only

once when the search engine is initialized. Moreover, this process is performed in a

background thread. Considering the weight adjustment scheme during retrieval process, the

computation complexity can be measured as a function of returned list R . In this algorithm,

weight adjustment scheme has to check the semantic fields of every returned document. Hence,

the computation complexity of weight adjustment scheme is |)|( RnO (where ||# represents

the cardinality of a set).

3.4 NoSQL-based Semantic Storage

NoSQL databases have been widely used in industry, including big data and real-time

Web applications. This technology is used to store the semantic fields and multimedia location

which are represented as highly optimized map<key-value> format. The data can be

stored and retrieved utilizing models that employ less constrained consistency than traditional

relational databases such as Oracle and Microsoft SQL Server. In SHMR, Apache HBase is

adopted to simplify the storage.

In SHMR, ontology nodes at the first level are used to represent the most obvious features.

The second and other levels of semantic fields will be provided based on the previous levels.

All the information is extracted through the original semantic input and user feedback. SHMR

adopts composite pattern, where objects can be composed as a tree structure to represent the

Page 14: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 13 of 30

Accep

ted

Man

uscr

ipt

part and whole hierarchy (Guo et al., 2014). This pattern regards simple and complex elements

as common elements. Client uses the same method to deal with complex elements as to simple

elements, so that the internal structure of the complex elements will be independent with the

client program (Guo and Zhang, 2013).

To facilitate the subsequent data processing in MapReduce, the data structure is required

to be changed into map<key, value> pairs because the map function takes a key-value pair as

input. HBase stores the files with some blocks in data nodes, the size of every block is a fixed

value (e.g. 64 MByte), and the corresponding multimedia semantic ontology files will be

recorded in each block. The file format is shown in Fig. 3.

Block Header

Record

Record

Record

Record Size

Key Bytes (Multimedia Location)

Value Bytes (Multimedia Ontology)

Partition 0

Partition 1

Partition n

Block File Record

Original Multimedia File

Data Nodes

OntologyComponent

OntologyCompositeOntologyLeaf

children

Fig. 3 Map Structure of Block and Record

To reduce the network I/O load between data nodes during the job dispatching, it can be

seen from Fig. 3 that multimedia data are not stored in block files. To a record, the key is the

location of original multimedia document, and the value is the ontology content which is

represented as byte array format.

3.5 MapReduce-based Heterogeneous Multimedia Retrieval

MapReduce-based algorithm consists of two steps: (1) mapper function, which is

specified to process a key-value pair in order to generate a set of intermediate key/value pairs;

(2) reducer function, which is designed to process intermediate values associated with the same

intermediate key. In the queries, every query is assigned a QueryId and QueryOntology, the

Page 15: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 14 of 30

Accep

ted

Man

uscr

ipt

returned result will be formed as a ReturnedList. The MapReduce-based retrieval process is

shown in Fig. 4.

mapper

HBase Storage

m1 m2 … mn

User

qn

Queries

qn … qn

Intermediate Result 2

Intermediate Result n

reducer r1 r2 … rn

rn

ReturnedList

rn … rn

Hadoop Environment

Intermediate Result1

Fig. 4 MapReduce-based Retrieval Process

In the retrieval, for any CMM ji , , the similarity function is computed as :

)()()()(

1),(similarity jiji MOMO

jNiNMM

(2)

where )(iN and )( jN are the row numbers of miS and mjS , )( iMO and )( jMO

are the collection of all the semantic fields of multimedia iM and jM , respectively.

In the queries and returned lists, the information is represented as byte array. The mapper

function takes pairs of record key (multimedia location) and record value (multimedia

ontology). For each pair, retrieval engine executes all queries for each matching using the

similarity function defined in formula (2). MapReduce tool parallel runs the mapper functions

on each machine. When the mapper function finishes, MapReduce tool groups the intermediate

output according to every QueryId. For each corresponding QueryId, the reducer function,

running locally on each machine, simply takes the result whose similarity is above the average

value, and outputs it into the ReturnedList. The pseudo code in Algorithm 2 outlines the

retrieval implementation.

Algorithm 2. MapReduce-based Retrieval Algorithm

1

2

1. Input: (1) Records (containing RecordSize, RecordKey, RecordValue).

(2) Queries (containing QueryIds, QueryOntologies)

Page 16: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 15 of 30

Accep

ted

Man

uscr

ipt

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

2. Initialize: Configure the Hadoop running environment.

3. Retrieval:

mapper (RecordKey, RecordValue)

For each (QueryId, QueryOntology) in Queries

sim = similarity(QueryOntology, RecordValue)

If (sim > 0) Then

output (QueryId, (RecordKey, RecordValue, sim))

End If

End For

End Function

reducer (QueryId, Pairs(RecordKey, sim))

avg = the average value for all similarity values of QueryId

For each (RecordKey, sim) in Pairs

If (sim > avg) then

insert (QueryId, (RecordKey, RecordValue)) into ReturnedList

End if

End For

End Function

4. Output:

For each (RecordKey, RecordValue) in ReturnedList

output (QueryId, (RecordKey, RecordValue))

End For

In this algorithm, user query is split into some <QueryId, QueryOntology> pairs. In the

mapper process, in every pair, the QueryOntology is compared with the RecordValue, all the

matching records will be cached. In the reducer process, only the records with greater

similarity are selected into the returned list. In addition, the returned list will be sorted

according to the similarity using Insertion Sort Algorithm, which can guarantee the records

with greater similarity appear at a more forward position in the result list.

Page 17: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 16 of 30

Accep

ted

Man

uscr

ipt

3.6 Semantic Field Refinement and User Feedback

For any CMi , the semantic matrix miS stems from the different users' understanding

or automatic learning. Hence, || miS will increase continuously during the using of SHMR. In

miS , wrong or less frequently used semantic fields inevitably exist, which will waste much

retrieving resource and storage space. In order to solve this problem, a semantic field

refinement scheme is designed to to retain the higher frequency annotations and eliminate the

annotations with less use. The scheme is detailed as follows:

Algorithm 3. Semantic Field Refinement Scheme

1

2

3

4

5

6

7

8

9

10

11

12

13

14

1. Input: (1) Load semantic matrix S.

(2) Define a threshold value ( 10 ).

2. Refinement: Set step=1

For i=1 to N

Load Smi and compute

n

iimi w

nt

1

1

For j=1 to n

If wj < mit Then

remove the ith row from Smi

End If

Rebuild Smi

End For

End For

3. Output: New semantic matrix S.

It can be seen from line 5 to 10 that Algorithm 3 is to eliminate the fields whose weights

is less than an average value mit , this can make semantic information more and more accurate.

After the semantic field refinement, SHMR will update the ontology information in the HBase

according to the location information of multimedia documents.

Page 18: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 17 of 30

Accep

ted

Man

uscr

ipt

The computation complexity of Algorithm 3 can be measured as a function of the

semantic matrix S. In this algorithm, semantic field refinement scheme will check the semantic

fields of every multimedia document. For every document, computing tmi and eliminate the

fields with less weights take )(nO time. Therefore, in total, the whole running time of

Algorithm 3 is |)|( SnO . This complexity is high and needs enormous computation resource,

so this algorithm will be executed every long time interval such as 24 hours.

SHMR supports user feedback, for a particular returned document, social user can add

additional semantic fields to enrich the semantic information. For these semantic fields, the

initial weight will be assigned as mit . During the retrieval process, the semantic fields will be

increasingly abundant and accurate, and useless fields will be removed gradually. Therefore,

SHMR is a dynamic architecture for the long-term application.

4 Experimental Evaluations

4.1 Dataset and Experiment Tools

Many general datasets have been proposed to construct the experimental evaluation (Khan

et al., 2009; Lu et al., 2012; Yang et al., 2012; Costa et al., 2014). However, some of these

datasets can only perform the experiments aiming to particular multimedia types (e.g. image

and text files). Heterogeneous multimedia retrieval requires a wide variety of files such as

images, videos, audios and text documents, so these datasets are not appropriate for performing

the experiments. In our experiment, a multimedia database containing various multimedia

types is constructed. This multimedia database stores 50,000 multimedia documents, including

20,000 images, 10,000 videos, 10,000 audios, and 10,000 text documents. The documents are

gathered from Flickr, Wikipedia, and Youtube webpages. We use the same categorization

approach as Costa et al. (2014) and divide all the documents into 10 categories referring to the

10 top most populated categories in Wikipedia featured articles, which are listed as: Art &

architecture, Biology, Geography & places, History, Literature & theatre, Media, Music,

Royalty & nobility, Sport & recreation, Warfare (Costa et al., 2014). In every category, file

Page 19: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 18 of 30

Accep

ted

Man

uscr

ipt

sub-categories are defined. Hence, every recognizable category contains about 1,000

multimedia documents.

The semantic information is collected from two approaches. On the one hand, users can

provide text tags on the shared multimedia files in Flickr, Wikipedia and Youtube webpages,

we directly analyze the page structure and crawl the tags as the initial social annotations for

simulation. On the other hand, LDA model (Blei et al., 2003) and the STMT (Stanford Topic

Modeling Toolbox, 2014) tool are employed to analyze the text descriptions to extract the topic

words. In each crawled webpage, the multimedia document, file location and semantic

information are used to establish the dataset according to the approach proposed in Section 3.

For simplicity, the dimensional range of multimedia documents is restricted. Table 1 indicates

the initial annotation quantity and dimensional range gathered from the two approaches.

Table 1. Initial Annotation Quantity and Dimensional Range

Multimedia Type Image Video Audio Text

Social Annotating 63,587 35,478 22,174 11,258

Automatic Learning 98,869 47,586 36,352 24,693

Dimensional Range 300KB-1MB 3MB-10MB 1MB-4MB 10KB-30KB

SHMR architecture is implemented on 10 computers, which is able to simulate the

parallel and distributed system. Fig. 5 shows the running architecture in the experiments.

MasterSlave

Web Server(Tomcat6.0)

01

02 03 04

08 09 10

Slave

05 06

07

SlaveSlave Slave

Slave SlaveSlave SlaveSlave

Ubuntu Linux, Hadoop 2.0, OpenSSH

User

Upload

Fig. 5 Running Architecture of Experiments

Page 20: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 19 of 30

Accep

ted

Man

uscr

ipt

In this architecture, every node is low-end PC (2.0GHZ CPU, 2GB RAM), which is

installed Ubuntu Linux, Hadoop 2.0 together with the supporting tools (e.g. Java SDK6.0,

OpenSSH, etc.). The nodes are numbered from 01 to 10, and node 01 is taken as the master

node, and all the nodes are organized as slave nodes. Therefore, in total, this parallel system

has 10 machines, 10 processors, 20 GB memory, 10 disks and 10 slave data modes. All the

experiments are conducted in such a simulated environment.

In this paper, some additional software tools are developed to verify the effectiveness of

SHMR. These tools include: (1) annotation interface, which provides an interface for users to

annotate the multimedia documents (Guo et al., 2014). (2) retrieval interface, which is a

convenient operating interface similar to the traditional commercial search engines. Users can

upload multimedia documents in the interface and submit the information to the server. This

interface is developed using HTML5 and can run on typical terminals. (3) search engine,

which is deployed in Web Server (Tomcat 6.0). In the experiment, the threshold value of

Algorithm 3 is chosen as 0.8 and the background process will be executed every 24 hours.

4.2 Performance Evaluation Model

In this section, performance evaluation model will be designed to measure the

performance. The model is based on the following three criteria: precision rate, time cost and

storage cost.

(1) Precision rate. Precision rate is one of the most frequently used measurements for

evaluating the retrieval performance. For better comparison, we slightly modify the traditional

definition and compute the precision rates at top returned list tR . Define all the relevant

multimedia documents set as lR , the precision rate is computed by the proportion of retrieved

relevant documents in tR . Therefore, the precision rate p can be defined as follows:

||

||

t

tl

R

RRp

(3)

Page 21: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 20 of 30

Accep

ted

Man

uscr

ipt

(2) Time cost. Time cost includes two factors. The first factor is the time cost of data

process. In SHMR, several background processes are time consuming. The background

process time is defined as follows:

refpreb ttt (4)

where pret is the preprocess time (convert the semantic and multimedia location to map

structure) and satisfies:

N

i

iprepre tt

1

(5)

reft represents the semantic field refinement time (eliminate the redundant or error

semantic information and add the new semantic information from the feedback).

The second factor is retrieval time. Define rt as the time cost for retrieval. In fact, rt

includes extraction time (extract semantic information from the HBase) and the matching time

(match the semantic similarity between the sample document and the stored files).

(3) Storage cost. Because the HBase stores the map information, the storage cost has to be

taken into consideration. The increase rate for storage sp is defined as follows:

orgonts ssp / (6)

where onts and orgs are respectively the total size of ontology files and multimedia

documents:

N

i

iorgorg

N

i

iontont

ss

ss

1

1(7)

4.3 Precision Rate Evaluation

In the experiment, we firstly testify the effectiveness of our algorithm. We upload a

sample multimedia document shown in Fig. 6(a) and search the multimedia documents similar

to it. For simulation, the sample file has been annotated by some other users. Fig. 6(b)

illustrates the part of returned documents. This retrieval costs 4218 ms, and returns 625 images,

Page 22: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 21 of 30

Accep

ted

Man

uscr

ipt

136 videos, 17 audios and 295 text documents. The time cost includes extracting semantic

information from documents and matching the semantic fields in HBase.

(b) Some Returned Documents from SHMR(a) Sample Document

Fig. 6. An Example of Multimedia Retrieval

In order to demonstrate the heterogeneous retrieval performance, in the second

experiment, we specially record the precision rates of submitting one sample file to search the

four document types (e.g. use image to search images, videos, audios and text documents). For

every document type, we perform 10 different retrievals using 10 sample documents randomly

chosen from the multimedia dataset and compute the average precision rates. The average

precision rates are illustrated in Fig. 7.

66

68

70

72

74

76

78

80

82

84

Image Video Audio Text

Precision Rate(%)

Image

Video

Audio

Text

Fig. 7. Average Precision Rates of Heterogeneous Retrieval

Fig. 7 indicates that even in the retrieval process between different multimedia types, the

precision rates are not reduced. This is because SHMR completely abandons the physical

feature extraction, and executes the retrieval process based only on semantic fields.

The third experiment will illustrate the precision comparisons between our algorithm and

three typical heterogeneous multimedia retrieval approaches: IBCR (Lu et al., 2012), LRGA

Page 23: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 22 of 30

Accep

ted

Man

uscr

ipt

(Yang et al., 2012) and SCM (Costa et al., 2014). Images, videos, audios and text documents

are used as sample files to execute retrieval. For every sample type, we perform 10 different

retrievals and compute the average precision rates. Considering not all approaches support

various document types, we calculate the precision rate for supported ones. The average

precision rates are listed in Fig. 8.

100 200 300 400 50050

60

70

80

90

100

|Rt|

Precision Rate(%)

IBCR

LRGA

SCM

SHMR

Fig. 8. Precision Rates Comparison

It can be seen from Fig. 8 that SHMR achieves good retrieval precision rates. However, in

comparison with the current approaches, obvious advantages cannot be indicated from this

experiment. For instance, the precision rate of SHMR is difficult to exceed SCM.

In order to demonstrate the effectiveness of user feedback, in the fourth experiment, we

specially record the precision rates in consideration of asking the social user to give feedback

annotations to the returned documents. We define feedback quantity to represent the quantity

of feedback annotations. For every feedback quantity (0, 10, 20, 30, 40, 50, 60, 70, 80, 90 and

100), we perform 10 different retrievals and compute the average precision rates. The precision

rates after feedback quantity are illustrated in Fig. 9.

Page 24: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 23 of 30

Accep

ted

Man

uscr

ipt

0 20 40 60 80 10050

60

70

80

90

100

Feedback Quantity

Precision Rate(%)

IBCR

LRGA

SCM

SHMR

Fig. 9. Precision Rates After User Feedback

Fig. 9 indicates that after the user feedback, the precision can be increased. With the

increasing of the feedback quantity, the gap will be growing greatly. From the comparison, we

can see that the effectiveness of SHMR outperforms some other approaches in the case of user

feedback.

4.4 Time Cost Evaluation

In order to carry out the retrieval process, SHMR has to perform several background

processes whose time cost is bt , which includes pret and reft . Table 2 shows the time cost

of the background processes.

Table 2. Time Cost of Background Processes (s)

Multimedia Type pret reft bt

Image 102 34 136

Video 57 21 78

Audio 56 19 75

Text 83 34 117

It can be seen from Table 2 that pret , reft will cost many seconds ( bt respectively costs

about 136 seconds to image type, 78 seconds to video type, 75 seconds to audio type and 117

seconds to text type). However, the background processes are not always executed. Preprocess

Page 25: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 24 of 30

Accep

ted

Man

uscr

ipt

is executed once for initialization, and semantic field refinement is executed every 24 hours in

background thread.

Next, we measure the retrieval time rt . We specially record the time cost of 16 retrieval

processes. For every document type (image, video, audio and text), we perform 4 different

retrievals (the samples are numbered from 01 to 04). In every retrieval, rt will be recorded

respectively. Detailed time cost of 16 retrievals is listed in Table 3.

Table 3. Time Cost of 16 Retrievals (ms)

Sample Type 01 02 03 04

Image 3928 4350 3814 3020

Video 4215 4624 5742 5235

Audio 4521 4012 3871 3214

Text 3785 3541 3020 3147

Table 3 shows that the semantic information extraction costs only a very short period of

time, this is because we only need to directly extract the semantic segment from the sample

document.

To compare the performance when data scale increases, we perform the following

experiment to illustrate the time cost comparison between our algorithm and some other

approaches. In this experiment, we use datasets with different scale to execute the retrieval.

The document quantities are selected as 5,000, 10,000, 15,000, 20,000, 25,000, 30,000, 35,000,

40,000, 45,000 and 50,000. For every approach, we perform 10 different retrievals and

compute the average time cost. The average time costs are listed in Fig. 10.

Page 26: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 25 of 30

Accep

ted

Man

uscr

ipt

100 200 300 400 5002000

3000

4000

5000

6000

7000

8000

Document Quantity(X100)

Time Cost(ms)

IBCR

LRGA

SCM

SHMR

Fig. 10. Time Cost Comparison When Data Scale Increases

Fig. 10 indicates that although SHMR has no obvious advantages in the case of small data

scale, when the database increases, the time cost is significantly lower than other approaches.

Therefore, SHMR is very suitable for heterogeneous multimedia retrieval at distributed big

data environments.

4.5 Storage Cost Evaluation

In this section, storage cost will be taken into consideration because the HBase has to

store the ontology represented semantic files. Table 4 shows the storage space cost in our

architecture.

Table 4. Storage Space Cost (MB)

Multimedia Type onts orgs sp (%)

Image 319 15023 2.12

Video 182 73007 0.25

Audio 163 23575 0.69

Text 43.2 215 20.09

We can see from Table 4 that the semantic information file size occupies 20.09% for text

type. This is because the semantic information in text files is abundant. However, the semantic

Page 27: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 26 of 30

Accep

ted

Man

uscr

ipt

file size has almost not increased for image, video and audio types ( sp respectively is about

2.12% for image, 0.25% for video and 0.69% for audio).

5 Conclusions

In this paper, a novel architecture named SHMR supporting semantic multimedia retrieval

in heterogeneous big data environments has been proposed. We described semantic fields

extraction, data storage, semantic-based multimedia retrieval and performance evaluation

model. The architecture consists of four independent steps. Algorithms proposed in this paper

solve two critical problems in semantic-based heterogeneous multimedia retrieval. First, noises

in semantic information cannot reflect users' intent, how to eliminate them to guarantee better

retrieval precision. Second, given the multimedia documents with semantic information, how

to convert it to map structure and retrieve it from the database.

The framework possesses excellent economic efficiency. On Hadoop-based platform,

users only need to purchase some cheap computers to perform the data storage and retrieval

process, this can help us to reduce the hardware investment. Open-sourced tools such as

Ubuntu Linux, Java SDK and Hadoop can be freely downloaded from the corresponding

websites. This will save the investment of software. In addition, the Apache Hadoop provides

simplified programming models for reliable, scalable, distributed computing. It allows

distributed processing of large data sets across clusters of computers using simple

programming models (Leverich and Kozyrakis, 2010). This will save the learning cost.

We applied several experiments on the proposed framework. Comparisons in experiments

demonstrated that the proposed framework obtains remarkable performance, especially in the

case of data scale increasing. After user feedback, the precision outperforms the existing

approaches. In addition, storage and I/O cost in the architecture can be significantly reduced by

using the proposed scheme.

However, in this paper, experimental dataset acquisition is from some specific websites

such as Flickr, Wikipedia and Youtube, the semantic provision by social users is still a

simulation. In the future work, we will plan to explore several improvements of SHMR,

Page 28: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 27 of 30

Accep

ted

Man

uscr

ipt

including performing the experiments in real Internet environment and increasing the retrieval

speed.

Acknowledgments

Thanks for the help of Professor Li Kuang in the research of semantic-based

heterogeneous multimedia big data retrieval. This work is supported by Hunan Science and

Technology Plan (2012RS4054), Natural Science Foundation of China (61202341), China

Scholarship (201308430049) and the Major Science & Technology Research Program for

Strategic Emerging Industry of Hunan (2012GK4054). The authors declare that they have no

conflict of interests.

References

Apache Hadoop, 2013. Available at http://www.uefi.org/home/ (accessed on 1 November

2013).

Apache Hbase, 2013. Available at http://en.wikipedia.org/wiki/HBase (accessed on 1

November 2013).

Blei, D.M., Ng, A.Y., Jordan, M.I., 2003. Latent Dirichlet allocation. Journal of Machine

Learning Research 3, 993-1022.

Chang, F., Dean, J., Ghemawat S., Hsieh, W. C., Wallach, D. A., Burrows, M., Chandra, T.,

Fikes, A., Gruber, R. E., 2008. Bigtable: A distributed storage system for structured data.

ACM Transactions on Computer Systems 26, 1-26.

Chen, J., Yang, Y., 2011. Temporal dependency-based checkpoint selection for dynamic

verification of temporal constraints in scientific workflow systems. ACM Transactions on

Software Engineering and Methodology 20, 9.

Chen, R., Cao, Y.F., Sun, H., 2011. Active sample-selecting and manifold learning-based

relevance feedback method for synthetic aperture radar image retrieval. IET Radar, Sonar

& Navigation 5, 118-127.

Page 29: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 28 of 30

Accep

ted

Man

uscr

ipt

Costa Pereira, J., Coviello, E., Doyle, G., Rasiwasia, N., Lanckriet, G., Levy, R., Vasconcelos,

N., 2014. On the role of correlation and abstraction in cross-modal multimedia retrieval.

IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 521-535.

Datta, R., Joshi, D., Li, J., Wang, J. Z., 2008. Image retrieval: ideas, influences, and trends of

the new age. ACM Computing Surveys 40, 1-60.

Dean, J., Ghemawat, S., 2008. MapReduce: simplified data processing on large clusters.

Communications of the ACM 51, 107-113.

Ghemawat, S., Gobioff, H., Leung, S.T., 2003. The Google file system. In: Proceedings of the

19th ACM Symposium on Operating Systems Principles, New York, USA, pp. 29-43.

Gijsenij, A., Gevers, T., 2010. Color constancy using natural image statistics and scene

semantics. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 687-698.

Guo, K., Ma, J., Duan, G., 2014. DHSR: a novel semantic retrieval approach for ubiquitous

multimedia. Wireless Personal Communications 76, 779-793.

Guo, K., Zhang, S., 2013. A semantic medical multimedia retrieval approach using ontology

information hiding. Computational and mathematical methods in medicine 2013, 407917.

He, X., 2010. Laplacian regularized D-Optimal design for active learning and its application

to image retrieval. IEEE Transactions on Image Processing 19, 254-263.

Hofmann, T., 2001. Unsupervised learning by probabilistic latent semantic analysis. Machine

Learning 42, 177-196.

Khan, I., Saffari, A., Bischof, H., 2009. Tvgraz: Multi-modal learning of object categories by

combining textual and visual features. In: Proceedings of 33rd Workshop Austrian

Association for Pattern Recognition, Austria, pp. 213-224.

Leverich, J., Kozyrakis, C., 2010. On the energy (in) efficiency of hadoop clusters. ACM

SIGOPS Operating Systems Review 44, 61-65.

Liu, C., Chen, J., Yang, L., Zhang, X., Yang, C., Ranjan, R., Ramamohanarao, K., 2013.

Authorized public auditing of dynamic big data storage on cloud with efficient verifiable

fine-grained updates. IEEE Transactions on Parallel and Distributed Systems 99, online.

Page 30: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 29 of 30

Accep

ted

Man

uscr

ipt

Liu, K., Wei, S., Zhao, Y., Zhu, Z., Wei, Y., Xu, C., 2014. Accumulated reconstruction error

vector (AREV): a semantic representation for cross-media retrieval. Multimedia Tools

and Applications, online.

Lu, B., Wang, G. R., Yuan, Y., 2012. A novel approach towards large scale cross-media

retrieval. Journal of Computer Science and Technology 27, 1140-1149.

Maedche, A., Staab, S., 2001. Ontology learning for the semantic web. IEEE Intelligent

systems 16, 72-79.

Rasiwasia, N., Costa Pereira, J., Coviello, E., Doyle, G., Lanckriet, G. R., Levy, R.,

Vasconcelos, N., 2010. A new approach to cross-modal multimedia retrieval.

In: Proceedings of the ACM international conference on Multimedia, Firenze, Italy, pp.

251-260.

Smeulders, A. W., Worring, M., Santini, S., Gupta, A., Jain, R., 2000. Content-based image

retrieval at the end of the early years. IEEE Transactions Pattern Analysis and Machine

Intelligence 22, 1349-1380.

Smith, J. R., 2011. History made everyday. IEEE Multimedia 18, 2-3.

Smith, J. R., 2012. Minding the gap. IEEE MultiMedia 19, 2-3.

Stanford Topic Modeling Toolbox, 2014. Available at http://nlp.stanford.edu/software/tmt/tmt-0.4/

(accessed on 1 April 2014).

Wang, J., Liu, Z., Zhang, S., Zhang, X., 2014. Defending collaborative false data injection

attacks in wireless sensor networks. Information Sciences 254, 39-53.

Wang, X., Lv, T., Wang, S., Wang, Z., 2008. An ontology and swrl based 3d model retrieval

system. Lecture Notes in Computer Science 4993, 335-344.

Wang, X.Y., Chen, J.W., Yang, H.Y., 2011. A new integrated SVM classifiers for relevance

feedback content-based image retrieval using EM parameter estimation. Applied Soft

Computing 11, 2787-2804.

Wong, R.C.F., Leung, C.H.C., 2008. Automatic semantic annotation of real-world Web

images. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 1933-1944.

Page 31: An effective and economical architecture for semantic-based heterogeneous multimedia big data retrieval

Page 30 of 30

Accep

ted

Man

uscr

ipt

Wu, L., Hoi, S. C., Yu, N., 2010. Semantics-preserving bag-of-words models and

applications. IEEE Transactions on Image Processing 19, 1908-1920.

Wang, G., Jiang, W., Wu, J., Xiong, Z., 2014. Fine-grained feature-based social influence

evaluation in online social networks. IEEE Transactions on Parallel and Distributed

Systems 25, 2286-2296.

Yang, D., Dong, M., Miao, R., 2008. Development of a product configuration system with an

ontology-based approach. Computer-Aided Design 40, 863-878.

Yang, Y., Nie, F., Xu, D., Luo, J., Zhuang, Y., Pan, Y., 2012. A multimedia retrieval

architecture based on semi-supervised ranking and relevance feedback. IEEE

Transactions on Pattern Analysis and Machine Intelligence 34, 723-742.

Zhai, X., Peng, Y., Xiao, J., 2013. Cross-media retrieval by intra-media and inter-media

correlation mining. Multimedia Systems 19, 395-406.

Zhang, L., Wang, L., Lin, W., 2012. Generalized Biased Discriminant Analysis for content-

based image retrieval. IEEE Transactions on Systems, Man, and Cybernetics, Part B:

Cybernetics 42, 282-290.

Zhao, R., Grosky, W. I., 2002. Narrowing the semantic gap-improved text-based Web

document retrieval using visual features. IEEE Transactions on Multimedia 4, 189-200.

Zhou, G. T., Ting, K. M., Liu, F. T., Yin, Y., 2012. Relevance feature mapping for content-

based multimedia information retrieval. Pattern Recognition 45, 1707-1720.