13
http://www.iaeme.com/IJMET/index.asp 775 [email protected] International Journal of Mechanical Engineering and Technology (IJMET) Volume 8, Issue 12, December 2017, pp. 775787, Article ID: IJMET_08_12_084 Available online at http://www.iaeme.com/IJMET/issues.asp?JType=IJMET&VType=8&IType=12 ISSN Print: 0976-6340 and ISSN Online: 0976-6359 © IAEME Publication Scopus Indexed A FRAMEWORK USING BIG DATA ANALYSIS ON HUMAN ACTIVITY PATTERNS FOR HEALTH PREDICTION Dr. Anjali Mathur Associate Professor, Koneru Lakshmaiah Education Foundation, Andhra Pradesh, India V Vaishnavi Koneru Lakshmaiah Education Foundation, Andhra Pradesh, India K Jigeesha Koneru Lakshmaiah Education Foundation, Andhra Pradesh, India K S V A G Sudheer Koneru Lakshmaiah Education Foundation, Andhra Pradesh, India ABSTRACT In this research work, big data collected from smart devices have been used to retrieve the human activity patterns to improve smart home resident’s health status, as there is a lot of financial investment in the digital transformation as an effort to provide healthier ecosystems for people. In these transformation millions of smart devices are equipped around, which gives a massive amount of refined and sorted data that can be used to analyze the health patterns. In this research, the work mainly focuses on analyzing the big data extracted from human activities for frequent pattern mining, cluster analysis, prediction to measure and analyze the energy consumption changes accordingly by occupants. This paper represents the need of analyzing energy consumption patterns based on the appliance level, which is completely related to human activities. Key words: Smart Devices, Human Activity Patterns, Smart home, Digital Transformation, Cluster Analysis, Bayesian network. Cite this Article: Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer, A Framework Using Big Data Analysis on Human Activity Patterns for Health Prediction, International Journal of Mechanical Engineering and Technology 8(12), 2017, pp. 775787. http://www.iaeme.com/IJMET/issues.asp?JType=IJMET&VType=8&IType=12

A FRAMEWORK USING BIG DATA ANALYSIS ON HUMAN …€¦ · Cite this Article: Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer, A Framework Using Big Data Analysis on Human

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A FRAMEWORK USING BIG DATA ANALYSIS ON HUMAN …€¦ · Cite this Article: Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer, A Framework Using Big Data Analysis on Human

http://www.iaeme.com/IJMET/index.asp 775 [email protected]

International Journal of Mechanical Engineering and Technology (IJMET)

Volume 8, Issue 12, December 2017, pp. 775–787, Article ID: IJMET_08_12_084

Available online at http://www.iaeme.com/IJMET/issues.asp?JType=IJMET&VType=8&IType=12

ISSN Print: 0976-6340 and ISSN Online: 0976-6359

© IAEME Publication Scopus Indexed

A FRAMEWORK USING BIG DATA ANALYSIS

ON HUMAN ACTIVITY PATTERNS FOR

HEALTH PREDICTION

Dr. Anjali Mathur

Associate Professor, Koneru Lakshmaiah Education Foundation,

Andhra Pradesh, India

V Vaishnavi

Koneru Lakshmaiah Education Foundation, Andhra Pradesh, India

K Jigeesha

Koneru Lakshmaiah Education Foundation, Andhra Pradesh, India

K S V A G Sudheer

Koneru Lakshmaiah Education Foundation, Andhra Pradesh, India

ABSTRACT

In this research work, big data collected from smart devices have been used to

retrieve the human activity patterns to improve smart home resident’s health status, as

there is a lot of financial investment in the digital transformation as an effort to

provide healthier ecosystems for people. In these transformation millions of smart

devices are equipped around, which gives a massive amount of refined and sorted

data that can be used to analyze the health patterns. In this research, the work mainly

focuses on analyzing the big data extracted from human activities for frequent pattern

mining, cluster analysis, prediction to measure and analyze the energy consumption

changes accordingly by occupants. This paper represents the need of analyzing energy

consumption patterns based on the appliance level, which is completely related to

human activities.

Key words: Smart Devices, Human Activity Patterns, Smart home, Digital

Transformation, Cluster Analysis, Bayesian network.

Cite this Article: Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer,

A Framework Using Big Data Analysis on Human Activity Patterns for Health

Prediction, International Journal of Mechanical Engineering and Technology 8(12),

2017, pp. 775–787.

http://www.iaeme.com/IJMET/issues.asp?JType=IJMET&VType=8&IType=12

Page 2: A FRAMEWORK USING BIG DATA ANALYSIS ON HUMAN …€¦ · Cite this Article: Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer, A Framework Using Big Data Analysis on Human

Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer

http://www.iaeme.com/IJMET/index.asp 776 [email protected]

1. INTRODUCTION

The appeal for the health care resources is being widely overwhelmed by the digital

transformation. According to study, by the year 2050 digital transformation plays an integral

role. By advancement of these machines, a huge portion of data will be generated from smart

devices [1]. For example, auditing the changes of appliance usage can be used for indirect

determination of the person‟s well being based on the historical data. Everyday routines and

activities reflects their regular habits, on observing their regular habits people‟s difficulties in

taking care of themselves, likely not washing his clothes, not using woven and help us to

recognize any anomalous activities which might be an indication of ill health[4]. The

relationship between appliance usage and routine activities is used by health care applications

to detect potential health problems. This reduces the burden on health care systems and

provides the monitoring service that automatically identifies normal and abnormal behaviors

for independent living patients or those with self-limiting conditions [7]. In a way, the large

volumes of data generated by smart devices are analyzed to support health care services.

A method for the use of energy data, collected from smart devices installed at homes, and

get information based on the routine-activities of inhabitants. The model observes and

analyzes meter-readings of smart home equipment to identify the regular activities.

Intellectual study of that data can help to find the changes in behavior or in health of the

occupants [3]. Power consumption and the period of utilization are closely related to the

resident‟s activities performed at household. For instance, if the „„Oven‟‟ is ON, the operation

of this appliance is most likely dealing with activity „„Preparing Food‟‟. The time (e.g.

morning or evening) of this operation may also indicate the type of the meal such as breakfast

or dinner. Furthermore, people often perform more than one activity at the same time such as

„„Preparing their own Food‟‟ and „„Listening to Music‟‟ or „„Watching programs over

television‟‟, which means multiple appliances are operated together [4]. We analyze, energy

consumption patterns at the appliance level to predict their operations by detecting multiple

appliance usages and. However, it is very challenging since it is not so easy to detect usage

dependencies among various appliances when their operation occurs at the same time [6].

Furthermore, tracing out an accurate human activity patterns and their prediction is greatly

affected by the probabilistic relationships between appliances usage and their time slots that

have dynamic time intervals.

To handle the above-mentioned issues, the paper proposes frequent mining and prediction

model to measure and analyze energy usage changes observed in the household behavior. The

data from smart meters are recursively observed in the quantum/data portion of 24 hours, and

the results are preserved across subsequent mining activities.

The research work is based on human activity pattern mining model based on appliance

usage variations in homes. Frequent pattern tree for mining the entire set of a frequent pattern

by pattern fragment growth for pattern recognition is observed based on the comparison

between k-means clustering algorithm and DBSCAN clustering algorithm to identify the

appliance-to-appliance cooperative from incremental mining of energy consumption data

[12]. This is not only used to determine activity routines, but also, used for detecting sudden

changes of human activities when utilized by health care application, that is required for

attention of a health provider [13].

2. REQUIREMENT ANALYSIS AND LITERATURE SURVEY

In 2015 A.Yassine, A.A.N. Shirehjini, and S. Shirmohammadi proposed a work on smart

meters big data Game theoretic model for fair data sharing through a technique for sharing

Page 3: A FRAMEWORK USING BIG DATA ANALYSIS ON HUMAN …€¦ · Cite this Article: Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer, A Framework Using Big Data Analysis on Human

A Framework Using Big Data Analysis on Human Activity Patterns for Health Prediction

http://www.iaeme.com/IJMET/index.asp 777 [email protected]

power consumption data in deregulated smart grids. The activities in daily living as a mean

data categorization to help the data aggregator and the consumers to identify privacy risk

values. This paper used the concept of differential privacy as an anonymity mean to minimize

the leakage of information.

In 2015 K. Jack and K. William gave their work on domestic appliance-level electricity

demand and whole-house demand from UK homes. This paper adopted a dataset possessed

from disparate homes. Smart homes contain very large number of meter readings by the users

of the smart homes equipped by smart devices. The content of the meter reading varies from

home to home based on the usage level of the equipment by the residents. This paper presents

a approach to assemble data from smart homes based on the usage of the appliances installed.

In 2016 M. S. Hossain proposed a work through paper for showing a patient's state

recognition system for healthcare using speech and facial expression. This paper explains a

model to address a overall framework on health care. It mainly deals with the concept of

identifying a patient state for providing good recognition accuracy to provide low cost

modeling. This paper mainly depends on two types of inputs considerably audio and video

which are captured in a multi-sensory environment which showed an average detection

efficiency over 98 percent.

In 2016 M. UlAlam, N. Roy, M. Petruska, and A. Zemp proposed their work on smart-

energy group difference based on behavioral anomaly detection. This paper proposes a data

analytic access that classifies energy usage abnormalities according to the behavioral

deformity of the inhabitant. Research work mainly relies on detecting everyday appliances

usage from smart meter and smart plug data tracks regular activity at days and nights, then

learning the unique time segment group of each appliance energy consumption. Main theme

in the paper is provide basic non-intrusive health monitoring technologies which can be

deployed at a huge scale without any extra sensors to be equipped in any home with multi-

inhabitants.

In 2014 J. Clement, J. Ploennigs, and K. Kabitzsch presented their work in paper for

detecting daily living activities with smart meters. This paper explains the methods that are

used to analyze smart meter data to monitor human behavior. A model Semi-Markov (SMM)

is used to track and notice mode to analyze and find exclusive structures which define habits

of household. The second approach is based on a form that allows the disclosure of ADL‟s

and focuses on temporal search of ADL‟s. these methods rely on smart meter data regarding

which home appliance was turned on.

In 2012 K Basu , V Debusschere , and S. Bacha presented their work in a paper for

Appliance Usage Prediction Using a Time Series Based Classification Approach. This paper

proposed a model which tries to validate a method using time-series based multi-label

classifier which considers correlation between different appliances. The objective of this work

is to implement a model which can forecast the appliance usage in household that helps the

system to consolidate energy production and consumption to decide which appliance will be

used at each hour. The proposed model of this paper uses an iterative learning approach and

tries all the possible information based on consumption data, time of the event.

In 2013 K. Basu, L. Hawarah, N. Arghira, H. Joumaa, and S. Ploix proposed work

through the prediction system for home appliance usage. Power supervision in home

environment and workplaces involve appliance usage prediction as the future user desires are

not manageable. The appliance usage make the prediction of appliance usage from energy

consumption data a deadly task. A neutral model for prediction at the appliance level is still

Page 4: A FRAMEWORK USING BIG DATA ANALYSIS ON HUMAN …€¦ · Cite this Article: Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer, A Framework Using Big Data Analysis on Human

Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer

http://www.iaeme.com/IJMET/index.asp 778 [email protected]

inadequate. This paper proposes a model to enhance algorithms with proficient knowledge

and recommends a universal model using a knowledge driven approach to identify whether a

particular appliance will start during a assumed hour or not.

In 2015 C. Chelmis, J. Kolte, and V. K. Prasanna proposed a work in big data analytics

for demand response: Clustering over space and time. This paper explains the motivational

need for alternate representations of electricity consumption data, arguing that approach based

on time-series representations are unable to mine implicit temporal patterns over a collection

of huge consumption data from a diverse clientele base. This paper shows the usage behavior

patterns identified at (i) different times-of-day, (ii) days-of-the-week, or (iii) at coarser

granularities (i.e., by semester or yearly) for a clientele and similarities are mined between

clientele‟s with phenomenally different characteristics by exactly clustering time-series data.

In 2015 K. Gajowniczek and T. Za_bkowski proposed a work in data mining techniques

for detecting household characteristics based on smart meter data. The goal of the paper is to

regulate the structure of household appliances usage patterns, hence providing more intellect

in smart metering systems by in view of the usage and the spell of their usage. Many

unsupervised machine learning techniques are used to observe usage patterns observed at

different households. The work carries the solutions suitable for smart metering systems that

might contribute to sophisticated energy consciousness; support precise usage predicting and

provide the input for demand response systems in households with periodical energy saving

suggestions for users. This paper delivered some results which show that defining household

features from smart meter data is accurate and allows extracting general trends in data.

In 2004 J. Han, J. Pei, Y. Yin, and R. Mao proposed a model in paper Mining frequent

patterns without candidate generation: A frequent-pattern tree approach. The importance of

frequent pattern mining in the field of data mining is to find associations, periodic patterns etc

among different datasets that are in a database. The previous studies acquire Apriori

algorithms like candidate set generation and test approach. As, Apriori is expensive,

particularly when there are numerous number of patterns, here proposed a method frequent-

pattern tree (FP-tree) structure, which is an extended prefix-tree structure for filling closely

related, substantial data about frequent patterns, and develop a capable FP-tree based mining

scheme. The consequences illustrate that the FP-growth method is capable and accessible for

mining mutually long and short frequent patterns, and is quicker than Apriori algorithm.

In 2011 J. Han, J. Pei, and M. Kamber presented work in Data mining: Concepts and

techniques, in Cluster Analysis: Basic Concepts and Methods. Cluster is a mixture of related

data objects inside a cluster and dissimilar data objects are outside the clusters. The procedure

of merging objects into modules of similar data objects is called “Clustering”. Scalability,

ability to deal with different data sets and their attributes, identifying the constraints within

input data, detection of cluster subjective shape and interpret-capability. The well-known

clustering techniques are grid based, partitioning based, density based and hierarchical

methods.

In 2014 D. Heckerman proposed work in Bayesian Network for Data Mining. This paper

explains Bayesian network used to learn relationships, also helps to gain better knowledge

about the problem and to predict the significances. This gives better representation for uniting

earlier knowledge and data. This paper, proposes the construction of Bayesian Field networks

using different kind of methods with the acquired knowledge and existing data.

In 2016 S. Singh, A. Yassine, and S. Shirmohammadi proposed a work in Incremental

mining of frequent power consumption patterns from smart meters big data. This paper

Page 5: A FRAMEWORK USING BIG DATA ANALYSIS ON HUMAN …€¦ · Cite this Article: Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer, A Framework Using Big Data Analysis on Human

A Framework Using Big Data Analysis on Human Activity Patterns for Health Prediction

http://www.iaeme.com/IJMET/index.asp 779 [email protected]

explains, the energy usage performance replicates appliance associations and their usage. The

consumption of data from a smart meter is a constant process. After a certain period, the inter-

associations among the appliances may alter with time and new patterns may establish. This

also illustrates the power consumption of a home that can be estimated by the data consisting

of appliance usage tuples for 24hours in a progressive manner.

In 2011 J. Han, J. Pei, and M. Kamber, work in Data mining: Concepts and techniques, in

Cluster Analysis: Basic Concepts and Methods, 3rd ed. Bayesian network delivers a graphical

network of underlying relations from which learning can be achieved. These trained Bayesian

belief networks are used for the classification. These Classifications can be done based on

frequent patterns. These frequent patterns reflect the associations among attribute-value pairs.

In, Associative classification the classification is based on association rules generated from

frequent patterns whereas, semi-supervised classification is useful for enormous dimensions

of unsupervised data.

3. METHODOLOGY

3.1. Block Diagram of proposed work

In a implementation, the household data is collected from the equipped smart devices to

measure the activity patterns of the living beings. After collecting the meter readings, the data

is pumped for pattern mining using activity pattern algorithms.

A sequence of steps includes to carry out the work for detecting the human activity

patterns. This work helps to identify a model that analyses the human activity patterns of

smart homes residents for the health prediction. To implement the model, we collect the smart

home data and apply pattern mining algorithms and apply clustering algorithms. The obtained

result set is used to generate a trained Bayesian network, for classifying human activity

patterns to predict abnormalities in the behavior of the smart home residents.

First it starts by applying frequent pattern mining to discover appliance-to-appliance

relations, that is understanding which appliances are functioning together. Then, this model

uses cluster analysis by comparing k means and DBscan clustering algorithms for deducing

appliance-to-time associations. With the help of above two processes, the model is capable

enough to deduce the pattern of appliance usage which is going to be used as input to the

Bayesian network for activities prediction. The output produced by the system is useful for

specific health care applications depending on the anticipated use.

Figure 1 Block Diagram

Page 6: A FRAMEWORK USING BIG DATA ANALYSIS ON HUMAN …€¦ · Cite this Article: Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer, A Framework Using Big Data Analysis on Human

Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer

http://www.iaeme.com/IJMET/index.asp 780 [email protected]

3.2. Steps of Proposed Work

Smart homes are equipped with smart devices.

When the households use the smart devices, huge volumes of data are generated.

Smart devices usage is like observing the routine activities of the households to recognize

anomalous activities.

The smart meter generated huge volumes of data is collected.

Then the data is classified based on the similarity of the data by using some algorithms.

The data is clustered based on the clustering algorithms.

After the cluster analysis the source data is refined into supervised learning classification.

Then the FP growth mining is applied to mine the source data.

Then the obtained data is collected and set for the prediction to predict the human activity

patterns based on the appliance usage.

Bayesian network combine the frequent patterns and appliance-to-time associations to gain

knowledge about the usage of various appliances at a time.

Based on the Bayesian networks we build a activity prediction model which in terms helps us

to predict the health conditions of the households.

3.3. Data Collection

The dataset used in this study is a collection of smart meters data from five houses in the

United Kingdom (UK) in the initial stage of the cleaning process we remove noises from the

data and prepare it for mining. We developed a artificial dataset for initial evaluation of the

model, containing over 1.2million records in table I

Table I Ready-to-mine source data

3.4. Identifying Frequent Patterns in source data

As mentioned earlier, the aim is to discover human activity patterns from smart meters data.

For example, activities such as „„Watching TV, Cooking, Using Computer, Preparing Food

and Cleaning Dishes or Clothes‟‟ are usually regular routines. Our aim is to detect the patterns

of these activities so that a health care application, that monitors sudden changes in patient‟s

behavior (e.g. patients with cognitive defects), can send timely alert periodically to health

care providers. The energy detection of appliances (TV, Oven and Treadmill) is related to

human activities such as leisure/relaxation time, food preparation, and exercising. A

simplified example which describes possible relationships between appliance usage and

activities. Acquiring human activity patterns is not only observing the individual appliance

operation, but also the appliance-to-appliance associations that is the patterns of activities that

are combined such as washing clothes while exercising or watching TV. The underlying

Page 7: A FRAMEWORK USING BIG DATA ANALYSIS ON HUMAN …€¦ · Cite this Article: Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer, A Framework Using Big Data Analysis on Human

A Framework Using Big Data Analysis on Human Activity Patterns for Health Prediction

http://www.iaeme.com/IJMET/index.asp 781 [email protected]

concept of the model is based on [10] which propose pattern growth or FP-growth approach

and Apriori Algorithm for frequent pattern mining.

3.5. Cluster Analysis

Recognizing appliance-to-time associations has a major role in health applications that keeps

a track on residents activity patterns all the time. In this section, to acknowledge about the

appliance usage time clustering analysis technique is used. Appliance-to-time associations are

underlying knowledge in the smart meter time series data which include sufficiently close

time-stamps, when relevant appliance has been recorded as active or operational. Using this

data, we can group a class or cluster of appliances that are in operation simultaneously or

overlapping. The size of the cluster that describes such associations is defined as the count of

members in the cluster as well as its relative strength. Clustering analysis is the process of

creating classes (unsupervised classification) or groups/segments (automatic segmentation) or

partitions where members must possess similarity with one another, but should be dissimilar

from the members of the other clusters. The distinct advantage of the clustering analysis is the

non-supervised nature of the process.

3.5.1. K-Means Clustering Algorithm

A partitioning method applied to survey data and consider observations of the data as objects

based on locations and distance between various input data points. Partitioning the objects

into commonly limited clusters (K) is done by it in such a manner that objects within each

cluster persist as nearby as likely to each other.

Each cluster is considered by its Centroid i.e., its centre point. The partings used in

clustering in most of the phases don‟t really signify the spatial distances. In wide-ranging, the

only resolution for this issue of conclusion global minimum is comprehensive choice of

starting points. In a dataset, the appropriate number of clusters K and a set of k initial points,

the K-Means clustering algorithm finds the predicted number of distinct clusters and their

centroids. A centroid is the position where coordinates are acquired by means of computing

each coordinates of the points, models assigned to the clusters.

Figure 2 K-Means Clustering Points

3.5.2. DBscan Clustering Algorithm

The main impression of the DBSCAN algorithm is that, for each point of a cluster, the

neighborhood of a given radius must contain a minimum number of points, that is, the density

in the neighborhood must extinguish some predefined threshold.

This procedure needs input parameters:

Page 8: A FRAMEWORK USING BIG DATA ANALYSIS ON HUMAN …€¦ · Cite this Article: Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer, A Framework Using Big Data Analysis on Human

Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer

http://www.iaeme.com/IJMET/index.asp 782 [email protected]

K, the proximate neighbor list size.

Eps, the radius that determine the neighborhood region of a point (Eps-neighborhood).

MinPts, the minimum number of points that must exist in the Eps-neighborhood.

The Clustering procedure is based on the category of the points in the dataset as core

points, border points and noise points and on the use of density link among points (directly

density reachable, density-reachable, density-connected [Ester1996]) to form the clusters.

Figure 3 DBscan Clustering points

3.6. Bayesian Networks for Activity Prediction

In this section, we merge the frequent patterns and appliance-to-time associations to absorb

about the use of multiple appliances and build the activity prediction model. The procedure

utilizes Bayesian network which is a directed acyclic graph, where nodes illustrate random

variables and edges indicate probabilistic dependencies. An example of Bayesian network,

allege 6 random variables, is shown in Figure2. One of the main features of a Bayesian

network is that it includes the concept of causality. For example, the link/arc between A to C

in figure indicates that node A causes node C, which means that the directed graph in a

Bayesian network is acyclic.

Figure 4 Bayesian Network

In addition to the structure, a Bayesian network model provides a closely-packed way of

representing the link probability distribution.

In other words, each node or variable is independent of its non-descendants and

accompanied by its local conditional probability distributions in the form of a node

probability table, an important benefit of the Bayesian network is the capability of modify

missing data, learn associations, and make use of prior facts and observations while avoiding

overfitting of data.

Page 9: A FRAMEWORK USING BIG DATA ANALYSIS ON HUMAN …€¦ · Cite this Article: Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer, A Framework Using Big Data Analysis on Human

A Framework Using Big Data Analysis on Human Activity Patterns for Health Prediction

http://www.iaeme.com/IJMET/index.asp 783 [email protected]

4. RESULTS AND ANALYSIS

In this section, K-Means clustering algorithm is compared with DBscan clustering. In

addition, the usage of Bayesian networks for human activity pattern recognition also

introduced.

Table II shows the comparison between K-Means and DBscan clustering algorithms.

Parameters K-Means DBscan

Approach Partitional

Based

Density based

Characterization Centroid based Dense Region

based

Limitations When clusters

are of different

Size, Densities,

Non-globular

shapes.

When the data

contains

outliers.

Do not work

efficiently when

there are more

number of

clusters with

different

densities.

Advantage More quicker

compared to

DBscan.

No cluster size

is is demanded

to form clusters.

Table II. Comparison between K-Means and DBscan

Smart home data can be employed with either K-Means clustering algorithm or DBscan

clustering algorithm. The selection of algorithm is mostly depending on dataset. If number of

clusters is predefined and if the dataset is scalable K-Means can be applied. If no prior

information about number of clusters then DBscan is feasible.

Here we used MOA tool to plot both K-Means and DBscan algorithms for a dataset. The

results of both the clustering algorithms can be compared and based on our requirement we

can use the dataset as a input for Bayesian networks.

4.1. K-Means Clustering Plot

Figure 5 K-Means Cluster in MOA Tool

Page 10: A FRAMEWORK USING BIG DATA ANALYSIS ON HUMAN …€¦ · Cite this Article: Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer, A Framework Using Big Data Analysis on Human

Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer

http://www.iaeme.com/IJMET/index.asp 784 [email protected]

4.2. DBscan Clustering Plot

Figure 6 DBscan Clustering Plots

4.3. Bayesian Networks for Activity Prediction

Figure 7 Augmental association of appliances-to-time of day

Figure 8 Bayesian Network Appliance-Time association of the day

Page 11: A FRAMEWORK USING BIG DATA ANALYSIS ON HUMAN …€¦ · Cite this Article: Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer, A Framework Using Big Data Analysis on Human

A Framework Using Big Data Analysis on Human Activity Patterns for Health Prediction

http://www.iaeme.com/IJMET/index.asp 785 [email protected]

A Bayesian network is explained by the probabilistic distribution presented in equation (1).

P(x1, x2, …,xn) = π p(xi|parents(xi))

Our probabilistic prediction model is raised based on integrating probabilities for

appliances-to-time associations in terms of days, weeks, months, seasons and appliance-to-

appliance level associations. The posterior probability for the contemplate work model is

suggested in equation (2).

P(.) = p(Hour) x p (Time of day) x p (Week) x p (Week) x p (Month) x p(Season) (2)

Figure 9 Appliance-Time association of the week

5. DISCUSSION

Most essential step in learning household activities is by mining associations of appliance

usage. Figure 8 and 9 show the appliance-to-time association visible for time of the day and

week for respective house. We can clearly observe that between 2:30 and 5:00 Television,

living area Lights are used combinedly in the house with highest frequency during weekends.

Based on these truths we can observe the varying effect of days and weeks on the usage of

appliances.

Figure 10 Probability Distribution of appliances for time of the day

Page 12: A FRAMEWORK USING BIG DATA ANALYSIS ON HUMAN …€¦ · Cite this Article: Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer, A Framework Using Big Data Analysis on Human

Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer

http://www.iaeme.com/IJMET/index.asp 786 [email protected]

Based on the probability distribution of appliances for time of the day in smart homes and

probability distribution of appliances for time of week. Figure 10 and 11 show the probability

distribution of appliance-to-time associations.

Figure 11 Probability Distribution of appliances for a week

The result shown in the figures represent the associations for 3 homes and it depends on

processing 25% of the dataset. It is easily observed from appliance interrelation, behavior of

inhabitant is observed, like to relax while preparing food.

6. CONCLUSION AND FUTURE WORK

In this paper, we presented a model for observing the human activity patterns from smart

meter data and presented a model by comparing the with two most structured clustering

techniques using smart meter data. Most of the human activities can be observed from

appliance-to-appliance and appliance-to-time associations. We presented a model by frequent

pattern mining based on the clustering of the dataset and prediction model is presented based

on the Bayesian network. In our current work, based on experiments, we found that a

complete day period was optimal for data mining.

For future work, we are planning to reduce the model and introduce distributed learning of

data mining from several houses in a instantaneous manner. This will help the households to

be conscious and take actions by alerting the conditions of the households to care providers.

Furtherly, we can build a ontology model to automate the potential activities of the appliance

which will increase the precision of perceiving human activities.

REFERENCES

[1] Abdulsalam Yassine, Shailendra Singh, and Atif Alamri,” Mining Human Activity

Patterns from smart home big data for health care applications”, IEEE Access, Vol. 5,

2017.

[2] A.A.N. Shirehjini, S. Shirmohammadi and A. Yassine, “Smart meters big data: Game

theoretic model for fair data sharing in deregulated smart grids”, IEEE Access, vol. 3,

2015.

[3] K. William and K. Jack, “The UK-DALE dataset, domestic appliance-level electricity

demand and whole-house demand from UK homes,'' Sci. Data, Sep. 2015.

[4] M. S. Hossain, “A patient's state recognition system for health care using speech and

facial expression,” J. Med. Syst., vol. 40, no. 12, Dec. 2016.

Page 13: A FRAMEWORK USING BIG DATA ANALYSIS ON HUMAN …€¦ · Cite this Article: Dr. Anjali Mathur, V Vaishnavi, K Jigeesha, K S V A G Sudheer, A Framework Using Big Data Analysis on Human

A Framework Using Big Data Analysis on Human Activity Patterns for Health Prediction

http://www.iaeme.com/IJMET/index.asp 787 [email protected]

[5] M. UlAlam, N. Roy, M. Petruska, and A. Zemp, “Smart-energy group anomaly based

behavioral abnormality detection,” in Proc. IEEE Wireless Health (WH), Oct. 2016.

[6] K. Kabitzsch, J. Clement and J. Ploennigs “Detecting activities of daily living with smart

maters”, in Advance technology and Societal Change. Heidelberg, Germany: Springer,

2014.

[7] K. Basu, V. Debusschere, and S. Bacha, “Appliance usage prediction using a time series

based classification approach,” in Proc. IEEE 38th Annu. Conf. Ind. Electron. Soc.

(IECON), May 2012.

[8] L. Hawarah, S. Ploix, N. Arghira, H. Joumaa, and K. Basu, “A prediction system for home

appliance usage”, Energy Buildings, vol.67, Sep. 2013.

[9] V.K. Prasanna, C. Chelmis, and J. Kolte, “Big data analytics for demand response:

Clustering over space and time”, in Proc. IEEE Int. Conf. Big Data (Big Data), Apr. 2015.

[10] T. Za_bkowski and K. Gajowniczek, “Data mining techniques for detecting House hold

characteristics based on smart meter data,” Energies, vol. 8, no. 7, 2015.

[11] Y. Yin, R. Rao, J. Pei, and J. Han, “Mining frequent patterns without Candidate

generation: A frequent-pattern tree approach”, Data Mining Knowl. Discovery, vol. 8, no.

1, 2004.

[12] J. Han, J. Pei, and M. Kamber, “Data mining: Concepts and techniques,”in Cluster

Analysis: Basic Concepts and Methods, 3rd ed. San Mateo, CA, USA: Morgan Kaufmann,

2011.

[13] D. Heckerman, “Bayesian networks for data mining,” Data Mining Knowl. Discovery,

vol. 1, no. 1, 1997.

[14] S. Shrimohammadi, A. Yassine and S. Singh, “Incremental mining of frequent power

consumption patterns from smart maters big data”, in Proc. IEEE Electrical Power Energy

Conf. (EPEC), Oct. 2016..

[15] J. Han, J. Pei, and M. Kamber, “Data mining: Concepts and techniques,” in Classification:

Advanced Methods, San Francisco, CA, USA: Morgan Kaufmann, 2011.

[16] Mrs. Bhawana Mathur, Dr. Manju Kaushik: “Comparative study of k-means and

Hierarchical Clustering Techniques”, International journal of Software & Hardware

Research in Engg, 2014.

[17] Dr. M Nagalakshmi, Dr. I Surya Prabha, K Anil, Big Data Map Reducing Technique

Based Apriori in Distributed Mining. International Journal of Advanced Research in

Engineering and Technology, 8(5), 2017, pp 19 – 28.

[18] Parag C. Shukla and Dr. Kishor Atkotiya, Big Data Analytics: What It Is and What It

Isn‟t, Characteristics, Classification, Challenges and Importance. International Journal of

Computer Engineering & Technology, 8(6), 2017, pp. 60–66.

[19] K. Prema and Dr. A.V. Sriharsha, Differential Privacy in Big Data Analytics for Haptic

Applications. International Journal of Computer Engineering & Technology, 8(3), 2017,

pp. 11–19.

[20] Naga Raju Hari Manikyam and Dr. S. Mohan Kumar, Methods and Techniques To Deal

with Big Data Analytics and Challenges In Cloud Computing Environment. International

Journal of Civil Engineering and Technology, 8(4), 2017, pp. 669-678.