Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features...

Preview:

Citation preview

Mr. Apichon Witayangkurn

apichon@iis.u-tokyo.ac.jpDepartment of Civil Engineering

The University of Tokyo

Hive/Hadoop

Messaging Service

Sensor

Network

Contents

Univ.Tokyo.

Introduction1

What & Why Sensor Network2

Enterprise Sensor Network3

Conclusion and Future work4

2Apichon W.

Introduction

BackgroundSensor technology is very famous and available at low cost in themarket nowadays. ex: weather sensors, co2, radiation and so on.

It is widely used in many fields of research and applications suchas Environment monitoring, Pollution monitoring, Disatermonitoring , Agriculture field monitoring and Traffic monitoring.

Most of applications are developed based on its specification andapplication. Difficult to apply for using in other purpose or withdifferent sensor.

Sharing sensor information among system is difficult due to lackof standardization.

There is Sensor Web Enablement (SWE) from OGC but not focuson a concrete detail of application development.

Univ.Tokyo.3Apichon W.

What & Why Sensor Network

Sensor Network A group of heterogeneous sensor system connected together using communication infrastructure to exchange information between sensor stations or sensor nodes. All sensor nodes are able to link or synchronize data among each other or main station so that it acts as network. It is driven by the progress of 3 technologies: Sensors, Field platform and Internet.

Univ.Tokyo.4Apichon W.

Sensor Network

Internet

SensorsPlatform

What & Why Sensor Network

What is needed for Sensor Network

Univ.Tokyo.5Apichon W.

What & Why Sensor Network

Sensor Service GRID (SSG) – Sensor Middleware

Univ.Tokyo.6Apichon W.

What & Why Sensor Network

Issues in Sensor Network

How to deal and handle large size sensor network (Nodes and Data)How to scale to larger size with minimizing effortsInsufficient processor, I/O, and storage resources for large-scaleHeterogeneous and vender-specific sensor are difficult to connect withsensor network.It must be able to operate under any network even unstable network.Real-time and Near real-timeIt must provide channel or interface for 3rd party application to connectwith and use data in sensor service.Standardization interface to be compatible with other softwareRapid installation and ease of use.Visualization with GIS enableLow cost??

Univ.Tokyo.7Apichon W.

Enterprise Sensor Network

Key FeaturesLarge-scale support with cloudMassive data and real-time data processingFlexible data communicationEasy integration, installation and ease of useHigh-frequency and multi-dimension support Open standard and integrating supportSpatial data support

Univ.Tokyo.8Apichon W.

The Goal: “Design and develop a prototype of sensor network system supported various sensors, support any network topology and can easily scale from small to large size with minimizing efforts and human operation”

Enterprise Sensor Network

Univ.Tokyo.9Apichon W.

System Overview

Enterprise Sensor Network

Sensor Stations (SOSes)SOS is a sensor station installed and deployed at field site.It handles feeding data from sensors as well as sending data tocloud service.It can be fixed-station or mobile station with mission support.A combination of SOS Service and Web Server.It support both push and pull data feeding.Divided into 3 types based on its features

Rich-node: fully functions with web UI and 2-way controlDump-node: data feeder only (storage, processing cost)Virtual-node: Share resource, no HW, more than one node

Univ.Tokyo.10Apichon W.

Enterprise Sensor Network

Sensor Station Design

Univ.Tokyo.11Apichon W.

Enterprise Sensor Network

Messaging Service as communication mediumEnable 2-ways control between station and cloud servicesSupport multiple ConnectorsSupport various type of message storageLoad balance and cluster support

Univ.Tokyo.12Apichon W.

(Source: ActiveMQ, Apache)

Enterprise Sensor Network

Network of BrokersBrokers can be linked together to form a network or cluster of brokers. A network of brokers can use various network topologies, such as hub-and-spoke, daisy chain, or mesh.

Univ.Tokyo.13Apichon W.

Enterprise Sensor Network

Sensor Cloud Service It is a sensor data middleware which provides users with aplatform to receive data from remote field sensor networksincluding data interface and virtualization.

Typically characterized by the features:High PerformanceScalabilityReliabilityOpen Architecture

Univ.Tokyo.14Apichon W.

Spatial Database

Arbitrary Processing

Services

Spatial Query

Cloud Service (Hadoop/Hive)

Web Services

Web Interface

Sensor Virtualization

Synchronization

Services

Proprietary

API

Open Standard

API

Command Services

3rd App Connectors

Key Technology

Univ.Tokyo.15Apichon W.

What is Hadoop An open source framework, Free !!Distributed applications for large dataParallel processingRun on Commodity machinesScalableVery Famous

Hive is a data warehousing package on Hadoop with SQL-like.

In 2011, Facebookclaimed that they had the largest Hadoop cluster in the world with 30 PB of storage with nearly 10,000 nodes.

Hive provide a SQL-like language called HiveQL via Web GUI and JDBC

Key Technology

Univ.Tokyo.16Apichon W. (D2)

Project under Hadoop umbrella Common—A set of components and interfaces for distributed filesystems and general I/O (serialization, Java RPC, persistent datastructures).MapReduce—A distributed data processing model and executionenvironment that runs on large clusters of commodity machines.HDFS—A distributed filesystem that runs on large clusters ofcommodity machines.Hive—A distributed data warehouse. Hive manages data stored inHDFS and provides a query language based on SQL (and which istranslated by the runtime engine to MapReduce jobs) for querying thedata.Sqoop—A tool for efficiently moving data between relationaldatabases and HDFS.

Hadoop main component

Key Technology

TaskTracker

JobTracker

Secondary NameNode

DataNode

NameNodeNameNode is the bookkeeper of HDFS; it keeps track of how your files arebroken down into file blocks, which nodes store those blocks, and the overallhealth of the distributed filesystem.

Datanodes are the workhorses of the filesystem. They store and retrieveblocks when they are told to (by clients or the namenode), and they report backto the namenode periodically with lists of blocks that they are storing.

Secondary NameNode (SNN) is an assistant daemon for monitoring the stateof the cluster HDFS and the SNN help snapshots NameNode to help minimizethe downtime and loss of data.

JobTracker is the liaison between your application and Hadoop. Once yousubmit your code to your cluster, the JobTracker determines the execution planby determining which files to process, assigns nodes to different tasks, andmonitors all tasks as they’re running.

TaskTrackers is responsible for executing the individual tasks that theJobTracker assigns and manage the execution of individual tasks on eachslave node.

Univ.Tokyo.17Apichon W.

Key Technology

Univ.Tokyo.18Apichon W. (D2)

Hadoop main component

(Source: Lam., 2011)

add more node

1 PC

KeepMetadata

&Distribute

JobStore & ProcessData

Key Technology

HiveHive is a data warehousing package built on top of Hadoop.

Its target users remain data analysts who are comfortable with SQL and who need to do ad hoc queries , summarization , and data analysis on Hadoop-scale data.

You interact with Hive by issuing queries in a SQL-like language called HiveQL via Web GUI and JDBC.

Univ.Tokyo.19Apichon W. (D2)

1

2 3

Key Technology

HiveQL

Univ.Tokyo.20Apichon W.

(Source: White., 2011)

Key Technology

How Hadoop benefit Sensor NetworkScalability —Commodity hardware scales easily in many cases.Twenty Hadoop nodes may cost only as much as a singleredundant database slave pair.Operational concerns —Removing as many single-point-of-failure cases as possible is crucial to smooth operation of aworld-class service.Data processing speed —Many system-wide calculations weresimply not possible to perform with a monolithic system.

Spatial Processing & Custom functionSpatial Query: find point in polygonSpecific custom function: interpolation, forecasting, model

Univ.Tokyo.21Apichon W.

Hive with Spatial and Custom Function

Use JTS (Java Topology suite)Pure Java native library for spatial functionIt can be easily attached to map/reduce task because hadoop is java native platformGood performance and Open Source

Use User-Defined Function – custom developmentUDF (User-Defined Function)UDAF (User-Defined Aggregate Function)UDTF (User-Defined Table Function)Create spatial function such as “within” using JTS and make it as UDFThen it can run on hive and auto generate to map/reduce.

Use Join Method and Lateral View

Spatial Data Processing & Custom Function

Univ.Tokyo.22Apichon W.

Univ.Tokyo.23Apichon W.

Spatial Data Processing & Custom Function

Example of Spatial Custom FunctionJTS (Java Topology suite), Use UDF (User-defined function)Identify location of GPS point (Lat,Lng) by search in shape polygons

Prefecture

City

Grid

300,000++ points/sec

139.702777 35.694152

Tokyo

Shinjuku-ku

Code:533944151

Performance Comparisons of Spatial Data Processing Techniques for a Large Scale Mobile Phone Dataset

App vs. RDBMS vs. Hadoop

Univ.Tokyo.24Apichon W.

Remark: 1 day data = 20 million records

21 Hours

1 min !!!

Sensor Network with Cloud

Hive and PostgreSQL (Programming view point)

Univ.Tokyo.25Apichon W.

PostgreSQL

SQL

Hadoop

Hive (Metastore)

Hibernate Spring

MapReduce

Specific data processing

Java

Servlet

RMI

Java

Conclusion

ConclusionWe designed Enterprise Sensor Network to address current issues in development of sensor network such as

handling large number of sensor node and sensor data, real-time data processingflexible data communicationeasy integration and installation

We purposed Messaging Service and Hadoop distributed platformas main technologies to overcome those issues.

On sensor station side, we designed the system as services. Web server and SOS service are separated and communicate each other via RMI.

Univ.Tokyo.26Apichon W.

Conclusion

ConclusionSOS service is a combination of several services to handle specific operation such as SOS interface, Command Service, Scheduler Service, Data Synchronization Service and Data Feeder Service.

Data Feeder Service was designed to be able to develop custom feeder for vender-specific sensor and can plug to the services.

A combination of Sensor Station, Messaging Service and Sensor Cloud Service support sensor network system to archive Real-time, Scalability and Robustness.

Univ.Tokyo.27Apichon W.

Hive/Hadoop

Messaging Service

Sensor

Network

Mr. Apichon Witayangkurn

Email: apichon@iis.u-tokyo.ac.jp

Department of Civil Engineering

Recommended