46
Josh Joyner Gregory Christakos Real-Time & Big Data GIS: Best Practices

Real-Time & Big Data GIS: Best Practices · -In order to use the spatiotemporal big data store it requires an Enterprise type connection •Federating the local ArcGIS Server (and

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Josh Joyner

Gregory Christakos

Real-Time & Big Data GIS: Best Practices

analytics storage

visualization

ArcGIS GeoEvent ServerReal-Time GIS for ArcGIS Enterprise

ingestion

actuation

Apps

DesktopAPIs

Agenda

Architecture Recommendations

Deployment Patterns

Service Design Considerations

Understanding Your Data

Things I Wish I Knew

Resources & Wrap Up

1

2

3

4

5

6

Architecture

Recommendations1

• Operating environment: m5.2xlarge (AWS) / DS4 v2 (Azure)

- virtual machines – beware! resources need to be shared in an effective way, like EC2 or Azure.

- dedicated bare metal machines or public cloud instances are much more deterministic.

• Network

- speed – the faster the better. 1 GB/s

• Memory

- size – 8GB has been required since 10.3. 32GB, default JVM max heap size is 4 GB

- type – minimum of DDR4 is recommended.

- clock speed (MHz) and transfer rate (Mbps) – the faster the better.

• Processors

- # of cores – the more the better. 8 vCPU

- speed (GHz) – the faster the better.

• Disk

- 700MB required for installation 10GB recommended minimum (10.6+)

- amount of disk space needed will vary based on quantity of deployed input connectors

- each input can utilize up to a maximum of 600 MB of disk space before clean up

ArcGIS GeoEvent ServerWhat are the primary factors I should consider?

What are the primary factors I should consider?

• Operating environment: m5.2xlarge (AWS) / DS4 v2 (Azure)

- virtual machines – beware! resources need to be shared in an effective way, like EC2 or Azure.

- dedicated bare metal machines or public cloud instances are much more deterministic.

• Disk

- speed – the faster the better 1 GB/s EBS, note: local SSD is much better

• Network

- speed – the faster the better. 1 GB/s

• Memory

- size – 16GB minimum. 32GB

- type – DDR4 is recommended.

- clock speed (MHz) and transfer rate (Mbps) – the faster the better.

• Processors

- # of cores – the more the better. 8 vCPU

- speed (GHz) – the faster the better.

spatiotemporal big data store

Deployment Patterns2

ArcGIS GeoEvent Server

Deployment Patterns

functional servers & spatiotemporal big data store

SHOULD BE on ISOLATED machines

ArcGIS GeoEvent Server

Deployment Patterns

functional servers & spatiotemporal big data store

SHOULD BE on ISOLATED machines

multi-machine considerations

• Site vs Silo Approaches

- Site: Multiple installations of GeoEvent Server configured to use a shared ArcGIS Server site

- Silo: Independent GeoEvent Server instances, each in their own ArcGIS Server site

Deployment Patterns

multi-machine considerations

• Site vs Silo Approaches

- Site: Multiple installations of GeoEvent Server configured to use a shared ArcGIS Server site

- Silo: Independent GeoEvent Server instances, each in their own ArcGIS Server site

• The Site based approach is NOT recommended- Introduces significant effort to synchronize all participating machines

- Throughput improvements but not true HA or DR

Deployment Patterns

• Silo based approach may required additional third-party infrastructure to support

message distribution (e.g. Kafka or Load Balancers)

Is there a better option that provides improved

scalability and disaster recover?

Deployment Patterns

ArcGIS Analytics for IoThttps://www.esri.com/en-us/arcgis/products/arcgis-analytics-for-iot/overview

multi-machine considerations

Deployment Patternsunderstanding the options

• What is the relationship between GeoEvent Server and Analytics for IoT?

• Both Analytics for IoT and GeoEvent Server provide similar capabilities, right?

• Is Analytics for IoT a replacement for GeoEvent Server?

• When / Why should I consider one product over another

• When will Analytics for IoT be available to customers?

updating Gateway deployment location

• GeoEvent Gateway is deployed to

“C:/ProgramData/Esri/GeoEvent-Gateway/” (Windows OS)

“/home/arcgis/.esri/GeoEvent-Gateway/config.machinename” (Linux OS)

- This location can be changed by editing:

- (ArcGIS Server Installation Path)\GeoEvent\gateway\etc\kafka.properties

- (ArcGIS Server Installation Path)\GeoEvent\gateway\etc\zookeeper.properties

GeoEvent Server

other deployment considerations

GeoEvent Server

• GeoEvent Server and GeoEvent Gateway services should be left running

- Limit system restarts to scheduled patch times

- Inputs and Outputs leveraging Hosted Feature Layers should be stopped prior to any required

restarts

- Be aware of server and service dependencies

• Patches and Upgrades typically require the reset/clearing of browser cache to complete

installation

Service Design

Considerations3

Service Design Considerationshow many event processing paths do you see?

Service Design Considerationshow many event processing paths do you see?

Service Design Considerationshow many event processing paths do you see?

Service Design Considerationshow many event processing paths do you see?

Service Design Considerationswhich would you choose?

Service A Service B

Service Design Considerationswhich would you choose?

When possible, pre-filter the

input data before ingesting.Each “branch” in a service contains the same event data.

In this example the three branches triple the number of

records the underlaying message bus needs to handle.

Service Design Considerationsnot all components are created equally

A

B

C

Which of these services has

the potential to least impact

performance?

Service Design Considerationsnot all components are created equally

A

B

CThe first service, though it

has more nodes, operates

against a cache of

enrichment records and in-

memory geofences allowing

for better performance.

Service Design Considerationsnot all components are created equally

A

B

CThe second service

modifies the incoming

event geometry which can

be “expensive”.

These types of requests are

typically very quick but can

be impacted by geometry

complexity.

Service Design Considerationsnot all components are created equally

A

B

CThe third service utilizes

Network Analyst to return a

“drive time” polygon which

can significantly impact

throughput.

• Configure Filters and Field Mapper Processors as early as possible in a service

- This reduces the volume / data size of the events being processed

- Potentially simplifies service configuration “down stream”

• Avoid Managed GeoEvent Definitions when possible

- These are “system owned” definitions whose lifecycle is entirely controlled by the processors

- Editing or Deleting a processor will remove these definitions

- If necessary copy generated definition and edit processor to look for it

• Utilize single data pipelines whenever possible

- Branching out or back together of pipelines can have multiplicative impacts

- Better to duplicate workflows across multiple services than a single complex workflow

Service Design Considerationsother recommendations

4 Understanding Your Data

Understanding Your Data

• Building a GeoEvent Service is a methodical process

- Input, Output, GeoEvent Service

- Data Stores, GeoEvent Definitions and more

• Different output connectors are needed for troubleshooting and service design

- Write to a JSON file

- Add features to a pseudo feature service

- Push data over TCP to the GeoEvent Logger

• It can be tricky to understand real-time data in route

Common Themes

Understanding Data – Service Designer

• 10.8 – New usability and functionality in the GeoEvent Service Designer

• Added ability to:

- Create and add input and output connectors

- Copy existing input and output connectors

- Edit input and output connectors

- Start or stop input and output connectors*

- Modify GeoEvent Definitions, GeoFences, Data Stores, Spatiotemporal Big Data Stores

• Best Practice: Use the new features to quickly and easily build or edit GeoEvent

Services

New at 10.8.0

Service Designer

Understanding Data – GeoEvent Sampler

• 10.8 – New embedded utility in the GeoEvent Service Designer

• Provides the ability to:

- Sample a subset of processed events on any route in real-time

- View events as prettified JSON or delimited text

- Visualize geometry of sampled events

- Compare sampled events (and geometry) on two routes simultaneously

• Use the GeoEvent Sampler to:

- Verify attributes

- Verify schema

- Verify geometry

• Best Practice: Use GeoEvent Sampler to quickly and easily understand real-time

data

New at 10.8.0

GeoEvent Sampler

Things I Wish I Knew5

Things I Wish I Knewfederating GeoEvent Server

“I have to federate the GeoEvent Server… don’t I?”

- In order to use the spatiotemporal big data store it requires an Enterprise type connection

• Federating the local ArcGIS Server (and in turn GeoEvent Server) provides…

- Single Sign-on Experience

- Automatically converting Default datastore type

- …but even if you don’t federate you can still make a Datastore Connection to your ArcGIS Enterprise

Things I Wish I Knewfederating GeoEvent Server

“I have to federate the GeoEvent Server… don’t I?”

- Some deployments (especially in highly secured environments) function better without federating

- It also allows more restrictive access to the GeoEvent Manager and Services

- Only those Admin level users on the local ArcGIS Server can log into GeoEvent Manager

Things I Wish I Knewversion parity requirements

“Do I have to upgrade or wait for newer versions of the Gallery add-ons”

• Most items on the GeoEvent Gallery are compatible with GeoEvent Server 10.4 or later

- New releases / updates are scheduled only when a component is not compatible

• It is recommended to remove and redeploy custom components during upgrading

Things I Wish I Knewgeofence synchronization

“What’s the maximum number of GeoFences I can store in GeoEvent Server”

• No hard cap on number of GeoFences, but higher quantity/complexity impacts load times

• Sync rule refresh interval is the time between successful requests/responses

- A 1 second refresh on a query that takes 6 seconds to complete means new data every 7 seconds not 1 second

• Use Stream Layers when possible to push new geometry directly into the GeoEvent Server cache

Look for a 10.8 / 10.7.1 Patch coming soon to address some known

issues with GeoFence Sync

6 Resources & Wrap-Up

ResourcesSelf-Paced Training and Resources

• ArcGIS GeoEvent Server resources

- http://enterprise.arcgis.com/en/geoevent

- Updated Documentation

- Installation Guides

- System Requirements

- Tutorials

• Blogs and discussions on the forum

- http://links.esri.com/geoevent-forum

• Video recordings of technical workshops

- http://www.esri.com/videos

Real-Time & Big Data Technical Workshops

• Tuesday

– 1:45 - 2:45 ArcGIS Analytics for IoT: An Introduction

– 3:00 - 4:00 ArcGIS GeoEvent Server: An Introduction

• Wednesday

– 11:00 - 12:00 ArcGIS Analytics for IoT: An Introduction 2nd offering

– 1:30 - 2:30 ArcGIS GeoEvent Server: An Introduction 2nd offering

– 2:45 - 3:45 Real-Time and Big Data GIS: Best Practices Only Offering

– 4:00 - 5:00 ArcGIS GeoEvent Server: Applying Real-Time Analytics Only Offering

– 5:15 - 6:15 ArcGIS GeoEvent Server: Visualizing Real-Time Data Only Offering

Print Your Certificate of Attendance

Print Stations Located in 150 Concourse Lobby

Tuesday12:30 pm – 6:30 pm

Expo

Hall B

5:15 pm – 6:30 pm

Expo Social

Hall B

Wednesday10:45 am – 5:15 pm

Expo

Hall B

6:30 pm – 9:30 pm

Networking Reception

Smithsonian National Museum

of Natural History

Download the Esri

Events app and find your event

Select the session

you attended

Scroll down to

“Survey”

Log in to access the

survey

Complete the survey

and select “Submit”

Please Share Your Feedback in the App

Josh Joyner

ArcGIS GeoEvent ServerProduct Manager

[email protected]

Gregory Christakos

ArcGIS GeoEvent ServerProduct Engineer

[email protected]

Comments?

Questions?