13
© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved. Big Data and Analytic Challenges Remain in the Era of IoT Over the last year there has been a major uptick in interest in and maturity of IoT. IoT is the network of physical objects or “things” embedded in electronics, software, and sensors that enables objects to exchange data between connected devices. The primary entry point for IoT is the need to extend the existing data analytics engagement to drive greater value from digital assets that combine device data with other enterprise data. It should be noted that though IoT is taking shape as an extension of big data and analytics initiatives, nonanalytics oriented solutions also fall under the IoT umbrella by automatically making decisions and completing transactions based on a set of business rules. Despite the business interest and technological advancements in IoT, the exponential growth and increased variety of data continue to be key challenges holding back organizations from making the IoT leap and transforming into a data driven business. How long will it take until insight is achieved? What new tools will be required to efficiently reason and analyze over these massive data sets? How much storage is needed? Who needs to be involved? These are just some of the questions respondents to a recent ESG survey have asked since deploying a data analytics solution. The most often identified challenge they face is data integration. With multiple data sources that must be transformed, cleansed, wrangled, and joined, it is no surprise that 38% of respondents believe that data integration is complex. 1 Figure 1. Top Six Requirements for Driving New BI/analytics Solution Evaluations Source: Enterprise Strategy Group, 2015. 1 Source: ESG Research Report, Enterprise Big Data, Business Intelligence, and Analytics Trends, January 2015. 24% 24% 24% 26% 27% 27% 23% 23% 24% 24% 25% 25% 26% 26% 27% 27% 28% OrganizaZon is moving toward more predicZve and discovery analyZcs Cost reducZon of exisZng pla\orms Need to combine analyZcs for structured and unstructured data Current BI/analyZcs soluZon(s) do not meet requirements/needs New applicaZons are generaZng new data types that need different analyZcal tools OrganizaZon is moving towards more realZme analyZcs Which of the following requirements are most responsible for driving the evaluaAon of new BI/analyAcs soluAons? (Percent of respondents, N=168, three responses accepted) ESG Lab Review An Internet of Things (IoT) Platform: Pivotal Cloud Foundry and Pivotal Big Data Suite on VCE Vblock Systems with EMC Isilon Date: August 2015 Author: Mike Leone, ESG Lab Analyst Abstract: This ESG Lab Review documents handson testing of the ability of the VCE Vblock System and VCE Technology Extension for EMC Isilon storage to drive value in an Internet of Things (IoT) realtime analytics environment with Pivotal Cloud Foundry and Pivotal Big Data Suite. Testing focused on the functionality, deployment, and simplicity of running Pivotal software on a converged VCE solution, which is designed to provide a scalable, flexible, endtoend platform for IoT and analytic solutions.

ESG$Lab$Review An InternetofThings(IoT) Platform … · bestKofKbreedtechnologies"from"industryKleadingvendors.VCE"offers"a"range"of"Vblock"Systems"thatbusinesses"can"

  • Upload
    lamtu

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

©  2015  by  The  Enterprise  Strategy  Group,  Inc.  All  Rights  Reserved.  

 

Big  Data  and  Analytic  Challenges  Remain  in  the  Era  of  IoT  Over  the  last  year  there  has  been  a  major  uptick  in  interest  in  and  maturity  of  IoT.  IoT  is  the  network  of  physical  objects  or  “things”  embedded  in  electronics,  software,  and  sensors  that  enables  objects  to  exchange  data  between  connected  devices.  The  primary  entry  point  for  IoT  is  the  need  to  extend  the  existing  data  analytics  engagement  to  drive  greater  value  from  digital  assets  that  combine  device  data  with  other  enterprise  data.  It  should  be  noted  that  though  IoT  is  taking  shape  as  an  extension  of  big  data  and  analytics  initiatives,  non-­‐analytics  oriented  solutions  also  fall  under  the  IoT  umbrella  by  automatically  making  decisions  and  completing  transactions  based  on  a  set  of  business  rules.  

Despite  the  business  interest  and  technological  advancements  in  IoT,  the  exponential  growth  and  increased  variety  of  data  continue  to  be  key  challenges  holding  back  organizations  from  making  the  IoT  leap  and  transforming  into  a  data-­‐driven  business.  How  long  will  it  take  until  insight  is  achieved?  What  new  tools  will  be  required  to  efficiently  reason  and  analyze  over  these  massive  data  sets?  How  much  storage  is  needed?  Who  needs  to  be  involved?  These  are  just  some  of  the  questions  respondents  to  a  recent  ESG  survey  have  asked  since  deploying  a  data  analytics  solution.  The  most  often  identified  challenge  they  face  is  data  integration.  With  multiple  data  sources  that  must  be  transformed,  cleansed,  wrangled,  and  joined,  it  is  no  surprise  that  38%  of  respondents  believe  that  data  integration  is  complex.1  

Figure  1.  Top  Six  Requirements  for  Driving  New  BI/analytics  Solution  Evaluations  

 Source:  Enterprise  Strategy  Group,  2015.  

                                                                                                                         1  Source:  ESG  Research  Report,  Enterprise  Big  Data,  Business  Intelligence,  and  Analytics  Trends,  January  2015.  

24%  

24%  

24%  

26%  

27%  

27%  

23%   23%   24%   24%   25%   25%   26%   26%   27%   27%   28%  

OrganizaZon  is  moving  toward  more  predicZve  and  discovery  analyZcs  

Cost  reducZon  of  exisZng  pla\orms  

Need  to  combine  analyZcs  for  structured  and  unstructured  data  

Current  BI/analyZcs  soluZon(s)  do  not  meet  requirements/needs  

New  applicaZons  are  generaZng  new  data  types  that  need  different  analyZcal  tools  

OrganizaZon  is  moving  towards  more  real-­‐Zme  analyZcs  

Which  of  the  following  requirements  are  most  responsible  for  driving  the  evaluaAon  of    new  BI/analyAcs  soluAons?  (Percent  of  respondents,  N=168,  three  responses  accepted)  

ESG  Lab  Review  

An  Internet  of  Things  (IoT)  Platform:  Pivotal  Cloud  Foundry  and  Pivotal  Big  Data  Suite  on  VCE  Vblock  Systems  with  EMC  Isilon  

Date:  August  2015      Author:  Mike  Leone,  ESG  Lab  Analyst  

Abstract:    This  ESG  Lab  Review  documents  hands-­‐on  testing  of  the  ability  of  the  VCE  Vblock  System  and  VCE  Technology  Extension  for  EMC  Isilon  storage  to  drive  value  in  an  Internet  of  Things  (IoT)  real-­‐time  analytics  environment  with  Pivotal  Cloud  Foundry  and  Pivotal  Big  Data  Suite.  Testing  focused  on  the  functionality,  deployment,  and  simplicity  of  running  Pivotal  software  on  a  converged  VCE  solution,  which  is  designed  to  provide  a  scalable,  flexible,  end-­‐to-­‐end  platform  for  IoT  and  analytic  solutions.  

ESG  Lab  Review:  IoT  Platform:  Pivotal  Cloud  Foundry  and  Pivotal  Big  Data  Suite  on  VCE  Vblock  Systems  with  EMC  Isilon          2  

©  2015  by  The  Enterprise  Strategy  Group,  Inc.  All  Rights  Reserved.  

With  45%  of  large  midmarket  and  enterprise-­‐class  organizations  planning  to  deploy  a  new  big  data  and  BI/analytics  solution  in  2015,  and  39%  selecting  Hadoop-­‐based  solutions  as  one  of  their  planned  infrastructure  approaches,  ESG  asked  respondents  what  requirements  were  most  responsible  for  driving  the  evaluation  of  a  new  BI/analytics  solution.  As  shown  in  Figure  1,  the  two  most-­‐cited  responses,  identified  by  27%  of  respondents  each,  were  that  the  organization  is  moving  towards  a  more  real-­‐time  analytics  approach,  and  that  new  applications  are  generating  new  data  types  that  need  different  analytical  tools.2  It  should  be  noted  that  99%  of  the  respondents  already  have  a  BI/analytics  solution  in  place.  In  other  words,  adaptability  and  agility  were  overlooked  with  many  early  adopters  of  big  data  technology.  This  has  led  to  organizations  rethinking  their  data,  application,  and  infrastructure  strategies  to  not  only  solve  big  data  analytics  challenges,  but  to  better  prepare  themselves  for  IoT  success.  

Challenges  Moving  Forward  in  an  IoT  World  From  seismic  sensors  and  mobile  phones  to  refrigerators  and  cars,  the  amount  of  data  being  collected  on  a  minute-­‐by-­‐minute  basis  is  massive  in  both  the  consumer  and  industrial  worlds.  This  is  often  said  to  be  IoT,  but  data  collection  is  just  a  piece  of  IoT.  There  is  also  the  necessity  to  analyze  the  data  to  make  better  decisions  based  on  real-­‐time  and  historical  data.  Often  times,  this  data  analysis  can  be  done  without  human  interaction  and  leverage  machine  learning  to  make  recommendations  based  on  that  data.  This  combination  of  data  collection,  automated  data  wrangling,  and  data  analysis  is  IoT.  

With  new  devices  and  new  interfaces  generating  such  a  large  variety  of  data,  data  wrangling  becomes  a  major  concern,  which  falls  in  line  with  the  previously  mentioned  challenges  that  many  organizations  experience  with  traditional  BI/analytics  solutions.  Because  of  this,  organizations  have  been  forced  to  hand-­‐code  one-­‐off  solutions  to  address  the  time  it  takes  to  wrangle  all  of  the  data.  This  is  often  done  without  an  underlying  platform,  which  not  only  consumes  a  significant  amount  of  expert  personnel  and  data  scientist’s  time,  but  also  often  introduces  the  potential  for  bugs  or  issues  with  the  customized  software  due  to  human  error.  

From  a  software  standpoint,  numerous  open  source  technologies  exist  that  can  work  quite  well  to  address  the  requirements  of  IoT  when  effectively  used  together.  The  problem  lies  in  the  fact  that  going  with  this  approach  can  be  costly.  Independent  experts  of  each  technology  are  often  required  not  only  to  make  a  solution  work,  but  also  to  manage  and  support  it.  With  open  source  technologies  often  comes  the  idea  of  leveraging  commodity  hardware  to  house  these  various  pieces  of  software.  Though  organizations  will  obviously  save  on  CapEx  with  that  approach,  the  idea  of  an  organization  relying  on  commodity  hardware  with  minimal  support  to  house  their  crucial  data  can  serve  as  an  instant  deterrent.  The  need  for  a  platform-­‐based  approach  has  never  been  more  apparent.  

 

VCE  Vblock  System  with  EMC  Isilon  Storage,  Pivotal  Cloud  Foundry,  and  Pivotal  Big  Data  Suite  The  VCE  Vblock  System  is  a  well-­‐known  and  extensively  deployed  integrated  computing  platform  (ICP)  that  combines  best-­‐of-­‐breed  technologies  from  industry-­‐leading  vendors.  VCE  offers  a  range  of  Vblock  Systems  that  businesses  can  choose  from  to  suit  their  workload  needs.  With  Cisco  compute  and  networking,  EMC  storage  and  data  protection,  and  VMware  virtualization  and  management,  Vblock  Systems  are  designed  to  make  it  simpler  and  quicker  for  organizations  to  deploy  a  complete  IT  platform  that  has  been  pre-­‐integrated,  pre-­‐validated,  and  pretested  with  release  certification  matrices  (RCM)  serving  as  a  basis  for  VCE  seamless  support.    

VCE  Technology  Extension  for  EMC  Isilon  Storage  Building  off  the  preexisting  success  of  VCE’s  integrated  computing  platform,  VCE  released  the  VCE  Technology  Extension  for  EMC  Isilon  storage  to  extend  the  business  value  of  VCE.  With  analytics  being  a  component  of  the  dominant  use  cases  for  VCE,  organizations  can  now  deploy  Hadoop  distributions  and  tools  from  the  Hadoop  ecosystem  along  with  dependent  applications  and  databases  to  provide  end-­‐to-­‐end  analytics  on  Vblock  Systems  with  EMC  Isilon.  Their  benefits  include:  

                                                                                                                         2  Ibid.  

ESG  Lab  Review:  IoT  Platform:  Pivotal  Cloud  Foundry  and  Pivotal  Big  Data  Suite  on  VCE  Vblock  Systems  with  EMC  Isilon          3  

©  2015  by  The  Enterprise  Strategy  Group,  Inc.  All  Rights  Reserved.  

• Flexible  scale-­‐out  capacity  and  performance  from  Vblock  Systems  with  EMC  Isilon  to  avoid  traditional  scale-­‐up  limitations  in  big  data  analytics  environments.  This  scale-­‐out  approach  provides  improved  data  protection,  data  access,  resiliency,  high  availability,  manageability,  and  cost  savings.  

• Augmenting  the  structured  data  analytics  storage  environment  with  native  HDFS  support  from  EMC  Isilon  where  unstructured  data  can  be  analyzed  in  place,  eliminating  the  need  for  ingest  or  staging.  This  augments  the  already  highly  prevalent  structured  data  technology  built  into  existing  Vblock  Systems  with  EMC  Symmetrix  VMAX,  EMC  VNX,  and  EMC  XtremIO  storage  systems  while  deploying  other  analytics  such  as  SAS  or  Splunk.  VCE  also  provides  VCE  Technology  Extension  for  Cisco  Compute  as  a  Direct  Attach  Storage  (DAS)  alternative,  especially  for  Hadoop  entry  level  deployments  or  Massively  Parallel  Processing  (MPP)  databases.  

• Predictable  high  availability  and  reliability  for  extremely  large  data  sets.  As  data  sets  reach  petabyte  scale,  the  pre-­‐engineered  Vblock  Systems  with  scale-­‐out  Isilon  storage  offer  high  levels  of  component  redundancy,  helping  to  eliminate  single  points  of  failure.  

• Data  protection—Vblock  Systems  take  advantage  of  technologies  from  EMC,  including  EMC  Avamar,  EMC  Data  Domain,  EMC  RecoverPoint,  and  EMC  VPLEX.  EMC  Isilon  takes  the  integrated  data  protection  of  the  complete  solution  a  step  further  by  offering  additional  capabilities  such  as  data-­‐at-­‐rest  encryption  for  files,  snapshotting,  remote  replication,  and  NDMP  backups.    

• Networking  Hadoop  workloads  can  be  network-­‐heavy,  and  the  ability  to  choose  and  architect  the  leading  networking  technologies  is  key  to  success—contrasting  with  servers  and  appliances.  Vblock  Systems  can  employ  Cisco  Nexus  9000  Series  Application  Centric  Infrastructure  switches  to  provide  application-­‐driven  automation,  open  software  support,  and  hardware-­‐based  multi-­‐tenancy.  

With  the  recent  expansion  of  its  portfolio,  VCE  also  offers  VxBlock  Systems  designed  to  present  additional  choices  to  customers.  Customers  can  choose  network  virtualization  solutions  from  VMware  NSX  or  Cisco  ACI.  Vblock  Systems  and  VxBlock  Systems  can  also  be  combined  with  VCE  Technology  Extensions  and  VCE  Vscale  Blocks.  Under  the  VCE  Vscale  Architecture,  organizations  can  choose  the  right  building  blocks  based  on  workload,  performance  and  scaling  requirements,  and  budgetary  demands.  

 

Pivotal  Organizations  can  leverage  the  VCE  Vblock  System  with  EMC  Isilon;  VCE  Vision  Intelligent  Operations;  and  Pivotal  data,  application,  and  analytics  solutions  to  help  streamline  software  application  provisioning,  deployment,  management,  and  analytics.  Deploying  Pivotal  solutions  on  Vblock  Systems  provides  organizations  with  greater  agility  by  standardizing  on  a  proven,  converged  hardware  platform  to  help  accelerate  time  to  value.  By  converging  all  investments  in  applications,  analytics,  and  data  into  a  joint  VCE  and  Pivotal  solution,  organizations  gain  flexibility,  scalability,  and  efficiency,  while  maximizing  their  return  on  investment.  Additional  benefits  include:  

• Achieving  elasticity,  scalability,  and  reliability  with  Pivotal  Cloud  Foundry  on  Vblock  Systems  utilizing  best-­‐in-­‐class  servers,  networks,  and  storage.  

• Ensuring  mission-­‐critical  readiness  of  Hadoop  with  Pivotal  HD,  HAWQ,  and  EMC  Isilon  to  achieve  multi-­‐tenancy,  Hadoop  Distributed  File  System  (HDFS)  integration,  privacy,  and  reliability.  

• Driving  operational  efficiency  and  performance  improvements  with  the  Pivotal  Greenplum  Database  and  its  MPP  in  communication  with  external  databases  and  Hadoop.  

• Improving  business  outcomes  by  leveraging  Pivotal  GemFire  to  analyze  large  quantities  of  data  at  scale  in  the  cloud.  

• Deploying  VCE  Data  Protection,  including  EMC  Avamar,  EMC  Data  Domain,  and  EMC  VPLEX.  • Mixing  and  matching  future  deployments  and  upgrading  compute,  network,  and  storage  resources  as  

business  demands  evolve  and  improved  technology  becomes  available.  

ESG  Lab  Review:  IoT  Platform:  Pivotal  Cloud  Foundry  and  Pivotal  Big  Data  Suite  on  VCE  Vblock  Systems  with  EMC  Isilon          4  

©  2015  by  The  Enterprise  Strategy  Group,  Inc.  All  Rights  Reserved.  

Figure  2.  VCE  Vblock  System  with  VCE  Technology  Extension  for  EMC  Isilon  and  Pivotal  

 

Pivotal  Cloud  Foundry  

Pivotal  Cloud  Foundry  is  a  cloud  native  application  platform  (platform)  that  delivers  an  easy-­‐to-­‐consume  and  ready-­‐to-­‐use  cloud  computing  environment  with  application  services.  An  existing,  virtualized  public  or  private  infrastructure  is  used  to  host  the  platform,  streamlining  the  provisioning,  deployment,  and  management  of  cloud-­‐based  applications.  Capabilities  include:  

• Operational  Visibility  and  Control  –  Manage  and  monitor  the  entire  platform  from  a  single  pane  of  glass.  • Security  –  Isolate  applications  with  containers,  apply  security  groups  to  restrict  connections,  apply  role-­‐

based  access  control,  and  audit  platform  activities.  • Application  Portability  –  Standardize  application  deployments  with  containers,  enabling  application  

migrations  between  compatible  infrastructures  provisioned  in  VMware  vSphere,  OpenStack,  and  Amazon  Web  Services.  

• Automation  –  Simplify  management  tasks  with  centrally  managed  services,  while  leveraging  deployment  templates  and  tools  to  easily  configure  and  manage  virtualized  resources.  

• Services  –  Offer  customers  a  self-­‐service  catalog  of  databases,  analytics,  and  middleware  technology,  including  Pivotal  Big  Data  Suite  Services.  

Pivotal  Big  Data  Suite  

The  Pivotal  Big  Data  Suite  is  a  software  stack  focused  on  advanced  data  analytics  and  in-­‐memory  processing  for  data-­‐driven  organizations.  The  product  suite  is  a  collection  of  Pivotal’s  data  technologies  that  enable  organizations  to  build  a  solid  infrastructure  for  storage  and  data  processing  on  Hadoop,  leverage  advanced  analytics  for  deeper  insights,  and  develop  and  deploy  scalable,  data-­‐centric  applications  with  distributed,  in-­‐memory  data  stores.  

As  shown  in  Figure  3,  eight  components  work  together  to  form  the  Pivotal  Big  Data  Suite,  five  of  which  are  also  available  as  services  on  Pivotal  Cloud  Foundry.  

• Pivotal  HD  –  A  Hadoop  distribution  based  on  the  core  Open  Data  Platform  (ODP)  optimized  for  batch  processing  workloads.  

• Pivotal  Greenplum  Database  –  An  analytical,  massively  parallel  processing  database.  • Pivotal  HAWQ  –  A  massively  parallel  processing,  ANSI-­‐compliant  SQL  on  Hadoop  query  engine.  • Pivotal  GemFire  –  A  high-­‐performing,  distributed,  in-­‐memory  NoSQL  database.  • Spring  XD  –  An  open  source,  distributed  framework  for  data  ingestion,  batch  processing,  and  data  flows.  • Spark  –  An  open  source,  cluster  computing  framework  for  quickly  processing  many  types  of  data,  supported  

via  Pivotal  HD.  • Redis  –  A  scalable,  open  source  key  value  storage  and  data  structure  server.  

ESG  Lab  Review:  IoT  Platform:  Pivotal  Cloud  Foundry  and  Pivotal  Big  Data  Suite  on  VCE  Vblock  Systems  with  EMC  Isilon          5  

©  2015  by  The  Enterprise  Strategy  Group,  Inc.  All  Rights  Reserved.  

• RabbitMQ  –  A  scalable,  open  source  message  queue  for  applications.  

Figure  3.  Pivotal  Big  Data  Suite  

 

 

ESG  Lab  Tested  ESG  Lab  leveraged  a  VCE  Vblock  System  340  and  VCE  Technology  Extension  for  EMC  Isilon  storage  to  test  and  validate  the  functionality  and  workflows  of  Pivotal  Cloud  Foundry  and  Pivotal  Data  Suite  as  they  relate  to  deployments  in  IoT  environments.  

Simplicity  and  Operational  Efficiency  with  Pivotal  Cloud  Foundry  

Pivotal  has  taken  the  application  lifecycle  from  months  to  hours  and  minutes  by  building  on  initial  cloud  native  application  platform  benefits  like  developer  agility,  and  focusing  on  offering  additional  operational  benefits.  Every  application  that  is  deployed  on  Pivotal  Cloud  Foundry  can  take  advantage  of  these  benefits  as  they  relate  to  individual  application  requirements.  When  applications  are  deployed,  they  are  dynamically  routed  to  a  fault  tolerant  array  of  software  load  balancers  to  allow  for  instant  scaling  and  zero  downtime.  In  fact,  applications  are  deployed  into  dual  availability  zones  to  allow  for  50%  hardware  capacity  loss  without  downtime.  Application  and  user  events  are  fed  into  a  single  log  drain  where  tools  like  Splunk  can  be  leveraged  to  mine  the  data  for  information  and  troubleshooting  purposes.  Role-­‐based  access  and  policy  management  are  integrated  directly  into  the  main  dashboard,  while  application  performance  management  is  currently  in  beta  with  a  release  date  in  the  near  future.    

ESG  Lab  leveraged  a  simulated  environment  that  consisted  of  a  VCE  Vblock  System  as  the  underlying  hardware  to  test  the  functionality  and  simplicity  of  Pivotal  Cloud  Foundry  when  deploying  and  scaling  a  new  application.  Testing  began  by  logging  into  the  main  developer  console.  An  organization  called  AlliancesLab  had  previously  been  created,  which  

ESG  Lab  Review:  IoT  Platform:  Pivotal  Cloud  Foundry  and  Pivotal  Big  Data  Suite  on  VCE  Vblock  Systems  with  EMC  Isilon          6  

©  2015  by  The  Enterprise  Strategy  Group,  Inc.  All  Rights  Reserved.  

consisted  of  four  “spaces”  or  business  units.  The  console  summarized  each  space  and  highlighted  the  applications,  services,  and  team  members  for  each  particular  space.  Figure  4  shows  the  console  interface.  

Figure  4.  Pivotal  Cloud  Foundry  Main  Console  

 

The  production  space  was  selected,  which  had  no  applications  or  services  deployed  or  running.  Using  the  Cloud  Foundry  command  line  utility,  ESG  Lab  navigated  to  a  sample  application  with  recently  completed  source  code.  After  checking  the  endpoints,  of  which  there  can  be  multiple,  and  logging  in,  the  sample  application  was  pushed  to  the  production  space.  The  push  command  kicks  off  a  group  of  events,  including  creating  the  route,  binding  the  route  to  the  application,  and  then  downloading  all  of  the  components  of  the  application.  An  example  of  a  component  would  be  if  the  application  is  a  java  application,  the  JDK  would  be  downloaded.  All  of  this  information  gets  compiled  together  and  uses  the  Cloud  Foundry  buildpack.  Buildpacks  enable  enterprises  to  automatically  package  and  run  multiple  run  times  and  frameworks  (Go,  Ruby,  Java,  Spring,  etc.)  for  a  particular  application.  The  buildpacks  get  put  into  what  Pivotal  calls  a  “droplet,”  which  gets  uploaded  to  Pivotal’s  platform  layer,  where  the  application  starts  and  gets  ported  to  the  droplet  application  engine.  

Next,  ESG  Lab  transitioned  back  to  the  management  console,  where  the  newly  deployed  application  could  be  viewed  along  with  the  configuration  details  and  the  automatically  generated  routing  information.  By  clicking  on  the  route,  the  application  was  launched  (shown  in  Figure  5).  The  application  simulated  a  retailer  who  was  interested  in  tracking  her  orders  across  the  United  States  and  having  the  data  actively  streamed  in  real  time  to  the  displayed  map.    

ESG  Lab  Review:  IoT  Platform:  Pivotal  Cloud  Foundry  and  Pivotal  Big  Data  Suite  on  VCE  Vblock  Systems  with  EMC  Isilon          7  

©  2015  by  The  Enterprise  Strategy  Group,  Inc.  All  Rights  Reserved.  

Figure  5.  Viewing  a  Deployed  Application  

 

For  the  application  to  function  properly,  a  service  had  to  be  added  and  bound  to  the  application.  By  navigating  back  to  the  main  console,  ESG  Lab  clicked  the  Marketplace,  which  displayed  the  available  services  for  the  organization.  RabbitMQ  for  Pivotal  CF  was  selected,  which  then  displayed  the  available  service  plans  that  could  be  selected.  This  allows  companies  that  want  to  be  service  providers  to  build  multiple  plans  and  apply  various  charging  capabilities.  The  only  plan  available  was  selected  and  after  supplying  an  instance  name  to  the  service,  selecting  the  space,  and  selecting  the  application  in  which  to  bind  the  service,  the  service  was  added  to  the  application.  Figure  6  displays  the  process  of  adding  a  new  service  to  an  already  deployed  application.  After  restarting  the  application,  the  application  was  launched  and  the  data  stream  was  started.  The  map  began  changing  colors  based  on  the  number  of  orders  occurring  and  ESG  Lab  could  click  on  different  states  to  gain  a  more  granular  view  of  the  number  of  orders  at  a  particular  time.  

Figure  6.  Adding  a  New  Service  

 

Next,  ESG  Lab  wanted  to  scale  the  application.  This  was  done  through  the  command  line  by  issuing  a  “scale”  command  and  specifying  the  number  of  instances  desired  and  the  application.  In  one  click  of  a  button,  two  additional  instances  were  added  to  the  deployed  application.  This  can  also  be  done  via  the  management  interface,  which  provides  the  ability  to  increase  the  memory  and  disk  limits  of  each  instance.  

The  final  phase  of  testing  focused  on  a  failure  scenario.  Pivotal  Cloud  Foundry  has  built-­‐in  application  monitoring  components  to  meet  high  availability  and  recovery  requirements.  In  particular,  the  health  monitor  interacts  with  the  cloud  controller  to  verify  the  configuration  and  deployment  expectations  of  an  application.  If  the  cloud  controller  reports  something  back  that  doesn’t  meet  those  expectations,  instances  of  the  application  will  be  automatically  

ESG  Lab  Review:  IoT  Platform:  Pivotal  Cloud  Foundry  and  Pivotal  Big  Data  Suite  on  VCE  Vblock  Systems  with  EMC  Isilon          8  

©  2015  by  The  Enterprise  Strategy  Group,  Inc.  All  Rights  Reserved.  

restarted  to  rectify  the  inconsistencies.  ESG  Lab  simulated  a  failure  in  the  application  caused  by  a  software  bug.  After  the  failure  occurred,  it  was  reflected  in  the  management  interface  (shown  in  Figure  7).  Within  seconds,  the  application  went  from  being  in  a  down  state,  to  being  up  and  running  again.  

Figure  7.  Simulating  a  Failure  

 

Why  This  Matters    The  way  the  technology  industry  is  moving,  business  agility  is  essential  in  order  to  not  only  adapt  to  market  conditions,  but  also  to  remain  competitive.  The  business  is  pressuring  IT  to  deliver  dependable  services  that  meet  its  requirements  faster   than  ever.   This   is   especially   true   as   software   continues   to  be   viewed  as   the  primary  way   to   gain   competitive  advantage.  The  problem  is  that  business  expectations  far  exceed  what  can  be  delivered  by  IT.  Developers  are  focused  on  improving  their  own  experience  with  an  agile,  iterative  process  to  better  consume  open  source  technology  and  new  data   services,  while   the  application  operators   are   focused  on  being  able   to   continuously  deliver   application  uptime,  scalability,  and  consistency.  An  approach  that  better  aligns  both  developer  goals  and  operator  goals  is  crucial  to  better  meeting  the  lofty  expectations  of  the  line  of  business.    

ESG  Lab  validated  that  the  combination  of  the  industry-­‐proven  VCE  technology  and  the  power  of  Pivotal  Cloud  Foundry  to  deliver  an  enterprise-­‐ready  platform-­‐as-­‐a-­‐service  enables  organizations   to  develop,  deploy,   and  maintain  an  agile  business   that   can   easily   adjust   in   real   time   to   meet   the   needs   of   the   business.   A   sample   application   was   quickly  deployed   in   Pivotal   Cloud   Foundry  with   a   simple   push   command.   Services  were   added   to   the   application   from   the  available  marketplace  with   just  a   few  mouse  clicks   to  enable   the  streaming  of   real-­‐time  events.  The  application  was  scaled   up   via   a   single   command   from   the   command   line,   and   could   also   be   scaled   directly   from   the  management  console.  ESG  Lab  was  particularly  impressed  with  the  high-­‐availability  capabilities.  A  failure  was  simulated  that  caused  the  application  to  go  down.  The  Pivotal  software  quickly  detected  the  failure  and  within  seconds  the  application  was  back  up  and  running  in  a  fully  functional  state.  

 

 

 

 

ESG  Lab  Review:  IoT  Platform:  Pivotal  Cloud  Foundry  and  Pivotal  Big  Data  Suite  on  VCE  Vblock  Systems  with  EMC  Isilon          9  

©  2015  by  The  Enterprise  Strategy  Group,  Inc.  All  Rights  Reserved.  

Delivering  Internet  of  Things  Solutions  with  Pivotal  Data  Solutions  and  VCE  

The  Pivotal  Big  Data  Suite  offers  software  to  ease  organizations’  transition  from  a  reactive  data  analytics  model  to  a  more  proactive,  self-­‐improving,  machine  learning  approach.  With  the  Pivotal  Big  Data  Suite,  when  data  is  ingested  from  multiple  data  sources  as  is  common  in  many  IoT  scenarios,  a  data  stream  pipeline  is  leveraged  to  automatically  wrangle  the  data  and  move  it  to  ideal  locations  in  the  overall  analytics  solution.  This  includes  streaming  data  in,  parsing  it,  making  sure  it  is  the  right  format,  and  then  understanding  where  to  send  it.  Historical  data  is  placed  in  the  data  lake  with  HDFS,  which  is  where  VCE  Vblock  Systems  with  EMC  Isilon  fits  in,  while  real-­‐time  data  leverages  the  speed  and  low  latency  of  an  in-­‐memory  datastore.  Data  from  both  the  data  lake  and  the  in-­‐memory  system  are  then  leveraged  by  the  machine  learning  portion  of  the  overall  solution  to  continuously  learn,  improve,  adapt,  and  better  identify  trends.  

ESG  Lab  simulated  an  IoT  scenario  commonly  referred  to  as  the  connected  car:  a  device  connected  to  a  car  that  tracks  speed,  location,  miles  per  gallon,  engine  temperature,  etc.  The  connected  car  architecture  is  shown  in  Figure  8,  which  combines  both  Pivotal  Big  Data  Suite  and  Pivotal  Cloud  Foundry  with  the  VCE  Vblock  System  and  EMC  Isilon  into  a  fine-­‐tuned  IoT  architecture  based  on  a  data  streaming  reference  architecture.  Details  of  the  exact  test  bed  can  be  found  in  the  Appendix.  

First,  the  car  data  is  transmitted  to  the  stream  processing  engine.  Spring  XD  is  used  to  orchestrate  and  automate  all  the  steps  of  the  data  stream  pipeline  by  leveraging  dozens  of  built-­‐in  connectors  to  ingest/sink  data;  process  that  data  via  filtering,  splitting,  and  transforming  with  tools  like  Spark;  and  analyze  the  data  with  other  tools  like  Python  or  R  to  help  analyze  where  the  data  should  reside,  whether  that  be  in  a  data  lake  or  in  an  in-­‐memory  database.  For  data  that  gets  sunk  to  the  data  lake,  the  VCE  Vblock  System  with  EMC  Isilon  is  used  as  the  Hadoop-­‐based  solution  running  Pivotal  HD.  Using  Pivotal  HAWQ,  data  that  resides  in  the  data  lake  can  be  queried  with  SQL  for  SQL-­‐based  advanced  analytics,  while  the  Pivotal  Greenplum  Database  serves  as  the  analytical,  massively  parallel  processing  database.  For  data  that  can  be  analyzed  in  real  time,  Pivotal  GemFire  serves  as  the  highly  distributed  NoSQL  database,  providing  scalable,  low-­‐latency,  real-­‐time  data  access,  storage,  and  event  processing.  Pivotal  Cloud  Foundry  is  leveraged  as  the  platform  running  the  application,  and  the  previously  mentioned  services  collect  all  the  needed  data  that  eventually  gets  pushed  to  mobile  devices  and  end-­‐users  using  the  Pivotal  Cloud  Foundry  mobile  services.    

Figure  8.  IoT  Architecture  for  a  Connected  Car  

 

ESG  Lab  Review:  IoT  Platform:  Pivotal  Cloud  Foundry  and  Pivotal  Big  Data  Suite  on  VCE  Vblock  Systems  with  EMC  Isilon          10  

©  2015  by  The  Enterprise  Strategy  Group,  Inc.  All  Rights  Reserved.  

A  simulated  application  that  tracked  a  moving  vehicle  and  used  historical  and  real-­‐time  data  to  predict  a  destination  was  used  for  testing.  Speed,  miles  per  gallon,  and  the  final  destination  were  displayed  in  real  time  as  the  location  of  the  vehicle  was  tracked  on  a  map.  As  shown  in  Figure  9,  a  background  script  was  used  to  generate  the  simulated  car  data.  Once  the  data  stream  started  via  the  Spring  XD  command  line,  ESG  Lab  logged  in  to  various  management  interfaces  to  view  real-­‐time  performance  metrics  of  the  application.  

Figure  9.  Launching  the  Data  Stream  and  Managing  Performance  

 

Finally,  the  application  was  viewed,  which  displayed  the  moving  car  on  a  map  and  car  metrics  as  the  car  traveled  from  one  destination  to  another.  Also  shown  on  the  map  were  the  potential  destinations  of  the  driver.  As  the  car  navigated  in  a  certain  direction  (real-­‐time  data),  the  destination  was  correctly  predicted  based  on  historical  data  of  frequently  visited  locations.  

Figure  10.  Connected  Car  Application  in  Real  Time  

 

 

Why  This  Matters    Organizations  must  adapt  to  a  data-­‐driven  model  that  can  enable  a  greater  competitive  advantage  by  speeding  up  time  to  results.  With  IoT  and  its  millions  of  devices  for  various  use  cases,  the  current  analytics  model  is  in  dire   need   of   a   platform   that   combines   agile   application   development   and   deployment   with   real-­‐time   data  analytics   and   machine   learning.   Pivotal   and   VCE   have   joined   forces   to   offer   such   a   platform—one   where  organizations  can  develop,  deploy,  and  manage  cloud-­‐based  applications  that  autonomously  collect,  wrangle,  and   analyze   data   to   improve   the   overall   efficiency   and   profitability   of   the   business.   And   by   leveraging   the  industry-­‐proven   VCE   Vblock   Systems   converged   infrastructure,   organizations   gain   peace   of   mind   that   the  underlying  hardware  can  be  quickly  and  easily  deployed,  deliver  high   levels  of  performance  and  availability,  and  future  proof  hardware  investments  with  lower  TCO  and  higher  ROI.  

ESG  Lab  Review:  IoT  Platform:  Pivotal  Cloud  Foundry  and  Pivotal  Big  Data  Suite  on  VCE  Vblock  Systems  with  EMC  Isilon          11  

©  2015  by  The  Enterprise  Strategy  Group,  Inc.  All  Rights  Reserved.  

The  Bigger  Truth  As  line  of  business  expectations  continue  to  exceed  what  can  be  delivered  by  IT,  administrators  are  looking  for  ways  to  help  transform  the  traditional  infrastructure  into  something  more  agile  and  autonomous.  This  is  particularly  important  as  it  relates  to  Internet  of  Things  solutions,  which  require  a  dynamic  environment  to  address  real-­‐time  and  historic  data  coming  from  multiple  data  sources,  whether  it  be  machine  log  data,  mobile  phones,  vehicles  on  the  road,  or  even  traditional  systems  of  records.  Building  out  such  a  solution  from  scratch  is  doable,  but  comes  at  a  significant  cost.  Not  just  from  a  capital  standpoint,  but  also  from  an  operational  standpoint.  The  requirements  for  expert  personnel,  collaboration  between  business  units,  management  complexity,  and  support  are  just  a  few  of  the  potential  pitfalls  that  can  quickly  lead  to  roadblocks  or  significant  delays  in  not  just  initial  deployments,  but  in  ongoing  profitability  and  the  business  changes  as  well.  Why  not  select  a  preconfigured,  pretested,  industry-­‐proven  solution  that  can  easily  adjust  to  business  needs  in  seconds  rather  than  months?  

By  leveraging  a  converged  platform  in  VCE  Vblock  Systems  and  combining  it  with  EMC  Isilon  and  VMware  vSphere,  organizations  get  a  fully  integrated  platform  that  meets  and  grows  with  their  needs.  Vblock  Systems  and  Isilon  deliver  a  scale-­‐out  infrastructure  that  achieves  high  levels  of  reliability,  availability,  security,  and  performance.  By  adding  two  additional  pieces  of  software  in  Pivotal  Cloud  Foundry  and  Pivotal  Big  Data  Suite,  enterprises  become  more  agile  and  resilient  in  developing  and  deploying  modern  applications,  while  adjusting  their  mindset  to  be  more  data-­‐centric.  Real-­‐time  and  historic  data  analytics  with  Hadoop  and  in-­‐memory  computing  can  be  used  together  to  gain  an  always-­‐desired  competitive  advantage  on  a  rock-­‐solid  hardware  platform.  

ESG  Lab  has  previously  validated  many  VCE  solutions  with  positive  results,  and  this  joint  solution  is  no  different.  The  pretested,  pre-­‐engineered  VCE  Vblock  System  with  EMC  Isilon  greatly  simplifies  the  deployment  and  management  of  the  hardware  resources  required  for  a  complete  big  data  analytics  solution,  specifically  one  to  handle  the  demands  of  IoT  use  cases.  ESG  Lab  used  Pivotal  Cloud  Foundry  to  deploy  a  prebuilt  application,  add  data  analytics  services,  and  easily  scale  the  application  as  demands  required.  A  software  failure  was  simulated  and  the  ability  to  automatically  recover  in  seconds  was  particularly  impressive.  When  factoring  in  the  Pivotal  Big  Data  Suite  and  using  the  IoT  use  case  of  a  connected  car,  ESG  Lab  witnessed  the  convergence  of  the  VCE  hardware  and  Pivotal  software  to  deliver  data  analytics  in  real  time  based  on  historical  data  and  machine  learning.  As  a  car  traveled  at  varying  speeds,  an  application  hosted  in  Pivotal  Cloud  Foundry  could  correctly  predict  the  destination  of  the  vehicle  and  the  time  at  which  it  would  arrive  as  the  vehicle  was  viewed  in  real  time  on  a  map.  

Together,  VCE  and  Pivotal  will  enable  a  more  agile  business  driven  by  data  to  accelerate  time  to  results  and  time  to  value  by  transforming  your  traditional,  siloed  IT  infrastructure,  into  a  converged  hardware,  application,  and  analytics  platform.  Take  the  next  steps  to  modernize  your  organization,  reduce  future  risk,  and  improve  operational  efficiency  by  evaluating  VCE  Vblock  Systems  with  EMC  Isilon,  Pivotal  Cloud  Foundry,  and  Pivotal  Big  Data  Suite.    

ESG  Lab  Review:  IoT  Platform:  Pivotal  Cloud  Foundry  and  Pivotal  Big  Data  Suite  on  VCE  Vblock  Systems  with  EMC  Isilon          12  

©  2015  by  The  Enterprise  Strategy  Group,  Inc.  All  Rights  Reserved.  

Appendix  

Table  1.  Test  Bed  Environment  

 

Element   Configuration  

VCE  Vblock  System  340  

VCE  Management  • Advanced  Management  Platform  (AMP-­‐2)  

Fabric  Interconnect  • Cisco  UCS  6248UP  Fabric  Interconnect    

Servers  • 16  X  Cisco  UCS  B200  M3  

Server  Details  • 2  X  Xeon  Intel  E5-­‐2680V2  (2.8  GHz)  • 256GB  memory  (16  X  16  GB)  • Host  local  storage:  None  (diskless;  SAN  Boot)  • 2  X  Cisco  UCS    VIC-­‐1240  • Cisco  UCS  5108  Blade  Server  Chassis  

Storage  –  EMC  VNX  7600  

• Connectivity:  Fibre  Channel  • Three  RAID5  Storage  Pools  Configuration  of  total  of  66  400GB  SSD  drives  

(Micron  P410M400)  o Pool  1  (4+1)  :  16  X  50GB  LUNs  for  ESXi  SAN  boot  o Pool  2  (8+1)  :  8  X  500GB  LUNs  for  HAWQ  Cluster  and  

Pivotal  Cloud  Foundry  o Pool  3  (4+1)  :  17  X  600GB  LUNs  for  Greenplum  

• 8Gb  Fibre  Channel  Networking  –  16    lanes  from  switch  to  VNX  

Networking  • Switches:  48  ports  Cisco    • A  pair  of  N5K-­‐C5548UP  capable  of  10  Gigabit  Ethernet,  Fibre  

Channel,  and  FCoE  switch  

o Disjoint  Layer  2:  VM-­‐FEX    

EMC  Isilon  (offered  as  VCE  Technology  Extension  for  EMC  Isilon  

storage)  

• HDFS  configuration:    o 128  MB  Hadoop  block  size  

• Nodes:  8  • EMC  Isilon  X410-­‐4U-­‐Dual-­‐256GB-­‐2x1GE-­‐2x10GE  SFP+-­‐99TB-­‐2458GB  SSD  • OneFS  7.2.0.1  • Dual  10  Gig  Ethernet  connectivity  per  Isilon  node  

• Additional  pair  of  N5K-­‐C5548UP  

Pivotal  Greenplum  

• Pivotal  Software  Releases  o Greenplum  Database  4.3.5.0  o Greenplum  Web  Command  Center  1.3.0.0  

• VM  o Master/Web  Command  Center  node:  1  o Segment  node:  8  

ESG  Lab  Review:  IoT  Platform:  Pivotal  Cloud  Foundry  and  Pivotal  Big  Data  Suite  on  VCE  Vblock  Systems  with  EMC  Isilon          13  

©  2015  by  The  Enterprise  Strategy  Group,  Inc.  All  Rights  Reserved.  

Hypervisor  

• vSphere  5.5  –  EMC  PowerPath/VE  enabled  • vCenter  5.5  • Storage:  HDFS  for  Isilon    

o Isilon  HDFS  protocol  o VNX  Fibre  Channel  for  Shuffle  

• vSphere  VM-­‐FEX  Distributed  Switch  with  multiple  VLANs    • Resource  Pools:  3  

o Pool  1:  Virtualized  Greenplum  with  nine  servers  o Pool  2:  HAWQ  and  Pivotal  CF  with  five  servers  o Pool  3:  IoT  demo  and  admin  with    two  servers  

Virtual  Machines  

• Deploy  via    o VM  version  10  o VMware  Tools  installed  o HDFS  using  Isilon  HDFS  protocol  o VNX  for  VM  guest  OS  and  shuffle  

• Greenplum  o 128GB  memory  and  16  vCPUs  

• HAWQ  o 128GB  memory  and  8  vCPUs  

Linux  

• RHEL  6.5  x86_64  o /etc/security/limits.conf  

! soft  nofile  2900000  ! hard  nofile  2900000  ! soft  nproc  131072  ! hard  nproc  131072  

o SELINUX  disabled  (/etc/selinux/config)  o transparent_hugepage=never  o Java  version  "1.7.0_76"  o HDFS  based  on  Isilon  HDFS  protocol  o Shuffle  partition  formatted  with  EXT4  

Pivotal  HAWQ  Cluster  

• Pivotal  Software  Releases  o PHD  2.1.0  o Pivotal  Command  Center  2.3.0  o Pivotal  HAWQ  1.2.1.0  

• VM  o Hadoop  Manager/Command  Center:  1  o Master  node:  1  o Segment  Node:  3  

 

The  goal  of  ESG  Lab  reports   is  to  educate  IT  professionals  about  data  center  technology  products  for  companies  of  all  types  and  sizes.  ESG  Lab  reports  are  not  meant   to   replace   the   evaluation   process   that   should   be   conducted   before   making   purchasing   decisions,   but   rather   to   provide   insight   into   these   emerging  technologies.  Our  objective  is  to  go  over  some  of  the  more  valuable  feature/functions  of  products,  show  how  they  can  be  used  to  solve  real  customer  problems  and   identify   any   areas   needing   improvement.   ESG   Lab’s   expert   third-­‐party   perspective   is   based   on   our   own   hands-­‐on   testing   as   well   as   on   interviews   with  customers  who  use  these  products  in  production  environments.  This  ESG  Lab  report  was  sponsored  by  VCE.  

All  trademark  names  are  property  of  their  respective  companies.  Information  contained  in  this  publication  has  been  obtained  by  sources  The  Enterprise  Strategy  Group  (ESG)  considers  to  be  reliable  but  is  not  warranted  by  ESG.  This  publication  may  contain  opinions  of  ESG,  which  are  subject  to  change  from  time  to  time.  This  publication  is  copyrighted  by  The  Enterprise  Strategy  Group,  Inc.  Any  reproduction  or  redistribution  of  this  publication,  in  whole  or  in  part,  whether  in  hard-­‐copy  format,  electronically,  or  otherwise  to  persons  not  authorized  to  receive  it,  without  the  express  consent  of  The  Enterprise  Strategy  Group,  Inc.,  is  in  violation  of  U.S.  copyright  law  and  will  be  subject  to  an  action  for  civil  damages  and,  if  applicable,  criminal  prosecution.  Should  you  have  any  questions,  please  contact  ESG  Client  Relations  at  508.482.0188.