Upload
lamtu
View
214
Download
0
Embed Size (px)
Citation preview
© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.
Big Data and Analytic Challenges Remain in the Era of IoT Over the last year there has been a major uptick in interest in and maturity of IoT. IoT is the network of physical objects or “things” embedded in electronics, software, and sensors that enables objects to exchange data between connected devices. The primary entry point for IoT is the need to extend the existing data analytics engagement to drive greater value from digital assets that combine device data with other enterprise data. It should be noted that though IoT is taking shape as an extension of big data and analytics initiatives, non-‐analytics oriented solutions also fall under the IoT umbrella by automatically making decisions and completing transactions based on a set of business rules.
Despite the business interest and technological advancements in IoT, the exponential growth and increased variety of data continue to be key challenges holding back organizations from making the IoT leap and transforming into a data-‐driven business. How long will it take until insight is achieved? What new tools will be required to efficiently reason and analyze over these massive data sets? How much storage is needed? Who needs to be involved? These are just some of the questions respondents to a recent ESG survey have asked since deploying a data analytics solution. The most often identified challenge they face is data integration. With multiple data sources that must be transformed, cleansed, wrangled, and joined, it is no surprise that 38% of respondents believe that data integration is complex.1
Figure 1. Top Six Requirements for Driving New BI/analytics Solution Evaluations
Source: Enterprise Strategy Group, 2015.
1 Source: ESG Research Report, Enterprise Big Data, Business Intelligence, and Analytics Trends, January 2015.
24%
24%
24%
26%
27%
27%
23% 23% 24% 24% 25% 25% 26% 26% 27% 27% 28%
OrganizaZon is moving toward more predicZve and discovery analyZcs
Cost reducZon of exisZng pla\orms
Need to combine analyZcs for structured and unstructured data
Current BI/analyZcs soluZon(s) do not meet requirements/needs
New applicaZons are generaZng new data types that need different analyZcal tools
OrganizaZon is moving towards more real-‐Zme analyZcs
Which of the following requirements are most responsible for driving the evaluaAon of new BI/analyAcs soluAons? (Percent of respondents, N=168, three responses accepted)
ESG Lab Review
An Internet of Things (IoT) Platform: Pivotal Cloud Foundry and Pivotal Big Data Suite on VCE Vblock Systems with EMC Isilon
Date: August 2015 Author: Mike Leone, ESG Lab Analyst
Abstract: This ESG Lab Review documents hands-‐on testing of the ability of the VCE Vblock System and VCE Technology Extension for EMC Isilon storage to drive value in an Internet of Things (IoT) real-‐time analytics environment with Pivotal Cloud Foundry and Pivotal Big Data Suite. Testing focused on the functionality, deployment, and simplicity of running Pivotal software on a converged VCE solution, which is designed to provide a scalable, flexible, end-‐to-‐end platform for IoT and analytic solutions.
ESG Lab Review: IoT Platform: Pivotal Cloud Foundry and Pivotal Big Data Suite on VCE Vblock Systems with EMC Isilon 2
© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.
With 45% of large midmarket and enterprise-‐class organizations planning to deploy a new big data and BI/analytics solution in 2015, and 39% selecting Hadoop-‐based solutions as one of their planned infrastructure approaches, ESG asked respondents what requirements were most responsible for driving the evaluation of a new BI/analytics solution. As shown in Figure 1, the two most-‐cited responses, identified by 27% of respondents each, were that the organization is moving towards a more real-‐time analytics approach, and that new applications are generating new data types that need different analytical tools.2 It should be noted that 99% of the respondents already have a BI/analytics solution in place. In other words, adaptability and agility were overlooked with many early adopters of big data technology. This has led to organizations rethinking their data, application, and infrastructure strategies to not only solve big data analytics challenges, but to better prepare themselves for IoT success.
Challenges Moving Forward in an IoT World From seismic sensors and mobile phones to refrigerators and cars, the amount of data being collected on a minute-‐by-‐minute basis is massive in both the consumer and industrial worlds. This is often said to be IoT, but data collection is just a piece of IoT. There is also the necessity to analyze the data to make better decisions based on real-‐time and historical data. Often times, this data analysis can be done without human interaction and leverage machine learning to make recommendations based on that data. This combination of data collection, automated data wrangling, and data analysis is IoT.
With new devices and new interfaces generating such a large variety of data, data wrangling becomes a major concern, which falls in line with the previously mentioned challenges that many organizations experience with traditional BI/analytics solutions. Because of this, organizations have been forced to hand-‐code one-‐off solutions to address the time it takes to wrangle all of the data. This is often done without an underlying platform, which not only consumes a significant amount of expert personnel and data scientist’s time, but also often introduces the potential for bugs or issues with the customized software due to human error.
From a software standpoint, numerous open source technologies exist that can work quite well to address the requirements of IoT when effectively used together. The problem lies in the fact that going with this approach can be costly. Independent experts of each technology are often required not only to make a solution work, but also to manage and support it. With open source technologies often comes the idea of leveraging commodity hardware to house these various pieces of software. Though organizations will obviously save on CapEx with that approach, the idea of an organization relying on commodity hardware with minimal support to house their crucial data can serve as an instant deterrent. The need for a platform-‐based approach has never been more apparent.
VCE Vblock System with EMC Isilon Storage, Pivotal Cloud Foundry, and Pivotal Big Data Suite The VCE Vblock System is a well-‐known and extensively deployed integrated computing platform (ICP) that combines best-‐of-‐breed technologies from industry-‐leading vendors. VCE offers a range of Vblock Systems that businesses can choose from to suit their workload needs. With Cisco compute and networking, EMC storage and data protection, and VMware virtualization and management, Vblock Systems are designed to make it simpler and quicker for organizations to deploy a complete IT platform that has been pre-‐integrated, pre-‐validated, and pretested with release certification matrices (RCM) serving as a basis for VCE seamless support.
VCE Technology Extension for EMC Isilon Storage Building off the preexisting success of VCE’s integrated computing platform, VCE released the VCE Technology Extension for EMC Isilon storage to extend the business value of VCE. With analytics being a component of the dominant use cases for VCE, organizations can now deploy Hadoop distributions and tools from the Hadoop ecosystem along with dependent applications and databases to provide end-‐to-‐end analytics on Vblock Systems with EMC Isilon. Their benefits include:
2 Ibid.
ESG Lab Review: IoT Platform: Pivotal Cloud Foundry and Pivotal Big Data Suite on VCE Vblock Systems with EMC Isilon 3
© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.
• Flexible scale-‐out capacity and performance from Vblock Systems with EMC Isilon to avoid traditional scale-‐up limitations in big data analytics environments. This scale-‐out approach provides improved data protection, data access, resiliency, high availability, manageability, and cost savings.
• Augmenting the structured data analytics storage environment with native HDFS support from EMC Isilon where unstructured data can be analyzed in place, eliminating the need for ingest or staging. This augments the already highly prevalent structured data technology built into existing Vblock Systems with EMC Symmetrix VMAX, EMC VNX, and EMC XtremIO storage systems while deploying other analytics such as SAS or Splunk. VCE also provides VCE Technology Extension for Cisco Compute as a Direct Attach Storage (DAS) alternative, especially for Hadoop entry level deployments or Massively Parallel Processing (MPP) databases.
• Predictable high availability and reliability for extremely large data sets. As data sets reach petabyte scale, the pre-‐engineered Vblock Systems with scale-‐out Isilon storage offer high levels of component redundancy, helping to eliminate single points of failure.
• Data protection—Vblock Systems take advantage of technologies from EMC, including EMC Avamar, EMC Data Domain, EMC RecoverPoint, and EMC VPLEX. EMC Isilon takes the integrated data protection of the complete solution a step further by offering additional capabilities such as data-‐at-‐rest encryption for files, snapshotting, remote replication, and NDMP backups.
• Networking Hadoop workloads can be network-‐heavy, and the ability to choose and architect the leading networking technologies is key to success—contrasting with servers and appliances. Vblock Systems can employ Cisco Nexus 9000 Series Application Centric Infrastructure switches to provide application-‐driven automation, open software support, and hardware-‐based multi-‐tenancy.
With the recent expansion of its portfolio, VCE also offers VxBlock Systems designed to present additional choices to customers. Customers can choose network virtualization solutions from VMware NSX or Cisco ACI. Vblock Systems and VxBlock Systems can also be combined with VCE Technology Extensions and VCE Vscale Blocks. Under the VCE Vscale Architecture, organizations can choose the right building blocks based on workload, performance and scaling requirements, and budgetary demands.
Pivotal Organizations can leverage the VCE Vblock System with EMC Isilon; VCE Vision Intelligent Operations; and Pivotal data, application, and analytics solutions to help streamline software application provisioning, deployment, management, and analytics. Deploying Pivotal solutions on Vblock Systems provides organizations with greater agility by standardizing on a proven, converged hardware platform to help accelerate time to value. By converging all investments in applications, analytics, and data into a joint VCE and Pivotal solution, organizations gain flexibility, scalability, and efficiency, while maximizing their return on investment. Additional benefits include:
• Achieving elasticity, scalability, and reliability with Pivotal Cloud Foundry on Vblock Systems utilizing best-‐in-‐class servers, networks, and storage.
• Ensuring mission-‐critical readiness of Hadoop with Pivotal HD, HAWQ, and EMC Isilon to achieve multi-‐tenancy, Hadoop Distributed File System (HDFS) integration, privacy, and reliability.
• Driving operational efficiency and performance improvements with the Pivotal Greenplum Database and its MPP in communication with external databases and Hadoop.
• Improving business outcomes by leveraging Pivotal GemFire to analyze large quantities of data at scale in the cloud.
• Deploying VCE Data Protection, including EMC Avamar, EMC Data Domain, and EMC VPLEX. • Mixing and matching future deployments and upgrading compute, network, and storage resources as
business demands evolve and improved technology becomes available.
ESG Lab Review: IoT Platform: Pivotal Cloud Foundry and Pivotal Big Data Suite on VCE Vblock Systems with EMC Isilon 4
© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.
Figure 2. VCE Vblock System with VCE Technology Extension for EMC Isilon and Pivotal
Pivotal Cloud Foundry
Pivotal Cloud Foundry is a cloud native application platform (platform) that delivers an easy-‐to-‐consume and ready-‐to-‐use cloud computing environment with application services. An existing, virtualized public or private infrastructure is used to host the platform, streamlining the provisioning, deployment, and management of cloud-‐based applications. Capabilities include:
• Operational Visibility and Control – Manage and monitor the entire platform from a single pane of glass. • Security – Isolate applications with containers, apply security groups to restrict connections, apply role-‐
based access control, and audit platform activities. • Application Portability – Standardize application deployments with containers, enabling application
migrations between compatible infrastructures provisioned in VMware vSphere, OpenStack, and Amazon Web Services.
• Automation – Simplify management tasks with centrally managed services, while leveraging deployment templates and tools to easily configure and manage virtualized resources.
• Services – Offer customers a self-‐service catalog of databases, analytics, and middleware technology, including Pivotal Big Data Suite Services.
Pivotal Big Data Suite
The Pivotal Big Data Suite is a software stack focused on advanced data analytics and in-‐memory processing for data-‐driven organizations. The product suite is a collection of Pivotal’s data technologies that enable organizations to build a solid infrastructure for storage and data processing on Hadoop, leverage advanced analytics for deeper insights, and develop and deploy scalable, data-‐centric applications with distributed, in-‐memory data stores.
As shown in Figure 3, eight components work together to form the Pivotal Big Data Suite, five of which are also available as services on Pivotal Cloud Foundry.
• Pivotal HD – A Hadoop distribution based on the core Open Data Platform (ODP) optimized for batch processing workloads.
• Pivotal Greenplum Database – An analytical, massively parallel processing database. • Pivotal HAWQ – A massively parallel processing, ANSI-‐compliant SQL on Hadoop query engine. • Pivotal GemFire – A high-‐performing, distributed, in-‐memory NoSQL database. • Spring XD – An open source, distributed framework for data ingestion, batch processing, and data flows. • Spark – An open source, cluster computing framework for quickly processing many types of data, supported
via Pivotal HD. • Redis – A scalable, open source key value storage and data structure server.
ESG Lab Review: IoT Platform: Pivotal Cloud Foundry and Pivotal Big Data Suite on VCE Vblock Systems with EMC Isilon 5
© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.
• RabbitMQ – A scalable, open source message queue for applications.
Figure 3. Pivotal Big Data Suite
ESG Lab Tested ESG Lab leveraged a VCE Vblock System 340 and VCE Technology Extension for EMC Isilon storage to test and validate the functionality and workflows of Pivotal Cloud Foundry and Pivotal Data Suite as they relate to deployments in IoT environments.
Simplicity and Operational Efficiency with Pivotal Cloud Foundry
Pivotal has taken the application lifecycle from months to hours and minutes by building on initial cloud native application platform benefits like developer agility, and focusing on offering additional operational benefits. Every application that is deployed on Pivotal Cloud Foundry can take advantage of these benefits as they relate to individual application requirements. When applications are deployed, they are dynamically routed to a fault tolerant array of software load balancers to allow for instant scaling and zero downtime. In fact, applications are deployed into dual availability zones to allow for 50% hardware capacity loss without downtime. Application and user events are fed into a single log drain where tools like Splunk can be leveraged to mine the data for information and troubleshooting purposes. Role-‐based access and policy management are integrated directly into the main dashboard, while application performance management is currently in beta with a release date in the near future.
ESG Lab leveraged a simulated environment that consisted of a VCE Vblock System as the underlying hardware to test the functionality and simplicity of Pivotal Cloud Foundry when deploying and scaling a new application. Testing began by logging into the main developer console. An organization called AlliancesLab had previously been created, which
ESG Lab Review: IoT Platform: Pivotal Cloud Foundry and Pivotal Big Data Suite on VCE Vblock Systems with EMC Isilon 6
© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.
consisted of four “spaces” or business units. The console summarized each space and highlighted the applications, services, and team members for each particular space. Figure 4 shows the console interface.
Figure 4. Pivotal Cloud Foundry Main Console
The production space was selected, which had no applications or services deployed or running. Using the Cloud Foundry command line utility, ESG Lab navigated to a sample application with recently completed source code. After checking the endpoints, of which there can be multiple, and logging in, the sample application was pushed to the production space. The push command kicks off a group of events, including creating the route, binding the route to the application, and then downloading all of the components of the application. An example of a component would be if the application is a java application, the JDK would be downloaded. All of this information gets compiled together and uses the Cloud Foundry buildpack. Buildpacks enable enterprises to automatically package and run multiple run times and frameworks (Go, Ruby, Java, Spring, etc.) for a particular application. The buildpacks get put into what Pivotal calls a “droplet,” which gets uploaded to Pivotal’s platform layer, where the application starts and gets ported to the droplet application engine.
Next, ESG Lab transitioned back to the management console, where the newly deployed application could be viewed along with the configuration details and the automatically generated routing information. By clicking on the route, the application was launched (shown in Figure 5). The application simulated a retailer who was interested in tracking her orders across the United States and having the data actively streamed in real time to the displayed map.
ESG Lab Review: IoT Platform: Pivotal Cloud Foundry and Pivotal Big Data Suite on VCE Vblock Systems with EMC Isilon 7
© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.
Figure 5. Viewing a Deployed Application
For the application to function properly, a service had to be added and bound to the application. By navigating back to the main console, ESG Lab clicked the Marketplace, which displayed the available services for the organization. RabbitMQ for Pivotal CF was selected, which then displayed the available service plans that could be selected. This allows companies that want to be service providers to build multiple plans and apply various charging capabilities. The only plan available was selected and after supplying an instance name to the service, selecting the space, and selecting the application in which to bind the service, the service was added to the application. Figure 6 displays the process of adding a new service to an already deployed application. After restarting the application, the application was launched and the data stream was started. The map began changing colors based on the number of orders occurring and ESG Lab could click on different states to gain a more granular view of the number of orders at a particular time.
Figure 6. Adding a New Service
Next, ESG Lab wanted to scale the application. This was done through the command line by issuing a “scale” command and specifying the number of instances desired and the application. In one click of a button, two additional instances were added to the deployed application. This can also be done via the management interface, which provides the ability to increase the memory and disk limits of each instance.
The final phase of testing focused on a failure scenario. Pivotal Cloud Foundry has built-‐in application monitoring components to meet high availability and recovery requirements. In particular, the health monitor interacts with the cloud controller to verify the configuration and deployment expectations of an application. If the cloud controller reports something back that doesn’t meet those expectations, instances of the application will be automatically
ESG Lab Review: IoT Platform: Pivotal Cloud Foundry and Pivotal Big Data Suite on VCE Vblock Systems with EMC Isilon 8
© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.
restarted to rectify the inconsistencies. ESG Lab simulated a failure in the application caused by a software bug. After the failure occurred, it was reflected in the management interface (shown in Figure 7). Within seconds, the application went from being in a down state, to being up and running again.
Figure 7. Simulating a Failure
Why This Matters The way the technology industry is moving, business agility is essential in order to not only adapt to market conditions, but also to remain competitive. The business is pressuring IT to deliver dependable services that meet its requirements faster than ever. This is especially true as software continues to be viewed as the primary way to gain competitive advantage. The problem is that business expectations far exceed what can be delivered by IT. Developers are focused on improving their own experience with an agile, iterative process to better consume open source technology and new data services, while the application operators are focused on being able to continuously deliver application uptime, scalability, and consistency. An approach that better aligns both developer goals and operator goals is crucial to better meeting the lofty expectations of the line of business.
ESG Lab validated that the combination of the industry-‐proven VCE technology and the power of Pivotal Cloud Foundry to deliver an enterprise-‐ready platform-‐as-‐a-‐service enables organizations to develop, deploy, and maintain an agile business that can easily adjust in real time to meet the needs of the business. A sample application was quickly deployed in Pivotal Cloud Foundry with a simple push command. Services were added to the application from the available marketplace with just a few mouse clicks to enable the streaming of real-‐time events. The application was scaled up via a single command from the command line, and could also be scaled directly from the management console. ESG Lab was particularly impressed with the high-‐availability capabilities. A failure was simulated that caused the application to go down. The Pivotal software quickly detected the failure and within seconds the application was back up and running in a fully functional state.
ESG Lab Review: IoT Platform: Pivotal Cloud Foundry and Pivotal Big Data Suite on VCE Vblock Systems with EMC Isilon 9
© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.
Delivering Internet of Things Solutions with Pivotal Data Solutions and VCE
The Pivotal Big Data Suite offers software to ease organizations’ transition from a reactive data analytics model to a more proactive, self-‐improving, machine learning approach. With the Pivotal Big Data Suite, when data is ingested from multiple data sources as is common in many IoT scenarios, a data stream pipeline is leveraged to automatically wrangle the data and move it to ideal locations in the overall analytics solution. This includes streaming data in, parsing it, making sure it is the right format, and then understanding where to send it. Historical data is placed in the data lake with HDFS, which is where VCE Vblock Systems with EMC Isilon fits in, while real-‐time data leverages the speed and low latency of an in-‐memory datastore. Data from both the data lake and the in-‐memory system are then leveraged by the machine learning portion of the overall solution to continuously learn, improve, adapt, and better identify trends.
ESG Lab simulated an IoT scenario commonly referred to as the connected car: a device connected to a car that tracks speed, location, miles per gallon, engine temperature, etc. The connected car architecture is shown in Figure 8, which combines both Pivotal Big Data Suite and Pivotal Cloud Foundry with the VCE Vblock System and EMC Isilon into a fine-‐tuned IoT architecture based on a data streaming reference architecture. Details of the exact test bed can be found in the Appendix.
First, the car data is transmitted to the stream processing engine. Spring XD is used to orchestrate and automate all the steps of the data stream pipeline by leveraging dozens of built-‐in connectors to ingest/sink data; process that data via filtering, splitting, and transforming with tools like Spark; and analyze the data with other tools like Python or R to help analyze where the data should reside, whether that be in a data lake or in an in-‐memory database. For data that gets sunk to the data lake, the VCE Vblock System with EMC Isilon is used as the Hadoop-‐based solution running Pivotal HD. Using Pivotal HAWQ, data that resides in the data lake can be queried with SQL for SQL-‐based advanced analytics, while the Pivotal Greenplum Database serves as the analytical, massively parallel processing database. For data that can be analyzed in real time, Pivotal GemFire serves as the highly distributed NoSQL database, providing scalable, low-‐latency, real-‐time data access, storage, and event processing. Pivotal Cloud Foundry is leveraged as the platform running the application, and the previously mentioned services collect all the needed data that eventually gets pushed to mobile devices and end-‐users using the Pivotal Cloud Foundry mobile services.
Figure 8. IoT Architecture for a Connected Car
ESG Lab Review: IoT Platform: Pivotal Cloud Foundry and Pivotal Big Data Suite on VCE Vblock Systems with EMC Isilon 10
© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.
A simulated application that tracked a moving vehicle and used historical and real-‐time data to predict a destination was used for testing. Speed, miles per gallon, and the final destination were displayed in real time as the location of the vehicle was tracked on a map. As shown in Figure 9, a background script was used to generate the simulated car data. Once the data stream started via the Spring XD command line, ESG Lab logged in to various management interfaces to view real-‐time performance metrics of the application.
Figure 9. Launching the Data Stream and Managing Performance
Finally, the application was viewed, which displayed the moving car on a map and car metrics as the car traveled from one destination to another. Also shown on the map were the potential destinations of the driver. As the car navigated in a certain direction (real-‐time data), the destination was correctly predicted based on historical data of frequently visited locations.
Figure 10. Connected Car Application in Real Time
Why This Matters Organizations must adapt to a data-‐driven model that can enable a greater competitive advantage by speeding up time to results. With IoT and its millions of devices for various use cases, the current analytics model is in dire need of a platform that combines agile application development and deployment with real-‐time data analytics and machine learning. Pivotal and VCE have joined forces to offer such a platform—one where organizations can develop, deploy, and manage cloud-‐based applications that autonomously collect, wrangle, and analyze data to improve the overall efficiency and profitability of the business. And by leveraging the industry-‐proven VCE Vblock Systems converged infrastructure, organizations gain peace of mind that the underlying hardware can be quickly and easily deployed, deliver high levels of performance and availability, and future proof hardware investments with lower TCO and higher ROI.
ESG Lab Review: IoT Platform: Pivotal Cloud Foundry and Pivotal Big Data Suite on VCE Vblock Systems with EMC Isilon 11
© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.
The Bigger Truth As line of business expectations continue to exceed what can be delivered by IT, administrators are looking for ways to help transform the traditional infrastructure into something more agile and autonomous. This is particularly important as it relates to Internet of Things solutions, which require a dynamic environment to address real-‐time and historic data coming from multiple data sources, whether it be machine log data, mobile phones, vehicles on the road, or even traditional systems of records. Building out such a solution from scratch is doable, but comes at a significant cost. Not just from a capital standpoint, but also from an operational standpoint. The requirements for expert personnel, collaboration between business units, management complexity, and support are just a few of the potential pitfalls that can quickly lead to roadblocks or significant delays in not just initial deployments, but in ongoing profitability and the business changes as well. Why not select a preconfigured, pretested, industry-‐proven solution that can easily adjust to business needs in seconds rather than months?
By leveraging a converged platform in VCE Vblock Systems and combining it with EMC Isilon and VMware vSphere, organizations get a fully integrated platform that meets and grows with their needs. Vblock Systems and Isilon deliver a scale-‐out infrastructure that achieves high levels of reliability, availability, security, and performance. By adding two additional pieces of software in Pivotal Cloud Foundry and Pivotal Big Data Suite, enterprises become more agile and resilient in developing and deploying modern applications, while adjusting their mindset to be more data-‐centric. Real-‐time and historic data analytics with Hadoop and in-‐memory computing can be used together to gain an always-‐desired competitive advantage on a rock-‐solid hardware platform.
ESG Lab has previously validated many VCE solutions with positive results, and this joint solution is no different. The pretested, pre-‐engineered VCE Vblock System with EMC Isilon greatly simplifies the deployment and management of the hardware resources required for a complete big data analytics solution, specifically one to handle the demands of IoT use cases. ESG Lab used Pivotal Cloud Foundry to deploy a prebuilt application, add data analytics services, and easily scale the application as demands required. A software failure was simulated and the ability to automatically recover in seconds was particularly impressive. When factoring in the Pivotal Big Data Suite and using the IoT use case of a connected car, ESG Lab witnessed the convergence of the VCE hardware and Pivotal software to deliver data analytics in real time based on historical data and machine learning. As a car traveled at varying speeds, an application hosted in Pivotal Cloud Foundry could correctly predict the destination of the vehicle and the time at which it would arrive as the vehicle was viewed in real time on a map.
Together, VCE and Pivotal will enable a more agile business driven by data to accelerate time to results and time to value by transforming your traditional, siloed IT infrastructure, into a converged hardware, application, and analytics platform. Take the next steps to modernize your organization, reduce future risk, and improve operational efficiency by evaluating VCE Vblock Systems with EMC Isilon, Pivotal Cloud Foundry, and Pivotal Big Data Suite.
ESG Lab Review: IoT Platform: Pivotal Cloud Foundry and Pivotal Big Data Suite on VCE Vblock Systems with EMC Isilon 12
© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.
Appendix
Table 1. Test Bed Environment
Element Configuration
VCE Vblock System 340
VCE Management • Advanced Management Platform (AMP-‐2)
Fabric Interconnect • Cisco UCS 6248UP Fabric Interconnect
Servers • 16 X Cisco UCS B200 M3
Server Details • 2 X Xeon Intel E5-‐2680V2 (2.8 GHz) • 256GB memory (16 X 16 GB) • Host local storage: None (diskless; SAN Boot) • 2 X Cisco UCS VIC-‐1240 • Cisco UCS 5108 Blade Server Chassis
Storage – EMC VNX 7600
• Connectivity: Fibre Channel • Three RAID5 Storage Pools Configuration of total of 66 400GB SSD drives
(Micron P410M400) o Pool 1 (4+1) : 16 X 50GB LUNs for ESXi SAN boot o Pool 2 (8+1) : 8 X 500GB LUNs for HAWQ Cluster and
Pivotal Cloud Foundry o Pool 3 (4+1) : 17 X 600GB LUNs for Greenplum
• 8Gb Fibre Channel Networking – 16 lanes from switch to VNX
Networking • Switches: 48 ports Cisco • A pair of N5K-‐C5548UP capable of 10 Gigabit Ethernet, Fibre
Channel, and FCoE switch
o Disjoint Layer 2: VM-‐FEX
EMC Isilon (offered as VCE Technology Extension for EMC Isilon
storage)
• HDFS configuration: o 128 MB Hadoop block size
• Nodes: 8 • EMC Isilon X410-‐4U-‐Dual-‐256GB-‐2x1GE-‐2x10GE SFP+-‐99TB-‐2458GB SSD • OneFS 7.2.0.1 • Dual 10 Gig Ethernet connectivity per Isilon node
• Additional pair of N5K-‐C5548UP
Pivotal Greenplum
• Pivotal Software Releases o Greenplum Database 4.3.5.0 o Greenplum Web Command Center 1.3.0.0
• VM o Master/Web Command Center node: 1 o Segment node: 8
ESG Lab Review: IoT Platform: Pivotal Cloud Foundry and Pivotal Big Data Suite on VCE Vblock Systems with EMC Isilon 13
© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.
Hypervisor
• vSphere 5.5 – EMC PowerPath/VE enabled • vCenter 5.5 • Storage: HDFS for Isilon
o Isilon HDFS protocol o VNX Fibre Channel for Shuffle
• vSphere VM-‐FEX Distributed Switch with multiple VLANs • Resource Pools: 3
o Pool 1: Virtualized Greenplum with nine servers o Pool 2: HAWQ and Pivotal CF with five servers o Pool 3: IoT demo and admin with two servers
Virtual Machines
• Deploy via o VM version 10 o VMware Tools installed o HDFS using Isilon HDFS protocol o VNX for VM guest OS and shuffle
• Greenplum o 128GB memory and 16 vCPUs
• HAWQ o 128GB memory and 8 vCPUs
Linux
• RHEL 6.5 x86_64 o /etc/security/limits.conf
! soft nofile 2900000 ! hard nofile 2900000 ! soft nproc 131072 ! hard nproc 131072
o SELINUX disabled (/etc/selinux/config) o transparent_hugepage=never o Java version "1.7.0_76" o HDFS based on Isilon HDFS protocol o Shuffle partition formatted with EXT4
Pivotal HAWQ Cluster
• Pivotal Software Releases o PHD 2.1.0 o Pivotal Command Center 2.3.0 o Pivotal HAWQ 1.2.1.0
• VM o Hadoop Manager/Command Center: 1 o Master node: 1 o Segment Node: 3
The goal of ESG Lab reports is to educate IT professionals about data center technology products for companies of all types and sizes. ESG Lab reports are not meant to replace the evaluation process that should be conducted before making purchasing decisions, but rather to provide insight into these emerging technologies. Our objective is to go over some of the more valuable feature/functions of products, show how they can be used to solve real customer problems and identify any areas needing improvement. ESG Lab’s expert third-‐party perspective is based on our own hands-‐on testing as well as on interviews with customers who use these products in production environments. This ESG Lab report was sponsored by VCE.
All trademark names are property of their respective companies. Information contained in this publication has been obtained by sources The Enterprise Strategy Group (ESG) considers to be reliable but is not warranted by ESG. This publication may contain opinions of ESG, which are subject to change from time to time. This publication is copyrighted by The Enterprise Strategy Group, Inc. Any reproduction or redistribution of this publication, in whole or in part, whether in hard-‐copy format, electronically, or otherwise to persons not authorized to receive it, without the express consent of The Enterprise Strategy Group, Inc., is in violation of U.S. copyright law and will be subject to an action for civil damages and, if applicable, criminal prosecution. Should you have any questions, please contact ESG Client Relations at 508.482.0188.