Click here to load reader
Upload
pjmorse
View
820
Download
1
Embed Size (px)
Citation preview
Paul Morse | Hadoop, Hadoop Desktop Cluster, CriKit, Copyright 2012 – All Rights
Reserved | September 30, 2012
Hadoop Desktop Cluster TIME FOR HADOOP ON YOUR DESK !
Hadoop Desktop Cluster - CriKit
COPYRIGHT 2012 – PAUL MORSE – ALL RIGHTS RESERVED PAGE 1
1
Hadoop is Hot
The number of organizations that want to investigate and use Hadoop and other “Big
Data” solutions is growing rapidly. It has been claimed that 50% of the world’s data will be
held in Hadoop by 2015. That seems like an aggressive prediction, but there is no doubt
that the entire sector of “Big Data” is exploding with no end to the expansion in the near
term. One of the major inhibitors to broad adoption can be the high cost of hardware to
test and evaluate these new technologies. What is needed is a low cost computing
environment with which to test these new technologies to determine if they are viable
solutions for the organization. There are, of course, many options from public cloud
providers and SaaS providers that have built competent solutions around Hadoop and
other Big Data technologies, but many organizations are reluctant to put their private data
in public environments to simply try out these new solutions. Further, it seems all the
major hardware vendors are standing by to offer solutions in the hundreds of thousands of
dollars coupled with services offerings that rival or exceed the hardware costs – just to
take a look at the technology.
What is needed for a lot of organizations with a tight budget is a compact, low-wattage,
multi-node sandbox to test the basic functionality to see if a larger scale environment is
indicated for further pursuing Hadoop solutions. Test small, go large if it works for you.
Enter CriKit
One solution for testing big data solutions is CriKit, Desktop Private Cloud. Originally
created for testing and running Private Cloud software from a variety of companies, it is
perfect for budget-conscious organizations that want to test the functionality of Hadoop
or Cassandra and many of the functional add-on products like DataMeer or Tableau or
myriad others.
Why Crikit?
CriKit is a unique desktop cluster solution. It was designed to be at the confluence of
compact size, low-wattage, but high compute power, with future reusability in mind. By
being broken into discrete compute nodes and not put in a proprietary case, the
MicroServers can be reassigned as desktop devices when they have outlived their
usefulness as servers. This extends the useful life of the MicroServers by 5- 7 years. A dual
hard drive CriKit MicroServer costs roughly $1,500.00 USD. If it has a multi-role useful life
of 10 years, that equates to approximately 41 Cents a day per node, or less than $2.00 a day
for the 4 compute nodes in the base CriKit system. This cost compares very favorably
against larger, more costly on-premise solutions and Public Cloud offerings from a variety
of cloud vendors. Further, Crikit can be used in a hybrid cloud configuration where
Hadoop experts can refine their Hadoop environment to be the most efficient locally, then
Hadoop Desktop Cluster - CriKit
COPYRIGHT 2012 – PAUL MORSE – ALL RIGHTS RESERVED PAGE 2
2
burst to Public Clouds for large scale processing. This helps ensure that organizations
minimize their Public Cloud Hadoop computing spend, where simple mistakes can be
very costly.
What is in a CriKit?
CriKit was designed to be very simple and provide the compute power, networking and
management necessary to build private clouds or data clusters. CriKit MicroServers can
easily be added to increase the processing capability of the cluster, and the financial step
function of adding MicroServers is small compared with larger, proprietary offerings. Each
desktop CriKit environment for a minimal, 4 node Hadoop cluster contains -
Computing Nodes - CriKit contains 4 energy-efficient compute nodes that include an
Intel Server Motherboard, a 64 Bit Intel Xeon Server CPU, 16 GB of RAM, Dual 1 Gb
Ethernet Network Interface Controllers ( NIC’s) and varying sizes and types of SATA III,
2.5 inch drives. CriKit nodes can contain up to 2 SATA III, 2.5 inch spinning, Hybrid or
Solid State Drives. Testing has shown that SSD’s provide the best read performance by a
wide margin.
1 Gb Ethernet Switch - CriKit comes standard with an 8 Port, unmanaged, 1 Gb Ethernet
switch. Managed switches, and switches with more ports for larger CriKit
implementations are available.
Keyboard, Video and Mouse Switch - A high-quality, 8 node DVI/USB switch is
included with CriKit. This switch can be daisy-chained with additional switches to
accommodate 511 CriKit compute nodes and one Management Workstation.
Management/Development Workstation - CriKit comes with a high-powered
workstation to manage the cloud environment and provide high developer productivity.
Purchasers can select the components of the workstation – like CPU, Disk, Memory, etc -
or decide not to buy the workstation component and use their own desktop or laptop
machine as a cluster management station.
Architecture
A 4 node CriKit cluster provides the minimum hardware necessary to run a Hadoop
cluster. For testing and evaluation, 1 of the 4 nodes can contain all the Hadoop-related
management functions and the 3 remaining nodes can be the compute or slave nodes.
With 4 cores and 8 threads in each CPU, plus 16 GB of RAM and SSD’s up to 1.2 TB in each
node, there is enough compute horsepower, memory and high-speed storage to
adequately test Hadoop with moderate amounts of data.
Hadoop Desktop Cluster - CriKit
COPYRIGHT 2012 – PAUL MORSE – ALL RIGHTS RESERVED PAGE 3
3
Further, if you want to use CriKit as a 4 node private cloud platform and run Hadoop in
virtual machines to test Hadoop scalability over many virtual machine nodes, this is a
viable configuration option as well.
Summary
There are a growing number of organizations that want to test and evaluate Hadoop and
other “Big Data” solutions and add-on products. CriKit provides a low-cost, low-wattage,
compact and quiet desktop computing platform that is ideal for organizations on a tight
budget.
http://www.crikit.info
http://www.cloudademia.com
http://www.usmicro.com