Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

  • View
    541

  • Download
    7

Embed Size (px)

Transcript

Be a Next SuperStar

Big Data Analytics using Hive

Slide # 2015 BlueCamphor Technologies (P) Ltd.www.skillspeed.comScope of PPT BIG Data Analytics via Hive Introduction to Big Data and HadoopUnderstanding Hive and its ConceptsHive Architecture, Hive Meta Store and Hive Use-CasesBIG Data Analytics via HiveBIG Data & Hadoop Job TrendsWebinar Session by SkillspeedGet Started with BIG Data & HadoopSlide # 2015 BlueCamphor Technologies (P) Ltd.www.skillspeed.comBig Data and its ChallengesGet Started with BIG Data & HadoopSlide # 2015 BlueCamphor Technologies (P) Ltd.www.skillspeed.comBig Data and its Challenges

Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applicationsSystems / Enterprises generate huge amount of data from Terabytes to and even Petabytes of information

Its very difficult to manage such huge dataGet Started with BIG Data & HadoopSlide # 2015 BlueCamphor Technologies (P) Ltd.www.skillspeed.comWho Generates Big Data?

Have you ever wondered how Google, Facebook or LinkedIn manages to store and utilize the huge data?Today, managing unstructured and voluminous data is creating a big problem.Get Started with BIG Data & HadoopSlide # 2015 BlueCamphor Technologies (P) Ltd.www.skillspeed.comHadoop can be utilized for processing & analyzing large data-sets.

Before that lets understand what is Hadoop?Get Started with BIG Data & HadoopSlide # 2015 BlueCamphor Technologies (P) Ltd.www.skillspeed.comHadoop and its CharacteristicsApache Hadoop is a framework that allows the distributed processing of large data sets across clusters of commodity computers using a simple programming modelIt is an Open-source Data Management technology with scale-out storage and distributed processingHadoop CharacteristicsFlexibleReliableEconomicalScalableGet Started with BIG Data & HadoopSlide # 2015 BlueCamphor Technologies (P) Ltd.www.skillspeed.comHadoop EcosystemFlumeSqoopImport Or Export

Unstructured or Semi-Structured data

Structured DataApache Oozie (Workflow)HDFS(Hadoop Distributed File System)Pig LatinData AnalysisHiveDW SystemMapReduce FrameworkHBaseOther YARNFrameworks (MPI, GIRAPH)YARNCluster Resource ManagementGet Started with BIG Data & HadoopSlide # 2015 BlueCamphor Technologies (P) Ltd.www.skillspeed.comHive OriginationHive originated as an internal project in Facebook

Later it was adopted in Apache as an open source project

Facebook deals with massive amount of data (petabytes scale) and it needs to perform more than 75k ad-hoc queries on this massive amount of data

Since the data is collected from multiple servers and is of diverse nature, any RDBMS system could not fit as probable solution

Map Reduce could be a natural choice, but it had its own limitationsGet Started with BIG Data & HadoopSlide # 2015 BlueCamphor Technologies (P) Ltd.www.skillspeed.comWhat is Hive?It is a query engine wrapper built on top of Map ReduceIt is treated as Data Warehousing tool of Hadoop Ecosystem It is used for data analysis Primarily targeted to the users with SQL background Provides HiveQL, which is very similar to SQLIt is used for managing and querying structured data Hadoop complexity is hidden from end users Java and Hadoop API knowledge is optional for core users Developed by Facebook and contributed to community

Get Started with BIG Data & HadoopSlide # 2015 BlueCamphor Technologies (P) Ltd.www.skillspeed.comHive Use CasesAd-hoc analysis of underlying dataHypothesis testing of the underlying dataBig Data Testing of huge data setsAnalysis of the processed dataGet Started with BIG Data & HadoopSlide # 2015 BlueCamphor Technologies (P) Ltd.www.skillspeed.comHive ComponentsHive ComponentsDriverShellMetastoreCompilerExecution EngineGet Started with BIG Data & HadoopSlide # 2015 BlueCamphor Technologies (P) Ltd.www.skillspeed.comHive ArchitectureJDBC/ODBCBrowse, Query, DDLMetastore Thrift APIHIVE QLParserPlannerOptimizerExecutionUser-definedMapReduce ScriptsFileFormatsTextFileSequenceFileRCFileMap ReduceHDFSUDF/UDAFSubstrSumAverageSerDeCSVThriftRegexGet Started with BIG Data & HadoopSlide # 2015 BlueCamphor Technologies (P) Ltd.www.skillspeed.comHive Meta StoreMetastoreDerbyMetastoreMetastoreMySQLMetastoreServer JVMMetastoreServer JVMMySQLEmbedded MetastoreLocal MetastoreRemote MetastoreHIVE Service JVMDriverDriverDriverDriverDriverDriverGet Started with BIG Data & HadoopSlide # 2015 BlueCamphor Technologies (P) Ltd.www.skillspeed.com

Job Trends Hadoop Get Started with BIG Data & HadoopSlide # 2015 BlueCamphor Technologies (P) Ltd.www.skillspeed.comWhy SkillSpeed?Course Curriculum from Industry Experts

Instructor Led Live Virtual Sessions

Lifetime access to Course Content via LMS

100% Placement Assistance

24x7 Support

24x7

Get Started with BIG Data & HadoopSlide # 2015 BlueCamphor Technologies (P) Ltd.www.skillspeed.comSkillSpeed offer virtual instructor lead courses designed to bridge the time to competency gap experienced by the technology companies. USP of SkillSpeed is the subject matter expert (SME). SMEs are industry experts and has a good understanding and hands-on industry experience of the technology.

This industry expert designs, develops, and delivers the course.

SkillSpeed provides you:Course Curriculum from Industry ExpertsInstructor Led Live Virtual Sessions Real life industry case studies- Live Virtual Interactions Interaction with industry experts- Lifetime access to all course content via the LMS- 24*7 support- 100% placement assistance

16Course TopicsModule 1Introduction to Big Data and HadoopModule 2HDFS Internals, Hadoop Configurations and Data LoadingModule 3Introduction to Map ReduceModule 4Advanced Map Reduce ConceptsModule 5Introduction to PigModule 6Advanced Pig and Introduction to HiveModule 7Advanced Hive ConceptsModule 8Extending Hive and HBase IntroductionModule 9Advanced HBase and Oozie IntroductionModule 10Project Set-up DiscussionGet Started with BIG Data & HadoopSlide # 2015 BlueCamphor Technologies (P) Ltd.www.skillspeed.comCorporate Partners

Get Started with BIG Data & HadoopSlide # 2015 BlueCamphor Technologies (P) Ltd.www.skillspeed.comLines open 24/7To know more about the course, Please contact:

IND+91-90660-20904USA1866-607-6547 (Toll Free)Or reach us atsales@skillspeed.comContact UsGet Started with BIG Data & HadoopSlide # 2015 BlueCamphor Technologies (P) Ltd.www.skillspeed.comImage ReferencesGoogle images credit for google, Facebook and LinkedIn LOGO and Snapshotshttp://iconizer.net/en/search/1/collection:Practikahttp://findicons.com/icon/66444/user_grouphttp://www.virtualizor.com/tourhttps://accounts.it.et.byu.edu/http://www.clipartsfree.net/tag/server.htmlhttp://www.gopixpic.com/16/time-clock-icon-png-downloadhttp://blog.smartbear.com/requirements/how-to-interview-users-to-find-out-what-they-really-want/http://www.lincs.fr/research/areas/big-data/http://www.counsellingpages.co.uk/http://langfordsconsultancy.com/langfords-training-support-package/http://cbsepathshala.blogspot.in/2012/05/physics-class-x-chapter-electricity.htmlhttp://mmatycoon.com/tycoontimes/tycoontimesstory.php?SID=1010Slide # 2015 BlueCamphor Technologies (P) Ltd.www.skillspeed.com