View
254
Download
0
Category
Preview:
Citation preview
Copyright © 2016 NTT DATA Corporation
5/9/2016NTT DATA CorporationAkira Ajisaka
Apache Hadoop 3, Current Status
Apache: Big Data North America 2016
2 Copyright © 2016 NTT DATA Corporation
Self introduction
n Akira Ajisaka (NTT DATA)l Apache Hadoop Committer & PMC member
Ø 130+ commits in 2015Ø Working on usability and supportability
l "Open-Source Professional Services" teamØ Has deployed and supported 10k+ nodes of
Hadoop clusters overall for 7 yearsØ Contributing to Apache Hadoop 6th in the world
with NTT [1]
Ø ASF Committers: Spark, Yetus, HTrace(incubating)
[1] The Activities of Apache Hadoop Community 2014 http://ajisakaa.blogspot.com/2015/02/the-activities-of-apache-hadoop.html
3 Copyright © 2016 NTT DATA Corporation
Agenda
n What's Apache Hadoop 3?n Differences between Hadoop 3 and 2
l New Featuresl Incompatible Changes
n Current Statusn Summary
Copyright © 2016 NTT DATA Corporation 4
What's Apache Hadoop 3?
5 Copyright © 2016 NTT DATA Corporation
Disclaimer
n Apache Hadoop 3 is now undefinedl Not releasedl Not decided what feature is in or not
n This presentation is not always correct
n I'd like to introduce as far as I know
6 Copyright © 2016 NTT DATA Corporation
What's Apache Hadoop 3?
20142010 2011 201320122009 2015
2.2.0
2.3.0
2.4.02.0.0-alpha
2.1.0-beta
branch-1 (branch-0.20)
1.0.0 1.1.0 1.2.1(stable)0.20.1 0.20.205
0.22.0
0.21.0New append
Security
0.23.0
0.23.11(final)NameNode Federation, YARN
NameNode HA
HDFS Snapshots NFSv3 support Windows
Heterogeneous storage HDFS in-memory caching
HDFS ACLs HDFS Rolling Upgrades Application History Server RM Automatic Failover
2.5.0
2.6.0YARN Rolling Upgrades Transparent Encryption Archival Storage
2.7.0
Drop JDK6 support Truncate API
2016
branch-0.23
branch-2
trunk
Hadoop 3 and 2 were diverged in 2011 (5 years ago!)
Hadoop1 (EOL)
Hadoop2
Hadoop3
7 Copyright © 2016 NTT DATA Corporation
Discussions for releasing Apache Hadoop 3
n 2014/6l Discussed releasing Hadoop 3 with upgrading JDK
versionl Decided to make 2.6 the last release that supports
JDK6n 2015/3
l Oracle JDK7 is EoL, so we need to upgradel After all we put off the event for another 12 months
n 2016/2l It's time to revisit Hadoop 3 release plansl Developers are agreed with releasing Hadoop 3
8 Copyright © 2016 NTT DATA Corporation
When will it be released?
n Developer mailing list:l Releasing alphas through the summerl Freeze and stabilize for GA in Nov/Dec.
n I suppose Hadoop 3 GA will be released in the end of 2016
n There are many tasks to dol I'll introduce them in this slide later
Copyright © 2016 NTT DATA Corporation 9
Difference between Hadoop 3 and 2
10 Copyright © 2016 NTT DATA Corporation
YARN is unchanged
n YARN: the biggest difference between Hadoop 1 & 2n Hadoop 3 still uses YARN
HDFS
MapReduce
HDFS
Map Reduce Spark
Tez
Hadoop 1 Hadoop 2 & Hadoop 3
YARN
Hive Pig Hive
Pig Storm ・・・・・・・・・・
11 Copyright © 2016 NTT DATA Corporation
New features
n 440 fixed issues only in Hadoop 3 (as of 5/8/2016)l https://s.apache.org/GpUq
n HDFS erasure codingn Shell script rewriten Task level native optimizationn Derive heap size or mapreduce.*.memory.mb
automaticallyn Support more than 2 NameNodesn and more
12 Copyright © 2016 NTT DATA Corporation
Break compatibility
n Major version up is to clean up the codel Deprecated APIs can be removed only in changing
major versionØ @Public and @Stable Java APIØ REST APIØ Metrics/JMXØ CLIØ Environment variables
l Wire-compatibility can be brokenØ 2.X client cannot talk to 3.X server and vice versa
l Compatibility Guide:Ø https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html
Copyright © 2016 NTT DATA Corporation 13
New Features
14 Copyright © 2016 NTT DATA Corporation
Erasure Coding (HDFS-7285)
n Probleml Reduce costs of storagel Blocks are replicated to 3 DNs
Ø 3x storage overhead is costly
n Solutionl Use Erasure Code
3-replication (6,3)-Reed-SolomonTolerates 2 failures 3 failures
Disk Usage 3x 1.5x
15 Copyright © 2016 NTT DATA Corporation
Erasure Coding: Write files using (6,3)-Reed-Solomon
n Write data to 9 DNs in parallel
DN1
DN6
DN7
DN9
・・・
・・・Incoming Data
・・・
ECClient
・・・
3 Parity Blocks
6 Data Blocks
16 Copyright © 2016 NTT DATA Corporation
Erasure Coding: Read files
n Read data from 6 DNs in parallel
DN1
DN6
DN7
DN9
・・・
・・・
ECClient
・・・
17 Copyright © 2016 NTT DATA Corporation
Erasure Coding: Read files when DN fails
n Read data from arbitrary 6 DNs in parallel
DN1
DN6
DN7
DN9
・・・
・・・
ECClient
・・・ ×
18 Copyright © 2016 NTT DATA Corporation
Erasure Coding: Current Status
n Phase 1: striping layoutl C = 64KB (default)l Work for small filesl No data localityl Available on trunk
n Phase 2: contiguous layoutl C = 128MB (= HDFS Block size)l Not work for small filesl Data localityl Now in progress (HDFS-8030)
Incoming Data
DataNode 1DataNode 2DataNode 3DataNode 4DataNode 5
・・・
Cell size (C)
19 Copyright © 2016 NTT DATA Corporation
Shell Script Rewrite (HADOOP-9902)
n Hadoop and Shell Scriptl Launching daemonsl Hadoop CLI
n Difficult to understandl What is the correct env var to set a option
Ø java classpath?Ø java.library.path?Ø GC options?
l How to add the option to the env varl We have to read almost all the shell scripts!
20 Copyright © 2016 NTT DATA Corporation
After rewriting the scripts ...
n Easy to understandn Because shell API doc is available
l Shelldoc maker generates docs from the scriptsl Similar to JavaDoc
Public API
My documentation build (trunk): http://aajisaka.github.io/hadoop-project/hadoop-project-dist/hadoop-common/UnixShellAPI.html
21 Copyright © 2016 NTT DATA Corporation
.hadoop-env and .hadooprc (HADOOP-11353, HADOOP-13045)
n Very similar to .bashrcl Read the API docl Create your own ~/.hadoopXX
Ø .hadoop-env : hadoop-env.sh for each userØ .hadooprc : called after shell env vars are configured
l And that's all :)
n ex.) Set additional classpath (.hadooprc)hadoop_add_classpath /path/to/my/jar
22 Copyright © 2016 NTT DATA Corporation
--debug option is available
n Useful for troubleshooting
$ hadoop --debug versionDEBUG: hadoop_parse_args: procesiong versionDEBUG: hadoop_parse: asking caller to skip 1DEBUG: HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop(snip)DEBUG: Applying the user's .hadooprcDEBUG: Initial CLASSPATH=/path/to/my/jarDEBUG: Initialize CLASSPATH(snip)DEBUG: Initial CLASSPATH=/usr/local/hadoop/share/hadoop/common/lib/*
CLASSPATH was overwritten!! (before HADOOP-13045)
23 Copyright © 2016 NTT DATA Corporation
Many new features, bug fixes, improvements
n 'hadoop distch' to change the ownership and permissions on many files via MapReduce job
n 'hadoop jnipath' to print java.library.pathn 'hadoop --daemon' instead of hadoop-daemon.sh
l ex.) hdfs --daemon status namenodel The return code for status is LSB-compatiblel hadoop-daemon(s).sh are now deprecated
n .out files are now appended (not overwritten)l Allows external log rotation
n and many morel see https://issues.apache.org/jira/browse/HADOOP-9902
24 Copyright © 2016 NTT DATA Corporation
Task level native optimization (MAPREDUE-2841)
n Add a native implementation of the map output collectorl Sort, Spill and IFile serialization
n Prequisitesl Built with -Pnative optionl Custom writable types and comparators are not
supportedn Setting
<property name="mapreduce.job.map.output.collector.class" value="org.apache.hadoop.mapred.nativetask.NativeMapOutputCollectorDelegator">
Tips: compact xml form will be supported in Hadoop 3
(HADOOP-6964)
25 Copyright © 2016 NTT DATA Corporation
Benchmark
n Release Note in the issue:l "For shuffle-intensive jobs this may provide
speed-ups of 30% or more."
n Benchmarked with 3 slaves (m3.xlarge)l CentOS 7.2l 3.0.0-SNAPSHOT (revision 5865fe2b)
n A very shuffle-intensive wordcount jobl Input: 2.6GB (compressed)l Shuffle: 14GBl Output: 10GB
26 Copyright © 2016 NTT DATA Corporation
Benchmark result
n Result: 506sec -> 383sec (25% speedup)l Average Map Time: 182sec -> 89sec
n Map task log (native optimization enabled)
n Tips: Backported to CDH 5.2.0 or later
INFO [main] org.apache.hadoop.mapred.nativetask.HadoopPlatform: Hadoop platform initedINFO [main] org.apache.hadoop.mapred.nativetask.NativeRuntime: Nativetask JNI library loaded.INFO [main] org.apache.hadoop.mapred.nativetask.handlers.NativeCollectorOnlyHandler: NativeTask Combiner is enabled, class = org.apache.hadoop.examples.WordCount$IntSumReducerINFO [main] org.apache.hadoop.mapred.nativetask.NativeBatchProcessor: NativeHandler: direct buffer size: 1048576INFO [main] org.apache.hadoop.mapred.nativetask.handlers.NativeCollectorOnlyHandler: [NativeCollectorOnlyHandler] combiner is not nullINFO [main] org.apache.hadoop.mapred.nativetask.NativeBatchProcessor: NativeHandler: direct buffer size: 1048576INFO [main] org.apache.hadoop.mapred.nativetask.util.OutputUtil: nativetask.output.manager = org.apache.hadoop.mapred.nativetask.util.NativeTaskOutputFilesINFO [main] org.apache.hadoop.mapred.nativetask.NativeMapOutputCollectorDelegator: Native output collector can be successfully enabled!INFO [main] org.apache.hadoop.mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.nativetask.NativeMapOutputCollectorDelegator
27 Copyright © 2016 NTT DATA Corporation
Derive heap size or mapreduce.*.memory.mb automatically (MAPREDUCE-5785)
n In Hadoop 2, two similar properties must be set :(l mapreduce.{map,reduce}.memory.mb
Ø The amount of memory to request from the scheduler for each task (ex. 2048)
l mapreduce.{map,reduce}.java.optsØ Java options for YARN containers (ex. -Xmx2G)
n In Hadoop 3, either is enoughl .java.opts is derived from .memory.mb and vice
versaØ .java.opts = .memory.mb * mapreduce.job.heap.memory-mb.ratio
Ø .memory.mb = .java.opts / mapreduce.job.heap.memory-mb.ratio
28 Copyright © 2016 NTT DATA Corporation
Support more than two NameNodes (HDFS-6440)
n Hadoop 2 now supports only 2 NameNodesl 1 active and 1 standby
n Hadoop 3 supports 2 or more standby NameNodesl provides additional fault-tolerancel avoids multiple standby NNs to checkpoint at the
same timel # of standby should be small due to block report
n FYI: ResourceManager already supports multi-standby in branch-2
29 Copyright © 2016 NTT DATA Corporation
Other new features
n metrics2 sink plugin for Apache Kafka (HADOOP-10949)n .jhist default format is changed from json to binary
(MAPREDUCE-6613)n Use FileOutputCommitter v2 by default
(MAPREDUCE-6336)n Allow/disallow snapshots via WebHDFS (HDFS-9057)n Check and make checkpoint before stopping NameNode
(HDFS-6353)n YARN TimelineServer v2 (YARN-2928)n YARN WebUI v2 (YARN-3368)n Dynamic subcommands (HADOOP-12930)n and many other changes
Copyright © 2016 NTT DATA Corporation 30
Incompatible Changes
31 Copyright © 2016 NTT DATA Corporation
Incompatible changes
n Many deprecated APIs will be removedl hftp/hsftp/s3 -> webhdfs/s3{n,a}l Metrics v1l org.apache.hadoop.Recordsl and more
n Improved CLI outputl 'mapred job -list' shows the job name as welll 'hadoop fs -du' shows the raw disk usage, and
aligned more unix-likel and more
n Search 'Incompatible change' flagl https://s.apache.org/sMO4
32 Copyright © 2016 NTT DATA Corporation
STARTUP_MSG: classpath = /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/jetty-6.1.26.jar:/usr/local/hadoop/share/hadoop/common/lib/re2j-1.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/local/hadoop/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/local/hadoop/share/hadoop/common/lib/zookeeper-3.4.6.jar:/usr/local/hadoop/share/hadoop/common/lib/guava-11.0.2.jar:/usr/local/hadoop/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/usr/local/hadoop/share/hadoop/common/lib/slf4j-api-1.7.10.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/usr/local/hadoop/share/hadoop/common/lib/xmlenc-0.52.jar:/usr/local/hadoop/share/hadoop/common/lib/jsp-api-2.1.jar:/usr/local/hadoop/share/hadoop/common/lib/curator-client-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-json-1.9.jar:!/usr/local/hadoop/share/hadoop/common/lib/jettison-1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-collections-3.2.2.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-io-2.4.jar:/usr/local/hadoop/share/hadoop/common/lib/jetty-util-6.1.26.jar:/usr/local/hadoop/share/hadoop/common/lib/nimbus-jose-jwt-3.9.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-codec-1.4.jar:/usr/local/hadoop/share/hadoop/common/lib/stax-api-1.0-2.jar:/usr/local/hadoop/share/hadoop/common/lib/junit-4.11.jar:/usr/local/hadoop/share/hadoop/common/lib/hamcrest-core-1.3.jar:/usr/local/hadoop/share/hadoop/common/lib/htrace-core4-4.0.1-incubating.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-core-1.9.jar:/usr/local/hadoop/share/hadoop/common/lib/netty-3.6.2.Final.jar:/usr/local/hadoop/share/hadoop/common/lib/hadoop-annotations-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-server-1.9.jar:/usr/local! /hadoop/share/hadoop/common/lib/activation-1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/json-smart-1.1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/usr/local/hadoop/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/usr/local/hadoop/share/hadoop/common/lib/log4j-1.2.17.jar:/usr/local/hadoop/share/hadoop/common/lib/jsch-0.1.51.jar:/usr/local/hadoop/share/hadoop/common/lib/curator-framework-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/lib/httpcore-4.2.5.jar:/usr/local/hadoop/share/hadoop/common/lib/jcip-annotations-1.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/usr/local/hadoop/share/hadoop/common/lib/avro-1.7.4.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-lang-2.6.jar:/usr/local/hadoop/share/hadoop/common/lib/httpclient-4.2.5.jar:/usr/local/hadoop/share/hadoop/common/lib/hadoop-auth-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-digester-1.8.jar:/usr/local/hadoop/share/hadoop/common/lib/jsr305-3.0.0.jar:/usr/local/hadoop/share/hadoop/common/lib/gson-2.2.4.jar:/usr/local/hadoop/share/hadoop/common/lib/jets3t-0.9.0.jar:/usr/local/hadoop/share/hadoop/common/lib/servlet-api-2.5.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-net-3.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-math3-3.1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-compress-1.4.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-logging-1.1.3.jar:/usr/local/hadoop/share/hadoop/common/lib/xz-1.0.jar:/usr/local/hadoop/share/hadoop/common/lib/mockito-all-1.8.5.jar:/usr/local/hadoop/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-configuration-1.6.jar:/usr/local/hadoop/share/hadoop/common/lib/asm-3.2.jar:/usr/local/hadoop/share/hadoop/common/lib/paranamer-2.3.jar:/usr/local/hadoop/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/local/hadoop/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/usr/local/hadoop/share/hadoop/common/lib/curator-recipes-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar:/usr/local/hadoop/share/hadoop/common/hadoop-common-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/common/hadoop-common-3.0.0-SNAPSHOT-tests.jar:/usr/local/hadoop/share/hadoop/common/hadoop-nfs-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/hadoop-hdfs-client-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/netty-all-4.1.0.Beta5.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/hpack-0.11.0.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/xercesImpl-2.9.1.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/leveldbjni-all-1.8.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/okio-1.4.0.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/okhttp-2.4.0.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-nfs-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-3.0.0-SNAPSHOT-tests.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-nativetask-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-app-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-common-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-SNAPSHOT-tests.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/lib/javassist-3.18.1-GA.jar:/usr/local/hadoop/share/hadoop/yarn/lib/metrics-core-3.0.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/guice-3.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/javax.inject-1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/aopalliance-1.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/curator-test-2.7.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/fst-2.24.jar:/usr/local/hadoop/share/hadoop/yarn/lib/objenesis-2.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-client-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/zookeeper-3.4.6-tests.jar:/usr/local/hadoop/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-math-2.2.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-common-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-api-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-tests-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-common-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-web-proxy-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-registry-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-nodemanager-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-timeline-pluginstorage-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-client-3.0.0-SNAPSHOT.jar:/usr/java/latest/lib/tools.jar
Bump up the versions of the libraries
n Drop JDK7 support (HADOOP-11858)l Hello Lambda!
n Dependency Helll Tomcatl Jettyl Jerseyl Guaval Log4Jl Jacksonl and many many many more ...
33 Copyright © 2016 NTT DATA Corporation
Classpath isolation (HADOOP-11656)
n Relaxing the "dependency hell"l Separate client and server jarsl Client jar does not pull any third party
dependencies
n If the isolation is done ...l We can safely upgrade the libraries in server codel In branch-2, the upgrade is incompatible :(
Copyright © 2016 NTT DATA Corporation 34
Current Status
35 Copyright © 2016 NTT DATA Corporation
Current Status
n Most of the new features are already availablel Erasure Codingl Shell Scripts Rewritel Task level native optimization
n Need more contributionl Bumping the library versions l Classpath isolationl Remove deprecated XXX
n Discussionl Release from trunk or cut new branch-3l How should we make the releases alpha, beta, or
GA?
Copyright © 2016 NTT DATA Corporation 36
Summary
37 Copyright © 2016 NTT DATA Corporation
Summary
n Apache Hadoop 3 hasl Many new features and code cleanupsl Many remaining tasks
n Need your help to release Hadoop 3 earlier!l Not only creating patches but also testing are
welcome!
n (Probably) Hadoop 3 GA will be released in 2016
38 Copyright © 2016 NTT DATA Corporation
Comments?
n Suppose there are many Apache Hadoop developers/users in this room :)
n Tell mel if there are any required tasks not in this talkl if I am misunderstanding somethingl everything you want to tell
n Discussionl Release from trunk or cut branch-3l When we should support JDK9?l What is alpha/beta/GA?l ...
n I'd like to feedback to the community
Copyright © 2011 NTT DATA Corporation
Copyright © 2016 NTT DATA Corporation
40 Copyright © 2016 NTT DATA Corporation
Appendix
n Native erasure coding support inside HDFS (Strata + Hadoop World New York 2015)l http://conferences.oreilly.com/strata/big-data-
conference-ny-2015/public/schedule/detail/42957
Recommended