TriHUG 3/14: HBase in Production

  • View

  • Download

Embed Size (px)


talk by Michael Webster, software engineer at Bronto

Text of TriHUG 3/14: HBase in Production

  • HBase In Production Hey were hiring!
  • Contents Bronto Overview HBase Architecture Operations Table Design Questions?
  • Bronto Overview Bronto Software provides a cloud-based marketing platform for organizations to drive revenue through their email, mobile and social campaigns
  • Bronto Contd. ESP for E-Commerce retailers Our customers are marketers Charts, graphs, reports Market segmentation Automation We are also hiring
  • Where We Use HBase High volume scenarios Realtime data Batch processing HDFS staging area Sorting/Indexing not a priority We are working on this
  • HBase Overview Implementation of Googles BigTable Sparse, sorted, versioned map Built on top of HDFS Row level ACID Get, Put, Scan Assorted RMW operations
  • Tables Overview Tables are sorted (lexicographically) key value pairs of uninterpreted byte[]s. Keyspace is divided up into regions of keys. Each region is hosted by exactly one machine.
  • R3R1 Server 1 Key Value a byte[] aa byte[] b byte[] bb byte[] c byte[] ca byte[] R1: [a, b) R2: [b, c) R3: [c, d) R2 Server 1 Table Overview
  • Operations Layers of complexity Normal failure modes Hardware dies (or combust) Human error JVM HDFS considerations Lots of knobs
  • Cascading Failure 1. High write volume fragments heap 2. GC promotion failure 3. Stop the world GC 4. ZK timeout 5. Receive YouAreDeadException, die 6. Failover 7. Goto 1
  • Useful Tunings MSLAB enabled hbase.regionserver.handler.count Increasing puts more IO load on RS 50 is our sweet spot JVM tuning UseConcMarkSweepGC UseParNewGC
  • Monitoring Tools Nagios for hardware checks Cloudera Manager Reporting and health checks Apache Ambari and MapR provide similar tools Hannibal + custom scripts Identify hot regions for splitting
  • Table Design Table design is deceptively simple Main Considerations: Row key structure Number of column families Know your queries in advance
  • Additional Context SAAS environment Twitter clone model wont work Thousands of users millions, of attributes Skewed customer base Biggest clients have 10MM+ contacts Smallest have thousands
  • Row Keys Most important decision The only (native) index in HBase Random reads and writes are fast Sorted on disk and in memory Bloom filters speed read performance (not in use)
  • Hotspotting Associated with monotonically increasing keys MySql AUTO_INCREMENT Writes lock onto one region at a time Consequences: Flush and compaction storms $500K cluster limited by $10K machine
  • Row Key Advice Read/Write ratio should drive design We pay a write time penalty for faster reads Identify queries you need to support Consider composite keys instead of indexes Bucketed/Salted keys are an option Distribute writes across N buckets Rebucketing is difficult Requires N reads, slow workers
  • Variable Width Keys customer_hash::email Allows scans for a single customer Hashed id distributes customers Sorted by email address Could also use reverse domain for gmail, yahoo, etc.
  • Fixed Width Keys site::contact::create::email FuzzyRowFilter Can fix site, contact, and reverse_create Can search for any email address Could use a fixed width encoding for domain Search for just gmail, yahoo, etc Distributes sites and users Contacts sorted by create date
  • Column Families Groupings of named columns Versioning, compression, TTL Different than BigTable BigTable: 100s HBase: 1 or 2
  • Column Family Example Id d {VERSIONS => 2} s7 {TTL => 604800} a (address) p (phone) o:3-27 (open) c:3-20 (click) dfajkdh byte[] byte[]:555-5555 byte[] hnvdzu9 byte[]:1234 St. XXXX hnvdzu9 byte[]:1233 St. hnvdzu9 XXXX byte[] er9asyjk byte[]: 324 Ave Column Family Example PROTIP: Keep CF and qualifier names short They are repeated on disk for every cell d supports 2 versions of each column, maps to demographics s7 has seven day TTL, maps to stats kept for 7 days.
  • MemStore HDFS s2s1 s3 f1 Column Families In Depth MemStore HDFS s2s1 f2 my_table,,1328551097416.12921 bbc0c91869f88ba6a044a6a1c50. StoreFile(s) for each CF in region Sparse One memstore per CF Must flush together Compactions happen at region level (Region) (family) (family)
  • Compactions Rewrites StoreFiles Improves read performance IO Intensive Region scope Used to take > 50 hours Custom script took it down to 18 Can (theoretically) run during the day
  • MemStore HDFS S1 f1 my_table,, 1328551097416.12921 bbc0c91869f88ba6a044a6a1c5 0. (Region) MemStore HDFS s 2 s 1 s3 f1 my_table,,1328551097416.12921 bbc0c91869f88ba6a044a6a1c50. (Region) Compaction Before and After s4 s5 s6 Before After K-Way Merge
  • The Table From Hell 19 Column Families 60% of our region count Skewed write pattern KB size store files Frequent compaction storms hbase.hstore.compaction.min.size (HBASE-5461) Moved to its own cluster
  • And yet... Cluster remained operational Table is still in use today Met read and write demand Regions only briefly active Rowkeys by date and customer
  • What saved us Keyed by customer and date Effectively write once Kept active region count low Custom compaction script Skipped old regions More hardware Were able to selectively migrate
  • Column Family Advice Bad choice for fine grained partitioning Good for Similarly typed data Varying versioning/retention requirements Prefer intra row scans CF and qualifiers are sorted ColumnRangeFilter
  • Questions?