[Pixar] Big Data, Big Depots


Citation preview


Pixar: Big Data, Big Depots

Mark Harrison, Tech Lead, Data Management Group

Mike Sundy, Senior Asset Administrator David Baraff, Senior Animation Scientist Pixar Animation Studios Emeryville, CA

Logo area


Three Fundamental Questions of Big Data

•  How big is a big repository? •  How long:

•  Do operations take? •  Do you use your data?

•  How can you back things up?


Templar Refresher

•  What we store: •  Source code, film assets, original artwork, video

•  Scale of what we store •  Many TB, 100+ depots

•  How long we store it for: •  50 year charter

•  Why P4 for storing •  Scalability, Off-the-Shelf


Big Data Problems

•  Deal with slow operations on huge files •  Dealing with high backup costs •  Dealing with depots that are 99% R/O, 1% R/W


Slow Operations

•  Biggest asset: 900 GB •  Biggest checkin: 6.5 TB •  Biggest depot: 35 TB •  Rule of thumb:

•  1 MB = 1 sec, 1 GB= 1 minute, 1 TB = 14 hours


How do you back it up?

•  Tape, offsite •  Expensive •  Recurring operation

•  Needs to be verified, refreshed, tapes upgraded •  Archived/non-archived

•  Things not actively being changed can be backed up less aggressively ( = more cheaply)


A Modest User Request

•  “Can we make the depot read-only to reduce costs, but still write to it just a little bit?”

•  Response 1: •  #*@#$(*%%%( !!!

•  Response 2: •  “we shall investigate the issue and let you know.”


Problem: Archive + Non Archive

•  Movie process, never totally finished •  Need to periodically make small additions/

tweaks •  E.g. ads, interstitials, Oscar promo (hopefully!)

•  Applicable to lots of industries


Archive + Non Archive (requirements)

•  Conflicting goals: •  readonly / writable •  Slow cheap storage / Fast expensive storage

•  Lots of data duplication = need to dedupe •  Most common file:

•  125,241 copies


Archive + Non Archive (requirements)

•  # cueformat = 5;


Ideas which didn’t work for us

•  +X filetype •  p4 snap •  Various vendor backup things


Solution: split data onto multiple volumes

•  “old” stuff onto archive R/O volume •  “new” stuff onto active, writable volume •  Works because of ,d magic •  Hooray P4 Super Brains!!


Horizontal linking / Vertical Linking

•  Horizontal = volume splitting = underminer •  Vertical = deduping = shrink ray


P4 Depot Symlinks

Writable Storage Read-only Storage

P4 Depot

Writable Storage Read-only storage •  Normal Perforce

Configuration •  New Files added here •  Perforce doesn’t care if

files are symlinks (GENIUS!)

•  Files copied from other storage

•  Symlinks established

•  Uses ‘archive’


Shrink Ray

Duplicate File

Duplicate File

•  All duplicate files on one volume hard linked together.

•  Reduces storage overhead – but transparent to Perforce.

Duplicate File

Shrink ray links

Shrink ray links

Underminer links

Active depot storage Read-only Storage


Shrink Ray – P4 Deduper

•  Simple file-level dedupe (not block level) •  Do this in real-time on checkin •  Batch shrink ray for undermined files •  Use p4 checksums , move into database •  Now all queries are db operations •  Easily cap number of hard links


Sample db queries

•  Unique files •  select  count(*)  from  (select  digest,  count(1)  from  fileinfo  group  by  digest)  

•  How many copies of this file contents? •  select  count(*)  from  fileinfo  where  digest  =  '1E8529CE1AE991982A0FB5FD760CE92D'  


Why not p4 verify?

•  Takes a long time: 1 week on mediavault depot! •  Due to NFS, can incapacitate p4 server machine •  Caused “fear pathology” among p4 end users

when p4 operations would “hang” •  False alarms due to “p4 purge” coupled with

long running time of “p4 verify”


solution: the suminator (offline verify)

•  Runs on separate machine from P4 server •  Accesses repository store via NFS •  Gets checksums via “p4 fstat” •  Verifies repository based on those checksums •  Uses in-house Python streaming API to minimize

memory footprint and startup delay •  Parallelizable: can farm out to multiple machines


problem: p4 submit of big data

•  no idea if submit is still running •  one mediavault check-in of 6 TB took 3 days •  all the user sees:

>  p4  submit  -­‐d  "test"  bigfile  Submi<ng  change  5784.  Locking  1  files  ...  edit  //markive/miketest/bigfile#2  


solution: progress indicator

•  As of p4 2012.2, you can use ‘p4 –I submit’ •  p4 -I submit bigfile •  Change 5788 created with 1 open file(s). •  Submitting change 5788. •  Locking 1 files ... •  /home/msundy/depots/markive/miketest/bigfile 43%

•  provides feedback and predictability •  users are happy


problem: debug big data submit performance

•  triggers can take several minutes with big data •  p4 log not granular enough to debug trigger performance •  how long did we spend in each trigger?  2013/02/25  16:35:28  pid  24112  msundy@msundy-­‐home-­‐depot-­‐shaunkive  [p4/2012.1/LINUX26X86_64/442152]  'dm-­‐CommitSubmit’  <commit  triggers  fire  –  can  only  see  p4  ops  from  within  triggers,  not  trigger  phases>  

 2013/02/25  16:35:59  pid  24112  completed  30.03s  1+1us  192+264io  0+0net  2672k  0pf      


solution: structured logging

•  as of 2012.1 •  results: cut 80 min thumbnail branch bug to 5 seconds

JSON:11,1343771362,805530491,2012/07/31  14:49:22  805530491,6280,17,msundy,focus-­‐msundy-­‐markive,dm-­‐CommitSubmit,,,v66,,1,/usr/anim/modsquad/bin/pyrunmodule  prodp4.triggers.thumbnails  5576  JSON:11,1343776124,571539452,2012/07/31  16:08:44  571539452,6280,17,msundy,focus-­‐msundy-­‐markive,dm-­‐CommitSubmit,unknown,,v66,,2,  



Mark Harrison (m@pixar.com) Mike Sundy (msundy@pixar.com) David Baraff (deb@pixar.com)
