Upload
elizabeth-smith
View
182
Download
1
Embed Size (px)
Citation preview
Taming the resource tigerYou cannot hide from Physics!
MassHow much matter is in an object
Data Storage• Hard Disk Drive - HDD
• Magnetizes a thin film of ferromagnetic material on a disk• Reads it with a magnetic head on an actuator arm
• Solid State Drive – SSD• Uses integrated circuit assemblies as memory to store data
persistently• No moving parts
Areal Storage Density• SSD
• 2.8 Tbit/in2 • HDD
• 1.5 Tbit/in2
Terabits per square inch – numbers as of 2016 (see Wikipedia, our materials are improving)
When hard drives go bad
What does a this have to do with PHP?
The chat that worked for 3 days…
Streams: Computing ConceptDefinitions
• Idea originating in 1950’s • Standard way to get Input
and Output• A source or sink of data
Who uses them
• C – stdin, stderr, stdout• C++ iostream• Perl IO• Python io• Java• C#
What is a Stream?• Access input and output generically• Can write and read linearly• May or may not be seekable• Comes in chunks of data
Why do I care about streams?• They are created to handle massive amounts of data• Assume all files are too large to load into memory• If this means checking size before load, do it• If this means always treating a file as very large, do it• PHP streams were meant for this!
What uses streams in PHP?• EVERYTHING• include/require _once• stream functions• file system functions• many other extensions
ALL IO
Attach Context
Stream Transpo
rt
Stream Filter
Stream Wrappe
r
How PHP Streams Work
Using Streams
You can also do logic on the fly!
What are Filters?• Performs operations on stream data• Can be prepended or appended (even on the fly)• Can be attached to read or write• When a filter is added for read and write, two instances of the
filter are created.
Using Filters
Things to watch for!• Data has an input and output state• When reading in chunks, you may need to cache in between
reads to make filters useful• Use the right tool for the job
Throw away your assumptions except for:
There will be Terabytes of Cat Gifs!!
DimensionBoth an object’s size and mathematical space
Random Access Memory (RAM)• The CPU uses RAM to work• It randomly shoves data inside and pulls data back out• RAM is faster then SSD and HDD• It’s also more expensive
Out of Memory
There are two reasons you’ll see that error• Recursion recursion recursion recursion
• Solution: install xdebug and get your stacktrace• Loading too much data into memory
• Solution: manage your memory
Inherently PHP hides this problem• Share nothing architecture• Extensions with C libraries that hide memory consumption• FastCGI/CGI blows away processes, restoring memory• Max child and other Apache settings blow away children,
restoring memory
How do I fix it!
Halp, I can’t upload!!
Arrays are evil• There are other ways to store data that are more efficient• They should be used for small amounts of data• No matter how hard you try, there is C overhead
Process with the appropriate tools• Load data into the appropriate place for processing• Hint – arrays are IN MEMORY – that is generally not an
appropriate place for processing• Datastores are meant for storing and retrieving data, use them
Select * from table
Use the iteration, Luke• Lazy fetching exists for database fetching – use it!• Always page (window) your result sets from the database –
ALWAYS• Use filters or generators to format or alter results on the fly
The N+1 problem• In simple terms, nested loops• Don’t distance yourself too much from your datastore• Collapse into one or two queries instead
Throw away all your assumptions except:
SpeedThe rate at which an object covers distance
How does a CPU work?
CPU limitations• Transmission delays• Heat
• Both are materials limitations
• http://www.mooreslaw.org/
Why I no longer overclock
What does this have to do with PHP?• You are limited by the CPU your site is deployed upon.• Yes even in a cloud – there are still physical systems running
your stuff• Yes even in a VM – there are still physical systems running your
stuff• Follow good programming habits • PROFILE
Good programming habits• Turn on opcache in production!• Keep your code error AND WARNING free• Watch complex logic in loops
• Short circuit the loop • Rewrite to do the logic on the entire set in one step• Calculate values only once• On small arrays use array_walk• On large arrays use generators/iterators
• Use isset instead of in_array if possible• Profile to find the place to rewrite for slow code issues
Profiling OptionsFree
• Xdebug• Xhprof• Uprofiler
Paid
• NewRelic• AppDynamics• Blackfire• Tideways• Tracelytics
Distribute the load• Perfect for heavy processing for some type of data• Queue code that requires heavy processing but not immediate
viewing• Design your UX so you can inform users of completed jobs• Cache complex work items
Pick your system• php-resque• Gearman• Beanstalkd• IronMQ• RabbitMQ• ZeroMQ• AmazonSQS• Just visit http://queues.io
Job queuing and 10K page pdfs
Keep your CPU happy• Offload processing• Use a queue
VelocitySpeed + Direction
Networking 101• IP – forwards packets of data based on a destination address• TCP – verifies the correct delivery of data from client to server
with error and lost data correction• Network Sockets – subroutines that provide TCP/IP (and UDP
and some other support) on most systems
Packet of Data
Speed in the series of tubes• Bandwidth – size of your pipe• Latency – length of your pipe including size changes• Jitter – air bubbles in your pipe
Network Socket Types• Stream
• Connection oriented (tcp)• Datagram
• Connectionless (udp)• Raw
• Low level protocols
Definitions• Socket
• Bidirectional network stream that speaks a protocol• Transport
• Tells a network stream how to communicate• Wrapper
• Tells a stream how to handle specific protocols and encodings
Using Sockets
What does this have to do with PHP?• APIs fail• APIs go byby• AWS goes down
• Or loses network connection to a specific area• Or otherwise fails
What do you mean we can’t write files?
Prepare for failure• Handle timeouts• Handle failures• Abstract enough to replace systems if necessary, but only as
much as necessary• If you’re not paying for it, don’t base your business model on it
Checklist• Cultivate good coding habits• Try not to loop logic or processing• Don’t be afraid to offload work to other systems or services• Assume every file is huge• Assume there are 1 million rows in your DB table• Assume that every network request is slow or going to fail• Profile to find code bottlenecks, DON’T assume you know the
bottleneck• Wrap 3rd party tools enough to deal with downtime or
retirement of apis
About Me http://emsmith.net [email protected] twitter - @auroraeosrose IRC – freenode –
auroraeosrose #phpmentoring https://joind.in/talk/9f43b