Redis for your boss

  • Published on
    08-Jan-2017

  • View
    194

  • Download
    2

Embed Size (px)

Transcript

<ul><li><p>REDIS FOR YOUR BOSSELENA KOLEVSKA</p></li><li><p>Who am I ?A nomad earthling. Lead developer @ www.speedtocontact.com</p><p>BLOG.ELENAKOLEVSKA.COM@ELENA_KOLEVSKA</p></li><li><p>(Really) SHORT INTRONo, seriously!</p></li><li><p>REDIS IS AN OPEN SOURCE (BSD licensed), IN-MEMORY DATA STRUCTURE STORE, USED </p><p>AS DATABASE, CACHE AND MESSAGE BROKER</p><p>The 'Definition' on redis.io</p></li><li><p>BASIC FEATURES: Different data structures</p><p> Keys with a limited time-to-live Transactions Pipelining Lua scripting Pub/Sub</p><p> Built-in replication Different levels of on-disk persistence</p></li><li><p>SPEED</p></li><li><p>AVAILABLE CLIENTS IN:ActionScript bash C C# C++ Clojure Common lisp Crystal D Dart Elixir emacs lisp Erlang Fancy gawk GNU Prolog Go Haskell Haxe Io Java Javascript Julia Lua Matlab mruby Nim Node.js Objective-C OCaml Pascal Perl PHP Pure Data Python R Racket Rebol Ruby Rust Scala Scheme Smalltalk Swift Tcl VB VCL</p></li><li><p>AVAILABLE DATA STRUCTURES Strings (Binary safe, can be anything from "hello world" to a jpeg file)</p><p> Lists (Collections of string elements sorted according to the order of insertion) Sets (Collections of unique, unsorted string elements) Sorted sets (It's like Sets with a score)</p><p> Hashes (Maps of fields associated to values. Think non-nested json objects) Bitmaps (Manipulate Strings on a bit level)</p><p> HyperLogLogs (Probabilistic data structure used to estimate the cardinality of a set)</p></li><li><p>Imagine...</p></li><li><p>TWITTER ANALYSIS TOOL Track a selected group of hashtags (#gameofthrones, #got, #gotseason7)</p><p> Count mentions of certain keywords ('winter is coming', 'tyrion', 'jon snow', 'stark', 'targaryen', 'cersei', 'asha greyjoy', 'Khaleesi', 'sansa', 'arya')</p><p>METRICS: A feed of all tweets containing one of the hashtags</p><p> Total number of tweets with one or more of the selected hashtags A leaderboard of keyword frequency A feed of tweets per keyword</p></li><li><p>[1] CONNECTING TO REDIS</p></li><li><p>[1] CONNECTING TO REDIS</p><p>Install the PRedis package using composer composer require predis/predis</p><p>... // Initialize the client $parameters = [ 'scheme' =&gt; 'tcp', 'host' =&gt; '127.0.0.1', 'port' =&gt; 6379 ]; $client = new Predis\Client($parameters, ['prefix' =&gt; 'twitter_stats:']);</p></li><li><p>[2] SET TRACKED DATA</p></li><li><p>[2] SET TRACKED DATA</p><p>Use sets to store all the hashtags we'll be looking at and all the keywords as well</p><p> $client-&gt;sadd('hashtags', 'gameofthrones','got', 'gotseason7'); // hashtags | 'gameofthrones' // | 'got' // | 'gotseason7' </p><p> $client-&gt;sadd('keywords', 'winter is coming', 'winterfell', 'jon snow', 'stark', 'targaryen', 'cersei', 'asha greyjoy', 'dorne', 'Khaleesi' 'hodor', 'sansa', 'arya', 'white walkers', 'the night king');</p></li><li><p>[3] GET THE DATA</p></li><li><p>[3] GET THE DATA</p><p>Use Twitter Stream API to receive notifications for tweets containing any of the hashtags we're following</p><p> $hashtags = $client-&gt;smembers('hashtags'); // array (size=3) // 0 =&gt; string 'got' (length=3) // 1 =&gt; string 'gameofthrones' (length=13)</p></li><li><p>Save every new tweet from the stream as a separate String. $keyname = 'tweet_id:' . $tweet_id; $tweet_contents = "Winter is coming Khaleesi! #gameofthrones";</p><p> $client-&gt;set($keyname, $tweet_contents) // 'tweet_id:45645656' | 'Winter is coming Khaleesi! #gameofthrones'</p><p>And then push to a queue to be processed asynchronously // Use the list data structure as a queue $client-&gt;lpush('message_queue', $keyname); // 'message_queue' | 'tweet_id:45645656' // | 'tweet_id:44645234' // | 'tweet_id:43645232'</p></li><li><p>[4] WORKER TO PROCESS THE QUEUED JOBS</p></li><li><p>[4] WORKER TO PROCESS THE QUEUED JOBS</p><p>A separate worker will be grabbing jobs off the top of the queue and processing them: $message_queue = $client-&gt;rpop('message_queue'); // 'message_queue' | 'tweet_id:45645656' // | 'tweet_id:44645234' // | 'tweet_id:43645232'</p><p>Reliable queue: RPOPLPUSH, BRPOPLPUSH</p><p>Blocking queue: BLPOP, BRPOP</p></li><li><p>[5] PROCESS THE TWEET CONTENT</p></li><li><p>[5] PROCESS THE TWEET CONTENT</p><p> $tweet_contents = $client-&gt;get($keyname); $keywords = $client-&gt;smembers('keywords');</p><p> foreach ($keywords as $keyword) { $tweet_contents = strtolower($tweet_contents); $keyword = strtolower($keyword); if (strpos($tweet_contents,$keyword) !== false){ $client-&gt;zincrby('mention_counter', 1, $keyword); // Increase the counter for this specific keyword // mention_counter | 'tyrion' =&gt; 9.00 // | 'the wall' =&gt; 5.00 // | 'arya' =&gt; 4.00 $keyword_feed_keyname = 'keyword_feeds:'. $keyword; $client-&gt;lpush($keyword_feed_keyname, $tweet_contents); // Add the tweet to the keyword's feed $client-&gt;ltrim($keyword_feed_keyname, 0, 50); } }</p><p> $client-&gt;incr('total_count'); // Increase the general tweet count</p><p> $client-&gt;lpush('main_feed', $tweet_contents); $client-&gt;ltrim('main_feed', 0, 100);</p></li><li><p>[6] SHOW THE STATS</p><p> $total_count = $client-&gt;get('total_count'); // 'total_count' | 259</p><p> $scores = $client-&gt;zrevrangebyscore('mention_counter', '+inf', '-inf', ['withscores'=&gt;1]); // mention_counter | 'tyrion' =&gt; 9.00 // | 'the wall' =&gt; 5.00 // | 'arya' =&gt; 4.00</p><p> // Feed by keyword foreach ($scores as $keyname =&gt; $score) { $keyword_feeds[$keyname] = $client-&gt;lrange('keyword_feeds:' . $keyname, 0, -1); }</p><p> // Feed of all tweets containing one of the specified hashtags $main_feed = $client-&gt;lrange('main_feed', 0, -1);</p></li><li><p>[7] USEFUL EXTRAS</p></li><li><p>[7] USEFUL EXTRASAPI RATE LIMIER</p><p> $ip = $_SERVER['REMOTE_ADDR'] ; $timestamp = time(); //unix timestamp $key = 'api_rate_limits:' . $timestamp . ':' . $ip; // $key = 'api_rate_limits:1473613000:192.168.10.1 '</p><p> $api_requests = $client-&gt;get($keyname);</p><p> if (!is_null($api_requests) &amp;&amp; $api_requests &gt;= 3){ throw new Exception('Too many requests per second'); }else{ $client-&gt;multi(); $client-&gt;incr($key); $client-&gt;expire($key,10); $client-&gt;exec(); }</p></li><li><p>[7] USEFUL EXTRASPIPELINING</p><p> $keyname = 'tweet_id:' . $tweet_id; $keywords = $client-&gt;smembers('keywords');</p><p> $pipe = $client-&gt;pipeline(); $pipe-&gt;set($keyname, $tweet_contents);</p><p> foreach ($keywords as $keyword) { [...] }</p><p> $pipe-&gt;incr('total_count'); $pipe-&gt;lpush('main_feed', $tweet_contents); $pipe-&gt;ltrim('main_feed', 0, 20); $replies = $pipe-&gt;execute();</p></li><li><p>[7] USEFUL EXTRASLUA SCRIPTING</p></li><li><p>Thank you!</p></li><li><p>Questions?@ELENA_KOLEVSKA</p><p>HTTPS://JOIND.IN/TALK/C68B2</p></li></ul>