Mario Wedding Invitations

Awesome!!

Cherenkov Radiation

While electrodynamics holds that the speed of light in a vacuum is a universal constant (c), the speed at which light propagates in a material may be significantly less than c. For example, the speed of the propagation of light in water is only 0.75c. Matter can be accelerated beyond this speed (although still to less than c) during nuclear reactions and in particle accelerators. Cherenkov radiation results when a charged particle, most commonly an electron, travels through a dielectric (electrically polarizable) medium with a speed greater than that at which light would otherwise propagate in the same medium.

Drawbridge From Microsoft Research

Very interesting and clever method of maintaining support for older applications while not carrying the baggage of such in the OS forward forever. Kudos to Microsoft on this one; I hope to see this in production soon.

Systems We Make

These are indeed great times for Distributed Systems enthusiasts. The boom in the number and variety of systems being built in both academia and the industry has created a strong need to curate interesting creations under one roof.
Systems We Make was conceived to fill this void.
Although Systems We Make is still in its infancy I hope to shape it into something more than just a catalog. So stay tuned as we evolve this site and do write to me about how you feel!

Great collection of papers here.

Sharding & IDs at Instagram

We’ve delegated ID creation to each table inside each shard, by using PL/PGSQL, Postgres’ internal programming language, and Postgres’ existing auto-increment functionality.

Each of our IDs consists of:

  • 41 bits for time in milliseconds (gives us 41 years of IDs with a custom epoch)
  • 13 bits that represent the logical shard ID
  • 10 bits that represent an auto-incrementing sequence, modulus 1024. This means we can generate 1024 IDs, per shard, per millisecond

What Is a DSP?

Note: this was originally a comment on a Hacker News post asking what a DSP is.

DSP is a term for an service that connects to multiple ad inventory sources (ad exchanges, ad networks, etc) and allows an advertiser to upload their creative media and targeting criteria once and then be able to buy on all the different inventory sources that the DSP integrates at once. The demand-side refers to the fact that DSPs (Invite Media, Turn, DataXu, etc) are used by advertisers to buy inventory, as opposed to the supply-side, which are the publishers (web sites) or their representatives (ad networks, Rubicon, Pubmatic, etc) that have the inventory slots which the DSP is to fill. In display advertising, “supply” is code for “audience” (people) and “demand” is code for money.

All of this works via the magic of RTB (real-time bidding) APIs which certain ad exchanges and networks implement that allows a DSP to see and bid on each individual impression right before it is rendered in real-time. It works as follows:

  • An end user loads a web page in his browser with an RTB-enabled ad tag on it
  • The ad tag loads and calls back to its origin (e.g. Google AdX) for a creative to show
  • The origin initiates a real-time auction with all of its partners (DSPs or Appnexus1) to find a creative to show
  • The origin will send each partner a bid request containing such information as the origin IP address, geographical information, site on which the tag is placed, etc
  • The partner searches its inventory for a matching creatives
  • The partner picks one to bid on and chooses a bid price to offer
  • It sends that back to the origin
  • The origin chooses the winner of the auction and then:
  • Sends the winning creative tag back to the publisher’s ad tag to be shown
  • Debits the winning partner’s account by the amount bid (or something less, depending on the auction model)
  • The ad loads in the browser and the end user is pissed off that he has yet to enable AdBlock

As for who offers RTB APIs, those are typically ad exchanges, SSPs (“supply-side platforms”, also known as “yield optimizers”, e.g. Rubicon, Pubmatic, AdMeld) and some ad networks. The largest ad exchange, Yahoo! RightMedia, has yet to offer an RTB API yet, although they are purported to be working on it.

DSPs will typically use the information provided in the bid request to help aid the bidding decision but also they rely heavily on third-party data sources for extra information about the user and the publisher domain/URL in order to maximize the likelihood of offering a relevant ad for the cheapest price. Proximic, BlueKai, eXelate, Bizo, TARGUSinfo and many more offer specialized data streams for this purpose.

DSPs serve as aggregators of RTB-enabled inventory sources for advertisers so that they can buy across all of them without having to manage N sets of creatives, N sets of bids, N sets of targeting criteria, N relationships, etc. This allows advertisers to concentrate on the audience they want to reach and not where to find them. Having said that, the larger brand advertisers will typically use an ad agency and not deal with DSPs directly; the agency will use a DSP on their behalf and charge a markup for the service. This is typically called “execution” in the same vein as “execution of trades”. DSPs are the algo traders of the online ad world.

1 I mention Appnexus separately because they are not technically a DSP. While they do connect to the RTB APIs of many of the inventory sources and can serve as a DSP, they are a primarily a platform on which to create a DSP or some other real-time-enabled ad processing engine. Many of the DSPs out there are just whitelabeled Appnexus, in that they use Appnexus’ bidder and UI for their “DSP”. Appnexus also has exchange-like properties, e.g. they have inventory exclusive to themselves (i.e. what used to be called Microsoft AdECN) that is only available by integrating with them/using their platform.

 

The Facebook Equivalent of the 90s

So true :)

Rust Language Keynote

Flajolet-Martin Algorithm

Flajolet-Martin algorithm approximates the number of unique objects in a stream or a database in one pass. If the stream contains  elements with  of them unique, this algorithm runs in  time and needs  memory. So the real innovation here is the memory usage, in that an exact, brute-force algorithm would need  memory (e.g. think “hash map”).

Twitter Scala School

Lessons
Basics
Values, functions, classes, methods, inheritance, try-catch-finally. Expression-oriented programming
Basics continued
Case classes, objects, packages, apply, update, Functions are Objects (uniform access principle), pattern matching.
Collections
Lists, Maps, functional combinators (map, foreach, filter, zip, folds)
Pattern matching & functional composition
More functions! PartialFunctions, more Pattern Matching
Type & polymorphism basics
Basic Types and type polymorphism, type inference, variance, bounds, quantification
Advanced types
Advanced Types, view bounds, higher-kinded types, recursive types, structural types
Simple Build Tool
All about SBT, the standard Scala build tool
More collections
Tour of the Scala Collections library
Testing with specs
Write tests with Specs, a BDD testing framework for Scala
Concurrency in Scala
Runnable, callable, threads, Futures, Twitter Futures
Java + Scala
Java interop: Using Scala from Java
An introduction to Finagle
Finagle primitives: Future, Service, Filter, Builder
Searchbird
Building a distributed search engine using Finagle