2010-12-09

Social Proof with Redis

I learned about this method of implementing social proof from the guys at Scribd during their Web 2.0 Expo presentation “Social Design with Facebook”. When implementing avatars, the Scribd guys suggest that you show all users who have liked a particular item; but show the CURRENT user’s friendsFIRST so they can quickly recognize their friends’ avatars at the top of the cluster of avatars. These are the avatars that really matter; the rest of the avatars are just to prove how popular a given item is.

The traditional, relational database solution (e.g., using MySQL) is I/O-intensive for getting this information due to the scale of modern social apps (the number of items and user-friend connections) and due to a number of other issues inherent in how the data is stored and retrieved (such as thecartesian product nature of joins).

The Scribd solution uses Redis which, if you haven’t heard of it, is essentially an in-memory set-based database (mathematical sets, which can support operations such as unions, intersections, subtractions, etc.). Scribd stores two types of sets in memory:

  1. All friend ids for a given user, for all users. So if user 1 has friends 2, 3, and 4 then the key would be 1, and the set would contain the values {2, 3, 4}. This size of storing these sets in memory would be ((# of users) x (number of average friends per user))
  2. All user ids for people who follow a particular item, for all items So if Item 10 is followed by Users 3, 6, and 7, then the key would be 10 and the set would contain the values {3, 6, 7}. This size of storing these sets in memory would be ((# of items) X (average # of people following any given item))

The social proof is quickly generated by doing:

  1. "The current user's friends who follow the current item" would be {the set of the current user's friends} intersect {the set of the current item's followers}. In the above example, it would be {2,3,4}intersect {3,6,7} which results in {3}.
  2. "The other people who follow the current item who are not friends with the current user" would be {the set of the current item's followers} subtract {the current user's friends}. In the above example it would be {3,6,7} subtract {1,2,3} which would be {6,7}.

That’s it! Two quick and painless, in-memory operations with no I/O bottlenecks to worry about. You would still store users, friendships, and item follows in a traditional database for persistence, but you would duplicate some of that content in Redis for these quick operations. Any CRUD operations would be duplicated in Redis. When you restart your servers, you would re-generate the in-memory sets.