Problem of the Week 1111
You just launched www.namemyiguana.com, a web site with a ton of traffic but very little money. You want to analyze your site's users, but you can only afford to store information about 100 users (we're in a recession, remember). Therefore, you decide to sample users uniformly over the lifetime of the site.
You begin by sampling the first 100 users. However, once the user base grows to some size
You must make a one-time decision about whether or not to sample a user the moment they first visit your site. If you choose to include a new user in the sample, you must replace some user currently in the sample with the new user.
What strategy would you use to ensure that, given a user base of size n, each of the n users has an equal probability of being in the sample? You must maintain the property as n grows.
Source: This week's problem is inspired by an interview question.
© Copyright 2009 Shilad Sen. Reproduced with permission.