OnStartups

How I Inadvertently Ran an MIT Student Hacking Contest For $3,001

Posted by Dharmesh Shah on December 6, 2012 0 Comments

It was 6:30pm on a random Tuesday evening in Cambridge. There I was, carefully counting a wad of crisp $100 bills in an unmarked white envelope.   I was parked in my car on Main Street, a block from the MIT campus.  The bills were new enough that they sort of stuck together.  The first time I counted them, I counted only 28 and had a brief moment of panic. There were supposed to be 30.  But, two more recounts, and I was fairly certain that there was $3,000 in the envelope.  This was the closest I think I've come to feeling like I was doing something illicit.  I was just waiting for a police officer to walk up to the car and tap on my window.

I was scheduled to hand this cash over to people I had never met.  At 7pm.  In a classroom at MIT's Stata Center (it's the building that Frank Gehry designed that looks like it came out of a Dr. Seuss book)

The $3,001 was payment for what was intended to be an inbound marketing contest.  The funny thing is, I inadvertently ended up running a hacker contest.  But, looking on the bright side, the payment could have gone as high as $50,000 (which would have maybe required a black briefcase and sunglasses).  So in some sense, I got off easy.  "This could have been much, much worse," I told myself.Mit-stata-center

Here's the story of how I ended up handing an envelope full of $100 bills to two complete strangers.

The Founder's Journey Class at MIT

Back when I was a grad student at MIT (this is back in the 2005/2006 timeframe), I took a class called “New Enterprises” with Ken Zolot and Howard Anderson.  It was the class to take if you were a student at MIT and had entrepreneurial leanings.  It's where I wrote the original business plan for HubSpot (the only business plan I've ever written).

Ken Zolot went on to teach a new class called “Founder's Journey” (6.933) .  Students were primarily undergrad computer science majors.  Ken invited me in as a guest speaker, and I've been a regular guest lecturer ever since.

My topic at this guest lecture  has generally been inbound marketing.  To make things fun and interesting, I often hold a small contest/exercise during class (I've done this in other classes I teach as well).  The cash prize was generally between $20 and $100 (usually based on how much cash I had in my pocket at the time, and the nature of the exercise).  It often involved some sort of inbound marketing task:  Write a title for this blog post, figure out some SEO keywords for this company, etc.

 “Hey, why don't we make this more interesting…”

 

A week before I was scheduled to speak to the current Fall term class of “Founder's Journey”, Ken reached out to me and suggested we might want to try a more “in-depth” exercise.  Something the students would have a few days to complete, instead of a few minutes in class.  I thought that was a good idea, and noodled on the idea a bit.  I then sent Ken this fateful email.

Zolot-email

I will summarize the rules for you:  Each student first sets a goal for themselves in terms of how many retweets they're trying to get.  They then try to post a tweet that gets that many (or more) retweets.  Simple enough, right?

Now, the astute among you will immediately recognize a couple of issues with this proposal:

1. I don't specify a maximum payout.

At the time, what I had in my head was: “It's kinda hard to get people to retweet something that much.  I have 160,000+ followers on twitter, and even one of my popular tweets might get 100–200 retweets.”  None of these students is going to have that kind of following, ergo, the likely payout will be $20 –  $50.  And, if someone's really good and gets a couple of hundred retweets, so be it.

2. I'm not explicit about what methods are considered legitimate/fair to acquire the retweets.

My topic was inbound marketing, and part of what I was hoping to  teach/convey is that a) it's hard to get people to retweet.  b) it helps to experiment with different kinds of tweets (humorous, useful, controversial, social good, etc.)

What ended up happening was this “perfect storm” of mistakes in structuring this exercise/contest.

“I thought to myself:  Self, we have a problem…”

So, on Thursday night (t minus 6 days to the class), tweets with the #foundersjourney starting showing up on my radar.  Not surprising.  That's the evening the class is held, so Ken had clearly given the students their assignment and the game was on.

Several of the tweets were amusing, like this one:

Founders-journey-tweet-1

But, when I saw this tweet, I knew I had a potential problem.

Founders-journey-tweet-2

Uh oh.  Exactly 300 retweets (as stated in the tweet) — all of them looking like bots.  That's when I thought to myself, “Self, we have a problem…”  I realized that I had not explicitly stated that one couldn't set up a robot army to do the re-tweeting (that kind of defeats the whole purpose  of the exercise, which was meant to be about inbound marketing).  But, the rules are the rules.  I had set myself up, and was prepared to pay the price.

Now, I just needed to figure out what that price was going to be.  So, I sent an email to Ken to ask the simple question: What were the top 5 retweet bids/projections?  (Remember, per the rules, I wouldn't have to pay out any more than what someone had bid).

"The top bids are: 50,000, 11,000, 10,000, 5,000, and 3,000", he said.  “Holy crap!”, I thought.  This is going to get interesting.

Later, I saw this tweet.

Founders-journey-tweet-3

The tweet didn't have 3,259 retweets at the time — but it was climbing steadily.  The contest ran for 3 days, and as you might expect, I watched intently as the story unfolded.

Once the contest ended, I knew I was likely going to be out a fair amount of money.  I didn't know how much, because I didn't know what Paul's bid was.  But, I had a sneaking suspicion it was going to be relatively close to the 3,259 bid.

So, I figured, what the hell — let me just reach out to this guy and get the skinny.  It's also when I had the idea to write this blog article.  I figured “Hey, maybe we can turn this into somewhat of a happy inbound marketing ending. ”  I later learned that Paul and Tim (a classmate) had agreed to join forces and attack the challenge together.

30, Fresh, Crisp, $100 Bills

So, last night (Tuesday, December 5th), I went in to teach the Founder's Journey class (as scheduled).  I met Paul and Tim for the first time, handed them an envelope with 30, crisp $100 bills, while their classmates looked on with a combination of admiration and envy, and then went about my business of convincing a bunch of undergrad computer science students that marketing was usually necessary, and that inbound marketing was the way to go about it.

Paul tweeted a picture of the cash later that night.  Here's that picture.

Mit-founders-journey-cash

 

Here's a video of them hacking away (don't worry, there's no sound, so it's safe for work -- or classrooms

CAPTCHAs, consoles and cash, Oh my!

I went ahead did an email interview (this is all before I actually met the guys).  Here are the questions I asked, and their responses.  Only minor editing..and I've added some of my thoughts in italics.

1. How did you pick 3,000 as your bid/projection?

Paul: First of all, my bid is 3,001. I wanted to be able to say that I won "over $3000." Secondly, I wanted something worth doing. I realized that I got to set how successful this could be, so I didn't want to limit myself. I thought was about the most I could get away with in 3 days.
Tim: So I picked 460, knowing I could probably get comfortably to that amount without raising any red flags. Once Paul mentioned he picked 3,000, the choice was between solving 2,500 captchas for no benefit… or getting a piece of the action. (Paul, of course, was nice enough to collaborate.)

2. At the time you submitted your bid, had you already thought about some sort of semi-automated approach?

Paul: My first idea was that some service must already exist to turn money into retweets, and I could just buy them for less than a dollar each. I tried  one of them with the ten free retweets, and they took about a day and a half to roll in. Too slow, I would need to make my own solution.
Tim: I was stoked when the contest rules were proposed. I wasn't sure if we were going to automate it at first; but I knew it was feasible, and intentionally asked no questions so that nothing would be off the table.  (This was smart.  No upside to asking a bunch of questions about the rules -Dharmesh)

3. How much software was involved in the process?  What did the software actually do?

Paul: The software did everything except solve the captcha. It scrapes the twitter mobile signup page for the captcha picture, saves that after running edge detection to make it even easier, and displays that on a constantly refreshing page. Type 6 numbers in the console, press enter, repeat. One twitter account after another.
Tim: After our setup, had a script which would access mobile Twitter, generate a random First + Last name from dictionary words ("Rumply Nectarines"), pick the first Twitter-suggested username (@rumply) and then download a captcha. To us, this was just a page which showed a number, we'd type six numbers, [enter], and the next image pops up. The new account credentials are saved, so five hours later, we just had to run "retweet_this.py [tweet id]" once and slowly refresh our browser as the number ticked up. At about 3am we got our networks blocked, and we started tickling out retweets over my phone connection, slowly inching past 2500 and hoping not to get blocked again... (My guess is that if their network continued to get blocked, they would have found a proxy server somewhere to get around it. -Dharmesh)

4. What was the software developed in?  Did you use any third-party libraries for some of the low-level stuff?

Paul: We used Node.js which we've both used for a lot of personal projects. We used a really cool library called scrapi which did all the heavy lifting for us (and which Tim authored :D).
Tim: Just Node.js for scraping, and Imagemagick to resize and highlight the edges of the captcha. (Paul disagreed this made it easier. We had about five hours to micro-optimize the whole process). I think it's amazing what you can accomplish with just an HTTP client and no rate limiting.

5. How do you and Tim know each other?

Paul: We go way back, since we were first-years at Olin. We made a lua-running operating system together, made a product designed for LARPers, and we share a provisional patent for a physical identity device as part of Lifegraph Labs that we are running this year with 4 other students. I wanted someone to help make this a reality, and I couldn't think of anyone more qualified than my friend Tim who is also in #foundersjourney. It  really worked out for both of us.
Tim: We both attend Olin College, and not only is Olin small, but Paul and I have known and worked with each other on a ton of projects over the past four years, a large number of them gags and things we're terribly embarrassed to talk about. Our shining moment may have been an OS we coded from scratch, just to display pictures of ASCII cats.

6. Had you worked on something similar before?

Paul:  I've never had to make 3000+ Twitterbots, but I have scraped twitter for data for AI research projects and I think we've both had our lion's share of Stay Late And Code projects.
Tim: Paul and I have done hacks like this before, but the timing was funny. I'm currently researching easy and flexible ways to consume APIs and websites. This scraper was built on a library I'd been working on for a few months [1]. Hacker News had a heated discussion last month around Marco's article "What Happens When a Twitter Client Hits the Token Limit" (http://news.ycombinator.com/item?id=4795052). Scraping as a panacea for Twitter API restrictions was brought up in the comments, and as a proof-of-concept I thought I would bang out something to parse Twitter's mobile client, just to test how easy it would be to use Twitter without an API. Not long after, this contest was announced. Good timing, I guess! [2]
[1] see https://github.com/tcr/scrapi, and a similar scraper for Hacker News that uses it https://github.com/tcr/node-hackernews
[2] see https://github.com/tcr/scrapi/blob/b9bc481ba04c5bebf65fa77f4ded5d22929c9b21/examples/twitter.js seriously deleted it later that day so I wouldn't be encouraging TOS violations… whoops.

7.  Did you consider something like Mechanical Turk to automate the last step (the CAPTCHA)?

Paul: We sure did. Even better would be adding a fully automated captcha solver that someone else wrote that works most or some of the time. That's a great step 2 for this project, and next time we won't be doing this manually.
Tim: Mechanical Turk's turnaround would be too slow, plus, better not violate Amazon's TOS and Twitter's. There are captcha solvers available online for cheap ($5 for 1000) but by this point we had paid no money and automated most of the work. It felt slightly less dishonest if we did all the grueling work ourselves.

8. You don't know me from Adam, how did you know I was going to make good on the offer?

Paul: I figured that since you proposed the challenge in the first place, you would be a good sport about it. I figured it would be great for your reputation if you paid out, and maybe not so good if you didn't. In the end, we weren't going lose more than a day programming and typing captchas, and the promise of a story this exciting was too good to pass up.
Tim: Can 3,000 Twitter bots get #dharmeshisaliar trending on Twitter? ;) Questioning the payout didn't even come up.  More worrisome was that someone would honestly (or dishonestly) beat us. We did the math; we solved 15 captchas a minute, so that's $900/hr, for five hours, with the risk we might get zero payout. That's a better lottery ticket than I've ever bought. It'll be fun Christmas shopping.
For the record, I was going to make the pay out regardless of what the outcome was.  A deal's a deal.  I'm just glad I didn't have to pay out $50,000.  That would have stung a bit (because I could have angel invested that kind of money in two up-and-coming startups!) –  Dharmesh

Closing thoughts and lessons learned

1. If you ever hold a contest with a bunch of MIT undergrad computer science students, assume that it's ultimately going to become a hacking contest.  Hackers hack.

 

2. Remember that any game that can be gamed will be gamed.  All part of the fun.

3. It's wise to actually state in the contest guidelines what the intent of the exercise is.  This will likely increase the chances that you get what you set out to get.  In this case, the irony was that I care immensely about inbound marketing, and I was worried that the students would walk away with the exact wrong lesson.  But, I think by the time my class was done, they "got it".

4. I'm not going to lose too much sleep over this because although there were twiter bots involved, at least it wasn't spammy.  The solution did not involve picking some popular meme, following a bunch of real people or otherwise annoying folks that were outside the scope of the contest.

5. I paid for this out of my own pocket.  No HubSpot shareholders were harmed.  

6. I'm glad I was able to at least capture the story and tell it.  It's a fun little story, and had a happy ending.  

And, if you want to help make it an even happier ending, help me find a great Python/Django developer.  This is to work with me on some cool HubSpot Labs projects.  But, you need to be really, really good -- like working over email, and have the ability to create things that are not fugly.  If this is you, or someone you know, email me at dshah {at} onstartups {dot-com}.  Thanks!