Skip navigation

Tag Archives: programming

I recently saw The Social Network, and though I have not yet reviewed it, the first thing I noticed after the movie was the lack of a single likable character. The movie clearly isn’t factual, and perhaps it’s that lack of individual kindness that makes the movie so engrossing. Regardless, the movie made me realize that there are plenty of interesting stories that I know about certain advancements on the internet that will probably go largely unnoticed. So, today I’m going to life a story that I read recently about a tracker’s (for torrents) development. This tracker, as you’ll read in the story, currently supports 5 million peers. What that means to you is there are five and a half million (2% of the US population, or two thirds of the population of the largest US city: New York City) individuals sharing music together for free. This is a closed community, meaning that each member had to be invited, and participate by following a rather complicated set of rules. Pretty interesting setting for a story. I took out the name of the tracker, just for fun, and all the names used are aliases—so hopefully they’ll be somewhat anonymous. Here’s the story as a received it:

________ is a private tracker. Thus, the entire site, staff, and community all revolve around a common piece of software – the tracker backend. Complementing the site frontend, which you’re looking at now, the tracker itself handles connections between peers. 

With over five million peers, our tracker receives an average of 3,500 hits per second, although after a period of tracker downtime, load can spike up to past 12,000 hits per second. This means that, when your client announces, the tracker has 80 microseconds to search through its database of over 900,000 torrents and 5,000,000 peers, compute a response, and send it back to you. That’s a lot of stress on a piece of software! 

We anticipated this problem, of course, back before the site even started. That’s why we elected to use what was then the fastest private tracker backend in the world – XBTT. 

Lauded for its speed, XBTT handled the peers very well for the first few months of the site’s existence. We brought on a developer – asm – whose job was to tune it and modify it as needed, and he was able to do that just fine – for a few months. However, asm was reluctant to make any major changes. When we asked why, his response was that XBTT’s code was too weird, and that he was afraid he’d break something. 

A bit surprised, we lead site developers peered into the bowels of XBTT for the first time, and we found that he was correct. XBTT’s internal code worked fine in practice, but strange/outdated design decisions and the inclusion of thousands of lines of unnecessary code gave us worries about how well it would scale to a swarm of the size we had planned, as well as whether we’d be able to continue modifying it to our needs. 

So a plan was formed. We would create a tracker of our own. 

Late winter 2007

It made perfect sense. We were already replacing the outdated TBDev source with our own new Gazelle source, so why not replace XBTT with another piece of software as well? Make it fast, make the code pretty, give it a cool-sounding exotic animal name, and we’d be set. It couldn’t possibly take very long – trackers are very simple pieces of software, after all. The only problem was that XBTT had scared asm into hiding, the other developers were all php developers (php is a language that is fast to write and slow to run) and we wanted the tracker coded in C++ (slow to write, fast to run). The solution was thus to outsource. 

January 2008

Our first developer choice was a young developer called rootkit. Immensely intelligent, but perhaps not the greatest people person in the world, rootkit decided that he wanted to write the tracker in haskell instead. We weren’t too excited to have the tracker written in a weird language that no one understood, but he promised that it’d be fast so we let him go at it. We don’t think he ever wrote more than a hundred lines of it before he gave up and stepped down. 

While we searched for a new developer, WhatMan decided to try an experiment – to see if a php tracker could outperform XBTT. He hacked away for a weekend and created Lioness – a beautiful little tracker, no doubt one of the fastest php trackers ever made. Unfortunately, it wasn’t quite fast enough for our needs – upon testing, the swarm crushed our poor webserver, and we were forced to go back to XBTT.

By this time, XBTT was barely able to keep up with the load. The timeouts had already started, and we did whatever we could, but in the end, the only thing that really helped was when we moved to our new (then) ridiculously oversized server in Canada. 

March – May 2008

Another developer had been found! The guy was smart, mature, well educated, fluent in C++, and seemingly very able. We told him what we needed, and he started coding. A month later, the new dev – lenrek – had created the first tracker to call itself Ocelot. 

lenrek’s ocelot looked promising. It was new, shiny, and multithreaded. We figured that our problems were solved, but when we tried it out, it exploded. It is still unclear exactly why, just that it happened. That ocelot was tweaked and some more tests were run, but we eventually gave up. lenrek’s ocelot was basically shelved, and attention turned, for the next year, back to making XBTT handle its load properly. 

Fortunately for us, lenrek stayed on as a developer – although his ocelot didn’t succeed, he’s responsible in a large part for making the site work as well as it does today. 

June 2009 – February 2010

In the next year of stagnation, ocelot was never quite forgotten, but working on it was never very motivating – especially with only one tracker dev. So we raised the XBTT announce interval from 30 minutes to 35, then to 40, then to 45. In the meantime, the idea of ocelot waited until we found someone to revitalize it. In June 2009, FZeroX found such a person – rconan. 

rconan was incredibly intelligent, and came up with a plan for what everyone was pretty sure was going to be the most awesome tracker ever. High performance event queues, hashmaps, all that cool stuff. We outsourced the project to him, he started coding, and initial progress was very rapid. 

Two hundred changes and additions to rconan’s new ocelot were made between the months of August and October. Before we knew it, the new ocelot was all but finished – 4,000 lines of divine C++ code, with just “a few” bugs and features left to code. And then, rconan’s real life started to get busier. 

A couple of changes were made in November, a couple in December, one in January, and a final flurry of activity took place in February. When we asked for progress updates, ocelot was still a few bugfixes and features away from being ready for production, but no changes were ever made after February. As none of our in-house developers had been closely following the development of the new ocelot, we were unable to take over, and simply hoped that rconan’s real-life obligations would clear up and he’d have the time to finish it. 

In the meantime, we had raised XBTT’s announce interval to the highest point we could justify – 47 minutes – and it was still timing out so often it became a joke. In April 2010, we gave it its own server and started load balancing multiple instances of it – starting out with 2 XBTTs, and then 3, and then 4. This gave us some breathing room, but not for long. 

April – May 2010

At one point, A9 and oorza were arguing about java performance. A9 had the brilliant idea of daring oorza to write a high performance tracker in java, and work began on shadowolf. oorza proclaimed shadowolf “almost completely done” on May 12th, save a few outstanding bugs. We checked in on his progress at the end of August, and he was rewriting the entire plugin architecture, and considering using hadoop to store peers. We’re unsure about shadowolf’s current status. 

August-September 2010

No updates had been made to ocelot in eight months, and rconan was nowhere to be found. The future of shadowolf was unclear. When a thread came up about ocelot in the forums, the staff were forced to admit that development on it had ceased, and that no update was liable to take place in the near future. It was a hard post to write, considering how the timeouts had become so bad that the joke wasn’t funny anymore. Users would sometimes have to wait hours for the tracker to let them download things, stats were being lost left and right, and we were out of hardware to throw at the problem. Something had to be done. 

Enter WhatMan. Having previously stayed out of the C++ tracker development arena due to a lack of confidence with his high-performance C++ coding skills, WhatMan was confused with as to why everyone wa
s creating 4000+ line of code behemoths when trackers are, in reality, extremely simple pieces of software. So he lifted some key design choices from rconan’s ocelot, created the rest of the design himself, and spent the last week of August hacking away at a brand new ocelot. 

On September 1st, ocelot was ready for performance testing. We replaced one xbtt instance with it, and it scaled. So we replaced two, and it scaled. We tweaked it a bit, and then replaced the third and fourth instances, tweaked it a bit more, and replaced the load balancer. What four XBTT instances and a load balancer were failing to handle before, was now being handled by one, singlethreaded instance of the latest ocelot. 

Then we pushed it harder – we lowered the announce interval to 40 minutes, and then to 30, and it scaled. Then we lowered it to 20 minutes, and linux broke before ocelot did. It was beautiful. 

The dev team rejoiced, and banded together to add the remaining features and fix the remaining bugs. By September 3rd, ocelot was considered feature complete, and we let it run the entire swarm – one tracker for five million peers, at a 30 minute announce interval. 

September 2010 – Now
Since then, ocelot’s been purring along. It uses up 20%-30% of one CPU core, and 3GB of RAM – for comparison, our four XBTT instances used the same amount of RAM in total, and 50%-100% of a core each. It’s 1547 lines of code long in total, which will be open-sourced at some point. The dev team has added the occasional bugfix, and there may be some bugs yet to be discovered, but our tracker is now more stable than it’s been since we started. After over two and a half years, ocelot’s journey to creation is finally finished.

Advertisements

The best articles from the New York Times.

In Praise of Progress

David Brooks: Unemployment is high, and there’s suffering, but global poverty is at its lowest point in human history. Afghanistan is depressing, but there are fewer wars these days than ever before, mostly because a sharp drop in civil wars. In short, everything is better, or nearly everything, and I say that as someone typing with his thumbs while getting eaten alive by mosquitoes in the back yard.Gail

Collins: Wow, the lack of power has really cheered you up. I remember just a few months ago, you were practically suicidal over the toxic politics in Washington.

Building Smarter Machines

Synthetic speech, autonomous robots, computers beating the best humans at chess and checkers. As computers grow ever smarter, a look at developments in the field of artificial intelligence.

Hints of Earth Splash a Saturnian Moon Landscape

However, if prolonged spells of 90-degree temperatures have you yearning for a refreshing icy dip, there are still plenty of bathing opportunities on Titan.

Of course the lakes there are made of liquid methane — and the 90 degrees of temperature are on the Kelvin scale, near enough to absolute zero to challenge even the most cosmically adept polar bear. The atmosphere is nitrogen and methane.

Four Ways to Kill a Climate Bill

But efforts to genetically engineer algae, which usually means to splice in genes from other organisms, worry some experts because algae play a vital role in the environment. The single-celled photosynthetic organisms produce much of the oxygen on earth and are the base of the marine food chain.

“We are not saying don’t do this,” said Gerald H. Groenewold, director of the University of North Dakota’s Energy and Environmental Research Center, who is trying to organize a study of the risks. “We say do this with the knowledge of the implications and how to safeguard what you are doing.”

The Limits of the Coded World

In one set of experiments, researchers attached sensors to the parts of monkeys’ brains responsible for visual pattern recognition. The monkeys were then taught to respond to a cue by choosing to look at one of two patterns. Computers reading the sensors were able to register the decision a fraction of a second before the monkeys’ eyes turned to the pattern. As the monkeys were not deliberating, but rather reacting to visual stimuli, researchers were able to plausibly claim that the computer could successfully predict the monkeys’ reaction. In other words, the computer was reading the monkeys’ minds and knew before they did what their decision would be.

On the Origin of Species (Annotated Text)

Darwin packed this paragraph with all of the elements of the process of natural selection. The phrasing reflects his incomparable knowledge of natural history and his revolutionary new view of nature:

“…variations useful in some way…” – the words of a lifelong collector who appreciated that individual members of a species exhibited variability.

“…the great and complex battle of life…” – unlike his predecessors who viewed nature as a peaceful, harmoniously designed landscape painting, Darwin had observed that nature was a battlefield in which there was tremendous waste and death.

“…thousands of generations?” – Darwin’s grasp of time was critical, his knowledge of geology made him confident that the planet and life were much older than people had once thought, such that there was plenty of time for the process of natural selection to play out.

—Sean B. Carroll, molecular biologist and geneticist; and an investigator of the Howard Hughes Medical Institute at the University of Wisconsin.

The Errors of Our Ways (Book Review)

Schulz begins with a question that should puzzle us more than it does: Why do we love being right? After all, she writes, “unlike many of life’s other delights — chocolate, surfing, kissing — it does not enjoy any mainline access to our biochemistry: to our appetites, our adrenal glands, our limbic systems, our swoony hearts.” Indeed, as she notes, “we can’t enjoy kissing just anyone, but we can relish being right about almost anything,” including that which we’d rather be wrong about, like “the downturn in the stock market, say, or the demise of a friend’s relationship or the fact that at our spouse’s insistence, we just spent 15 minutes schlepping our suitcase in exactly the opposite direction from our hotel.”

Take Ivy (Slideshow)

Time has done little to dim the allure of “Take Ivy,” with its guileless snapshots of handsome, fit and presumably bright young lugs disporting themselves in dining halls, on the College Green at Dartmouth, along Nassau Street in Princeton and in Harvard Yard. Credit: Teruyoshi H Girl

Pop’s Lady Gaga Makeover

Furthermore, the thing that most separates Lady Gaga from the bubblegum sirens of a decade ago is that her capacity for seduction has been neutered, recontextualized. Near the end of her recent Madison Square Garden show she emerged onstage with sparklerlike contraptions on her chest and crotch, spitting out tiny, angry, smoldering bits. “You tell them I burned the place!” she shouted. It was a straightforward repudiation of hypersexualized imagery. There was nowhere to touch without getting hurt.

Plus-Size Wars

Perhaps nowhere is the cultural confusion surrounding the larger woman more pronounced than in the clothing industry’s efforts to dress her. According to a 2008 survey conducted by Mintel, a market-research firm, the most frequently worn size in America is a 14. Government statistics show that 64 percent of American women are overweight (the average woman weighs 164.7 pounds). More than one-third are obese. Yet plus-size clothing (typically size 14 and above) represents only 18 percent of total revenue in the women’s clothing industry. The correlation between obesity and low income goes some way toward explaining the discrepancy — the recession was particularly hard on this segment of the market, with sales declining 10 percent between 2008 and 2009, a drop twice that of the women’s apparel industry over all — but it doesn’t explain it entirely. That figure has been fairly constant for the past 20 years.

Everybody’s a Critic of the Critics’ Rabid Critics

But then a second round of notices tarnished that luster. David Edelstein of New York magazine, Stephanie Zacharek of Movieline.com and Armond White, the reliably oppositional critic at The New York Press, published pans that ranged from frustrated to weary to vitriolic, decrying the rush to inscribe “Inception” in the pantheon of cinematic greatness. For their efforts these and other similarly unimpressed writers were treated like advocates for national health care at a Tea Party rally, their motives, their professionalism, their morals and their sanity questioned, and not always politely. What seemed to provoke the most ire was that these critics had shown the temerity to mention what other critics had written, and to respond to the aggressive marketing and the early effusions.

Facebook Is to Power Company as …

“I worry that we’ll end up with solutions that are familiar but not correct if we start from the wrong metaphor,” she said. “And I’m not sure there is a good metaphor for Facebook.”

Married, but Sleeping Alone

Technology is an even greater intrusion. Forget the tired debate about TV in the bedroom; how about your ex’s Twitter feed? Anyone who’s around teenage girls or techy men knows someone who checks e-mail, text messages or Facebook pages after turning out the light at night and before going to the bathroom in the morning.With all this commotion, it’s no wonder the bed has become such an unappealing place to sleep. Between whining kids, buzzing BlackBerrys, stacks of unpaid bills and overturned bottles of Evian and Ambien, the bedroom has become more crowded than the kitchen. If my house is any indication (“You get up early with the kids on Monday, I’ll move the car on Tuesday”), my bed needs its own Outlook calendar.