Syndication Bandwidth Consumption

This morning Scoble had an interesting post on the bandwidth problems RSS is causing Microsoft. I don’t think this is a problem with the general syndication model, the problem here is due to the nature of the feed in question.

This morning Scoble had an interesting post on RSS bandwidth problems. I commented there a few times, but I thought I’d post my thoughts here.

Basically, I don’t think this is a problem with the general syndication model, The problem here is with the feed in question.

One of the other commenters over at scoble’s site quoted Mark Pilgrim

... [C]onsolidating the full content of 1000 regularly updated sites into a single file of any format is going to consume a lot of bandwidth.

Which is exactly what the problematic feed is doing, there’s no simple way to get around this. You have lots of new items every hour. If you set a high ttl or Expires, customers will miss items. Conditional GET won’t help because that feed will almost always have new items, gzip compression can cut your bandwidth bill, but it’s already in place here and it’s still a problem.

For feeds with more normal characteristics, conditional GET and gzip encoding reduce your bandwitdh bills significantly. For a good analysis see Matt’s excellent post. But when it comes down to it, for a huge & item-heavy site, you’ll get huge bills, there’s no way around it.

An option that busier sites may want to consider is a more reasonable version of Slashdot’s draconian measures. Basically, if an IP / UserAgent combination is repeatedly requesting your feed, and that combination doesn’t use Conditional GET and gzip, return a series of bogus items which instruct the user to get a better aggregator.

Slashdot’s system hurts innocent users thanks to the fact that a large percentage of the internet is behind transparent proxies. Combining the IP with the User Agent string reduces the chances of false positives. Requiring conditional GET and gzip support means that two or more users with the same aggregator will only be banned if their aggregator doesn’t play nice, and hopefully this will encourage the authors to fix their software.

Of course deliberately evil consumers could just generate random User-Agents, but if they’re that determined to DoS your system, something more conventional would probably be easier. Thankfully my feed doesn’t get requested enough to cause me any bandwidth problems, but I think this method is what I’d try if things got out of hand.

Posted on September 11th, 2004 | Commenting Closed

Sponsors

Hosted excellently by RailsMachine