<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Stories from an Opentracker &#187; coding</title>
	<atom:link href="http://opentracker.blog.h3q.com/category/coding/feed/" rel="self" type="application/rss+xml" />
	<link>http://opentracker.blog.h3q.com</link>
	<description>running an open bittorrent tracker</description>
	<lastBuildDate>Fri, 04 Mar 2011 12:35:50 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
		<item>
		<title>The UDP situation</title>
		<link>http://opentracker.blog.h3q.com/2008/01/02/the-udp-situation/</link>
		<comments>http://opentracker.blog.h3q.com/2008/01/02/the-udp-situation/#comments</comments>
		<pubDate>Tue, 01 Jan 2008 23:18:09 +0000</pubDate>
		<dc:creator>erdgeist</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[tech]]></category>

		<guid isPermaLink="false">http://opentracker.blog.h3q.com/?p=40</guid>
		<description><![CDATA[Most of the problems we have to cope with in our everyday opentracker struggle arises from operating systems having to handle several thousand tcp connection setups, reassembly and tear downs. As a mean to help ease the impact of these requests, a udp extension to the tracker protocol has been agreed upon. However that specification [...]]]></description>
			<content:encoded><![CDATA[<p>Most of the problems we have to cope with in our everyday opentracker struggle arises from operating systems having to handle several thousand tcp connection setups, reassembly and tear downs. As a mean to help ease the impact of these requests, a <a href="http://xbtt.sourceforge.net/udp_tracker_protocol.html">udp extension</a> to the tracker protocol has been agreed upon.</p>
<p>However that specification page does a rather poor job – besides of the obvious html layout deficiencies – when it comes to explaining what design decissions were made and why. This lead to many client writers out there wondering what the connect packet is about.</p>
<p>While there are <a href="http://www.rasterbar.com/products/libtorrent/udp_tracker_protocol.html">sites around with a layout that at least does not try to prevent implementation</a>, rationale for the extension completely is absent.</p>
<h3>1. The connect packet</h3>
<p>Now, what is the extra connect packet all about? Wasn&#8217;t the udp extension all about saving packets hammering the server? The answer is, it was designed to prevent address spoofing. While tcp inherently prevents address spoofing with its initial handshaking – you can only finish it if the recipient of SYN+ACK is the same address that sent the SYN – udp does not provide any means of ensuring that the packet really originated from the address it claims to be.</p>
<p><i>So Olaf van der Spek – designer of that protocol extension – demands that a client first uses a connect packet to acquire a uniq-ish <code>connection_id</code> that it later presents to the tracker to prove listening on that very address.</i></p>
<p>What way of chosing, storing and comparing <code>connection_ids</code> a tracker implements is up to him, one possible way would be chosing one very long global tracker secret, and then calculate a cryptographically secure checksum of this secret appended by the clients alleged ip address. This approach would not require the tracker to store anything per client for it can be calculated everytime that client comes back.</p>
<p>However, when writing the specification, Olaf made the mistake of not mentioning this feature and – even worse – using the term <code>connection_id</code> for two completely unrelated values: A connect packet contains a static protocol identifier at the same offset where on announce and scrape packets that unique <code>connection_id</code> is expected. Olaf even also calls that value <code>connection_id</code>. Hence half of the non-connect udp packets arriving at opentracker still contain 0&#215;41727101980 as <code>connection_id</code>.</p>
<p>To be honest, opentracker is not completely innocent. Since we could freely chose which values to put there, we just left the old <code>connection_id</code>. This might have given the impression that the whole <code>connection_id</code> thing just is a sanity feature to distinguish stray udp packets arriving on that port.</p>
<p>Now we&#8217;re facing the problem that enforcing <code>connection_id</code> checking would really hurt udp announces. This is not in our interest. As a first step we changed our code to return a different value as <code>connection_id</code>. Now we want to encourage all client authors supporting udp to check their implementation against the way it was meant to be. In some months from now opentracker will enforce <code>connection_id</code> checking.</p>
<h3>2. Auto UDP</h3>
<p>As you may have guessed by now we do love udp announces a lot more than the tcp stuff. So introducing ways to increase the number of those more lightweight client interactions is in our own very interest.</p>
<p>We do know that it is hard to encourage users to add udp:// addresses to their torrents. Still most tracker responses on earth are being handled by trackers that are capable of speaking udp (and soon also udp-v6). This means that sending udp connect packets to the corresponding udp port of a http tracker url will most likely succeed.</p>
<p><i>We hereby invite all client authors to probe the corresponding udp announce address, store the result on a per-tracker base, use udp whenever feasible (i.e. for announces and [non-full-]scrapes) and revert to tcp only if udp fails.</i></p>
<h3>3. IPv6</h3>
<p>And while you are at it: you might also want to take a look at our <a href="http://opentracker.blog.h3q.com/?p=38">v6 protocol extension proposal</a>.</p>
<hr/>
As always, feel free to comment on that topic at <a href="mailto:code@denis.stalker.h3q.com">our coders mailing list</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://opentracker.blog.h3q.com/2008/01/02/the-udp-situation/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>The IPv6 situation</title>
		<link>http://opentracker.blog.h3q.com/2007/12/28/the-ipv6-situation/</link>
		<comments>http://opentracker.blog.h3q.com/2007/12/28/the-ipv6-situation/#comments</comments>
		<pubDate>Fri, 28 Dec 2007 21:48:09 +0000</pubDate>
		<dc:creator>erdgeist</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[tech]]></category>

		<guid isPermaLink="false">http://opentracker.blog.h3q.com/?p=38</guid>
		<description><![CDATA[If you haven&#8217;t spent the last decade behind a C64 you may have noticed a HugeNewThing™ called IPv6 is coming. There still are some issues that prevent its wide spread use. No consumer DSL has it. Most people wouldn&#8217;t even care what *IPv4* is. However, bittorrent has been designed to work with IPv6 out of [...]]]></description>
			<content:encoded><![CDATA[<p>If you haven&#8217;t spent the last decade behind a C64 you may have noticed a HugeNewThing™ called <a href="http://www.ipv6.org/">IPv6</a> is coming. There still are some issues that prevent its wide spread use. No consumer DSL has it. Most people wouldn&#8217;t even care what <b>*IPv4*</b> is. However, bittorrent has been designed to work with IPv6 out of the box. Many convenience extensions in the Bittorrent-Tracker protocol – compact tracker responses, for example – do not. Worse: others like the <a href="http://xbtt.sourceforge.net/udp_tracker_protocol.html">UDP-Tracker protocol</a> do not even look like having taken IPv6 into account at all.</p>
<p>Still, we consider IPv6 very important and convenient. Its use could solve all NAT-dependent connectivity issues and heavily reduce complexity in clients. On the other hand we noticed that introducing IPv6 in a transitional way creates many headaches in inter-protocol communication. To remove some barriers we propose the following request for comments which will be incorporated into opentracker and thus available at tpb soon.</p>
<h3>1. The clouds</h3>
<p>To be frank: there will be two clouds for each torrent, one in IPv4 and one in IPv6. Clients capable of handling IPv6 connections and still interested in IPv4 peers will need to announce their torrents to trackers twice. This is a harsh decission, but in the end most IPv4 peers are not interested in IPv6 peers and there is no easy way to tell whom to return which peers.</p>
<p><i>The proposed way to identify a peer&#8217;s internet protocol version defaults to the incoming socket address and can be overridden by the ip parameter from the announce http request.</i></p>
<p>This is very much how Bram Cohen layed it out in his Tracker-Protocol. Of course, overriding only works for trackers that allow overriding ip addresses with the one passed in the query. All others have no means of obtaining the ip address for the other protocol version.</p>
<p><i>Scrapes will be answered for each cloud independently according to the requestors ip address protocol version.</i></p>
<h3>2. Tracker TCP Responses</h3>
<p>Most trackers (e.g. opentracker) have completely switched to only returning compact responses in order to save, or better: not to waste bandwidth. That compact format is just a string of multiple of six bytes that contain ip address and port for as many peers as do fit there. Filling it with or mixing in IPv6 addresses is impossible without breaking backward compatibility.</p>
<p><i>So we propose adding another &#8220;peers6&#8243; entry in the announce response dictionary that contains a string of zero or more entries, each 16 bytes network byte order IPv6 addresses followed by 2 byte network byte order port number. The &#8220;peers6&#8243; entry must not be present in non-compact mode.</i></p>
<h3>3. Tracker UDP Responses</h3>
<p>The UDP-protocol has not left any space for IPv6-addresses in its current form. So we need to define a new action together with its input and output block formats.</p>
<p><i>Proposed extension to the <a href="http://xbtt.sourceforge.net/udp_tracker_protocol.html">UDP Tracker format</a> to work over IPv6, we claim a value for the action identifier of &#8220;4&#8243;.:</p>
<p><b>Request:</b></p>
<table border="1">
<tr>
<td>Offset</td>
<td>Size</td>
<td>Name</td>
<td>Value</td>
</tr>
<tr>
<td>0</td>
<td>64-bit integer</td>
<td>connection_id</td>
<td></td>
</tr>
<tr>
<td>8</td>
<td>32-bit integer</td>
<td>action</td>
<td>4</td>
</tr>
<tr>
<td>12</td>
<td>32-bit integer</td>
<td>transaction_id</td>
<td></td>
</tr>
<tr>
<td>16</td>
<td>20-byte string</td>
<td>info_hash</td>
<td></td>
</tr>
<tr>
<td>36</td>
<td>20-byte string</td>
<td>peer_id</td>
<td></td>
</tr>
<tr>
<td>56</td>
<td>64-bit integer</td>
<td>downloaded</td>
<td></td>
</tr>
<tr>
<td>64</td>
<td>64-bit integer</td>
<td>left</td>
<td></td>
</tr>
<tr>
<td>72</td>
<td>64-bit integer</td>
<td>uploaded</td>
<td></td>
</tr>
<tr>
<td>80</td>
<td>32-bit integer</td>
<td>event</td>
<td></td>
</tr>
<tr>
<td>84</td>
<td>16-byte string</td>
<td>IP address</td>
<td></td>
</tr>
<tr>
<td>100</td>
<td>32-bit integer</td>
<td>key</td>
<td></td>
</tr>
<tr>
<td>104</td>
<td>32-bit integer</td>
<td>num_want</td>
<td>-1</td>
</tr>
<tr>
<td>108</td>
<td>16-bit integer</td>
<td>port</td>
<td></td>
</tr>
</table>
<p>The corresponding reply looks as follows:</p>
<p><b>Response:</b></p>
<table border="1">
<tr>
<td>Offset</td>
<td>Size</td>
<td>Name</td>
<td>Value</td>
</tr>
<tr>
<td>0</td>
<td>32-bit integer</td>
<td>action</td>
<td>4</td>
</tr>
<tr>
<td>4</td>
<td>32-bit integer</td>
<td>transaction_id</td>
<td></td>
</tr>
<tr>
<td>8</td>
<td>32-bit integer</td>
<td>interval</td>
<td></td>
</tr>
<tr>
<td>12</td>
<td>32-bit integer</td>
<td>leechers</td>
<td></td>
</tr>
<tr>
<td>16</td>
<td>32-bit integer</td>
<td>seeders</td>
<td></td>
</tr>
<tr>
<td>20+n*18</td>
<td>16-byte string</td>
<td>IPv6 address</td>
<td></td>
</tr>
<tr>
<td>36+n*18</td>
<td>16-bit integer</td>
<td>TCP port</td>
<td></td>
</tr>
</table>
<p></i></p>
<hr />
We consider those the most sane design decissions that do not deviate from existing protocol too much and hope to receive feedback on our proposal, from closed source implementations as well as the usual open source suspects.</p>
]]></content:encoded>
			<wfw:commentRss>http://opentracker.blog.h3q.com/2007/12/28/the-ipv6-situation/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Review-A-Tracker-Team</title>
		<link>http://opentracker.blog.h3q.com/2007/12/11/review-a-tracker-team/</link>
		<comments>http://opentracker.blog.h3q.com/2007/12/11/review-a-tracker-team/#comments</comments>
		<pubDate>Tue, 11 Dec 2007 00:49:07 +0000</pubDate>
		<dc:creator>erdgeist</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[tech]]></category>

		<guid isPermaLink="false">http://opentracker.blog.h3q.com/?p=37</guid>
		<description><![CDATA[Six days ago – on December 5th – opentracker celebrated its first birthday, just two days before officially becoming the tracker handling most tracker requests on earth. During that year openracker already handled several billions http connections in what can only be described as the hardest fuzzing attack ever done to a software (: However, [...]]]></description>
			<content:encoded><![CDATA[<p>Six days ago – on December 5th – opentracker <a href="https://erdgeist.org/cvsweb/opentracker/opentracker.c?rev=1.1;content-type=text%2Fx-cvsweb-markup">celebrated its first birthday</a>, just two days before officially becoming the tracker handling most tracker requests on earth. During that year openracker already handled several billions http connections in what can only be described as the hardest fuzzing attack ever done to a software (:</p>
<p>However, becoming so huge and being so exposed brings its risks. opentracker is written in plain C and handles strings. Strings that have been sent through the internet. So we kindly ask the community to help us make the code more secure by reviewing critical parts of it. Especially the <a href="https://erdgeist.org/cvsweb/opentracker/scan_urlencoded_query.c?rev=HEAD;content-type=text%2Fx-cvsweb-markup">part that parses URIs</a> is a natural point to start looking into.</p>
<p>So if you are experienced in C, serious about helping to review or just need some explanation on opentrackers rougher edges, feel free to contact us at <a href="mailto:abuse@denis.stalker.h3q.com">code@denis.stalker.h3q.com</a>. We do also appreciate patches that fix bugs and warnings on operating systems we have not tested opentracker on and source packages for certain package distribution systems.</p>
]]></content:encoded>
			<wfw:commentRss>http://opentracker.blog.h3q.com/2007/12/11/review-a-tracker-team/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>ThePirateBay is testing opentracker!</title>
		<link>http://opentracker.blog.h3q.com/2007/11/22/thepiratebay-is-testing-opentracker/</link>
		<comments>http://opentracker.blog.h3q.com/2007/11/22/thepiratebay-is-testing-opentracker/#comments</comments>
		<pubDate>Wed, 21 Nov 2007 23:40:52 +0000</pubDate>
		<dc:creator>taklamakan</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[history]]></category>
		<category><![CDATA[tech]]></category>

		<guid isPermaLink="false">http://opentracker.blog.h3q.com/?p=33</guid>
		<description><![CDATA[The famous Swedish Bittorrent-Site is currently working with us on testing our opentracker software for ThePirateBay! We are in the stage of tuning and feature adding. Our software is already running on four of the piratebay-trackers (vip.tracker.thepiratebay.org, tpb.tracker.thepiratebay.org, tv.tracker.thepiratebay.org, a.tracker.thepiratebay.org) distributed with round-robin DNS over seven servers. Together they are handling nearly 17.000 hits/sec, serve [...]]]></description>
			<content:encoded><![CDATA[<p>The famous Swedish Bittorrent-Site is currently working with us on testing our opentracker software for ThePirateBay!<br />
We are in the stage of tuning and feature adding. Our software is already running on four of the piratebay-trackers (vip.tracker.thepiratebay.org, tpb.tracker.thepiratebay.org, tv.tracker.thepiratebay.org, a.tracker.thepiratebay.org) distributed with round-robin DNS over seven servers. Together they are handling nearly 17.000 hits/sec, serve 700 thousand torrents and have about 5 million peers.</p>
<p>ThePirateBay plans to switch the last trackers running hypercube to our opentracker software as soon as we have all features implemented. This should be real soon now.</p>
]]></content:encoded>
			<wfw:commentRss>http://opentracker.blog.h3q.com/2007/11/22/thepiratebay-is-testing-opentracker/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Rambo</title>
		<link>http://opentracker.blog.h3q.com/2007/01/21/rambo/</link>
		<comments>http://opentracker.blog.h3q.com/2007/01/21/rambo/#comments</comments>
		<pubDate>Sun, 21 Jan 2007 17:36:21 +0000</pubDate>
		<dc:creator>erdgeist</dc:creator>
				<category><![CDATA[coding]]></category>

		<guid isPermaLink="false">http://opentracker.blog.h3q.com/?p=7</guid>
		<description><![CDATA[Surely you have seen Weird Al Yankovich making fun of Rambo movies, where dozens of villains were standing on the top of a hill, firing rounds after rounds and still being shot – one after another by our dear Al. You might argue that a bittorrent tracker does not actually seek to kill its clients, [...]]]></description>
			<content:encoded><![CDATA[<p>Surely you have seen Weird Al Yankovich making fun of Rambo movies, where dozens of villains were standing on the top of a hill, firing rounds after rounds and still being shot – one after another by our dear Al.</p>
<p>You might argue that a bittorrent tracker does not actually seek to kill its clients, yet the amount of fire it attracts is comparable and I always wanted to refer to <a href="http://imdb.com/title/tt0098546/">&#8220;UHF&#8221;</a> in a blog posting.</p>
<p>When you have to handle that many requests per seconds (I&#8217;m told, were approaching the 350 in the moment), you basically have two choices: try to figure out a complex multithreaded connection handling framework that dispatches your complex database layout or go for simplicity and try to handle request sequentially and be fast to serve them.</p>
<p>Before I can go into that I need to explain, what a bittorrent tracker does. In Germany there&#8217;s an institution called Mitfahrzentrale. Basically you tell them, you&#8217;re driving from one city to another at a certain date or that you would want to but lack the car. In any way, Mitfahrzentrale tells you, who also wanted to get to the same place by the same time and provides you with contact information.</p>
<p>With bittorrent you acquire all information you need to share a file by the .torrent file you get from bittorrent search engines. This file lists all trackers that will play Mitfahrzentrale for you plus a token to identify your ride there: a twenty bytes (160 bits) info_hash, calculated over the .torrent file.</p>
<p>All contact information you need to connect to another peer wanting to share the same file is its IP address and a port number its client listens on. In IPv4 thats six bytes. If you are familiar with organizing data you might by now have noticed that you just need structures to hold twenty bytes per torrent plus a pointer to a list of six byte entries for each peer.</p>
<p>Before I can describe, how our first important design decission evolved, you need to know about the reasons, other clients chose different designs. Bram Cohen decided to tunnel all tracker requests through HTTP (over TCP, of course) which has two draw backs. First: most requests plus its answers comfortably fit into one ethernet frame each. Having to setup a TCP connection for requests requires nine packets average instead of two, you would get when using UDP, which would even leave us with enough bandwidth for eight retries before the more reliable TCP protocol could perform better. Second: HTTP itself is an awfully complex protocol to implement. If you&#8217;re just supposed to give a simple answer to a simple question, I would argue its use is inappropriate. Considering that most firewalls let HTTP connections pass unhindered, using HTTP would be excused – if the actual data connections between peers weren&#8217;t just some binary TCP streams doomed to be filtered.</p>
<p>Now, when you&#8217;re using some httpd to handle the protocol for you, anyway – why bother doing the rest as a C or C++ language CGI? A typical coders reflex is to write a php-script using mysql as data backend. You learned in computer science course, that using a data base to store your data is the way to go. Right? The framework will do all the threading and forking for you, your data base ensures an appropriate locking.</p>
<p>Well, we thought different.</p>
<p>Using a kick-ass all-features-on data base? Constructing complex SQL-queries to select six byte structures from lists accessed by twenty byte indexes, that already come pre-hashed? Having a data/metadata ratio of 1:10000? Doing local data base connections, that involves setting up unix sockets again to blow up a twenty bytes request and its 50*6 bytes answer up to several kbytes overhead. No way!</p>
<p>All techniques to efficiently store that little data (were talking about eight MBs for 100,000 info_hashes and six MBs for 1,000,000 peers here) were around for fourty years. As already mentioned, you can use info_hash&#8217;s first bytes to address buckets, since they are already hashed. So you can easily keep sorted lists you may do binary searches in. With some minor abstractions you can sort your peers into pools you can chose to release after bittorrents peer time out.</p>
<p>All left to do was to find a framework to handle HTTP. Fortunally I&#8217;ve been following <a href="http://www.fefe.de/libowfat/">libowfat</a>&#8216;s development for a while. It is a highly scalable multi platform framework modelled after <a href="http://cr.yp.to/software.html">Prof. Daniel J. Bernsteins</a> fascinating low level abstraction library, but exploiting most IO acceleration techniques like kqueue and poll. Felix von Leitner added an example httpd application that became basis for what opentracker is today. What we do is to accept, answer and close all incoming connections as fast as possible, using static buffers. This approach easily scales up to several thousand requests per second. The only reason for not having exact numbers is us being too lazy to implement stress testing software in C and all other test suites being too slow to request more than some hundred connections per second.</p>
<p>You can follow <a href="http://erdgeist.org/arts/software/opentracker/">opentracker</a>&#8216;s development <a href="https://erdgeist.org/cvsweb/opentracker/">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://opentracker.blog.h3q.com/2007/01/21/rambo/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.109 seconds -->

