<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>..spOOx?!</title>
	<atom:link href="http://spoox.org/wp/feed/" rel="self" type="application/rss+xml" />
	<link>http://spoox.org/wp</link>
	<description>Random thoughts and nonsense</description>
	<lastBuildDate>Tue, 08 Nov 2011 19:09:51 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>TNT and their annoying tracking form</title>
		<link>http://spoox.org/wp/2011/11/08/tnt-and-their-annoying-tracking-form/</link>
		<comments>http://spoox.org/wp/2011/11/08/tnt-and-their-annoying-tracking-form/#comments</comments>
		<pubDate>Tue, 08 Nov 2011 19:07:50 +0000</pubDate>
		<dc:creator>Rune</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://spoox.org/wp/?p=114</guid>
		<description><![CDATA[So, I&#8217;m waiting for a package that was shipped via TNT. Fine, I get a tracking number and go to TNT&#8217;s web site and do a &#8220;Track package&#8221; search, and get the tracking details. Fine&#8230; Well &#8212; almost&#8230;.. Argh. I look &#8230; <a href="http://spoox.org/wp/2011/11/08/tnt-and-their-annoying-tracking-form/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>So, I&#8217;m waiting for a package that was shipped via TNT. Fine, I get a tracking number and go to <a href="http://www.tnt.com">TNT&#8217;s web site</a> and do a &#8220;Track package&#8221; search, and get the tracking details. Fine&#8230;</p>
<p>Well &#8212; almost&#8230;.. Argh.<span id="more-114"></span></p>
<p>I look at the URL, and it&#8217;s <a href="http://www.tnt.com/webtracker/tracking.do">http://www.tnt.com/webtracker/tracking.do</a>.</p>
<p>No sign of my tracking number, as obviously they&#8217;re submitting the form using METHOD=POST. I.e.: instead of encoding the request in the query string (../tracking.do?id=123456&amp;..), they are submitting the query encoded as POST data.</p>
<p>This means you can&#8217;t bookmark the page, and you must either keep it open and refresh it (answering: &#8220;yes, I want to re-submit the data&#8221;), or each time go to the search form and enter the tracking number again.</p>
<p>If you bookmark the page, then opening the bookmark will give you <a href="http://www.tnt.com/webtracker/tracking.do">this page</a>:</p>
<blockquote>
<pre>Tracker

Sorry we are unable to fulfil your request
Please re-submit your enquiry.
Intellectual and other property rights to the
information contained in this site are held by
TNT Holding B.V. with all rights reserved © 2008</pre>
</blockquote>
<p>Well, thank you very much.</p>
<p>Luckily, it turns out that if you dig around the HTML for the search page and find the form variables (or use a HTTP sniffing tool like <a title="Fiddler2" href="http://www.fiddler2.com">Fiddler2</a>), their backend supports submitting the form using a query string (METHOD=GET) also:</p>
<blockquote><p><em>http://www.tnt.com/webtracker/tracking.do?<br />
respCountry=us&amp;respLang=en&amp;navigation=1&amp;page=1<br />
&amp;sourceID=1&amp;sourceCountry=ww&amp;plazaKey=&amp;refs=<br />
&amp;requesttype=GEN&amp;searchType=CON<br />
&amp;cons=</em><strong>your-tracking-number-goes-here</strong></p></blockquote>
<p>Yay, bookmarked!</p>
<p><strong><em>Now .. why is this hidden away behind METHOD=POST?</em></strong></p>
<p>Sure, it could be out of &#8220;security&#8221; &#8211; so that your tracking number would not be kept in the browser&#8217;s history for others to have a peek at, and possibly to avoid someone intercepting your package.</p>
<p>Well, fine! But then &#8211; why not offer a link from the results page, for those of us who hopefully know what we are doing?</p>
<p>And seriously:</p>
<ul>
<li>Most modern browsers support &#8220;privacy mode&#8221;, if you&#8217;re doing sensitive stuff. (And pretty much all browsers have a &#8220;clear history&#8221; function hidden away somewhere)</li>
<li>If knowing the tracking number is all it takes to hijack and intercept my package, then what kind of security is that..?</li>
</ul>
<div>Oh well.. First world problems ;)</div>
]]></content:encoded>
			<wfw:commentRss>http://spoox.org/wp/2011/11/08/tnt-and-their-annoying-tracking-form/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Lessons learned from Apache mod_chroot vs. PHP&#8217;s mail()</title>
		<link>http://spoox.org/wp/2011/01/04/lessons-from-apache-mod_chroot-vs-phps-mail/</link>
		<comments>http://spoox.org/wp/2011/01/04/lessons-from-apache-mod_chroot-vs-phps-mail/#comments</comments>
		<pubDate>Tue, 04 Jan 2011 21:00:21 +0000</pubDate>
		<dc:creator>Rune</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://spoox.org/wp/?p=76</guid>
		<description><![CDATA[Some lessons learned from setting up Apache with mod_chroot and then trying to get PHP&#8217;s mail() function to work. Sure, there are several mod_chroot caveats listed, but that wasn&#8217;t quite enough. Short story: Problems: mail() doesn&#8217;t work since it won&#8217;t &#8230; <a href="http://spoox.org/wp/2011/01/04/lessons-from-apache-mod_chroot-vs-phps-mail/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Some lessons learned from setting up Apache with <a href="http://core.segfault.pl/~hobbit/mod_chroot/">mod_chroot</a> and then trying to get PHP&#8217;s <a href="http://php.net/manual/en/function.mail.php">mail()</a> function to work. Sure, there are several mod_chroot <a href="http://core.segfault.pl/~hobbit/mod_chroot/caveats.html">caveats</a> listed, but that wasn&#8217;t quite enough.</p>
<p><span id="more-76"></span></p>
<p>Short story:</p>
<p>Problems:</p>
<ul>
<li> mail() doesn&#8217;t work since it won&#8217;t have access to sendmail or whatever MTA binaries that are most likely located outside your chroot jail</li>
<li> You&#8217;ll either need to use something else than mail() &#8211; e.g. a replacement library/class &#8211; or make some workaround to get mail() working</li>
<li> Most likely a workaround is not what you want, since it means having to patch e.g. WordPress, Mediawiki and all sorts of existing software that already uses mail() (WordPress has a plugin to replace the call to mail() but I couldn&#8217;t get it working either)</li>
<li> Even after trying a sendmail binary replacement (I went with <a href="http://www.acme.com/software/mini_sendmail/">mini_sendmail</a>), something was just not working (I tried <tt>ssmtp</tt> and <tt>nbsmtp</tt> also, to no luck. In hindsight: Probably related to problem #2 listed below)</li>
<li> Yes, that included adding various libraries and stuff to the chrooted area. (hint: <a href="http://linux.about.com/library/cmd/blcmdl1_ldd.htm">ldd</a>. <a href="http://www.cyberciti.biz/files/lighttpd/l2chroot.txt">l2chroot</a> is also nice)
</ul>
<p><strong>Problem 1: mini_sendmail failing to detect the current username</strong></p>
<p>Tested using chroot from the shell first:</p>
<blockquote><pre># chroot /var/www /bin/mini_sendmail
/bin/mini_sendmail: can't determine username</pre>
</blockquote>
<p>So what&#8217;s that about?</p>
<p>Well, there&#8217;s a segment of code that tries to figure out the login/username using getlogin() and it even has some #ifdef&#8217;s to try getpwuid(getuid()) instead, but for some reason both of them fail.</p>
<p>Manually setting it to some dummy value seemed to be the solution:</p>
<blockquote><pre>username = "blah"; // doesn't work:  getlogin();</pre>
</blockquote>
<p><strong>Problem 2: replacement binaries were simply just not working at all from Apache/PHP</strong></p>
<p>The symptom was &#8230;. PHP&#8217;s mail() was just returning &#8220;false&#8221;. Nothing in the Apache error logs or anywhere else.</p>
<p>I even tried creating a C program that just logged the arguments and any STDIN input to a file, pointed the php.ini file to it, but nothing appeared in the output file either.</p>
<p>I tried calling the binaries using system() and exec() from PHP, but they returned &#8220;&#8221; (blank string) and nothing was logged in the Apache logs. WTF?</p>
<p>Turns out you need to have /bin/sh in your chroot jail, or else PHP refuses to run. I guess this is because it uses <a href="http://www.kernel.org/doc/man-pages/online/pages/man3/system.3.html">system(3)</a> which forks a shell, instead of using <a href="http://www.kernel.org/doc/man-pages/online/pages/man3/exec.3.html">exec(3)</a> or <a href="http://www.kernel.org/doc/man-pages/online/pages/man2/execve.2.html">execve(2)</a>.</p>
<p>I don&#8217;t know. Adding /bin/sh worked. End of story.</p>
<p><strong>Problem 3: mini_sendmail screwing up while parsing email addresses from mail headers</strong></p>
<p>When I tried to get this working with Mediawiki, I ran into another problem: the activation mails weren&#8217;t being sent out. A little bit tcpdump was all it took to notice what was going on there: the following command was being rejected by the MTA:</p>
<blockquote><p><tt>RCPT TO:&lt;Real Name &lt;foo@example.org&gt;</tt></p></blockquote>
<p>Here the arg_dumper C binary I wrote came handy. This was what it showed being passed to mail() from Mediawiki:</p>
<blockquote><p><tt>To: Real Name &lt;foo@example.org&gt;</tt></p></blockquote>
<p>Two problems here:</p>
<ol>
<li> The &#8216;RCPT TO&#8217; command should not include the descriptive name of the recipient; only the address should be included</li>
<li> Where did the last &gt; go in the SMTP &#8216;RCPT TO:&#8217; command, anyhow?</li>
</ol>
<p>Turns out it was the <tt>add_recipient()</tt> method in mini_sendmail that doesn&#8217;t handle this very well:</p>
<blockquote><pre>
    /* Skip leading whitespace. */
    while ( len > 0 &#038;&#038; ( *recipient == ' ' || *recipient == '\t' ) )
        {
        ++recipient;
        --len;
        }

    /* Strip off any angle brackets. */
    while ( len > 0 &#038;&#038; *recipient == '<' )
        {
        ++recipient;
        --len;
        }
    while ( len > 0 &#038;&#038; recipient[len-1] == '>' )
        --len;
</pre>
</blockquote>
<p>So, what does it do? First it chops off any whitespace and leading opening angle brackets (&#8216;&lt;&#8217;). Then it chops off any trailing closing angle brackets. (&#8216;&gt;&#8217;).</p>
<p>This effectively chops off the last character in the Mediawiki template. So, I modified that to something like:</p>
<blockquote><pre>

    /* Skip leading whitespace. */
    while ( len > 0 &#038;&#038; ( *recipient == ' ' || *recipient == '\t' ) )
        {
        ++recipient;
        --len;
        }

    /* Strip off any angle brackets.  (fixed?) */

  if ( (len > 0) &#038;&#038; (recipient[len-1] == '>') )
    {
    while ( len > 0 &#038;&#038; *recipient != '<' )
        {
        ++recipient;
        --len;
        }

    /* skip leading < we stopped at */
    --len;
    ++recipient;

    while ( len > 0 &#038;&#038; recipient[len-1] == '>' )
        --len;
    }
}
</pre>
</blockquote>
<p>I.e.: only do this bracket-stripping if there is a trailing bracket, and chop off anything before the first bracket.</p>
<p>..which works with all of these:</p>
<blockquote><pre>
To: Real Name &lt;foo@example.org&gt;
To: &lt;foo@example.org&gt;
To: foo@example.org
</pre>
</blockquote>
<p>Produces:</p>
<blockquote><p><tt>RCPT TO:&lt;foo@example.org&gt;</tt></p></blockquote>
<p>&#8230;</p>
<p><b>Yay.</b></p>
<p>I can haz mod_chroot&#8217;ed mail().</p>
]]></content:encoded>
			<wfw:commentRss>http://spoox.org/wp/2011/01/04/lessons-from-apache-mod_chroot-vs-phps-mail/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Windows sounds&#8230; oh my.</title>
		<link>http://spoox.org/wp/2010/12/22/windows-sounds-oh-my/</link>
		<comments>http://spoox.org/wp/2010/12/22/windows-sounds-oh-my/#comments</comments>
		<pubDate>Wed, 22 Dec 2010 20:48:36 +0000</pubDate>
		<dc:creator>Rune</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://spoox.org/wp/?p=62</guid>
		<description><![CDATA[Hmm, is it just me, or isn&#8217;t selecting the &#8220;No sounds&#8221; scheme one of the first things you would do after installing Windows? I would at least expect that any experienced and sane person would do so&#8230; Hearing &#8220;DING&#8221; and &#8230; <a href="http://spoox.org/wp/2010/12/22/windows-sounds-oh-my/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Hmm, is it just me, or isn&#8217;t selecting the &#8220;No sounds&#8221; scheme one of the first things you would do after installing Windows? I would at least expect that any experienced and <em>sane</em> person would do so&#8230;</p>
<p>Hearing &#8220;DING&#8221; and Windows startup and shutdown sounds from people in the IT department or who are supposed to be &#8220;system administrators&#8221; really scares me&#8230;</p>
<p>It makes me wonder what other annoying, useless default features they&#8217;ve left enabled&#8230; :P</p>
]]></content:encoded>
			<wfw:commentRss>http://spoox.org/wp/2010/12/22/windows-sounds-oh-my/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Silverlight and socket support &#8211; what were they thinking?</title>
		<link>http://spoox.org/wp/2008/11/07/silverlight-and-socket-support-what-were-they-thinking/</link>
		<comments>http://spoox.org/wp/2008/11/07/silverlight-and-socket-support-what-were-they-thinking/#comments</comments>
		<pubDate>Fri, 07 Nov 2008 17:22:09 +0000</pubDate>
		<dc:creator>Rune</dc:creator>
				<category><![CDATA[Silverlight]]></category>

		<guid isPermaLink="false">http://spoox.org/wp/2008/11/07/silverlight-and-socket-support-what-were-they-thinking/</guid>
		<description><![CDATA[With beta 2 release of Silverlight 2 (SL2-B2), Microsoft decided to change the socket implementation to require a socket policy server, which basically is a TCP server responding with an XML document if you send it a certain string. Prior &#8230; <a href="http://spoox.org/wp/2008/11/07/silverlight-and-socket-support-what-were-they-thinking/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>With beta 2 release of <a href="http://www.microsoft.com/silverlight">Silverlight</a> 2 (SL2-B2), Microsoft decided to change the <a href="http://msdn.microsoft.com/en-us/library/cc296248(VS.95).aspx">socket implementation</a> to require a <a href="http://msdn.microsoft.com/en-us/library/cc645032(VS.95).aspx#sectionToggle4">socket policy server</a>, which basically is a TCP server responding with an XML document if you send it a certain string.</p>
<p><span id="more-49"></span></p>
<p>Prior to SL2-B2, the only <a href="http://msdn.microsoft.com/en-us/library/cc645032(VS.95).aspx">restriction</a> for socket connections was that the connection had to be done against the same hostname/IP-address that served the Silverlight application, and that the TCP port needed to be in the range 4502-4534.</p>
<p>Release candidate 0 (SL2-RC0) and the final &#8220;Release-To-Web&#8221; release (SL2-RTW) didn&#8217;t introduce any new changes.</p>
<p>So, the restrictions are now:</p>
<ol>
<li>Connections can only be made to TCP ports in the range 4502-4534</li>
<li>A socket policy server must be running on port 943 on the same server</li>
<li>The policy returned by the policy server must match with the application performing the request</li>
</ol>
<p>The arguments for introducing the policy server seems to be:</p>
<ol>
<li>Added security</li>
<li>A server hosting a socket server is not likely to also run a web server</li>
</ol>
<p>Some related links:</p>
<ol>
<li><a href="http://timheuer.com/blog/archive/2008/06/06/silverlight-sockets-requires-policy-server-beta-2.aspx">http://timheuer.com/blog/archive/2008/06/06/silverlight-sockets-requires-policy-server-beta-2.aspx</a></li>
<li><a href="http://silverlight.net/blogs/msnow/archive/2008/06/26/full-implementation-of-a-silverlight-policy-server.aspx">http://silverlight.net/blogs/msnow/archive/2008/06/26/full-implementation-of-a-silverlight-policy-server.aspx</a></li>
</ol>
<h3>Added security?</h3>
<p>The first point is of course a partially valid one. If you have a socket service, you might want to limit who has access to the service in order to prevent unauthorized use of it, or prevent so-called <a href="http://en.wikipedia.org/wiki/Cross-site_request_forgery">Cross-Site Request Forgery (XSRF)</a> attacks.</p>
<p>Basically this is what&#8217;s going on:</p>
<ol>
<li>The user goes to <b>http://server1.example.org/foo.html</b> and is served a HTML page with an embedded Silverlight application</li>
<li>The Silverlight app tries to do a connection against <b>server2.example.org:4502</b></li>
<li>Silverlight intercepts the request and does a connection against <b>server2.silverlight.example.org:943</b> and requests the policy XML</li>
<li>Silverlight checks if the policy file accepts connections to server2.example.org:4502 from server1.example.org/foo.html or server1.example.org or *.example.org or some kind of match</li>
<li>The socket request is then either allowed or disallowed, based on the outcome of if a policy server answered, and gave an XML allowing the request</li>
</ol>
<p>This also is more or less the same process with web service calls or web requests (http/https) from within a Silverlight application, with the change being that the access policy file is then read from <b>http://&lt;same-server-and-port-as-the-app-was-served-from&gt;/ClientAccessPolicy.xml</b> (or the Flash <b>crossdomain.xml</b> file).</p>
<p><em>However</em>, the limitation of what services Silverlight applications are able to call is implemented in the Silverlight plugin itself! If you make an external application (i.e. not in Silverlight) in &lt;your favorite programming language&gt; that does socket, http/https or web service calls against these services, these are of course not checked against the access policy files first.</p>
<p>In other words: <b>It leaves these services wide open to any other non-restricted applications doing calls against them. This only protects against XSRF attacks, and nothing else.</b></p>
<p>For the services to be really secure and limit requests, these access checks need to be implemented in the socket services or in the web services etc themselves. You simply cannot rely on that some policy file has been checked in advance. Of course, in a socket server you can&#8217;t figure out which web page the application originated from* &#8211; only what remote IP and port the caller has &#8211; so the policy files do still make a little bit of sense there. (* You could maybe record the IP address in an ASPX page serving the Silverlight app and do something smart there and check this again in the socket)</p>
<h3>&#8220;A server hosting a socket server is not likely to also run a web server&#8221;</h3>
<p>Now, I don&#8217;t know where this idea came from, or what the reasoning behind it was. My reactions are:</p>
<ol>
<li>Why not also do a request against and HTTP service on the same server, <i>in case</i> there is a web server running there already?</li>
<li>Why not allow the policy server port to be configurable? If someone is already running a socket server there in the 4502-4534 range, surely they will be able to set up a service running on port 943 also. (Yes, some OSes limit listening sockets below port 1024 or 4096 to the root/administrator user, but how likely is that?). Also, it&#8217;s not like this will be directly insecure either, as the requesting client can&#8217;t affect which socket servers are running on the remote server anyhow, AND the policy file needs to be checked. I guess one argument could be that this allows for an administrator to manage what goes on, instead of permitting the individual socket server developers to do that.</li>
<li>The policy file is a standard XML file. It is served by a client connecting to a service, issuing a request, and getting the file back. Does this sound <em>vaguely familiar</em> with another, very common protocol? Why insist on creating a new, custom protocol for this, instead of just basing this on HTTP?</b> (see #1)</li>
</ol>
<p>(Regarding #1, this makes me think of another weird decision: leaving out double-click mouse events or mouse-wheel support just because some interfaces might not have anything equivalent. It’s like limiting the screen resolution to 640×480 just in case some users don’t have a monitor capable of a higher resolution, or leaving out mouse support all-together in case some devices might only have keyboard, and no joystick/mouse/touch screen etc. I don&#8217;t get it &#8212; it reduces the user experience at the cost of a hypothetical uncommon scenario)</p>
<h3>Firewalls</h3>
<p>Most non-standard ports will normally be blocked by most &#8220;secure&#8221; firewall settings, and if you&#8217;re behind a corporate firewall, chances are you&#8217;ll experience even more restrictive policies with only a very few ports being open, e.g. port 80 and 443 for http/https.</p>
<p>Adding both a limit of a policy server running on port 943, AND the request being limited to ports 4502-4534 will make this virtually impossible to get to work in most corporate settings, and also other restrictive firewall setups.</p>
<p>It can be argued in reverse also; enforcing this range makes it easier to standardize the port range for access lists. Yes, but; from my experience with firewall administrators and IT departments, they are pretty sensitive about what they will put into their rules. They will most likely only open up the range to a select destination IP or range, and then they could just as well have done that for another port range. Opening up 4502-4534 to <i>all</i> target destinations is not likely to happen.</p>
<p>The common work-around for this problem, is for people to just set up e.g. a non-HTTP service on a the common HTTP port, or something similar. As long as the firewall isn&#8217;t inspecting the traffic and doing protocol analysis, this usually works fine. If you have packet inspection, you&#8217;ll probably be screwed anyhow.</p>
<p>Secondly, it limits calls against standard services, such as chat servers, web servers, ftp servers, etc since these do not typically run in the 4502-4534 range, but usually have their own defined ports. If you want to be able to connect to such services, you now need to either reconfigure them to use this port range, or write a proxy server that proxies/tunnels connections and traffic to the correct ports.</p>
<h3>My conclusion</h3>
<p>My conclusion to this, is that Microsoft weighed the security aspects of this to be much more imporant that offering developers flexibility and possibilities, a choice I can clearly understand and respect. However, in that case I think they really outdid themselves in adding obstacles.</p>
<p>Or; they just didn&#8217;t carefully think this through before rushing out SL2-RTW.. (because let&#8217;s admit it; one RC, ~2 weeks, hardly any changes?)</p>
<p>What I feel would have sufficed:</p>
<ol>
<li>Access policy file must be present either on either a provided port (default 943 if they insist), or on http/https on the same server.</li>
<li>Use something equivalent to HTTP instead of that custom protocol. It could perfectly well be VERY limited (&#8220;GET /ClientAccessPolicy.xml\n\n&#8221; + dummy headers + XML response).</li>
<li>No port range restriction, other than what is enforced through the access policy file. (Or at least offer the possibility of requesting &#8220;elevated privileges&#8221; from the user in form of a dialog or something for using non-SL-range ports, e.g. like Java does).</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://spoox.org/wp/2008/11/07/silverlight-and-socket-support-what-were-they-thinking/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A few ways of &#8220;watermarking&#8221; mp3 files</title>
		<link>http://spoox.org/wp/2007/02/12/a-few-ways-of-watermarking-mp3-files/</link>
		<comments>http://spoox.org/wp/2007/02/12/a-few-ways-of-watermarking-mp3-files/#comments</comments>
		<pubDate>Mon, 12 Feb 2007 05:55:48 +0000</pubDate>
		<dc:creator>Rune</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://spoox.org/wp/?p=32</guid>
		<description><![CDATA[Recently there has been talk by some music labels that they will be releasing non-DRMed mp3 files, followed by updates that they are going to &#8220;watermark&#8221; these files. I choose to write &#8220;watermark&#8221; quoted, as this is really more tagging &#8230; <a href="http://spoox.org/wp/2007/02/12/a-few-ways-of-watermarking-mp3-files/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Recently there has been talk by some music labels that they will be releasing non-DRMed mp3 files, followed by updates that they are going to &#8220;watermark&#8221; these files.</p>
<p>I choose to write &#8220;watermark&#8221; quoted, as this is really more tagging than watermarking, because I doubt they will be applying on-the-fly digital audio processing to watermark, followed by encoding each mp3 specifically for each customer. Let&#8217;s assume they&#8217;re going to just modify stocks of ready-encoded mp3s.</p>
<p><span id="more-32"></span></p>
<p><strong>MP3 file format layout</strong></p>
<p>Before continuing it&#8217;s important to understand how MP3 files are stored, so let&#8217;s take a closer look at how they are structured internally:</p>
<p>ID3v2 tag  (optional)<br />
LAME header  (optional)<br />
MPEG frame 1<br />
MPEG frame 2<br />
..<br />
MPEG frame N<br />
ID3v1/ID3v1.1 tag  (optional)</p>
<p><strong>ID3 tags</strong></p>
<p>The <em>ID3 tags</em> contain various metadata about the file, which is normally either entered automatically by the encoder or edited manually by the person encoding the files. Two standards exist:</p>
<p><a href="http://www.id3.org/ID3v1">ID3v1</a> &#8211; which later got a slight modification and became ID3v1.1 &#8211; is a 128-byte fixed length structure located at the end of files.</p>
<p><a href="http://www.id3.org">ID3v2</a> which is a more dynamic structure, typically located at the beginning of the file. This allows for a plethoria of different information to be stored, with several extensions being made all the time. (e.g. Lyrics3 which allows the lyrics transcript of a song to be embedded into the file)</p>
<p><strong>MPEG frames</strong></p>
<p><em>MPEG frames</em> are small chunks of audio data. The size of these frames will depend on the bitrate used in encoding the file, but each header will be prefixed with a 4 byte <em>frame header</em>.</p>
<p>These headers contain information needed to interpret and make use of the encoded data (e.g. which MPEG encoding method is used, which bitrate the frame is encoded at, the sampling frequency etc) but also non-functional data such as a &#8220;protection bit&#8221;, &#8220;private bit&#8221;, &#8220;copyright bit&#8221;, &#8220;original bit&#8221; etc.</p>
<p>For a full description of the MPEG headers see e.g. this article at <a href="http://www.mp3-tech.org/programmer/frame_header.html">mp3-tech.org</a> or search the web.<a href="http://www.mp3-tech.org/programmer/frame_header.html"><br />
</a></p>
<p><strong>LAME header</strong></p>
<p>Some encoders, such as <a href="http://lame.sourceforge.net/index.php">LAME</a> (and Xing), add frames that appear as regular MPEG frames, but that actually contain additional meta data about the encoding parameters used etc.</p>
<p><strong>SO WHERE DOES THIS LEAD US? </strong></p>
<p>So where does this lead us? Let&#8217;s have a look at a few (but probably not all) ways of tagging mp3 files with some kind of watermarks. (with a varying depth of detail)</p>
<p><strong>1. Adding out-of-stream data to the MPEG stream</strong></p>
<p>Because MPEG frame headers contain a sync marker so that a player can check if the next location it is about to read is a valid MPEG frame, this means that you can place out-of-stream data in between frames.</p>
<p>This will just make players skip a few bytes until it finds a valid MPEG frame header sync marker, but other applications can choose to store custom data in here.</p>
<p><strong>2. Using the unused MPEG frame header bits</strong></p>
<p>As mentioned above, MPEG frame headers Bits like the &#8220;original bit&#8221;, &#8220;private bit&#8221; etc are not of much use to players, so for each frame of the MP3 file you can store up to several bits of information.</p>
<p>Spread across the entire file, depending on the length of the song, this would allow for quite a lot of data which can be used as tracking markers once read by a special program.</p>
<p><strong>3. Using the IDv1 tag</strong></p>
<p>I find this not very likely, but as a method you could choose to utilize the &#8220;comment&#8221; field of the ID3v1 header to add some sort of numeric ID etc.</p>
<p><strong>4. Using the ID3v2 tag</strong></p>
<p>There&#8217;s primarily two methods here:</p>
<p>a) Using an unused or little-used, or even creating a custom ID3v2 block</p>
<p>b) Using the unused padding space of the ID3v2 tag</p>
<p><strong>5. Constructing special MPEG frames like the LAME header</strong></p>
<p>You could divide this into two methods also:</p>
<p>a) Frames using one of the undefined or unallowed bit combinations of a frame header to mark it as invalid, so that the player will skip it. Then custom data can be stored in the actual data portion of the frame. (I <em>think</em> this is what LAME does)</p>
<p>b) Special valid encoded frames consising of an audio watermark (e.g. some audio wave) could be added. This would probably play back as a short click or noise, though, so it might not end up sounding too good.</p>
<p><strong>6. Using the MPEG frame header CRC</strong></p>
<p>The encoder can optionally add checksums to each MPEG frame, so the validity of the file can be tested for corrupted frames etc.</p>
<p>Seeing that not many players actually test for this even if it is present, this could be used to insert 2 bytes of data per frame header.</p>
<p>It could be a bit risky doing this, though, should any players actually choose to test the validity of frames against the CRC.</p>
<p><strong>7. Using variations of volume in each MPEG frame</strong></p>
<p>Quite honestly this is beyond what I know too much about, but I know certain tools (e.g. <a href="http://mp3gain.sourceforge.net/">mp3gain</a>) can normalize or change the volume of an mp3 without transcoding (re-encoding), and thus offers a non-destructive way of doing this.</p>
<p>This is based on the fact that data is (I think) stored as floating point values. These values again are stored using a &#8220;sign * mantissa * radix ^ exponent&#8221; format, which means that you can increment or modify these by fixed values back and forth and introduce gradual change with the option of still getting back to the original values (up to a certain level I guess).</p>
<p>Utilizing this, I guess you could somehow introduce short changes in volume from frame to frame that would go by unnoticed by the listener, but through analysis could be detected. E.g. think morse code.</p>
<p>I don&#8217;t know if this would work. Maybe. It&#8217;s just an idea..</p>
<p><strong>CONCLUSION</strong></p>
<p>There&#8217;s a lot of places you can hide information in an MP3 file. However, it does not take more than having 2 different watermarked copies of the same file to figure out where data is being stored.</p>
<p>Of course, one could combine all the methods described above, or even additional ones, but in the end all of this information could be stripped away leaving only the MP3 audio data left.</p>
<p>We&#8217;ll find out soon enough as these files hit the streets..</p>
<p><b>Update</b><br />
Someone pointed out the fact that on-the-fly watermarking and transcoding would be perfectly well possible on smaller ranges of the files, an option I never considered&#8230;..</p>
<p><b>Update 2011</b><br />
This article is pretty much outdated, as todays computing power allows for on-the-fly watermarking and re-encoding. However, the methods described are still valid, but not very hard to defeat.</p>
]]></content:encoded>
			<wfw:commentRss>http://spoox.org/wp/2007/02/12/a-few-ways-of-watermarking-mp3-files/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
	</channel>
</rss>

