<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Inkzee &#187; benchmarks</title>
	<atom:link href="http://blog.inkzee.com/index.php/tag/benchmarks/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.inkzee.com</link>
	<description>How to read more in less time</description>
	<lastBuildDate>Wed, 12 May 2010 13:23:28 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<atom:link rel="hub" href="http://pubsubhubbub.appspot.com"/>		<item>
		<title>First stats with Tokyo Cabinet</title>
		<link>http://blog.inkzee.com/index.php/2009/06/25/first-stats-with-tokyo-cabinet/</link>
		<comments>http://blog.inkzee.com/index.php/2009/06/25/first-stats-with-tokyo-cabinet/#comments</comments>
		<pubDate>Thu, 25 Jun 2009 02:15:33 +0000</pubDate>
		<dc:creator>abarrera</dc:creator>
				<category><![CDATA[inkzee]]></category>
		<category><![CDATA[benchmarks]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[pytc]]></category>
		<category><![CDATA[pyTyrant]]></category>
		<category><![CDATA[schemaless]]></category>
		<category><![CDATA[tokyo cabinet]]></category>
		<category><![CDATA[tokyo tyrant]]></category>

		<guid isPermaLink="false">http://blog.inkzee.com/?p=62</guid>
		<description><![CDATA[Today we started testing Tokyo Cabinet as our DBM for the new design. We had some very good references about it, so we thought we should give it a try.
After setting up Tokyo Cabinet, it&#8217;s python binding and Tokyo Tyrant (db server) with it&#8217;s python bindings too we did some fast tests. We drafted a [...]]]></description>
			<content:encoded><![CDATA[<p>Today we<strong> started testing <a href="http://tokyocabinet.sourceforge.net/">Tokyo Cabinet</a></strong> as our DBM for the new design. We had some very good references about it, so we thought we should give it a try.</p>
<p>After setting up Tokyo Cabinet, it&#8217;s python binding and <strong><a href="http://tokyocabinet.sourceforge.net/tyrantdoc/">Tokyo Tyrant</a> (db server)</strong> with it&#8217;s python bindings too we did some fast tests. We drafted a new schema-less design for the new database and<strong> dumped part of some old data</strong> to Tokyo Cabinet.</p>
<p>For those not familiar with the term <strong>schema-less</strong>, it&#8217;s basically a database that has no table structure, that is, everything is stored as a tuple of (key, value). On one side, a key-value database is much faster to read/write but it&#8217;s much harder to maintain and keep in sync.</p>
<p>So, we did some queries (<strong>read only operations</strong>) in both databases an this is what we saw:</p>
<p><strong>Test 1:</strong></p>
<ul>
<li>All data from a feed (MySQL):  0.01699 s</li>
<li>Partial data from a feed (TC): 0.00174 s</li>
</ul>
<p>This first test wasn&#8217;t really fair, as MySQL had to retrieve all fields per record, while TC just had to access a bunch of buckets with fewer fields. We did this first test as it&#8217;s going to be the real scenario, currently we retrieve many more fields from a Feed than we should and so, the new query under TC is, not only faster because of the database, but because it&#8217;s much more lightweighted.</p>
<p>Anyway, we modified the test so that <strong>both queries retrieved both fields per row</strong>:</p>
<p><strong>Test 2:</strong></p>
<ul>
<li>Partial data from a feed (MySQL): 0.00346 s</li>
<li>Partial data from a feed (TC): 0.00151 s</li>
</ul>
<p>Here we can see that both are slightly similar. Again, this isn&#8217;t really fair, as MySQL is executing just one query against several that we do with TC. So, we changed the TC query into a <strong>multiget request</strong> (request several keys at the same time):</p>
<p><strong>Test 3:</strong></p>
<ul>
<li>Partial data from a feed (MySQL): 0.003533 s</li>
<li>Partial data from a feed (TC with Multiget): 0.000845 s</li>
</ul>
<p>Under exact circunstances it&#8217;s clear which one is faster. So, I think we&#8217;ll continue experimenting with Tokyo Cabinet and some more real data and see how it performs.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.inkzee.com/index.php/2009/06/25/first-stats-with-tokyo-cabinet/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
	</channel>
</rss>
