Archive for category inkzee

First stats with Tokyo Cabinet

Today we started testing Tokyo Cabinet as our DBM for the new design. We had some very good references about it, so we thought we should give it a try.

After setting up Tokyo Cabinet, it’s python binding and Tokyo Tyrant (db server) with it’s python bindings too we did some fast tests. We drafted a new schema-less design for the new database and dumped part of some old data to Tokyo Cabinet.

For those not familiar with the term schema-less, it’s basically a database that has no table structure, that is, everything is stored as a tuple of (key, value). On one side, a key-value database is much faster to read/write but it’s much harder to maintain and keep in sync.

So, we did some queries (read only operations) in both databases an this is what we saw:

Test 1:

  • All data from a feed (MySQL):  0.01699 s
  • Partial data from a feed (TC): 0.00174 s

This first test wasn’t really fair, as MySQL had to retrieve all fields per record, while TC just had to access a bunch of buckets with fewer fields. We did this first test as it’s going to be the real scenario, currently we retrieve many more fields from a Feed than we should and so, the new query under TC is, not only faster because of the database, but because it’s much more lightweighted.

Anyway, we modified the test so that both queries retrieved both fields per row:

Test 2:

  • Partial data from a feed (MySQL): 0.00346 s
  • Partial data from a feed (TC): 0.00151 s

Here we can see that both are slightly similar. Again, this isn’t really fair, as MySQL is executing just one query against several that we do with TC. So, we changed the TC query into a multiget request (request several keys at the same time):

Test 3:

  • Partial data from a feed (MySQL): 0.003533 s
  • Partial data from a feed (TC with Multiget): 0.000845 s

Under exact circunstances it’s clear which one is faster. So, I think we’ll continue experimenting with Tokyo Cabinet and some more real data and see how it performs.

, , , , , ,

8 Comments

When partitioning isn’t enough

These past weeks we’ve been partitioning our database design. The goal was to achieve better scalability. Because Inkzee grows with the number of feeds it holds, not the users, we needed to partitioned the data tables so that we could process feed posts faster.

After altering a lot of our current code so that it worked with the new database design we’ve been experiencing problems with MySQL. It seems that, even though the solution makes the overall system much faster (like 3 to 4 times faster), some operations don’t play too well with MySQL and add an unaccepted latency to the system.

We’ve been resisting the urge to migrate to a schema-less database but it seems we have no other option but to transition to it. So, even though we thought we could have the new design working by the end of the week, we are afraid we’ll have to postpone it until further notice. We’ll keep you guys updated though!

The Inkzee Team

, , , , ,

No Comments

Step 2: Database redesign

As part of our milestones towards opening up Inkzee we have the database redesign. We currently manage more than 2 millions posts and over 4000 blogs. And although it might not seem as a lot, our database is starting to complain. A lot of the queries we do against it are getting really sluggish.

That means that if we ought to open up Inkzee we need to redesign the database so it can sustain a higher load of blogs and posts. We are currently working on it and we’ve done great advances. We have a prototype working with the new design but there are still some bugs and problems to resolve.

We hope the new design is finished sometime during this week. We’ll then fire up our test cases and check nothing is broken and once we’re sure the new design is as flawless as we can get it, we’ll release it to you guys! Hopefully you’ll experience a much faster site, not only on a subscription by subscription basis but specially when you request all posts from all blogs.

We’ll keep you posted!

The Inkzee Team

, , ,

No Comments

New UI redesign

As we said before, since we got back from the Bay Area we’ve been working on getting the site out of alpha status. Our goal is to open it up to everyone as soon as possible. This plan requires 3 steps:

  1. Redesign the UI and make it more user-friendly
  2. Redesign the database architecture and alleviate the current scalability bottlenecks we are experiencing
  3. Move all infrastructure to Amazon Web Services for easy horizontal scaling.

To date we’ve accomplished the first milestone. We did a complete overhaul of the UI for Inkzee. It wasn’t just the design but we needed to add a lot of help features. One thing we’ve seen is that most users just use a small fraction of the features the site has (and there aren’t too many). Because the current version was on alpha status, there where no help pages whatsoever. That meant that only the most advanced users were able to navigate the site. Our goal is to make it plain simple for everyone and so we realized we needed a complete help system.

So, off we went and added a lot of help features into the site. For starters, we added a help page where you a a complete (or so we think) guide to Inkzee. In there you’ll, hopefully, find the answers to the most common questions you might have.

Secondly, we changed most of our cryptic icons and turned them into fully blown buttons with their textual meaning apart from the icon.

alerts

We also added a bunch of widgets to the main page so you now will have the latest news of this blog on the main page. We also added a much more interactive “get started” widget than the old plain text layer we had before.

And finally but not least, we added help bubbles all over the interface with tips about the feature at hand.

The first reviews from our alpha users has been very positive but we will love to get some feedback from everyone. Found a spelling mistake? Couldn’t find help about a specific topic? Do you still don’t undeerstand what that button or feature does? Feel free to contact us with your suggestions and ideas!

The Inkzee Team

, , , ,

2 Comments

Back from Silicon Valley

Hello everybody!

Sorry for this long lasting silence. It’s funny how easily we preach the value of blogs but we don’t apply it to ourselves. Anyhow we are back with some news. We recently went to Silicon Valley and had some really amazing conversations with a lot of entrepreneurs there. We should say that one of the most enlightening conversations was with Cameron Koczon. We talked about a lot of things but one struck as really important. We can still remember his words: “Dude, you’ve got everything, the idea, the prototype, the users, what are you waiting for?“. And of course he was right.

There is nothing like getting trapped in your own web. We kept pushing our final release date because we wanted to finish this or that. After getting back from the Valley we realized we just needed to focus on getting Inkzee out of the door and not that much about individual features. So all of our late efforts have been oriented towards that goal. The first stage we’ve accomplished was a complete UI redesign. Right now we are working on a couple of things we still need before opening Inkzee to the world.

In conclusion, we are still working for you guys and you’ll see some results very soon!

The Inkzee team

, , ,

No Comments

Archiving news, what a wonderful thing

Hi all,

Some days ago we rolled out a new release. Apart from fixing a bunch of minor bugs, we implemented an internal improvement. One of the problems we were running into is the huge amount of posts the system is tracking (last week was something around 650.000 posts). Problem is that accessing a single table with that amount of records in it can be time expensive. The solution was simple, most of the posts that are stored are quite old (more than 3 or 4 months old) so we decided to move all those old posts to some other tables where they would be access much less. That meant that we offloaded a lot of information from the main post table, keeping the number of records there under control.

As usual, implementing this things isn’t as straight forward as we would like and we had to implement a specific logic so that if a user wanted to read more posts from a blog that had been archived we would return them to the user in a seamless way. It’s finally working and the site’s performance is considerably better, so we are quite happy.

During the next weeks we’ll be partitioning some tables so that the site goes even faster, we’ll keep you posted.

Thanks for your patient!

The Inkzee Team

, , , , ,

8 Comments

Alerts are here!

Finally we managed to release alert filters to the system! It’s still a little buggy and we have already detected a couple of glitches that we are working on, but all in all it’s pretty decent. As for today you can create an alert with a bunch of keywords and get all posts within your subscriptions that match those keywords. As usual, the posts that are captured by the alert will also display related clustered posts so you can dig further if you want to.

Apart from the alerts, we’ve fixed a lot of bugs, updated our jQuery libs and integrated jQuery UI into our interface. There were some nasty bugs when importing some opml files that have been fixed as well as some problems when unsubscribing to a bunch of feeds.

As usual, feel free to contact us at support at inkzee dot com with any problems or bugs you find. Want to try the alpha? Request an invite here: http://www.inkzee.com.

Thanks a lot for your support!!

The Inkzee Team

, , , , ,

No Comments

Some good press – Thank you!

Just a quick shout to our good friends from the Internet division at L’Echo, Sarah Godard and Roland Legrand whom we met at the LeWeb conference in Paris this year.

Sarah just wrote her first post on her new blog, Nekstr, and called on us as having “undeniable perseverance” and “a wild imagination“. We are very excited to hear such keen words! And from here we send our biggest gratitude for such amazing words! This year we’ll keep it going and will unveil some great new features and hopefully, some cash.

For all the startups out there, keep it tight but keep it comming, good times ahead!

The Inkzee Team

No Comments

The not so simple case of folder management

Today I want to tell all our readers a little tale, the case of the folder management. Folders has been one of the most asked features we got asked for in our first round of alpha testing. Users wanted to be able to group their feeds into folders. We started an early development and suddenly realized doing it wasn’t as simple as expected.

Technology seems easy to everyone except for the tech guys. Features that seem very straight forward become real nightmares to develop. The folder management was one of those. There where 2 steps we needed to take to implement it. The first one was to add all the necessary code in the user interface that enabled the user to drag and drop feeds into folders and within folders. That part was kind of easy, just some Javascript code here and there.

The second part wasn’t easy at all. We needed a way to store all the positions of all feeds and folders for a given user at any time. The easy way was to create a list to store the feed with the user’s position. Problem was that we needed to be able to reorder that list. That meant we needed to be able to change all the feeds ranking when a feed was reordered. Doing that translated into a lot of write operations to that list (and in conclusion to the servers disk). To those that aren’t tech geeks, write operations to any hard drive are very expensive, they take a long time and a lot of resources. Though if you want your servers to keep up with a big load of web traffic you need to minimize the number of write operations.

So we had this problem, if we had to change the ranking of all feeds every time some user reorder one of them, it meant a lot of write operations. The best case scenario was that the user dropped a feed at the end of this list, and so, we only needed to update this feeds rank. The worst case scenario was having the user drop a feed at the beginning of the list. In that case we needed to update all the other users feeds rank. When this list has only 10 feeds, then it’s ok, but if the list is as long as 800 feeds, those are a lot of write operations. Finally, if you start thinking we could have several users dragging and dropping feeds simultaneously, the number of possible write operations in the worst case was alarmingly high.

To solve the problem we partitioned the rank space so that ranks had a gap of x numbers in between. That allowed us to minimized the number of ranks we needed to update at a single moment. This added an extra layer of difficulty as we needed to track the number of free slots available before we needed to reorder the whole feed list.

At the same time we needed to track if a feed was being dragged, before or after another feed, before or after a folder or if it was being dropped inside a folder. So we had to check the type of operation, where, which folder and if we had free slots there.

When we finally had all this stuff right we had to translate this internal structure into the export / import mechanism that produces or reads opml files (xml files). We needed to be able to read an opml file with folders and translate it into our internal representation and ranking list so it would be correctly displayed to the users.

All in all, it was fun but a very hard work for something that initially seemed so trivial to accomplish. The system is working now and apart from eventual glitches it should work. We still want to get a better UI for the drag & drop thing as it seems to us it’s still a little bit clumsy, but we’ll get there.

So now you know that even the simplest feature requires a hard work, so choose wisely what you want us to develop next ;)

, , , , ,

No Comments

God bless the private invites

Hi all,

After some finally tweaks, we managed to push another release today and finished migrating all users from the old alpha architecture. We were so excited, we decided it was time to send some more invites to some more users that have been waiting for it for ages! (We really regret that, sorry guys!).

Suddenly what we can only define as a shit storm came our way. Servers were on fire, even though we didn’t send too many invites. Problem is that our system scales with the number of feeds each users has. Some of our new alpha users are really feed junkies (you know who you are! :P ) and really pushed the total number of feeds the system manages really fast. We are throwing some stats so you get some idea:

As you can see, the number of feeds being tracked by Inkzee internally nearly triplicated in a 20 minute span. We are storing, displaying and sorting 32,859 posts right now and growing fast.

In the process of managing such a large amount of new feeds some parts of the backend went belly up. We are investigating it now and we hope we can fix them soon. Even though some things broke, they weren’t critical parts so the system recovered from the failures pretty fast which is good news.

The moment we stabilize the system again we’ll continue sending new invites. Right now we are also focusing in adding new features. The next one in our roadmap will be the ability of creating custom filters and keyword alerts. We hope to release that some time next week.

As usual, thanks a lot for your patient and keep coming, but gently ;)

The Inkzee Team

, ,

No Comments