Posts Tagged python
Efficent comunication protocols are critical
As we said before, one on the problems we had when migrating to AWS was that the backend system was putting a lot of stress on the server. After a week of benchmarking we realized that the protocol we where using to communicate between the different subsystems we have was the main responsible for the load increment.
From day one we didn’t want to use complex and cryptic protocols so we chose xmlrpc for our communication channel. It was easy to implement, had wide support in php and python and was very easy to debug. We knew that at some point we would need to switch to a more efficient protocol, but we didn’t know it was going to be so soon.
After doing some extensive benchmarking we realize that the through output of the protocol was very low, not only that, if too many xmlrpc connections were spawned it would eventually consume all resources of the process (file descriptors, sockets and memory). This was a painful lesson to learn, but we did. So we switched to the most efficient protocol we could found, that is binary. To be more precise we employ python’s cPickle binary protocol. Saying that the use of this is orders of magnitude more efficient is not even close
So after switching each subsystem to the new protocol we saw the load of the machines going down. As with all big changes in the backend of any system, it took a while to stabilize it. To avoid any havoc we actually put it into production subsystem by subsystem so during some time we had both protocols running at the same time.
And so, always remember that the choices you make will come to hunt you if not done correctly
Big test: Scoble’s OPML file
First of all hello from San Francisco! I’m currently staying a couple of days at SF pitching Inkzee, so I hope many of the people that are reading this blog will eventually become Inkzee users. Anyhow, yesterday night I saw the Robert Scoble had released his opml file with all his blog subscriptions. I though it would be a nice test for the Inkzee alpha to go and import that huge (698 subscriptions) amount of blogs.
As expected, the algorithm started chocking so I’ve been working on a new multithreaded version of the Inkzee engine. I finished it a couple of days ago but I’m still doing so tests before substituting the actual core. Some parts of the core have been rewritten in python for better speed and so no it’s time to test it to avoid any further problems. Hopefully the new version will be online at the end of the week. At the same time I’m working on some new features for the AI engine. Those will probably be online during the weekend.
Take care everybody!
