Sunday, July 8, 2012

redis vs rabbit with django celery

if you're planning on putting a django celery app into heavy production use, your message queue matters. At SimpleRelevance we used RabbitMQ for months, since it's probably the most famous of the available options.
It took about 8 seconds to restart out celery workers, but that was fine. In general it was fairly reliable, and the way it routed tasks to different celery workers on different boxes often felt like magic.

However, getting results of a task back using rabbit as a result backend was a different story - it often took minutes, even hanging the box it was on. And these weren't big results either.

So for the record here, we switched to Redis. Not only is restarting about 3X faster, but as a results backend it also wins - no more hanging, and results come back as soon as they're ready. My sysops also tells me it was much easier to install and configure.

actually it turns out redis starts to perform very badly when faced with a deep queue in our production environment. So the optimal setup for us turns out to be RabbitMQ for the queue handling, and Redis for the result backend.

Wednesday, July 4, 2012

"Error in service module" Ubuntu pam login fail

This was a strange one. One of the newer computers at the lab where I moonlight, running Ubuntu 11.04 (yes, I probably should upgrade to the next LTS, I guess), suddenly stopped logging in. This was preceded by a pink screen and a bunch of errors about the harddrive (so they tell me; I wasn't there. oh, the lives of not IT folk - like living underwater).
So now, whenever they logged in through the GUI, nothing happened - click the user, type the password, straight back to login screen.
So I ctrl alt F1 to TTY1, and log in, and the only error I got was:
"error in service module". Not helpful.
Googling that was helpful in that it started to point the blame at PAM, or Pluggable Authentication Modules, which I now know way too much about.

But anyway I ended up booting into recovery mode which got me past the login, and then overwriting the /etc/init/tty1.conf to skip login even when not in recovery mode following this advice, which allowed me to access the internet and external drives on the machine.

Then I realized that ubuntu logs everything, and I checked /var/log/auth.log, which mentioned a bunch of missing files in /etc/pam.d/ and other fun places. All such fun directories were empty. very strange. So I copied over all of the files from another ubuntu machine into said directories, including a bunch of .so files, and what do you know, login worked. That simple.

Moral of the story - read the logs when something goes wrong that you don't understand. Ubuntu writes a lot of logs, and there's a reason for that. I know I made it seem easy here but it probably took 2-3 hours to do all of the above. facepalms: 4