Tuesday, November 20, 2012

how to not run your business

Inline image 1


tell me - if you're going to pay for hosting for a corporate wordpress blog, what's the chance you're not going to use your corporate email account and what's the chance you're not going to host the blog at your corporate url? in other words, for what business would this error message not come up?

None. Facepalms: 5 (horrifically hilarious, luckily not a problem I have to deal with today.)

Wednesday, October 10, 2012

magento and the dreaded WebFault: Product not exists

Sometimes using SOAP with the Magento V2 API you may search for a product's image (by calling catalog_product_attribute_media.list with the sku as only argument) and receive this hilarious message:

"Product not exists." <-- you can't make this stuff up!

Hilarious but sad, because there's the image right there, laughing at you from the website, and yet your API doesn't have access.

luckily there's an easy fix. in short: append a space to the end of the sku. done. psha. magento, we love to hate you.

facepalms: 5

Monday, October 8, 2012

celery + djcelery problem with virtualenv and virtualenvwrapper



this is a tricky one - out of nowhere, production env boxes started failing at celery startup with:



ImportError: cannot import name current_app
when importing djcelery. Versions of celery were fine.
It turns out if you import celery and run
import celery.current_app you'll see the real problem, which is that the virtualenv binary is out of sync with the new python binary from a recent security update - specifically, os.urandom has been changed/removed.

if you have virtualenvwrapper, and you let $ENV=YOUR_ENVIRONEMNT_NAME
So the answer is:


deactivate (in case an env is running)
cd ~/$ENV_HOME (.virtualenvs, for me)
rm $ENV/bin/python
virtualenv $ENV


this will rebuild your python binary with the correct python post-security fix, without losing any other packages. happy hacking!

Sunday, July 8, 2012

redis vs rabbit with django celery

if you're planning on putting a django celery app into heavy production use, your message queue matters. At SimpleRelevance we used RabbitMQ for months, since it's probably the most famous of the available options.
It took about 8 seconds to restart out celery workers, but that was fine. In general it was fairly reliable, and the way it routed tasks to different celery workers on different boxes often felt like magic.

However, getting results of a task back using rabbit as a result backend was a different story - it often took minutes, even hanging the box it was on. And these weren't big results either.

So for the record here, we switched to Redis. Not only is restarting about 3X faster, but as a results backend it also wins - no more hanging, and results come back as soon as they're ready. My sysops also tells me it was much easier to install and configure.
boom.

----
update!
actually it turns out redis starts to perform very badly when faced with a deep queue in our production environment. So the optimal setup for us turns out to be RabbitMQ for the queue handling, and Redis for the result backend.

Wednesday, July 4, 2012

"Error in service module" Ubuntu pam login fail

This was a strange one. One of the newer computers at the lab where I moonlight, running Ubuntu 11.04 (yes, I probably should upgrade to the next LTS, I guess), suddenly stopped logging in. This was preceded by a pink screen and a bunch of errors about the harddrive (so they tell me; I wasn't there. oh, the lives of not IT folk - like living underwater).
So now, whenever they logged in through the GUI, nothing happened - click the user, type the password, straight back to login screen.
So I ctrl alt F1 to TTY1, and log in, and the only error I got was:
"error in service module". Not helpful.
Googling that was helpful in that it started to point the blame at PAM, or Pluggable Authentication Modules, which I now know way too much about.

But anyway I ended up booting into recovery mode which got me past the login, and then overwriting the /etc/init/tty1.conf to skip login even when not in recovery mode following this advice, which allowed me to access the internet and external drives on the machine.

Then I realized that ubuntu logs everything, and I checked /var/log/auth.log, which mentioned a bunch of missing files in /etc/pam.d/ and other fun places. All such fun directories were empty. very strange. So I copied over all of the files from another ubuntu machine into said directories, including a bunch of .so files, and what do you know, login worked. That simple.

Moral of the story - read the logs when something goes wrong that you don't understand. Ubuntu writes a lot of logs, and there's a reason for that. I know I made it seem easy here but it probably took 2-3 hours to do all of the above. facepalms: 4

Thursday, May 31, 2012

another python suds tip

If you are having trouble creating nested XML for array objects like this:
<configurations>
 <configuration>
   stuff
 </configuration>
</configurations>

while using suds' Factory methods, you should try to create the struct from scratch, by passing whatever method is appropriate a list of dictionaries or whatever nested structure applies:

[{'Configuration':obj},{'Configuration':obj}]

facepalms: a million

suds empty tag issue

I just posted a really nice Stack Overflow solution about this:
http://stackoverflow.com/questions/9388180/suds-generates-empty-elements-how-to-remove-them
The general point is that although Suds is an awesome python library that lets you connect to SOAP clients (I mean, really? welcome to the 21st century, people) with relative ease, it has the bad habit of adding empty tags for optional properties of objects. This tends to confuse (poorly written) API endpoints.

facepalms: 6

Friday, March 30, 2012

django celery with remote worker nodes

I set up rabbitmq and celery. Plenty of good tutorials online about how to do that. Then I wanted a bunch of different linode boxen all running the same django project, with the following setup:

1 server running mysql and nothing else
1 server running nginx serving http requests and routing tasks to rabbitmq / celery
1 server running rabbitmq and celery and django
N boxes running django and celery

Turns out, it's easy!
  • All of the above hook into the mysql server by setting the HOST and PORT settings in the django settings.
  • Each slave celery box uses an environment variable to take care of any individual settings it might need, but in general each of them uses django-celery's BROKER_HOST and BROKER_PORT options to connect to the rabbitmq server.
  • using fabric makes deploying code to all of them fairly simple
Believe it or not, rabbitmq effortlessly figures out who's got a free worker between all of your boxes and just does it.

Wednesday, February 22, 2012

app engine and appcfg import errors


If you get an error that looks like the below, be sure that you are importing the correct version of django
(which is to say, that if you are using the builtin version, be sure you are not also importing another version of django from your sitepackages). Appcfg automatically follows all symlinks and packages up everything on your pythonpath, and the import order in production is different (actually reversed!) from that in development.

I solved this problem by deactivating my virtualenv whenever I deploy.
facepalms: 8 (3 hours of nonsense for a 1-line fix!)

Traceback (most recent call last): File "/base/python27_runtime/python27_lib/versions/1/google/appengine/runtime/wsgi.py", line 193, in Handle result = handler(self._environ, self._StartResponse) File "/base/python27_runtime/python27_lib/versions/third_party/django-1.2/django/core/handlers/wsgi.py", line 232, in __call__ self.load_middleware() File "/base/python27_runtime/python27_lib/versions/third_party/django-1.2/django/core/handlers/base.py", line 40, in load_middleware mod = import_module(mw_module) File "/base/python27_runtime/python27_lib/versions/third_party/django-1.2/django/utils/importlib.py", line 35, in import_module __import__(name) File "/base/python27_runtime/python27_lib/versions/third_party/django-1.2/django/middleware/transaction.py", line 1, in from django.db import transaction File "/base/python27_runtime/python27_lib/versions/third_party/django-1.2/django/db/__init__.py", line 77, in connection = connections[DEFAULT_DB_ALIAS] File "/base/python27_runtime/python27_lib/versions/third_party/django-1.2/django/db/utils.py", line 92, in __getitem__ backend = load_backend(db['ENGINE']) File "/base/python27_runtime/python27_lib/versions/third_party/django-1.2/django/db/utils.py", line 50, in load_backend raise ImproperlyConfigured(error_msg) ImproperlyConfigured: 'google.appengine.ext.django.backends.rdbms' isn't an available database backend. Try using django.db.backends.XXX, where XXX is one of: 'dummy', 'mysql', 'oracle', 'postgresql', 'postgresql_psycopg2', 'sqlite3' Error was: cannot import name backends

app engine and datastore keys

say you know the key name's string of a Model instance and you want to fetch it from the datastore. It goes like this:
Model.get(db.Key.from_path("Model",key_name))

why so complicated? why can't you just Model.get(key_name)? Couldn't tell you, but was getting inexplicable errors that way.

facepalms: 4

Tuesday, February 14, 2012

Porting Postgres to google cloud SQL

I know it's quite a peculiar problem - you have a postgres database, and suddenly you're migrating to Google App Engine and you want to use the new Google Cloud SQL (look it up, it's public and it's pretty cool).

Anyway, you need to port the old data. Google cloud sql is cool except for a big problem - sql dump imports have 0 error reporting - if it fails, it just turns red and tells you that an unknown error occurred. So here's how I made it work:

  1. dump the postgres stuff selecting the tables you want with a couple extra options on:
  2. pg_dump --column-inserts --data-only POSTGRES_DATABASE_NAME -t TABLE_NAME -t ANOTHER_TABLE_NAME -f NEW_FILE_NAME.sql [note: you need to have psql privileges already here].
  3. delete the top lines of the dump file created in 2) until the first "insert" line.
  4. load it into mysql locally, where you can catch any errors:
  5. mysql -u USER -p DATABASE_NAME < NEW_FILE_NAME
  6. dump it from the local mysql:
  7. mysqldump -u USERNAME -p --add-drop-table MYSQL_DATABASE_NAME TABLE_NAME ANOTHER TABLE_NAME> FIXED_SQL_DUMP_FILE.sql
  8. add as the first line in the new dumpfile: "use DATABASE_NAME;" (ignore the quotes, add the name of the database you want the data loaded into on google).
  9. Now you can load this new file into a google cloud storage bucket using their web browser gui and from there import it into cloud sql.
  10. pray, as you wait for the stupid thing with no error reporting to turn green.
facepalms: 7




Thursday, February 9, 2012

app engine and django mail

lo and behold, local email (as in, sending email to console) in django works with the app engine development server, but not with the app engine development backend / taskqueue server. It gives you an attributeerror: Exception Value: 'module' object has no attribute 'getfqdn'

Wednesday, February 8, 2012

django content types and south

If you are missing content types in your from django.contrib.contenttypes.models.ContentType table, and you are using south on multiple databases, it's likely that you haven't synced AND MIGRATED all of your apps (even if you don't use them) on your master database.

facepalms: 1

Wednesday, January 25, 2012

tricks for constantcontact oauth2

Just so you know, you have to specify a redirect url for both legs of OAuth2 integration with ConstantContact. And it has to be the same exact url. Even tho the second (POST) leg probably won't end up redirecting anywhere.

Not only that, it has to be the exact same as the oauth_callback setting for the api key you're using. Where do you set this? You sign in to the community portal at constant contact, click the "api keys" tab, and then click on the api key itself, which is a link which takes you to its management page. With no explanation on this, it took me about 20 minutes to figure out, so hopefully this helps somebody else.

Oh - and if you're still getting uri-mismatch errors during this process? remember that the callback has to be https, for real - you need an ssl cert.

facepalms: 4

Wednesday, January 18, 2012

django-celery with virtualenv

this page has some great tips for daemonizing celery for django. They even have a section for django under a virtualenv.

But, just remember that if you have any environment variables that you export before running django, that your django settings rely on to load correctly, you should probably export them in your daemon script. Unfortunately the django-celery documents don't know about these environment variables (I'm talking about you, export ENV='pro'), so it's up to you to stick em in there.

Furthermore, consider restarting the celery daemon whenever you deploy. I hear supervisord makes this easy; I'm going to do that next.

facepalms: 2

email attachments with django postmark

A quick fix - and there are some pull requests for this on bitbucket, so feel free to look for them, but you can do it yourself -
in postmark/backends.py, attachments are expected to be in dictionary form:

{'Name':XXXX,'Content':XXXX,'ContentType':XXXX}

however, the default django core email attachment method wants them as a tuple. Make your own patch if you want, it's pretty straightforward. Postmark's error messages are fairly useless.

facepalms: 3

Monday, January 9, 2012

a quick gripe

the worst thing about the official google reader app for android (besides the annoying new swipe-based UI) is that there's no way to see alt text on images (think XKCD).

Thursday, January 5, 2012

google gdata oauth

there's a great big honkin awesome library for python called gdata - it lets you interface with a ton of their APIs, including spreadsheets and docs. We use it internally at SimpleRelevance to write all sorts of logging directly to shared spreadsheets - a big process will finish and have reams of data to share, but rather than parse out a flatfile every time, I wrote a wrapper that takes the data and spits it into a beautifully formatted google spreadsheet. From there, charts and such are easy.

We also now use it externally - clients can log in and authenticate through oauth with google, and then we can write their predictions to a spreadsheet in their own account.

You probably know about oauth - it's nice because the client never has to supply us with any login credentials - the whole thing is very secure. Unfortunately, it was a little painful to set up. Like, 1 hour of productive work and 2 hours of fighting with stupid. Why?

I'll tell you why.
  1. There are a lot of outdated tutorials.
  2. The gdata plugin, while awesome, has tons of legacy code and 2 completely different and mostly working ways of doing everything.
  3. There are tutorials for the old path, tutorials for the new, and tutorials that mix the two.
  4. This page has the most beautiful, well-written, cogent, perfect, comprehensive example of how to set up oauth with gdata. Unfortunately it gets confusing in a crucial bit at the end.
Although it's great, check it out. It has examples every step of the way in 4 different languages, for both gdata paths. That's 8 examples every step of the way (and you know that oauth2 is a 3 step process - that's around 24 pieces of code).

It's actually spot on all the way through to the end. The confusing part comes when you have to exchange your oauth token for the longterm access token - the thing that actually authenticates and lets you access stuff.
The tutorial has this line:
access_token = client.UpgradeToOAuthAccessToken()  # calls SetOAuthToken() for you

But I had trouble. Frankly, I messed up. But I couldn't get it to work until I did this:
client.UpgradeToOAuthAccessToken()
access_token = client.token_store.find_token(oauth_token.scopes[0])


It turns out that the first line actually does returns the access_token, so the google tutorial is correct. This was fixed some time in the last something or other. Used to be you had to use that second line to get it. Older tutorials don't reflect this change. I was getting tired by then. I missed the boat. Cue frustration.

Then, I my second point of confusion: the request token key and secret (leg 1) are the same as the oauth token key and secret (leg 2) but the access token key and secret are totally different and new. In retrospect, this makes perfect sense (from a security POV), but at the time I was baffled. Don't try to upgrade your oauth token and then save its key and secret to authenticate with. It won't work.

The google tutorial tells you to save it but leaves it to you to figure out how to use the access token object. Really, it's easy:
save access_token.key
save access_token.secret.

You can use the session, you can use a database, you can use a session stored in your database, whatever you want! I think the access token lasts for a while.

Anyway, oauth is hard, but I'm really getting the hang of it. Feel free to email me if you need help with this one.
facepalms: 4.5.