Showing posts with label django. Show all posts
Showing posts with label django. Show all posts

Tuesday, June 11, 2013

ipython and the django shell: strange scoping errors

this is a minor issue that has annoying repercussions. on most versions of django, if you use ipython and start it with ./manage.py shell, you cannot define global variables and then use them in local functions. it gets ugly quick. more info here:
https://github.com/ipython/ipython/issues/62

patch is here:
https://github.com/django/django/pull/512/files

and it definitely fixes the issue.

Monday, October 8, 2012

celery + djcelery problem with virtualenv and virtualenvwrapper



this is a tricky one - out of nowhere, production env boxes started failing at celery startup with:



ImportError: cannot import name current_app
when importing djcelery. Versions of celery were fine.
It turns out if you import celery and run
import celery.current_app you'll see the real problem, which is that the virtualenv binary is out of sync with the new python binary from a recent security update - specifically, os.urandom has been changed/removed.

if you have virtualenvwrapper, and you let $ENV=YOUR_ENVIRONEMNT_NAME
So the answer is:


deactivate (in case an env is running)
cd ~/$ENV_HOME (.virtualenvs, for me)
rm $ENV/bin/python
virtualenv $ENV


this will rebuild your python binary with the correct python post-security fix, without losing any other packages. happy hacking!

Sunday, July 8, 2012

redis vs rabbit with django celery

if you're planning on putting a django celery app into heavy production use, your message queue matters. At SimpleRelevance we used RabbitMQ for months, since it's probably the most famous of the available options.
It took about 8 seconds to restart out celery workers, but that was fine. In general it was fairly reliable, and the way it routed tasks to different celery workers on different boxes often felt like magic.

However, getting results of a task back using rabbit as a result backend was a different story - it often took minutes, even hanging the box it was on. And these weren't big results either.

So for the record here, we switched to Redis. Not only is restarting about 3X faster, but as a results backend it also wins - no more hanging, and results come back as soon as they're ready. My sysops also tells me it was much easier to install and configure.
boom.

----
update!
actually it turns out redis starts to perform very badly when faced with a deep queue in our production environment. So the optimal setup for us turns out to be RabbitMQ for the queue handling, and Redis for the result backend.

Friday, March 30, 2012

django celery with remote worker nodes

I set up rabbitmq and celery. Plenty of good tutorials online about how to do that. Then I wanted a bunch of different linode boxen all running the same django project, with the following setup:

1 server running mysql and nothing else
1 server running nginx serving http requests and routing tasks to rabbitmq / celery
1 server running rabbitmq and celery and django
N boxes running django and celery

Turns out, it's easy!
  • All of the above hook into the mysql server by setting the HOST and PORT settings in the django settings.
  • Each slave celery box uses an environment variable to take care of any individual settings it might need, but in general each of them uses django-celery's BROKER_HOST and BROKER_PORT options to connect to the rabbitmq server.
  • using fabric makes deploying code to all of them fairly simple
Believe it or not, rabbitmq effortlessly figures out who's got a free worker between all of your boxes and just does it.

Wednesday, February 22, 2012

app engine and appcfg import errors


If you get an error that looks like the below, be sure that you are importing the correct version of django
(which is to say, that if you are using the builtin version, be sure you are not also importing another version of django from your sitepackages). Appcfg automatically follows all symlinks and packages up everything on your pythonpath, and the import order in production is different (actually reversed!) from that in development.

I solved this problem by deactivating my virtualenv whenever I deploy.
facepalms: 8 (3 hours of nonsense for a 1-line fix!)

Traceback (most recent call last): File "/base/python27_runtime/python27_lib/versions/1/google/appengine/runtime/wsgi.py", line 193, in Handle result = handler(self._environ, self._StartResponse) File "/base/python27_runtime/python27_lib/versions/third_party/django-1.2/django/core/handlers/wsgi.py", line 232, in __call__ self.load_middleware() File "/base/python27_runtime/python27_lib/versions/third_party/django-1.2/django/core/handlers/base.py", line 40, in load_middleware mod = import_module(mw_module) File "/base/python27_runtime/python27_lib/versions/third_party/django-1.2/django/utils/importlib.py", line 35, in import_module __import__(name) File "/base/python27_runtime/python27_lib/versions/third_party/django-1.2/django/middleware/transaction.py", line 1, in from django.db import transaction File "/base/python27_runtime/python27_lib/versions/third_party/django-1.2/django/db/__init__.py", line 77, in connection = connections[DEFAULT_DB_ALIAS] File "/base/python27_runtime/python27_lib/versions/third_party/django-1.2/django/db/utils.py", line 92, in __getitem__ backend = load_backend(db['ENGINE']) File "/base/python27_runtime/python27_lib/versions/third_party/django-1.2/django/db/utils.py", line 50, in load_backend raise ImproperlyConfigured(error_msg) ImproperlyConfigured: 'google.appengine.ext.django.backends.rdbms' isn't an available database backend. Try using django.db.backends.XXX, where XXX is one of: 'dummy', 'mysql', 'oracle', 'postgresql', 'postgresql_psycopg2', 'sqlite3' Error was: cannot import name backends

Tuesday, February 14, 2012

Porting Postgres to google cloud SQL

I know it's quite a peculiar problem - you have a postgres database, and suddenly you're migrating to Google App Engine and you want to use the new Google Cloud SQL (look it up, it's public and it's pretty cool).

Anyway, you need to port the old data. Google cloud sql is cool except for a big problem - sql dump imports have 0 error reporting - if it fails, it just turns red and tells you that an unknown error occurred. So here's how I made it work:

  1. dump the postgres stuff selecting the tables you want with a couple extra options on:
  2. pg_dump --column-inserts --data-only POSTGRES_DATABASE_NAME -t TABLE_NAME -t ANOTHER_TABLE_NAME -f NEW_FILE_NAME.sql [note: you need to have psql privileges already here].
  3. delete the top lines of the dump file created in 2) until the first "insert" line.
  4. load it into mysql locally, where you can catch any errors:
  5. mysql -u USER -p DATABASE_NAME < NEW_FILE_NAME
  6. dump it from the local mysql:
  7. mysqldump -u USERNAME -p --add-drop-table MYSQL_DATABASE_NAME TABLE_NAME ANOTHER TABLE_NAME> FIXED_SQL_DUMP_FILE.sql
  8. add as the first line in the new dumpfile: "use DATABASE_NAME;" (ignore the quotes, add the name of the database you want the data loaded into on google).
  9. Now you can load this new file into a google cloud storage bucket using their web browser gui and from there import it into cloud sql.
  10. pray, as you wait for the stupid thing with no error reporting to turn green.
facepalms: 7




Thursday, February 9, 2012

app engine and django mail

lo and behold, local email (as in, sending email to console) in django works with the app engine development server, but not with the app engine development backend / taskqueue server. It gives you an attributeerror: Exception Value: 'module' object has no attribute 'getfqdn'

Wednesday, February 8, 2012

django content types and south

If you are missing content types in your from django.contrib.contenttypes.models.ContentType table, and you are using south on multiple databases, it's likely that you haven't synced AND MIGRATED all of your apps (even if you don't use them) on your master database.

facepalms: 1

Wednesday, January 18, 2012

django-celery with virtualenv

this page has some great tips for daemonizing celery for django. They even have a section for django under a virtualenv.

But, just remember that if you have any environment variables that you export before running django, that your django settings rely on to load correctly, you should probably export them in your daemon script. Unfortunately the django-celery documents don't know about these environment variables (I'm talking about you, export ENV='pro'), so it's up to you to stick em in there.

Furthermore, consider restarting the celery daemon whenever you deploy. I hear supervisord makes this easy; I'm going to do that next.

facepalms: 2

email attachments with django postmark

A quick fix - and there are some pull requests for this on bitbucket, so feel free to look for them, but you can do it yourself -
in postmark/backends.py, attachments are expected to be in dictionary form:

{'Name':XXXX,'Content':XXXX,'ContentType':XXXX}

however, the default django core email attachment method wants them as a tuple. Make your own patch if you want, it's pretty straightforward. Postmark's error messages are fairly useless.

facepalms: 3

Monday, December 26, 2011

django multidb support for admin site by subdomain

usecase: the client logs in to the admin site by going to clientname.sitename.com/client_admin.

Why? because it's nice to use the django admin site. Because we want the client to only access the client's database.

How?
first off, setup your DNS to forward *.sitename to your server. Than in your server config (I use nginx), send *.sitename to uwsgi.
(note: if you're not using uwsgi with django here, you'll have to use your imagination).
In your wsgi script (you know, the one that launches django for each incoming request?), put something like this:

class RequestWSGIHandler(django.core.handlers.wsgi.WSGIHandler):
def get_response(self,request):
domain = request.__dict__['META']['HTTP_HOST']
domain = domain.split('.')
if len(domain) > 3:
db_name = domain[0]
print "caught domain %s"%db_name
os.environ['SITE_NAME'] = db_name
return super(RequestWSGIHandler,self).get_response(request)

# Lastly, load handler.
# application = django.core.handlers.wsgi.WSGIHandler()
application = RequestWSGIHandler()

Yes, it's kind of quick and dirty. I've subclassed the django wsgi handler to intercept every request, grab the subdomain if there is one, and set it in the OS env. Then using the (happily outdated) python 2.6* super() syntax, I call the standard wsgi handler.

Then when you get to settings.py, you can grab that env variable if it's there and save it to your settings object. It's now available when you need it.

For instance!!!, you can subclass admin.ModelAdmin to support multiple databases (see here). In the linked example (you have to scroll halfway down, they basically overwrite the key class methods and add in "using=new database" to all db queries) the new database is static, but since you now have a dynamic database name in settings.whatever you called it, you can pick your extra database access name on the fly. hurrah.


A caveat:
(this is a bit scary...) in the django documentation, the good django devs say to always import settings from django.conf. I wonder why? Maybe to avoid the same multithread concurrency issues I very briefly glossed over in my last post. Who knows. Problem is, if you do that with the above example, some of your calls (though not all) won't pick up your dynamic subdomain-influenced database. Why the hell not? Beats me. I have "from django.conf import settings" everywhere in that project except that file, for which I use "import settings".

and as a further caveat, keep in mind that if you import the settings from django.conf in a file that's loaded before admin.py, or wherever you want the real dynamic settings, it won't work!

I didn't really go into it, but this is a 4.5 facepalm issue right here.

Dynamic Databases in Django

My current (paid) project has me managing a separate database for every client in Django. This has been a great challenge. Since 1.2, Django has had multidb support, so that's not hard - the hard part is all of the edge cases.

For instance, we want to be able to add clients. On the fly. We plan to have many - like more than 20. So we certainly don't want to have our database definition in settings, all written out like the Django tutorials. At the very least, a loop over db names.

def add_db_to_databases(DATABASES,name):
if name in DATABASES:
return DATABASES
DATABASES[name] = {
'HOST': 'localhost',
'PORT': '',
'NAME': name,
'USER': '',
'PASSWORD': '',
'ENGINE': '',
'OPTIONS': {
'autocommit': True,
}
}
return DATABASES

for name in pro_dbs.names:
DATABASES = add_db_to_databases(DATABASES,name)

What I did there is take the names from another python file, which contains a simple python list of names.

I needed to be able to add clients on the fly. This is the hard part; as of yet I have two stumbling blocks with only partial workarounds.
  1. I'd love to have the database names in a database themselves. Soooo much better than reading the python file with the names into a list, appending the new name to the list, than writing back to the file. But how to load from a database in the django settings file itself? It's been engineered not to allowed that.
  2. I'd love to be able to update settings without restarting the server. You can do certain things in that vein by messing with django.conf.settings, but it's unclear how well that'll hold up under multithreading.

All in all, not a facepalm worthy subject, but very interesting.