HOWTO : Python Debugging with pdb

March 19, 2008 1 Comment

I'm a dynamically typed language fan boy. I've dabbled in all sorts of languages, but got my first development job working with PHP and Javascript, graduated to Python, and then did the C/C++ and Java stints as well. One thing I'm quite used to doing is writing print statements for debugging. In large C/C++ or Java environments, I found this quite impractical, and eventually became quite intimate with debugging tools in those languages. Until recently, however, I continued using print statements for working with scripted languages like Python. Now I use pdb for almost every problem I encounter in python.

A while ago, I battled a nasty Entertainer bug. It was repeatable, but not consistently. I would start the backend, and the it would start indexing videos from the configured video folders. Then, for no reason, the backend would die. I was able to trace the problem back to the VideoThumbnailer, which would be my baby. Crap. So I started adding print statements, so that I could see if the thumbnailer was consistently dying on the same video (thinking it might have been a codec problem with GStreamer). What I found was that the backend rarely died in the same spot, and since the thumbnailer didn't create thumbnails for videos that already had thumbnails, once it got past a video it had failed on before, it never struggled on it again. However, with the multi-threaded environment of entertainer, it was quite difficult to debug.

Eventually, I followed the yellow brick road to see the Google. I typed in 'python debugger' and found the first page of results full of pdb articles. "Oh no," I thought, "anything but the confusing and difficult pdb!" Alas, once I forced myself to work with it, I found that it was very simple to use (and that the pydev eclipse plugin makes it so much more complicated than it needs to be). I'm still only using a small subset of the things that pdb can do, but frankly, I can't see a need for anything more than what I used it for (setting a breakpoint, reading local variables, seeing a backtrace, and stepping through code).

First, I decided where I wanted to have pdb stop and give me a debugging console (setting the breakpoint). When that was decided, I added the following code:

import pdbpdb.set_trace()

That's it! Now run the code, and you'll find that instead of complete execution, you'll be at the (Pdb) prompt. Above your prompt, you should see the pdb.set_trace() call that created the console. I'm not one of those people who fears a command prompt, so if you are, I apologize. However, this isn't nearly as complicated as a shell prompt. You've got only a few keys to worry about:

  • n (next) - Hitting 'n' at this prompt will execute the current line of code and go to the next line.
  • c (continue) - If you've gotten to the point where you know your bug is fixed, and you want to just close the debugger, hit 'c' to just continue executing the program. This will complete execution, provided that you have no more calls to pdb.set_trace()
  • s (step into) - If you're at a function or method call on the prompt, and you'd like to see what's going on inside that function, stepping into it will allow you to follow execution into the function. If you don't want to see what's going on in the function, 'n' is your character.
  • l (list) - Wait. Where was I in the code again? Yea, I guarantee that'll happen to you. Hitting 'l' will give you a chunk of code in the area you are, with a nice little -> where you are currently.
  • bt (backtrace) - Okay, so now I know where I am, but how did I get here? 'bt' gives you a backtrace, showing you the start of execution, and all the function and method calls that got you where you are today.
  • p (print) - Okay, so what's the value of variable foo? 'p foo' will print it's value, allowing you see how the variable is changing as you walk through the code.
  • q (quit) - Ah! Found the bug. Let's quit this session and fix it!
  • enter (repeat last command) - Tired of hitting 'n' over and over? How 'bout just using 'n' once, and then hitting 'Enter' over and over. It's a bigger key, and less likely to get fat fingered. I don't use it, but it's there if you want it.

Needless to say, debuggers are the tools that would make your life easier if you just learned how to use them. Since my experience with the debugger, I've chatted with a few pythonistas and developers from other languages, and to my surprise have received varying responses. I would say that if you've spent more than 5 minutes writing print statements and running your code, do yourself a favor and fire up the debugger. There might be a learning curve now, but you'll be a better developer for it.

Simple django-tagging HOWTO

January 25, 2008 No Comments

After looking through my Google Analytics results, I noticed that a few hits to my site were people looking for a howto on integrating django-tagging, which I have been wanting to write for quite some time. While I'm quite experienced working with SQL schemas and the like, there were some ORM concepts that I had never been exposed to until I started using Django (it was my first framework featuring an ORM). Things like Generics confused me, although I knew how to do something similar in plain ol' SQL, it took me a bit to grasp the concepts. So I figure a good way to demonstrate their uses, along with demonstrating integrating django-tagging into an app would be a fine use of my precious blog space...

Please keep in mind that I am not a Django expert, nor am I a django-tagging expert. My experience with django-tagging has been adding it to my blog posts. While I've considered making an entire category hierarchy-type site with tags, I haven't actually implemented it. All I can demonstrate on the basics. You'll have to learn the rest on your own. Also, it's important to point out that I have been using the svn version of django-tagging, although recently, a 0.2 version was released. I just set up my project's svn to use externals and keep me up to date on django-tagging.

First thing's first, you need a model. I'm going to use a dumbed down version of the model I use for blog posts. Lay out your model like this:

from django.db import models

class BlogPost(models.Model):

    title = models.CharField(max_length=30)
    body = models.TextField()
    date_posted = models.DateField(auto_now_add=True)

Now you have a pretty simple blog class. After a syncdb, you can fire up the admin app and start blogging! Er, kinda. You won't have any way for a user to see your posts, but you'll have posts in the database. However, you've got no tags! How would you implement tags into your new fangled blog? Easy. Download the tagging module and install it (I usually just copy the appropriate files to my $PYTHONPATH). Then it's actually quite simple, and you'll kick yourself for not figuring this out. Modify your blog model, and add this:

from django.db import models
from tagging.fields import TagField

class BlogPost(models.Model):

    title = models.CharField(max_length=30)
    body = models.TextField()
    date_posted = models.DateField(auto_now_add=True)
    tags = TagField()

However, you're going to want a method in your model that will allow you to iterate through the tags, called get_tags. I also have a set_tags method, and I know there was a reason I added that, I just can't remember what it was... (a good case for why you should always comment your code). So modify your model so that it now looks like this:

from django.db import models
from tagging.fields import TagField
from tagging.models import Tag

class BlogPost(models.Model):

    title = models.CharField(max_length=30)
    body = models.TextField()
    date_posted = models.DateField(auto_now_add=True)
    tags = TagField()
    
    def set_tags(self, tags):
        Tag.objects.update_tags(self, tags)

    def get_tags(self, tags):
        return Tag.objects.get_for_object(self)      

Now you've got a full blown model with tagging built in. Make sure you've got tagging installed in your INSTALLED_APPS, blow out your database, and syncdb again. Hooray! You'll notice on my blog that I have the tags shown at the top of each post, which is accomplished by with the following template code:

{% for tag in blogpost.get_tags %}
  <a href="/blog/tag/{{tag}}" alt="{{tag}}" title="{{tag}}">{{tag}}</a>
{%endfor%}

I know there are template tags for many of the standard operations for django-tagging, but I haven't used any of them. I noticed that most of them are for tag clouds by model or object, and I would like a tag cloud period, with links to pages on the site with various items that are tagged with the given tag. I have just been far too lazy to actually implement it yet, but there is a weekend coming up...

Edit: I've found some better ways to implement django-tagging, so I've made some changes to this tutorial

Gutsy Gibbon, Meet Samsung ML-2150

November 12, 2007 1 Comment
Tagged as: howto linux ubuntu

I got a steal of a deal on a Samsung ML-2150 laser printer from Buy.com (plus $10 off for signing up with Google Checkout). I was going to hold out for a good duplexing one, but I figured if this printer lasts six months, I got my money's worth. Before I received it, I read somewhere that that Foomatic hpijs driver would work. Upon hooking up the printer, I found this not to be the case. I figured I could post the process of getting it connected here.

First, make sure you have cups installed. I print through a NAS device that doubles as a print server, so I also needed to get Samba installed as well. If you don't have cups installed, a quick apt-get install cupsys will do the trick.

Conveniently, Samsung includes drivers on their disk. However, the way the installer runs is bad. It's set up to run all print jobs as root, and well, that's just not a good idea. We work around this by just using the vendor supplied Linux drivers without the installer. I went digging around and found more current drivers than were on my install cd here.

Unpack the tarball. The unpacking will create a directory called cdroot. Execute the following commands, as root:


# cp cdroot/Linux/noarch/at_opt/share/ppd/ML-2510spl2.ppd /usr/share/cups/drivers
# cp Linux/x86_64/at_root/usr/lib64/cups/filter/rastertosamsungspl /usr/lib/cups/filter/

You've just installed the needed drivers. Isn't that MUCH less painless than say, running an install disk? Now, all that's left is to navigate to http://localhost:631, which is a convenient web interface for setting up printers. Select "Add Printer" and go through the steps, naming your printer (without spaces), and giving it some metadata (location, description). When it comes time to select a .ppd file, click the browse button of the file field at the lower part of the screen, and navigate to /usr/share/cups/drivers. You should see ML-2510spl2.ppd there. Select it. Move on, making sure to enter the path to your printer (in my case, it was smb://nas1/lp, but your path will most likely be different).

Once you've completed the wizard, print a test page and pat yourself on the back. While you're at it, pat Samsung on the back for providing Linux drivers that make this process so freakin' easy.

Experience Renaming Deployed Django Apps

October 16, 2007 4 Comments

I've released yet another iteration of The Iron Lion, with the key of this new release being my new knowledge of the newforms library. I've been thinking about adding the ability to comment for a while, and have even blogged about some of the ideas I've had/seen to reduce spam and the like. You'll find the new contact form to be my experiment with the newforms library (and also a solution to putting my email out on the web, even though I've gotten plenty of spam in the email address anyway).

After I pushed this new version, I decided that my naming scheme for my various apps was becoming cumbersome more than clever. I was naming the different apps after planets in our solar system. Cute naming convention, but since this is becoming more than just a few apps, I found it difficult to remember what neptune was, why I need the earth app in this situation, and what pluto was even created for. So I obviously needed to rename these apps.

I was expecting that the process was just a matter of renaming the folders. There were obvious dependencies to worry about, and import calls to fix. So I started by moving the folders around. I then did an fgrep for all the instances of the old names. Using sed, I did a search and replace of each name. For some reason after that, my Django dev server didn't much like the environment variables I had set, and so I had to work out some problems with my environment before I could test it...

One day later, and I was back to work. I browsed around the dev server, to make sure everything was working, and noticed only one thing broken: Tags. This was an opportunity I had been waiting for, actually, and I dove headfirst into the tag database schema. The tags for this site are generated with the django-tagging app, and so I didn't have much experience with generic relations and the like, and I wanted to. I found a table called 'tagged_item' which I invesitigated, and found a reference to a content type id. I figured it must be the django_content_type foreign key, and kept digging. Apparently, django keeps a database of all the installed models, and the apps they are connected to. A few updates later, and I had tags working!

So, in summary, if you want to rename a Django app that's deployed in the field:

  1. Rename the folder found in your project root
  2. Change any references to your app in their dependencies, i.e. the app's views, the urls.py and settings.py files.
  3. Edit the database table django_content_type with the following command: UPDATE django_content_type SET app_label='<NewAppName>' WHERE app_label='<OldAppName>' (Note: for renaming models, you'll need to change django_content_type.name)