First Steps Sorting out Python

After wrapping up a little c++ code and a fantastic walk yesterday morning, I settled down in front of my iMac without an immediate looming task and it felt grand. Twitter’s an unmatched wide band serendipity delivery engine, and in a few minutes I found myself reading about pypy’s 1.5 alpha release. Even if you’re far removed from code, you’ve likely at least heard of Python but…

What is PyPy?

PyPy is a very compliant Python interpreter, almost a drop-in replacement for CPython 2.7.1. It’s fast (pypy 1.5 and cpython 2.6.2 performance comparison) due to its integrated tracing JIT compiler.

This release includes the features of CPython 2.6 and 2.7. It also includes a large number of small improvements to the tracing JIT compiler. It supports Intel machines running Linux 32/64 or Mac OS X. Windows is beta (it roughly works but a lot of small issues have not been fixed so far). Windows 64 is not yet supported.

I’ve been wanting to dive into python for a while now and even setup pythonbrew (which regretfully doesn’t play nicely with pypy), but the opportunity never presented itself. I pounced on the momentary lapse of my many time eating overlords, and decided to explore how the sorting code I’d recently written in c++ might take shape in Python.

Low and behold, StackOverflow revealed breadcrumbs of another developer new to Python who shared an example of qsort, thanks Frankie. I eased into this example which was more readable (for a python newb) than the one liner scripts I came across elsewhere. I’m not against one liners, but I’ll work that style after I get the fundamentals. Frankie’s quicksort’s form was similar to the qsort with temporary lists impl I coded Saturday morning, versus the slightly more tricky in place variant (yeah, I screwed that up the first time).

The brand spanking new py-util repo includes only a single short python script which contains a merge sort in addition to a qsort. Without further delay here come…

the Sorts

Update I left out a critical step in mergesort, the MERGE! There was an error with merge sort and I haven’t tested the fix yet. See the comment below for a good example how to mergesort (thanks, appreciate the correction). I’ll test out the fix I edited on github tonight. The good news I got to learn some more python syntax with extend. I coded the merge how I would in c/c++, there may be some tricks in python I’m missing.

Readme & Run Example

The tweaks to Frankie’s version and merge sort took only a few delightful moments with the aid of just two helpful interpreter error messages. The big surprise was in the execution times. PyPy ran ~8x faster than my Mac’s default cpython (7.1) and it’s comparable to my c++ qsort (in place is still faster). I’ve embedded the readme and results file below:

That wraps up my first steps in Python. Future py-posts may include web frameworks Flask and Django, datamining examples, number crunching with numpy (we’ll see if/how this jives with pypy) with an outside chance of opencv python bindings and image processing.

The latter is unlikely because each man or woman is apportioned only so many years of image manipulation per lifetime, and I’ve far exceeded my lot. One of the few forces outside of a paying gig that could coerce me into image processing would be a mobile app. But for that I’d use objective c or java bindings(?) to opencv or roll my own image processing lib.

Gratuitous Notes:

  1. Excellent python references:
  2. Related post: Folding in c++ (the repo referenced includes array splitting and in place qsorts)



Be Sociable, Share!

Categories: Uncategorized
Tags: ,
  • Greg Johnson

    Here was my newb attempt at a python qsort. It uses list comprehensions. I really
    like how clean it is.

    def qsort(a):

    # a really short list is already sorted. this is the (non-recursive)
    # base case.

    if len(a) <= 1:
    return a

    # this is the inductive step, which recursively calls the routine
    # on lists that are known to be shorter than the input list.

    # pick an arbitrary element of list 'a' (we choose a[0] but it
    # could have been anything).

    pivot = a[0]

    # grab the elements smaller than pivot, the elements equal to
    # pivot, and the elements bigger than pivot.

    smaller = [value for value in a if value pivot]

    # recursively sort the lists of smaller and bigger elements,
    # and glue the result together.

    result = qsort(smaller)

    result.extend(equal)

    result.extend(qsort(bigger))

    return result

  • Nobody

    Your merge_sort doesn’t sort the values (nor does it merge them either).
    In fact your merge_sort function split your list into 3 parts (left, right and a middle value), and creates a new list by appending the values in the same order they were to begin with.
    For instance if you send [5, 4, 2, 8, 3] as an input, your function will return [5, 4] + [2] + [8, 3].

    You can look at the pseudocode from wikipedia (http://en.wikipedia.org/wiki/Merge_sort) to see how the merge sort works, the translation to python is pretty straightforward :

    def merge_sort(lst):
    if len(lst) <= 1:
    return lst
    else:
    l = len(lst)
    mid = len(lst)/2
    left, right = [], []
    for x in lst[:mid]:
    left.append(x)
    for x in lst[mid:]:
    right.append(x)
    left = merge_sort(left)
    right = merge_sort(right)
    return merge(left, right)

    def merge(left, right):
    result = []
    while len(left) or len(right):
    if len(left) and len(right):
    if left[0] <= right[0]:
    result.append(left[0])
    left = left[1:]
    else:
    result.append(right[0])
    right = right[1:]
    elif len(left):
    result.append(left[0])
    left = left[1:]
    elif len(right):
    result.append(right[0])
    right = right[1:]
    return result

    As a side note, it is not recommended to use 'sorted' or 'list' as variable names, as they are builtin functions in Python.

  • http://www.victusspiritus.com/ Mark Essel

    crap I forgot the ordered merge and it’s the most important step! Good catch, I didn’t verify the outputs.