Python versus R

A quick web-search will find you plenty of people contributing to the debate as to whether R or Python is better for data analysis. I’m not really going to contribute to that debate here having already expressed my frustrations with R in previous posts. However I am going to put my neck on the line and state the Python is better for one task, stitching a few images together.

Grand Prix – big prize or high price?

The f1sleuth website presents a statistical view of Formula 1 performances, and I would encourage anyone with an interest in the sport to take a look. In conversation with the author, a committed R user, I learned he was using his language of choice for everything, not just the analysis but also for joining together output images.

It pained me to hear how he was doing this so I created a quick Python script to do the job. He has confirmed it does what he wants, and I can detect the smallest crack forming in his R-centric view of the world.

Exiting the pits

The simple script I provided is below (and on GitHub for what it’s worth).

# First install PIL
# c:\python27\scripts> pip install Pillow

# Usage:
# input1.jpg input2.jpy input3.jpy output.jpg

import sys
from PIL import Image

input_files = sys.argv[1:-1]
output_file = sys.argv[-1]
print 'Merging files %s into output file %s' % (','.join(input_files), output_file)

# Width of new image is max of the input widths, while height is the sum of the heights.
input_images = [ for x in input_files]
new_image ="RGB", (max([x.size[0] for x in input_images]), sum([x.size[1] for x in input_images])))

yy = 0
for input_image in input_images:
    new_image.paste(input_image, (0, yy))
    yy = yy + input_image.size[1]

Running it will stitch images together in a vertical fashion. For example, take the following inputs:

machu_picchu saksaywaman choquequirao

Run the script: machu_picchu.jpg saksaywaman.jpg choquequirao.jpg inca_ruins.jpg

And you will have output looking like this.



This probably isn’t the most telling contribution to the R versus Python debate, but if another R user out there is trying to use it for something it’s clearly not suited to I would plead with them to take a look at an alternative.

Although this is only a trivial example, I think it does help illustrate the versatility of Python. At the same time as being considered a serious rival to a dedicated statistical package like R, it can also be used as a very powerful scripting language to do something useful in only a dozen lines of code.

This isn’t to say it’s perfect for all occasions. Its very nature as an interpreted language means you can run into problems if you’re not careful. One example I experienced was a trade loading system that crashed intermittently part way through a large batch job. The problem turned out to be that the wrong type of arguments were being used to initialise an exception. By definition this code path was only exercised on an exceptional case, so it wasn’t picked up on simple sunny day testing. A compiled language couldn’t have this problem, the compile would have caught the problem for you. You can also run into performance issues, potentially after you’ve already gone a long way down a non-performant path.

However, for stitching some images together, I would suggest it is a better tool for the job than R.

This entry was posted in Python, R. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *