Clone wiki

iress-test / Home

Python Test

Contained in this repository is my attempt at a solution to a task I was presented with for a job interview. The repository will only be public long enough for them to review it.

Assumptions

As there was some element of providing a web-based user interface, I've gone with the framework I know (Django) to handle this.

To get this working, checkout the project as an "application" in a Django project and add it to the INSTALLED_APPS. You will also need to hook it up in the project wide urls.py file.

BOSS API

The Yahoo! BOSS API uses OAuth for authentication. You will need to modify settings.py in this project with your OAuth consumer key and consumer secret.

Libraries

The project makes use of:

  • python-oauth2
  • anyjson
  • lxml
  • httplib2 (also as a requirement of python-oauth2)

Using pip with the provided requirements.txt should ensure they are installed.

Issues encountered

While building the boss.Client and alexa.Client I encountered issues with query string arguments being handled unexpectedly when attempting to automatically dress up the request with extra parameters (such as oauth parameters).

In the case of boss.Client I changed attack and had to dig through the python-oauth2 source to find how to sign a request and turn this into an Authorization header.

In the case of alexa.Client I found that not using urlencode on the url argument made it work, so I manually add this to the end of my URL/querystring.

Optimisations

There are many optimisations that could still be made to this project. Here are just a few simple ideas.

Caching

I've implemented a very simple case of this to cache the structured data from a query to either Alexa or BOSS for the life of the thread in a dictionary. This means that changing the order or the results won't require another full set of queries to the external data sources.

In production this could be performed by using Redis or Memcached with an expiry period set to reduce calls to the external source, but also ensure that data returned to users is not too old/stale.

Asynchronous API calls

Currently the calls to the different data API's are performed synchronously, which is blocking the user-interface in the front-end.

We could get some benefit from using a distributed task queue (such as celery) which could perform these requests out of band, meanwhile the server can respond with a "loading" message and not leave the user wondering if processing is actually taking place.

Updated