Bliu Bliu API is accessible via HTTP POST method.

To get an API key go to, register and go to your dashboard ( At the bottom of the page you will see “Developer API key” box where you will see a code like:


Now you will be able to access API methods. Currently only one is available - for uploading content.


Content uploading




  • youtube_url - URL to video, if it’s available
  • image - URL to image, if it’s available
  • author - Author of the text
  • tags - a list of tags that might be applied for text (sports, economy, quotes, etc.)


You need to send HTTP POST request with data parameter, which contains JSON associative array of text data.


   "body":"Ashes rivals England and Australia will meet on the opening day of the 2015 World Cup after being drawn in the same group.",  
   "title":"England face Australia at World Cup",


curl -X POST\
 --data 'data={"body": "Ashes rivals England and Australia will meet on the opening day of the 2015 World Cup after being drawn in the same group.", "locale": "en", "tags": ["sports"], "title": "England face Australia at World Cup", "original_source": "", "image": "", "collection": ""}' \

PYTHON REQUEST EXAMPLE (uses python-requests)

import requests
import json

data = json.dumps({
    'title': 'England face Australia at World Cup',
    'body': 'Ashes rivals England and Australia will meet on the opening day of the 2015 World Cup after being drawn in the same group.',
    'original_source': '',
    'locale': 'en',
    'image': '',
    'tags': ['sports'],

response =
    data={'data': data}

print response.status_code
print response.content


Response always returns a JSON associative array, containing one element with key msg.

If request was processed successfully, API returns HTTP 200 (OK) status code with JSON: {"msg": "OK"}

If request processing failed, API returns HTTP 400 (BAD REQUEST) status code with JSON where the value of key element is explanation why request failed.


Help Bliu Bliu to add more content

How to start scraping with Scrapy

To scrape websites you will need:

  • Terminal app (on Mac OS X and Linux operating systems they are already available, for Windows you should download PuTTY client).
  • Linux server (it’s not recommended to scrape from your own computer, because it’s long-running task and requires good internet connection).


Connect to your server (if you do not have a server and want to help us, contact us and we will give you one) with terminal app:


Install OS (Ubuntu) dependencies:

sudo apt-get install build-essential python-virtualenv
    python-dev libxml2-dev libxslt-dev g++ libyaml-dev mercurial

Clone repository:

hg clone

Navigate to cloned directory:

cd bliubliu_crawler

Create virtualenv:

virtualenv .

Activate virtualenv:

source bin/activate

Install python dependencies:

pip install -r requirements.txt

Scraping a website

Firstly, open ./bliubliu/bliubliu/ and in line


change YOUR-API-KEY with your personal API key.

In directory ./bliubliu/bliubliu/spiders/ you will notice a first demo spider written for you It is for scraping website.

Navigate to ./bliubliu/bliubliu/ directory and run command:

scrapy list

You will see:

(bliubliu_crawler)user@server:~/bliubliu_crawler/bliubliu/bliubliu$ scrapy list

You can run crawler by entering command (where is your spiders name):

scrapy crawl

Now the spider will start to crawl a website and you should see output with response from API:

$ scrapy crawl
API response: [200] OK
API response: [200] OK
API response: [200] OK

you can stop spider by pressing ctrl+c few times.

Writing your own spider

Go to spiders/ directory, duplicate and rename it to have the name of website you are scraping.

Open a file you just created and rename class to website domain you are scraping. Also change name, allowed_domains, start_urls and rules to match the website's structure.

Before starting spider you should also change XPATH selectors to extract correct data from website.

To test your selectors you should use scrapy shell command.

If we want to scrape, find a page where only one quote on page appears, for example:

Now in shell type:

scrapy shell \

Now you are in a scrapy shell (to quit press ctrl+d).

You can try your one XPATH selectors by using hxs object.

For example, to extract quote body you try this selector:

In [1]:'//div[@class="bq_fq bq_fq_lrg"]/p/text()').extract()[0]
Out[1]: u'Film spectators are quiet vampires.'

After finding the correct selector write it down in your spider's file.

Find the correct selectors for all required fields and test them with the command (change the command to match website you are scraping and your spider):

scrapy parse\ --nolinks -c parse_item

You should see output like:

# Scraped Items  ------------------------------------------------------------
[{'author': u'Jim Morrison',
  'body': u'Film spectators are quiet vampires.',
  'locale': 'en',
  'original_source': '',
  'tags': ['quote'],
  'title': 'Quote'}]

It means, that selectors are correct and you can start your spider and scrape a website.

When you are done

scrapy crawl www.****.com