Bitbucket is a code hosting site with unlimited public and private repositories. We're also free for small teams!

Close

Usage

python feedcloner.py [--since=YYYY-MM-DD] [--until=YYYY-MM-DD] [--full]

Parameters:

  • --since=YYYY-MM-DD (Optional) Specifies the date from when the archive should be read from
  • --until=YYYY-MM-DD (Optional) Specifies the date from when the archive should be read until
  • --full (Optional) Retrieves the full feed from Facebook, disregard since and until parameter if specified

Installation

  1. (Highly recommended, optional) Set up a new virtualenv environment for this particular project.
  2. Install all the required packages via pip: pip install -r requirements.txt.
  3. Install the CouchApps

    1. Create a _couchapp/feedcloner/.couchapprc file. Here's a sample configuration file you can use:

      {
          "env": {
              "default": {
                  "db": "http://admin:password@localhost:5984/facebookarchive"
              }
          }
      }
      
    2. Execute the following command to deploy the CouchApp into your database: couchapp pushapp _couchapp/feedcloner default

    3. Create a new application in Facebook Application. Follow the instructions on the page.
    4. In the "Select how your app integrates with Facebook section" of the new app, select "Website" and key in the Site URL field. Save once done.
    5. Copy config.sample.py to config.py (i.e. cp config.sample.py config.py) and change the following configuration parameters:

      COUCH_CONN_STR = 'http://username:password@localhost:5984/' COUCH_DB_NAME = 'database-name' FACEBOOK_OAUTH_CLIENT_ID = '' # This will be the App ID of your FB application FACEBOOK_OAUTH_CLIENT_SECRET = '' # This will be the client secret of your FB application

  4. You're done! Run the script by executing python feedcloner.py.

Running it for the first time

When the script is executed for the first time, you'll need to grant the script permission to archive your feed.

A web browser will be opened automatically in order to show you the Facebook dialog to grant such permissions to your script.

Once permission is granted, you'll be redirected to this page https://www.facebook.com/connect/login_success.html?code=AQD6...FqX#=, copy the string between "code=" and "#" (as bolded in this example) and paste it to the feedcloner and it's done :).

FAQ

How can I retrieve the Group ID?

I hope there's a simpler way to achieve that, but here's the hard way of doing it:

  1. Start up Firebug and enable the "Net" tab
  2. Bring up Facebook - check that the Net panl will show all the HTTP requests that's happening
  3. Click the "Clear" button so that you have the panel in the blank slate
  4. Click on the link of the Facebook group that you wanted to retrieve the Facebook ID
  5. Look out for a GET request to generic.php - that would be the very first request, but hunt it down if it's not. Click on the [+] box to expand the details
  6. Click on the "Parameters" tab and you'd see a key-value dictionary of the request. There'll be two fields in there called "key" and "sk", both of which would carry the same value like "group_1656399367890"
  7. The group's ID will be the numeric value after "group_" (in this case, it'll be 1656399367890)

Licensing Terms

This script is licensed under MIT License.

Copyright (C) 2011 Seh Hui, Leong

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Recent activity

Seh Hui Leong

Commits by Seh Hui Leong were pushed to felixleong/facebook-feedcloner

7c403ed - Changed everything to use logging and various improvements. DETAILS: 1. Use logging instead of print statements 2. Renamed _couchapp/admin folder to _couchapp/feedcloner 3. Improved README ...
Seh Hui Leong

Commits by Seh Hui Leong were pushed to felixleong/facebook-feedcloner

a5b9582 - Added user_groups and friends_groups permissions to allow retrieval of secret groups. DETAILS: 1. Added user_groups and friends_groups permissions to allow retrieval of secret groups. 2. ...
Seh Hui Leong

Commits by Seh Hui Leong were pushed to felixleong/facebook-feedcloner

997219b - Shifted _handle_feed since it's more important (and needs to be more visible) than _workaround_bad_request. Remove distribute package requirement - it screws up pip in a ...
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.