Clone wiki

lighterfootprint / Open Issues

Miscellaneous Software Design Issues

Graph Database

We're currently evaluating / using as a graph database back end to store the "social" data, e.g., edges between participants (social network), comments & thumbs-ups they leave for each other, and what activities they have performed. There should be a clear distinction in the responsibilities of the graph database vs. the responsibilities of our Django RDBMS.

The graph database is used to calculate similarity between individuals based on the activities they perform (time, location, and frequency) and their communication patterns. Initial algorithm: Generate a series of vectors for each participant along one dimension (e.g., comment count on each activity or thumbs-up counts on each activity or direct activity performed counts). *Note* - we need to decide whether we ask the rdbms or neo4j for the comment counts. Apply cosine similarity function on the vectors between participants.

The main benefit of the graph database vs RDBMS is when we want to ask a query that results in a long traversal (e.g., we ask one query that gives us an intermediate result that we then need to use to ask another query that we then need to use to ask another query, etc).

Email Template

Hi, Person XYZZY:

This is a summary of your group's activity in the Lighter Footprints Challenge.

Activities performed
  • You turned off the water while brushing your teeth
  • Participant X performed Don't Flush Your Toilet 3 times
  • Participant Y ate a local lunch
  • Participant A, B, Z carpooled
Social activity
  • Participant D commented on your post, saying "You are the weakest link!"
  • Participant E liked your post
  • ...