Commits

Miki Tebeka committed df58190

mark

Comments (0)

Files changed (8)

+{
+ "metadata": {
+  "name": "Untitled0"
+ },
+ "nbformat": 3,
+ "nbformat_minor": 0,
+ "worksheets": []
+}
+#!/usr/bin/env python
+
+import re
+from collections import Counter
+from itertools import combinations
+
+
+def tokenize(text):
+    return set(
+        w.lower()
+        for w in re.findall('[a-z]+', text, re.I)
+        if len(w) >= 4)
+
+
+def histogram(w1, w2):
+    return Counter(w1 + w2)
+
+
+def is_anagram(w1, w2, w3, w4):
+    return histogram(w1, w2) == histogram(w3, w4)
+
+
+def find_2anagrams(words):
+    for w1, w2 in combinations(words, 2):
+        for w3, w4 in combinations(words - {w1, w2}, 2):
+            if is_anagram(w1, w2, w3, w4):
+                yield w1, w2, w3, w4
+
+
+def process(fo):
+    text = fo.read()
+    words = tokenize(text)
+    seen = set()
+    for w1, w2, w3, w4 in find_2anagrams(words):
+        wset = {w1, w2, w3, w4}
+        if seen & wset:
+            continue
+        print('{} {}, {} {}'.format(w1, w2, w3, w4))
+        seen |= wset
+
+
+if __name__ == '__main__':
+    from sys import stdin
+    process(stdin)
+And this Thing I saw!  How can I describe it?  A monstrous tripod,
+higher than many houses, striding over the young pine trees, and
+smashing them aside in its career; a walking engine of glittering
+metal, striding now across the heather; articulate ropes of steel
+dangling from it, and the clattering tumult of its passage mingling
+with the riot of the thunder.  A flash, and it came out vividly,
+heeling over one way with two feet in the air, to vanish and reappear
+almost instantly as it seemed, with the next flash, a hundred yards
+nearer.  Can you imagine a milking stool tilted and bowled violently
+along the ground?  That was the impression those instant flashes gave.
+But instead of a milking stool imagine it a great body of machinery on
+a tripod stand.

image-processing-pipeline.txt

+- User upload recipe, they extract data from it and reward user
+- Stack
+    - Ubuntu, nginx, redis, s3, mako, mysql, tornado
+    - OpenCV, NumPy, IMagick, Tesseract
+    - Mongo + Hadoop
+- Pipeline: pre process -> OCR -> parsing -> scoring -> select best
+                    <-------------------------+
+    - Internal part runs many times
+- Upload can be several images (long receipt) 
+- Pre processing (OpenCV + NumPy):
+    [1]
+    - color -> b/w
+    - unblur /sharpen
+    - un-highlight color regions
+    - adaptive thresholding
+    [2]
+    - Cropping (carpet story)
+    [3]
+    - Extracting lines (line recognition)
+- Tesseract OCR
+    - Had to train on receipt font
+    - Created shopping dictionary
+- Use Levenshtein distance used by Fuzzy Matches
+- Handling errors
+    - Never loose originals
+    - Have re-run capabilities
+- 80% accuracy on good pictures
+
+- OpenStreetMap
+- Mapnik, ??? (server side)
+- OpenLayser, leaflet (client side)
+- Geo stack: Varnish, nginx, gunicorn, TileStache, MemCached, Mapnik, postgresql
+- For realtime you need stateful connection
+    - polling is bad
+- Need pubsub queue
+    - Celery is awesome
+- Change the stack to have node.js with Socket IO to give replies and Celery in
+  the backend (see diagram)
+- PyQt/PyQwt, matplotlib, VTK, Mayavi
+- HDF5 for storage
+- PyQwt good for interactive plot

teaching-ipython.txt

+- Cross platform
+- No switching windows
+    - Confuses students
+- ipythonblocks
+    - with animation
+- Help system (?, . + <TAB)
+- ipython_nose
+- Can save and re-open later
+- Problems
+    - installing
+    - gdb
+    - deleting cell does not remove variables
+- Can style notebook with hidden HTML cell
+- System to see who's at the front door + 1-2min notification
+- Proximity - Take picture - Upload - SMS
+   Arduino      RaspberryPi
+- Around $90 (webcamp and usb hub free from junk pile)
+- Uses Twillio to send SMS
+- Download image for pi, ssh to it and write code
+    - Proximity sensor on /tty/ACM0 (output distance in inch)
+- fswebcam
+- Uses git sync to upload image