Miki Tebeka avatar Miki Tebeka committed 0c17268

mark

Comments (0)

Files changed (5)

Empty file added.

big-data-algorithms.txt

+- skip lists, hyperloglog counting, bloom filters and countmin
+- "Large is hard, infinite is much easier."
+- Often there's a lot of data that doesn't matter to the computation
+    - Example with mean by removing lower 8 bits
+- Skip List
+- A good hash function is essentially like a good random number generator
+- hyperloglog - find longest run of 0 in hash of objects. About how many
+  distinct objects you saw
+    - How many flip coins - longest running head
+- Bloom filter
+    - Only false positive
+    - He uses in gnome sequencing to reduce time
+- ipython blocks
+-   def __init__(self):
+        self.attr = self._make_attr()  # _make_attr is "pure"
+- Freeze data to make it immutable, can thaw later
+- Example with _frozen True to False then __setitem__ will raise
+    - Good to find where legacy code mutates things
+- Coroutines push example with yield and send
+    - Harder to debug
+- Some problems are not fitted well to functional programming

visualizing-github-1.txt

+* acquire parse filter mine | represent refine interact
+* Ben Fry book on information representations (wrote processing)
+* acquire usually the hardest
+* get meaningful subset of data (since were rate limited)
+* IPython and later tmux
+* ec2 + mongodb
+* celery + heruko

visualizing-github-2.txt

+- We produce a *story* (about selection visualization)
+- Data as story telling is new (vs words/picture/music ...)
+- Context: Medium and Audience
+- The eye candy trap (beautiful noise is still just noise)
+- Mapping data onto meaningful visuals
+- D3
+- Animation can help explain overload of info on element
+- Can use text, not too much
+- Stored data on files and loaded with D3
+    - Browser caching works for you
+- JSON have type but bloated for data
+- Ended up using CSV with JSON schema
+- 
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.