Wiki

Clone wiki

demeter-course / Syllabus

|  TIME   |  DURATION | DESCRIPTION
-------------------------------------------------------------------------- 
| 0 min   |   20 min  | Intro to Demeter (specs) and Hadoop 
                        - What is Hadoop?
                        - Demeter vs Minerva
                        - advantages 
                        - use case: AILUN
                            - Perl/MySQL app
                            - transformed to Python scripts + Impala queries
                        - EMR Search
-------------------------------------------------------------------------- 
| 20 min  |   20 min  | HDFS
                        - How to navigate HDFS and basic data loading
                        - sqoop (Hue interface)
                        - example: sqoop from EMR or Krakatoa
-------------------------------------------------------------------------- 
| 40 min  |   20 min  | File formats - text/zipped files, Avro and Parquet
-------------------------------------------------------------------------- 
| 60 min  |   10 min  | BREAK
-------------------------------------------------------------------------- 
| 70 min  |   30 min  | Hive and Impala Introduction and Examples 
                        - Revisit AILUN
                        - BAM Queries
-------------------------------------------------------------------------- 
| 100 min |   20 min  | Map Reduce (AILUN MR overview) Map Reduce Streaming in Python, Pubmed Abstracts Example

Updated