Wiki
Clone wikidemeter-course / Syllabus
| TIME | DURATION | DESCRIPTION -------------------------------------------------------------------------- | 0 min | 20 min | Intro to Demeter (specs) and Hadoop - What is Hadoop? - Demeter vs Minerva - advantages - use case: AILUN - Perl/MySQL app - transformed to Python scripts + Impala queries - EMR Search -------------------------------------------------------------------------- | 20 min | 20 min | HDFS - How to navigate HDFS and basic data loading - sqoop (Hue interface) - example: sqoop from EMR or Krakatoa -------------------------------------------------------------------------- | 40 min | 20 min | File formats - text/zipped files, Avro and Parquet -------------------------------------------------------------------------- | 60 min | 10 min | BREAK -------------------------------------------------------------------------- | 70 min | 30 min | Hive and Impala Introduction and Examples - Revisit AILUN - BAM Queries -------------------------------------------------------------------------- | 100 min | 20 min | Map Reduce (AILUN MR overview) Map Reduce Streaming in Python, Pubmed Abstracts Example
Updated