Material developed for Texas A&M University - Commerce course CSci 574: Introduction to Machine Learning and Data Analysis. These materials were developed in the Fall of 2013 semester. The original iPython notebooks were created in 2012 by Hannes Schulz, Andreas Mueller and Nenard Birešev at the University of Bonn original github repo. I have taken the original material and expanded it for our course.
The materials have been updated for the Fall of 2015 semester. Materials from Dr. Ng Coursera machine learning course were used extensively for the updates and assignments.
- The main content of the course will be delievers as interactive ipython/Jupyter notebooks.
- The lecture notebooks can be found in the lectures subdirectory, and all assignments for the course will be uploaded to the assignments subdirectory.
Suggested material for learning python:
- Think Python: How to think like a computer scientist free online textbook, very good resource for not only Python but learning to program in general.
- Google Developers Python Class short course with videos, might be helpful for those looking for video tutorials of Python.
- Software Carpentry section on learning Python is also very good, and also includes videos.
Companion Textbooks on Machine Learning:
Segaran. (2007). Programming Collective Intelligence. Already 6 years old, so a bit out of date, and as far as I know no new editions. But I will develop some of my optimization and decision tree examples from here. Code examples
Conway & White. (2012). Machine Learning for Hackers. github repo of book source and data Case studies for this book are written in R. This site Will it Python has example reimplementations in iPython notebooks.
Before the end of the first week of class, you need to get a working Python distribution installed on your personal machine, and you need to clone a copy of our class repository. The following video should help you in getting started
In order to do the class lectures and readings, you need to be able to run and execute Jupyter notebooks. In general, you need to complete the following 4 steps.
- Download and install a Python Distribution, such as Anaconda or Enthought python distributions, that includes support for Jupyter notebooks.
- Download and install a git client on your machine.
- Clone the class repository.
- Test out Python, running Jupyter notebooks, and that you can access and execute the course lecture notebooks with your system setup.
Download and Install a Python Distribution
For this course we recommend using a Python scientific distribution. We recommend using the Anaconda distribution, though the Enthought Canopy distribution should be fine as well.
Whether you are using Windows, Mac or Linux, the linked to installers should work for you. We are using Python version 3.x for this class, so please download and install the 3.x version of the installer. Python 2.x actually will probably work fine, but all of the libraries and code we are using have been successfully moved over to Python 3, so you should use version 3 of Python if at all possible.
Download and Install Git Client
For Linux or Mac users, if git is not already installed you can probably most easily use the standard package management systems of your OS to install git. For Windows, or to install it by hand on Linux/Mac, you should get the package from the SCM git site:
Clone Class Repository
The class repository for our Introduction to Computational Science class can be found at: https://bitbucket.org/dharter/ml-python-class
To clone the repository from a dos terminal or command line prompt, once git is installed, do the following
$ git clone https://bitbucket.org/dharter/ml-python-class.git
Test Python, Jupyter and Class Notebooks
There are multiple ways to start up a Jupyter notebook server on your system once you have Python and Jupyer installed. From a dos prompt or the command line, first change to the directory where you cloned your class repository into, and then execute the command
$ jupyter notebook
This will start up a notebook server, and on most systems will open up a file browser inside of your default web browser, in order for you to browse and select iPython notebooks for execution.
More tips on installing scikit-learn can be found on the scikit-learn website. If you used the Enthought Python distribution, I believe they will be installed for you as part of that distribution.