Course Description

Overview

An introduction to machine learning with an emphasis on the implementation of algorithms in Java to learn from "big data" to make predictions.

Requirements

To be successful in this course student are required to...

  • to read articles on machine learning concepts and innovations.
  • attend lecture and take notes to learn about concepts and examples that are not easily understood from the reading alone.
  • engage in class activities to hone problem-solving skills related to machine learning.
  • complete in-class quizzes based on the reading and previous lectures.
  • complete 4-5 programming projects based on the concepts and skills learned in the course.
  • complete a mid-term exam and a comprehensive final exam to demonstrate subject knowledge.

Syllabus »

Instructor

Eric Breimer

...has been teaching programming, data structures, theory of computation web application development, and management information systems courses at Siena since 2002.

More info »

Eric Breimer

Jan 19

Tuesday - 

Day 1

Intro to Machine Learning

Overview of Topics, Supervised vs. Unsupervised Learning, Intro to Models & Cost Functions

Read before Jan 26

Jan 21

Thursday - 

Day 2

Parameter Learing

Gradient Descent for Linear Regression

houses.csv Gradient Descent Linear Regression

HW1 Assigned

Jan 26

Tuesday - 

Day 3

Parameter Learning

Gradient Descent for Linear Regression

QUIZ 1 on reading & basics of linear regression

Jan 28

Thursday - 

Day 4

Linear Algebra & Octave

Implementing Gradient Descent with Octave

gradientDescent.m

data.txt

Feb 2

Tuesday - 

Day 5

Linear Algebra & Octave

Matrix operations: addition, subtraction, scalar multiplication, matrix multiplication, Octave: input and output

HW1 DUE Midnight

All code

Feb 4

Thursday - 

No Class

Feb 9

Tuesday - 

Day 6

Multivariate Linear Regression

Muliple features, gradient descent for multiple variables, feature scaling, learning rate, polynomial regression

HW2 Assigned

Feb 11

Thursday - 

Day 7

Multivariate Linear Regression

Multiple Variable Linear Regression GradientDescent.java houses.xlsx houses.csv

Feb 16

Tuesday - 

Day 8

Classification

Hypothesis representation, decision boundary

QUIZ 2: Multiple Variable Linear Regression

Feb 18

Thursday - 

Day 9

Logistic Regression

Cost Function, Simplified Cost Function and Gradient Descent, Advanced Optimization

HW2 DUE Friday Feb 19 by Midnight

Feb 23

Tuesday - 

No Class

Spring Break

Feb 25

Thursday - 

No Class

Spring Break

Mar 1

Tuesday - 

Day 10

HW2 DUE by Midnight 5% late penalty, otherwise zero

Multiclass Classification

Extending Logistical Regression to handle data with more than one class

Multiclass Logistic Regression logr.xlsx

Mar 3

Thursday - 

Day 11

Midterm Review

 

Mar 8

Tuesday - 

Day 12

 

Midterm Exam - Part 1

Mar 10

Thursday - 

Day 13

 

Midterm Exam - Part 2

Mar 15

Tuesday - 

Day 14

Neural Networks

Ng's Notes

My Slides

Mar 17

Thursday - 

Day 15

 

 

Mar 22

Tuesday - 

Day 16

Backprop

nn.xlsx My Slides

Mar 24

Thursday - 

No Class

 

 

Mar 29

Tuesday - 

Day 17

LG and NN in Octave

HW3 Assigned

Octave Code

Mar 31

Thursday - 

Day 18

LG and NN in Octave

 

Apr 5

Tuesday - 

Day 19

LG and NN in Octave

Minimizers

Apr 7

Thursday - 

Day 20

Process input & Weka Intro

Weka Manual 3.6

Octave File I/O

Apr 12

Tuesday - 

Day 21

LG and NN in Weka

Precision vs Recall

Apr 14

Thursday - 

Day 22

Precision & Recall

Precision.docx

Precision.xlxs

Apr 19

Tuesday - 

Day 23

HW3 Work Day

 

Apr 21

Thursday - 

Day 24

SVMs

Apr 26

Tuesday - 

Day 25

SVMs & K-Means

Slides

Weighting.xlxs

Apr 28

Thursday - 

Day 26

 

HW3 DUE Friday Apr 29 by Midnight

May 3

Tuesday - 

Reading Day

 

 

May 4-7

Thursday - 

Final Exams

 

 

CSIS-400: Machine Learning

Lecture: Tuesday & Thursday, 11:25am-12:40pm, RB 328

Instructor

Dr. Eric Breimer, ebreimer@siena.edu, 786-5084, RB 320

Office Hours

Prerequisite

CSIS-210: Data Structures or Sufficient Programming Background, especially with arrays

Required Text

None

Required Software (free)

Course Learning Goals

  1. To expose students to concepts and methods in machine learning.
  2. To enhance the student's problem-solving abilities for data intense problems.
  3. To further develop the student's ability to develop programming solutions, in particular, programs that can learn from data.
  4. To give students a basic set of machine learning tools applicable to a variety of problems.
  5. To teach students critical analysis of machine learning approaches so that the student can determine when a particular technique is applicable to a given problem.

Topics

For a complete list of topic covered see the Course Schedule

Course Details & Grading Weights

Reading

Rather than use a textbook, students will read passages from a variety of sources including scholarly papers.

Attendance & In-class Activities (10%)

Each week, students will complete short activities in lecture in preparation for the homework and a group project.

Homework (10%)

Students will complete 4-5 homework assignments that will include implementing basic algorithms in Java and Octave.

Group Project (30%)

Students will also complete a group (3-4 students) project that will require researching a problem, finding data, implementing an algorithm, producing a poster and making a public presentation.

In-class Quizzes (10%)

Each week, there will be a short quiz to give students feedback in preparation for the mid-term and final exams.

Midterm Exam (15%)

The midterm exam will include a mix of multiple-choice, fill-in and explanation-type questions. A few questions will require student to write basic machine learning algorithms in Java and Octave.

Cumulative Final Exam (25%)

During final exam week, students will take a 2-hour cumulative final exam that will be similar in format to the midterm exam. It will include material covered in midterm exam but will focus more on the new material covered in the second half of the course.

Grading

Letter grades will be assigned based on your numeric final average:

A>= 93.0 A->= 90.0 B+>= 87.0
B>= 83.0 B->= 80.0 C+>= 77.0
C>= 73.0 C->= 70.0 D+>= 67.0
D>= 63.0 D->= 60.0 F< 60.0

Lecture Attendance

A student is expected to attend every lecture. It is the student's responsibility to be aware of this policy. Students can receive up to a 10% penalty toward their final average for excessive absence, lateness, or disruption during lecture. Students will be given a warning if they are more than 2 minutes late to lecture. After a warning, subsequent lateness will be recorded. Students who are more than 10 minutes late will be marked absent and penalties will be incurred. Students can have two unexcused absence and two lateness warnings without any penalty. But after two, students will receive a 1% penalty for each unexcused absence and a 0.5% penalty for each unexcused lateness (maximum of 10% total penalty).

Excused Absences

Lecture:

Students can be excused (and not penalized) from lecture for illnesses, job interviews, and serious commitments such as athletic or academic trips/competitions. However, students must inform the instructor as soon as possible, provide proof/documentation, and take responsibility to acquire notes and information from other students. The following rules will be strictly enforced:

The instructor makes the final decision to excuse or not to excuse an absence. If you are concerned that an absence will not be excused, you should contact the instructor as soon as possible.

Pandemic/Emergency Preparedness

College Policies

Academic Integrity

Exams:

Students caught cheating on an exam, will receive a zero on the exam, will be penalized a full letter-grade in the course, and a letter describing the student's actions will be sent to Siena's Vice President of Academic Affairs. During an exam period, students cannot share information, look at each other's tests, or use unauthorized materials. Unless specific clarification is given, exams are closed-book, closed-notes, there are no cheat sheets allowed, and electronic device usage is prohibited.

Plagiarism on Code:

It is very easy to copy code from other sources and claim it as your own. This is academically dishonest and considered plagiarism. Students who present other authors' code, documents, or programs as their own will receive a grade of zero on the entire project or lab. Students who commit plagiarism a second time will again receive a zero, but will also be penalized a full letter-grade in the course and a letter describing the student's violation will be sent to Siena's Vice President of Academic Affairs. Use the following guidelines to avoid code plagiarism:

Do NOT copy code:

You should never use copied code (from Internet, peers or other sources). Instead, put the copied code away and try to write the code on your own. If you cannot explain your own code and if it happens to match code from other sources, you will be accused of plagiarism.

Do NOT share your code:

While it is natural for students to help each other outside of lab, students retain more knowledge if they attempt to write and debug code on their own. It is acceptable for students to help each other understand general concepts, but students are prohibited from sharing their code or writing code for another student. The only exception is when you are working with a designated partner for a lecture activity or group project, and in these cases, the only collaboration and sharing permitted is between designated lecture activitiy pairs or group members.

Ask for appropriate help:

It is appropriate to ask for or provide help solving a coding problem as long as it is done in a general or abstract way. Appropriate examples include: helping a peer understand an error message, sharing debugging strategies, or explaining a concept related to a specific problem. But, it is inappropriate to have any other students (including tutors) solve your problems directly. Seeking excessive help is a form of cheating. Inappropriate help includes: Asking a peer or tutor to write code for you, looking at another student's working solution, or receiving excessive (step-by-step) help in directly completing an assignment.

Strive to be independent:

An important goal in this course is for students to continue to hone strategies for becoming more independent with respect to problem solving, coding, and debugging. Requiring excessive help indicates that you have not put forth independent effort on a homework or project. The best way to become an independent programmer is to program often. Experiment with a few lines of code (compile, test and debug constantly). The design of this course will naturally require you to do the programming activities needed to be successful. But, if someone else completes these activities for you, it will show on the midterm and final exam when you are asked to write code without help.