CSIS-400 Machine Learning
Course Description
Overview
An introduction to machine learning with an emphasis on the implementation of algorithms in Java to learn from "big data" to make predictions.
Requirements
To be successful in this course student are required to...
- to read articles on machine learning concepts and innovations.
- attend lecture and take notes to learn about concepts and examples that are not easily understood from the reading alone.
- engage in class activities to hone problem-solving skills related to machine learning.
- complete in-class quizzes based on the reading and previous lectures.
- complete 4-5 programming projects based on the concepts and skills learned in the course.
- complete a mid-term exam and a comprehensive final exam to demonstrate subject knowledge.
Instructor
Eric Breimer
...has been teaching programming, data structures, theory of computation web application development, and management information systems courses at Siena since 2002.
Schedule
Jan 19
Tuesday - 
Day 1
Intro to Machine Learning
Overview of Topics, Supervised vs. Unsupervised Learning, Intro to Models & Cost Functions
Jan 21
Thursday - 
Day 2
Parameter Learing
Gradient Descent for Linear Regression
Jan 26
Tuesday - 
Day 3
Parameter Learning
Gradient Descent for Linear Regression
QUIZ 1 on reading & basics of linear regression
Jan 28
Thursday - 
Day 4
Linear Algebra & Octave
Implementing Gradient Descent with Octave
Feb 2
Tuesday - 
Day 5
Linear Algebra & Octave
Matrix operations: addition, subtraction, scalar multiplication, matrix multiplication, Octave: input and output
HW1 DUE Midnight
Feb 4
Thursday - 
No Class
Feb 9
Tuesday - 
Day 6
Multivariate Linear Regression
Muliple features, gradient descent for multiple variables, feature scaling, learning rate, polynomial regression
Feb 11
Thursday - 
Day 7
Multivariate Linear Regression
Multiple Variable Linear Regression GradientDescent.java houses.xlsx houses.csv
Feb 16
Tuesday - 
Day 8
Classification
Hypothesis representation, decision boundary
QUIZ 2: Multiple Variable Linear Regression
Feb 18
Thursday - 
Day 9
Logistic Regression
Cost Function, Simplified Cost Function and Gradient Descent, Advanced Optimization
HW2 DUE Friday Feb 19 by Midnight
Feb 23
Tuesday - 
No Class
Spring Break
Feb 25
Thursday - 
No Class
Spring Break
Mar 1
Tuesday - 
Day 10
HW2 DUE by Midnight 5% late penalty, otherwise zero
Multiclass Classification
Extending Logistical Regression to handle data with more than one class
Mar 3
Thursday - 
Day 11
Midterm Review
Mar 8
Tuesday - 
Day 12
Midterm Exam - Part 1
Mar 10
Thursday - 
Day 13
Midterm Exam - Part 2
Mar 17
Thursday - 
Day 15
Mar 31
Thursday - 
Day 18
LG and NN in Octave
Apr 19
Tuesday - 
Day 23
HW3 Work Day
Apr 21
Thursday - 
Day 24
SVMs
Apr 28
Thursday - 
Day 26
HW3 DUE Friday Apr 29 by Midnight
May 3
Tuesday - 
Reading Day
May 4-7
Thursday - 
Final Exams
Syllabus
CSIS-400: Machine Learning
Lecture: Tuesday & Thursday, 11:25am-12:40pm, RB 328
Prerequisite
CSIS-210: Data Structures or Sufficient Programming Background, especially with arrays
Required Text
None
Required Software (free)
Course Learning Goals
- To expose students to concepts and methods in machine learning.
- To enhance the student's problem-solving abilities for data intense problems.
- To further develop the student's ability to develop programming solutions, in particular, programs that can learn from data.
- To give students a basic set of machine learning tools applicable to a variety of problems. To teach students critical analysis of machine learning approaches so that the student can determine when a particular technique is applicable to a given problem.
Topics
For a complete list of topic covered see the Course Schedule
- Basic Linear Algebra
- Supervised vs. Unsupervised Learning
- Linear, Polynomials and Logistical Regression
- Gradient Descent & the Normal Equation
- Learning Rate, Normalization & Regularization
- Classification & Representation
- The Problem of Overfitting
- Neural Networks & Non-linear Hypotheses
- Support Vector Machines: Large Margin Classification & Kernels
- Clustering, K-Mean Algorithm
- Principal Component Analysis
- Anomaly Detection
- Dealing with Large Data Sets
Course Details & Grading Weights
Reading
Attendance & In-class Activities (10%)
Homework (10%)
Group Project (30%)
In-class Quizzes (10%)
Midterm Exam (15%)
Cumulative Final Exam (25%)
Grading
Letter grades will be assigned based on your numeric final average:
A | >= 93.0 | A- | >= 90.0 | B+ | >= 87.0 |
B | >= 83.0 | B- | >= 80.0 | C+ | >= 77.0 |
C | >= 73.0 | C- | >= 70.0 | D+ | >= 67.0 |
D | >= 63.0 | D- | >= 60.0 | F | < 60.0 |
Lecture Attendance
A student is expected to attend every lecture. It is the student's responsibility to be aware of this policy. Students can receive up to a 10% penalty toward their final average for excessive absence, lateness, or disruption during lecture. Students will be given a warning if they are more than 2 minutes late to lecture. After a warning, subsequent lateness will be recorded. Students who are more than 10 minutes late will be marked absent and penalties will be incurred. Students can have two unexcused absence and two lateness warnings without any penalty. But after two, students will receive a 1% penalty for each unexcused absence and a 0.5% penalty for each unexcused lateness (maximum of 10% total penalty).
Excused Absences
Lecture:
Students can be excused (and not penalized) from lecture for illnesses, job interviews, and serious commitments such as athletic or academic trips/competitions. However, students must inform the instructor as soon as possible, provide proof/documentation, and take responsibility to acquire notes and information from other students. The following rules will be strictly enforced:
- Practices (athletics), regularly scheduled extra curricular activities, and weekly obligations in other courses will NOT be considered a valid excuses for missing lecture. Students should not register for this class if such activities conflict with the scheduled meeting time.
- Traveling to athletic games that are documented by the Athletic Department are a valid excuse. However, if you are going to miss more than four lectures due to games, it is recommended that you drop the course.
- For illness or medical emergencies, students WILL have to show documentation (a doctor's note, release form, receipt or equivalent) that verifies the excuse. If an illness is not serious enough to go to a doctor then it is not serious enough to be accepted by the instructor.
- For family emergencies, funerals, or other serious commitments, students should contact the office of Student Affair or Academic Affairs. If the emergency is serious, ask an authorized school official to contact all your instructors regarding your absence. If an excuse is not serious enough to contact an authorized school official then the excuse is not serious enough to be accepted by the instructor.
The instructor makes the final decision to excuse or not to excuse an absence. If you are concerned that an absence will not be excused, you should contact the instructor as soon as possible.
Pandemic/Emergency Preparedness
- Students are instructed to bring all texts and a copy of the syllabus/course schedule home with you in the event of a College Closure. The Academic Calendar will be adjusted upon Reopening; so be prepared for the possibility of a short mini-semester; rescheduled class/exam period; and /or rescheduling of the semester, depending on the length of the Closure.
- If your situation permits, you should continue with readings and assignments to the best of your ability, per the course schedule.
- You will be given instructions regarding how to deal with paper assignments requiring library or other required research by me, as needed.
- Online office hours will be used by me in order to maintain contact with my students. You will be able to “check-in” with questions that you have. If you do not have internet access available, I will also provide my home phone number and home address, as needed. Remember, internet, mail delivery, and telephone services may also be impacted by a Pandemic or other emergency event.
- Finally, stay connected with information regarding the status of the College’s status and Reopening schedule by monitoring the Siena web site, www.siena.edu.
College Policies
- Academic integrity policy
- Accommodations policy
- Emergency preparedness
- Attendance policy
- Cell phone use
Academic Integrity
Exams:
Students caught cheating on an exam, will receive a zero on the exam, will be penalized a full letter-grade in the course, and a letter describing the student's actions will be sent to Siena's Vice President of Academic Affairs. During an exam period, students cannot share information, look at each other's tests, or use unauthorized materials. Unless specific clarification is given, exams are closed-book, closed-notes, there are no cheat sheets allowed, and electronic device usage is prohibited.
Plagiarism on Code:
It is very easy to copy code from other sources and claim it as your own. This is academically dishonest and considered plagiarism. Students who present other authors' code, documents, or programs as their own will receive a grade of zero on the entire project or lab. Students who commit plagiarism a second time will again receive a zero, but will also be penalized a full letter-grade in the course and a letter describing the student's violation will be sent to Siena's Vice President of Academic Affairs. Use the following guidelines to avoid code plagiarism:
Do NOT copy code:
You should never use copied code (from Internet, peers or other sources). Instead, put the copied code away and try to write the code on your own. If you cannot explain your own code and if it happens to match code from other sources, you will be accused of plagiarism.
Do NOT share your code:
While it is natural for students to help each other outside of lab, students retain more knowledge if they attempt to write and debug code on their own. It is acceptable for students to help each other understand general concepts, but students are prohibited from sharing their code or writing code for another student. The only exception is when you are working with a designated partner for a lecture activity or group project, and in these cases, the only collaboration and sharing permitted is between designated lecture activitiy pairs or group members.
Ask for appropriate help:
It is appropriate to ask for or provide help solving a coding problem as long as it is done in a general or abstract way. Appropriate examples include: helping a peer understand an error message, sharing debugging strategies, or explaining a concept related to a specific problem. But, it is inappropriate to have any other students (including tutors) solve your problems directly. Seeking excessive help is a form of cheating. Inappropriate help includes: Asking a peer or tutor to write code for you, looking at another student's working solution, or receiving excessive (step-by-step) help in directly completing an assignment.
Strive to be independent:
An important goal in this course is for students to continue to hone strategies for becoming more independent with respect to problem solving, coding, and debugging. Requiring excessive help indicates that you have not put forth independent effort on a homework or project. The best way to become an independent programmer is to program often. Experiment with a few lines of code (compile, test and debug constantly). The design of this course will naturally require you to do the programming activities needed to be successful. But, if someone else completes these activities for you, it will show on the midterm and final exam when you are asked to write code without help.