Syllabus#
Welcome to CPSC 330! Below is the course syllabus which contains much of the important details in the course.
Course description#
Application of machine learning tools, with an emphasis on solving practical problems. Data cleaning, feature extraction, supervised and unsupervised machine learning, reproducible workflows, and communicating results.
Class meetings#
Lectures:
Section |
Day |
Time |
Location |
---|---|---|---|
CPSC 330 |
MWF |
10:00 - 12:50 PM |
DMP 310 |
Tutorials:
Section |
Days |
Time |
Location |
TA |
---|---|---|---|---|
D1B |
T and Th |
10:00-11:00 |
DMP 201 |
Yifei (T) / Joseph (Th) |
D1C |
T and Th |
11:00-12:00 |
DMP 201 |
Justice (T) / Yifei (Th) |
D1E |
W and F |
9:00-10:00 |
DMP 201 |
Jun He (W) / Kimia (F) |
D1F |
W and F |
1:00-2:00 |
DMP 201 |
Atabak (W) / Shadab (F) |
Tutorials for this course will be conducted by TAs, who will guide you through additional exercises and demos on the content covered each week. Attendance to tutorials is not counted toward your final grade, but is expected to succeed in the course. Participating will allow you to see more examples than what it is possible to cover in class, engage in more personalized discussions with TAs, and providing you with valuable one-on-one time to deepen your understanding of machine learning concepts.
Student hours:
Teaching Team member |
Day |
Time |
Location |
---|---|---|---|
Dr. Moosvi |
Monday |
1:00 - 1:30 |
DMP 310 |
Dr. Moosvi |
Wednesday |
1:00 - 1:30 |
DMP 310 |
Dr. Moosvi |
Friday |
1:00 - 1:30 |
DMP 310 |
Niki Duan |
Thursday |
12:00 - 1:00 |
ICCS X139 and Zoom |
Shashwat Suri |
Thursday |
3:00 - 4:00 |
TBD and Zoom |
Patrick Cui |
Friday |
2:00 - 3:00 |
ICCS X139 and Zoom |
Junhe Cui |
Friday |
6:00 - 7:00 |
TBD and Zoom |
Teaching Team#
Instructor:
Firas Moosvi, Student Hours: MWF after class, 1:00 - 1:30 PM, DMP 310
Course co-ordinator#
Ancuta (Anca) Barbu (cpsc330-admin@cs.ubc.ca)
Please reach out to Anca for: admin questions, extensions, academic concessions etc. Include a descriptive subject, your name and student number, and course section so that we can keep track of emails.
TAs#
Jun He Cui
Patrick Cui
Niki Duan
Atabak Eghbal
Kimia Rostin
Justice Sefas
Shadab Shaikh
Joseph Soo
Shashwat Suri
Yifei Xia
Mahsa Zarei
Kelly Zhu
Course Learning Objectives#
The course is designed for a diverse group of students with varying backgrounds, including those from Computer Science and Statistics. It serves as a gentle introduction to machine learning, yet its applied nature is valuable even for those already familiar with the field. The curriculum covers foundational concepts of machine learning and data science, with topics ranging from data preprocessing and supervised learning to clustering, recommendation systems, text data processing, a high-level introduction to neural networks, time series, and survival analysis. A key focus of the course is developing hands-on skills in model development, evaluation, interpretation, ethical considerations, and clear communication.
By the end of this course, the students should be able to:
Describe supervised learning and its suitability for various tasks.
Explain key machine learning concepts such as classification, regression, overfitting, and the trade-off in model complexity.
Identify appropriate data preprocessing techniques for specific scenarios, provide reasons for their selection, and integrate them into machine learning pipelines.
Develop an intuitive understanding of common supervised machine learning algorithms.
Build end-to-end supervised machine learning pipelines using Python and scikit-learn on different types of datasets.
Understand and differentiate between various evaluation metrics used for classification (e.g., accuracy, precision, recall, F1-score, AP score, AUC-ROC) and regression (e.g., mean absolute error, mean squared error, R-squared). Apply the appropriate evaluation metric based on the problem context and interpret the results to assess model performance.
Recognize the significance of feature engineering in improving model performance.
Compare and contrast various feature selection techniques such as model-based feature selection and recursive feature elimination.
Analyze and interpret feature importances to gain insights into the relevance of different features.
Understand the fundamental concepts behind ensemble methods, including averaging and stacking. Use popular ensemble models like Random Forest, LGBM, CatBoost and appreciate their advantages in improving predictive accuracy and mitigating overfitting.
Understand the principles and algorithms behind clustering methods such as K-means, hierarchical clustering, and DBSCAN. Apply clustering techniques to segment data into meaningful groups.
Intuitively understand the core concepts behind recommendation algorithms, including collaborative filtering and content-based filtering.
Delve into word embeddings like Word2Vec and GloVe, understanding their significance in capturing semantic relationships in textual data.
Develop an intuitive understanding of neural networks, their advantages and drawbacks in machine learning contexts, and their superiority in handling image data.
Familiarize yourself with time series data, its appropriate use cases, how to manage data splitting challenges, conduct feature engineering, forecast future time points, and grasp core concepts such as trends and irregular time intervals.
Acquire an understanding of right-censored data, its implications, and the importance of specialized approaches like survival analysis; apply, interpret, and make predictions using tools such as the lifelines package, Kaplan-Meier curves, and Cox proportional hazards models in Python.
Develop an understanding of the ethical implications surrounding data collection, processing, modeling, and interpretation; critically evaluate potential biases, privacy concerns, and societal impacts.
Cultivate advanced communication skills tailored to diverse audiences, emphasizing reader-centric writing, contextual understanding, and critical evaluation of visualizations to ensure clarity, accuracy, and relevance in conveying ML insights and implications to stakeholders.
Describe the goals and challenges of model deployment.
Registration#
Waitlists:
The general seats available in this class usually fill up very quickly. Once the general seats are taken, the only way to register for the course is to sign up for the waiting list. For questions about the waiting list policies, see here. You should sign up for the waiting list even if it is long; a lot of students tend to drop courses. Signing up for the waiting list also makes it more likely that we will open up extra sessions, expand class sizes, or offer additional courses on these topics. The instructors have no control over the situation and I cannot help you bypass the waiting list.
Because all course material is available to all students, including those on the waitlist, through this repository, all students are expected to complete all the assignments by the assigned deadline, independently on the date on which they joined the course. The course moves at a fast pace and the first weeks cover fundamental concepts that will serve you for the entire semester - you do not want to miss them or find yourself racing to catch up.
Prerequisites: The official prerequisites can be found here. If you do not meet the prerequisites, see here and here. We were told that students should not visit the front desk in the CS main office about prerequisite issues, because the folks at the front desk do not have the authority to resolve prerequisite issues.
In practice, the prerequisite is familiarity with Python programming.
Auditing: If the course is full, we cannot accommodate official auditors. If there is space and you would like to audit the course, please contact the instructor. All UBC students are welcome to audit the course unofficially.
Grading scheme#
The grading scheme for the course is as follows:
Component |
Weight |
Location |
---|---|---|
Learning Logs |
5% |
PrairieLearn |
Assignments |
25% |
|
Midterm 1 |
20% |
PrairieLearn (CBTF) |
Midterm 2 |
20% |
PrairieLearn (CBTF) |
Final |
30% |
PrairieLearn (CBTF) |
Passing Requirements#
All students must satisfy ALL conditions to pass the course: 1. Pass the Assignments component with a grade of at least 40%, 2. Pass the Midterms and Final Exam together with a weighted average grade of at least 50% 3. Pass the Final Exam with a grade of at least 40%,
If a student does not satisfy the appropriate requirements, the student will be assigned the lower of their earned course grade or, a maximum overall grade of 45 in the course. In exceptional cases (with approved concessions), passing requirements may be waived at the discretion of the course instructor; if waived, the student will be earn a maximum grade of 50% in the course.
iClicker (not for course credit)#
In this version of the course, we will not be awarding points for iClicker questions during lectures. This does not mean students are encouraged to miss lectures! These questions are critical for making you think and challenge yourself during lectures. They are designed to help and facilitate your learning, so please do make an earnest effort when providing your answers.
Assignments#
The plan is that most of the assignments will contribute equally towards the overall Assignments grade, but changes to reflect particularly long or short assignments may be possible. We will drop your lowest homework grade. Some flexibility in the assignment submissions is allowed (see Late policy below). See this document for more detailed instructions on submitting homework assignments.
For the full policy on grades, see this document. We understand that grades are important for you for several reasons. But try not to focus too much on them. You will have a better learning experience and in general, you’ll be happier in life if you focus more on learning the material well. For the grading scheme we wish we could use this.
Late policy
Assignments will be due at 22:00 PM on the due date. If you cannot make this due date, you may use a “late token”, for example:
If assignment is due on a Monday at 22:00 PM:
Handing it anytime on Tuesday will cost you 1 late token (irrespective of whether it’s a holiday).
Handing it anytime on Wednesday will cost you 2 late tokens (irrespective of whether it’s a holiday).
Each student will have 4 late tokens for the entire semester. We will track their use and no action is required on your end, just be aware of when you happen to spend your tokens as result of a late submission and keep track of how many you have left.
There is no penalty for using “late tokens”, but you will get a mark of 0 on an assignment if you:
Use more than 2 late tokens on the assignment.
Use more than 4 late tokens across all assignments.
Lecture recordings#
This is an in-person class, and we do not livestream or make recordings available by default. If you miss a class, you can catch up by reviewing the lecture notes and associated videos, and talking to your peers.
Use of AI in the course#
Use of AI-based content generation tools, or AI tools, is permitted for assignments and project work in CPSC 330. It is not allowed during midterms and the final exams.
Additionally, students are required to disclose any use of AI tools for each assignment. This includes
Referencing the tool used
Including any prompts used to query
Including the output of the prompts and a discussion of if/how you modified the result
Failure to follow this policy will be considered a violation of UBC’s academic policy.
When using AI tools for your assignments, be mindful of their impact on your learning. Consider carefully whether they are improving or hindering your learning, and make a conscious decision about their use.
Midterms#
There will be two midterms in CPSC 330 and both of them will be conducted in the CBTF via self-reservation over a three-day period. The CBTF (computer based testing facility) is designed to enhance the student’s writing experience by providing them with a familiar, secure testing environment with quick access to technical support, as well as support from their instructor for common access issues.
Closer to the midterm dates, the instructors will communicate more details regarding the exam content, how to register for a time slot, what to do in case it is not possible to take the exam, and other relevant information. In general, students are expected to take the exam during their registered slot. Seats are limited and if you miss your registered slot it may not be possible to provide an alternative time, unless it is for a serious, documented reason. If something prevents you from attending one of the midterms, contact the course coordinator immediately.
Centre for Accessibility (CfA) Exam Accommodations#
Students who are registered with the Centre for Accessibility (CfA) with exam accommodations listed below will need to write all of their assessments in the Computer-Based Testing Facility (CBTF). The CBTF will provide the following accommodations:
Extended-time (up to 4x)
Distraction-reduced environment
Close proximity to washroom
Phone permitted for medical purposes
Medical equipment/supplies/food
If you have an accommodation that is not listed above, you will write your assessments with the CfA and will need to book a time by their deadline. Please do not book any assessments with the CfA if you are expected to write in the CBTF, as the CfA will cancel the exam booking and ask you to book it yourself with the CBTF. If you have any concerns about your accommodations being met in the CBTF, please reach out to your Accessibility Advisor.
You are encouraged to see the CBTF page.
Final exam#
The final exam is scheduled for the exam period and is likely to be comprehensive, covering the material taught over the course of the semester. See the Passing Requirements section for more details about the final exam.
Schedule#
The tentative schedule is posted here.
Academic concessions#
UBC has a policy on academic concession for cases in which a student may be unable to complete coursework. According to this policy, grounds for academic concession can be illness, conflicting responsibilities, or compassionate grounds. Examples of compassionate grounds, from the above policy, include “a traumatic event experienced by the student, a family member, or a close friend; an act of sexual assault or other sexual misconduct experienced by the student, a family member, or a close friend; a death in the family or of a close friend.” To request an academic concession, please write to the course coordinator (cpsc330-admin@cs.ubc.ca), with your section instructor copied in the email. Additional documentation might be requested. We will review your situation and determine whether to approve the concession, and if approved, the appropriate steps to follow.
Code of conduct#
This course follows the departmental policy available here: https://my.cs.ubc.ca/docs/collaboration-plagiarism. If you are not sure whether or not what you plan to do constitutes academic misconduct, consult the policy, an instructor or the course coordinator.
In general, the following is expected:
Do not submit work that you did not authored. If portions of code are being reused, declare all sources.
If you plan to engage in non-course-related activity in lecture (Facebook, YouTube, chatting with friends, etc), please sit in the last two rows of the room to avoid distracting your classmates.
Do not distribute any course materials (slides, homework assignments, solutions, notes, etc.) without permission.
Do not photograph or record lectures (audio or video) without permission.
If you commit to working with a partner on an assignment, do your fair share of the work.
If you have a problem or complaint, let the instructor(s) know immediately. Maybe we can fix it!
During the exam period, do not disclose, discuss, or share any part of the exam with any other individual, except as directly permitted or required by the course instructors. This includes discussion in person, online, or through any electronic means. Violation of this will result in academic penalties, which may include failure of the exam or failure of the course.
Land acknowledgement#
UBC’s Point Grey Campus is located on the traditional, ancestral, and unceded territory of the xwməθkwəy̓əm (Musqueam) people. The land it is situated on has always been a place of learning for the Musqueam people, who for millennia have passed on their culture, history, and traditions from one generation to the next on this site.
It’s important that this recognition of Musqueam territory and our relationship with the Musqueam people does not appear as just a formality. Take a moment to appreciate the meaning behind the words we use:
TRADITIONAL recognizes lands traditionally used and/or occupied by the Musqueam people or other First Nations in other parts of the country.
ANCESTRAL recognizes land that is handed down from generation to generation.
UNCEDED refers to land that was not turned over to the Crown (government) by a treaty or other agreement.
As you proceed through your journey at UBC, take some time to learn about the history of this land and to honour its original inhabitants.