Introduction
- Lecture: S0-Intro
- Version: current
- Please to Read: Numpy Tutorial
- Recorded Videos: M2 / M3
Att: the following markdown text was generated from the corresponding powerpoint lecture file automatically. Errors and misformatting, therefore, do exist (a lot!)!
Lecture 1 Study Guide
Course Logistics and Overview
Course Staff and Communication
- Instructor: Prof. Yanjun Qi (Pronounced: /ch ee/)
- Communication: Email and Slack (details on Course overview)
- TA and Office Hours: Information available on the Course Website
Course Materials
- Textbook: None required; multiple reference books shared via Course Website
- Official Content: Anything not mentioned in Prof. Qi’s slides is not an official topic
- Resources:
- All slides, videos, readings, and homework on Course Website
- Assignments/Project: UVA CourseSite
- Quizzes: Google Forms
Background Needed
- Required:
- Calculus
- Basic linear algebra
- Basic probability
- Basic Algorithm
- Recommended: Statistics
- Programming: Python (required for all assignments)
Assessment Breakdown
- Weekly Quizzes: 20% (top 10 scores count, 14 total, 10 minutes each, closed book via Google Form)
- Homework Assignments: 65% (HW1-HW5, 10% each)
- Final Project: 10% (apply ML to real-world problems)
- Weekly Reading Question Submissions: 5%
- Final Exam: 15% (December 9th)
- See CourseWeb for detailed policy
Course Format (2025F Flipped Classroom - Starting Week 3)
In-Class Activities
- Pre-class question submission
- Quizzes
- Quiz review
- Assignment review
- In-class solution examples
- Short presentations/code demos
Emphasis
- Active learning
- Understanding core ML concepts
- Applying models (Python, scikit-learn, Keras)
- Analyzing performance and ethics
- Becoming an ML tool builder
Key Objectives
- Learn fundamental principles, algorithm design, methods, and applications
- Build simple machine learning tools from scratch (not just tool users)
- Understand complex machine learning methods at the source code level
Machine Learning History and Concepts
Artificial Intelligence (AI)
- Definition: The study of computer systems that attempt to model and apply the intelligence of the human mind
- Attributes of Intelligence (for machines):
- Perceive
- Understand
- Interact
- Think/Reason/Learn
- Imagine/Make analogy
- Impact: Economic, cultural, social, health disruptions; potential for job automation; existential threat concerns
History of AI/ML (Timeline)
- 1950: Alan Turing’s “Computing Machinery and Intelligence” (“Can machines think?”)
- 1956: Dartmouth workshop, John McCarthy coins “artificial intelligence”
- 1950s-60s: Enthusiasm, early Neural Networks (NNs) popular
- 70s-80s: Rule-based/Knowledge-based systems, Expert systems
- 1959: Arthur Samuel popularizes “machine learning” (Samuel Checkers-playing Program)
- 90s: Classic ML (SVM, RF, BN), Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) invented
- 2000s: ImageNet, Deep Learning breakthroughs
- 2010s: Deep Learning breakthroughs continue (2015-2016 CNN/RNN invented, 2017 Deep Learning, Deep Reinforcement Learning, GANs)
- 2023-Now: ChatGPT / Generative AI / Agentic AI
Reasons for 2010s Deep Learning Breakthroughs
- Plenty of good quality (labeled) data (text, visual, audio, user activity, knowledge graphs, genomics, medical imaging)
- Advanced computer architecture that fits Deep Learning (e.g., GPUs for faster, more accurate results)
- Powerful, well-engineered machine learning libraries (easy to learn, use, extend)
Recent Advances (2023-Now) on Generative AI
- Definition: AI systems that create content (text, images, audio, video, code), powered by foundation models (transformers, diffusion, LLMs). Distinct from discriminative AI
- Key Applications:
- LLMs (text/chat)
- Diffusion models (images)
- Gen-2/Sora (video)
- GitHub Copilot (code)
- Multi-modal models
- Enabling Factors: Scaling law, data, advanced architecture, powerful libraries
Machine Learning Basics
Goal
Build computer systems that learn and adapt from “experience” (data examples)
Traditional Programming vs. Machine Learning
- Traditional: Computer + Data + Program → Output
- ML (Training Phase): Computer + Input Data (X) + Output (Y) → Program/Model f()
Common Notations
- Inputs (X):
- Matrix (bold capital)
- p variables, n observations
- Xⱼ is jth element
- Xⱼ,ᵢ is ith observed value
- Vectors are column vectors
- Outputs:
- Quantitative (Y)
- Qualitative/Categorical (C)
- Observed Variables: Lowercase
Supervised Learning
Find a function f() to map input space X to output space Y such that the difference between true y and f(x) is small
Example: Linear Binary Classifier
f(x,w,b) = sign(wᵀx + b)
- wᵀx + b > 0 (e.g., +1 point)
- wᵀx + b < 0 (e.g., -1 point)
Training
Learning parameters w, b by minimizing a loss/cost function L() on the training set (available x and y pairs)
Testing
Evaluating performance on “future” (unseen) points by comparing true y to predicted f(x). Key: testing examples are not in the training set
Basic Concepts
- Loss Function: Measures difference between y and f(x) (e.g., hinge loss for binary classification)
- Regularization: Additional information added to the loss function to control f
- Generalization: The ability to learn a function from past data to “explain,” “predict,” “model,” or “control” new, unseen data examples (inductive reasoning)
Two Modes of ML
- Training: Input-output pairs (X, Y) are fed into a model to learn f()
- Testing/Production: Unseen input X’ is fed into the learned model f() to produce f(X’) (predicted output)
Three Aimed Features of ML Models
- Robustness
- Computation
- Prediction
General Lessons for Excellence
- Good breadth in fundamentals is key
- Strength in particular targeted topics helps stand out
Recommended Extra-Curriculum Books
- Master Algorithm by Pedro Domingos (explores different “tribes” of ML: Symbolists, Connectionists, Evolutionists, Bayesians, Analogizers)
- Homo Deus: A Brief History of Tomorrow by Yuval Noah Harari
Quiz: UVA CS 4774: Machine Learning - Introduction
Instructions: Answer each question in 2-3 sentences.
- What is the main distinction between “Traditional Programming” and the “Machine Learning (training phase)” as illustrated in the lecture?
- Name three specific reasons cited in the lecture for the significant breakthroughs in Deep Learning during the 2010s.
- Define “Generative AI” and explain how it differs from “discriminative AI” as presented in the source material.
- In the context of supervised learning, what is the purpose of the “training phase” for a linear binary classifier, and what is being minimized?
- What is the concept of “generalization” in machine learning, and why is it crucial for model performance?
Quiz Answer Key
-
Machine Learning Goal
The primary goal of machine learning is to build computer systems that can learn and adapt from their “experience.” In this context, “experience” refers to the available data examples, also known as instances or samples, which are described with various properties. -
Traditional vs ML Programming
Traditional Programming involves a computer executing a pre-defined program on data to produce an output. In contrast, the Machine Learning training phase takes input data (X) and corresponding outputs (Y) to learn or create the program/model f(). -
Deep Learning Breakthroughs
The significant breakthroughs in Deep Learning during the 2010s were primarily due to the availability of plenty of good quality labeled data, advanced computer architectures suitable for Deep Learning (like GPUs), and powerful, well-engineered machine learning libraries. -
Generative AI
Generative AI refers to AI systems that are capable of creating new content, such as text, images, or code, powered by foundation models. This differs from discriminative AI, which primarily focuses on tasks like classification rather than content generation. -
Training Phase Purpose
In supervised learning, the training phase for a linear binary classifier aims to learn the parameters w and b. This is achieved by minimizing a loss or cost function L(), which quantifies the difference between the true output y and the model’s prediction f(x) on the available training examples. -
Generalization
Generalization in machine learning is the ability of a model to apply the knowledge learned from past, observed data to effectively “explain,” “predict,” or “model” new, unseen data examples. It is crucial because it indicates whether the model has truly learned the underlying patterns rather than just memorizing the training data.