CS 329T | Syllabus

Schedule & syllabus

The lecture slides, labs, and assignments will be posted here as the course progresses.
Lecture times are 3pm-4:20pm PST on Tuesdays and Thursdays. All deadlines are at 11:59pm PST.

This schedule is subject to change according to the pace of the class.

Date	Description	Materials	Events
		Part I: Background (Week 1)
		Week 1
Tue Sept 26	Trustworthiness of LLMs Course Overview Projects Big picture: LLM tech stack	Slides
Thu Sept 28	Guest Lecture: Jerry Liu (LlamaIndex) LlamaIndex for building LLM apps TruLens for LLM app evaluation Intro to Homework 1	Slides Homework 1 Introduction Supplemental Materials: LlamaIndex TruLens	Description: Homework 1 is designed to get you bootstrapped to an LLM prototype and set you up for a project. Homework 1 Due Oct 9th on Gradescope
		Part II: Key LLM Application Areas (Weeks 2, 3, and 4)
		Week 2
Tues Oct 3	Guest Lecture: Isabelle Hau (Stanford GSE) Josh Weiss (Stanford GSE) Application areas (education) Project ideas -- evaluations	Slides
Thu Oct 5	Guest Lecture: Nicholas Carlini (Google DeepMind) Zifan Wang (Center for AI Safety) Application areas (Security) Adversarial attacks on security LLMs for security Project ideas -- evals	Slides Zifan Wang's Slides	Homework 1 Due Oct 9th
		Week 3
Tue Oct 10	Guest Lecture: Monica Agrawal (LayerHealth) Divya Gopinath (LayerHealth) Application areas (Healthcare) Project ideas - evals	Guest Lecture Slides
Thu Oct 12	Space of evaluations Groundedness, Consistency, Confidence and Uncertainty, Adversarial attacks, Privacy, Fairness Summarize application areas and explore Trustworthiness angles Project ideas	Slides	Final project group formations due Saturday Oct. 14th. Refer to Ed for more info.
		Week 4
Tues Oct 17	Project proposals and feedback
Thurs Oct 19	Project proposals and feedback
		Part III: LLM Evaluations (Weeks 5 and beyond)
		Week 5
Tue Oct 24	RAG triad Context relevance, groundedness, QA relevance Relevance Groundedness evaluations Definition, Techniques, Tools	Slides References: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks TRUE: Re-evaluating Factual Consistency Evaluation Do Language Models Know When They're Hallucinating References? RARR: Researching and Revising What Language Models Say, Using Language Models The Internal State of an LLM Knows When its Lying SELFCHECKGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models Measuring Reliability of Large Language Models through Semantic Consistency
Thu Oct 26	Guest Lecture Eric Mitchell (Stanford University) Confidence, Calibration, Uncertainty Chelsea Finn’s work on Calibration Yarin Gal’s work on Uncertainty Self-Consistency, GD-Consistency, Prompt-Consistency and other topics	Slides References: Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback Teaching models to express their uncertainty in words Reducing conversational agents’ overconfidence through linguistic calibration
		Week 6
Tue Oct 31	Guest Lecture Juhan Bae, Cem Anil (University of Toronto, Anthropic) Explainability: Influence functions LLM training data privacy: membership inference	Slides References: Studying Large Language Model Generalization with Influence Functions Understanding Black-box Predictions via Influence Functions Estimating Training Data Influence by Tracing Gradient Descent Representer Point Selection for Explaining Deep Neural Networks
Thu Nov 2	Explainability: Attributions (Mechanistic interpretability) IG for text Influence patterns for BERT models	Slides Axiomatic Attribution for Deep Networks The Explanation Game: Explaining Machine Learning Models Using Shapley Values Influence Patterns for Explaining Information Flow in BERT
		Week 7
Tues Nov 7	No Class (Democracy Day)
Thurs Nov 9	Project mid-term presentations and feedback
		Week 8
Tue Nov 14	Project mid-term presentations and feedback
Thu Nov 16	Lecture on LLM agents and multi-modal LLMs	Slides: Evaluating Agents Slides: Evaluating Multi-Modal RAGs
		Thanksgiving Break (Nov 21, Nov 23)
		Part IV: Project Presentations
		Week 9
Tue Nov 28	No class, extra time to work on final project
Thu Nov 30	Final project presentations (3 - 5 pm, in Allen 101X)
		Week 10
Tue Dec 5	Poster and demo session (Fujitsu Conference Room, 4th floor of Gates)
Thu Dec 7	No class