This course studies the theory, design, and implementation of
text-based search engines. The core components include statistical
characteristics of text, representation of information needs and
documents, several important retrieval models, and experimental
evaluation. The course also covers common elements of commercial
search engines, for example, integration of diverse search engines
into a single search service ("federated search", "vertical
search"), personalized search results, diverse search results, and
sponsored search. The software architecture components include
design and implementation of large-scale, distributed search
This is a full-semester lecture-oriented course worth 12 units.
By the end of the course, students are expected to have
developed the skills listed below.
|Eligibility:||This course is open to all students who meet the prerequisites.|
This course requires good programming skills and an understanding of
computer architectures and operating systems (e.g., memory vs. disk
trade-offs). A basic understanding of probability, statistics, and
linear algebra is helpful. Thus students should have preparation
comparable to the following CMU undergraduate courses.
|Time & Location:||Tu/Th 10:30-11:50, WEH 7500|
The textbook is Introduction
to Information Retrieval, Christopher D. Manning, Prabhakar Raghavan,
and Hinrich Schutze, Cambridge University Press. 2008. You may use the
printed copy or the online copy, but note that the reading instructions
refer to the printed copy.
There are additional selected readings, which will be available through the class web page (this page).
Online access to some materials (additional readings, lecture notes, datasets, etc) is restricted to the .cmu.edu domain. CMU people can get access from outside .cmu.edu (e.g., from home) using CMU's WebVPN Service.
A discussion forum is provided for students to ask questions, answer questions, and discuss class-related topics. You must register yourself to access the discussion forum. Please provide a CMU email address when you join the discussion (you can use other email addresses, too). We will periodically remove students that do not have CMU email addresses.
|Homework:||5 assignments that give hands-on experience with techniques discussed in class.|
|Grading:||Weekly reading summaries (10% total), 5 homework assignments (10% each, 50% total), midterm exam (20%), final exam (20%).|
|Grading Scale:||Grades are assigned using a curve.|
|Course policies:||Attendance, Auditing, Laptops & mobile devices, Late homework, Pass/Fail, Plagiarism & cheating, Recording & videotaping, Waitlist|
(subject to revision):
|Advice From The Faculty:||
This course is a lot of work. Take care of yourself. Do your best to maintain a healthy lifestyle this semester by eating well, exercising, avoiding drugs and alcohol, getting enough sleep and taking some time to relax. This will help you achieve your goals and cope with stress.
If you find yourself struggling with the material or workload, please ask for help. All of us benefit from support during times of struggle. You are not alone. There are many helpful resources available on campus and an important part of the college experience is learning how to ask for help. Asking for support sooner rather than later is often helpful.
If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to help: call 412-268-2922 and visit their website at http://www.cmu.edu/counseling/. Consider reaching out to a friend, faculty or family member you trust for help getting connected to the support that can help.