Description: |
Many organizations need to analyze large amounts of text to
discover useful information. For example, a company may want to
monitor how the public discusses its products in social media, or
a forensics team may need to discover the contents of disk drives
seized by law enforcement. This course provides students with an
understanding of common and emerging methods of organizing,
summarizing, and analyzing large collections of unstructured and
lightly-structured text ('text analytics'). The focus is on
algorithms and techniques, however the course also provides an
introduction to open-source software tools This is a 6 unit course. It is offered during Mini-2 and Mini-4. |
|||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Learning Objectives: |
By the end of the course, students are expected to have
developed the following skills. Skills are assessed by
the homework assignments and the final exam.
|
|||||||||||||||||||||||||||||||||
Prerequisites: | None | |||||||||||||||||||||||||||||||||
Time & Location: | Mini A4, Fr 3:00-5:40pm, TOR Classroom 3 | |||||||||||||||||||||||||||||||||
Instructor: | Jamie Callan | |||||||||||||||||||||||||||||||||
Teaching Assistant: | Himani Gupta (himanig@andrew) | |||||||||||||||||||||||||||||||||
Office hours: |
TBD Office hours are held using a Google hangout. You will need to have a Google account and may need to install a browswer plug-in. If you are unable to use Google hangouts, contact Charles by email to make other meeting arrangements. |
|||||||||||||||||||||||||||||||||
Discussion Forum: | A discussion forum is provided for students to ask questions, answer questions, and discuss class-related topics. You will need a Piazza account to use the discussion forum. Please provide a CMU email address when you join the 95-865 discussion (you can use other email addresses, too). We will periodically remove students that do not have CMU email addresses. | |||||||||||||||||||||||||||||||||
Instructional Materials: |
Some lectures have assigned readings from
Introduction to Information
Retrieval, Christopher D. Manning, Prabhakar Raghavan, and Hinrich
Schutze, Cambridge University Press. 2008. The links next to each lecture
provide access to an online version of the text. Some lectures have assigned readings from other papers, as shown in the link next to the lecture. Online access to some materials is restricted to the .cmu.edu domain. CMU people can get access from outside .cmu.edu (e.g., from home) using CMU's WebVPN Service. |
|||||||||||||||||||||||||||||||||
Recorded Lectures: | Recorded lectures are available via the Heinz College video catalog. An Andrew id is required. | |||||||||||||||||||||||||||||||||
Homework: | 3 assignments that give hands-on experience with techniques discussed in class. | |||||||||||||||||||||||||||||||||
Grading: | 3 assignments (3 x 25%) and a final exam (25%). | |||||||||||||||||||||||||||||||||
Grading Scale: | Grades are assigned using a curve. | |||||||||||||||||||||||||||||||||
Course policies: | Attendance, Auditing, Laptops & mobile devices, Late homework, Pass/fail, Plagiarism & cheating, Waitlist | |||||||||||||||||||||||||||||||||
Syllabus (subject to revision): |
|