CSCI 544 — Applied Natural Language Processing
Spring 2017
Time and location |
Monday 4:00pm–7:50pm, THH 201 |
Instructor |
Ron Artstein |
Monday 2:00pm–3:30pm, PHE 514/516, or by appointment |
Teaching Assistants |
Justin Garten |
Wednesday 12:00pm–2:00pm,
SAL computer lab |
|
Ramesh Manuvinakurike |
Monday 10:00am–11:00am,
SAL computer lab |
|
Siddharth Jain |
Wednesday 8:30am–10:30am,
SAL computer lab |
There will be no office hours on January 16 (Martin Luther King’s
Birthday), February 20 (Presidents’ Day), or March 12–19 (Spring Recess).
There will also be no office hours on April 10 because code
demonstrations are being held that day.
Administrative Matters
- Registration and D-Clearance
- Please consult the page on D-clearance
and waiting list (Updated January 13)
- Graders
- If you’ve taken the course before and wish to be considered as a
grader, please apply through the Computer Science Department.
Graders are usually selected in the first few weeks of the semester.
- Travel
- Students who are absent from class for any reason must make up
the materials themselves, and must submit their assignments on time.
The final exam will be administered according to the
Final Exam
Schedule.
University regulations do not allow a student to omit a final
examination, or take it in advance of its scheduled time.
- Academic integrity
- Please read my Personal note on
academic integrity.
Synopsis
This course covers both fundamental and cutting-edge topics in
Natural Language Processing (NLP) and provides students with hands-on
experience in NLP applications.
This graduate course is intended for:
- students who want to understand current NLP technologies
- students interested in building NLP applications
- students interested in applications of NLP like sentiment analysis, dialogue systems, question answering systems, among others
Recommended preparation:
Proficiency in programming, algorithms and data structures, basic
knowledge of linear algebra and machine learning.
Related Courses
This course is part of USC’s
curriculum in natural
language processing. There is a sister course, CSCI 662 Advanced
Natural Language Processing, offered in the Fall semester, which
covers complementary (and advanced) material and is intended for PhD
students (or students who want to continue to a PhD program).
Coursework and Grading
- Homework assignments (8 × 5%): A mix of programming
assignments and written exercises.
- Reading quizzes (10 × 1%): Short, web-based quizzes
relating to reading assignments.
- Midterm exam (10%): In-class exam with questions similar to
those from the homework assignments and quizzes.
- Final exam (10%): In-class exam with questions similar to those
from the homework assignments and quizzes.
- Research project (30%): Group project, graded on the project’s
relevance to the course, degree of difficulty, amount of work,
correctness, and written report.
Grading scale
The following scale is used for determining final grades (note that
A is the highest grade given by USC).
- A: 92%, A–: 90%,
B+: 87%, B: 82%, B–: 80%,
C+: 77%, C: 72%, C–: 70%,
D+: 67%, D: 62%, D–: 60%
Grade challenges
- Grades on an assignment will only be changed if there is an error
in grading; a student who wishes to challenge a grade must identify
the grading error before asking for a grade change.
- Students are welcome to discuss any aspect of the homework
assignments, but there will be no negotiation on grades, and no
changes other than the correction of grading errors.
Late Policy
- There will be a penalty for turning in homework late; the
penalty will vary by assignment. In no instance will credit be given
to a homework assignment submitted after the solution has been
discussed in class.
- Reading quizzes are intended to make sure that students are
prepared for class, so no late quizzes will be accepted. (One
exception: students who registered late will take the first quiz
after they have registered.)
- Homework assignments will not be accepted by email. If there are
technical or other issues with the submission system you should
write to us and we will work to fix these issues, but do not send
homework by email just because you weren’t able to submit it through
the system.
- Homework is usually due at the end of the day. We may not be
available to solve issues close to the deadline, so you should plan
on submitting your homework early, even if only as a draft. Multiple
submissions are generally allowed, and the last submission will be
graded. Quizzes may only be submitted once.
Communication
Please use the class discussion boards on
Piazza for questions and
issues regarding homework assignments and the course in general. This
way, the entire class can participate and see the questions and
answers. Email should be reserved for communication of a personal
nature. If we receive questions by email where the response could be
helpful for the class, we may ask you to repost the question on the
discussion boards.
Resources
- Blackboard (for reading
quizzes and class announcements)
- Piazza (for class discussion boards)
- Vocareum
(for coding assignments)
- Crowdmark (for written assignments)
Schedule
Note: The weeks of January 16 and February 20 are
instructional weeks. Class will not be held on these days because they
are university holidays, but work will be assigned for the week and is
due at the appropriate time.
Topics listed in the schedule are tentative and subject to change.
- January 9:
Introduction and basic concepts; Naive Bayes.
- January 16:
No class (MLK Birthday)
- The first homework assignment is due January 20.
- All students who were in the Blackboard system by
January 11 should have received a personalized link by
email in the evening of January 11;
if you didn’t, please check your email junk/spam settings,
and if you still can’t find the email, please write to
Ramesh.
- Students who registered on or after January 12 will
receive the email link after they are added to Blackboard (with
some delay: we have to manually get the new registrations and send
links).
- January 23:
Text classification: generative and discriminative models.
- January 30:
Perceptron and linear classifiers; project discussion.
- February 6:
Part-of-speech tagging and sequence labeling.
- February 13:
Discriminative sequence labeling (conditional random fields).
- February 20:
No class (Presidents’ day)
- Research project proposals
due February 22.
- The fourth homework assignment is due February 26.
All students should have received a personalized link by email;
if you didn’t, please check your email junk/spam settings,
and if you still can’t find the email, please write to
Ramesh.
- February 27:
Language modeling and speech recognition.
- March 6:
Midterm; Annotation and evaluation.
- March 13:
Spring Recess
- March 20:
Syntax (parsing).
- March 27:
Semantics, discourse, coreference.
- April 3:
Reinforcement learning (Georgila), Educational applications (Core)
- April 10:
Dialogue.
- April 17:
Machine translation.
- April 24:
Langauge generation.
- May 8 at 4:30pm:
Final Exam
Academic Conduct
Plagiarism – presenting someone else’s ideas as your own, either verbatim or recast in your own words – is a serious academic offense with serious consequences. Please familiarize yourself with the discussion of plagiarism in SCampus in Part B, Section 11, “Behavior Violating University Standards” https://policy.usc.edu/student/scampus/part-b. Other forms of academic dishonesty are equally unacceptable. See additional information in SCampus and university policies on scientific misconduct, http://policy.usc.edu/scientific-misconduct.
Discrimination, sexual assault, intimate partner violence, stalking, and harassment are prohibited by the university. You are encouraged to report all incidents to the Office of Equity and Diversity/Title IX Office http://equity.usc.edu and/or to the Department of Public Safety http://dps.usc.edu. This is important for the health and safety of the whole USC community. Faculty and staff must report any information regarding an incident to the Title IX Coordinator who will provide outreach and information to the affected party. The sexual assault resource center webpage http://sarc.usc.edu fully describes reporting options. Relationship and Sexual Violence Services https://engemannshc.usc.edu/rsvp provides 24/7 confidential support.
Support Systems
A number of USC’s schools provide support for students who need help with scholarly writing. Check with your advisor or program staff to find out more. Students whose primary language is not English should check with the American Language Institute http://ali.usc.edu, which sponsors courses and workshops specifically for international graduate students. The Office of Disability Services and Programs http://dsp.usc.edu provides certification for students with disabilities and helps arrange the relevant accommodations. If an officially declared emergency makes travel to campus infeasible, USC Emergency Information http://emergency.usc.edu will provide safety and other updates, including ways in which instruction will be continued by means of Blackboard, teleconferencing, and other technology.
USC Viterbi Honor Code
The Code was developed by Viterbi students, and its text is as follows:
Engineering enables and empowers our ambitions and is integral to
our identities. In the Viterbi community, accountability is reflected
in all our endeavors.
- Engineering+ Integrity.
- Engineering+ Responsibility.
- Engineering+ Community.
- Think good. Do better. Be great.
These are the pillars we stand upon as we address the challenges of society and enrich lives.
Academic Integrity Violations
All coding and writing must be done individually (unless
instructed otherwise), and not copied from other students. Copying or
plagiarism is grounds for failure of an assignment, or in serious
cases failure of the course.
Use of the internet or other outside resources to find solutions
to homework problems is considered cheating.
Please read my Personal note on academic
integrity.