University of Southern California

CSCI 544 — Applied Natural Language Processing

Research Project

Updates (April 24)

(April 3)



Overview

The research project is an in-depth activity that will be carried out in teams of four. The project can be on any aspect of natural language processing. You will formulate a research question, identify resources and tools to address the question, implement and evaluate a system that uses these resources and tools, demonstrate the system, and write up a report.

Procedure

Proposal structure

The proposal describes your plan for the research project, and will serve as the skeleton for the final report. As a plan it is subject to change and does not represent a firm commitment, but it should show that you’ve thought through the relevant aspects of your research. The proposal should be a document of about 500 words, written in English in good academic style. Proposals that substantially exceed this length (above 600 words) will be penalized. The structure of the document should be as follows.

The proposal should be written after you have received some feedback about the general direction of your project. You will receive written feedback about your proposal, which should help you with writing the final report; however, feedback on the proposal might take some time, so don’t delay collecting your data and implementing your system while waiting for comments on your proposal. For feedback on specific issues that arise with the project, use Piazza.

Code demonstrations

Code demonstrations/presentations will take place on April 10 in PHE 516. Each team will have a 5-minute slot to talk about their work and demonstrate how their code works. No presentation slides (there’s no time for that). The code demo is a progress check; teams should have some working code to show, but it is not expected to be a final version.

Final report

The final report describes the research you have done, reporting on the method and results, relating the research to other work in the field, and offering conclusions and directions for future work. The report should be about 2000 words long, not counting the references; reports that substantially exceed this length will be penalized. The structure is similar to the proposal, but with more detail, and two additional sections following the method section.

The six main content sections (introduction, materials, procedure, evaluation, results, and discussion) carry equal weight. Therefore, they should be of similar lengths – this means reserving about 300–350 words for each section. This is only a general guideline, as you may find that some sections require more text than others. However, if you find you have more to say than fits within the length requirement, then you’ll need to concentrate on the more important aspects of your project.

When giving examples of text in languages other than English, please use the following multi-line format, to make the examples readable to English speakers. Below is an example for how to present a sentence in Hindi.

किसने दवाई को खरीदा (the original text in its native script)
kisne davaaii kokhariidaa (a transcription into Latin script)
whoERG medicine ACCbought (a word-by-word gloss)
‘Who bought the medicine?’ (a translation into English)

The second line is not needed if the language natively uses a version of the Latin script.

Grading

The grade for the assignment will be broken down as follows.

The research project counts for 30% of the overall course grade.