The Discourse Parsing and Summarization of Free Texts

Prof. Daniel Marcu
Computer Science Department, University of Southern California
Monday, October 22nd, 218 MLH, 2:30 pm - 3:20pm

ABSTRACT:


Researchers of natural language have repeatedly acknowledged that coherent texts are not just simple sequences of sentences. Rather, they are complex artifacts whose semantic units are connected by rhetorical, logical, argumentative, and cohesive relations. I discuss research aimed at uncovering the constraints that characterize the abstract structure of well-formed texts, and at producing algorithms for the automatic derivation of these structures. I show how automatically constructed discourse structures can be exploited in the context of several applications that range from text summarization to machine translation and automatic essay scoring.


Prof. Daniel Marcu is project leader and research scientist at the Information Sciences Institute and assistant professor of computer science at the University of Southern California. His published work is in discourse summarization, translation, generation, and knowledge representation. His current focus is on developing statistical models for machine translation and summarization. (web site: http://www.isi.edu/~marcu/)