TY - GEN
T1 - Towards modeling threaded discussions using induced ontology knowledge
AU - Feng, Donghui
AU - Kim, Jihie
AU - Shaw, Erin
AU - Hovy, Eduard
PY - 2006
Y1 - 2006
N2 - Online discussion boards are a popular form of web-based computer-mediated communication, especially in the areas of distributed education and customer support. Automatic analysis for discussion understanding would enable better information assessment and assistance. This paper describes an extensive study of the relationship between individual messages and full discussion threads. We present a new approach to classifying discussions using a Rocchio-style classifier with little cost for data labeling. In place of a labeled data set, we employ a coarse domain ontology that is automatically induced from a canonical text in a novel way and use it to build discussion topic profiles. We describe a new classify-by-dominance strategy for classifying discussion threads and demonstrate that in the presence of noise it can perform better than the standard classify-as-a-whole approach with an error rate reduction of 16.8%. This analysis of human conversation via online discussions provides a basis for the development of future information extraction and question answering techniques.
AB - Online discussion boards are a popular form of web-based computer-mediated communication, especially in the areas of distributed education and customer support. Automatic analysis for discussion understanding would enable better information assessment and assistance. This paper describes an extensive study of the relationship between individual messages and full discussion threads. We present a new approach to classifying discussions using a Rocchio-style classifier with little cost for data labeling. In place of a labeled data set, we employ a coarse domain ontology that is automatically induced from a canonical text in a novel way and use it to build discussion topic profiles. We describe a new classify-by-dominance strategy for classifying discussion threads and demonstrate that in the presence of noise it can perform better than the standard classify-as-a-whole approach with an error rate reduction of 16.8%. This analysis of human conversation via online discussions provides a basis for the development of future information extraction and question answering techniques.
UR - http://www.scopus.com/inward/record.url?scp=33750714405&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:33750714405
SN - 1577352815
SN - 9781577352815
T3 - Proceedings of the National Conference on Artificial Intelligence
SP - 1289
EP - 1294
BT - Proceedings of the 21st National Conference on Artificial Intelligence and the 18th Innovative Applications of Artificial Intelligence Conference, AAAI-06/IAAI-06
T2 - 21st National Conference on Artificial Intelligence and the 18th Innovative Applications of Artificial Intelligence Conference, AAAI-06/IAAI-06
Y2 - 16 July 2006 through 20 July 2006
ER -