
handle: 10810/10415
[Eus]Tesi honetan, Rolen Sailkatze Automatikoan (RSA) aski ezagunak diren bi arazo izan ditugu aztergai: (1) Rol multzo ezberdinen egokitasuna praktikan, eta (2) RSArako sistemek darabiltzaten ezaugarri lexikalen eragin mugatua eta pairatzen duten sakabanaketa. Lehen puntuari dagokionez, gaur egun gure arloan gehien erabiltzen diren PropBank eta VerbNeteko rol multzoen azterketa konparatibo sakona aurkeztuko dugu, rol multzo bakoitzarekin entrenatutako sailkatzaileen errendimendua, sendotasuna, eta orokortzeko gaitasuna,esperimentazio ingurune eta domeinu ezberdinetan neurtuz. Bigarren puntuari dagokionez, ezaugarri lexikoek planteatzen dituzten arazoak aztertuko ditugu eta, WordNet eta antzekotasun distribuzionaleko neurriekin sortutako hautapen murriztapenak erabiliz, arazo horien eragina modu esanguratsuan leunduko dugu. In-vitro egindako esperimentuekin, hautapen murriztapen horiek lexikotik eratorritako ezaugarriek baino sailkatze ahalmen handiagoa dutela ikusiko dugu. Azkenik, hautapen murriztapenetatik erauzitako ezaugarriak baliatuz, artearen egoeran dagoen RSA sistema baten errendimendua hobetuko dugu (domeinu barnean eta domeinuz kanpo).
[Eng]This thesis focuses on two well-known open issues in Semantic Role Classi fication (SRC) research: (1) the suitability of diferent role inventories in practice, and (2) the limited in uence and sparseness of lexical features. About the former, we present an empirical comparative study on the use of PropBank vs. VerbNet roles, the two most widely used role inventories, testing the performance diferences for unseen verbs and the robustness for new corpus domains. About the latter, we test the use of automatically learnt selectional preferences as a complement to lexical features, proposing both WordNet-based and distributional similarity based models. We show that all our selectional preference models improve over lexical features in in-vitro experiments, and that the models are complementary. Finally, we show that incorporating features based on selectional preferences, the overall performance of an state-of-the-art SRC system improves both in in-domain and out-of-domain corpora.
Lan hau EHUko ikerketa beka baten laguntzaz egin da (2005-2009)
158 p. : graf.
lingüística computacional, lenguajes de programación, inteligencia artificial
lingüística computacional, lenguajes de programación, inteligencia artificial
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
