Some remarks on vector representations of legal documents

descriptionPublicationkeyboard_double_arrow_right Article , Conference object 07 Nov 2002 Austria Publisher:IEEE Comput. SocJournal:Proceedings 11th International Workshop on Database and Expert Systems Applications

Authors: Schweighofer, Erich; Rauber, Andreas; Merkl, Dieter;

doi: 10.1109/dexa.2000.875162

Some remarks on vector representations of legal documents

- Summary
- Subjects
- Metrics

Abstract

Vector representation of legal documents is still the best way for computing classification clusters and labelling of its contents. This paper deals with the problem of diversity of legal documents making vector representation a difficult task. Extensive experiments with three text corpora of about 580 documents in three languages have shown that binary or weighted vector representation may not be sufficient. Even quite successful approaches of similarity computation have problems in identifying the best context of classification. The LabelSOM method can be seen as a very efficient tool for verification of similarity because common elements are explicitly identified. Finally, some proposals for the "best" vector representation are discussed: weighted vectors, feature vectors and hierarchies of vectors using XML information for identifying similar contexts.

Country

Austria

Related Organizations

TU Wien
Austria
University of Vienna
Austria

Keywords

505003 European law, 505003 Europarecht, 505029 Völkerrecht, 505029 International law

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	3
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average