Next: 1. Introduction
An Evaluation of Linguistically-motivated Indexing Schemes
Avi Arampatzis Th.P. van der Weide C.H.A. Koster P. van Bommel
Technical Report CSI-R9927, December 1999,
Dept. of Information Systems and Information Retrieval,
University of Nijmegen, The Netherlands.
{avgerino,tvdw,kees,pvb}@cs.kun.nl
Proceedings of BCS-IRSG 2000 Colloquium on IR Research, 5th-7th April 2000, Sidney Sussex College, Cambridge, England. To appear.
January 19, 2000
Abstract:
In this article, we describe a number of indexing experiments
based on indexing terms other than simple keywords.
These experiments were conducted as one step in validating
a linguistically-motivated indexing model.
The problem is important but not new.
What is new in this approach is the variety of schemes evaluated.
It is important since it should not only help to overcome the
well-known problems of bag-of-words representations,
but also the difficulties raised by non-linguistic text
simplification techniques such as stemming, stop-word deletion,
and term selection.
Our approach in the selection of terms is based on
part-of-speech tagging and shallow parsing.
The indexing schemes evaluated vary from simple
keywords to nouns, verbs, adverbs, adjectives,
adjacent word-pairs, and head-modifier pairs.
Our findings apply to Information Retrieval and most of related areas.
avi (dot) arampatzis (at) gmail