Seretan, Violeta

Syntax-Based Collocation Extraction (Gebundene Ausgabe)

Reihe: Text, Speech and Language Technology 44

Springer-Verlag GmbH, Springer Netherland, Dezember 2010

212 S. - Sprache: Englisch - 21 schwarz-weiße Abbildungen, 30 schwarz-weiße Tabellen - 240x164x23 mm

ISBN: 9400701330 EAN: 9789400701335

Syntax-Based Collocation Extraction is the first book to offer a comprehensive, up-to-date review of the theoretical and applied work on word collocations. Backed by solid theoretical results, the computational experiments described based on data in four languages provide support for the book's basic argument for using syntax-driven extraction as an alternative to the current cooccurrence-based extraction techniques to efficiently extract collocational data. The work described in Syntax-Based Collocation Extraction focuses on using linguistic tools for corpus-based identification of collocations. It takes advantage of recent advances in parsing to propose a novel deep syntactic analytic collocation extraction that has applicability to a range of important core tasks in Computational Linguistics. The book is useful for anyone interested in computational analysis of texts, collocation phenomena, and multi-word expressions in general.

Comprehensive, balanced, and up-to-date information on a key topic. Sophisticated, large-scale methods based on the emerging syntactic approach are described and evaluated. Open to multilingualism, providing results and discussing examples in English, French, Italian and Spanish. Discussion of theoretical issues is combined with illustrations of practical implementations.


1. Introduction .- 1.1 Collocations and Their Relevance for NLP . 1.2 The Need for Syntax-Based Collocation Extraction . 1.3 Aims . 1.4 Chapters Outline .- 2. On Collocations . 2.1. Introduction . 2.2 A Survey of Definitions . 2.3 Towards a Core Collocation Concept . 2.4 Theoretical Perspectives on Collocations . 2.5 Linguistic Descriptions . 2.6 What collocation means in this book . 2.7 Summary .- 3. Survey of Extraction Methods . 3.1 Introduction . 3.2 Extraction Techniques . 3.3 Linguistic Preprocessing . 3.4 Survey of the State of the Art . 3.5 Summary .- 4. Syntax-Based Extraction . 4.1 Introduction . 4.2 The Fips Multilingual Parser . 4.3 Extraction Method . 4.4 Evaluation . 4.5 Qualitative Analysis . 4.6 Discussion . 4.7 Summary .- 5. Extensions . 5.1 Identification of Complex Collocations . 5.2 Data-Driven Induction of Syntactic Patterns . 5.3 Corpus-Based Collocation Translation . 5.4 Summary .- 6. Conclusion . 6.1 Main Contributions . 6.2 Future Directions .- References .- A. List of Collocation Dictionaries .- B. List of Collocation Definitions .- C. Association Measures - Mathematical Notes .- D. Monolingual Evaluation (Experiment 1) .- E. Cross-lingual Evaluation (Experiment 2) .- F. Output Comparison


From the reviews:"This book tackles the question of Syntax-Based Collocation Extraction from a computational perspective. It first gives an overview of collocation studies over time and then details the creation of a collocation extraction tool developed by the author ... . The book is well written with all the procedures clearly described. ... The computational part of the book is extremely thorough and well documented ... . As it stands, this book could be required reading in NLP ... ." (Geoffrey Williams, International Journal of Lexicography, Vol. 26 (1), 2013)"This book provides a much-needed overview of the state of the art on collocations and their identification from a NLP perspective. ... The book can also serve as a step-by-step guide to identification in practice, discussing methodological choices and the impact of different levels of pre-processing in terms of performance. ... the book can serve as both a good introductory text for anyone starting on the field and an up-to-date compilation of the main references and recent advances in context for the expert reader." (Aline Villavicencio, Natural Language Engineering, Vol. 18 (4), 2012)"This relatively short book is very interesting for the justification it makes for syntactic parsing in the extraction of word types from texts. It gives a very clear and painstakingly documented account of two experiments in extracting collocations, or associated words, from digital texts in English and three other languages--French, Italian, and Spanish. ... This book makes an interesting contribution to the challenges of dealing with the properties inherent to natural language that have made computational approaches difficult." (Alice Davison, ACM Computing Reviews, September, 2011)

