Item description for Foundations of Computational Linguistics: Human-Computer Communication in Natural Language by Roland R. Hausser...
The central task of a future-oriented computational linguistics is the development of cognitive machines which humans can freely talk with in their respective natural language. In the long run, this task will ensure the development of a functional theory of language, an objective method of verification, and a wide range of practical applications. Natural communication requires not only verbal processing, but also non-verbal perception and action. Therefore, the content of this textbook is organized as a theory of language for the construction of talking robots. The main topic is the mechanics of natural language communication in both, the hearer and the speaker. The contents is divided into four parts: Theory of Language, Formal Grammar, Morphology and Syntax, Semantics and Pragmatics. The book contains more than 700 exercises for reviewing key ideas and important problems.
Promise Angels is dedicated to bringing you great books at great prices. Whether you read for entertainment, to learn, or for literacy - you will find what you want at promiseangels.com!
Reviews - What do customers think about Foundations of Computational Linguistics: Human-Computer Communication in Natural Language?
Organized and well-written Jul 17, 2001
This book is an essentially non-mathematical descriptive overview of computational linguistics that emphasizes an historical viewpoint. It is very understandable, even for someone approaching the subject for the first time. In addition, it could be used as a textbook as there are a large set of exercises at the end of each chapter. Computational linguistics has been applied to biological sequence analysis, which was my primary reason for reading the book.
The author begins with the concept of a language, which is defined as a set of word sequences with a formal language being a subset of the free monoid over a finite lexicon. The reasons for using generator grammars are discussed but these need to be replaced by a special type called categorial (C-) grammar, invented in the 1930's and applied to natural languages in the 1950's. The disadvantages of C-grammar are outlined by the author.
A second generative grammar, called phase structure (PS) grammar is discussed, and restrictions on the rule schema give four different types of PS-grammars. These different types give four different classes of complexity, with this complexity measured by an algorithm with the number of primitive operations required to analyze an input expression counted in relation to length of the input. Context-free PS-grammar is applied to natural language via phrase structures. The author distinguishes carefully the differences between C- and PS-grammars. In particular, the goal of PS-grammar is to represent what is called the constituent structure of natural language, which is defined by the author as a formal property of phase structures. He makes it very clear that there is as of yet no complete PS-grammars for natural languages. He also discusses in detail the constituent structure paradox with examples of discontinuous elements in natural language. The solution of Chomsky to this problem via transformation rules is outlined. This transformational grammar is equivalent to a Turing machine generating recursively enumerable languages and so is undecidable. To resolve this, Chomsky introduced formal restrictions on the transformations called "recoverability of deletions". The author shows however via Bach-Peters sentences, that this method does not always work. The discussion of parsing distinguishes between morphology parsers, syntax parsers, and semantic parsers. This exemplifies how the declarative-procedural distinction applies to the relation between generative grammars and parsers. These distinctions are important, the author argues, when modeling natural languages on a computer.
That natural languages are not context-free motivates the author to search for other formalisms. A successful formalism must be computationally tractable, and this is reflected in its grammar type. The "type transparency" between the parser and the grammar enables the analysis of the complexity to be done at the parser, since for any language, they will have the same formal grammar. The weaknesses of this approach for PS-grammar is discussed in detail by the author, and he gives other algorithms that restructure the PS-grammar rules in order to obtain parsing of context-free languages. These concerns also exist when the requirement that the grammar formalism "input-output" be equivalent to what is spoken and heard. PS-grammar is shown to be incompatible with this.
The more recent notion of left-associative (LA) grammar is discussed as an alternative to C-and PS-grammars. The irregular bracketing of these grammars is handled by using the principle of possible continuations in LA-grammars. The author shows, interestingly, that the distinction between context-free and context-sensitive languages disappears in LA-grammar. The principle of possible continuations allows close relation between parsing and generation. Discontinuous elements are dealt with by coding filler positions into a functor category and then cancelled later.
There are different types of LA-grammars which are characterized in terms of their generative capacity and computational complexity. Recursion theory plays a role, and the complexity is measured in terms of the operations required to process an input in the worst case. The author discusses in detail the different types of LA grammars and their subhierarchies, and compares the LA-and PS-hierarchies.
The morphological analysis of natural language can be studied in terms of combination principles, with words being defined in terms of word forms, and a clear distinction is made between the two notions. Word forms in turn are composed of elementary parts called morphemes, and morphemes are associated analyzed allomorphs. The author explains the steps needed to morphologically analyze an unknown word.
Even more interesting, and more important from a practical point of view, the author discusses methods for automatic word form recognition, including methodologies for investigating the frequency distribution of words. The grammar system of LA-morphology is used for word form recognition of English, German, Italian, French, Japanese, and Polish. The empirical testing of a grammar system via the building of a corpora is discussed with an illustration of Zipf's law. Unfortunately, the author does not discuss in detail the use of hidden Markov models in statistical tagging.
Syntax deals with the composition of word forms and uses the combination rules of valency, agreement, and word order. German and English are analyzed in terms of their word order. The ability of LA-grammar to map variable-based rule patterns onto categorially analyzed input expressions using a strictly time-linear order makes it efficient and flexible, argues the author. The LA-syntax for English and German is discussed in detail by the author. The discussion makes heavy use of finite state machines.
There are three different semantic systems, namely the logical, programming, and natural languages, and the author shows how these are related via replication, reconstruction, transfer, and composition. The problems in viewing natural languages as logical semantics is discussed in the context of Tarski's work. The author argues that the insistence of using logical semantics for analyzing natural language is incorrect since natural languages work differently from metalanguage-dependent logical languages. Truth, meaning, and ontology are taken out of the philosophical realm and applied to the logical semantics of natural language.
Computational Linguistics Aug 13, 2000
If you have a limited bookshelf space and a limited budget, but love to have a good study and reference book on Computational Linguistics, than this is the one to have. The subject is treated in girth and in depth. The 500-page content is well structured and very well written. The author has divided the content into four parts (mentioned in Preface) Theory of Language, Theory of Grammar, Morphology and Syntax, Semantics and Pragmatics. I have hard time putting this book down since the time it arrived in the mail a few weeks ago.
Real Computational Linguistics Nov 3, 1999
Computational Linguistics is not only the toys in Scholar's hands. It should solve the problem of natural language communication between humans and computers. This book give us systematic and complete solution and 'solid' solution.It is a real and true computational linguistics. We need 'solid' foundation more than 'smart' technique in CL. If you are interested what's a real CL, this book is worth reading and using as textbook.