Parsing controlled language to generate formal language expressions
Abstract
This thesis is an attempt to find a productive solution to generate formal language expressions from natural language sentences. The interest of such a conversion is bound to situations where calculations are required, but the input to them lies in a content expressed by plain text.
An example could be the need to represent the differences of energy consumption
in a city map by colors corresponding to units on a scale, and having only some textual
information at hand, where quantities appear in a way that cannot be accessed directly by
the calculation system. The case analyzed here is the conversion of task description
content worded in natural language to a task modeling language (TIM - Task-oriented
Information Modeling).
The system we here propose consists of a lexical analyzer, a parser and an interpreter that generates the formal language expressions. As input language, we have chosen to work on a controlled language (Simplified English), as the contexts of use that we envisage can accommodate well to this dialect.
The particular features of our system is that the source and target grammars are freely editable without recompilation and that we make extensive use of regular expressions, so as to lower the quantity of needed grammar rules to describe the controlled natural language sentences. To this end, we use Extended Context Free Grammar for parsing purposes and a mapping syntax table to interpret the syntax of them in terms of formal language expressions.