In the research of natural language processing (NLP), one must always accumulate and update of more from many of lexical data resources of heterogeneous formats for various applications. These resources are often difficult to maintain and to manipulate. It is necessary reconstructing a specific dictionary for every new application. Following the methods of analysis and design of the information systems, it is necessary to create a data conceptual model and then convert it in a data logical model in order to construct a lexical data base. Currently in Vietnam, there are already some Vietnamise dictionaries on computer, but it doesn't exist more dialectal dictionary. We present in this paper a solution of constructing of data models in order to create a Nghe-Tinh dialectal dictionary. We construct an entity-association model to represent the relationshif between the entry headword), explanation, popular meaning, phrase et sentence from a publish paper Nghe-Tinh dialectal dictionary. This model is convert into WinWord document format to update the Nghe-Tinh dialectal lexical database in the pivot telex code. By using an open sources of a software system of consulting the multiligual lexical database developed by us at the University of Danang, we have build a first version of Nghe-Tinh dialectal dictionary on computer. The lexical resource of this dictionary contains about 5000 entries with the possibility of update and readable. In the same time, the entity- association model is also converted into Access MDB table and XML format.
Readership Map
Content Distribution
In the research of natural language processing (NLP), one must always accumulate and update of more from many of lexical data resources of heterogeneous formats for various applications. These resources are often difficult to maintain and to manipulate. It is necessary reconstructing a specific dictionary for every new application. Following the methods of analysis and design of the information systems, it is necessary to create a data conceptual model and then convert it in a data logical model in order to construct a lexical data base. Currently in Vietnam, there are already some Vietnamise dictionaries on computer, but it doesn't exist more dialectal dictionary. We present in this paper a solution of constructing of data models in order to create a Nghe-Tinh dialectal dictionary. We construct an entity-association model to represent the relationshif between the entry headword), explanation, popular meaning, phrase et sentence from a publish paper Nghe-Tinh dialectal dictionary. This model is convert into WinWord document format to update the Nghe-Tinh dialectal lexical database in the pivot telex code. By using an open sources of a software system of consulting the multiligual lexical database developed by us at the University of Danang, we have build a first version of Nghe-Tinh dialectal dictionary on computer. The lexical resource of this dictionary contains about 5000 entries with the possibility of update and readable. In the same time, the entity- association model is also converted into Access MDB table and XML format.