Problems of article structure formalization of the “Dictionary of the russian language of the 18th century” prior to electronic edition

Authors:
Abstract:

This article discusses the results of manual processing and classification of the structural elements of the “Dictionary of the Russian language of the 18th century” obtained from its 22 issues published so far. The purpose of this work is to accommodate the manifold repertory of metalinguistic techniques and design features characteristic of the aforementioned dictionary to a unified data structure frame, which could significantly facilitate the preparation of its database-driven digital version. The core difficulty of the task discovered during our analysis of the dictionary structure is the fact that there is no obvious way to determine the limit of deviations from the printed version acceptable for the digital edition. In our taxonomy we distinguish two types of structures, namely generic and unique ones. They can be formally represented by a three-level system of components: (1) the basic ones, (2) those subordinate to the basic ones, and (3) simple items of two types: (3.1) primary elements or (3.2) complex typical structures (component blocks). In this system, the generic structures are preserved entirely and without exception, whereas the unique ones can be included or left out by an additional decision. The particular features of the paper version which do not affect the data  structure are preserved in cases where the scope of a certain element is narrowed or expanded within one component block. In fact, if any atypical or rare use of an element exceeded the boundaries of a block and it were nevertheless decided to preserve this element, it would be necessary to expand the data structure with a new component useless outside this idiosyncratic case. In such situations, we propose to eliminate the unclaimed elements of dictionary entries from the original text in order to adjust it to the standard metalanguage of the dictionary.