<?xml version="1.0" encoding="utf-8"?>
<journal>
  <titleid>80301</titleid>
  <issn>2782-5450</issn>
  <journalInfo lang="ENG">
    <title>Terra Linguistica</title>
  </journalInfo>
  <issue>
    <volume>14</volume>
    <number>1</number>
    <altNumber> </altNumber>
    <dateUni>2023</dateUni>
    <pages>1-107</pages>
    <articles>
      <article>
        <artType>EDI</artType>
        <langPubl>RUS</langPubl>
        <pages>7-10</pages>
        <authors>
          <author num="001">
            <authorCodes>
              <orcid>0000-0002-6425-2050</orcid>
            </authorCodes>
            <individInfo lang="ENG">
              <orgName>National Research University Higher School of Economics</orgName>
              <surname>Kolmogorova</surname>
              <initials>Anastasia </initials>
              <email>nastiakol@mail.ru</email>
              <address>St. Petersburg, Russian Federation</address>
            </individInfo>
          </author>
        </authors>
        <artTitles>
          <artTitle lang="ENG">Engineering linguistic technologies in text studies</artTitle>
        </artTitles>
        <abstracts>
          <abstract lang="ENG">The publication is devoted to the analysis of the current state of engineering linguistics, its main directions and research challenges. The definition of language technologies and their typology are formulated according to the criterion of the tasks solved with their help. It is noted that the national school of engineering linguistics manages to maintain a balance between technological and linguistic research.</abstract>
        </abstracts>
        <codes>
          <doi>10.18721/JHSS.14101</doi>
          <udk>81'32, 81'33</udk>
        </codes>
        <keywords>
          <kwdGroup lang="ENG">
            <keyword>языковые технологии</keyword>
            <keyword>инженерная лингвистика</keyword>
            <keyword>компьютерная лингвистика</keyword>
            <keyword>языковые модели</keyword>
          </kwdGroup>
        </keywords>
        <files>
          <furl>https://human.spbstu.ru/article/2023.51.1/</furl>
          <file>7-10.pdf</file>
        </files>
      </article>
      <article>
        <artType>RAR</artType>
        <langPubl>RUS</langPubl>
        <pages>11-20</pages>
        <authors>
          <author num="001">
            <authorCodes>
              <orcid>0000-0001-7580-4386</orcid>
            </authorCodes>
            <individInfo lang="ENG">
              <orgName>Smolensk State University</orgName>
              <surname>Andreev </surname>
              <initials>Vadim </initials>
              <email>vadim.andreev@ymail.com</email>
              <address>Smolensk, Russian Federation</address>
            </individInfo>
          </author>
        </authors>
        <artTitles>
          <artTitle lang="ENG">Evolution of Vladimir Nabokov’s image system: quantitative analysis</artTitle>
        </artTitles>
        <abstracts>
          <abstract lang="ENG">The article deals with the evolution of such important aspect of Vladimir Nabokov’s individual style as image system. Navokov is much better known as prose writer. However, he began his creative career as a poet and continued to write verse all his life. The utilized traditional approach to the definition of image as a transfer of conceptual characteristics makes it possible to carry out quantitative analysis of the concepts used by the author and to compare their quantity in early and mature periods of creative activity. Multivariate discriminant analysis is used as a statistical method to differentiate between the periods simultaneously on the basis of a larger number of variables (frequencies of concepts in the function of source domain). The obtained results demonstrate that there are significant changes in the individual style of the poet, and, consequently, in his worldview. The obtained discriminant model, which includes eight characteristics (concepts marking style alteration), makes it possible to automatically attribute the text to the right period in 100% of cases. Qualitative analysis of changes in the frequencies of concepts reveals a complex polyphonic character of style alteration, which includes both the transition from complicated to simpler concepts and a change to more complex understanding of living beings.</abstract>
        </abstracts>
        <codes>
          <doi>10.18721/JHSS.14102</doi>
          <udk>811'32</udk>
        </codes>
        <keywords>
          <kwdGroup lang="ENG">
            <keyword>stylochronometry</keyword>
            <keyword>individual style</keyword>
            <keyword>image system</keyword>
            <keyword>discriminant analysis</keyword>
            <keyword>Nabokov</keyword>
          </kwdGroup>
        </keywords>
        <files>
          <furl>https://human.spbstu.ru/article/2023.51.2/</furl>
          <file>11-20.pdf</file>
        </files>
      </article>
      <article>
        <artType>RAR</artType>
        <langPubl>RUS</langPubl>
        <pages>21-29</pages>
        <authors>
          <author num="001">
            <authorCodes>
              <orcid>0000-0003-2856-5049</orcid>
            </authorCodes>
            <individInfo lang="ENG">
              <orgName>St. Petersburg State University</orgName>
              <surname>Grebennikov</surname>
              <initials>Alexander</initials>
              <email>agrebennikov@mail.ru</email>
              <address>St. Petersburg, Russian Federation</address>
            </individInfo>
          </author>
          <author num="002">
            <authorCodes>
              <orcid>0000-0002-3347-1373</orcid>
            </authorCodes>
            <individInfo lang="ENG">
              <orgName>St. Petersburg State University</orgName>
              <surname>Marusenko</surname>
              <initials>Natalya </initials>
              <email>n.marusenko@spbu.ru</email>
              <address>St. Petersburg, Russian Federation</address>
            </individInfo>
          </author>
          <author num="003">
            <authorCodes>
              <orcid>0000-0002-7825-1120</orcid>
            </authorCodes>
            <individInfo lang="ENG">
              <orgName>St. Petersburg State University</orgName>
              <surname>Skrebtsova</surname>
              <initials>Tatyana</initials>
              <email>t.skrebtsova@spbu.ru</email>
              <address>St. Petersburg, Russian Federation</address>
            </individInfo>
          </author>
        </authors>
        <artTitles>
          <artTitle lang="ENG">Mapping word frequencies in fiction on sociopolitical context: the case of early 20th century Russian short stories</artTitle>
        </artTitles>
        <abstracts>
          <abstract lang="ENG">The paper deals with the language of Russian short stories written in the period from 1900–1930. It is based on the Russian Short Stories Corpus, an ongoing research project aimed to collect, digitally process, and present the Russian literature of the early 20th century in an electronic form. The Corpus contains the stories written by thousands of Russian authors, both well-known and almost forgotten ones. From the corpus, a sample was taken to serve as a testbed for linguists, lexicographers and literary scholars, enabling them to check their intuitions concerning the language and style of the epoch. The sample has been divided into three subsamples along the lines set by the dramatic turns of Russian history. The first subsample contains the stories produced from the onset of the 20th century up to WWI (1900–1913), the second one refers to the tumultuous period of wars and revolutions (1914–1922), and the third accounts for the stories written in the Soviet Union (1923–1930). The Corpus has proved instrumental in detecting manifold changes in language use, including grammar, vocabulary, syntactic patterns, collocations, and stylistics. In the present paper, frequency-sorted word lists are used to bring out relevant changes in Russian vocabulary, linking them to the sociopolitical context. The results obtained will provide valuable data for the lexicographers compiling Russian dictionaries of the above-mentioned period.</abstract>
        </abstracts>
        <codes>
          <doi>10.18721/JHSS.14103</doi>
          <udk>81'33</udk>
        </codes>
        <keywords>
          <kwdGroup lang="ENG">
            <keyword>Russian short stories</keyword>
            <keyword>text corpus</keyword>
            <keyword>frequency dictionary</keyword>
            <keyword>Russian lexicography</keyword>
            <keyword>stylometry</keyword>
          </kwdGroup>
        </keywords>
        <files>
          <furl>https://human.spbstu.ru/article/2023.51.3/</furl>
          <file/>
        </files>
      </article>
      <article>
        <artType>RAR</artType>
        <langPubl>RUS</langPubl>
        <pages>30-40</pages>
        <authors>
          <author num="001">
            <authorCodes>
              <orcid>0000-0001-5338-3656</orcid>
            </authorCodes>
            <individInfo lang="ENG">
              <orgName>Peter the Great St. Petersburg Polytechnic University</orgName>
              <surname>Evtushenko</surname>
              <initials>Tatiana </initials>
              <email>evtushenkotg@gmail.com</email>
              <address>St. Petersburg, Russian Federation</address>
            </individInfo>
          </author>
          <author num="002">
            <authorCodes>
              <researcherid>6523-2016</researcherid>
              <scopusid>57189038663</scopusid>
              <orcid>0000-0002-6326-8392</orcid>
            </authorCodes>
            <individInfo lang="ENG">
              <orgName>St. Petersburg Electrotechnical University “LETI”</orgName>
              <surname>Klochkova</surname>
              <initials>Yelena</initials>
              <email>esklochkova@etu.ru</email>
              <address>St. Petersburg, Russian Federation</address>
            </individInfo>
          </author>
          <author num="003">
            <individInfo lang="ENG">
              <orgName>National Research Tomsk State University</orgName>
              <surname>Laputenko</surname>
              <initials>Andrey </initials>
              <email>laputenko.av@gmail.com</email>
              <address>Tomsk, Russian Federation</address>
            </individInfo>
          </author>
          <author num="004">
            <authorCodes>
              <orcid>0000-0002-4006-1161</orcid>
            </authorCodes>
            <individInfo lang="ENG">
              <orgName>Institute for System Programming of the Russian Academy of Sciences</orgName>
              <surname>Evtushenko</surname>
              <initials>Nina </initials>
              <email>evtushenko@ispras.ru</email>
              <address>Moscow, Russian Federation</address>
            </individInfo>
          </author>
        </authors>
        <artTitles>
          <artTitle lang="ENG">Studying the impact of morphological parameters on text readability using statistical analysis methods</artTitle>
        </artTitles>
        <abstracts>
          <abstract lang="ENG">The paper addresses one of the important aspects of text complexity, namely the dependency of text readability on a set of morphological and text surface metrics such as the average length of words, sentences, etc. The correlation between the objective text complexity which is specified by quantitative parameters of the linguistic features and the subjective text complexity, i.e. the difficulty of text comprehension as a psychological phenomenon, is analyzed. To assess the morphological text complexity we used an annotated dataset consisting of 1000 online news texts (140000 tokens) retrieved from the websites of Russian universities. For each text unit the ratio of each part-of-speech per token is measured. Online news texts of the dataset were also assessed by a target audience of the website, i.e. applicants, undergraduate and postgraduate students. As a result, the dataset was automatically annotated based on text linguistic features and human-labelled based on experts’ estimates of text readability on a 5-point scale. To assess the significance of morphological metrics and their influence on text readability, the correlation and regression analysis was carried out. To automatically classify a text as ‘easy-to-read’ or not ‘easy-to-read’, both single feature and compound models including more than one metric were constructed. In agreement with the prior research the most common metrics influencing text readability appear to be text surface characteristics. However, the proposed models also made it possible to establish the significance of morphological parameters, used both in single feature and compound models, such as the use of participles, nouns in the genitive case, adjectives and numerals, which should be taken into account in analyzing news text readability. Moreover, novel formulae for assessing readability were proposed based on the studied coefficients.</abstract>
        </abstracts>
        <codes>
          <doi>10.18721/JHSS.14104</doi>
          <udk>81.32</udk>
        </codes>
        <keywords>
          <kwdGroup lang="ENG">
            <keyword>text complexity</keyword>
            <keyword>readability</keyword>
            <keyword>morphological features</keyword>
            <keyword>media text</keyword>
            <keyword>correlation and regression analysis</keyword>
          </kwdGroup>
        </keywords>
        <files>
          <furl>https://human.spbstu.ru/article/2023.51.4/</furl>
          <file>30-40.pdf</file>
        </files>
      </article>
      <article>
        <artType>RAR</artType>
        <langPubl>RUS</langPubl>
        <pages>41-56</pages>
        <authors>
          <author num="001">
            <individInfo lang="ENG">
              <orgName>Herzen State Pedagogical University of Russia</orgName>
              <surname>Kamshilova </surname>
              <initials>Olga </initials>
              <email>onkamshilova@gmail.com</email>
              <address>St. Peterburg, Russian Federation</address>
            </individInfo>
          </author>
          <author num="002">
            <individInfo lang="ENG">
              <orgName>Herzen State Pedagogical University of Russia</orgName>
              <surname>Belyaeva</surname>
              <initials>Larisa</initials>
              <email>belyaevaln@herzen.spb.ru</email>
              <address>St. Petersburg, Russian Federation</address>
            </individInfo>
          </author>
        </authors>
        <artTitles>
          <artTitle lang="ENG">Machine translation in the age of digitalization: new practices, procedures and resources</artTitle>
        </artTitles>
        <abstracts>
          <abstract lang="ENG">In Russian linguistics digitalization is traditionally associated with the use of mathematical and computer methods applied mainly to text processing problems in various automated systems. The article analyzes the impact of digitalization on the use and purpose of machine translation (MT) systems in modern conditions. It describes new practices of using MT products both by professional translators and by a general MT system user for their individual purpose. It highlights the objective advantages and disadvantages of MT application from the point of view of practicing professional translators and ordinary users as well. The article considers a translator’s new working conditions, their new roles and skills determined by the impact of digitalization on working with text. It pays special attention to post-editing MT products as a translator’s new professional activity, which is needed to ensure high-quality translation and to extract correct information. It also describes the necessary and sufficient post-editing procedures to be performed by non-professional users while pursuing their own goals through MT application. Finally, the research focuses on the analysis of procedures and available linguistic resources that can optimize working with MT systems.</abstract>
        </abstracts>
        <codes>
          <doi>10.18721/JHSS.14105</doi>
          <udk>8'33</udk>
        </codes>
        <keywords>
          <kwdGroup lang="ENG">
            <keyword>digitalization</keyword>
            <keyword>machine translation (MT)</keyword>
            <keyword>MT systems</keyword>
            <keyword>post-editing</keyword>
            <keyword>translation practices</keyword>
            <keyword>linguistic resources</keyword>
          </kwdGroup>
        </keywords>
        <files>
          <furl>https://human.spbstu.ru/article/2023.51.5/</furl>
          <file>41-56.pdf</file>
        </files>
      </article>
      <article>
        <artType>RAR</artType>
        <langPubl>RUS</langPubl>
        <pages>57-69</pages>
        <authors>
          <author num="001">
            <authorCodes>
              <orcid>0000-0001-9085-0284</orcid>
            </authorCodes>
            <individInfo lang="ENG">
              <orgName>St. Petersburg State University</orgName>
              <surname>Khokhlova</surname>
              <initials>Maria </initials>
              <email>m.khokhlova@spbu.ru</email>
              <address>St. Petersburg, Russian Federation</address>
            </individInfo>
          </author>
        </authors>
        <artTitles>
          <artTitle lang="ENG">Learner corpora: relevant information and an overview of the existing frameworks</artTitle>
        </artTitles>
        <abstracts>
          <abstract lang="ENG">In the modern world, there is a constant interest in foreign languages. Therefore, the question of learning about the language used by non-native speakers of a certain language, as well as describing their mistakes is a highly relevant matter. Learner corpora differ not only according to the languages they focus on, but also in relation to a number of their properties. The purpose of the study is to present a review the learner corpora available for different languages, as well as to compare the approaches that exist for their annotation. The paper considers the origins of learner corpus research, focuses on the main the stages of a project, types of learner corpora (which may differ in their tasks, students’ mother tongue, language proficiency, text genre, data type, etc.), linguistic and metatextual information that accompany texts and provides a classification of errors. The paper gives a brief overview of annotation tools and corpus platforms that can be used for building a learner corpus.</abstract>
        </abstracts>
        <codes>
          <doi>10.18721/JHSS.14106</doi>
          <udk>81'32</udk>
        </codes>
        <keywords>
          <kwdGroup lang="ENG">
            <keyword>learner corpora</keyword>
            <keyword>typology</keyword>
            <keyword>errors</keyword>
            <keyword>annotation</keyword>
            <keyword>second language acquisition</keyword>
          </kwdGroup>
        </keywords>
        <files>
          <furl>https://human.spbstu.ru/article/2023.51.6/</furl>
          <file/>
        </files>
      </article>
      <article>
        <artType>RAR</artType>
        <langPubl>RUS</langPubl>
        <pages>70-87</pages>
        <authors>
          <author num="001">
            <authorCodes>
              <orcid>0000-0002-3008-5514</orcid>
            </authorCodes>
            <individInfo lang="ENG">
              <orgName>St. Petersburg State University</orgName>
              <surname>Mitrofanova </surname>
              <initials>Olga A. </initials>
              <email>o.mitrofanova@spbu.ru</email>
              <address>St. Petersburg, Russian Federation</address>
            </individInfo>
          </author>
          <author num="002">
            <individInfo lang="ENG">
              <orgName>St. Petersburg State University</orgName>
              <surname>Athugodage</surname>
              <initials>Mark </initials>
              <email>m.athugodage@yahoo.com</email>
              <address>St. Petersburg, Russian Federation</address>
            </individInfo>
          </author>
        </authors>
        <artTitles>
          <artTitle lang="ENG">Dynamic topic modelling of the russian legal text corpus</artTitle>
        </artTitles>
        <abstracts>
          <abstract lang="ENG">The article is devoted to the dynamic topic modelling analysis of legislative acts, decrees of senior officials and resolutions of the Supreme and Constitutional Courts dated 2008–2022, included into the research corpus of Russian legal documents. The article describes the procedures of corpus construction and preprocessing, training of topic models on this corpus. We consider both standard topic model and a dynamic topic model that takes into account changes in topics over time. After training the models in various conditions, a set of optimal training parameters was determined. The BERTopic library was used as the main tool for topic modelling, combining algorithms for constructing topic models and contextualized neural network models of distributed vectors. The research data may be of interest both for specialists in the field of computational linguistics as well as for sociologists, political scientists, lawyers working with legislative documents.</abstract>
        </abstracts>
        <codes>
          <doi>10.18721/JHSS.14107</doi>
          <udk>81'32, 81'33</udk>
        </codes>
        <keywords>
          <kwdGroup lang="ENG">
            <keyword>topic modelling</keyword>
            <keyword>dynamic topic model</keyword>
            <keyword>BERTopic</keyword>
            <keyword>Russian corpus of legal documents</keyword>
            <keyword>Russian gazette</keyword>
          </kwdGroup>
        </keywords>
        <files>
          <furl>https://human.spbstu.ru/article/2023.51.7/</furl>
          <file>70-87.pdf</file>
        </files>
      </article>
      <article>
        <artType>RAR</artType>
        <langPubl>RUS</langPubl>
        <pages>88-97</pages>
        <authors>
          <author num="001">
            <authorCodes>
              <orcid>0000-0002-8815-7920</orcid>
            </authorCodes>
            <individInfo lang="ENG">
              <orgName>Petrozavodsk State University</orgName>
              <surname>Rogov</surname>
              <initials>Alexander </initials>
              <address>Petrozavodsk, Russian Federation</address>
            </individInfo>
          </author>
          <author num="002">
            <authorCodes>
              <orcid>0000-0001-5556-5349</orcid>
            </authorCodes>
            <individInfo lang="ENG">
              <orgName>Petrozavodsk State University</orgName>
              <surname>Moskin</surname>
              <initials>Nikolai</initials>
              <email>moskin@petrsu.ru</email>
              <address>Petrozavodsk, Russian Federation</address>
            </individInfo>
          </author>
          <author num="003">
            <authorCodes>
              <orcid>0000-0001-9939-9389</orcid>
            </authorCodes>
            <individInfo lang="ENG">
              <orgName>Petrozavodsk State University</orgName>
              <surname>Lebedev</surname>
              <initials>Alexander</initials>
              <email>perevodchik88@yandex.ru</email>
              <address>Petrozavodsk, Russian Federation</address>
            </individInfo>
          </author>
        </authors>
        <artTitles>
          <artTitle lang="ENG">On the paradigm shift of the author's invariant</artTitle>
        </artTitles>
        <abstracts>
          <abstract lang="ENG">One of the urgent tasks in philology is the attribution of texts. The quantitative indicator by which one can distinguish between the works of different authors should be called the author's invariant. The paper describes a number of studies (the method of G. Kjetsaa, the method of evaluating the pair connection of grammatical classes, the method of "decision trees", the Delta method), the results of which confirm that the initial definition of the author's invariant should be corrected. In particular, this applies to the time interval during which the attribution parameter should keep “constant value”. It does not necessarily coincide with the entire period of the writer's work. Also due to the lack of a universal criterion that uniquely distinguishes a particular writer from others, one should use a set of characteristics of author's invariants at different levels of the language. The performed analysis shows that the term "author's invariant" should be divided into two categories – "global author's invariant" and "local author's invariant" – which can be consistently studied independently of each other.</abstract>
        </abstracts>
        <codes>
          <doi>10.18721/JHSS.14108</doi>
          <udk>808.1</udk>
        </codes>
        <keywords>
          <kwdGroup lang="ENG">
            <keyword>literary text</keyword>
            <keyword>linguostatistical parameter</keyword>
            <keyword>author's invariant</keyword>
            <keyword>classification</keyword>
            <keyword>data mining</keyword>
          </kwdGroup>
        </keywords>
        <files>
          <furl>https://human.spbstu.ru/article/2023.51.8/</furl>
          <file>88-97.pdf</file>
        </files>
      </article>
      <article>
        <artType>CNF</artType>
        <langPubl>RUS</langPubl>
        <pages>98-107</pages>
        <authors>
          <author num="001">
            <individInfo lang="ENG">
              <orgName>Herzen State Pedagogical University of Russia</orgName>
              <surname>Kamshilova </surname>
              <initials>Olga </initials>
              <email>onkamshilova@gmail.com</email>
              <address>St. Peterburg, Russian Federation</address>
            </individInfo>
          </author>
          <author num="002">
            <individInfo lang="ENG">
              <orgName>Herzen State Pedagogical University of Russia</orgName>
              <surname>Belyaeva</surname>
              <initials>Larisa</initials>
              <email>belyaevaln@herzen.spb.ru</email>
              <address>St. Petersburg, Russian Federation</address>
            </individInfo>
          </author>
          <author num="003">
            <authorCodes>
              <researcherid>J-2590-2015</researcherid>
              <scopusid>57207357482</scopusid>
            </authorCodes>
            <individInfo lang="ENG">
              <orgName>Herzen State Pedagogical University of Russia</orgName>
              <surname>Piotrowska</surname>
              <initials>Xenia</initials>
              <email>krp62@mail.ru</email>
              <address>St. Petersburg, Russian Federation</address>
            </individInfo>
          </author>
        </authors>
        <artTitles>
          <artTitle lang="ENG">Language engineering and applied linguistics today: The chronicle of the IV International conference “R. Piotrowski’s Readings – 2022”</artTitle>
        </artTitles>
        <abstracts>
          <abstract lang="ENG">This chronicle provides an overview of the IV International Conference on Language Engineering and Applied Linguistics “R. Piotrowski’s Readings – 2022” held on November 22, 2022, in Herzen State University (St. Petersburg, Russia). The conference was organized to mark the 100th anniversary of Rajmund G. Piotrowski’s birth (1922–2009), a Russian scientist, professor, Honored scientist of Russia. R.G. Piotrowski was the founder of Language Engineering School, pioneer of MT in Russia, initiator of engineering-linguistic strategy in research and practical methodological work, and evidence-based paradigm in methodology of humanitarian research. The article presents a brief outline of R.G. Piotrowski’s scientific legacy. It focuses on various methodological approaches and research practices in the field of engineering and applied linguistics contributed by the conference participants.</abstract>
        </abstracts>
        <codes>
          <doi>10.18721/JHSS.14109</doi>
          <udk>8'33</udk>
        </codes>
        <keywords>
          <kwdGroup lang="ENG">
            <keyword>R.G. Piotrowski</keyword>
            <keyword>engineering linguistics</keyword>
            <keyword>applied linguistics</keyword>
            <keyword>corpus linguistics</keyword>
            <keyword>machine translation</keyword>
            <keyword>language training computer systems</keyword>
          </kwdGroup>
        </keywords>
        <files>
          <furl>https://human.spbstu.ru/article/2023.51.9/</furl>
          <file>98-107.pdf</file>
        </files>
      </article>
    </articles>
  </issue>
</journal>
