<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "https://jats.nlm.nih.gov/publishing/1.3/JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xml:lang="en">
  <front xmlns:xlink="http://www.w3.org/1999/xlink">
    <journal-meta>
      <journal-id journal-id-type="elibrary">80301</journal-id>
      <journal-title-group>
        <journal-title>Terra Linguistica</journal-title>
        <trans-title-group xml:lang="ru">
          <trans-title>Terra Linguistica</trans-title>
        </trans-title-group>
      </journal-title-group>
      <issn pub-type="epub">2782-5450</issn>
    </journal-meta>
    <article-meta xmlns:xlink="http://www.w3.org/1999/xlink">
      <article-id pub-id-type="publisher-id">9</article-id>
      <article-id pub-id-type="doi">10.18721/JHSS.12209</article-id>
      <title-group>
        <article-title>Searching for multicomponent terms in comparable scientific corpora</article-title>
        <trans-title-group xml:lang="ru">
          <trans-title>Searching for multicomponent terms in comparable scientific corpora</trans-title>
        </trans-title-group>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <name>
            <surname>Belyaeva</surname>
            <given-names>Larisa</given-names>
          </name>
          <xref ref-type="aff" rid="aff1"/>
          <email>belyaevaln@herzen.spb.ru</email>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Kamshilova</surname>
            <given-names>Olga</given-names>
          </name>
          <xref ref-type="aff" rid="aff1"/>
          <email>onkamshilova@gmail.com</email>
        </contrib>
      </contrib-group>
      <aff id="aff1">Herzen State Pedagogical University of Russia</aff>
      <pub-date publication-format="electronic" date-type="pub" iso-8601-date="2021-06-29">
        <day>29</day>
        <month>06</month>
        <year>2021</year>
      </pub-date>
      <volume>12</volume>
      <issue>2</issue>
      <fpage>118</fpage>
      <lpage>124</lpage>
      <self-uri xmlns:xlink="http://www.w3.org/1999/xlink" content-type="pdf" xlink:href="https://human.spbstu.ru/userfiles/files/articles/2021/2/118-124.pdf"/>
      <abstract xml:lang="en">
        <p>The paper suggests the use of full-text parallel/comparable corpora with a “built-in” part of machine translation (MT) results for term extraction, harmonization and translation, since analysis and comparison of these texts will assure the possibility to identify terminological units for dictionary entries. We focus on the complicated and non-parallel structure of English multicomponent terminological noun phrases (NPs), their variants and modifications within the same text, which determine the need for a three-part text corpus, including parallel/comparable texts and their MT translation. The research has proved that multicomponent terminological NPs are not only specific for a scientific text, but they demonstrate ambiguous dependency relations, caused by their syntactic compression, which normally is the result of a sentence or of another NP convolution. These modifications are results of a number of standard procedures described in the paper.</p>
      </abstract>
      <kwd-group xml:lang="en">
        <kwd>comparable corpora</kwd>
        <kwd>MT</kwd>
        <kwd>multicomponent NPs</kwd>
        <kwd>terminological NPs</kwd>
        <kwd>lexicography</kwd>
        <kwd>noun phrase transformation</kwd>
      </kwd-group>
    </article-meta>
  </front>
</article>
