Quantitative and Statistical Characteristics of the Lexical Environment of the Nouns bran’ ‘battle’ and rat’ ‘war, army’ in Old Church Slavonic and East Slavonic Sources from 11th–15th Centuries: Experience in Еxtracting and Analyzing Small Corpus Data

Summary: 

This paper analyzes the lexical distribution of the nouns брань ‘battle’ and рать ‘war, army’ in sixteen Slavic written monuments of the 11th-15th centuries with varied textual characteristics. Methods are used for comparing the frequency of the analyzed nouns, extracting bigrams and measuring the closeness between these nouns and their collocates, and calculating the correlation between lists of collocates of one noun in different subcorpora and between lists of collocates of different nouns in one subcorpus. The author posits a relationship between the basic characteristics of the lexical environment and the semantics of the words брань ‘battle’ and рать ‘war, army’. The author also demonstrates a relationship between the distribution of the analyzed words and the textual characteristics of the texts.

Victor A. Baranov (Izhevsk, Russia)

References: 

B a r a n o v, V. A. Instrumenty i metodika poiska kollokatsiĭ v istoricheskom korpuse “Manuskript” (na primere glagolicheskikh rukopiseĭ). –  Filologija, 68 (2017), pp. 17–49.

B a r a n o v, V. A Text Corpus of Medieval Manuscripts as a Goal and a Tool for Linguistic Research. – In: Editing Mediaeval Texts from a Different Angle: Slavonic and Multilingual Traditions. To Honour Francis J. Thomson on the Occasion of His 80th Birthday. Edited by L. Sels, J. Fuchsbauer, V. Tomelleri and I. de Vos. Paris–Bristol: Peeters Leuven, 2018, pp. 283–308.

B a r a n o v, V. A. Sozdanie i ispol’zovanie istoricheskikh korpusov slavianskikh pis’mennykh pamiatnikov. – Scripta & e-Scripta, 19 (2019), s. 33–57.

B a r a n o v, V. A.,  R. M. G n u t i k o v. The statistics and n-gram modules of the historical corpus “Manuscript”. – In: Digital and Analytical Approaches to the Written Heritage: Proceedings of the 7th international conference El’Manuscript “Textual Heritage and Information Technologies”. Compilers and Editors: A. Miltenova, V. Baranov, H. Miklas, K. Hawkins, J. Fuchsbauer. Sofia, 2019, pp. 9–28.

B a t u r a, T. V.,  S. E. S t r e k a l o v a. Podkhod k postroeniiu rasshirennykh tematicheskikh modeleĭ tekstov na russkom iazyke. – Vestnik Novosibirskogo gosudarstvennogo universiteta. Seriia: Informatsionnye tekhnologii, 16 (2018), № 2, s. 5–18.

B o b k o v a, T. Izvlechenie kollokatsiĭ iz korpusa ukrainskikh tekstov. – Research Journal Studies about Languages, 27 (2015), p. 93–105.

B r a s l a v s k i i, P.,  E. S o k o l o v. Sravnenie chetyrekh metodov avtomaticheskogo izvlecheniia dvukhslovnykh terminov iz teksta. – V: Komp'iuternaia lingvistika i intellektual’nye tekhnologii: Trudy mezhdunarodnoi konferentsii “Dialog 2006”, Bekasovo, 31 maia – 4 iiunia 2006. Moskva, 2006. URL: http://www.dialog-21.ru/digests/dialog2006/materials/html/Braslavski.htm.

B u r a k o v a, O. M. Metody i metodika vydeleniia semanticheskikh poleĭ. – V: XII (59) nauchnaia sessiia prepodavateleĭ, nauchnykh sotrudnikov i aspirantov universiteta: sbornik statei. Vitebsk, 2007, s. 195–199.

D i m i t r o v a, A. Zlatostruyat v prevodacheskata deynost na starobalgarskite knizhovnitsi. Sofia, 2016.

D o b r e v, I. Tekstat na Dobromirovoto evangelie i vtorata redaktsiya na starobalgarskite bogosluzhebni knigi. – Balgarski ezik, 29 (1979), № 1, s. 9–21.

E v e r t, S. The statistics of word cooccurences word pairs and collocations. PhD thesis. Stuttgart, 2005. URL: https://elib.uni-stuttgart.de/bitstream/11682/2573/1/Evert2005phd.pdf.

F i l i n, F. P. Istoricheskaia leksikologiia russkogo iazyka. Moskva, 2008.

F o r c h i n i, P.,A. M u r p h y. N-grams in comparable specialized corpora. Perspectives on phraseology, translation and pedagogy. – In: International Journal of Corpus Linguistics, 13 (2008), № 3, p. 351–367.

  1. a g u n o v a, E. V. Slovo – kollokatsiia – sintaksicheskie konstruktsii – tekst. Edinitsa analiza i kontekst. – V: Avtomaticheskaia obrabotka tekstov na estestvennom iazyke i komp’iuternaia lingvistika. Moskva, 2011.
  2. a g u n o v a, E. V.,  L. M. P i v o v a r o v a. Priroda kollokatsiĭ v russkom iazyke. Opyt avtomaticheskogo izvlecheniia i klassifikatsii na materiale novostnykh tekstov. – Sbornik NTI. Ser. 2, (2010), № 6. URL: http://http://webground.su/services.php?param=priroda_collac&part=prirod....

I a g u n o v a, E. V.,  L. M. P i v o v a r o v a. Ot kollokatsiĭ k konstruktsiiam. – V: Russkiĭ iazyk: konstruktsionnye i leksiko-semanticheskie podkhody. Sankt-Peterburg, 2013, 51 s. (= Acta Linguistica petropolitana: Trudy Instituta lingvisticheskikh issledovaniĭ RAN). URL: https://bit.ly/2OWkAmC.

I l i e v, I. Talkuvanieto na Kniga na prorok Daniil ot Ipolit Rimski v starobalgarskata literatura. – Avtoreferat na disertatsia za prisazhdane na obrazovatelnata i nauchna stepen “doctor”. Sofia, 2016.

J a g i ć, V. Entstehungsgeschichte der kirchenslavischen Sprache. Berlin, 1913.

  1. h o k h l o v a, M. V.,  A. M. P o p o v. K voprosu o reprezentatsii dannykh o sochetaemosti v elektronnykh leksikograficheskikh resursakh. – V: Komp’iuternaia lingvistika i vychislitel’nye ontologii. Vyp. 2. Trudy XXI Mezhdunarodnoĭ ob"edinennoĭ nauchnoĭ konferentsii “Internet i sovremennoe obshchestvo (IMS 2018)”. Sankt-Peterburg, 2018, s. 121–127. URL: https://openbooks.itmo.ru/ru/file/8484/8484.pdf.

K o l s h a n s k i ĭ, G. V. O prirode konteksta. – Voprosy iazykoznaniia, 4 (1959), s. 47–49.

K o c h e t k o v a, N. A. Statisticheskie iazykovye metody. Kollokatsii i kolligatsii. – Novye informatsionnye tekhnologii v avtomatizirovannykh sistemakh, 16 (2013), s. 301–305. URL: http://cyberleninka.ru/article/n/statisticheskie-yazykovye-metodykolloka....

K r i u k o v a, A. V. Opredelenie semanticheskoĭ blizosti tekstov s ispol’zovaniem instrumenta DKPro Similarity. – V: Komp’iuternaia lingvistika i vychislitel’nye ontologii. Vyp. 1. Trudy XX Mezhdunarodnoĭ ob"edinennoĭ nauchnoĭ konferentsii “Internet i sovremennoe obshchestvo (IMS 2017)”. Sankt-Peterburg, 2017, s. 87–97. URL: https://openbooks.itmo.ru/ru/file/6510/6510.pdf.

K u t u z o v, A.,L. Ø v r e l i d,T. S z y m a n s k i,E. V e l l d a l. Diachronic word embeddings and semantic shifts: a survey. – In: Proceedings of the 27th International Conference on Computational Linguistic. Santa Fe, New Mexico, USA (2018), p. 1384–1397. URL: https://www.aclweb.org/anthology/C18-1117.

M a n n i n g, C.,H. S c h ü t z e. Foundations of Statistical Natural Language Processing. Cambridge, 2000.

  1. a s e v i c h, A. Ts.,  V. P. Z a k h a r o v. Metody korpusnoĭ lingvistiki v istoricheskikh i kul’turologicheskikh issledovaniiakh // Komp’iuternaia lingvistika i vychislitel’nye ontologii. – V: Trudy XIX Mezhdunarodnoĭ ob"edinennoĭ nauchnoĭ konferentsii “Internet i sovremennoe obshchestvo (IMS 2016)”. Sankt-Peterburg, 2016, s. 24–43. URL: https://openbooks.itmo.ru/ru/file/4102/4102.pdf.

M i l t e n o v, Ia. Leksicheskiĭ kriteriĭ kak sposob atributsii preslavskikh tekstov (na materiale slavianskogo perevoda Dialogov Psevdo-Kesariia). – Slavianovedenie, 5 (2008), s. 41–49.

M i l t e n o v, Ia. Preslavskite leksikalni markeri. 1. Opit za vavedenie. – Palaeobulgarica, 44 (2020), № 2, s. 54–79.

M i t r o f a n o v a, O. A.,V. V. B e l i k, V. V. K a d i n a. Korpusnoe issledovanie sochetaemostnykh predpochteniĭ chastotnykh leksem russkogo iazyka. – V: Komp’iuternaia lingvistika i intellectual’nye tekhnologii: Po materialam ezhegodnoĭ mezhdunar. konf. “Dialog”. Moskva, 2008. URL: http:// www.dialog-21.ru/dialog2008/materials /html/56.htm.

M i t r o f a n o v a, O. A.,  E. V. S o k o l o v a. Avtomaticheskoe izvlechenie kliuchevykh slov i slovosochetaniĭ iz russkoiazychnykh tekstov s pomoshch’iu algoritma KEA. – V: Komp’iuternaia lingvistika i vychislitel’nye ontologii. Vyp. 1. Trudy XX Mezhdunarodnoĭ ob"edinennoĭ nauchnoĭ konferentsii “Internet i sovremennoe obshchestvo (IMS 2017)”. Sankt-Peterburg, 2017, c. 157–165. URL: https://openbooks.itmo.ru/ru/file/6522/6522.pdf.

Neparametricheskie korreliatsii. – V: StatSoft. URL: https://clck.ru/RzcdD.

N o v a k, M. O. Formy perfekta v drevneslavianskom perevode Apostola (na materiale spiskov XII–XIV vv.). – Vestnik Volgogradskogo gosudarstvennogo universiteta. Seriia 2: Iazykoznanie, 15 (2016), № 2, s. 69–74. DOI: http://dx.doi.org/10.15688/jvolsu2.2016.2.8.

N o v a k, M. O. Istochniki i iazyk tolkovaniĭ na Deianiia v Tolstovskom Apostole XIV veka. – Vestnik Volgogradskogo gosudarstvennogo universiteta. Seriia 2, Iazykoznanie, 16 (2017), № 4, s. 58–65. DOI: https://doi.org/10.15688/jvolsu2.2017.4.4.

N o v a k, M. O. Greko-slavianskie sintaksicheskie korreliatsii v oglavleniiakh k poslaniiam Apostolov (na materiale drevnerusskikh spiskov XII–XIV v.). – Drevniaia Rus’. Voprosy medievistiki, 69 (2017), № 3, s. 93–94.

N o v i k o v a, A. S. K voprosu ob istorii sozdaniia evangel’skogo teksta Chudovskoĭ rukopisi. – V: Slavianskie iazyki i literatury v sinkhronii i diakhronii. Materialy mezhdunarodno ĭ nauchno ĭ konferentsii. Moskva, 2013, s. 268–273.

  1. i v o v a r o v a, S. S. Vyiavlenie terminov-kandidatov dlia mnogoiazychnogo terminologicheskogo slovaria. – V: Komp’iuternaia lingvistika i vychislitel’nye ontologii. Trudy XIX Mezhdunarodnoĭ ob"edinennoĭ nauchnoĭ konferentsii “Internet i sovremennoe obshchestvo (IMS 2016)”. Sankt-Peterburg, 2016, s. 55–64. URL: https://openbooks.itmo.ru/ru/file/4104/4104.pdf.
  2. i v o v a r o v a, L M.,  E. V. I a g u n o v a. Izvlechenie i klassifikatsiia terminologicheskikh kollokatsiĭ na materiale lingvisticheskikh nauchnykh tekstov (predvaritel’nye nabliudeniia). – V: Materialy Simpoziuma “Terminologiia i znanie”, Moskva, 21–22 maia 2010. Moskva, 2010. URL: http://webground.su/data/lit/pivovarova_yagunova/Izvlechenie_i_klassifikatsiya_terminoligicheskih_kollokatsyi.pdf.

Russkiĭ iazyk kontsa XX stoletiia (1985–1995) / V. L. Vorontsova, M.Ia. Glovinskaia, E. I. Golanova [i dr.]. Moskva, 1996. (Iazyk. Semiotika. Kul’tura).

S l a v o v a, T. Preslavska redaktsiya na Kirilo-Metodievia starobalgarski evangelski prevod. – Kirilo-Metodievski studii, 6 (1989), s. 15–129.

S l a v o v a, T. Slavyanskiyat prevod na Poslanieto na patriarh Fotiy do knyaz Boris-Mihail. Sofia, 2013.

S o r o k o l e t o v, F. P. Istoriia voennoĭ leksiki v russkom iazyke (XI–XVII vv.). Moskva, 2009.

Z a k h a r o v, V. P.,  M. K o g a n,  A. Iu. K o l o t a e v a,  A. T i l’ m a n c,  Z. S h r o t - V i k h e r t,  A. M. I a r o s h e v i c h. K probleme sozdaniia spiska vysokochastotnykh slov i vyrazheniĭ nemetskogo iazyka dlia spetsial’nykh tseleĭ. – V: Komp'iuternaia lingvistika i vychislitel’nye ontologii. Vyp. 2. Trudy XXI Mezhdunarodnoĭ ob"edinennoi nauchnoi konferentsii “Internet i sovremennoe obshchestvo (IMS 2018)”. Sankt-Peterburg, 2018, s. 44–55. URL: https://openbooks.itmo.ru/ru/file/8417/8417.pdf.

  1. a k h a r o v, V. P.,  M. V. K h o k h l o v a. Avtomaticheskoe vyiavlenie terminologicheskikh slovosochetanii. – Strukturnaia i prikladnaia lingvistika, 10 (2014), s. 182–200.

Z h e l i a z k o v a, V. Parimeĭnye chteniia v chet'ikh spiskakh Knigi Iskhod. – Studia Ceranea, 6 (2016), s. 225–240.

Z h e l i a z k o v a, V. Kniga Iskhod v iuzhnoslavianskikh spiskakh XV–XVI vv. – Studi Slavistici, XIII (2016), s. 243–256.

Z h o l o b o v, O. F.,  V. A. B a r a n o v. Distributivno-kvantitativnye i semanticheskie kharakteristiki glagolov znaniia v staroslavianskoĭ i drevnerusskoĭ pis’mennosti. – Vestnik Sankt-Peterburgskogo universiteta. Iazyk i literatura, 18 (2021), № 1, s. 56–76.

Z h o l o b o v, O. F.,  V. A. B a r a n o v. T ransformatsii leksicheskogo riada zhivot" – zhizn’ – zhitie: opyt lingvostatisticheskogo opisaniia. – Voprosy iazykoznaniia, (2022) № 2, s. 65-101. DOI: 10.31857/0373-658X.2022.2.65-101.