A corpus of Slavic dialects in Albania

Link:
Autor/in:
Beteiligte Personen:
  • Dr. Maxim Makartsev
  • Dr. Elena Uzeneva
  • Dr. Timofey Arkhangelskiy
Verlag/Körperschaft:
Universität Hamburg
Erscheinungsjahr:
2024
Medientyp:
Datensatz
Schlagworte:
  • Balkan linguistics
  • language contact
  • sociolinguistics
  • Slavic dialects in Albania
  • Slavic ethnography
  • Macedonian dialectology
  • Štokavian dialectology
  • Albanian language
Beschreibung:
  • A corpus of Slavic dialects in Albania

    The user-friendly version of the Corpus with search options is available here.

    These are the main parameters of the corpus:

    1. Korça Macedonian (KM): Rural locations: Boboshtica | Urban locations: Korça | Size: 34.0 thousand words | Morphological analysis: Classla 2.1.1 for Macedonian, partial manual postprocessing.
    2. Prespa Macedonian (PM): Rural locations: Pustec, Gorna Gorica, Dolna Gorica, Shulin | Urban locations: Elbasan, Korça | Size: 171.3 thousand words | Morphological analysis: Classla 2.1.1 for Macedonian, partial manual postprocessing.
    3. Golloborda Macedonian (GM): Rural locations: Trebisht, Vërnica, Malestreni | Urban locations: Durrës, Elbasan, Tirana | Size: 239.7 thousand words | Morphological analysis: Classla 2.1.1 for Macedonian, partial manual postprocessing.
    4. Myzeqe Štokavian (MŠ): Rural locations: Rreth Libofsha, Petova | Urban locations: Fier | Size: 58.8 thousand words | Morphological analysis: Classla 2.1.1 for Serbo-Croatian, partial manual postprocessing.
    5. Shijak Štokavian (SŠ): Rural locations: Borake, Koxhas | Urban locations: Shijak, Sukth | Size: 68.8 thousand words | Morphological analysis: Classla 2.1.1 for Serbo-Croatian, partial manual postprocessing.
    6. Albanian: Rural locations: All of the above | Urban locations: All of the above | Size: 34.7 thousand words | Morphological analysis: uniparser-albanian.
    7. Other languages: Bulgarian, English, French, German, Greek, BCMS, Italian, Russian, Turkish | Size: 4.5 thousand words | Morphological analysis: not analyzed.
    8. Total size: 611.8 thousand words.

    Previous research on Slavic dialects in Albania

    This project is not the first study of Slavic dialects in Albania (SDAs), but we did consider varieties that have never been studied (e.g., Slavic speech in urban Albanian settings; Myzeqe Štokavian).

    Starting from the groundbreaking monograph Slavic populations of Albania by A. Seliščev (1931), there have been several publications on SDAs that covered both the history of the Slavic population in this country (e.g., Ylli’s (1997, 2000) monograph on Slavic borrowings in Albanian toponymics) and its current state (Bojović 1991; Tončeva 2014; Vidoeski 1998). Of the utmost importance is the four-volume work “Die slavischen Minderheiten in Albanien” by Steinke and Ylli (2007, 2008, 2010, 2013), supported by a Deutsche Forschungsgemeinschaft (DFG) grant between 2002 and 2011.

    In the selected publications listed, as well as in those dedicated to separate dialects (see the outline of dialects below), you can find descriptions of the language systems of the SDAs, information about the current status of the communities that use the SDAs, and dialectal transcripts.

    Selected language varieties

    The labels for the language varieties included in the corpus do not make any claims about the national, ethnic, or other identities of the speakers; they are purely provided for an orientation in terms of the respective dialectologies. The labels are not necessarily those that the speakers used, either. In fact, some speakers do not use any labels at all, while others use a variety of labels, often non-terminologically. The language issue within some of the ethnolinguistic minorities in Albania is seriously politicized; however, this corpus does not carry any political claims of any political organization, party, group of individuals, or state, etc. The language labels used in the external sources quoted here are as in the original for identification purposes only.

    Five dialects were chosen for this project, as shown on the Google Map.

    Golloborda Macedonian

    Golloborda Macedonian is a peripheral Balkan Slavic dialect that continues West Macedonian Debar dialects in Albanian territory. It is spoken in 15 villages in the Albanian regions of Dibra and Elbasan, as well as in migrant communities in the cities of Durrës, Tirana, and Elbasan. It has been estimated that it has more than 7,000 speakers in total. In its rural centers, this community has been studied thoroughly by a team of researchers from the Institute of Linguistic Studies (Russian Academy of Sciences), Saint Petersburg State University, and Peter the Great Museum of Anthropology and Ethnography (Kunstkammer); their research resulted in a valuable monograph that was translated into Albanian and Macedonian. Notably, our corpus-based research focused on the sociolinguistic variation and changes within this dialect, specifically between its rural and urban centers, so our methods and data differ from those in the research of our peers from Saint Petersburg.

    Selected literature: Steinke & Ylli (2008); Sobolev & Novik (2013, 2017, 2018).

    Korça Macedonian

    Korça Macedonian is an apparently extinct Balkan Slavic island dialect (structurally close to the dialectical area of Southeastern Macedonian). The corpus includes speech samples from the last six speakers (three of whom lived in the village of Boboshtica and three of whom lived in the town of Korça in Southeastern Albania but were originally from Drenova). Despite the community being so small, this dialect was crucial for the project, as it has been subjected to the most prolonged and intensive Albanian influence. However, family discussions could not be organized in this community.

    Selected literature: Mazon (1936); Mazon and Filipova-Bajrova (1965); Steinke and Ylli (2007)

    Prespa Macedonian

    Prespa Macedonian is a peripheral Balkan Slavic dialect that continues the West Macedonian Ohrid-Prespa dialects on the Albanian side of Great Prespa Lake, transitional to Southeastern Macedonian. According to estimates, it has around 4,500 speakers in nine villages of the region and two large towns, namely Korça and Bilisht.

    Selected literature: Steinke and Ylli (2007); Cvetanovski (2010)

    Myzeqe Štokavian

    Spoken in several quarters in Fier and several villages around this town, Myzeqe Štokavianan is a Štokavian island dialect spoken among recent (1920s) migrants from the Sandžak region (a Novi Pazar-Sjenica dialect of the Zeta-Sjenica dialectal zone) of what is now Southwestern Serbia and the bordering region of Montenegro.

    Selected literature: Makartsev and Kikilo (2022); Makartsev (2023)

    Shijak Štokavian

    Spoken in the village of Borake and its satellite village of Koxhas, Shijak Štokavian is a Štokavian island dialect spoken among relatively recent (from the 1880s) migrants from the Mostar region in what is now Bosnia and Herzegovina (a central Herzegovinian subdialect of the East Bosnian dialectal zone, spoken in the Mostar—Čapljina—Stolac triangle). It is spoken by 150 to 220 families in both villages. Speakers of this dialect also live in the town of Shijak and Sukth.

    Selected literature: Steinke and Ylli (2013); Makartsev and Kikilo (2022); Makartsev (2023)

    Sociolinguistic diversity among the SDAs

    These dialects have varying degrees of structural affinity with Albanian due to their differing connections to the Balkan sprachbund. Structurally, the closest to Albanian are the Balkan Slavic dialects of Korça, Golloborda, and Prespa. The Štokavian dialects of Myzeqe and Shijak are not included in the Balkan sprachbund and show less structural affinity with Albanian.

    The selected dialects do not represent all the SDAs. One of the most complete lists of SDAs can be found in Steinke and Ylli’s monograph. However, they contain the variety of elements and parameters that have caused the diversity in the SDAs.

    Two of the dialects are Štokavian (Myzeqe and Shijak), but their speakers have differing ethnopolitical and linguistic orientations: Our interviewees in Shijak usually articulated their Bosniak identity; Myzeqe speakers usually clarified their Bosniak or Serbian identity. Three dialects are Balkan Slavic (Golloborda, Korça, Prespa). The orientation of the speakers of these dialects toward a standard language (Macedonian or Bulgarian) is usually individual. The ethnopolitical and linguistic orientations mentioned by the speakers were thus not interpreted as making any political claims, but they allowed us to thematize the orientation of the speakers toward one of the standard Southern Slavic languages and better explain certain features in their speech.

    Four of the communities have a rural (more conservative) and an urban (less conservative) center (except Korça Macedonian, whose number of speakers did not allow us to construct this opposition). For Shijak, the labor activity of the speakers was important, especially in terms of whether their jobs were connected to work at the national road services. (Such workers have daily contact with the language of TIR truck drivers who speak BCMS.) The opposition of rural versus urban was not relevant for the Shijak data because of the short distance between the settlements and the small size of the towns of Sukth and Shijak. The city of Durrës, which is where many of the dialectal speakers work daily, is also located too close to Shijak to allow for the shaping of a significant urban colony with distinct features.

    Religion was another factor that we considered might influence linguistic identity choices (cf. the linguo-confessional situation in Bosnia and Herzegovina and regions of Montenegro and Serbia populated by a Štokavian-speaking but traditionally Muslim population) since it is relevant to numerous distinctive features of the traditional culture. All Myzeqe and Shijak Štokavian speakers that we interviewed culturally and traditionally belong to Sunni Islam. Among the Balkan Slavic communities, all our Korça and Prespa speakers culturally belong to Orthodox Christianity, while Golloborda is heterogenous with the domination of Sunni Islam.

    Topics of our interviews

    1. Narratives and memorates: Type: Unstructured 

    2. Ethnographical and ethnolinguistic interviews:

    2.1. Calendar, rites of passage (birth, marriage, death), demonology: Type: Semi-structured | Bibliographical reference: (Plotnikova 2009; see the online publication).

    2.2. Rites and beliefs connected to the moon: Type: Semi-structured | Bibliographical reference: (Čëxa 2009).

    2.3. Rites and beliefs connected to the cuckoo: Type: Semi-structured | Bibliographical reference: (Makartsev 2017).

    3. Frog, Where Are You?: Bibliographical reference: (Berman et al. 1994–2004; Mayer 1969; see preview).

    3.1. Conducted by the researchers

    3.2. Conducted by trained local assistants

    4. Family talks: Type: Unstructured | Bibliographical reference: (Hentschel and Zeller 2013)

    The narratives and memorates (T. 1) were unstructured discussions about the oral history and current problems of a given community that also provided insights into the identity and politics of memory of the community. The researchers led these discussions.

    The ethnographical and ethnolinguistic interviews (T. 2) were conducted to collect ethnographical and ethnolinguistic information. They comprised the informants’ answers to our questions and covered various aspects of the traditional culture. We mainly followed the structure of the questionnaires (or interview designs) listed in the table, with slight adaptations.

    Frog, Where Are You? (T. 3) is a book with 24 pictures that combine to form a visual narrative. This section was structured as a questionnaire, ensuring that the researcher was minimally involved. We also asked our trained local assistants to record themselves or their relatives and friends answering this questionnaire; therefore, the data that we collected here resembled real-life language use.

    Our trained local assistants organized the family talks (T. 4) in our absence. The aim was to record spontaneous speech, so the topics were irrelevant. Since the same assistants prepared the transcripts, they could omit any sections that contained potentially harmful information or could have been used to identify the speakers.

    Collecting this type of data was most successful for Golloborda Macedonian speakers since we had a network of trained local assistants upon whom we could rely.

    We managed to arrange a few family talks among Prespa Macedonian speakers and just one family talk with Shijak Štokavian speakers.

    For Myzeqe Štokavian, arranging family talks has not yet been successful.

    For Korça Macedonian, such discussions were impossible since none of our speakers still used the dialect daily, although they could still speak it with us. The epigraph above was spoken by one of the speakers from Drenova (Dre01), wherein he described the frustration he felt while witnessing the attrition and loss of his native dialect.

    Speakers

    The speakers in the transcripts were divided into the three main categories:

    1) Native speakers of the respective SDAs. They were anonymized. All information that could be used for their identification was manually removed from the corpus (tagged as ((ERASED))). All speakers of this category were referenced with indices comprising three letters (for the settlement) and two digits. We also referred to them by these indices in our publications based on the corpus.

    2) Researchers. They were only referenced with letter indices. Their names are provided in the Acknowledgments section.

    3) SPK. This abbreviation was used to refer to all other speakers whose speech was transcribed for context but was not annotated for various reasons (an unknown neighbor passing by the window and saying hello, an Albanian-speaking waiter in a village café, some unidentified background voices, etc.).

    See list of speakers

    Transcripts

    Our team manually prepared all transcripts. When possible, our trained local assistants, speakers of the dialects who also organized the family discussions, prepared the draft versions. Our editors (specialists in their respective philology) proofread these draft versions, following which Dr. Maxim Makartsev double-proofread them. If trained local assistants were unavailable, our editors prepared the transcripts. We used EXMARaLDA Partitur Editor to match the transcripts with the recordings. The original scripts for Albanian and other languages were used. Only transcripts in Slavic dialects were proofread, while transcripts for Albanian and other languages were not for contextual purposes. (Hence, the spelling ranges from standard to non-orthographical semi-phonetic spelling.)

    Transcription

    See transcription conventions

    Annotation

    We followed several steps for the annotation.

    First, we formulated the rules to define the language of the respective word form based on the tags provided in EXMARaLDA Partitur Manager manually by the transcribers and editors and on language-specific scripts (e.g., Greek, Russian), symbols (e.g., ë, special for Albanian), and combinations of symbols (e.g., llrr, initial ng and mb, special or unique for Albanian).

    Second, the respective parsers and taggers were applied depending on the language of the word form (see the table above). Following this, only those parts of the transcripts that the speakers uttered (not the researchers) in Slavic dialects were manually and semi-automatically checked, proofread, and edited. The parts of the transcripts that the researchers uttered or that speakers uttered in other language varieties were not proofread. In such cases, we kept the automatic annotation.

    Third, lemmatization was manually checked (for Slavic); the lemmas automatically marked as Albanian were selectively checked and corrected, if needed. For Golloborda, Korça, and Prespa Macedonian, the lemmatization was performed in standard Macedonian—as the closest standard language structurally. For Myzeqe and Shijak, lemmatization was based on Ijekavian standards. For dialectal lexemes that did not exist in the respective standards, standard phonology was applied, resulting in the creation of dummy lemmata that followed standard phonology but cannot be found in standard dictionaries. The only function of these lemmata was to allow for the trans-dialectal search of word forms. If a standard cognate could not be established, we adopted any suitable word form attested in our transcripts.

    Fourth, the morphological tags for Macedonian and Štokavian were harmonized, since Classla 2.1.1 uses slightly different MULTEXT-East conventions for varieties. The resulting tag set is provided below.

    Fifth, the results of the morphological tagging were selectively checked. We focused on the word forms with the greatest homonymy and the lexemes with the most frequent tokens.

    This is the beta-version of our corpus, so the manual editing of the morphological tags is ongoing. If you notice an error, please feel free to contact us. When working with the corpus, manual checking of the search results is highly recommended.

    Considering the principally bilingual nature of our data and frequent code-switches, we would like to highlight two instruments here:

    1) The corpus allows for searching sets of word forms that are specifically ordered (one after another or with one or more irrelevant word forms in between). You may compose the search entry by marking one of the word forms as Slavic (you can choose the dialect) and another as Albanian, which will show all cases of code-switches that follow your chosen parameters.

    2) The special field Foreign includes all lexical matter borrowings and congruent lexicalizations from Albanian. (They cannot be formally distinguished since both types have Albanian stems and Slavic inflectional morphology.)

    We also distinguished direct speech (tag OWN) and quotations (XENO), while several other tags provide additional information about the intonation and context of the interview (((LAUGH)), ((COUGH)), ((NOISE))).

    Metadata

    • Transcript ID
    • Year and location of the recording
    • Sub-corpus (dialect)
    • Code of the speaker
    • Code of the researcher who participated in the recording and transcribing of the interview
    • The type of interview (whether the researchers were present or absent)
    • The speaker’s birth place, birth year, gender, and occupation; other sociolinguistic data when relevant; familial relations when relevant
    • Current place of residence of the speaker
    • Genre

    List of transcripts

    See full list

    Tag set for SDAs

    Originally, the tag set was based on MULTEXT-East morphosyntactic specifications. Korça Macedonian, Prespa Macedonian, and Golloborda Macedonian were based on Macedonian specifications; Myzeqe Štokavian and Shijak Štokavian were based on Serbo-Croatian specifications.

    We introduced several changes to 1) harmonize the Macedonian and Štokavian parsers and taggers that otherwise followed somewhat different principles and conventions; 2) adapt the tag set to the terminology most widely used in Slavic linguistics (e.g., the use of the term “imperfective aspect” instead of “progressive aspect”); 3) unify all other possible idiosyncrasies (e.g., MULTEXT-East morphosyntactic specifications for Serbo-Croatian do not have verbal aspects, so these had to be introduced for our corpus).

    The tag set for the Albanian language section was developed by Maria Morozova, Alexander Rusakov, and Timofey Arkhangelskiy for the Albanian National Corpus and can be found here. Albanian tags are preceded by the prefix sq: to avoid confusion with homonymous Slavic tags.

    The grammatical features of the words in the corpus are marked with short tags. In tags, abbreviations are capitalized, while full words are not.

    See the tag set

    Frequently asked questions

    — What is the Corpus of Slavic dialects in Albania?

    This is a language corpus or collection of non-adapted transcripts of interviews done in Slavic dialects that are spoken in Albania. Each word form in these dialects included in the corpus are enriched with additional linguistic information or annotations. We also have a user-friendly interface that allows for writing search queries.

    — Who needs corpora?

    Corpora are used by linguists. The search engines and annotations of corpora are designed to allow for easily making linguistic queries such as “find all pronouns in the accusative case” or “find all forms of the word mačka followed by a verb” or “find all instances of a noun followed by an adjective” so that you can retrieve relevant information from the provided linguistic varieties in seconds. Further analyses of this type of data allow linguists to determine how linguistic varieties have changed, how Albanian has influenced these varieties, what the limits of variation are, or whether there are any new and interesting linguistic phenomena that are not found in Macedonian and Štokavian dialects that have had no contact with Albanian.

    Aside from linguists, corpora can be useful tools for language teachers, language learners, and even native speakers.

    A corpus documents linguistic varieties in a given period. For example, one of the varieties included (Korça Macedonian) appeared to have gone extinct during our project (or its last speakers became unavailable to the researchers due to their old age). To the best of our knowledge, our corpus includes the last speech examples of this dialect available. It preserves this dialect and other included varieties for future generations and can be used by language activists for language revitalization.

    — Can I use the corpus for other things beyond purely linguistic research?

    The Corpus of Slavic dialects in Albania makes full transcripts of our recordings available. Aside from merely linguistic interests, the content of the transcripts can also be analyzed since the transcripts are so diverse and include many narratives containing oral history, identity, and anonymized personal biographies. Our transcripts also include much ethnographic and ethnolinguistic information on the traditional culture of the communities, which can be relevant for ethnolinguists, ethnographers, and members of the communities. There are also many examples of oral folk traditions (songs, tales, proverbs, etc.) available for researchers and the general public.

    — Can I use the corpus as a dictionary?

    You might not be able to use this corpus like you would a traditional dictionary because it does not provide translations or explanations of the included words. You may, however, discover in which context the word is used, which you can then use to clarify the word’s meaning.

    — What is a morphological annotation, and how is it obtained?

    Our corpus was lemmatized and morphologically annotated. Lemmatization means that each word in the texts was annotated with its lemma, i.e., its dictionary or citation form. Morphological annotation means that the grammatical features of each word were annotated, including its part of speech, number, case, tense, etc. Since the corpus was too large for manual annotations, it was annotated automatically with programs called morphological analyzers.

    We used analyzers compiled for standard Macedonian since it is closest to Golloborda,  Korça, and Prespa Macedonian structurally, as well as analyzers compiled for Štokavian-based standard languages for Myzeqe and Shijak Štokavian (mostly the Croatian analyzer since it could account for dialectal variations in phonology and morphology within the Štokavian dialects). The language denominators for the respective analyzers do not make any claims about the identities of our speakers and were only used as references in external sources.

    The results of the automatic annotation were partially proofread and edited manually. Our corpus still has homonymy, i.e., when one word form may have several possible morphological analyses. For example, ja in Macedonian dialects can mean ‘I’ (first-person singular personal pronoun in the nominative case), ‘her’ (third-person singular feminine personal pronoun in the accusative case), ‘here’ (a deictic particle), etc. Hence, when looking for anything within the corpus, you will receive false positive results. Manually checking the data you find in the corpus is thus strongly recommended.

    Acknowledgments

    This corpus is the main research instrument developed for the project “Contact-induced language change in situations of non-stable bilingualism—Its limits and modelling: Slavic (social) dialects in Albania,” funded by the DFG (German Research Foundation), project number 8750/1-1 (October 16th, 2019–April 30th, 2024). The principal investigator was Dr. Maxim Makartsev.

    The concept, development, and realization of this project would not have been possible without the constant support of Prof. Dr. Gerd Hentschel, my deepest gratitude to whom words cannot express. I am deeply indebted to Prof. Dr. Jan Patrick Zeller and my other colleagues from the Institute for Slavistics (Carl von Ossietzky Universität Oldenburg) for their support and thorough feedback on my project during its various stages.

    Authors

    The corpus was developed and is maintained by:

    • Dr. Maxim Makartsev (Institut für Slavistik, Carl von Ossietzky Universität Oldenburg), maxim.makartsev@gmail.com
    • Dr. Timofey Arkhangelskiy (Institut für Finno-Ugristik/Uralistik, Universität Hamburg), timarkh@gmail.com

    The current version of the corpus uses the platform tsakorpus developed by Dr. Timofey Arkhangelsky. It is stored on the server of Carl von Ossietzky University Oldenburg.

    I am deeply grateful to Dr. Elena Uzeneva, my cooperation partner (between January 1st, 2020, and March 31st, 2022), without whom this project would not have been possible.

    Several field trips were undertaken to collect the speech samples that were included in the corpus. In 2010–2019, Dr. Maxim Makartsev organized these trips using his own resources (see his legacy page on the site of his previous host institution for publications based on those data). The participants in these fieldtrips were:

    • Dr. Mikhail Chivarzin, Moscow—Shenzhen (MC)
    • Alexandra Chivarzina, Moscow—Shenzhen (AC)
    • Renata Hamidullina, Perm—Vienna (RH)
    • Marina Mihajlova, Sofia—Calgary (MMI)

    In 2020–2022, Dr. Maxim Makartsev and Dr. Elena Uzeneva organized the field trips within the framework of the aforementioned project funded by the DFG. The participants in these fieldtrips were:

    The photo on the start page was taken in Trebisht/Требишта, Albania, by Aino Väänänen, an independent documentary photographer, during our joint research field trip for the aforementioned project in 2020 and was used with her kind permission.

    The transcripts for the corpus were prepared by:

    • Hristina Angeleska—Prilep
    • Bojana Damnjanović—Helsinki
    • Pavel Falaleev—Helsinki
    • Đorđe Genović—Belgrade
    • Violeta Jordanova—Skopje
    • Dr. Natalia Kikilo—Moscow
    • Dr. Maxim Makartsev—Oldenburg
    • Milan Milenović—Belgrade
    • Ekaterina Panova—Saint Petersburg
    • Uliana Putilina—Moscow
    • Anđela Redžić—Belgrade
    • Maria Stryszewska—Wrocław
    • Ekaterina Titova—Moscow

    We are deeply grateful to our speakers and local assistants whose hard work and passion made it possible to make this corpus available. We cannot disclose their names and personal details for their protection.

    References

    Berman, Ruth A., Dan I. Slobin, Sven Stromqvist, and Ludo T. Verhoeven. 1994–2004. Relating Events in Narrative. Hillsdale, N.J. L. Erlbaum Associates.

    Bojović, Jovan R., ed. 1991. Stanovništvo slovenskog porijekla u Albaniji : zbornik radova sa međunarodnog naučnog skupa održanog u Cetinju 21, 22. i 23. juna 1990. Titograd: Stručna knjiga.

    Čëxa, Oksana V. 2009. “Novogrečeskaja leksika narodnoj astronomii v sopostavlenii s balkanoslavjanskoj: Luna i lunnoe vremja (ėtnolingvističeskij aspekt).” Ph.D., Institute of Slavic Studies, Russian academy of sciences. https://inslav.ru/event/chyoha-oksana-vladimirovna-novogrecheskaya-leksika-narodnoy-astronomii-v-sopostavlenii-s.

    Cvetanovski, Goce. 2010. Govorot na makedoncite vo Mala Prespa: zapadnoprespanski govor. Skopje: Institut za makedonski jazik “Krste Misirkov”.

    Hentschel, Gerd, and Jan P. Zeller. 2013. “Gemischte Rede, gemischter Diskurs, Sprechertypen: Weißrussisch, Russisch und gemischte Rede in der Kommunikation weißrussischer Familien.” In Wiener Slawistischer Almanach, edited by Aage A. Hansen-Löve and Tilmann Reuther, 127–55 70. München, Berlin, Wien: Peter Lang.

    Makartsev, Maxim. 2017. “Ėtjudy k balkanskomu bestiariju: Kukuška.” Živaja starina 95 (3): 46.

    ———. 2023. “Razvoj balkanoslavenskoga tipa futura u štokavskim iseljeničkim dijalektima u Albaniji i jezički kontakti.” Književni jezik (34): 41–69.

    Makartsev, Maxim, and Natalia Kikilo. 2022. “Some Tendencies in the Morphosyntax of the Migrational Shtokavian Dialects in Albania (Shijak and Myzeqe) And Slavic-Albanian Language Contact.” Slavic World in the Third Millennium 17 (1-2): 120–41. doi:10.31168/2412-6446.2022.17.1-2.07.

    Mayer, Mercer. 1969. Frog, Where Are You? Sequel to a Boy, a Dog and a Frog. New York: Dial Books for Young Readers (a division of Penguin Putnam Inc.).

    Mazon, André. 1936. Documents, contes et chansons slaves de l’Albanie du Sud. Bibliothèque d’études balkaniques 5. Paris: Librarie Droz.

    Mazon, André, and Maria Filipova-Bajrova. 1965. Documents slaves de l’Albanie du Sud: II. Pièces complémentaires. Bibliothèque d’études balkaniques 8. Paris: Institut d’études slaves.

    Plotnikova, Anna A. 2009. Materialy dlja ėtnolingvističeskogo izučenija balkanoslavjanskogo areala. 2, revised. Moskva: Institut slavjanovedenija RAN.

    Seliščev, Afanasij M. 1931. Slavjanskoe naselenie v Albanii (s illjustracijami v tekste i s kartoju Albanii). Sofia.

    Sobolev, Andrey N., and Aleksandr Novik. 2013. Golo Bordo (Gollobordë), Albanija: Iz materialov balkanskoj ėkspedicij RAN i SPbGU 2008-2010 gg. Materialien zum Südosteuropasprachatlas Bd. 6. Sankt-Peterburg: Nauka.

    ———. 2017. Gollobordë (Golo Bordo), Shqipëri: Nga materialet e ekspeditës ballkanike të AShR-së dhe UShSt-P-së në vitet 2008-2010. Translated by Ligor Cullufe. Tiranë: Botimet Toena.

    ———. 2018. Golo Brdo: Od materijalite na balkanskata ekspedicija na RAN i SPbDU vo 2008-2010 godina. Materialien zum Südosteuropasprachatlas Band 6. Skopje, Sankt Peterburg: Univerzitet "Sv. Kiril i Metodij"; Institut za makedonski jazik "Krste Misirkov"; "Nauka".

    Steinke, Klaus, and Xhelal Ylli. 2007. Die slavischen Minderheiten in Albanien (SMA): 1. Teil. Prespa-Vërnik-Boboshtica. Slavistische Beiträge 458. München: Otto Sagner.

    ———. 2008. Die slavischen Minderheiten in Albanien (SMA): 2. Teil. Golloborda-Herbel-Kërçishti i Epërm. Slavistische Beiträge 462. München: Otto Sagner.

    ———. 2010. Die slavischen Minderheiten in Albanien (SMA): 3. Teil. Gora. Slavistische Beiträge 474. München: Otto Sagner.

    ———. 2013. Die slavischen Minderheiten in Albanien (SMA): 4. Teil. Vraka-Borakaj. Slavistische Beiträge 491. München, Berlin: Sagner.

    Tončeva, Veselka. 2014. Našencite v Albanija: Istorija, ezik, tradicii. Sofia: Ongăl.

    Vidoeski, Božidar. 1998. Dijalektite na makedonskiot jazik. Vol. 1. Skopje: Makedonska Akademija na naukite i umetnostite.

    Ylli, Xhelal. 1997. Das slavische Lehngut im Albanischen. Teil 1 : Lehnwörter. Slavistische Beiträge. Digitale Ausgabe 350. München: Verlag Otto Sagner.

    ———. 2000. Das slavische Lehngut im Albanischen. Teil 2 : Ortsnamen. Slavistische Beiträge. Digitale Ausgabe 395. München: Verlag Otto Sagner.

    Contact

    If you have questions, would like to propose collaboration, or noticed an error in the corpus, please contact Dr. Maxim Makartsev.

     

  • Funding: Deutsche Forschungsgemeinschaft (DFG) DFG reference number: GZ: MA 8750/1-1, AOBJ: 661378 Project number: 429823235 Project title: Contact-induced language change in situations of non-stable bilingualism—its limits and modelling: Slavic (social) dialects in Albania
relatedIdentifier:
DOI 10.5281/zenodo.14191622 DOI 10.25592/uhhfdm.16608
Lizenzen:
  • https://creativecommons.org/licenses/by/4.0/legalcode
  • info:eu-repo/semantics/openAccess
Quellsystem:
Forschungsdatenrepositorium der UHH

Interne Metadaten
Quelldatensatz
oai:fdr.uni-hamburg.de:16609