Ethnicity and language are among the most apparent obstacles for nation-building in Ukraine due to its multiethnic and multilingual character. Kulyk notes that contradictory policies during the Soviet period produced a wide discrepancy between identity and language practice . This discrepancy remains in place today. For instance, a 2017 Razumkov Center survey finds a discrepancy in ethnolinguistic identities between Ukrainophone and Russophone (only Russian-speaking) Ukrainians: 92% of the population self-identify as “ethnic” Ukrainian, yet only 72.2% speak Ukrainian as a mother-tongue .
Discrepancies between national attachment, ethnicity, and language introduce normative issues which have surfaced in salient political debates on language policy such as the repeal of the largely sardonic Kivalov-Kolesnichenko regional languages law, and Poroshenko’s signing of the new Law on Education. Ironically, while attempting to create a European-style nation-state framework, the latter law has alienated Ukraine’s European neighbors who have shown concern for their national minorities residing in the country. However, this should be considered alongside regional elites’ cynical self-legitimation strategies for domestic politics.
All of this languages policy posturing may be unnecessary because the events of the past five years have influenced much of the Russophone population to opt in using more Ukrainian rather than speaking their mother-tongue . Policy-makers and Western establishment should be acutely aware of these dynamics when crafting and analyzing policies, mainly because configurations of these identities are strong predictors of political attitudes .
The language of educational instruction is a vital instrument of national development: it helps facilitate communication between citizens and the state. It also can help create a common framework for citizens to act as co-equals in the marketplace of ideas, culture, and society.
Importantly, policy-makers themselves have been underexplored regarding identity dynamics, and their influence on policy is still yet to be fully understood. Analyzing the development of the new Law on Education may clarify lawmakers’ visions for Ukraine regarding language and ethnicity, and the extent to which these beliefs shape substantive reforms. However, the context which these policymakers are in, is extremely specific and their words can only help describe this context (i.e., education policy) rather than generalize on the entire Rada policy community.
Methods, Data, and Theory
- I first collected data by web-scrawling round tables and hearings which discussed the Law on Education (2015-2017) on the Rada Committee on Education and Science (KNO) website. In total there were three texts from the time period that were large enough to warrant further analysis and which only discussed the Law on Education. The texts averaged 23,900 words each, and the final corpus included 71,700 words. The meetings averaged 140 minutes and 23 speakers. The data was compiled into three separate text documents, and then organized and prepared for machine-learning analysis and hand-coding discourse analysis  . Select sections are translated by me into English for this article.
- It is not easy to understand the exact meanings of this massive amount of words and information. So, I employ a machine-learning algorithm to cluster these words into similar topics. Topic modeling (via latent Dirichlet allocation) assumes that documents consist of a mixture of topics and that these topics consist of high frequencies of particular words. It clusters these words into similar topics and reports the frequencies of words per topic and of topic per document  . This technique is excellent for clustering textual data and for classifying legislative discourse by topics. Clustering topics in legislative discourse helps establish what policymakers are latently discussing in a given text.
- Topic modeling helps establish the given topics of a text, but it cannot distinguish between semantic devices nor can it analyze rhetoric. Instead, discourse analysis is then used to go further into the texts and qualitatively analyze these clusters by giving a personal evaluation on the structure of policymakers’ rhetorical devices. The analysis explores different categories of semantic devices pertinent to identity construction: contents of speech, discursive strategies which underlay rhetoric, and the dialectical means in which a strategy is used .
Results and Analysis
The topic model found six main topics (see Figure 3) . Tracing the probabilities of these topics by year shows essential shifts in the discourse. 2015 saw the highest mixture between topics with the conversation centered on education in general, opinions, standards, and the law itself. 2016 focused chiefly on the new Law on Education as well as funding and standards. Yet, the discussion on minorities occurs only in 2017. The graph below highlights these results by comparing topic probabilities by year.
Figure 3. Topic Probabilities by Year
With this rough road map, we can begin to analyze national understandings as well as legitimate aspirations and fears within the policy-making community. Across the three years, the discourse analysis qualitatively identified three particularly salient contents spread across each topic except for funding: European integration, ethnocultural criteria of national membership, and an idea that Ukraine is (and should be) the modern Western frontier against the unmodern east and Russian aggression. These contents can be thought of as “conversations” within the topics outlined above, and they underscore how policy framing appropriates each topic differently.
2015: Europe, Democracy, Modernity
In 2015, the emphasis of the discourse surrounded Europeanization, modernization, and democracy. For instance, one committee member states that the education system should respond to the needs of a modern European Ukraine, a democratic Ukraine, and Ukraine in the war. Lawmakers express the need to integrate with Europe in terms of modernizing pedagogical standards. Moreover, there is a sense of civic duty and nationhood found among the speakers. When talking about educational standards, policymakers evoke civic duty and nationhood as a core value for the educational system:
“Therefore, the Law on Education should clearly state the need for the formation of value orientations, valuable skills, a new culture in the educational sphere and, most importantly, the formation of a Ukrainian identity of citizens and should reflect their European affiliations and promote understanding of this belonging” (Rada KNO 2015 [Vasyl Hryhorovych Kremen]).
There is also an intense focus on building a modern European Ukraine and for constructing a future which is different from the present. At the same time, however, there are typical nationalistic appeals to the past to justify present policy positions insofar as a nationalized history is used to promote policies of the present especially when policymakers discuss education in general or their opinions on education:
“Ukrainian education also has its peculiarities; its legal norms were laid down in the collection of ‘Ruska Pravda’ by Yaroslav the Wise and Volodymyr Monomakh’s writings. Therefore, on the one hand, we need to preserve the foundations of national education, and on the other to integrate with the world, perhaps even bring to the European educational system our pedagogical heritage” (Rada KNO 2015 [Leonid Vasylovych Gubersky]).
This quote reveals a discursive strategy which perpetuates national mythologized histories and justifies the need to create a Ukrainian education system in the interest of maintaining tradition and innovation. 2015 opens the window of opportunity for the debate on education to widen from increasing the standards of education into the Europeanization of the standards of education.
2016: Civilizational War
Rather than focus on standards and education in general, in 2016, policymakers focus more on the wording of the law itself and the Russian threat. These constructions took on a literal Huntingtonian form with a member quoting from Clash of Civilizations while giving an opinion on the general state of Education:
“When I read this book, I had in my hands the book of ‘Ukraine and Russia’, and it was written in black and white that we are different, but there will never be war. But the ink has not dried, and the war has begun” (Rada KNO 2016 [Ivan Hryhorovych Kyrylenko]).
This sort of discursive strategy implicitly compares Ukraine to Russia and by considering Russian civilization as threatening to Ukraine’s national identity rather than the Russian state. The speaker is rhetorically warning against Russian aggression and the need for Ukrainian sovereignty. Interestingly, none of the discourses in 2016 see the European Union as a force of globalization poised to pool Ukrainian sovereignty. Therefore, the law is implicitly about a heightened need for security vis-a-vis the Russian state rather than specifically maintaining national identity as such: lawmakers see European institutions as essential for establishing substantive sovereignty.
2017: Patriotism, Fear, and Language
The discursive constructions found in 2017 shift the focus from the eastern Other onto Ukraine and its minorities. The majority-minority question becomes particularly salient and policymakers begin to more explicitly address the place of national minorities rather than generalized calls for homogeneity.
“We have the Ukrainian state, and in our country the only [state] language is Ukrainian. There can be no other [state] languages. We do not want to sow further separatism in the future as the misunderstandings of the past 25 years have since as the honorable chairman of the committee has said, there are schools in the previous 25 years that prepare students who, after graduation, do not speak Ukrainian” (Rada KNO 2017 [Oleh Stepanovich Musiy]).
This straddles the line between ethnic and civic nation-building. On the one hand it stresses an ethnolinguistic Ukrainian identity as essential for the functioning of the state, and on the other hand, it treats Russophones as co-citizens who are in a vulnerable position and ripe for exploitation. The example above may be problematic insofar as it reduces non-Ukrainophones as persons susceptible to foreign influence (e.g., because they are not Ukrainian). However, this rhetoric diverges from the reality that Russophones have only become more Ukrainian after the war. This divergence calls to attention the complexities of civic nation-building. One such issue is that policy-makers who tacitly aspire to produce integrative policies may endorse homogenizing policies in the name of state harmony. These policies can be problematic in multi-ethnic and multilingual contexts, and often drive tensions among minorities’ “national homelands.”
Further examples illustrate the nuance of how discursive strategies to build a civic conception of Ukraine use destructive language:
“I absolutely consider the situation unacceptable, especially in the minority schools of other groups; other language groups, like Hungarian, for example, the minority children, earning state-financed education, at the end of school cannot speak Ukrainian” (Rada KNO 2017 [Liliia Mykhailivna Hrynevych]).
Here, even minority language groups which are not Russophone are considered somehow contra-Ukrainian since they cannot ‘contribute’ to the functioning of society or government in the same language. This is a shift in the discourse away from the Russian threat and against the Other within. This strategy seeks to detract from other groups’ legitimacy within the polity by presupposing a national difference between them and Ukrainians which legitimizes an ethnolinguistic version of Ukrainian identity.
Speakers also express surprise at dissenting deputies:
“Ukraine always provides everything to everyone. And will she provide the Ukrainian language to Ukrainians, please tell me, dear people’s deputy?” (Rada KNO 2017 [O. S. Musiy]).
“…The right of national minorities to learn their native language, but not to replace the Ukrainian state language with their native language” (Rada KNO 2017 [V. H. Kremen]).
The speakers place primacy on Ukrainian through portraying other groups’ languages as somehow foreign and thus native to different nationalities. Policy-makers here seem to be conflating language practice, ethnicity, and nationality.
2017 also saw the return of Eurocentric language. However, some of the rhetoric blends Europeanization with “patriotic” elements when talking about educational standards:
“The core idea is a competitive, modern patriot and innovator, a well-developed personality who will build a modern European Ukraine. This is the key idea of this bill” (Rada KNO 2017 [L. M. Hrynevych]).
This all points to an apparent shift in rhetoric and discourse. While this quote underscores civic elements of a Ukrainian nation (e.g., patriotic love for the state and European ideals) it also infuses Ukrainophone patriotism a necessary requisite of a European Ukraine. All of which paints a complex portrait of civic nation-building in Ukraine. On the one hand, policymakers are keen to promote a nation which is competitive, integrative, European, and “modern,” but on the other hand, ethnic and linguistic identities and linguistic practices are viewed and acted on in ethno-nationalistic terms since policymakers often conflate ethnic, national, and linguistic identities. Therefore, nation-building in Ukraine fluctuates between ethnolinguistic demarcations and hopeful Europeanization efforts.
Further Analysis: Concluding Thoughts
The topics which policymakers discuss indicate that Ukrainian education policy runs deeper than ethnic divisions. The machine-learning algorithm detected a change in topic from a mix of standards, opinions, and education in general to the subject of national minorities, and the nation-building rhetoric seems to have qualitatively changed alongside this trend. Even when they use language which is essentializing and destructive, lawmakers focus on creating a Ukrainian civic identity much in the classical French fashion: the nation as a daily voluntary act or plebiscite of memories and amnesia. Policymakers, thus expect to build a unitary and centralized nation-state, which depends—much like in the historical French fashion—in part on a shared language of prestige. In this view, non-Ukrainophones need to adopt the common linguistic framework of Ukraine while limiting their own mother-tongue usage in state contexts. The rhetoric between 2015-2017 shows that a modern European Ukraine is contrasted against the eastern and unmodern, which is a result of policymakers acute awareness of interstate politics: a security crisis underpins their perspective.
The shift from gradual to radical Ukrainization (at least in parliamentary discourse pertaining to education) is the result of two overarching themes: a security threat and need for national unity, and the desire to join the European community. Policymakers generally follow trends in the polity regarding national identification but need to be wary of the pitfalls of populistic ethnonationalism. This sort of patriotism rejects other realities of Ukraine which include rich sources of culture, information, and dialogue across a multitude of languages and histories. Donald Tusk’s recent speech commemorating the fifth year since Euromaidan states just that: beware of populist nationalism and embrace the civic identity of Ukraine which makes it a part of the European community. In the face of a rising language of prestige, educational reform, and long-term conflict with a larger nuclear neighbor, Ukraine must trust in its citizenry and dispel the balancing act between ethnic and civic Ukrainization.
 A comparable survey by the Kyiv International Institute of Sociology captures the national feeling of Ukrainians at 88.3% for all respondents and finds that 68.3% of them identify Ukrainian as their native language. Razmukov’s wording, however, may obfuscate affective and/or non-ethnic or non-linguistic based facets of national identity since it is entirely possible to identify as ethnically Ukrainian but feel nationally Russian or non-Ukrainian.
 Volodymyr Kulyk (2018) Shedding Russianness, recasting Ukrainianness: the post-Euromaidan dynamics of ethnonational identifications in Ukraine, Post-Soviet Affairs, 34:2-3, 119-138, DOI: 10.1080/1060586X.2018.1451232
 Grigore Pop-Eleches & Graeme B. Robertson (2018) Identity and political preferences in Ukraine – before and after the Euromaidan, Post-Soviet Affairs, 34:2-3, 107-118, DOI: 10.1080/1060586X.2018.1452181
 All data were compiled into three samples for the separate years in their original language (.txt files). R was localized for use with Ukrainian. The data were then combined into a single vector, and this was used to create a corpus. Stop words were used in the preprocessing phase and include useless words like “yes,” “no,” and “thank you,” as well as a custom list which included legislative specific redundancies and words outside of the original list such as “presiding,” “honorable,” “law,” “absolutely,” “what/that,” “committee,” and speakers’ names.
 NVivo was used for qualitative analysis. I color coded each speaker differently (they were identified clearly in the text), and noted changes in topic and strategies used. For overlapping content, this is noted and comparisons are drawn in the software. An overarching methodological framework is used to identify different rhetorical strategies. The method iteratively reads the document for discursive constructions that imply or explicitly mention the a national collective or common body, and notes if the speaker used words identified by the topic model (top 20 list per topic). Then, four macro rhetorical strategies are analyzed in the aforementioned texts: constructive, destructive, perpetuation, and transformation strategies which either construct or destroy a common nation, or perpetuate or transform historical experiences. These discursive strategies are then analyzed in further depth by looking at their associated micro-strategies (e.g., how is the nation being constructed or transformed) and argumentative schemes (topoi) which enable the strategy to be realized. Finally, the dialectical and syntactical schemes of the arguments are analyzed to establish any unconscious linguistic cues, but are omitted from the article for brevity’s sake.
 The model itself has relatively auspicious indicators of model fit. The topic probabilities by year are all clear cut (see Figure 3), and the topic probabilities per word all indicate that the topics are measuring different latent constructs in the text (i.e., they measure different words with little substantive overlap). A 6 cluster model (k=6) was chosen to balance parsimony and accuracy. The tidytext package was used to extract beta values (per-topic-per-word) which were used with the dplyr package to construct a matrix of top 20 words per topic. Document-topic probabilities (gamma) were analyzed and graphed (Figure 3). R code for document preparation and analysis is available upon request.
 Sergey I. Nikolenko, Sergei Koltcov & Olessia Koltsova (2017) Topic modelling for qualitative studies. Journal of Information Science, 43(1), 88–102. https://doi.org/10.1177/0165551515617393
 Ruth Wodak (2001) The discourse-historical approach. In R. Wodak & M. Meyer (Eds.), Methods of critical discourse analysis (pp. 63-94). London: SAGE Publications Ltd doi: 10.4135/9780857028020.n4
 Topic word and word-frequency lists are available upon request.
The author doesn`t work for, consult to, own shares in or receive funding from any company or organization that would benefit from this article, and have no relevant affiliations