Full-text resources of CEJSH and other databases are now available in the new Library of Science.
Visit https://bibliotekanauki.pl

PL EN


2025 | 3 | 73-104

Article title

Wyzwania i potencjał analiz cyfrowych dużych zbiorów danych tekstowych w badaniach społecznych na przykładzie przygotowania korpusu dokumentów unijnych dotyczących ubóstwa i wykluczenia społecznego z lat 2001-2021

Content

Title variants

EN
Challenges and Promises of Large Textual Data Sets’ Digital Analysis in Social Research: The Case of Preparing a Corpus of EU Documents on Poverty and Social Exclusion from 2001 to 2021

Languages of publication

PL

Abstracts

PL
Artykuł analizuje potencjał i wyzwania związane z wykorzystaniem cyfrowych metod analizy dużych zbiorów tekstów w naukach społecznych na przykładzie badania unijnej polityki przeciwdziałania ubóstwu i wykluczeniu społecznemu w latach 2001-2021. Autorzy przedstawiają proces tworzenia korpusu dokumentów UE - od pozyskiwania danych z baz EUR-Lex, przez ich przetwarzanie, wzbogacanie metadanych i deduplikację, po analizę sieciową. Szczególną uwagę poświęcono wskaźnikom jako elementom infrastruktury kalkulacyjnej UE, które nie tylko opisują rzeczywistość społeczną, lecz także ją współtworzą, wpływając na definiowanie problemów i formułowanie interwencji politycznych. Analiza i wizualizacja sieci ujawniają zmiany w relacjach między wskaźnikami, tematami i instytucjami, trudne do uchwycenia tradycyjnymi metodami, a ich interpretacja została oparta na przeprowadzonych wywiadach i literaturze przedmiotu. Wyniki wskazują, że kluczową rolę w polityce UE odgrywa wskaźnik zagrożenia ubóstwem relatywnym (AROP), podczas gdy nowszy wskaźnik AROPE ma ograniczoną dyfuzję poza administracją unijną. Jednocześnie widoczna jest rosnąca rola agencji, instytutów badawczych, firm konsultingowych i środowisk akademickich, co świadczy o eksternalizacji ekspertyzy. Wprowadzanie nowych wskaźników sprzyja rozszerzaniu wpływu UE na kolejne obszary polityki społecznej. Autorzy proponują model pracy badawczej zgodny z zasadami FAIR i Linked Open Data, podkreślając znaczenie integracji metod cyfrowych z analizą jakościową oraz współpracy interdyscyplinarnej w badaniach procesów politycznych.
EN
The article explores the challenges promises of using digital methods to analyse large text datasets in the social sciences, focusing on the EU policies addressing poverty and social exclusion between 2001 and 2021. It outlines the creation of a corpus of EU documents, from acquiring data via EUR-Lex to processing, enriching metadata, deduplication and network analysis. Indicators are seen as elements of the EU’s calculative infrastructure, which not only describe social reality but also shape it by framing problems and guiding policy interventions. Network analysis and visualisation reveal shifts in the relationships between indicators, topics and institutions that are hard to capture with traditional methods, while interpretation draws on the interviews and literature review. Findings show that at- risk-of-poverty rate (AROP) indicator is central to EU policy, while the newer indicator (AROPE) is less present beyond administrative structures. The growing involvement of agencies, research institutes, consultancy firms and academia reflects the externalisation of expertise. Introducing new indicators extends the EU’s influence into broader areas of social policy. The authors propose a research workflow aligned with the FAIR and Linked Open Data principles, highlighting the value of combining digital and qualitative methods and fostering interdisciplinary collaboration to better understand political processes.

Year

Issue

3

Pages

73-104

Physical description

Contributors

  • Wydział Socjologii UW
  • Instytut Studiów Politycznych PAN
  • Instytut Badań Literacki PAN
  • NASK PIB

References

  • Bandola-Gill, J., Grek, S., Tichenor, M. (2022). Governing the Sustainable Develop- ment Goals: Quantification in Global Public Policy. Springer Nature. https://doi. org/10.1007/978-3-031-03938-6
  • Blom-Hansen, J. (2019). Studying Power and Influence in the European Union: Ex- ploiting the Complexity of Post-Lisbon Legislation with EUR-Lex. European Un- ion Politics, 20(4), 692–706. https://doi.org/10.1177/1465116519851181
  • Boswell, C. (2008). The Political Functions of Expert Knowledge: Knowledge and Legitimation in European Union Immigration Policy. Journal of European Public Policy, 15(4), 471–488. https://doi.org/10.1080/13501760801996634
  • Brennan, T. (2017). The Digital-humanities Bust: After a Decade of Investment and Hype, What Has the Field Accomplished? Not Much. Chronicle of Higher Educa- tion, 64(8). https://www.chronicle.com/article/the-digital-humanities-bust/
  • Brosz, M., Bryda, G., Siuda, P. (2017). Big Data i CAQDAS a procedury badawcze w polu socjologii jakościowej. Przegląd Socjologii Jakościowej, 13(2), 6–23.
  • Bruno, I., Jacquot, S., Mandin, L. (2006). Europeanization Through its Instrumen- tation: Benchmarking, Mainstreaming and the Open Method of Co-ordination... Toolbox or Pandora’s Box? Journal of European Public Policy, 13(4), 519–536. https://doi.org/10.1080/13501760600693895
  • Bruno, I. (2009). The “Indefinite Discipline” of Competitiveness Benchmarking as a Neoliberal Technology of Government. Minerva, 47(3), 261–280. https://doi. org/10.1007/s11024-009-9128-0
  • Copeland, P., Daly, M. (2012). Varieties of Poverty Reduction: Inserting the Poverty and Social Exclusion Target into Europe 2020. Journal of European Social Policy, 22(3), 273–287. https://doi.org/10.1177/0958928712440203
  • de la Porte, C., Pochet, P., Room, B. G. (2001). Social Benchmarking, Policy Mak- ing and New Governance in the EU. Journal of European Social Policy, 11(4), 291–307. https://doi.org/10.1177/095892870101100401
  • Diefenbach, D., Wilde, M. D., Alipio, S. (2021). Wikibase as an infrastructure for knowledge graphs: The eu knowledge graph. W: International Semantic Web Con- ference (s. 631-647). Cham.
  • Düro, M. (2009). Crosswalking EUR-Lex: A Proposal for a Metadata Mapping to Improve Access to EU Documents. Office for Official Publications of the European Communities.
  • Golub, J. (1999). In the Shadow of the Vote? Decision Making in the Europe- an Community. International Organization, 53(4), 733–764. https://doi.org/ 10.1162/002081899551057
  • Golub, J. (2007). Survival Analysis and European Union Decision-making. European Union Politics, 8(2), 155–179. https://doi.org/10.1177/1465116507076428
  • Golub, J. (2023). EUPROPS: A New Dataset on Policymaking in the European Un- ion from 1958 to 2021. European Union Politics, 25(1), 197–217. https://doi. org/10.1177/14651165231202034
  • Gornitzka, Å., Sverdrup, U. (2015). Societal Inclusion in Expert Venues: Participation of Interest Groups and Business in the European Commission Expert Groups. Politics and Governance, 3(1), 151–165. https://doi.org/10.17645/pag.v3i1.130
  • Grimmer, J., Roberts, M. E., Stewart, B. M. (2022). Text as Data: A New Framework for Machine Learning and the Social Sciences. Princeton University Press.
  • Gábos, A., Goedemé, T. (2016). The Europe 2020 Social Inclusion Indicators: Main Conclusions of the ImPRovE Project on Validity, Methodological Robustness and Interrelationships. ImPRovE Working Paper, 16(13).
  • Hage, F. M. (2007). Committee Decision-making in the Council of the European Union. European Union Politics, 8(3), 299–328. https://doi.org/10.1177/1465116507079539
  • Hertz, R., Leuffen D. (2011). Too big to run? Analysing the impact of enlargement on the speed of EU decision-making. European Union Politics 12(2), 193–215.
  • Hunt, D. (2021). Corpus Linguistics: Examining Tensions in General Practitioners’ Views about Diagnosing and Treating Depression. W: G. Brookes, D. Hunt (red.), Analysing Health Communication. Discourse Approaches (s. 133–160). Palgrave Macmillan.
  • Häge, F. M. (2008). Who Decides in the Council of the European Union? Journal of Common Market Studies, 46(3), 533–558.
  • Häge, F. M. (2011). The European Union Policy-making Dataset. European Union Politics, 12(3), 455–477. https://doi.org/10.1111/j.1468-5965.2008.00790.x
  • Janowicz, K., Hitzler, P., Adams, B., Kolas, D., Vardeman II, C. (2014). Five Stars of Linked Data Vocabulary Use. Semantic Web, 5(3), 173–176. https://doi.org/10.3233/SW-140135
  • Karlińska A., Rosiński, C., Kubis, M., Hubar, P., Wieczorek, J. (2024). Using Bib- liodata LODification to Create Metadata-enriched Literary Corpora in Line with FAIR Principles. W: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) (s. 17271–17284). ELRA and ICCL.
  • Konig, T., Luetgert, B., Dannwolf, T. (2006a). EU-Lex: EU Legislative Databank Based on Celex/Prelex. University of Mannheim. http://www.sowi.unimannheim. de/lsp012/08downloads01.html
  • Konig, T., Luetgert, B., Dannwolf, T. (2006b). Quantifying European Legislative Re- search: Using Celex and Prelex in EU legislative studies. European Union Politics, 7(4), 553–574.
  • Kurunmäki, L., Miller, P. (2013). Calculating Failure: The Making of a Calculative Infrastructure for Forgiving and Forecasting Failure. Business History, 55(7), 1100–1118. https://doi.org/10.1080/00076791.2013.838036
  • Kurunmäki, L., Mennicken, A., Miller, P. (2019). Assembling Calculative Infrastruc- tures. W: M. Kornberger i in. (red..), Thinking Infrastructures. Emerald Publishing Limited.
  • König, T. (2007). Divergence or Convergence? From Ever-growing to Ever-slowing European Legislative Decision-making. European Journal of Political Research, 46, 417–444. https://doi.org/10.1111/j.1475-6765.2007.00705.x
  • Lelie, P., Vanhercke, B. (2013). Inside the Social OMC’s Learning Tools: How “Bench- marking Social Europe” Really Worked. OSE Research Paper, 10, 2–61. https:// doi.org/10.13140/RG.2.1.2419.7202
  • Louis, M., Maertens, L. (2021). Why International Organizations Hate Politics: De- politicizing the World. Taylor & Francis. https://doi.org/10.4324/9780429466984
  • Maître, B., Nolan, B., Whelan, Ch. T. (2013). A Critical Evaluation of the EU 2020 Poverty and Social Exclusion Target: An Analysis of EU-SILC 2009.
  • McEnery, T., Wilson, A. (2001). Corpus Linguistics: An Introduction. Edinburgh Uni- versity Press.
  • McEnery, T., Xiao, R., Tono, Y. (2006). Corpus-based Language Studies: An Advanced Resource Book. Routledge.
  • Mehrpouya, A., Samiolo, R. (2019). Indexal Thinking – Reconfiguring Global Topol- ogies for Market-based Intervention. W: M. Kornberger i in. (red.), Thinking Infra- structures. Emerald Publishing Limited.
  • Menyhért, B., Cseres-Gergely, Z., Kvedaras, V., Mina, B., Pericoli, F., Zec, S. (2021). Measuring and Monitoring Absolute Poverty (ABSPO) – Final Report (JRC Tech- nical Report No. JRC127444; EUR 30924 EN). Publications Office of the Europe- an Union. https://publications.jrc.ec.europa.eu/repository/handle/JRC127444
  • Metz, J. (2013). Expert Groups in the European Union: A Sui Generis Phenomenon? Policy and Society, 32(3), 267–278. https://doi.org/10.1016/j.polsoc.2013.07.007
  • Năstase, A., Radulova, E. (2025). When Consultants Come to Town: How the Eu- ropean Commission Justifies the Involvement of Private Contractors in Online Public Consultations. International Review of Public Policy, 7(2). https://doi. org/10.4000/14mnk
  • Nolan, B., Whelan, Ch. T. (2011). Poverty and Deprivation in Europe. University Press.
  • Pfiffner, N. (2021). Identifying Patterns in Communication Science: Mapping Knowl- edge Structures Using Semantic Network Analysis of Keywords. W: E. Segev (red.), Semantic Network Analysis in Social Sciences. Sage Publishing.
  • Rauh, C. (2021). One Agenda-setter or Many? The Varying Success of Policy Initia- tives by Individual Directorates-General of the European Commission 1994–2016. European Union Politics, 22(1), 3–24. https://doi.org/10.1177/1465116520961467
  • Reilley, J., Scheytt, T. (2019). A Calculative Infrastructure in the Making: The Emer- gence of a Multi-layered Complex for Governing Healthcare. W: M. Kornberger i in. (red.), Thinking Infrastructures. Emerald Publishing Limited.
  • Salganik, M. J. (2018). Bit by Bit: Social Research in the Digital Age. Princeton Uni- versity Press.
  • Szpunar, M. (2016). Kultura cyfrowego narcyzmu. Wydawnictwa AGH.
  • Thedvall, R. (2006). Eurocrats at Work: Negotiating Transparency in Postnational Employment Policy. Almqvist & Wiksell International.
  • Toshkov, D. (2017). The Impact of the Eastern Enlargement on the Decision-making Capacity of the European Union. Journal of European Public Policy, 24(2), 177–196.
  • Veltri, G. A. (2020). Digital Sosocial researchResearch. Polity.
  • Waagmeester, A., Stupp, G., Burgstaller-Muehlbacher, S., Good, B. M., Griffith, M., Griffith, O. L. (2020). Wikidata as a Knowledge Graph for the Life Sciences. eLife, 9. https://doi.org/10.7554/eLife.52614
  • Wasserman, S, Faust, K. (2012). Social Network Analysis: Methods and Applications. Cambridge University Press.
  • Wiedemann, G. (2013). Opening up to Big Data: Computer-assisted Analysis of Textual Data in Social Sciences. Forum Qualitative Sozialforschung Forum: Qualitative Social Research, 14(2), 332–357. https://doi.org/10.17169/fqs-14.2.1949
  • Wiedemann, G. (2015). Text Mining for Qualitative Data Analysis in the Social Sciences. Springer Berlin Heidelberg.

Document Type

Publication order reference

YADDA identifier

bwmeta1.element.desklight-9cf75fd5-918c-470e-9c61-60ed4c220535
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.