Leveraging Artificial Intelligence and Automation for Enhancing School Improvement Efforts
Pub. online: 29 January 2026
Type: Data Science In Action
Open Access
Received
13 August 2025
13 August 2025
Accepted
5 January 2026
5 January 2026
Published
29 January 2026
29 January 2026
Abstract
Advances in AI and automation are reshaping qualitative research workflows, making processes more efficient, accurate, consistent, and scalable. This paper presents innovations developed for the Illinois Needs Assessment project, a statewide initiative led by the Illinois State Board of Education and the American Institutes for Research to conduct comprehensive needs assessments for schools that need intensive or comprehensive support. To address the scale and tight timeline requirements of the project, the team designed three interconnected pipelines that work together to produce a finalized report. The first, an Audio Pipeline, uses Whisper and generative AI to automate transcription, text-based speaker role attribution, thematic coding, and insight generation from focus groups and interviews. The second, a Report Generation Pipeline, integrates Airtable automations with AWS infrastructure to produce customized school reports that merge AI-generated findings with survey data, school performance metrics, and contextual comparisons. Third, the Needs Assessment Summary Report automates the assembly of all quantitative and qualitative inputs into a polished, customizable deliverable that combines efficiency with expert review. Together, these pipelines replace ad hoc manual workflows with reproducible, consistent systems that enhance data quality, reduce error, and broaden access for non-technical users. The integrated design demonstrates how automation and generative AI can reduce manual burdens, shorten delivery timelines, and support timely, data-informed, and human-centered decision-making in education.
Supplementary material
Supplementary MaterialThe supplementary material includes a GitHub repository with two subfolders—‘aiPipeline-SDSS2025’ and ‘autoreportsPipeline-SDSS2025’—corresponding to the Audio Pipeline and the Report Generation Pipeline + Automated Report described in the manuscript. While the original implementations relied on secure cloud infrastructure, the materials provide insight into system architecture, key processing steps, and expected outputs. Included are mock data, configuration examples, prompts, crosswalks, and selected code, enabling users to review and execute sample scripts to understand each pipeline stage. The supplementary materials also include an additional R Markdown (RMD) file that demonstrates report generation using synthetic school-level data, illustrating how qualitative and quantitative inputs are combined within the automated reporting workflow. Due to reliance on internal systems and proprietary authentication, some components (e.g., secure dataset access, organizational credentials, private APIs) are non-functional outside production. Code exposing security-sensitive logic or deployment details has been removed, but the materials still convey the overall design and practical implementation. Additional documentation in each folder guides navigation of outputs. See https://github.com/gchickering21/SDSS2025_materials for files and documentation.
References
Airtable (2023). Airtable api documentation. https://airtable.com/api. [Online; accessed 23 November 2025].
Airtable Blog (2022). How low and no-code tools increase productivity by breaking silos. https://blog.airtable.com/the-promises-low-code-platforms-should-deliver/. [Online; accessed 23 November 2025].
Airtable Help (2025). When webhook received trigger. https://support.airtable.com/docs/when-webhook-received-trigger. [Online; accessed 23 November 2025].
Amazon Web Services (2025a). Amazon api gateway features. https://aws.amazon.com/api-gateway/features/. [Online; accessed 23 November 2025].
Amazon Web Services (2025b). Amazon rds features. https://aws.amazon.com/rds/features/. [Online; accessed 23 November 2025].
Amazon Web Services (2025c). Amazon simple email service (ses). https://aws.amazon.com/ses/. [Online; accessed 23 November 2025].
Amazon Web Services (2025d). Aws step functions. https://aws.amazon.com/step-functions/. [Online; accessed 23 November 2025].
Amazon Web Services (2025e). Security best practices for amazon s3. https://docs.aws.amazon.com/AmazonS3/latest/userguide/security-best-practices.html. [Online; accessed 23 November 2025].
Amazon Web Services (2025f). What is amazon ec2? https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/concepts.html. [Online; accessed 23 November 2025].
Amazon Web Services (2025g). What is amazon elastic container registry (ecr)? https://docs.aws.amazon.com/AmazonECR/latest/userguide/what-is-ecr.html. [Online; accessed 23 November 2025].
Boettiger C (2015). An introduction to docker for reproducible research. ACM SIGOPS Operating Systems Review, 49(1): 71–79. https://doi.org/10.1145/2723872.2723882
Center for Computation and Visualization (2025). Speech-to-text models. https://docs.ccv.brown.edu/ai-tools/services/transcribe/speech-to-text-models. [Online; accessed 23 November 2025].
Glenn ML, Strassel SM, Lee H, Maeda K, Zakhary R, Li X (2010). Transcription methods for consistency, volume and efficiency. In: Calzolari N, Choukri K, Maegaard B, Mariani J, Odijk J, Piperidis S, Rosner M, Tapias D (eds), Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10). European Language Resources Association (ELRA), Valletta, Malta.
Gohel D, Skintzos P (2023). officer: Manipulation of microsoft word and powerpoint documents (r package version 0.6.2). https://CRAN.R-project.org/package=officer. [Online; accessed 23 November 2025].
Lane B, Unger C, Souvanna P (2014). Turnaround practices in action: A three-year analysis of school and district practices, systems, policies, and use of resources contributing to successful turnaround efforts in massachusetts’ level 4 schools. http://www.mass.gov/edu/docs/ese/accountability/turnaround/practices-report-2014.pdf. [Online; accessed 23 November 2025].
Maissen P, Felber P, Kropf P, Schiavoni V (2020). FaaSdom: A benchmark suite for serverless computing. In: Charfi A, Cugola G, Pietzuch P, Jerzak Z (eds), Proceedings of the 14th ACM International Conference on Distributed and Event-Based Systems (DEBS 2020). Association for Computing Machinery (ACM).
Metabase Inc (2025). Metabase: An open-source business intelligence platform. https://www.metabase.com/. [Online; accessed 23 November 2025].
Microsoft (2023). Microsoft graph api overview. https://learn.microsoft.com/en-us/graph/overview. [Online; accessed 23 November 2025].
Microsoft (2024). Data, privacy, and security for azure openai service. https://learn.microsoft.com/en-us/azure/ai-foundry/responsible-ai/openai/data-privacy?tabs=azure-portal. [Online; accessed 23 November 2025].
Microsoft (2025). Azure key vault security features. https://learn.microsoft.com/en-us/azure/key-vault/general/security-features. [Online; accessed 23 November 2025].
Microsoft Corporation (2025). What is sharepoint? https://support.microsoft.com/en-us/sharepoint. [Online; accessed 23 November 2025].
Nascimento RS, Silva AL, Rocha IA, Almeida JJ, Gonçalves G, Santos A, et al. (2024). Availability, scalability, and security in the migration from on-premises systems to azure kubernetes service: A proof of concept. Computers, 13(8): 192. https://doi.org/10.3390/computers13080192
Ogeawuchi JC, Uzoka A, Alozie CE, Agboola OA, Owoade S (2022). Next-generation data pipeline automation for enhancing efficiency and scalability in business intelligence systems. International Journal of Social Science Exceptional Research, 1(1): 277–282. https://doi.org/10.54660/IJSSER.2022.1.1.277-282
OpenAI (2023). Gpt-4 technical report. arXiv preprint: arXiv:2303.08774.
Peng RD (2011). Reproducible research in computational science. Science, 334(6060): 1226–1227. https://doi.org/10.1126/science.1213847
Radford A, Kim JW, Xu T, Brockman G, McLeavey C, Sutskever I (2022). Robust speech recognition via large-scale weak supervision. arXiv preprint: arXiv:2212.04356.
Shen J (2024). Understanding airtable webhooks and their applications. https://shortcuts.sequentialroutine.com/blog/understanding-airtable-webhooks-applications/. [Online; accessed 23 November 2025].
US Department of Education (2001). Comprehensive needs assessment guidebook. https://www.ed.gov/sites/ed/files/admins/lead/account/compneedsassessment.pdf. [Online; accessed 23 November 2025].
Voicegain (2023). Practical considerations for voice developers considering openai’s whisper asr. https://www.voicegain.ai/post/practical-considerations-for-voice-developers-considering-openais-whisper-asr. [Online; accessed 23 November 2025].