Brain2Text: A Reproducible, CPU-Deployable Framework for Neural Speech Decoding with Browser-Accessible Inference

Gaurav Kumar Singh; Aayush Chougule; Uday Tomar; Sidheshwar Sharma

doi:https://doi.org/10.47001/IRJIET/2026.105083

Brain2Text: A Reproducible, CPU-Deployable Framework for Neural Speech Decoding with Browser-Accessible Inference

Gaurav Kumar SinghSenior Assistant Professor, School of Computer Science, Engineering and Applications, D Y Patil International University, Akurdi, Pune, IndiaAayush ChouguleSchool of Computer Science, Engineering and Applications, D Y Patil International University, Akurdi, Pune, IndiaUday TomarSchool of Computer Science, Engineering and Applications, D Y Patil International University, Akurdi, Pune, IndiaSidheshwar SharmaSchool of Computer Science, Engineering and Applications, D Y Patil International University, Akurdi, Pune, India

Vol 10 No 5 (2026): Volume 10, Issue 5, May 2026 | Pages: 618-624

International Research Journal of Innovations in Engineering and Technology

OPEN ACCESS | Research Article | Published Date: 29-05-2026

doi.org/10.47001/IRJIET/2026.105083

Full Text PDF

Abstract

Despite the highly successful results neural speech decoding models have obtained in laboratories, the road to making these models usable by clinicians and end-users is under reported and often not shown. The published decoders are embedded within the jupyter notebooks, are displayed in a terminal and require GPUs on research infrastructure to operate. Brain2Text was developed to fill right up this hole. The framework receives pre-extracted intracortical feature vectors of dimension 512 from the Brain-to-Text 2025 T15 CopyTask benchmark, and outputs English words on their basis using the following five steps: (1) 512 dimensional feature vectors are extracted from the intracortical areas within the benchmark, (2) a five-layer Gated Recurrent Unit (GRU) network is trained with the extracted feature vectors and a Connectionist Temporal Classification (CTC) loss function, (3) feature vectors are mapped to the output language (English) using a frequency-weighted CMU Pronouncing Dictionary ( lookup) , (4) an LLM fallback on the unmapped phoneme sequences. The end-to-end inference pipeline runs on CPU with latency of 90-165 ms for trials of up to 200 time-steps. The decoder is wrapped in a Flask REST API which is accessed by a React/Vite front end application, or used in a live-less demonstration mode in which there are no dependencies on a live back end or files. The whole stack is initialised by one command in the shell.

Keywords

Brain-computer interface, neural speech decoding, GRU, CTC, ARPAbet, phoneme-to-text, Flask, React, reproducibility, low-resource deployment, intracortical.

Citation of this Article

Gaurav Kumar Singh, Aayush Chougule, Uday Tomar, & Sidheshwar Sharma. (2026). Brain2Text: A Reproducible, CPU-Deployable Framework for Neural Speech Decoding with Browser-Accessible Inference. International Research Journal of Innovations in Engineering and Technology - IRJIET, 10(5), 618-624. Article DOI https://doi.org/10.47001/IRJIET/2026.105083

This work is licensed under Creative common Attribution Non Commercial 4.0 Internation Licence

References

B. Pandarinath et al., "Latent factors and dynamics in motor cortex and their application to brain-machine interfaces," J. Neurosci., vol. 38, no. 44, pp. 9390-9401, 2018.

F. R. Willett et al., "High-performance brain-to-text communication via handwriting," Nature, vol. 593, pp. 249-254, 2021.

D. A. Moses et al., "Neuroprosthesis for decoding speech in a paralyzed person with anarthria," N. Engl. J. Med., vol. 385, pp. 217-227, 2021.

S. L. Metzger et al., "A high-performance neuroprosthesis for speech decoding and avatar control," Nature, vol. 620, pp. 1037-1046, 2023.

G. K. Anumanchipalli, J. Chartier, and E. F. Chang, "Speech synthesis from neural decoding of spoken sentences," Nature, vol. 568, pp. 493-498, 2019.

A.Defossez et al., "Decoding speech from non-invasive brain recordings," arXiv:2208.12266, 2022.

Brain-to-Text 2025 Challenge, Kaggle. [Online]. Available: https://www.kaggle.com/competitions/brain-to-text-2 025

A.Graves et al., "Connectionist temporal classification," in Proc. ICML, 2006.

D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," in Proc. ICLR, 2015.

I.Loshchilov and F. Hutter, "SGDR: Stochastic gradient descent with warm restarts," in Proc. ICLR, 2017.

S. Bird, E. Klein, and E. Loper, Natural Language Processing with Python. O'Reilly Media, 2009.

For Authors

Publication Archives

Volume 1 - 2017

Volume 2 - 2018

Volume 3 - 2019

Volume 4 - 2020

Volume 5 - 2021

Volume 6 - 2022

Volume 7 - 2023

Volume 8 - 2024

Volume 9 - 2025

Volume 10 - 2026

For Board Members

Downloads

Research Areas

Brain2Text: A Reproducible, CPU-Deployable Framework for Neural Speech Decoding with Browser-Accessible Inference

Abstract

Keywords

Citation of this Article

References

International Research Journal of Innovations in Engineering
and Technology - IRJIET

Editorial Policies

Quick Links