Department of Electronics and Communication, Maharaja Agrasen Institute of Technology, Delhi, India.
International Journal of Science and Research Archive, 2025, 15(01), 1426-1434
Article DOI: 10.30574/ijsra.2025.15.1.1140
Received on 22 February 2025; revised on 22 April 2025; accepted on 24 April 2025
In this work, the development of a basic large language model (LLM) has been presented, with a primary focus on the pre-training process and model architecture. A simplified transformer-based design has been implemented to demonstrate core LLM principles with the incorporation of reinforcement learning techniques. Key components such as tokenization, and training objectives have been discussed to provide a foundational understanding of LLM construction. Additionally, an overview of several established models—including GPT-2, LLaMA 3.1, and DeepSeek—has been provided to contextualize current advancements in the field. Through this comparative and explanatory approach, the essential building blocks of large-scale language models have been explored in a clear and accessible manner.
Pretraining; Introduction to the Neural Networks in LLM; Transformer Architecture; Post Training; Post Training with Reinforcement Learning
Preview Article PDF
ML Sharma, Sunil Kumar, Rajveer Mittal, Shubhankar Rai, Akshat Jain, Anurag Gandhi, Swayam Nagpal, Anurag Ranjan, Riya Yadav and Vatshank Mishra. Building an LLM from Scratch. International Journal of Science and Research Archive, 2025, 15(01), 1426-1434. Article DOI: https://doi.org/10.30574/ijsra.2025.15.1.1140.
Copyright © 2025 Author(s) retain the copyright of this article. This article is published under the terms of the Creative Commons Attribution Liscense 4.0







