Department of CSE (Artificial Intelligence and Machine Learning), ACE Engineering College, India.
International Journal of Science and Research Archive, 2025, 14(01), 1124-1128
Article DOI: 10.30574/ijsra.2025.14.1.0192
Received on 09 December 2024; revised on 17 January 2025; accepted on 20 January 2025
Synthetic data, which is the data produced to mimic the characteristics of actual data without revealing any confidential information, is a much safer option than original data, especially when it comes to extreme instances such as personal data, financial data, or military intelligence. There are substantial dangers connected with the use of real-life data such as assault with the intent to commit identity theft, fraud, and hacking, but because synthetic data (SD) reproduces some of the elements of real data, without infringing on anyone’s privacy, suffers from these risks. The project concentrates on the cutting-edge fields of Language Learning Models (LLM) and Deep Learning (DL) to generate synthetic data that mimics real-world data in its intricacy. Advances in LSTM networks and Generative Adversarial Networks (GAN) produce plausible and useful data in sequence forms for natural language processing and machine learning (ML) augmentation respectively. Applications of this technology include, but are not limited to, the use of augmented datasets to improve medical diagnosis, advanced finance fraud detection systems, and designing fictitious consumers in order to enhance AI-based system recommendations. The project which is implemented with Python programming language and also takes advantage of some open source packages such as SymPy, Pydbgen, Synthetic Data Vault (SDV), and Scikit-learn offers a solution to data scarcity and quality problems in order to improve the performance of the AI models in various sectors.
Deep Learning; Large Language Models (LLM); Synthetic Data; Long - Short Term Memory (LSTM); Generative Adversarial Networks (GAN)
Preview Article PDF
Satya Sudha S, Grishitha V, Sai Rajeshwar V V N and Shiva Karthik P. The spectrum of synthetic data generation: A comprehensive review. International Journal of Science and Research Archive, 2025, 14(01), 1124-1128. Article DOI: https://doi.org/10.30574/ijsra.2025.14.1.0192.
Copyright © 2025 Author(s) retain the copyright of this article. This article is published under the terms of the Creative Commons Attribution Liscense 4.0







