Home
International Journal of Science and Research Archive
International, Peer reviewed, Open access Journal ISSN Approved Journal No. 2582-8185

Main navigation

  • Home
    • Journal Information
    • Abstracting and Indexing
    • Editorial Board Members
    • Reviewer Panel
    • Journal Policies
    • IJSRA CrossMark Policy
    • Publication Ethics
    • Instructions for Authors
    • Article processing fee
    • Track Manuscript Status
    • Get Publication Certificate
    • Current Issue
    • Issue in Progress
    • Past Issues
    • Become a Reviewer panel member
    • Join as Editorial Board Member
  • Contact us
  • Downloads

ISSN Approved Journal || eISSN: 2582-8185 || CODEN: IJSRO2 || Impact Factor 8.2 || Google Scholar and CrossRef Indexed

Fast Publication within 48 hours || Low Article Processing Charges || Peer Reviewed and Referred Journal || Free Certificate

Research and review articles are invited for publication in January 2026 (Volume 18, Issue 1)

Optimizing AI Model Inference Performance with Dynamic Profiling

Breadcrumb

  • Home
  • Optimizing AI Model Inference Performance with Dynamic Profiling

Ankush Jitendrakumar Tyagi * 

University of Texas at Arlington, Texas, USA.

Review Article

International Journal of Science and Research Archive, 2025, 16(01), 2266-2275

Article DOI: 10.30574/ijsra.2025.16.1.2066

DOI url: https://doi.org/10.30574/ijsra.2025.16.1.2066

Received on 31 May 2025; revised on 18 July 2025; accepted on 27 July 2025

Deep neural networks and Artificial Intelligence (AI) models have shown great success in areas that include computer vision, natural language processing, and autonomous systems. Yet, their application in real-world tasks is typically limited by inference performance drawbacks, in particular, when the specialized cutting-edge devices are needed to complete such tasks in real time and on resource-constrained devices. The main issue with the requirement to scale, efficient, and responsive AI systems is the key attention paid to the inference performance optimisation. Dynamic profiling, or the process of analysing AI models and system performance in real-time as they execute, has become a critical technique not only as a means to detect locations where performance is impeded but to inform the process of performance optimisation at runtime. In contrast to static profiling which performs an analysis before execution of specific and prepared traces of the executable (static profiling uses the pre-execution information about the program to perform an analysis of it), dynamic profiling allows a more dynamic and fine-grained inspection of problems like inefficiencies in memory access, imbalances in compute utilisation, layer-resolution latency, and power consumption. Dynamic performance tracing, profiling, and tools and frameworks such as TensorRT, Intel VTune, NVIDIA Nsight, and PyTorch Profiler enjoy wide support across the diverse hardware platforms, including CPU, GPU, and edge accelerator, with full support across platforms. These tools can offer useful information to guide fine-grained optimisations like operator fusion, quantisation, memory and computation scheduling, and replication strategies. Notably, with the seamless coupling of dynamic profiling to automated deployment pipelines, AI systems can dynamically optimise themselves at runtime and respond well to variations in workloads and system constraints. It helps achieve intelligent self-optimising AI applications that are also able to be kept at production level performance. As dynamic profiling is integrated into the AI model lifecycle, it allows continuous performance tracking and a sign-and-iterate cycle, hence facilitating the delivery of scalable, energy-efficient, and high-throughput AI approaches at scale in the cloud as well as at the edge. This paper will demonstrate that dynamic profiling is a very important technique to overcome the performance issues and drive the best future of AI deployment.

Dynamic Profiling; Inference Optimisation; Real-Time Performance; Edge AI; Profiling Frameworks; Adaptive Deployment

https://journalijsra.com/sites/default/files/fulltext_pdf/IJSRA-2025-2066.pdf

Preview Article PDF

Ankush Jitendrakumar Tyagi. Optimizing AI Model Inference Performance with Dynamic Profiling. International Journal of Science and Research Archive, 2025, 16(01), 2266-2275. Article DOI: https://doi.org/10.30574/ijsra.2025.16.1.2066.

Copyright © 2025 Author(s) retain the copyright of this article. This article is published under the terms of the Creative Commons Attribution Liscense 4.0

For Authors: Fast Publication of Research and Review Papers


ISSN Approved Journal publication within 48 hrs in minimum fees USD 35, Impact Factor 8.2


 Submit Paper Online     Google Scholar Indexing Peer Review Process

Footer menu

  • Contact

Copyright © 2026 International Journal of Science and Research Archive - All rights reserved

Developed & Designed by VS Infosolution