A Framework for AI Token Usage Forecasting and Capacity Planning in Medium and Large Enterprises

Prasad, Shiva

Found an issue? Give us feedback

ZENODOarrow_drop_down

ZENODO

Article

Data sources: ZENODO

A Framework for AI Token Usage Forecasting and Capacity Planning in Medium and Large Enterprises

descriptionPublicationkeyboard_double_arrow_right Article Under curationPublisher:Zenodo

Authors: Prasad, Shiva;

doi: 10.5281/zenodo.20544663

A Framework for AI Token Usage Forecasting and Capacity Planning in Medium and Large Enterprises

- Summary

Abstract

Large language models (LLMs) and other generative AI services are increasingly embedded into enterprise workflows, yet most organizations lack reliable mechanisms for forecasting token-based usage and planning capacity. Unpredictable token consumption leads to budget overruns, exhausted AI credits, and sudden reversions to manual work, undermining confidence in AI adoption. This paper proposes and evaluates a practical framework for forecasting AI token usage at workflow and department level in medium and large enterprises and for translating these forecasts into concrete capacity planning decisions. The framework ingests token and telemetry data from AI-enabled applications through a unified gateway, enriches them with business metadata, and applies a set of baseline and advanced forecasting models to estimate future token consumption under multiple scenarios. A multi-organization field study design is outlined to compare forecasting accuracy against naïve and simple trend baselines using metrics such as mean absolute percentage error (MAPE), root mean squared error (RMSE), and forecast bias, and to assess the impact on budget overruns, credit exhaustion events, and AI workload coverage. The results are expected to demonstrate that workflow-aware forecasting substantially improves planning accuracy and reduces unplanned interruptions without reducing the share of work handled by AI. The proposed framework provides FinOps, IT, and HR leaders with an actionable methodology to move from reactive AI cost control to proactive capacity planning, and establishes a foundation for more comprehensive AI usage governance and workforce planning.

Found an issue? Give us feedback