Offline Dynamic Inventory and Pricing Strategy: Addressing Censored and Dependent Demand

Name: Offline Dynamic Inventory and Pricing Strategy: Addressing Censored and Dependent Demand
Keywords: FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Statistics - Machine Learning, Computer Science - Artificial Intelligence, FOS: Mathematics, 90B05, 68T05, 90C40, 62N02, Mathematics - Statistics Theory, Machine Learning (stat.ML), Applications (stat.AP)

Gundem, Korel; Qi, Zhengling

Found an issue? Give us feedback

arXiv.org e-Print Ar...arrow_drop_down

arXiv.org e-Print Archive

Preprint . 2025

Data sources: arXiv.org e-Print Archive

https://dx.doi.org/10.48550/ar...

Article . 2025

License: arXiv Non-Exclusive Distribution

Data sources: Datacite

Offline Dynamic Inventory and Pricing Strategy: Addressing Censored and Dependent Demand

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Jan 2025Embargo end date: 01 Jan 2025Publisher:arXiv

Authors: Gundem, Korel; Qi, Zhengling;

doi: 10.48550/arxiv.2504.09831

arXiv: 2504.09831

Offline Dynamic Inventory and Pricing Strategy: Addressing Censored and Dependent Demand

- Summary
- Subjects
- Metrics

Abstract

In this paper, we study the offline sequential feature-based pricing and inventory control problem where the current demand depends on the past demand levels and any demand exceeding the available inventory is lost. Our goal is to leverage the offline dataset, consisting of past prices, ordering quantities, inventory levels, covariates, and censored sales levels, to estimate the optimal pricing and inventory control policy that maximizes long-term profit. While the underlying dynamic without censoring can be modeled by Markov decision process (MDP), the primary obstacle arises from the observed process where demand censoring is present, resulting in missing profit information, the failure of the Markov property, and a non-stationary optimal policy. To overcome these challenges, we first approximate the optimal policy by solving a high-order MDP characterized by the number of consecutive censoring instances, which ultimately boils down to solving a specialized Bellman equation tailored for this problem. Inspired by offline reinforcement learning and survival analysis, we propose two novel data-driven algorithms to solving these Bellman equations and, thus, estimate the optimal policy. Furthermore, we establish finite sample regret bounds to validate the effectiveness of these algorithms. Finally, we conduct numerical experiments to demonstrate the efficacy of our algorithms in estimating the optimal policy. To the best of our knowledge, this is the first data-driven approach to learning optimal pricing and inventory control policies in a sequential decision-making environment characterized by censored and dependent demand. The implementations of the proposed algorithms are available at https://github.com/gundemkorel/Inventory_Pricing_Control

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Statistics - Machine Learning, Computer Science - Artificial Intelligence, FOS: Mathematics, 90B05, 68T05, 90C40, 62N02, Mathematics - Statistics Theory, Machine Learning (stat.ML), Applications (stat.AP), Statistics Theory (math.ST), Statistics - Applications, Machine Learning (cs.LG)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green