Who reported this story?

This story was reported by arXiv cs.RO.

Robotics

Relating Reinforcement Learning to Dynamic Programming-Based Planning

Robos News Newsroom

Editorial Desk

2026-06-29 · 2 min read

Published June 29, 2026 · Category: Robotics

Overview

arXiv:2603.07844v2 Announce Type: replace Abstract: This paper bridges some of the gap between optimal planning and reinforcement learning (RL), both of which share roots in dynamic programming applied to sequential decision making or optimal control. Whereas planning typically favors deterministic models, goal termination, and cost minimization, RL tends to favor stochastic models, infinite-horizon discounting, and reward maximization in addition to learning-related parameters such as the learning rate and greediness factor. A derandomized version of RL is developed, analyzed, and implemented to yield performance comparisons with value iteration and Dijkstra's algorithm using simple planning models. Next, mathematical analysis shows: 1) conditions under which cost minimization and reward maximization are equivalent, 2) conditions for equivalence of single-shot goal termination and infinite-horizon episodic learning, and 3) conditions under which discounting causes goal achievement to fail. The paper then advocates for defining and optimizing truecost, rather than inserting arbitrary parameters to guide operations. Performance studies are then extended to the stochastic case, using planning-oriented criteria and comparing value iteration to RL with learning rates and greediness factors.

Source

Originally published at arxiv.org.

Source: https://arxiv.org/abs/2603.07844

Robos News Newsroom

Robos News covers markets, crypto and commodities for Asia & the Middle East — tier-1 desk research, AI-driven analysis, institutional-grade data. Tip our newsroom: [email protected]

Email the newsroom →

Disclaimer: This article is for informational purposes only and does not constitute investment advice. Data may be delayed up to 15 minutes. Past performance is not indicative of future results. Consult a licensed financial advisor before making investment decisions.

Relating Reinforcement Learning to Dynamic Programming-Based Planning

Overview

Source

Related Articles

Related Stories

Overview

Source

Related Articles

Related Stories

Learning to Throw: Agile and Accurate Cable-Suspended Payload Delivery with a Quadrotor

PPO-EAL: Exact Augmented Lagrangian Proximal Policy Optimization for Safe Robotic Control

On dynamic multi-agent pathfinding methods: review, simulations and modifications

AI-Driven Synthesis for High-Tech System Design: Automating Innovation

Cookie Preferences