Afternoon with Green Tea
  • Agentic
  • LLM
  • RL
  • Algorithm
  • System
  • Essay
  • Me
Sign in

Reinforcement Learning

Reinforcement Learning in machine learning category
Reinforcement Learning

Diffusion Model via RL

Training Diffusion Models with Reinforcement Learning tries to apply HFRL to diffusion model, aiming at better generation of high quality images. The Noising and denosing processes in Diffiusion Modeal, assumes Markov propertiy and the final latent distribution to learn $ p_{\theta} (x_{t-1} | x_t) $ The loss function for Diffusion
14 Feb 2026 2 min read
Reinforcement Learning

Multiplex Thinking vs Maximum Likelihood Reinforcement Learning (MLRL)

How two 2025 reasoning training paradigms independently rediscovered “optimize search success instead of single‑trajectory accuracy”. TL;DR Both papers try to train language models for Pass@K / best‑of‑N decoding success rather than single‑sample correctness — but they approach the problem from completely different directions: Multiplex Thinking improves
14 Feb 2026 2 min read
Reinforcement Learning

Convergence

To analyze how to guarantee Vs(s) convergence with TD(k) update
21 Sep 2023
Reinforcement Learning

Temporal-Difference Learning

Temporal-Difference Learning in Reinforcement Learning, like what you love, TD(0), TD(1), TD(lambda).
21 Sep 2023
Reinforcement Learning

Basic model of Reinforcement Learning

Basic model for Reinforcement learning, about value function V(s), Q(s, a) function and continuation value function C(s, a)
21 Sep 2023 1 min read
Page 1 of 1
Afternoon with Green Tea © 2026
  • Latest Posts
  • Facebook
  • Twitter
  • Ghost
Powered by Ghost