PPO RL Algo Using Python

Fully Distributed Event-Triggered Control for Containment of Fully Heterogeneous MASs Using Filter-Based RL

Abstract: The complexity of multiagent systems poses challenges in containment control, especially with fully heterogeneous agents influenced by multiple leaders. Traditional methods rely on ...

IEEE

IPPO: Enhanced PPO Algorithm for Multi-Drone Control

Abstract: Unmanned aerial vehicles (UAVs) have gained significant attention in recent years due to their wide-ranging applications in areas such as surveillance, delivery, and disaster response.

GitHub

SCOPE-RL: A Python library for offline reinforcement learning, off-policy evaluation, and selection

SCOPE-RL is an open-source Python Software for implementing the end-to-end procedure regarding offline Reinforcement Learning (offline RL), from data collection to offline policy learning, off-policy ...

GitHub

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Figure 1. FIPO vs. baselines on AIME 2024. FIPO shows that pure RL training alone can outperform reproduced pure-RL baselines such as DAPO and DeepSeek-R1-Zero-32B, surpass o1-mini, and produce ...

Dexerto

PewDiePie reveals how he ‘fixed’ YouTube’s algorithm using his own AI

Frustrated by YouTube feeding him Shorts and advertisements, PewDiePie set out to ‘fix’ the platform’s algorithm with his own AI. YouTube icon Felix ‘PewDiePie’ Kjellberg famously trained his own AI ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results