Publications
Ren Kishimoto*, Rikiya Takehi*, Koichi Tanaka, Riku Togashi, Masahiro Nomura, Yoji Tomita, Yuta Saito.
Beyond Match Maximization and Fairness: Retention-Optimized Two-Sided Matching.
In Proceedings of the 12th International Conference on Learning Representations (ICLR), 2025.Koichi Tanaka*, Ren Kishimoto*, Bushun Kawagishi, Yusuke Narita, Nobuyuki Shimizu, Yasuo Yamamoto, Yuta Saito.
Off-Policy Learning with Limited Supply.
In Proceedings of the ACM Web Conference (TheWebConf), 2025.Ren Kishimoto, Tatsuhiro Shimizu, Kazuki Kawamura, Takanori Muroi, Yusuke Narita, Yuki Sasamoto, Kei Tateno, Takuma Udagawa, Yuta Saito.
Offline Contextual Bandits in the Presence of New Actions.
KDD 2025 Workshop - Causal Inference and Machine Learning in Practice.Tatsuhiro Shimizu*, Koichi Tanaka*, Ren Kishimoto, Haruka Kiyohara, Masahiro Nomura, Yuta Saito.
Effective Off-Policy Evaluation and Learning in Contextual Combinatorial Bandits.
In Proceedings of the 18th ACM Recommender Systems Conference (RecSys), 2024.Ren Kishimoto*, Koichi Tanaka*, Haruka Kiyohara, Yusuke Narita, Nobuyuki Shimizu, Yasuo Yamamoto, Yuta Saito.
Efficient Offline Learning of Ranking Policies via Top-k Policy Decomposition.
ICML 2024 Workshop on Aligning Reinforcement Learning Experimentalists and Theorists (ARLET), 2024.Haruka Kiyohara, Ren Kishimoto, Kosuke Kawakami, Ken Kobayashi, Kazuhide Nakata, Yuta Saito.
Towards Assessing and Benchmarking Risk-Return Tradeoff of Off-Policy Evaluation.
In Proceedings of the 12th International Conference on Learning Representations (ICLR), 2024.Haruka Kiyohara, Ren Kishimoto, Kosuke Kawakami, Ken Kobayashi, Kazuhide Nakata, and Yuta Saito.
SCOPE-RL: A Python Library for Offline Reinforcement Learning and Off-Policy Evaluation.
arXiv preprint, 2023.
