Ping-Chun Hsieh (謝秉均)

Associate Professor
Department of Computer Science,
National Yang Ming Chiao Tung University (NYCU)
Office: EC 418 (光復校區-工程三館418室)
Lab: EC 129 (光復校區-工程三館129室)
Email: pinghsieh [AT] nycu [DOT] edu [DOT] tw
[Google Scholar] [Facebook]

About me

Welcome! I am an Associate Professor at the Department of Computer Science in National Yang Ming Chiao Tung University. My research is currently focused on Bandit Learning, Reinforcement Learning, Bayesian Optimization, Meta Learning, and Optimization for Networks.

To Prospective Students

(115年度碩博錄取生歡迎來信詢問) I am currently looking for highly-motivated students (including postdoctoral researchers, graduate students, undergraduate students, and research assistants) who are interested in either theoretical or system aspects of Reinforcement Learning, Bandit Learning, Bayesian Optimization, or other related areas. Both graduate and undergraduate students at NYCU are welcome to discuss with me if you are interested in my research.

Recent Schedule

I will be at NeurIPS on December 2-7, 2025.

News

[Sep. 2025] I gave a talk at NTHU Stats (清大統計與數據科學所) on “First-Order Methods Find Globally Optimal Policies in Reinforcement Learning.”
[Sep. 2025] Our paper titled "Learning Human-Like RL Agents Through Trajectory Optimization With Action Quantization" is accepted to NeurIPS 2025.
[Aug. 2025] Invited to serve as an Area Chair of ICLR 2026.
[Aug. 2025] Congratulations to our lab member Chia-Han Yeh (葉佳翰) on receiving the 18th Taiwan Management Institute Thesis Award (崇越AI應用論文優等獎)!
[Aug. 2025] One paper is accepted to ACML 2025. The title is "Relaxed Transition Kernels can Cure Underestimation in Adversarial Offline Reinforcement Learning." This is a joint work with Prof. YuShuen Wang and Dr. Yun-Hsuan Lien.
[Aug. 2025] One paper is accepted to EMNLP 2025. The title is "Extending Automatic Machine Translation Evaluation to Book-Length Documents." Congratulations to the PhD student Kuang-Da Wang (王廣達)!
[Jun. 2025] Congratulations to the PhD student Yu-Heng Hung (洪鈺恆) on receiving the Hon Hai Technology Award 2025 (鴻海科技獎)!
[May 2025] Our paper titled "Action-Constrained Imitation Learning" is accepted for ICML 2025! Congratulations to the our lab members Chia-Han Yeh (葉佳翰), Wei Hung (洪偉), and Hung-Yen Wu (吳泓諺)!
[Jan. 2025] Two papers are accepted to ICLR 2025! The titles are "Efficient Action-Constrained Reinforcement Learning via Acceptance-Rejection Method and Augmented MDPs" and "BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL". Congratulations to the students Wei Hung (洪偉), Yu-Heng Hung (洪鈺恆), Kai-Jie Lin (林楷傑), and Yu-Heng Lin (林禹亨)!
[Dec. 2024] I gave a talk at TAAI 2024 Nectar session on ‘‘Enhancing Value Function Estimation through First-Order State-Action Dynamics in Offline Reinforcement Learning."
[Nov. 2024] I gave a talk at the NTU Communication Seminar (台大電信所專題演講) on ‘‘From Policy Gradient to Nesterov Acceleration: A Unified Framework for Global Optimality in RL."
[Sep. 2024] I am honored to receive the NYCU Outstanding Teaching Award (校級傑出教學獎)!
[Jun. 2024] Congratulations to the PhD student Yun-Hsuan Lien (連云暄) on winning the Hon Hai Technology Award 2024 (鴻海科技獎)!
[May. 2024] One paper titled "Offline Imitation of Badminton Player Behavior via Experiential Contexts and Brownian Motion" is accepted to ECML-PKDD 2024. This is a joint work with the PhD student Kuang-Da Wang (王廣達), Dr. Wei-Yao Wang, and Prof. Wen-Chih Peng.

[May. 2024] Two papers are accepted to ICML 2024! The titles are "Accelerated Policy Gradient: On the Convergence Rates of the Nesterov Momentum for Reinforcement Learning" and "Enhancing Value Function Estimation through First-Order State-Action Dynamics in Offline Reinforcement Learning". Congratulations to the students Yun-Hsuan Lien (連云暄), Yen-Ju Chen (陳彥儒), and Nai-Chieh Huang (黃迺絜)!
[Dec. 2023] Our paper titled "PPO-Clip Attains Global Optimality: Towards Deeper Understandings of Clipping," has been accepted to AAAI 2024. Congratulations to our undergraduate member Nai-Chieh Huang (黃迺絜) and the PhD student Kuo-Hao Ho (何國豪)! This is a joint work with Prof. I-Chen Wu.
[Sep. 2023] Our paper titled "Towards Human-Like RL: Taming Non-Naturalistic Behavior in Deep RL via Adaptive Behavioral Costs in 3D Games," has been accepted to ACML 2023. This is a joint work with Prof. I-Chen Wu and the PhD student Kuo-Hao Ho (何國豪).
[Jul. 2023] Our group members Yen-Ju Chen (陳彥儒) and Nai-Chieh Huang (黃迺絜) will present their latest research "Accelerated Policy Gradient: On the Nesterov Momentum for Reinforcement Learning" at the Frontiers4LCD Workshop at ICML 2023. Congratulations!
[Jul. 2023] A paper titled "Generating Turn-Based Player Behavior via Experience from Demonstrations" has been accepted to the 1st SPIGM Workshop at ICML 2023. This is a joint work with Prof. Wen-Chih Peng and the students Kuang-Da Wang (王廣達) and Wei-Yao Wang (王威堯).
[July. 2023] Our undergraduate group member Nai-Chieh Huang (黃迺絜) won the 1st place of the NYCU CS Undergraduate Research Competition (陽明交大資工系專題競賽特優). Congratulations!
[Jun. 2023] Our undergraduate group members 吳文心 and 廖兆琪 (along with her teammate 孟祥蓉) are awarded the NSTC Research Grant for University Students (國科會大專學生研究計畫). Congratulations!
[Apr. 2023] Our paper titled "Revisiting Domain Randomization via Relaxed State-Adversarial Policy Optimization," has been accepted to ICML 2023. Congratulations to the PhD student Yun-Hsuan Lien (連云暄)!
[Jan. 2023] A paper titled "Q-Pensieve: Boosting Sample Efficiency of Multi-Objective RL Through Memory Sharing of Q-Snapshots," is accepted to ICLR 2023! Congratulations to our group members Wei Hung (洪偉) and Bo-Kai Huang (黃柏愷)!
[Jan. 2023] A paper titled "Coordinate Ascent for Off-Policy RL with Global Convergence Guarantees," is accepted to AISTATS 2023! Congratulations to our group members Hsin-En Su (蘇信恩) and Yen-Ju Chen (陳彥儒)!
[Nov. 2022] Our paper "Reward-Biased Maximum Likelihood Estimation for Neural Contextual Bandits: A Distributional Learning Perspective," is accepted to AAAI 2023! Congratulations to our group member Yu-Heng Hung (洪鈺恆)!
[Sep. 2022] A paper titled "Neural Frank-Wolfe Policy Optimization for Region-of-Interest Intra-Frame Coding with HEVC/H.265" is accepted for VCIP 2022. This is a joint work with Prof. Wen-Hsiao Peng and the students Yung-Han Ho and Chia-Hao Kao.
[Sep. 2022] I gave an invited talk at TSFN Nano-Symposium 2022 about "Rethinking Policy Improvement in Reinforcement Learning."
[Jul. 2022] Our undergraduate group members 林浩君 and 許承壹 won the 2nd place of the NYCU CS Undergraduate Research Competition (陽明交大資工系專題競賽優等). Congratulations!
[Jun. 2022] I gave a talk at CS Department at NTHU, and the title is "Rethinking Policy Improvement in Reinforcement Learning."
[Jun. 2022] I gave a talk at ISS at Academia Sinica (中研院統計所) about "Exploration Through Reward Biasing in Bandits."
[Mar. 2022] I gave two talks at TIGP@Academia Sinica and GIBMS@NTU Hospital about "Rethinking Policy Improvement in Reinforcement Learning."
[Mar. 2022] Yu-Heng (PhD student) and I are delighted to share on arXiv about our recent progress on "Reward-Biased Maximum Likelihood Estimation for Neural Contextual Bandits."
[Dec. 2021] I gave an invited talk at the NeurIPS 2021 Taiwan Meetup and was invited to serve as a panelist.
[Nov. 2021] I gave a talk at the Neuroscience Workshop (大腦、認知與演算法工作坊), and the title is "Revisiting Exploration in Bandits: Reward Biasing and Meta Learning."
[Oct. 2021] Two papers are accepted to NeurIPS 2021! The titles are "Reinforced Few-Shot Acquisition Function Learning for Bayesian Optimization" and "NeurWIN: Neural Whittle Index Network for Restless Bandits via Deep RL." Congratulations to our group member Bing-Jing Hsieh (謝秉瑾)!
[Oct. 2021] I gave a talk at TAAI AI Forum 2021, and the title is "Exploration Through Reward Biasing: Bandit Learning via Reward-Biased Maximum Likelihood Estimation."
[Sep. 2021] I gave a talk at the Institute of Neuroscience in NYCU, and the topic is "Rethinking Policy Improvement in Reinforcement Learning: Two Case Studies."
[Jun. 2021] Our undergraduate group member Cheng-Yu Chung (鍾承佑) (along with his teammates Bing-Zhi Ke (柯秉志) and Yu-Lun Hsu (徐煜倫)) is awarded the MOST Research Grant for University Students. Congratulations!
[May. 2021] Our paper, "Escaping from Zero Gradient: Revisiting Action-Constrained Reinforcement Learning via Frank-Wolfe Policy Optimization," is accepted to UAI 2021! Congratulations to our group members Jyun-Li Lin (林峻立), Wei Hung (洪偉), and Shang-Hsuan Yang (楊上萱)!
[Apr. 2021] I gave a talk at the Department of Computer Science in NTNU, and the talk is on "Reinforcement Learning and Bandits: Two Case Studies."
[Dec. 2020] A paper, "Reward-Biased Maximum Likelihood Estimation for Linear Stochastic Bandits," is accepted to AAAI 2021! Congratulations to our group member Yu-Heng Hung (洪鈺恆)!
[Oct. 2020] Our paper, "Fresher Content or Smoother Playback? A Brownian-Approximation Framework for Scheduling Real-Time Wireless Video Streams," received the Best Paper Award from MobiHoc 2020!
[Oct. 2020] Our paper, "Rethinking Deep Policy Gradient via State-Wise Policy Improvement," is accepted to ICBINB Workshop @ NeurIPS 2020.
[Jun. 2020] Our paper, "Exploration Through Reward Biasing: Reward-Biased Maximum Likelihood Estimation for Stochastic Multi-Armed Bandits ", is accepted to ICML 2020!
[Apr. 2020] A paper, "Fresher Content or Smoother Playback? A Brownian-Approximation Framework for Scheduling Real-Time Wireless Video Streams," is accepted to MobiHoc 2020!
[Dec. 2019] I gave a talk at Academia Sinica and the Department of Communication Engineering in NTPU on "Bandit Learning: Optimality, Scalability, and Reneging."
[Jun. 2019] I attended ICML 2019 in Long Beach, CA ang gave a presentation on "Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging."
[Apr. 2019] A paper, "Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging," is accepted to ICML 2019!
[Feb. 2019] I am awarded the Young Scholar Fellowship (愛因斯坦計畫) from Ministry of Science and Technology (MOST) in Taiwan.
[Nov. 2018] I gave a talk at the CS Department in NCTU on "Providing Ultra-Low Latency for Wireless Networks: From Theory to Practice."
[Nov. 2018] I gave a talk at the TAMU ECE seminar on "PULS: Processor-Supported Ultra-Low Latency Scheduling."
[Jun. 2018] A paper, "Heavy-Traffic Analysis of QoE Optimality for On-Demand Video Streams Over Fading Channels," is accepted to IEEE/ACM Transactions on Networking.
[May. 2018] I attended ICC 2018 in Kansas City, MO and gave a presentation on "An Experimental Study on Coverage Enhancement of LTE Cat-M1 for Machine-Type Communication."
[May. 2018] I passed my final defense and received my Ph.D. degree from ECE at TAMU. The thesis titile is "Network Algorithms for Control and Communication for IoT Applications."
[Apr. 2018] A paper, "A Decentralized Medium Access Protocol for Real-Time Wireless Ad Hoc Networks With Unreliable Transmissions," is accepted to IEEE ICDCS 2018.
[Mar. 2018] A paper, "PULS: Processor-Supported Ultra-Low Latency Scheduling," is accepted to ACM MobiHoc 2018.
[Feb. 2018] I attended the ITA Workshop in San Diego, CA and gave an invited presentation at the Graduation Day session on "Throughput-Optimal Scheduling for Multi-Hop Networked Transportation Systems With Switch-Over Delay."