• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Hu Yujing, Gao Yang, An Bo. Online Counterfactual Regret Minimization in Repeated Imperfect Information Extensive Games[J]. Journal of Computer Research and Development, 2014, 51(10): 2160-2170. DOI: 10.7544/issn1000-1239.2014.20130823
Citation: Hu Yujing, Gao Yang, An Bo. Online Counterfactual Regret Minimization in Repeated Imperfect Information Extensive Games[J]. Journal of Computer Research and Development, 2014, 51(10): 2160-2170. DOI: 10.7544/issn1000-1239.2014.20130823

Online Counterfactual Regret Minimization in Repeated Imperfect Information Extensive Games

More Information
  • Published Date: September 30, 2014
  • In this paper, we consider the problem of exploiting suboptimal opponents in imperfect information extensive games. Most previous works use opponent modeling and find a best response to exploit the opponent. However, a potential drawback of such approach is that the best response may not be a real one, since the modeled strategy actually may not be the same as what the opponent plays. We try to solve this problem from the perspective of online regret minimization, which avoids opponent modeling. We make extensions to a state-of-the-art equilibrium-computing algorithm called counterfactual regret minimization (CFR). The core problem is how to compute the counterfactual values in online scenarios. We propose to learn approximations of these values from the results produced by the game and introduce two different estimators: static estimator which learns the values directly from the results’ distribution, and dynamic estimator which assigns larger weight to new sampled results than older ones for better adapting to dynamic opponents. Two algorithms for online regret minimization are proposed based on the two estimators. We also give the conditions under which the values estimated by our estimators are equal to the true values, showing the relationship between CFR and our algorithms. Experimental results in one-card poker show that our algorithms not only perform the best when exploiting some weak opponents, but also outperform some state-of-the-art algorithms by achieving the highest win rate in matches with a few hands.
  • Related Articles

    [1]Yue Wenjing, Qu Wenwen, Lin Kuan, Wang Xiaoling. Survey of Cardinality Estimation Techniques Based on Machine Learning[J]. Journal of Computer Research and Development, 2024, 61(2): 413-427. DOI: 10.7544/issn1000-1239.202220649
    [2]Geng Fenghuan, Liu Hui, Guo Qiang, Yin Yilong. Variational Optical Flow Estimation Based Super-Resolution Reconstruction for Lung 4D-CT Image[J]. Journal of Computer Research and Development, 2017, 54(8): 1703-1712. DOI: 10.7544/issn1000-1239.2017.20170346
    [3]Wu Haifeng, Zeng Yu, and Feng Jihua. Passive RFID Tag Anti-Collision Binary Tree Slotted Protocol without Tags Quantity Estimation[J]. Journal of Computer Research and Development, 2012, 49(9): 1959-1971.
    [4]Gu Huitao, Chen Shuming, and Sun Shuwei. An HD Video Motion Estimation Coprocessor Supporting Multiple Coding Standards[J]. Journal of Computer Research and Development, 2011, 48(11): 2015-2022.
    [5]Bai Heng, Gao Yurui, Wang Shijie, and Luo Limin. A Robust Diffusion Tensor Estimation Method for DTI[J]. Journal of Computer Research and Development, 2008, 45(7): 1232-1238.
    [6]Deng Lei, Gao Wen, Hu Mingzeng, Ji Zhenzhou. A High Efficient Architecture for Motion Estimation Based on AVC/AVS Coding Standard[J]. Journal of Computer Research and Development, 2006, 43(11): 1972-1979.
    [7]He Xiaoyang and Wang Yasha. Model-Based Methods for Software Cost Estimation[J]. Journal of Computer Research and Development, 2006, 43(5): 777-783.
    [8]Hu Yusuo and Chen Zonghai. A Novel Robust Estimation Algorithm Based on Linear EIV Model[J]. Journal of Computer Research and Development, 2006, 43(3): 483-488.
    [9]Liu Bo, Wang Zhensong, Yao Ping, Li Mingfeng. A Novel Real-Time Doppler Centroid Estimating Algorithm[J]. Journal of Computer Research and Development, 2005, 42(11): 1911-1917.
    [10]Li Tiejun, Shen Chengdong, and Li Sikun. A VLSI Architecture for PMVFAST Block-Based Motion Estimation Algorithm[J]. Journal of Computer Research and Development, 2005, 42(4): 537-543.

Catalog

    Article views (1883) PDF downloads (1069) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return