This thesis aims to explore Deep Reinforcement Learning based methods for strategy optimization in the Contract Bridge game, such as Multi-agent Reinforcement Learning, Value-based Learning, and Policy-based Learning, as well as Knowledge Representation and Reasoning based methods such as Multi-agent Epistemic Situation Calculus
Thesis Type |
|
Status |
Finished |
Presentation room |
Seminar room I5 6202 |
Supervisor(s) |
Stefan Decker |
Advisor(s) |
Yongli Mou |
Contact |
mou@dbis.rwth-aachen.de |
Game AI has always been a hot topic in the research of artificial intelligence. The biggest historical moment of the game AI traces back to 2016 when AlphaGo defeated Lee Sedol, a Go professional player, which is the first victory of AI to defeat a Go world top player. It is a milestone for AI research. Late on, AlphaGo Master defeated Ke Jie, and AlphaGo Zero starting without human knowledge surpassed AlphaGo by winning 100 games to 0 within 3 days and exceeded all old AlphaGo versions in 40 days.
The game of Go has an enormous search space but is a two-player perfect-information strictly competitive game in a relatively simple environment. The other games may not have such large search space as Go but are still challenging for AI research, and challenges could also be reflected from other aspects such as complex playing/scoring rules and rich hidden information. Contract Bridge is a multi-player two-sided (cooperative and competitive) stochastic imperfect-information trick-taking game. Research on AI in Contract Bridge is in its relative infancy due to the game being stochastic (from the shuffling of the cards), hidden information (from not being able to see opponents’ cards) and the general complexity of the game.
The goals of this thesis are listed as followings:
- Develop Deep Reinforcement Learning (DRL) models that optimize the strategy in both the bidding stage and playing stage of the Contract Bridge. Different DRL methods, such as Multi-Agent Reinforcement Learning, Value-based Learning, and Policy-based Learning.
- In a scenario with imperfect information and uncertainty, a more rational strategy requires better abstraction of state, action and change. It will be interesting to develop a framework to formalize the game and build strategies based on both objective and subjective information obtained by agents. A promising solution to this task is a probabilistic extension of the multi-agent epistemic situation calculus.
- Develop a Contract Bridge game playing environment for both Deep Reinforcement Learning and Knowledge Representation and Reasoning-based approaches. OpenAI gym is a toolkit for developing and comparing reinforcement learning algorithms. Part of the tasks of this thesis is to develop a Contract Bridge Game environment based on the OpenAI gym framework.
If you are interested in this thesis, do not hesitate to contact us via mou@dbis.rwth-aachen.de
References
[1] Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S. Mastering the game of Go with deep neural networks and tree search. nature. 2016 Jan;529(7587):484-9.
[2] Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A, Chen Y. Mastering the game of go without human knowledge. nature. 2017 Oct;550(7676):354-9.
[3] Bellamy-McIntyre J. Applying Case-Based Reasoning to the Game of Bridge (Doctoral dissertation, University of Auckland).
[4] Ho CY, Lin HT. Contract bridge bidding by learning. InWorkshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence 2015 Apr 1.
[5] Zhang X, Lin R, Bo Y, Yang F. The Synergy of Double Neural Networks for Bridge Bidding. Mathematics. 2022 Sep 3;10(17):3187.
[6] Xiong L, Liu Y. Strategy Representation and Reasoning for Incomplete Information Concurrent Games in the Situation Calculus[C]//IJCAI. 2016: 1322-1329.
Knowledge about Deep Reinforcement Learning and Knowledge Representation and Reasoning
Programming language – Python
Deep Learning Framework – PyTorch