archive by month
Skip to content

Steamhammer has a UCB bug

Ack, Steamhammer has a typo in its UCB formula! A parenthesis is misplaced. What a blunder! In UCB1_bound(), change this inexcusable mistake:

	return sqrt(2.0 * log(double(total)/tries));

to this:

	return sqrt(2.0 * log(total) / tries);

The typecast has no formal effect in C++11 and later, and made it harder to see the error.

The current Steamhammer 1.4.1 uses UCB only for deciding whether to steal gas, when AutoGasSteal is turned on. I had been wondering why it chose to steal gas so often against so many opponents. Was the gas steal really that effective? When I looked again at the code, I soon spotted the mistake.

The behavior is approximately right when the number of games is small. That’s how it passed my end-to-end tests. As the number of games goes up, it gets more and more wrong. It’s impossible to be too careful in testing. :-/

The upcoming version will use UCB for opening selection—not in the most direct way, like most bots, but with a twist to cope with the large number of openings, too many to explore. Good thing I caught the bug in time.

Trackbacks

No Trackbacks

Comments

Paul Goodman on :

Would you mind clarifying UCB for us laymen? Google turns up a university and a comedy group.

Jay Scott on :

UCB is a family of solutions to the multi-armed bandit problem: https://en.wikipedia.org/wiki/Multi-armed_bandit It is for making decisions (A or B or C?) when you have to figure out from experience which decision is better. UCB1 is a specific algorithm from the family, popular for its simplicity and generality.

Add Comment

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA

Form options

Submitted comments will be subject to moderation before being displayed.