Exploring the python-chess module
Lichess' accuracy metrics replication as pretext to explore the python-chess module.Introduction
In my pevious blog Fitting an Elo model to Titled Tuesday games I use python to fit an Elo model to Titled Tuesday blitz games.
To read and parse my PGN files I used python-chess.
In this blog I will try to replicate Lichess' accuracy metrics as pretext to explore the python-chess module.
I decided to write this blog inspired by Ryan Wingate. Maybe what I have learned will also be usefull to someone. Another interesting blog that explores chess engine analysis using python-chess can be found here.
Before continuing a word of caution, although as a statistical researcher I program in python every day I am not a developer. This means that while typically, but not always, I manage to get things done my programming is not elegant, sophisticated or efficient.
My imports for this project are:
import pandas as pd
import numpy as np
import scipy.stats as sps
import matplotlib.pyplot as plt
import chess
import chess.pgn
import chess.engine
Lichess accuracy metrics
White expected score
Lichess's equation that, for a given position, converts Stockfish centipawns (for those who don't know a centipawn is 1/100 of a pawn) into what Lichess calls winning chances is given by:
Win% = 50 + 50 * (2 / (1 + exp(-0.00368208 * centipawns)) - 1)
Which can be simplified to:
Win% = 100 / (1 + exp(-0.00368208 * centipawns))
More precisely this equation measures White's expected score after 100 games are played.
My White's expected score implementation is:
def lichess_white_expected_score(cp,p=-0.00368208):
return 1.0/(1.0+np.exp(p*cp))
Note that I use 1 instead of 100, so my equation returns a number between 0 and 1, rather than between 0 and 100.
To evaluate, for instance the move 1.d4, a board with a move 1.d4 can be set-up using it's FEN (Forsyth–Edwards Notation) as follows:
board = chess.Board('rnbqkbnr/pppppppp/8/8/3P4/8/PPP1PPPP/RNBQKBNR b KQkq - 0 1')
To show this board:
display(board)
Alternatively, an empty board can be initiated and "a move can be made":
board = chess.Board()
move = chess.Move.from_uci("d2d4")
board.push(move)
display(board)
The difference is that in this way the move is shown on the board.
Getting the UCI (Universal chess interface) , SAN (Standard Algebraic Notation) and FEN notations for the current move/board can be done using:
move.uci() # gets the current move in UCI notation
board.san(move) # gets the current move in SAN notation
board.fen() # gets current board's FEN
I can now evaluate this position, at a required depth, using my version of stockfish 16 engine:
engine = chess.engine.SimpleEngine.popen_uci('stockfish-ubuntu-x86-64-avx2')
info = engine.analysis(board,limit=chess.engine.Limit(depth=36))
Time taken to evaluate 1.d4 at depth of 36: 0 days 00:00:42.480189
Alternatively a time limit for the analysis can be set:
info = engine.analyse(board, limit=chess.engine.Limit(time=2.0)) # runs engine's analysis for 2 seconds
It is also possible to get the best moves:
info = engine.analyse(board, chess.engine.Limit(time=2.0), multipv=3)# gets the 3 best moves
In this case info will be a vector with 3 elements: info[0], info[1] and info[2].
info will contain useful information, for example:
info['depth'] # gets the deph of the analysis
info['score'] # gets the score from the player who played perpective
info['score'].white().score() # gets the score from the White's perpective
info['score'].white().is_mate() # true if the score is mate
info['score'].white().mate() # number of moves until mate (zero if mate is on the board)
info['score'].wdl().white() # Win/Draw/Loss distribution out of 1000 games
from White's perspective
You can get Black's perspective by replacing .white() for .black().
A couple of examples:
- 1.d4 centipawns, from White perspective, at depth=36: 27
- 1.d4 Win/Draw/loss distribution, from White perspective, at depth=36: Wdl(wins=21, draws=978, losses=1)
Consequently, White's 1.d4 expected scores are:
- Lichess model (given my Stockfish 16 evaluation at depth 36): 1.0/(1.0+exp(-0.00368208 * 27)) = 0.5248
- Lichess master database: 0.33+0.44/2 = 0.55 (see next figure)
- Stockfish 16 model at depth 36: (21+978/2)/1000 = 0.51
The previous figure also shows that Lichess' Stockfish 16 White's advantage for move 1.d4 is 0.1 or 10 centipawns at a deph of 36.
Therefore:
- Chess engines evaluations and reality don't necessarily agree. People don't play like engines.
- My version of Stockfish 16 does not automatically gives the same evaluations as Lichess' Stockfish 16.
- Running my engine twice does not necessarily yield the same result.
Engines don't necessarily agree with each other (or with themselves). It is important to have this in mind.
The following figure reproduces Lichess' White's expected score as a function of centipawns
Move accuracy
Lichess calculates move accuracy as follows:
Accuracy% = 103.1668 * exp(-0.04354 * (winPercentBefore - winPercentAfter)) - 3.1669
My implementation is:
def lichess_move_accuracy(score_100_games_diff,par = [103.1668,-0.04354,-3.1669]):
accuracy = par[0] * np.exp(par[1] * score_100_games_diff) + par[2]
return accuracy
As an example, let's imagine that after 1.d4 Black plays 1...a6 (which is the least popular response to 1.d4 in Lichess's master database)
The board is:
The evaluations are:
Move | Centipawns | White score |
---|---|---|
1.d4 | 27 | 0.524834 |
1...a6 | 57 | 0.552278 |
Acuracy of the move 1...a6 = 103.1668 * exp(-0.04354 * (55.23 - 52.48)) - 3.1669 = 88.38%
The following graph reproduces the Lichess' accuracy as a function of the score difference over 100 games.
Reading and parsing a PGN file
I use python-chess reading and parsing PGN files in 3 different ways:
- reading multiple games from one file (by the way, reading large PGN files using this module is extremely slow):
pgn_headers = [
'Event',
'Site',
'Date',
'Round',
'White',
'Black',
'Result',
'ResultDecimal',
'WhiteTitle',
'BlackTitle',
'WhiteElo',
'BlackElo',
'ECO',
'Opening',
'Variation',
'WhiteFideId',
'BlackFideId',
'EventDate',
'Annotator',
'PlyCount',
'TimeControl',
'Time',
'Termination',
'Mode',
'FEN',
'SetUp',
'Moves',
]
games = pd.DataFrame(columns=pgn_headers)
game_n = 0
with open(pgn_file_name) as f:
while True:
game = chess.pgn.read_game(f)
## If there are no more games, exit the loop
if game is None:
break
games.loc[game_n] = pd.Series(game.headers)
game_n = game_n + 1
- reading a PGN file with just a game
game = chess.pgn.read_game(open('pgn_file_name'))
- Create a PGN from moves.
moves = '1. e4 e5 2. Nf3 Nh6 3. Nxe5 d6 4. Nf3 Qf6 5. d4 Ng4 6. Bg5 Qg6 7. Bd3 Be7 8. Bxe7 Kxe7
9. e5 f5 10. exf6+ Qxf6 11. O-O Bf5 12. Re1+ Kf8 13. Bxf5 Qxf5 14. Qe2 Nc6 15. h3 Re8 16. Qxe8# '
game = chess.pgn.read_game(io.StringIO(moves))
These days I play on Lichess mostly anonymously but I still like to analyse my games. Hence I use this method for my anonymous games.
To copy the moves, on Lichess I right click the game's last move, on the analysis board, and choose "copy variation PGN" and then I paste it on to my python code, as you can see above.
Lichess' average accuracy and advantage chart
For this section I will use one of my games as example:
Seeing chess from White's perspective is clear that logic demands that when White plays either White keeps it score (100% accuracy) or White's score gets worst (less than 100% accuracy). When it is Black's turn either White's score is unchanged (100% accuracy by Black) or White's score improves (less than 100% accuracy for Black).
Of course Engines are not perfect. Therefore it can happen that White's score given by the engine can improve after White's move or decrease after Black's move. In both these two events I set White's score differential to zero, and I call it "White's adjusted expected score differential".
Finally, what is the centipawns equivalent to, for instance, a mate in 4? This question is important because to calculate accuracy using Lichess' function it is necessary to know the score and to know the score it is necessary to know the centipawns.
Theoretically the score of mate in 4 is 1, meaning that whichever color has a mate in 4 will win the game. In practice chess players frequently miss these opportunities (we all have been there, right?), consequently the score of a mate in 4 is not 1.
Python-chess converts a mate in m turns to centipawns by subtracting m from an arbitrary large number:
chess.engine.Mate(m).score(mate\_score=1800) # converts mate in centipawns = mate\_score - m
As you can see above, I chose 1800, equivalent to 2 queens, as this arbitrary large number.
Given my choice, for instance mate in 11 is converted to 1800-11 = 1789 centipawns.
The following table shows:
- the moves in UCI and SAN notations,
- White's advantage in centipawns,
- White's expected score,
- White's expected score differential,
- White's adjusted expected score differential and
- the depth at which the engine calculated White's advantage in centipawns in 2 seconds of analysis.
Turn | SAN | UCI | CP | Score | Diff | Adj_diff | Acc | Depth |
---|---|---|---|---|---|---|---|---|
White | e4 | e2e4 | 30 | 52.76 | 0.46 | 0 | 100 | 25 |
Black | e5 | e7e5 | 33 | 53.03 | 0.28 | 0.28 | 98.77 | 26 |
White | Nf3 | g1f3 | 20 | 51.84 | -1.19 | -1.19 | 94.77 | 27 |
Black | Nh6 | g8h6 | 173 | 65.41 | 13.57 | 13.57 | 53.98 | 24 |
White | Nxe5 | f3e5 | 189 | 66.73 | 1.32 | 0 | 100 | 25 |
Black | d6 | d7d6 | 185 | 66.4 | -0.33 | 0 | 100 | 25 |
White | Nf3 | e5f3 | 195 | 67.22 | 0.82 | 0 | 100 | 25 |
Black | Qf6 | d8f6 | 291 | 74.49 | 7.27 | 7.27 | 72 | 25 |
White | d4 | d2d4 | 307 | 75.59 | 1.1 | 0 | 100 | 24 |
Black | Ng4 | h6g4 | 385 | 80.5 | 4.9 | 4.9 | 80.16 | 25 |
White | Bg5 | c1g5 | 296 | 74.84 | -5.66 | -5.66 | 77.47 | 23 |
Black | Qg6 | f6g6 | 313 | 76 | 1.16 | 1.16 | 94.92 | 24 |
White | Bd3 | f1d3 | 242 | 70.91 | -5.09 | -5.09 | 79.51 | 24 |
Black | Be7 | f8e7 | 397 | 81.18 | 10.27 | 10.27 | 62.8 | 24 |
White | Bxe7 | g5e7 | 408 | 81.79 | 0.61 | 0 | 100 | 24 |
Black | Kxe7 | e8e7 | 410 | 81.9 | 0.11 | 0.11 | 99.51 | 25 |
White | e5 | e4e5 | 351 | 78.46 | -3.45 | -3.45 | 85.63 | 24 |
Black | f5 | f7f5 | 406 | 81.68 | 3.23 | 3.23 | 86.48 | 25 |
White | exf6+ | e5f6 | 309 | 75.73 | -5.95 | -5.95 | 76.44 | 24 |
Black | Qxf6 | g6f6 | 319 | 76.4 | 0.67 | 0.67 | 97.03 | 25 |
White | O-O | e1g1 | 320 | 76.46 | 0.07 | 0 | 100 | 26 |
Black | Bf5 | c8f5 | 405 | 81.63 | 5.16 | 5.16 | 79.23 | 25 |
White | Re1+ | f1e1 | 386 | 80.55 | -1.07 | -1.07 | 95.29 | 24 |
Black | Kf8 | e7f8 | 516 | 86.99 | 6.43 | 6.43 | 74.79 | 25 |
White | Bxf5 | d3f5 | 379 | 80.15 | -6.84 | -6.84 | 73.42 | 24 |
Black | Qxf5 | f6f5 | 399 | 81.29 | 1.15 | 1.15 | 94.98 | 26 |
White | Qe2 | d1e2 | 381 | 80.26 | -1.03 | -1.03 | 95.48 | 23 |
Black | Nc6 | b8c6 | 371 | 79.67 | -0.59 | 0 | 100 | 25 |
White | h3 | h2h3 | 364 | 79.25 | -0.42 | -0.42 | 98.13 | 24 |
Black | Re8 | a8e8 | 1799 | 99.87 | 20.61 | 20.61 | 38.88 | 245 |
White | Qxe8# | e2e8 | 1800 | 100 | 0.13 | 0 | 100 | 0 |
Average accuracy
To calculate White's and Black's average accuracy, as far as I understand, Lichess uses an harmonic mean, which can be easily calculated using the function hmean in the module scipy.stats.
The following table shows the average accuracy for White and Black as calculated by me and by Lichess.
White | Black | |
---|---|---|
Me | 91 | 76 |
Lichess | 91 | 77 |
The next table compares my and Lichess' average accuracy for a few random games.
White | Black | ||
---|---|---|---|
Carlsen vs Esipenko | Me | 97 | 92 |
Lichess | 96 | 90 | |
Firouzja vs Nepo | Me | 88 | 95 |
Lichess | 87 | 95 | |
Game ve1epqzT | Me | 81 | 86 |
Lichess | 83 | 87 | |
Game lO13IlTf | Me | 80 | 60 |
Lichess | 78 | 63 | |
Game sLv7lVYF | Me | 81 | 90 |
Lichess | 77 | 87 |
As it can be seen the accuracy numbers are not the same but close enough. Certainly close enough for my purpose.
Advantage chart
To replicate Lichess' advantage plot, White's expected score needs to be converted from the [0, 1] interval to [-1, 1] by calculating 2*lichess_white_expected_score-1.
The next figure reproduces Lichess' advantage chart for my game and also shows the actual Lichess graph for comparison.
Final thoughts
Just for fun, it is possible to make two engines play each other:
engine1 = chess.engine.SimpleEngine.popen_uci('your engine 1')
engine2 = chess.engine.SimpleEngine.popen_uci('your engine 2')
board = chess.Board()
while not board.is_game_over():
result = engine1.play(board, chess.engine.Limit(time=1.0))
board.push(result.move)
display(board)
result = engine2.play(board, chess.engine.Limit(time=1.0))
board.push(result.move)
display(board)
The moves played can be found in:
board.move_stack
I hope that this blog will help you start using python-chess, if you are interested in these sort of things, of course.