PGN Utils
PGNutils.Txt - 8 SEP 2006 - by Tom McCormick - mccormit@sbcglobal.net ----- YOU MAY WISH TO PRINT THIS FILE FOR FUTURE REFERENCE -------- Many of the freeware utility programs described below have been developed
under the Windows XP "command window" which emulates MS DOS. This is
reached by clicking on Start, then on Run, then key in CMD.EXE (for XP) or
COMMAND.COM (for Windows 9x or ME) and press the Enter key. Key in EXIT
to return to Windows.
I expect that these programs would run with ANY version of Windows,
but I have not tested them with all other versions.
All of these programs expect an input file to use Carriage-Return
and Linefeed character pairs as line delimiters. PGN from Unix or
Linux systems should be processed first using the crlf.exe utility
to convert the single-character "newline" to CR/LF before processing.
All of these utilities will accept ANY size input file, however,
when running PGNTRIM5 or PGNTRIM6 to normalize fresh pgn files,
individual PGN GAMES larger than 8,000 characters will be sent
into the .BAD file for review, and will not be in the new output
file. This is done to permit the user to review each rejected game
and decide to manually edit the problem, or discard that game.
Any PGN game containing a [FEN tag will be passed through
to the output file without any normalizing or correcting.
IMPORTANT...
Run PGNTRIM5 or PGNTRIM6 before running any of the other
PGN utilities described at the end of this document.
PGNTRIM5 and PGNTRIM6
---------------------
The comments about PGNTRIM5 apply to PGNTRIM6, a later version.
PGNTRIM6 has only a few differences with PGNTRIM5...notably that
semicolons are permitted within tags rather than treating them
according to the PGN standard as signalling the beginning of a
comment through the end of that line/record.
PGNTRIM5.EXE is a freeware Windows utility to correct most PGN syntax
errors, and to direct games which need human review into a separate
output file named BADTRIM5.BAD. That file usually furnishes enough
information to the user so that a decision can be made to correct the
input file and run again, or to accept the number of games rejected
from the new output file. PGNTRIM5 never changes the original input file.
PGNTRIM5.EXE requires no installation routine, nor any .DLL file(s).
It runs from a Microsoft DOS prompt, or from a Microsoft Windows 95,
98, ME, NT, NT2000, or XP command prompt, and probably under VISTA.
PGNTRIM5 does NOT detect illegal/impossible moves; but PGNSCID.EXE is
freeware, and will catch most of these.
I run pgntrim5 first against newly downloaded PGN files in order to
clean up common syntax problems and ommisions, and to drop text info
such as titles and crosstables. Then I run the output file into pgnscid
to catch any illegal moves. This approach greatly reduces the amount of
time needed to edit the PGN file for syntax errors before placing it
into a database.
This reduces the amount of manual review and editing necessary to
rescue important games, or discard others from a file to end up
with a much cleaner PGN file for viewing, or for insertion into
a database such as SCID, CHESSBASE, FRITZ, or BOOKUP Lite.
PGNTRIM5 will "repair" correctable PGN syntax errors such as cd4
which is changed to cxd4 and e8Q to e8=Q and f1 Q + which
becomes f1=Q+ etc. You may run the accompanying TEST.PGN file to see
what PGNTRIMn will do, i.e., PGNTRIM6 TEST.PGN TESTOUT.PGN.... if you
do not enter the filenames in the command tail, you will simply be
prompted for them as the program begins.
(P)ortable (G)ame (N)otation format is rather thoroughly defined and
effective as a means to record and distribute recordings of chess game
moves. This standard is available over the internet from several sources.
Recently, people have been submitting "annofritzed" PGN games to internet
websites. These often reach more than 8,000 characters of movestext...and
all too often contain unbalanced alternate move tokens (..)..) for which
nesting IS permitted, or they may contain unbalanced curly brace tokens
{..} delimiting comments. Fritz will produce correct PGN syntax when
autofritzing, but humans seem driven to "improve or clarify" these comments
and they frequently end up with these tokens unbalanced. The PGN standard
forbids nested {..{...}..} curly braces, anyway.
A common error made by players trying to enhance comments or alternate
moves, is to use a semicolon ";". The PGN standard requires that ALL
text following it in that input record be dropped as comments. If that
occurs within {..} or within (...) then the closing character is
dropped causing unbalance. Ahem...a STANDARD is a STANDARD, thank you.
Recently, PGN games have been appearing on the internet which are
Fischer-Random games. If there is no FEN statement, or other indication
of this, then "illegal moves" such as 1.Nb3 will pass through syntax
checking, but will appear to be an illegal move to database programs!
Some standard indication of Fisher-Random games is being debated, and
needs to be added to the PGN standard. Until then, PGNTRIM5 or 6 will
not recognize a Fisher Random tag such as [Varient "Fischerandom"] until
the PGN standard is final.
Chess magazines and books are not immune from typograhical errors and
omissions such as leaving out moves entirely, leaving pieces off the
diagrams, having two black Kings, no White King, displaying entirely
the wrong diagram, etc.
Persons collecting PGN chess game records do not want to end up with such
problems that show up while a game is being studied! Normalization
programs can detect most PGN problems, fix many, and tell the user about
the others so that they can be manually edited, or the game discarded.
PGNTRIM5 directs erroneous games into PGNTRIM5.BAD where they can be
reviewed and edited separately from the clean output file. If you were
to edit and delete some games from PGNTRIM5.BAD...leaving only games
which you have `fixed`, then you could simply COPY the cleaned
PGNTRIM5.BAD file to the clean output file as for example:
copy newfile.pgn+pgntrim5.bad (NOTE no spaces in the file list!
and you may prefer to drop the [Warning tag from the corrected game).
A PGN game recording example follows. Heading records are called "tags",
and seven of them are required as a minimum....the first 7 shown below
are required in any PGN game. Other tags are optional such as the
"Opening" and "ECO" tags shown. All tags must conform to standard in
order to be useful to a wide audience...Each tag must begin with [ and
end with ], and the tag name must begin with one uppercase letter, the
text must be enclosed within quotation marks, etc. It is somewhat
surprising just how many PGN games have simple syntax errors in the
tag records!
NOTE: PGNTRIM5 can be forced to retain all tags into the output file by
adding /alltags to the command tail, otherwise only the following
tags are preserved:
[Event [Site [Date [Round [White [Black [Result
[ECO [Opening [WhiteElo [BlackElo and [Comment
Stripping the [Annotator, [PlyCount, [Clock, etc. etc. saves
considerable file space, but if you MUST have them all then
always include /alltags in the command line for example
pgntrim5 2006WCC.PGN 2006WCC.TRM /alltags
By default, there will be exactly four complete moves per line in
the output file unless you specify between one and seven moves.
One move per line is useful in teaching situations where you want
the students to comment on each move (in writing). Four moves per
line permits printing with a decent sized font without overflows.
These are specified in the command tail, i.e.,
PGNTRIM6 OLDFILE.PGN NEWFILE.PGN /MPL:1 etc.
Normalization programs detect deviations from standard, and either fix
the problem, notify the user, or both. Missing tags, illegal moves or
incomplete moves such as B7, a8, or Rx can not be fixed and are
simply reported to the user for editing or discarding the game.
Other problems such as spacing errors can usually be fixed by a
normalization program, so Nxg4Nbd7 (no space between White and Black
halfmoves) can be fixed to Nxg4 Nbd7, and O-O5. can be fixed to
O-O 5. Castling must use alpha O, not zeroes, a normalization program
can easily substitute to fix this..as PGNTRIMn does.
Missing half-moves or entire moves can be detected and reported, as
can a result code which does not match the [Result tag.
[Event "Example PGN Chess Game Record"]
[Site "Moscow"]
[Date "2003.12.25"]
[Round "2"]
[White "Blaganov"]
[Black "Dufus"]
[Result "1-0"]
[Opening "Scandinavian"]
[ECO "B01"]
1.e4 d5 2.exd5 Qxd5 3.Nc3 Qd8 4.d4 Nf6
{B01 Scandinavian}
5.Bc4 c6 6.Nf3 Bg4 7.Bxf7 Kxf7 8.Ne5 Kg8
9.Nxg4 Nbd7 10.Qe2 Nxg4 11.Qe6# 0-1
EXAMPLE GAMES NORMALIZED USING PGNTRIM5
---------------------------------------
Here is an example "BEFORE" and "AFTER" using PGNTRIM5.
This very old game was annotated by the computer program
Fritz 6...a process called annofritzing. There are many
comments within curly braces {...}, NAG comments... $17,
move continations following an alternate move sequence,
and there are many nested alternate moves i.e., ( ( ( ) ) )
[Event "New Orleans"]
[Site "New Orleans"]
[Date "1849.??.??"]
[Round "?"]
[White "Morphy, Paul "]
[Black "J. MacConnell sr"]
[Result "1-0"]
[Annotator "Fritz 6 (6s)"]
[PlyCount "57"]
[EventDate "1849.??.??"]
1. e4 {C39: King's Gambit Accepted: 3 Nf3 g5 4 h4} e5 2. f4 exf4 3. Nf3 g5 4.
h4 g4 5. Ne5 h5 6. Bc4 Rh7 7. d4 d6 8. Nd3 f3 9. g3 (9. gxf3 Be7 10. Be3 Bxh4+
11. Kd2 Bg5 12. f4 Bf6 13. a3 c6 14. Nc3 Bh8 15. f5 Ne7 16. Qe2 Kf8 17. f6 Bxf6
18. Raf1 d5 19. Rxf6 dxc4 20. Ne5 Nd7 21. Nxd7+ Bxd7 22. Rh6 Rg7 23. R6xh5 Ng8
{Pektor,A-Zvara,P/Prague 1992/0-1 (48)}) 9... Nc6 10. Nf4 $146 (10. c3 Nge7 (
10... Nce7 11. Kf2 c6 12. Nf4 Qc7 13. Qb3 b5 14. Bd3 Rh8 15. Re1 Ng6 16. Nxg6
fxg6 17. e5 Ne7 18. Bxg6+ Kd8 19. Qf7 Nxg6 20. Qxg6 Qg7 21. Bg5+ Kc7 22. exd6+
Kb6 23. Bd8+ Ka6 24. Qxg7 Bxg7 25. Bc7 {
Abbe de Lionne & Morant-Maubisson & Auzout/Paris 1680/1-0 (40)}) 11. Nf4 a6 12.
a4 Bg7 13. Qb3 Bh8 14. Nxh5 Kf8 15. Nf4 Na5 16. Qa2 Nxc4 17. Qxc4 c6 18. Nd2 d5
19. exd5 cxd5 20. Qb4 Bf6 21. Nf1 Kg7 22. h5 Nc6 23. Qc5 Be6 24. Qa3 Qd7 {
Jannisson & Maubisson-Lionne & Morant/Paris 1680 (36)}) (10. Bb5 d5 11. Ne5
Bd7 12. Nxd7 Qxd7 $17 (12... Kxd7 $2 13. exd5 Bd6 14. Kf2 $18 (14. dxc6+ $6
bxc6 15. Ba4 Bxg3+ 16. Kf1 Rb8 $16 (16... Bxh4 $4 {
taking the pawn will bring Black grief} 17. Qd3 $18)))) 10... Bd7 (10... Nf6
11. Nc3 $17) 11. Nc3 Nf6 (11... Bg7 12. Be3 $17) 12. Be3 Ne7 (12... Bh6 13. Rf1
$17) 13. Kf2 c6 (13... Bh6 14. e5 dxe5 15. dxe5 $17) 14. Re1 Bg7 15. e5 dxe5
16. dxe5 Nfd5 (16... Nfg8 17. Ne4 Bxe5 18. Ng5 Bxf4 19. Nxh7 Bxe3+ 20. Rxe3 $11
) 17. Bxd5 (17. Nfxd5 Nxd5 18. Nxd5 cxd5 19. Qxd5 Bh8 $14) 17... cxd5 (17...
Nxd5 18. Ncxd5 cxd5 19. Nxd5 Be6 $14 (19... Bxe5 {
Black again will not be able to digest the pawn} 20. Bg5 f6 21. Nxf6+ Kf7 22.
Rxe5 (22. Qxd7+ $6 {is not possible} Qxd7 23. Nxd7 Bd4+ 24. Kf1 Kg6 $18) 22...
Qb6+ 23. Re3 $18)) 18. Bc5 (18. Ncxd5 Nxd5 (18... Bxe5 $2 {
is nothing because of} 19. Bb6 Qb8 20. Bd4 $18) 19. Nxd5 Be6 $14 (19... Bxe5 {
as before the pawn must remain untouched} 20. Bg5 f6 21. Nxf6+ Kf7 22. Rxe5
Qb6+ 23. Re3 $18)) 18... Bc6 (18... Rc8 19. Bxa7 Qa5 20. Bd4 $15) 19. b4 (19.
Qd3 Rh6 $11) 19... b6 (19... d4 $142 20. Qd3 Rh6 $15 (20... dxc3 21. Qxh7 Kf8
22. Rad1 $18 (22. Qxh5 $6 {is the less attractive alternative} Qd2+ 23. Kg1 Kg8
$18) (22. Nxh5 $4 {the pawn is indigestible} Qd2+ 23. Re2 Qxe2+ 24. Kg1 Qg2#)))
20. Bxe7 $14 Qxe7 {The isolani on e5 becomes a target} 21. Nfxd5 Qb7 $4 (21...
Bxd5 $142 {is just about the only chance} 22. Nxd5 Qd8 23. Nf6+ Bxf6 24. exf6+
Kf8 25. Qxd8+ Rxd8 $16) 22. Nf6+ $18 Bxf6 23. exf6+ Kf8 24. Qd6+ Kg8 25. Re7
Qc8 26. Rc7 Qf5 27. Qxc6 {Threatening mate: Qxa8} Qxc2+ (27... Rf8 {
does not save the day} 28. Nd5 Qe5 $18) 28. Ke3 Rd8 (28... Rf8 29. Rxa7 Qb2 30.
Ra8 Qxc3+ 31. Qxc3 Rxa8 32. Qc7 $18) 29. Rd1 $1 {
the end of the story. Threatening mate... how?} (29. Rd1 Rf8 30. Rxa7 $18) 1-0
...after processing the above file through PGNTRIMn, it appears as
[Event "New Orleans"]
[Site "New Orleans"]
[Date "1849.??.??"]
[Round "?"]
[White "Morphy, Paul "]
[Black "J. MacConnell sr"]
[Result "1-0"]
[Annotator "Fritz 6 6s "]
[PlyCount "57"]
[EventDate "1849.??.??"]
1.e4 e5 2.f4 exf4 3.Nf3 g5 4.h4 g4
5.Ne5 h5 6.Bc4 Rh7 7.d4 d6 8.Nd3 f3
9.g3 Nc6 10.Nf4 Bd7 11.Nc3 Nf6 12.Be3 Ne7
13.Kf2 c6 14.Re1 Bg7 15.e5 dxe5 16.dxe5 Nfd5
17.Bxd5 cxd5 18.Bc5 Bc6 19.b4 b6 20.Bxe7 Qxe7
21.Nfxd5 Qb7 22.Nf6+ Bxf6 23.exf6+ Kf8 24.Qd6+ Kg8
25.Re7 Qc8 26.Rc7 Qf5 27.Qxc6 Qxc2+ 28.Ke3 Rd8
29.Rd1 1-0
---------------------------------------------------------
Here is an example "BEFORE" and "AFTER" using PGNTRIM5.
This game was annotated by the computer program Fritz8.
There are many {[%emt 0:00:00]} elapsed-time remarks
which unfortunately use sqare braces within the movestext!!
Although these are also within curly brace pairs, using
[..] square braces within the moves text area is a violation
of common PGN good practice, if not the standard, itself.
PGNTRIM5 will remove these as shown in the example below.
[Event "Fritz8 commentary removal test file"]
[Site "Howie in the Hills, Florida"]
[Date "2004.05.28"]
[Round "?"]
[White "Fritz 8"]
[Black "McGillicuddy, Sean"]
[Result "1-0"]
[ECO "B06"]
[PlyCount "75"]
[Comment "Unfortunately, Fritz 8 also uses funky comment spacing"
{286MB, Fritz8.ctg, Intel 2.5 WinXP
} 1. Nf3 {[%emt 0:00:00]} g6 {
[%emt 0:00:00]} 2. e4 {[%emt 0:00:00]} Bg7 {[%emt 0:00:03]} 3. d4 {
[%emt 0:00:00]} d6 {[%emt 0:00:04]} 4. Nc3 {[%emt 0:00:00]} Nc6 {[%emt 0:00:12]
} 5. Bb5 {[%emt 0:00:01]} Bd7 {[%emt 0:00:02]} 6. O-O {[%emt 0:00:02]} a6 {
[%emt 0:00:05]} 7. Be2 {[%emt 0:00:01]} Bg4 {[%emt 0:00:17]} 8. Be3 {
[%emt 0:00:01]} Nf6 {[%emt 0:00:10]} 9. h3 {[%emt 0:00:02]} Bd7 {[%emt 0:00:04]
} 10. Qc1 {[%emt 0:00:01]} O-O {[%emt 0:00:25]} 11. Qb1 {[%emt 0:00:02]} e5 {
[%emt 0:00:23]} 12. dxe5 {[%emt 0:00:02]} dxe5 {[%emt 0:00:14]} 13. Kh1 {
[%emt 0:00:01]} Re8 {[%emt 0:00:14]} 14. a3 {[%emt 0:00:01]} b5 {[%emt 0:00:20]
} 15. Bc5 {[%emt 0:00:02]} Be6 {[%emt 0:00:12]} 16. Qc1 {[%emt 0:00:02]} Qc8 {
[%emt 0:00:10]} 17. Qd2 {[%emt 0:00:02]} Bxh3 {[%emt 0:00:19]} 18. gxh3 {
[%emt 0:00:05]} Qxh3+ {[%emt 0:00:02]} 19. Nh2 {[%emt 0:00:00]} Nd4 {
[%emt 0:00:14]} 20. Rfd1 {[%emt 0:00:04]} Rad8 {[%emt 0:00:09]} 21. Qd3 {
[%emt 0:00:03]} Qc8 {[%emt 0:00:37]} 22. b4 {[%emt 0:00:03]} h5 {[%emt 0:00:10]
} 23. Rac1 {[%emt 0:00:03]} Bh6 {[%emt 0:00:07]} 24. Rb1 {[%emt 0:00:04]} Bf4 {
[%emt 0:00:14]} 25. Bf1 {[%emt 0:00:02]} Kg7 {[%emt 0:00:10]} 26. a4 {
[%emt 0:00:05]} c6 {[%emt 0:00:06]} 27. Bg2 {[%emt 0:00:02]} Rh8 {
[%emt 0:00:31]} 28. Nf3 {[%emt 0:00:07]} h4 {[%emt 0:00:08]} 29. Ne2 {
[%emt 0:00:05]} h3 {[%emt 0:00:04]} 30. Nfxd4 {[%emt 0:00:02]} exd4 {
[%emt 0:00:09]} 31. Bf3 {[%emt 0:00:04]} Ng4 {[%emt 0:00:07]} 32. Bxg4 {
[%emt 0:00:02]} Qxg4 {[%emt 0:00:09]} 33. Rg1 {[%emt 0:00:04]} Qh4 {
[%emt 0:00:38]} 34. Bxd4+ {[%emt 0:00:09]} Kg8 {[%emt 0:00:19]} 35. Rbf1 {
[%emt 0:00:06]} Rh6 {[%emt 0:00:24]} 36. Ng3 {[%emt 0:00:03]} h2 {
[%emt 0:00:56]} 37. Rg2 {[%emt 0:00:02]} Be5 {[%emt 0:00:17]} 38. Nf5 {
[%emt 0:00:02]} 1-0
[Event "Fritz8 commentary removal test file"]
[Site "Howie in the Hills, Florida"]
[Date "2004.05.28"]
[Round "?"]
[White "Fritz 8"]
[Black "McGillicuddy, Sean"]
[Result "1-0"]
[ECO "B06"]
[PlyCount "75"]
[Comment "Unfortunately, Fritz 8 also uses funky comment spacing"
1.Nf3 g6 2.e4 Bg7 3.d4 d6 4.Nc3 Nc6
5.Bb5 Bd7 6.O-O a6 7.Be2 Bg4 8.Be3 Nf6
9.h3 Bd7 10.Qc1 O-O 11.Qb1 e5 12.dxe5 dxe5
13.Kh1 Re8 14.a3 b5 15.Bc5 Be6 16.Qc1 Qc8
17.Qd2 Bxh3 18.gxh3 Qxh3+ 19.Nh2 Nd4 20.Rfd1 Rad8
21.Qd3 Qc8 22.b4 h5 23.Rac1 Bh6 24.Rb1 Bf4
25.Bf1 Kg7 26.a4 c6 27.Bg2 Rh8 28.Nf3 h4
29.Ne2 h3 30.Nfxd4 exd4 31.Bf3 Ng4 32.Bxg4 Qxg4
33.Rg1 Qh4 34.Bxd4+ Kg8 35.Rbf1 Rh6 36.Ng3 h2
37.Rg2 Be5 38.Nf5 1-0
OTHER FREEWARE PGN UTILITIES
----------------------------
PGN2ONE
-------
PGN2ONE reads normalized PGN and creates one-record-per-game
and prepends a 40 character sort key which can be used to sort
by White, Black, ECO, Number of moves in game, Year of game, etc.
Some batch files have been included in PGNutils.zip to use qsort
and perform each of the above sorts.
I refer to this output format as .111 format indicating one
line/record per game. The prepended sort/selection record area
provides exactly consistent locations for important data needed
to sort and select games. This prepended area is removed when the
.111 format is converted back to PGN by either ONE2PGN or PGNUNDUP.
The first 7 letters of player names is optimum because it reduces
misspellings. Before you challange this approach, look up the
word OPTIMUM. It would also be rather awkward to obtain fixed
positions for full names such as Leko, Nimzowitsch, etc.
Here are some examples of one-record-per-game created by PGN2ONE:
...you can see how easy it is to select or sort by critical elements.
1 5 10 15 20 25 30 35 40 ...see BYYEAR.BAT etc. examples.
White Black Year Mvs Re Site ECO
{ Adams Kasparo 1992 022 0-1 Dor D31} [Event "?"][Site "Dortmund"]...
{ Anand Kasparo 1998 024 1/2 Lin B55} [Event "It "][Site "Linares "]...
{ Bareev Kasparo 1999 021 1/2 Sar D80} [Event "It "][Site "Sarajevo "]...
{ Beliavs Kasparo 1979 035 1-0 Min A61} [Event "?"][Site "Minsk"]...
{ Karpov Kasparo 1996 045 1/2 Las D20} [Event "It "][Site "Las Palmas "]...
{ Kasparo Anand 1999 033 1-0 Wij A45} [Event "Blitz "][Site "Wijk aan Zee "]
{ Kasparo Huebner 1992 048 0-1 Col C23} [Event "?"][Site "Cologne"]...
{ Kasparo Ivanchu 1999 036 1-0 Lin D11} [Event "It "][Site "Linares "]...
Creating one record per game in this way also facilitates the use
of the FIND command to select or reject games containing certain
text strings. FIND comes will all versions of DOS or Windows.
For example:
find "O-O-O" MyBig.111 >CastLong.111
{the > redirects output to a new file instead of the display.}
or
find /V "O-O" MyBig.111 >NoCastl.111
{selects only games in which neither player castles.}
{the "/V" command parameter OMITS matched records.}
or
find "C02" MyBig.111 >FrAdvan.111
{this outputs games of French Defense, Advance var.}
or
find "2004" MyBig.111 >2004Only.111
{this outputs only games played during year 2004.}
or
find /V "1/2" MyBig.111 >NoDraws.111
{drops drawn games of any length.}
Two or more `find` executions can be used to refine selections further:
EXAMPLE 1
---------
find "Leko" MyBig.111 >Leko2.111
{this outputs games played by Leko as White or Black.}
then
find "Kramnik" Leko2.111 >LekoKram.111
{this outputs games played between Leko & Kramnik.}
EXAMPLE 2
---------
find "0-1" MyBig.111 >BlackWin.111
{drops draws and incomplete games.}
{drops games shorter than 20 moves.}
find "1-0" MyBig.111 >WhiteWin.111
{drops draws and incomplete games.}
{drops games shorter than 20 moves.}
copy BlackWin.111+WhiteWin.111 WinsOnly.111
find " 26." /V WinsOnly.111 >Miniat25.111
{outputs games shorter than 26 moves.}
ONE2PGN
-------
ONE2PGN reads the one-record-per-game file (any sequence) created
using PGN2ONE, and outputs a new PGN file. If the one-record-per-game
file as been sorted on positions 1 to 40 with the intention of
dropping duplicate games, then PGNUNDUP should be used instead of
ONE2PGN in order to drop duplicate games. ONE2PGN will NEVER drop
a game, even if it is a duplicate game...and therefore the input
file sequence to ONE2PGN is of no concern...you MAY or MAY NOT
sort it into any sequence as you wish.
or......
PGNUNDUP reads the sorted output from PGN2ONE, and creates normal
PGN from the incoming 1-record-per-game. The input file is expected
to be in ascending sequence on positions 1 to 40 so that duplicate
games can be detected and dropped. The input file MUST be in that
sequence to use this program, else it will tell you that the input
file is "out of sequence".
PGNBEST6
--------
PGNbest6 reads normal PGN, looks in a plain text table (provided)
for the 3,000 or so greatest player names of all time, and outputs
games if EITHER player is on that list. Kasparov vs Amatuer will be
written to the new output file, but NoName vs. Amatuer will not.
The PGNbest6.RAT plain text ratings file (user modifiable) should
be in the same folder as the .exe file. This file may be in ANY
sequence, but I find alphabetical by name easier to update!
TIPS ON USING THESE UTILITIES
-----------------------------
Always run PGNTRIM5 first to normalize the PGN syntax.
Example: pgntrim5 05Linar.pgn 05Linar2.pgn
Example: pgntrim5 05Linar.pgn 05Linar2.pgn /MPL:5
NOTE: /MPL:n where n is 1 to 7 sets moves per line in output}
By using PGN2ONE.exe, you create one line (record) per game, and
prepend sort fields (columns) to it. You make sorting or selecting
much simpler since several key sort items are in fixed positions!
Utility programs are provided to convert PGN to one-line-per-game
and back again after sorting or selecting has been done. For example,
here are a few records illustrating this format:
White Black Year Moves
Result
Site ECO
{ Kramnik Kasparo 2003 018 1/2 Lin D11} [Event "XX SuperGM"][Site "Lin...
{ Radjabo Leko 2003 046 0-1 Lin E12} [Event "XX SuperGM"][Site "Lin...
{ Anand Ponomar 2003 064 1-0 Lin C65} [Event "XX SuperGM"][Site "Lin...
{ Vallejo Anand 2003 030 1/2 Lin A30} [Event "XX SuperGM"][Site "Lin...
{ Kasparo Radjabo 2003 039 0-1 Lin C11} [Event "XX SuperGM"][Site "Lin...
{ Ponomar Kramnik 2003 040 0-1 Lin B30} [Event "XX SuperGM"][Site "Lin...
{ Radjabo Ponomar 2003 011 1/2 Lin D30} [Event "XX SuperGM"][Site "Lin...
{ Kramnik Vallejo 2003 030 1/2 Lin D15} [Event "XX SuperGM"][Site "Lin...
{ Leko Kasparo 2003 087 1/2 Lin B55} [Event "XX SuperGM"][Site "Lin...
{ Bacrot Adams 2003 045 1/2 Rey A45} [Event "Hrokurinn"][Site "Reyk...
The data between the {...} are sort keys. You may use ByECO.BAT to sort
such a file by ECO opening code as shown above. You may use ByYear.bat,
etc. for other sequences. The command-line "find" utility which comes with
DOS and Windows may be used against this format very handily since
selections will be entire games...ready for ONE2PGN to restore to PGN.
Example: pgn2one 03twic.pgn 03twic.111
find "Karpov" 03TWIC.111 > 03Karpov.111
one2pgn 03Karpov.111 03Karpov.pgn
To combine many PGN files, the copy command will suffice.
Example: copy 02linar.pgn+03linar.pgn+04linar.pgn 0204lin.pgn
{NOTE No spaces in the multiple file names} {New output}
Example: copy *.pgn Feb25.pgn
---------------------------------------------------------------------
Additional (and somewhat repetitive) comments about these utilities.
PGN files from Unix or Linux based computers use a single newline
charater to terminate each line. Windows PCs require a pair of
characters for this, and the utility program named crlf.exe will
`fix` PGN files from Unix/Linux sources to work properly on Windows
PCs. PGNTRIM5.exe should then be used to normalize the combined PGN
file as follows: pgntrim5 RawPGN.pgn Normal.pgn
When combining several PGN files into one PGN file
(i.e., COPY 2006.PGN+06*.PGN) you may run across some files that
originated on a Unix/Linux computer and therefore have only a linefeed
separator for lines (called a newline, or /n). For Windows PCs, you
need to run the combined file through CRLF.EXE to insure that every
line is terminated by a CR/LF character pair as Windows expects.
PGNTRIM5, PGN2ECO3, PGN2ONE, ONE2PGN, PGNUNDUP, and PGNBEST6 all
require a new output filename, and DO NOT change the original file.
QSORT and CRLF require only one filename, and it becomes changed...
so make a backup file first if you are worried.
Games rejected by PGNTRIM5 will appear in a file named
BADTRIM5.BAD where they can be reviewed, and then edited or
discarded. For example, games having zero or one move are rejected,
games with nested curly-brace comments such as
{23.Qb6 was better {then if 23...Ka8....}} etc. are contrary to
the PGN standard. Nested alternate moves within parentheses are
proper, and are handled by pgntrim5 unless they are 'unbalanced'
...in that case they are also sent to the .BAD file. The ...BAD file
continues to grow and grow until you delete it, then a new one will
be created when needed.
PGNTRIM5 fixes many common syntax errors and omissions so that the
output file conforms very closely to the `export format` for PGN.
Illegal or impossible moves are detected later when the normalized
PGN files are imported into a database (as with pgnscid, for example).
PGN2ECO3 may optionally be run then to assign ECO codes using
the first 4 moves compared to the PGN2ECO3.eco plain text table
which is provided. Be careful modifying this .eco file since
the sequence AND completeness are important. PGN2ECO3 will use
the last match it found in pgn2eco3.eco as it moves down the
list. The sequence and content of PGN2ECO3.eco is critical!
An [Opening tag for each game will also be inserted into the PGN
output file if no such tag existed. For help, enter PGN2ECO3 /?
PGN2ONE.exe converts normalized PGN format files to one-record-
per-game plus a 40-char prefix of useful sort `fields` (or columns).
QSORT or any such program can then be used to insure the sequence
of the .111 format file. (.111 is my convention, any filetype may
be used). If the file is sorted from position 1 through 40, then
duplicates may be dropped if PGNUNTAG.exe is then executed.
An example command to drop duplicates from a sorted .111 file is:
pgnuntag amber.111 amber.pgn
The one-record-per-game format has many useful functions:
1. It is simpler to select or drop large sets of games as for
example dropping all draws of less than nn moves, selecting
only games from one or several specific years. The `find`
command, or a custom-written program is useful for all this.
2. The games can easily be sorted by White, Black, ECO, number
of moves in the game, result (1-0,0-1, etc.), or year.
3. When sorted on positions 1 to 40, and used as input to
PGNUNDUP.exe, duplicate games are dropped as the new output
file is written.
4. Games having desired characteristics such as neither player
castling can be easily selected. Likewise for castling long,
checkmate, or certain variations within an ECO opening code.
If duplicates need not be dropped, you may use ONE2PGN.exe to convert
the .111 format back to PGN ...as with one2pgn Leko.111 Leko.pgn
Finally, if desired, a PGN file can be reduced to contain only games
where both players are rated 2450 or above by using PGNBEST6.exe i.e.,
pgnbest6 04all.pgn 04best.pgn
or
pgnbest6 04all.pgn 04best.pgn /ELO which causes [WhiteElo and
[BlackElo tags to be updated or created.
PGNBEST6.exe uses the plain text ratings file PGNBEST6.rat to select
the strongest players. This file need not be in alphabetical, or any
other sequence, but may be easier to maintain in alphabetical sequence.
One method is to capture new FIDE ratings lists into the .rat format,
and keep `classic` masters such as Alekhine, Fisher, etc. at the end
where they can simply be copied as one block into a new .rat file.
Since the `Classic` players are not gaining new ratings, an estimated
rating is used, and would only apply to games in which they played.
Up-and-coming new masters (such as Magnus Carlsen) should be added to
cause their games to be selected. PGNBEST6.rat uses the first seven
characters of player names for selections and matches since this has
been found by lengthy testing to be optimum.
-----------------------------------------------------------------------
S U M M A R Y
-------------
I run pgntrim5 immediately after downloading PGN files from the internet.
If I am going to combine two or more PGN files, I do that next using
some variant of the "copy" command to achieve the desired result.
For example: copy twic48*.PGN+twic49*.pgn+twic50*.pgn twic2004.pgn
If I am going to select only the games where both players are very strong,
then I run pgnbest6 using the plain-text user-modifiable table file
named pgnbest6.rat
-----------------------------------------------------------------------
The following files are contained in the archive file PGNUTILS.ZIP
111TOELO.EXE Reads .111 records, adds GM Elo if none present.
111TOELO.RAT Used by above program. Includes classic Grandmasters.
111YEAR.EXE Reads .111 in any sequence, appends .111 to specific year
files such as 1854.111, 2001.111, etc.
2600PLUS.RAT Recent ratings of Grandmasters with FIDE rating of 2600+.
BYECO.BAT These batch files sort .111 files in different ways....
BYEVENT.BAT
BYMOVES.BAT
BYRESULT.BAT
BYYEAR.BAT
BYPLAYER.EXE Reads .111 in any sequence, writes Leko.111, Anand.111, etc.
CHOICE.DOC Free utility from Microsoft for making choices in batch files.
CHOICE.EXE
CRLF.EXE Scans any text input (including PGN), and insures all lines
CRLF.TXT ...end in CR/LF rather than just /N (Linefeed).
ECOBYDES.TXT Gives ECO code from opening description.
ECOBYECO.TXT Gives opening description from ECO code.
FILE_ID.DIZ Brief summary of PGNUTILS.ZIP for website file listings.
FIXCRLF.EXE Similar to CRLF
ICCF.RAT Correspondance chess ratings list
ONE2PGN.EXE Convert .111 format in any sequence back into PGN.
PGN2ECO3.ECO OPTIONAL: reads and writes PGN assigning ECO only if missing.
PGN2ECO3.EXE -------- Caution...these are APPROXIMATE ECO codes, only.
PGN2ONE.EXE Converts PGN file to .111 file of one line per game.
PGNBEST6.EXE Reads PGN, writes PGN of GMs having FIDE Elo of 2450+
PGNBEST6.RAT
PGNSCID.EXE Loader for SCID database will catch illegal or impossible
...moves which PGNTRIM5 misses.
PGNTRIM5.DOC This is the primary normalization program for PGN files.
PGNTRIM5.EXE Read the .DOC file for further details.
PGNUNDUP.EXE Reads SORTED .111, writes PGN while dropping duplicate games.
PGNUTILS.TXT This file
QSORT.BRF Brief documentation...all you need is in here!
QSORT.DOC Full documentation
QSORT.EXE Command-line sort utility handles ANY size file.
TEST.PGN Important test data exercising PGNTRIM5 & showing functions.
UN_EOF.EXE Removes excessive end-of-file characters leaving only one!
UN_EOF.TXT
|