Character encoding (Generation III)

From Bulbapedia, the community-driven Pokémon encyclopedia.
Jump to navigationJump to search

The Generation III games use a proprietary character encoding to store text data. The Generation III encoding is greatly different from the encodings used in previous generations, with characters corresponding to different bytes. Versions of the games in different languages may use different encodings, some more different than others.

Some text strings are stored in fixed-length structures while others are stored in a block of text with separate strings simply terminated by 0xFF. In the large, variable-length blocks, usually another structure will have pointers to the appropriate string(s) within that block of text. In the fixed-length structures, strings are still terminated by 0xFF, but any remainder of the allotted space is padded out with 0x00.

Compatibility

Unlike previous generations, all language versions are able to trade or battle across languages. This includes between Japanese games and Western games, which was not previously supported. Due to the encodings mostly being compatible, when trading Pokémon between different languages, nicknames and Original Trainer names are usually displayed correctly, though Japanese games will only display the first five characters of names longer than five characters.

All of the core series games for the Game Boy Advance and the side series games for the Nintendo GameCube losslessly preserve the codepoints used in Trainer names and Pokémon nicknames when traded, so they will be correctly displayed if returned to a game of the same language as its language of origin, even if it may appear truncated or be displayed differently.

Nintendo GameCube

Main article: GameCube character encoding (Generation III)

When viewing or storing a Pokémon in Pokémon Box Ruby & Sapphire or trading a Pokémon between the Game Boy Advance games and Pokémon Colosseum and XD, its nickname and Original Trainer need to be transcoded between this character encoding and that of the Nintendo GameCube game.

Pal Park

Main article: Pal Park → Nicknames and Original Trainers

When transferring a Pokémon to a Generation IV game via Pal Park, its nickname and Original Trainer need to be transcoded from this character encoding to that of the Generation IV games. Due to a bug, some accented letters that normally cannot be entered by the player are incorrectly turned into a kana character instead.

Character sets

Every Western game in Generation III (English, French, Italian, German, and Spanish games) contains two character sets: their native set and the Japanese set. The different Western character sets are mostly identical, with only a few regional differences.

For most text, the game's native character set is used, but if a Pokémon's origin language is Japanese, its nickname and its Original Trainer's name use the Japanese character set. The Japanese games only have the Japanese character set, but almost all user-enterable characters from the Western versions are encoded to roughly equivalent characters in the Japanese encoding. The key differences are 0xB8 (a comma in the Western versions but a period in Japanese), 0xAE (a hyphen-minus in the Western versions but a chōonpu in Japanese, which is visually similar), and 0xAD and 0xB0-0xB4 (which display as the Japanese equivalents of the Western characters).

Western

The table below shows the English character set in Pokémon Emerald. Some differences do exist between different revisions and games and between different languages, detailed afterward.

Characters on a white background are the only characters that can be input in names; 0xF1 - 0xF6 are only available for input in German games. Those on a light gray background may be used in other text strings (such as dialogue) depending on the language of the game. Characters on a dark gray background are unused values that mostly display as spaces in Pokémon FireRed, LeafGreen, and Emerald; in Pokémon Ruby and Sapphire, they are holdovers from the Japanese encoding.

-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
0- À Á Â Ç È É Ê Ë Ì Î Ï Ò Ó Ô
1- Œ Ù Ú Û Ñ ß à á ç è é ê ë ì
2- î ï ò ó ô œ ù ú û ñ º ª ᵉʳ & +
3-
Lv
= ;
4-
5- ¿ ¡ PK MN Character 0x55 iii.png Character 0x56 iii.png
Character 0x57 iii.png
Character 0x58 iii.png
Character 0x59 iii.png
Í % ( )
 
 
6-
 
 
 
 
 
â í
7- * * *
8- * * * * < >
9-
A- ʳᵉ 0 1 2 3 4 5 6 7 8 9 ! ? . -
B-
' $ , × / A B C D E
C- F G H I J K L M N O P Q R S T U
D- V W X Y Z a b c d e f g h i j k
E- l m n o p q r s t u v w x y z
F- : Ä Ö Ü ä ö ü Control characters

Differences between games and revisions

Codepoint 0xB0 represents an ellipsis. In Pokémon Ruby, Sapphire, Colosseum, XD, and Box Ruby & Sapphire, it renders as a two-dot ellipsis (). In Pokémon FireRed, LeafGreen, and Emerald, it renders as a three-dot ellipsis () in the main font, but remains a two-dot ellipsis in the small font used on the party screen and the narrow font used in the Pokédex, bag, and shops. In subsequent generations, this character renders consistently as a three-dot ellipsis.

Codepoint 0xB0 represents an apostrophe or right single quotation mark. In Pokémon Box Ruby & Sapphire, it is transcoded as a curly right single quotation mark (). In Pokémon Colosseum and XD, it is transcoded as a straight apostrophe (').

Codepoints 0x7D-0x83, marked by asterisks (*) above, print spaces 1-7 pixels wide (in ascending order of the hex value). In FireRed and LeafGreen, 0x50 and 0x7D-0x83 are not used and print as regular spaces like other unused characters.

In certain languages, codepoints 0x34, 0x57-0x59, and 0x64 differ between games, as detailed below.

In Pokémon Ruby and Sapphire, many values print Japanese characters—holdovers from the original Japanese encoding. These include:

  • All unused characters (on a dark gray background above)
  • 0x50 and 0x7D - 0x83
  • 0x36, 0x84 - 0x86, and 0xA0, in version 1.0 of the English Ruby and Sapphire only

Regional differences

A few characters differ between regions, and among them are quotation marks. These can be input into names, which means a Pokémon with quotation marks in its nickname or OT name will display differently if traded to a game of a different region.

In the table below, the underscores (_) stand for spaces.

English Spanish Italian German French
0x34 Lv NvRS
Nv.EFRLG
L. Lv. N.
0x57 - 0x59 Character 0x57 iii.pngCharacter 0x58 iii.pngCharacter 0x59 iii.png Character 0x57 es iii.pngCharacter 0x58 es iii.pngCharacter 0x59 es iii.png _, _, MRS
_, _, _EFRLG
Character 0x57 de iii.pngCharacter 0x58 de iii.pngCharacter 0x59 de iii.png Character 0x57 fr iii.pngCharacter 0x58 fr iii.pngCharacter 0x59 fr iii.png
0x5E - 0x63 Character 0x5E it iii.pngCharacter 0x5F it iii.pngCharacter 0x60 it iii.pngCharacter 0x61 it iii.pngCharacter 0x62 it iii.png, Character 0x63 it iii.png
0x64 PcoE
0xB1 «
0xB2 »

Japanese

Only the characters on a white background below can be input in names. The characters on a dark gray background are printed as spaces in Pokémon FireRed, LeafGreen, and Emerald. Otherwise, the Japanese character set has no differences between games or revisions. Codepoint 0xB0 represents an ellipsis. In Pokémon Ruby, Sapphire, Emerald, Colosseum, XD, and Box Ruby & Sapphire, it renders as a two-dot ellipsis (). In Pokémon FireRed and LeafGreen, it renders as a three-dot ellipsis (). In subsequent generations, this character renders consistently as a three-dot ellipsis.

-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
0-  
1-
2-
3-
4-
5-
6-
7-
8-
9-
A-
B- ×
C-
D-
E-
F- Ä Ö Ü ä ö ü Control characters

Special characters

Extra symbols

Pokémon Ruby and Sapphire

In Pokémon Ruby and Sapphire, the escape sequence 0xFC 0x0C is used as an escape code for the following characters, depending on the following byte. These characters are stored in the game's fonts at their respective indices, but the escape sequence forces these characters to be printed directly, rather than possibly interpreted as a control character. If the byte is not a control character byte, that byte prints normally.

-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
F-

As 0xF7-0xF9 are not control characters in these games, they will also print normally even if not escaped, though in-game text always escapes them.

In Japanese games, 0xFB is instead used to produce "=" in the Options screen and "&" in the credits. Western versions almost never use these codepoints, instead preferring to use the equivalent characters in the 0x01-0xA0 range.

The control characters from 0xFC-0xFF do not produce any characters. In the English games, nothing is printed, while in the Japanese games, miscellaneous data appears to be printed.

Pokémon FireRed, LeafGreen, and Emerald

In Pokémon FireRed, LeafGreen, and Emerald, either the escape sequence 0xF9 or 0xFC 0x0C is used as an escape code for the following characters, depending on the following byte. These characters are stored in the game's fonts beginning at index 0x100, after the main set of characters. Characters on a dark gray background are unused values that display as spaces.

Western font
-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
0-
Lv
PP
ID
No
_
1- ×
 
2-
3-
4-
5-
6-
7-
8-
9-
A-
B-
C-
D- Underscore Vertical bar Overline Tilde Left parenthesis Right parenthesis Subset of Greater than Left eye Right eye At sign Semicolon Plus sign Minus sign Equals sign Spiral
E- Tongue Triangle outline Acute accent Grave accent Circle Down-pointing triangle Square Heart Crescent Music note Poké Ball Thunderbolt Leaf Fire Water Right hand
F- Left hand Flower Eye Eye Irritated face Mischievous face Happy face Angry face Surprised face Big smile face Evil face Tired face Neutral face Shocked face Big anger face
Japanese font
-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
0- Lv PP
ID
No _
1-
2-
3-
4-
5-
6-
7-
8-
9-
A-
B-
C- R Button 大 小 ゛
D- ゜ Vertical bar Overline Tilde Left parenthesis Right parenthesis Subset of Greater than Left eye Right eye At sign Semicolon Plus sign Minus sign Equals sign Spiral
E- Tongue Triangle outline Acute accent Grave accent Circle Down-pointing triangle Square Heart Crescent Music note Poké Ball Thunderbolt Leaf Fire Water Right hand
F- Left hand Flower Eye Eye Irritated face Mischievous face Happy face Angry face Surprised face Big smile face Evil face Tired face Neutral face Shocked face Big anger face

0xF9 is used in most cases, but most arrows in game text are still written using the escape sequence 0xFC 0x0C in Japanese versions or the 0x7A-0x7D range in Western versions.

Differences between games and revisions

In Western versions of Pokémon FireRed and LeafGreen, the character "×" was added at 0x117 to preserve the original appearance of the fullwidth character when displaying type matchups in the Help System, as the halfwidth character "×" at 0xB9 used elsewhere in the game is slightly smaller and does not match the width of "◎" and "△".

In Western versions of Pokémon FireRed and LeafGreen, the character "Underscore" was added at 0x1D0 as an 8-pixel wide underscore for the Union Room chat, distinct from the underscore which leaves a 1-pixel gap between it and the previous character at 0x109 used in the easy chat system in Western versions and in both interfaces in Japanese versions.

In Japanese versions of Pokémon Emerald, the characters from 0x1CC-0x1D0 were added to explain what function the R Button performs on the text entry screen, though 0x1CC was not actually used in the final game.

Regional differences

A few characters differ between regions. The table below describes how they are displayed in the main font.

Japanese English Spanish Italian German French
0x105 Lv Nv. L. Lv. N.
0x106 PP AP PP
0x107 ID
0x108 No N.º Nr.

In Japanese versions, the "ID" character is displayed as "ID." in the wider fonts introduced in Pokémon FireRed and LeafGreen.

In the small font, each character is only 8 pixels wide, so the sequence 0x105 0x118 is used in Spanish and German versions to display the full 9-pixel wide "Nv" and "Lv." strings.

Braille

The table below shows the codepoints corresponding to each braille pattern. They are not ordered according to the braille numeric sequence; rather, they are stored in a binary ordering with dots 1, 4, 2, 5, 3, and 6 corresponding to the lowest six bits from lowest to highest.

In Pokémon Ruby, Sapphire, and Emerald, each braille message is preceded by six bytes describing the size and position of the window and text. However, Pokémon Emerald ignores these values and calculates them based on the contents of the message itself.

In Pokémon FireRed and LeafGreen, braille is displayed like normal text in the standard dialogue window.

Western font
-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
0-
1-
2-
3-

Keypad icons

In Pokémon FireRed, LeafGreen, and Emerald, the escape sequence 0xF8 is used as an escape code for the following characters, depending on the following byte.

-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C
0- A Button B Button L Button R Button START SELECT Control Pad up Control Pad down Control Pad left Control Pad right Control Pad up-down Control Pad left-right Control Pad

Control characters

  • 0xF7 is an escape character for dynamic data.FRLGE
  • 0xF8 is an escape character for keypad icons.FRLGE
  • 0xF9 is an escape character for extra symbols.FRLGE
  • 0xFA and 0xFB both mark a prompt for the player to press a button to continue the dialogue. However, they will print the new line of dialogue differently: 0xFA will scroll the previous dialogue up one line before printing the next line, while 0xFB will clear the dialogue box entirely.
  • 0xFC is an escape character that leads to several different functions (see below).
  • 0xFD is an escape character for variables, such as the player's name or a Pokémon's name (see below).
  • 0xFE is a line break.
  • 0xFF is a terminator, marking the ends of strings.

0xFC functions

Index Function Length Values
Introduced in Japanese versions of Ruby and Sapphire
0x00 do nothing 1
0x01 change the text color 2 color index (byte)
0x02 change the highlight color 2 color index (byte)
0x03 change the shadow color 2 color index (byte)
0x04 change the text, highlight, and shadow color 4 text color index (byte)
highlight color index (byte)
shadow color index (byte)
0x05 change the paletteRS 2 palette index (byte)
0x06 change font 2 font index (byte)
0x07 reset font to the defaultRS 1
0x08 pause text display for a period of time 2 length (byte)
0x09 wait for a button to be pressed 1
0x0A wait for a sound effect to finish 1
0x0B play background music 3 index (two bytes, little endian)
0x0C print an extra symbol 2 character index (byte)
0x0D set the X coordinate of the cursor (in all versions except Western RS) 2 number of pixels (byte)
0x0E set the Y coordinate of the cursor 2 number of pixels (byte)
0x0F clear the window 1
0x10 play a sound effect 3 index (two bytes, little endian)
Introduced in Western versions of Ruby and Sapphire
0x11 print a space of the specified width 2 number of pixels (byte)
0x12 set the X coordinate of the cursor 2 number of pixels (byte)
0x13 clear the line until the specified X coordinate 2 number of pixels (byte)
0x14 set the minimum character width 2 number of pixels (byte)
0x15 display text in the Japanese font 1
0x16 display text in the international font 1
Introduced in Japanese versions of FireRed and LeafGreen
0x17 pause the background music 1
0x18 resume playing the background music 1

Although 0xFC 0x00 does not have any effect in regular text, the Western versions of Pokémon Ruby and Sapphire handle it specially in two cases.

  • It is used as a placeholder for a one-digit number in the text "Appeal no. <number>!" used during Contests.
  • It is used to abbreviate the names of cities and towns in the Trainer's Eyes function of the PokéNav. For example, "VERDANTURF TOWN" is shortened to "VERDANTURF".

Color values

A table of available text, highlight, and shadow colors is shown below.

Byte RS FRLG E
0x00 Transparent
0x01
0x02
0x03
0x04
0x05
0x06
0x07
0x08
0x09
0x0A
0x0B
0x0C
0x0D
0x0E
0x0F

0xFD variables

When 0xFD is followed by one of the following bytes, it prints a text variable or version-dependent text. Version-dependent text is only used in Pokémon Ruby, Sapphire, and Emerald; in Pokémon Emerald, all of these values are the same as Pokémon Sapphire, except the version name. The text printed by version-dependent text variables is constant within a single game, but varies between versions and languages.

Text variables
  • 0x01: the player's name
  • 0x02, 0x03, or 0x04: whatever text has been assigned to one of three buffers using a variety of script commands
  • 0x05: an honorific corresponding to the player's gender in Japanese (くん for male, ちゃん for female), or nothing in Western languages
  • 0x06: the rival's name
Version-dependent text
Variable ID Description English content
 R   S   E 
0x07 the game's name RUBY SAPPHIRE EMERALD
0x08 the name of the villainous team MAGMA AQUA
0x09 the name of the non-villainous team AQUA MAGMA
0x0A the name of the villainous team's leader MAXIE ARCHIE
0x0B the name of the non-villainous team's leader ARCHIE MAXIE
0x0C the name of the villainous team's Legendary Pokémon GROUDON KYOGRE
0x0D the name of the opposing Legendary Pokémon KYOGRE GROUDON

Trivia

  • In the name field for Eggs, the game places the bytes 0x60 0x6F 0x8B corresponding to タマゴ (tamago, the Japanese word for egg). This remains in the English version even though the characters have been replaced.


Data structure in the Pokémon games
General Character encoding
Generation I Pokémon speciesPokémonPoké MartCharacter encodingSave
Generation II Pokémon speciesPokémonTrainerCharacter encoding (Korean) • Save
Generation III Pokémon species (EvolutionPokédexType chart)
Pokémon (substructures) • MoveContestContest moveItem
Trainer TowerBattle FrontierCharacter encoding (GameCube) • Save
Generation IV Pokémon species (EvolutionLearnsets)
PokémonSaveCharacter encoding (Wii)
Generation V–present Character encoding
Generation VIII Save
TCG GB and GB2 Character encoding
Project Games logo.png This data structure article is part of Project Games, a Bulbapedia project that aims to write comprehensive articles on the Pokémon games.