Character encoding (Generation I): Difference between revisions
mNo edit summary |
mNo edit summary |
||
Line 16: | Line 16: | ||
Control characters work by intercepting the tile that would normally correspond to the control character and instead perform a different action whether it be end the text or print a lengthy message. | Control characters work by intercepting the tile that would normally correspond to the control character and instead perform a different action whether it be end the text or print a lengthy message. | ||
====Tilemap | ====Tilemap sections==== | ||
# VRAM address 0x9000 to 0x9480 correspond to a portion of the current tileset of the map. Character codes 0x01 to 0x48 and 0x4D directly correspond to them. For example, when your outdoors, tile #3 is the animated flower meaning character code 0x3 will place the animated flower in text, however if your anywhere else such as in battle, in a cave, or elsewhere a completely different tile will likely print. | # VRAM address 0x9000 to 0x9480 correspond to a portion of the current tileset of the map. Character codes 0x01 to 0x48 and 0x4D directly correspond to them. For example, when your outdoors, tile #3 is the animated flower meaning character code 0x3 will place the animated flower in text, however if your anywhere else such as in battle, in a cave, or elsewhere a completely different tile will likely print. | ||
## Characters 0x49 - 0x5F technically are also in this same section but, apart from one, 0x4D, all the rest are control character and thus link to code rather than the tile they would normally correspond to | ## Characters 0x49 - 0x5F technically are also in this same section but, apart from one, 0x4D, all the rest are control character and thus link to code rather than the tile they would normally correspond to | ||
Line 26: | Line 25: | ||
## 0xE0 - 0xFF reference tiles similar to section 2, you can consider these the "other half" of that section although some player-typeable characters like "PK", "MN", gender symbols, etc. are here as well as numbers, some symbols, and more ui characters | ## 0xE0 - 0xFF reference tiles similar to section 2, you can consider these the "other half" of that section although some player-typeable characters like "PK", "MN", gender symbols, etc. are here as well as numbers, some symbols, and more ui characters | ||
====Character | ====Character codes==== | ||
As mentioned above, character codes are within the 0x49-0x5F range with the exception of 0x4D which doesn't map to any code and thus, by default, corresponds to tile 4D. All of these are completely usable in game such as names, testing never showed any crashing, however if done so expect to have some small to large graphical glitches that will usually be cleaned up by changing screens or entering a new map through a warp and definitely some annoyances if used long-term. | As mentioned above, character codes are within the 0x49-0x5F range with the exception of 0x4D which doesn't map to any code and thus, by default, corresponds to tile 4D. All of these are completely usable in game such as names, testing never showed any crashing, however if done so expect to have some small to large graphical glitches that will usually be cleaned up by changing screens or entering a new map through a warp and definitely some annoyances if used long-term. | ||
=====Dialogue control codes===== | =====Dialogue control codes===== | ||
These control codes control dialogue text placement, paging, etc... they can be used in names but will have various temporary graphical glitches | These control codes control dialogue text placement, paging, etc... they can be used in names but will have various temporary graphical glitches | ||
Line 47: | Line 44: | ||
=====Variable control codes===== | =====Variable control codes===== | ||
These simply expand out to text of their own that can vary or based on other variables, they're perfectly safe to use in names without any graphical glitches however since it expands to larger text you can quickly have dialogue or text spilling over the edges of the container which would just temporarily clutter the screen and may overwrite or overlap other text being printed. | These simply expand out to text of their own that can vary or based on other variables, they're perfectly safe to use in names without any graphical glitches however since it expands to larger text you can quickly have dialogue or text spilling over the edges of the container which would just temporarily clutter the screen and may overwrite or overlap other text being printed. | ||
Line 56: | Line 52: | ||
=====Text control codes===== | =====Text control codes===== | ||
These are like variable control codes but always remain consistent and can never change | These are like variable control codes but always remain consistent and can never change | ||
Line 180: | Line 175: | ||
0xE4 and 0xE5 cause the following character to be printed with that diacritic above it. | 0xE4 and 0xE5 cause the following character to be printed with that diacritic above it. | ||
===Japanese | ===Japanese control characters=== | ||
{{incomplete|section|Incomplete or missing functions for control bytes. Alternate defaults in different games/other languages}} | {{incomplete|section|Incomplete or missing functions for control bytes. Alternate defaults in different games/other languages}} | ||
* 0x4A: Prints <code>が </code> | * 0x4A: Prints <code>が </code> |
Revision as of 02:53, 26 January 2019
This article is incomplete. Please feel free to edit this article to add missing information and complete it. Reason: French, German, Italian, and Spanish character encodings |
The Generation I games use a proprietary character encoding to store text data. Versions of the games in different languages may use different encodings, some more different than others.
Fixed-length user-input strings are terminated with 0x50. If a fixed-length string is terminated before using its full capacity, the contents of the remaining space are not specified.
Character sets
Note that 0x7F is a space (" "), not empty. All characters that are not control characters print in one character.
In some contexts, some characters may display differently than suggested below. For example, in the character input table, ED is 0xF0 instead of the Pokémon Dollar symbol, and in the Pokédex (in English), the feet (') and inches (") marks are 0x60 and 0x61.
English
Mechanics
The game sections off various areas of the tilemap loaded into vram and each character code directly corresponds to a tile in the tilemap. Not all tiles in the tilemap are accessible via character code but many are.
Control characters work by intercepting the tile that would normally correspond to the control character and instead perform a different action whether it be end the text or print a lengthy message.
Tilemap sections
- VRAM address 0x9000 to 0x9480 correspond to a portion of the current tileset of the map. Character codes 0x01 to 0x48 and 0x4D directly correspond to them. For example, when your outdoors, tile #3 is the animated flower meaning character code 0x3 will place the animated flower in text, however if your anywhere else such as in battle, in a cave, or elsewhere a completely different tile will likely print.
- Characters 0x49 - 0x5F technically are also in this same section but, apart from one, 0x4D, all the rest are control character and thus link to code rather than the tile they would normally correspond to
- VRAM address 0x9600 to 0x97F0 partially corresponds to character codes 0x60-0x7F, here is where the "UI" tiles are such as random bold letters or border artwork for the dialogs and menus. The space character is also here. Tiles here can sometimes change meaning characters that reference them may print out a different tile image but they are far more consistent than the first section mentioned above
- VRAM address 0x8800 to 0x8BF0 corresponds to characters 0x80 - 0xBF is where the main font is placed when text is needed to render
- VRAM address 0x8C00 to 0x8DF0 has 2 tile sections
- 0xC0 - 0xDF is one that appears to be reserved only for certain areas that need extra space for extra tiles, they would go here. As such, most of the time nothings there meaning only blank characters print out. The player info screen is one such example that uses only some of this area and thus any character codes that reference these tiles.
- 0xE0 - 0xFF reference tiles similar to section 2, you can consider these the "other half" of that section although some player-typeable characters like "PK", "MN", gender symbols, etc. are here as well as numbers, some symbols, and more ui characters
Character codes
As mentioned above, character codes are within the 0x49-0x5F range with the exception of 0x4D which doesn't map to any code and thus, by default, corresponds to tile 4D. All of these are completely usable in game such as names, testing never showed any crashing, however if done so expect to have some small to large graphical glitches that will usually be cleaned up by changing screens or entering a new map through a warp and definitely some annoyances if used long-term.
Dialogue control codes
These control codes control dialogue text placement, paging, etc... they can be used in names but will have various temporary graphical glitches
- 0x49 - "page" - Begins a new Pokedex page, if used in a name it causes the user to have to press a button to continue displaying rest of text and has some serious graphical glitches that can be easily cleared as normal
- 0x4B - "_cont"- Stops and waits for confirmation before scrolling the dialogue down by 1, in names it's same as 0x49 but with slightly less graphical glitches
- 0x4C - "autocont" - Scroll dialogue down 1 without waiting for confirmation, a less annoying but still graphical glitchy as 0x4B when used in names
- 0x4E - "next line" - Move a line down in dialogue, causes just that when used in names - all dialogue moves 1 line down as soon as it hits your name causing weird graphical glitches and text being overwritten or off the screen
- 0x4F - "bottom line" - Write at the last line of dialogue, in names it causes graphics, particularly dialogue text, to get quirky and overlap
- 0x50 - "end" - Used all the time, even in names, just marks the end and nothings read afterwards. On the contrary removing 0x50 will cause the text engine to proceed on until it does reach 0x50 or, in certain cases of player/rival names, total crashing of the game if it reaches a variable that tells it to insert their own name causing an infinite loop at that point.
- 0x51 - "paragraph" - Begin a new dialogue page with button confirmation, if used in names will do exactly that, will cause graphical glitches, dialogue text overlapping and large annoyances
- 0x55 - "cont" - A variation of 0x4B and 0x4C
- 0x57 - "done" - ends text box, in names causes various graphical glitches
- 0x58 - "prompt" - Prompts to end textbox, in names similar to 0x4B, 0x4C, and 0x55
- 0x5F - "dex" - Ends a Pokédex Entry, it's just expands to a period "." and that's it but it's only used normally at the end of Pokédex entries
Variable control codes
These simply expand out to text of their own that can vary or based on other variables, they're perfectly safe to use in names without any graphical glitches however since it expands to larger text you can quickly have dialogue or text spilling over the edges of the container which would just temporarily clutter the screen and may overwrite or overlap other text being printed.
- 0x52 - "players name" - Insert the players name, the only variable you cannot use at all in the players name since it will lead to an infinite loop that crashes the game safe and fun elsewhere, try it on a Pokémon's name or your rival.
- 0x53 - "rivals name" - Inverse of players name, prints rivals name instead and cannot be used in the rivals name, great to place in your HM slave's name though
- 0x59 - "target" - Inserts target name, this is essentially the Pokémon from your perspective. If the dialogue is referring to the enemies Pokémon that name will be inserted with "Enemy " prepended before it, if it's your Pokémon then it will just be your Pokémon name. The last Pokémon you fought is kept in memory so if used in names it will still work even out of battle. This is the longest control character in the game and will print far off the screen in all cases. It can expand up to 16 characters with this single control character alone.
- 0x5A - "user" - The inverse of "target", the Pokémon from the enemies perspective. If used in names will likely just be the enemy you fought without the "Enemy " prefix
Text control codes
These are like variable control codes but always remain consistent and can never change
- 0x4A - "pkmn" - Prints "PK" and "MN" using only one character code, can be used in names to surpass the 7 or 10 printed character limit while not going over the space limits
- 0x54 - "poke" - Prints the characters "Poké" while taking up only 1 byte of space, can be used in names to print more characters past the 7 or 10 character limit and still fit within
- 0x56 - "......" - Print 2 characters consisting of 3 dots each on the screen
- 0x5B - "pc" - prints "PC" as 2 tiles
- 0x5C - "tm" - Prints "TM" as 2 tiles
- 0x5D - "trainer" - prints "TRAINER" as individual tiles
- 0x5E - "rocket" - prints "ROCKET" as individual tiles
Those bytes with a dark gray background are not used normally in the English games. Characters with a light gray background are holdovers from the Japanese game but that are not used in the English game.
-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F 0- NULL 1- Junk 2- 3- 4- Control characters 5- Control characters 6- A B C D E F G H I V S L M : ぃ ぅ 7- ‘ ’ “ ” ・ … ぁ ぇ ぉ = || 8- A B C D E F G H I J K L M N O P 9- Q R S T U V W X Y Z ( ) : ; [ ] A- a b c d e f g h i j k l m n o p B- q r s t u v w x y z é 'd 'l 's 't 'v C- Junk D- E- ' PK MN - 'r 'm ? ! . ァ ゥ ェ ▷ ▶ ▼ ♂ F- $ × . / , ♀ 0 1 2 3 4 5 6 7 8 9
In the Japanese games (as can be seen below), 0xF2 is distinguishable from 0xE8, with the former meant as a decimal point while the latter is punctuation. Presumably this intention was largely inherited when the English games were made, as most of the game's script uses 0xE8 exclusively; however, 0xF2 appears in the character table for user input, meaning it may appear in user-input names (and, conversely, 0xE8 never should).
The full list of characters that are available for user input are: A-Z and a-z, space, and the following: ×():;[]PKMN-?!♂♀/.,
.
Japanese
Technically all characters under 0x60 are control characters, the majority of which have the behavior of causing a specific character from the main font (0x80-0xFF) to be printed with a diacritic in the space above it. Those characters that have different, more complicated functions are detailed below.
0xE4 and 0xE5 cause the following character to be printed with that diacritic above it.
Japanese control characters
- 0x4A: Prints
が
- 0x52: Prints the player's name.
- In Pokémon Yellow, the default value is
ゲーフリ1
in Japanese games.
- In Pokémon Yellow, the default value is
- 0x53: Prints the rival's name.
- In Pokémon Yellow, the default value is
クリチャ
in Japanese games.
- In Pokémon Yellow, the default value is
- 0x54: Prints
ポケモン
in Japanese games. - 0x59: Prints the inactive Pokémon's name in battle. (In specific circumstances, the game may "pretend" that the inactive Pokémon is actually active and vice versa.)
てきの
in Japanese games.
- 0x5A: Prints the active Pokémon's name in battle. The default value is empty. (In specific circumstances, the game may "pretend" that the active Pokémon is actually inactive and vice versa.)
- 0x5B: Prints
パソコン
in Japanese games. - 0x5C: Prints
わざマシン
in Japanese games. - 0x5D: Prints
トレーナー
in Japanese games. - 0x5E: Prints
ロケットだん
in Japanese games.
|
This data structure article is part of Project Games, a Bulbapedia project that aims to write comprehensive articles on the Pokémon games. |