• Character Set:

Description

The ISO-8859-1 code space occupies every single-byte value; on systems that treat it as a transparent 8-bit container, no byte sequence coming from any other encoding is ever discarded.
In other words, you can safely regard any foreign byte stream as if it were ISO-8859-1 without losing data. This property is why early MySQL defaulted to Latin-1: the server simply stores the bytes it receives. ASCII is a 7-bit bucket; ISO-8859-1 is an 8-bit bucket.
Is it OK to store Chinese in Latin-1? You can, but you shouldn’t. Once you do, the database no longer knows anything about character semantics—sorting, comparison, length calculations and string functions all return nonsense. For example, the UTF-8 sequence for “中” is 0xE4B8AD (three bytes). Inserted into a Latin-1 column, MySQL does not see one Chinese character; it sees three separate Latin-1 code points (0xE4, 0xB8, 0xAD). The on-disk value is still 0xE4B8AD—nothing is lost—but you must remember to interpret those bytes as UTF-8 when you read them back, or you will only see mojibake.
This tool reverses the mistake: it first turns the “Latin” characters back into raw bytes (hex), then lets you pick the correct charset to decode the original text. Take the character “中”: its UTF-8 bytes are 0xE4B8AD; if those bytes are mistakenly rendered as ISO-8859-1, you see 中. To restore it, encode 中 as ISO-8859-1 to recover E4B8AD, then decode those bytes as UTF-8 to get the original “中”.
C# snippet for fixing mojibake:
byte[] raw = Encoding.GetEncoding("ISO-8859-1").GetBytes("中");
string result = Encoding.UTF8.GetString(raw);
Console.WriteLine(result);
Mojibake in a nutshell
  • What you call “ISO-8859-1 garbage” is almost never real ISO-8859-1; it is a UTF-8 (or GBK) byte stream that was decoded as ISO-8859-1 by your terminal, browser or database.
  • To fix it, reverse the error: encode the text back to bytes using ISO-8859-1, then decode those bytes with the actual encoding.
  • As long as the raw bytes are still intact, the process is lossless. If any step ever converted the bytes to characters and re-encoded them, the data may be gone for good.