![]() |
s i s t e m a o p e r a c i o n a l m a g n u x l i n u x | ~/ · documentação · suporte · sobre |
Next
Previous
Contents
2. Difficulties of Using Chinese on Linux SystemThis section makes an attempt to do a general description for the possible obstacles in using Chinese on Linux; then you could find the key points out much easier as you meet with these problems. As a matter of fact, the shortcomings described here not only appear on Linux but also the other system. Even more, we can say that the whole computers environments are concerned. If this section is not suited for your tastes or you are eager to act directly, then you can jump onto the section Display and Input Chinese! A Chinese word is composite of two bytes in computers, as we all know. The most popular encoding methods includes BIG5 codes available in the area of Taiwan and GB codes available in the mainland China. The first byte of each word is almost bigger than numeric values 128, which is what we called the non-ASCII codes.(The ASCII codes means codes smaller than 128.) Yes! Then so what? Here are the points! Because of different kinds of reasons, in the early days, many programs didn't consider the possibility of non-ASCII codes as a part of entering data. These kinds of programs always assume that the data prepared to manipulation are all limited in the range of ASCII codes, and the most worst is that when they meet with non-ASCII codes, an assumption of their non-existence and a truncation of the 8th bit is the most frequent method they took. This is the so called 8-bit clean problem. Your program, for example, always take it for granted that your inputs are all the 7-bit-width ASCII codes. When you enter Chinese words, it will erase the 8th bit so that the inputs under circumstances of Chinese will become disturbed codes all the way. Communication programs on Internet are usually could only transmit 7-bit data. A notorious substance is the earlier This problem seems to be more complicated on Internet. Even you and your receivers all have the machines installed with Applications which are incapable of identifying the Chinese encoding are also a major problem, apart from being unable to deal with non-ASCII codes' data. That is, most programs(even if they can deal with 8-bit data accurately) all take a Chinese word as two individual bytes. This won't cause problems under some conditions, but it will show an unfortunate disaster under some spots. The most obvious matter is that, for instance, even if you can input Chinese words properly, but when you hit the backspace key a time trying to delete a complete word, the whole word will be split into wto parts meaning that only one byte(column) can backspace on monitor and the redundant half one then become a disturbed code. More over than that, you might change new line at the second byte of a Chinese word in some text editors and then disturbed codes occurred. Besides, these text editors might think that a long Chinese sentence as a long English sentence without changing to a new line, making the picture of screen ugly and chaotic. There are more worse matters, too ! Some Chinese words contain special codes which correspond to some particular meaning for some applications and might make these programs producing severe faults while meeting with that codes or just collapse. Here below will try to propose some resolved methods but segmental, incomplete and also unsatisfactory. Only when all softwares can fit with Chinese, then the problems could really resolve perhaps. However, more and more programs have noticed the significance of internationalization, for example, most hosts'
Next Previous Contents |