Data encoding

honggarae 14/10/2021 1740

Introduction

Data encoding is the key to computer processing. Different information records should use different codes, and one code point can represent one information record. Because the data and information to be processed by the computer are very complex, the meanings represented by some databases are difficult to remember. In order to be easy to use and easy to remember, it is often necessary to code the processed objects, and use a code to represent a piece of information or a string of data. Encoding of data is very important in the management of computers, and can easily perform operations such as information classification, verification, aggregation, and retrieval. People can use codes to identify each record, distinguish processing methods, classify and check, thereby overcoming the shortcomings of uneven projects, saving storage space, and improving processing speed.

Coding requirements

The system, standard, practicability, expandability and efficiency should be followed when data coding.

Data encoding Data can establish the internal connection between data through encoding, which is convenient for computer identification and management. The main data coding in the geographic information system is the geocoding that serves the analysis of spatial information. It is a coding method established to identify the location and attributes of graphical points, lines, surfaces or grids, including topological coding and coordinate coding. The former is a coding method that expresses the adjacent logical relationship of spatial data positions; the latter is a measurement of spatial data positions in a certain coordinate system, which can be implicit (for grid data) or explicit.

Purpose

The main purpose of encoding is to reduce the amount of information, because the data affects the processing efficiency and accuracy, and the low efficiency is mainly due to the large number of characters used for names or descriptions, and a lot of time for reporting , Entry, identification and understanding. More importantly, there must be enough space to store those characters and numbers. This inefficiency has a great impact on manual operations and computer processing. On the other hand, to improve the accuracy of computer processing, the definition of data items must be standardized. A well-designed coding structure can solve the above problems. For example, a three-digit code 000-999, which uniquely and concisely identifies 1000 different entries, occupies significantly less space than each description in words.

In addition to improving the processing efficiency and precision using coding, the coding structure can be used to express a specific meaning. For example, a person's ID number can indicate the province, city, birth date, gender, etc. The data about this person can be sorted, summarized, counted, analyzed, etc. by a computer according to a prescribed algorithm.

Encoding is necessary in offline batch processing and online query systems. The structure of the encoding may be very complicated, but it is very important for realizing a modern information processing system.

Types of encoding structure

There are many methods for encoding. It is very important to choose an appropriate encoding structure. The following encoding structures are often used.

Sequence coding

Each item in the sequence represents a piece of information, and its possible structure is expressed as follows:

001 Wang Lin

002 Zhang San

003 Li Si

.....

500 Li Ming

Data encoding

Advantages of sequential coding scheme:

1. This is the most frequently used structure, because it is simple;

2, short and single;

3, when the queryer knows the code, find and compare Convenient, as long as you find the place where the code appears;

4. Simple and convenient management

The disadvantage of sequential coding scheme: it has no logical basis, except for the order in the list, it does not include Other useful information; "It is not fault-tolerant, and each modification can only be made in the last record.

Classification code

The possible situation of classification code is similar to that of ID number, classification code It is to divide a data block into several small data blocks representing specific meanings. These small data blocks must be able to represent all data.

The advantages of classification coding:

1. Data The value and position of the indicates a specific meaning;

2, the classification coding structure is more convenient for information processing, and each small data block can be easily retrieved, operated, analyzed, sorted, etc.;

< p>3. It is more convenient to expand the data block category, unless this number has reached the maximum capacity;

4, the classification data block can be added or deleted in the coding structure.

Classification Disadvantages of encoding: The length of the encoding is determined by the nature of the classification, which makes the number of encoding bits too large; in many cases, the encoding is idle; when it needs to be modified, it may cause system maintenance problems.

Common encoding schemes

Common data encoding schemes are: unipolar code, polar code, bipolar code, return to zero code, bi-phase code, non-return to zero code, Manchester encoding, differential Manchester encoding, multiple power Flat coding, 4B/5B coding.

Unipolar code

In this coding scheme, only positive (or negative) voltage is used to represent data. Unipolar code Used in teletypewriter interfaces and PC and TTY compatible interfaces, this code requires a separate clock signal to be timed, otherwise when a long string of 0 or 1 is transmitted, the clocks of the transmitter and receiver will not be timed. The anti-noise characteristic of the polar code is not good.

Polar code

In this kind of encoding, positive and negative voltages are used to represent the binary numbers "0" and "1" respectively. The level difference of this kind of code is larger than that of unipolar code, so it has better anti-interference characteristics, but it still needs another clock signal.

Bipolar code

The level (positive, negative, zero) changes between. A typical bipolar code is the signal inversion alternate coding (AMI). In the AMI signal, the data stream makes the level when it encounters "1" Flip alternately between positive and negative, and maintain zero level when encountering "0".

Return to Zero Code

Return to Zero Code (Return to Zero, RZ), that is, the intermediate signal of the symbol returns to zero level, such as the transition from positive level to zero level It represents the symbol "0", and the negative level to the zero level represents the symbol "1".

Bi-phase code

Bi-phase code requires a level shift in each bit. Therefore, the biggest advantage of this code is self-timing. At the same time, the bi-phase code also has the function of detecting errors. If there is no level inversion in the middle of a bit, it is considered as an illegal code.

Non-Return to Zero Level Coding

Non-Return to Zero Level (NRZ-L), that is, no 0 level is used, positive level is used It means "1", and a negative level means "0".

Non-Return to Zero Inverted Code

Non-Return to Zero Inverted (NRZ-I), that is, the level is reversed when "1" appears, When "0" appears, the level will not be reversed. This code is also called differential code.

Manchester code

Manchester code (Manchester), the transition edge from high level to low level means "0", and the transition edge from low level to high level means "1" ", The level conversion side in the middle of the bit represents both the data code and the timing signal. Manchester encoding is used in Ethernet.

Differential Manchester code

Differential Manchester code (Differential Manchester), also known as phase encoding (PE); often used in local area network transmission. In Manchester encoding, there is a transition in the middle of each bit. "0" means there is a transition at the beginning of the bit, and "1" means there is no transition at the beginning of the bit. The transition in the middle of the bit is used as both a clock signal and a data signal. .

Multi-level coding:

The code element can take one of multiple levels, and each code element can represent several binary bits.

4B/5B encoding

This is the information encoding scheme adopted in the Fiber Distributed Data Interface (FDDI) of Megabit Fast Ethernet. The characteristic of this kind of encoding is that every 4 bits of the data stream to be sent is regarded as a group, and every 4 bits of binary code is represented by a 5-bit code. The 5-bit code is called a code group and is transmitted by NRZI.

Latest: Data segment

Next: CAV (audio brand)

data block