Data logging

honggarae 05/11/2021 1094

Introduction

Data (data) is a symbolic representation of objective things. It is an unprocessed raw material used to represent objective things, such as graphic symbols, numbers, letters, etc. In other words, data are facts and concepts obtained through physical observation, and are descriptions of places, events, other objects or concepts in the real world. Data record refers to a complete set of related information corresponding to a row of information in the data source. For example, all information about a customer in the customer mailing list is a data record.

Detailed introduction

In computer science, data refers to the general term for all the media of symbols that can be input into a computer and processed by a computer program. It is used to input into an electronic computer for processing. A general term for numbers, letters, symbols, and analog quantities that have a certain meaning. It is the most basic element of a geographic information system. There are many types, divided by nature: ①Positioning, such as various coordinate data; ②Qualitative, such as data representing the attributes of things (residential areas, rivers, roads, etc.); ③Quantitative, data reflecting the quantitative characteristics of things, such as Geometric quantities such as length, area, volume, or physical quantities such as weight and speed; ④Timed data that reflects the time characteristics of things, such as year, month, day, hour, minute, second, etc. According to the form of expression, it is divided into: ①digital data, such as various statistics or measurement data; ②analog data, composed of continuous functions, and divided into graphic data (such as points, lines, surfaces), symbol data, text data and image data Etc. According to the recording method, it is divided into maps, tables, images, tapes, and paper tapes. Divided into vector data, grid data, etc. according to the digital method. In a geographic information system, the choice, type, quantity, collection method, level of detail, credibility, etc. of data depend on the system application goals, functions, structure, and data processing, management and analysis requirements.

There is no uniform definition of the term data warehouse. The well-known data warehouse expert whinmon gave the following description in his book "buildingthedatawarehouse": Datawarehouse is a subject-oriented (subjectoriented) ), integrated (integrate), relatively stable (non-volatile), reflecting historical changes (timevariant) data collection, used to support management decision-making. We can understand the concept of data warehouse from two levels. First, the data warehouse is used to support decision-making and is oriented to analytical data processing. It is different from the existing operational database of the enterprise; secondly, the data warehouse is for multiple heterogeneous The data sources are effectively integrated. After the integration, they are reorganized according to the theme, and include historical data, and the data stored in the data warehouse is generally no longer modified.

Data warehouse

According to the meaning of the data warehouse concept, the data warehouse has the following four characteristics:

Subject-oriented

Operational database The data organization is oriented to transaction processing tasks, each business system is separated from each other, and the data in the data warehouse is organized according to a certain subject domain. A theme is an abstract concept, which refers to the key aspects that users care about when using a data warehouse to make decisions. A theme is usually related to multiple operational information systems.

Integrated

Transaction-oriented operational databases are usually related to certain specific applications. The databases are independent of each other and are often heterogeneous. The data in the data warehouse is obtained through systematic processing, summary and sorting on the basis of extracting and cleaning the original scattered database data. The inconsistencies in the source data must be eliminated to ensure that the information in the data warehouse is about the entire Consistent global information for the enterprise.

Relatively stable

The data in the operational database is usually updated in real time, and the data changes in time as needed. Data warehouse data is mainly used for enterprise decision analysis. The data operations involved are mainly data queries. Once a certain data enters the data warehouse, it will generally be retained for a long time, that is, there are generally a large number of query operations in the data warehouse. , But there are few modification and deletion operations, usually only regular loading and refreshing are required.

Reflect historical changes

Operational databases are mainly concerned with data in a certain period of time, while data in data warehouses usually contain historical information. The system records the company’s history from a certain time in the past. From the point (such as the time when the data warehouse is applied) to the current various stages of information, through this information, quantitative analysis and prediction of the development process and future trends of the enterprise can be made.

The construction of enterprise data warehouse is based on the existing enterprise business system and the accumulation of a large amount of business data. The data warehouse is not a static concept. Only when the information is delivered to the users who need the information in a timely manner for them to make decisions to improve their business operations, can the information play a role and make the information meaningful. The fundamental task of the data warehouse is to organize, summarize and reorganize the information and provide it to the corresponding management decision-makers in a timely manner. Therefore, from the perspective of the industry, the construction of a data warehouse is a project and a process.

Database is a collection of data organized in accordance with a certain data model and stored in secondary storage. This kind of data set has the following characteristics: it does not repeat as much as possible, and serves a variety of applications of a specific organization in the best way. Its data structure is independent of the application that uses it. The addition, deletion, modification and retrieval of data are determined by Unified software for management and control. From the perspective of development history, the database is an advanced stage of data management, which is developed by the file management system.

The basic structure of the database is divided into three levels, reflecting the three different perspectives of observing the database.

(1) Physical data layer. It is the innermost layer of the database and is the collection of data actually stored on the physical storage device. These data are the original data, the object processed by the user, and are composed of bit strings, characters and words processed by the instruction operation described in the internal mode.

(2) Conceptual data layer. It is the middle layer of the database and the overall logical representation of the database. Pointed out the logical definition of each data and the logical connection between the data, which is a collection of storage records. It involves the logical relationship of all objects in the database, rather than their physical conditions, and is a database under the concept of a database administrator.

(3) Logical data layer. It is a database seen and used by users, and represents a collection of data used by one or some specific users, that is, a collection of logical records.

The connection between different levels of the database is transformed through mapping. The database has the following main features:

(1) Realize data sharing. Data sharing includes that all users can access the data in the database at the same time. It also includes that users can use the database in various ways through interfaces and provide data sharing.

(2) Reduce data redundancy. Compared with the file system, because the database realizes data sharing, users are prevented from creating application files individually. A large amount of duplicate data is reduced, data redundancy is reduced, and data consistency is maintained.

(3) Independence of data. The independence of data includes the independence of the logical structure of the database in the database and the application, as well as the fact that changes in the physical structure of the data do not affect the logical structure of the data.

(4) Data realizes centralized control. In the file management method, the data is in a scattered state, and different users or the same user have no relationship between their files in different processing. The database can be used for centralized control and management of data, and the organization of various data and the relationship between data can be expressed through the data model.

(5) Data consistency and maintainability to ensure data security and reliability. Mainly include: ①Security control: to prevent data loss, wrong update and unauthorized use; ②Integrity control: to ensure the correctness, validity and compatibility of data; ③Concurrency control: to allow data to be The data realizes multi-channel access, and can prevent abnormal interaction between users; ④Fault discovery and recovery: The database management system provides a set of methods that can detect and repair faults in time, thereby preventing data from being destroyed.

Latest: Transmission efficiency

Next: Negative charge