CN114676102B

CN114676102B - Database control method and control system

Info

Publication number: CN114676102B
Application number: CN202210229737.5A
Authority: CN
Inventors: 沈璐璐; 许峰; 聂晓崧; 祝成成; 梁雅慧; 肖丹丹
Original assignee: 711th Research Institute of CSIC
Current assignee: 711th Research Institute of CSIC
Priority date: 2022-03-10
Filing date: 2022-03-10
Publication date: 2024-05-03
Anticipated expiration: 2042-03-10
Also published as: CN114676102A

Abstract

The application discloses a database control method and a database control system, and relates to the technical field of data service. Data compression techniques commonly used in mainstream platforms are limited by operating systems, and multiple operating systems cannot be compatible with each other. The application comprises the following steps: s1: the data reading module sequentially reads the data items in the configuration database and the basic database one by one to obtain metadata items; s2: the data acquisition module selectively acquires the data items read by the data reading module and acquires the data items meeting preset requirements; s3: the data compression module compresses the data items meeting the preset requirements acquired by the data acquisition module to form a data block; s4: and (3) compressing the data blocks meeting the preset requirements entering the cache pool in the step (S3) again to form a compressed file. The application overcomes the defect of the performance of the existing domestic operating system and chip and provides a real-time data receiving, compressing, storing and inquiring integrated processing platform based on the domestic operating system.

Description

Database control method and control system

Technical Field

The application relates to the technical field of data service, in particular to a database control method and a database control system.

Background

The real-time database software is a database control method and a database control system which can independently operate based on a real-time operating system and are used for processing a large amount of data with strong timeliness and strict time sequence, and the aim of high reliability and high timeliness is achieved.

Real-time data having the following characteristics:

the data change has a certain waveform rule;

High concurrency, high sampling rate;

only a small part of the measured points in the data are changed frequently;

The numerical values of a plurality of measuring points in the data have the characteristic of slow change;

the numerical variation and the time variation have common variation characteristics;

The user can allow the precision loss of the data within a certain range;

the data has three elements, a tag, an index and a time stamp.

Databases based on real-time operating systems are mostly monopolized abroad, and mature database products based on real-time domestic operating systems are lacking in domestic markets. At present, a real-time database management system commonly applied in industry belongs to a memory database, and if an application layer task is improperly operated on a memory, system breakdown is easy to cause. Data compression techniques commonly used in mainstream platforms are limited by operating systems, and multiple operating systems cannot be compatible with each other.

Disclosure of Invention

The application provides a database control method and a control system, which aim to solve the problem that a real-time database management system commonly applied in industry belongs to a memory database, and if an application layer task is improper to operate a memory, system breakdown is easy to cause. Data compression techniques commonly used in mainstream platforms are limited by operating systems, and multiple operating systems cannot be compatible with each other. Data queries are also subject to operating system limitations.

In order to achieve the above object, the present application provides a database control method, comprising the steps of:

S1: the data reading module reads the data items in the configuration database and the basic database one by one in sequence to obtain metadata items, and each reading step is carried out in step S2;

S2: the data acquisition module selectively acquires the metadata item, acquires the metadata item to obtain an acquisition data item if the metadata item meets a preset acquisition requirement, enters a step S3, discards the metadata item if the metadata item does not meet the preset acquisition requirement, returns to the step S1 to read the next data item until all the data items are completely finished;

S3: the data compression module selectively compresses the acquired data items, if the acquired data items meet the preset compression requirement, the acquired data items are compressed to form data blocks, the step S4 is carried out, if the acquired data items do not meet the preset compression requirement, the acquired data items are discarded, the step S1 is returned to read the next data item until all the data items are finished;

s4: and the cache pool selectively stores the data blocks, if the data blocks meet the preset storage requirement, the data blocks are compressed again to form a compressed file, then the compressed file is stored, if the data blocks do not meet the preset storage requirement, the data blocks are discarded, and the step S1 is returned to read the next data item until all the data items are finished.

In some embodiments of the present application, the data compression module selectively compresses the acquired data items using a revolving door compression algorithm.

In some embodiments of the application, the revolving door compression algorithm parameters include a compression bias, which is an absolute error value, empirically set; within the absolute error value range, the data item is compressed; outside the absolute error value range, the data item is not compressible.

In some embodiments of the present application, the revolving door compression algorithm parameter further comprises a slope, the slope comprising an upper slope, a lower slope, and an intermediate slope, the calculation formula is as follows:

Upper slope k1= (current data item value- (last saved data item value-compression offset))/(current data item time-last saved data item time);

lower slope k2= (current data item value- (last saved data item value + compression bias))/(current data item time-last saved data item time);

Intermediate slope k= (current data item value-data item to be saved value)/(current data item time-data item to be saved time);

The compression and storage criteria are as follows:

If K2 is not less than K1, compressing the data item to be stored;

If K < K2 or K > K1, the data item to be saved is stored.

In some embodiments of the application, the steps of the revolving door compression algorithm are as follows:

s20: the data acquisition module acquires the metadata item to obtain an acquired data item;

s30: judging whether the acquired data item is in a dead zone range or not: a. in the dead zone range, the compression is carried out without preservation; b. if the dead zone range is out, the step S40 is carried out;

S40: then calculating three slopes of the collected data item outside the dead zone range, and judging the magnitudes of the slopes of the collected data item: a. if the slope satisfies K2 is more than or equal to K1, compressing the acquired data item; b. if K is less than K2 or K is more than K1, collecting data item information, storing the data item information into a cache pool, compressing the data item information to form a data block, and calculating the compression ratio of the data block; simultaneously, returning to the step S30 until all the acquired data items are finished;

S50: judging the compression ratio of the data block: a. when the threshold value is reached, compressing the data block; b. and returning to continue to judge the compression ratio of other data blocks if the threshold value is not reached.

In some embodiments of the present application, if the data block meets a preset storage requirement, the data block is compressed again to form a compressed file, and an LZ4 compression algorithm is adopted.

In some embodiments of the application, the LZ4 compression algorithm includes the steps of:

s100: scanning the data blocks in the cache pool, and searching matching characters by adopting a moving step length of 4 bytes in a scanning window;

s200: judging the matched characters: a. returning to the matching process of the step S100 if the matching character is not met; b. if the matching character is met, continuing backward matching, calculating the matching length and the character string length, and finally outputting data;

S300: judging whether the scanning of the data blocks meeting the preset requirements in the cache pool is finished or not: a. the scanning of the data block conforming to the preset requirement is finished, and the program is terminated; b. and if the scanning of the data block meeting the preset requirements is not finished, repeating the scanning matching process of the step S100.

In some embodiments of the application, the data collection module selectively collecting the metadata items comprises the steps of:

s21: the server is powered on and started, a receiving thread is created, and whether metadata items exist in a network or not is judged: a. if the metadata item does not exist in the network, returning to the creation of the receiving thread; b. the network is provided with metadata items, and the metadata items are checked according to an internal communication protocol;

s22: judging whether the metadata item accords with an internal communication protocol or not: a. if the metadata item check does not conform to the internal communication protocol, the step S21 is performed to continuously determine whether other metadata items exist in the network; b. checking the change of the metadata item if the metadata item is checked to be in accordance with an internal communication protocol;

S23: judging the change of the metadata item: a. if the change of the metadata item does not exceed the threshold value, the step S21 is entered to determine whether other metadata items exist in the network; b. and if the change of the metadata item exceeds the threshold value, the data item is written into the cache pool, and the data item acquisition is completed.

In some embodiments of the present application, the frequency thread for selectively storing the data blocks includes the steps of:

s31: the server creates a sampling thread and checks the data blocks in the cache pool;

S32: judging whether a data block exists in the cache pool: a. if no data block exists, checking the data block of other cache pools; b. if the data block exists, checking the storage time of the data block is needed;

S33: judging the storage time of the data block: a. if the storage time of 1 minute is not reached, returning to the step S32 to check the storage time of other data blocks; b. and (5) compressing the data blocks to form a compressed file when the data blocks reach the storage time of 1 minute until all the data blocks are finished.

A database control system using the database control method according to any one of claims 1 to 9, the database control system comprising:

The server is internally provided with a microprocessor, a storage, a data acquisition module, a data management module, a data compression module, a data query module and a data reading module; the input end and the output end of the memory are in communication connection with the output end and the output end of the microprocessor; the input and output ends of the data acquisition module, the data management module, the data compression module, the data query module and the data reading module are respectively connected with the input and output ends of the microprocessor in a communication way, and a first interface is arranged on the server;

And the client is responsible for interacting with a user to finish data query, and a second interface for realizing data interaction with the first interface based on TCP/IP is arranged on the client.

The beneficial effects are that:

The application overcomes the defect of the performance of the existing domestic operating system and chip and provides a real-time data receiving, compressing, storing and inquiring integrated processing platform based on the domestic operating system.

The application adopts a kernel architecture similar to the operating system, has low dependence on the operating system, and is compatible with domestic desktops and embedded operating systems through simple configuration.

The application optimizes the compression strategy without introducing complex calculation, the space-time complexity of the algorithm is basically unchanged, the operation load of the system is not increased, and the method is very suitable for data compression of the monitoring platform based on the domestic operation system at present.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a database control method of the present application.

Fig. 2 performs a process diagram.

FIG. 3 data item collection thread.

FIG. 4 frequency thread of data item storage.

FIG. 5 is a data item compression flow diagram.

FIG. 6 is a data file compression flow diagram.

Fig. 7 client/server data flow.

FIG. 8 server/client two-tier architecture diagram.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to fall within the scope of the application.

In the description of the present application, it should be understood that the terms "center," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like indicate orientations or positional relationships based on the orientation or positional relationships shown in the drawings, merely to facilitate describing the present application and simplify the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present application.

The terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present application, unless otherwise indicated, the meaning of "a plurality" is two or more.

In the description of the present application, it should be noted that, unless explicitly specified and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be either fixedly connected, detachably connected, or integrally connected, for example; the two components can be mechanically connected, can be directly connected or can be indirectly connected through an intermediate medium, and can be communicated with each other. The specific meaning of the above terms in the present application will be understood in specific cases by those of ordinary skill in the art.

The application provides a database control method and a database control system, which are respectively described in detail below. It should be noted that the following description order of the embodiments is not intended to limit the preferred order of the embodiments of the present application. In the following embodiments, the descriptions of the embodiments are focused on, and for the part that is not described in detail in a certain embodiment, reference may be made to the related descriptions of other embodiments.

The platform realizes high-frequency, long-time and reliable storage of data based on a domestic real-time operating system under the condition of limited hardware resources of an embedded system, and a database is required to solve a series of problems of high efficiency of a data storage mechanism, data security control, a real-time transaction management mechanism, a recovery mechanism of the database, data rapid compression/decompression and the like, establish a data model suitable for data rapid access and operation, and adopt a Server-Client architecture, namely a Client-Server architecture. The server is responsible for completing data acquisition, data management, data compression and data storage tasks, and the client is responsible for interacting with a user to complete data query. The data acquisition, the data management, the data storage and the data query are all independent threads. The server performs data interaction with the application program based on the UDP. The internal interface realizes data interaction between the server and the client based on TCP/IP.

The server execution process is divided into an initialization state and an operation state as shown in fig. 1 and 2.

The system is powered on and started, firstly enters an initialization state, and reads a configuration database and a basic database. The configuration database comprises configuration information such as a network, a storage threshold, a log output mode and the like; the basic database comprises configuration information such as data items, data blocks, labels and the like. The initialization mainly completes the system configuration and data management work.

Data management is achieved through data item configuration, data block configuration and tag configuration.

The method comprises the following steps:

The data items include: type, byte order, data base type, data length, value flags, alarm flag attributes.

The data block includes: the method comprises the steps of identifying a data block, marking the type, starting time of the sample block, ending time of the sample block, storing the length of the sample data in a file, storing a corresponding file index of the sample data, and storing the position attribute of the data block in the file.

The label includes: ID. Name, data item type, storage index, validity flag, dead zone range, and compression accuracy.

After the initialization of the software is completed, the software automatically enters an operation state to perform data acquisition, data storage and data query tasks.

The data server receives the communication data sent to the multicast by the application program, and performs data verification according to an internal communication protocol to complete data acquisition. The application program and the server judge the communication quality of the Ethernet through the heartbeat message; system clock unification is ensured by synchronizing requests/responses.

The data acquisition module performs selective acquisition thread on the metadata item, which can be seen in fig. 3, and the specific steps are as follows:

The frequency of the selective storage of the data blocks can be seen in fig. 4, and the data blocks are stored in a timing storage mode, specifically, the frequency is once a minute, and the reason for this is that the data before 2 minutes of power failure can be systematically required to be stored. The specific ground thread comprises the following steps:

The data storage compresses data items by adopting an optimized revolving door compression algorithm, and performs data block compression by combining LZ 4.

Real-time data changes quickly, the data storage capacity is large, and if each data is stored, a large amount of disk data is occupied quickly. According to experience, the data of the system is in line with a linear change or in line with a certain rule change, so that the database can judge according to certain conditions, and certain negligible data are not stored. When the data is to be searched, the data is calculated through linear or stepping interpolation, so that the storage efficiency is greatly improved, and meanwhile, the disk space is saved.

The revolving door compression algorithm parameters comprise compression deviation, wherein the compression deviation is an absolute error value and is set according to experience; within the absolute error value range, the data item is compressed; outside the absolute error value range, the data item is not compressible. The revolving door compression algorithm parameter also comprises a slope, wherein the slope comprises an upper slope, a lower slope and a middle slope, and the calculation formula is as follows:

Upper slope k1= (current data item value- (last save data item value-compression offset))/(current data item time-last save data item time)

Lower slope k2= (current data item value- (last saved data item value + compression offset))/(current data item time-last saved data item time)

Intermediate slope k= (current data item value-data item to be saved value)/(current data item time-data item to be saved time)

The compression and storage criteria are as follows:

If K2 is not less than K1, the data item to be saved is compressed.

If K < K2 or K > K1, the data item to be saved is stored.

The application optimizes the rotation gate compression algorithm, and the data compression steps are as follows, as shown in fig. 5.

The data query is realized based on communication interaction between the server and the client, no middleware exists between the server and the client, and the server communicates with the program of the client through Socket. Results can be obtained quickly and confusion between different customers is avoided.

The compression of the data file is not immediately performed after the data item has completed the turnstile compression. The compressed data items firstly enter a cache pool to form data blocks, the compression ratio is calculated, and after the compression ratio of the data items reaches a certain value, the data blocks adopt an LZ4 algorithm to store data files and write the data files into a DB database.

The LZ4 compression algorithm is a classical lossless compression algorithm, a scanning window searches for a match for 4 bytes in the compression process, at least 1byte is moved for scanning, and repeated items are encountered for compression. The scanning step length can be adjusted, and the scheme adopts a movement step length of 4byte, so that the compression/decompression speed is improved, and the compression rate is correspondingly reduced slightly.

In the application, the compression process of the LZ4 compression algorithm data file is shown in fig. 6, and the specific steps are as follows:

The data flow between the client and the server is shown in fig. 7, the client sends a query request, after the service receives the request, the service decompresses the appropriate data block from the database according to the query protocol, returns the data, and the client displays the query result in a table or curve mode according to the user requirement.

The data storage scheme designed by the scheme can be operated in an autonomous domestic operating system, has reasonable software architecture and strong compatibility, is compact in unit module, and is suitable for a small-sized and lightweight embedded real-time database control method and a control system.

The application is based on the study of the data storage technology of the domestic real-time operating system, solves a series of core technical problems such as memory optimization management, data compression, reliability and the like, autonomously develops database software which meets the high-frequency, long-time and reliable storage requirements of the real-time data of the operating system and is suitable for the domestic real-time operating system, and has the following specific scheme:

A. And the core module is independent by adopting a kernel system structure similar to the operating system, and even if a specific module fails, the failure module can be automatically restarted and recovered, so that the stability of the embedded system is ensured.

B. The C/C++ interface development and realization of the algorithm or the function selection standard related to driving such as acquisition, storage, compression and the like can enable the real-time database control method and the control system to run on a plurality of platforms; the functions requiring the participation of users such as inquiry, export and the like are developed by selecting a cross-platform application program framework Qt, so that the portability and expansibility of the software codes are improved.

C. The lightweight open source database SQLite is selected as a database management system, so that the system occupancy rate is reduced, and the system resource overhead is reduced. In embedded devices, the database file occupies only a few hundred K of memory; the desktop and embedded operating system are compatible, the SQLite processing speed is faster and the efficiency is higher compared with Mysql, postgreSQL open source database management systems which are combined with a plurality of program languages.

D. the real-time database has large data acquisition amount and high data proximity, and the data source is compressed by adopting a revolving door algorithm based on linear fitting, so that the efficiency is high, the compression ratio is high, the realization is simple, and the error is controllable. The data item is configured with dead zone attribute, pre-processing is carried out, data redundancy is reduced, and an optimized revolving door compression algorithm is formed. The compressed data block is subjected to secondary compression based on an LZ4 lossless compression algorithm, and the moving step length of 4byte is adopted, so that the compression/decompression speed is improved, and the system load is not increased. And optimizing the memory and disk space.

Referring to fig. 8, a database control system includes a server and a client. The server is internally provided with a microprocessor, a storage, a data acquisition module, a data management module, a data compression module, a data query module and a data reading module; the input and output terminals of the memory are communicatively coupled to the output and output terminals of the microprocessor. The input and output ends of the data acquisition module, the data management module, the data compression module, the data query module and the data reading module are respectively in communication connection with the input and output ends of the microprocessor, and the server is provided with a first interface. The client is in charge of interacting with a user to finish data query, a second interface for realizing data interaction with the first interface based on TCP/IP is arranged on the client, and the buffer pool is arranged in the server.

In the description of the present specification, a particular feature, structure, material, or characteristic may be combined in any suitable manner in one or more embodiments or examples.

The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims. Furthermore, the foregoing description of the principles and embodiments of the application has been provided for the purpose of illustrating the principles and embodiments of the application and for the purpose of providing a further understanding of the principles and embodiments of the application, and is not to be construed as limiting the application.

Claims

1. A database control method, comprising the steps of:

S3: the data compression module adopts a revolving door compression algorithm to selectively compress the acquired data items, if the acquired data items meet the preset compression requirement, the acquired data items are compressed to form data blocks, the step S4 is carried out, if the acquired data items do not meet the preset compression requirement, the acquired data items are discarded, the step S1 is returned to read the next data item until all the data items are completely finished;

S4: the cache pool selectively stores the data blocks, if the data blocks meet the preset storage requirement, the data blocks are compressed again to form a compressed file, then the compressed file is stored, if the data blocks do not meet the preset storage requirement, the data blocks are discarded, and the step S1 is returned to read the next data item until all the data items are finished;

The compression algorithm parameters of the revolving door comprise compression deviation, wherein the compression deviation is an absolute error value and is set according to experience; within the absolute error value range, the data item is compressed; outside the absolute error value range, the data item cannot be compressed;

the revolving door compression algorithm parameter also comprises a slope, wherein the slope comprises an upper slope, a lower slope and a middle slope, and the calculation formula is as follows:

The compression and storage criteria are as follows:

If K2 is not less than K1, compressing the data item to be stored;

If K is less than K2 or K is more than K1, the data item to be saved is stored;

the revolving door compression algorithm comprises the following steps:

S40: then calculating three slopes of the collected data item outside the dead zone range, and judging the magnitudes of the slopes of the collected data item: a. if the slope satisfies K2 is more than or equal to K1, compressing the acquired data item; b. if K is less than K2 or K is more than K1, collecting data item information, storing the data item information into a cache pool, compressing the data item information to form a data block, and calculating the compression ratio of the data block; at the same time, the process returns to step S30 until all the acquired data items are completed.

2. The database control method according to claim 1, wherein the step of rotating-door compression algorithm further comprises:

3. The method for controlling a database according to claim 2, wherein the data block meets a preset storage requirement, and the data block is compressed again to form a compressed file by adopting an LZ4 compression algorithm.

4. A database control method according to claim 3, wherein the LZ4 compression algorithm comprises the steps of:

5. The database control method according to any one of claims 1 to 4, wherein the data acquisition module selectively acquires the metadata items, comprising the steps of:

6. The database control method according to any one of claims 1 to 4, wherein the frequency thread for which the data block is selectively stored includes the steps of:

7. A database control system, characterized in that the database control method according to any one of claims 1 to 6 is used, the database control system comprising:

The server is internally provided with a microprocessor, a memory, a data acquisition module, a data management module, a data compression module, a data query module and a data reading module; the input end and the output end of the memory are in communication connection with the output end and the output end of the microprocessor; the input and output ends of the data acquisition module, the data management module, the data compression module, the data query module and the data reading module are respectively connected with the input and output ends of the microprocessor in a communication way, and a first interface is arranged on the server;