CN110442565B - Data processing method, device, computer equipment and storage medium - Google Patents

Data processing method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN110442565B
CN110442565B CN201910730003.3A CN201910730003A CN110442565B CN 110442565 B CN110442565 B CN 110442565B CN 201910730003 A CN201910730003 A CN 201910730003A CN 110442565 B CN110442565 B CN 110442565B
Authority
CN
China
Prior art keywords
data
database
processing
period
enters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910730003.3A
Other languages
Chinese (zh)
Other versions
CN110442565A (en
Inventor
邵健锋
崔巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New Trend International Logis Tech Co ltd
Original Assignee
New Trend International Logis Tech Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New Trend International Logis Tech Co ltd filed Critical New Trend International Logis Tech Co ltd
Priority to CN201910730003.3A priority Critical patent/CN110442565B/en
Publication of CN110442565A publication Critical patent/CN110442565A/en
Application granted granted Critical
Publication of CN110442565B publication Critical patent/CN110442565B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/217Database tuning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a data processing method, a device, computer equipment and a storage medium, wherein the data processing method comprises the steps of dividing a database; when the database enters a grabbing period, the grabbed data are written into the database, and after the grabbing is completed, the database which enters the grabbing period next is written; when the database enters a processing period, carrying out local processing on the data in the database, and after the processing is finished, carrying out local processing on the data in the database which enters the processing period next; after the local processing is completed, the data in the database is exported. According to the method, the database is divided, so that the database which is being written in by data is separated from the database which is being processed by data, the performance requirement on a single database is reduced, and the effective storage of mass data is realized.

Description

Data processing method, device, computer equipment and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data processing method, a data processing device, a computer readable storage medium, and a computer device.
Background
Currently, most data acquisition software gathers data in order to expose real-time data to a user interface, and is aided by data storage. With the development and progress of technology, an important basis for artificial intelligence is data, i.e. data acquisition software is also required to acquire data, but the purpose of data acquisition is to collect data, and to use the data for data modeling and AI analysis, such as data trend analysis, to predict the running state of equipment in a short-term future period of time. The existing data acquisition software cannot meet the requirements of data modeling and AI analysis because the existing data acquisition software has higher performance requirements on the database.
Therefore, how to reduce the performance requirements of the database during the data processing, and optimizing the data processing to reduce the database usage cost are technical problems that those skilled in the art need to solve at present.
Disclosure of Invention
The embodiment of the invention provides a data processing method, a data processing device, a computer readable storage medium and computer equipment, which can reduce the performance requirement on a database in the data processing process, and optimize the data processing process to reduce the use cost of the database.
In a first aspect, an embodiment of the present invention provides a data processing method, including:
dividing a database;
when the database enters a grabbing period, the grabbed data are written into the database, and after the grabbing is completed, the database which enters the grabbing period next is written;
when the database enters a processing period, carrying out local processing on the data in the database, and after the processing is finished, carrying out local processing on the data in the database which enters the processing period next;
after the local processing is completed, the data in the database is exported.
In a second aspect, an embodiment of the present invention provides a data processing apparatus, including:
the dividing module is used for dividing the database;
the writing module is used for writing the grabbed data into the database when the database enters the grabbing period, and writing the database which enters the grabbing period next after the grabbing is completed;
the local processing module is used for carrying out local processing on the data in the database when the database enters the processing period, and carrying out local processing on the data in the database which enters the processing period next after the processing is completed;
and the data export module is used for exporting the data in the database after the local processing is completed.
In a third aspect, an embodiment of the present invention further provides a computer apparatus, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the data processing method described in the first aspect when executing the computer program.
In a fourth aspect, an embodiment of the present invention further provides a computer readable storage medium, where the computer readable storage medium stores a computer program, which when executed by a processor, implements the data processing method described in the first aspect.
The embodiment of the invention provides a data processing method, which comprises the steps of dividing a database; when the database enters a grabbing period, the grabbed data are written into the database, and after the grabbing is completed, the database which enters the grabbing period next is written; when the database enters a processing period, carrying out local processing on the data in the database, and after the processing is finished, carrying out local processing on the data in the database which enters the processing period next; after the local processing is completed, the data in the database is exported. According to the method, the database which is being subjected to data writing is separated from the database which is being subjected to data processing, so that the performance requirement on a single database is obviously reduced, and indexes during data reading are not continuously rebuilt due to frequent writing, so that the performance requirement on the database is reduced. The invention also provides a data processing device, a computer readable storage medium and a computer device, which have the beneficial effects and are not described herein.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a data processing method according to an embodiment of the present invention;
FIG. 2 is a flowchart of a data processing method according to an embodiment of the present invention;
FIG. 3 is a flowchart of a data processing method according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be understood that the terms "comprises" and "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
Referring to fig. 1, fig. 1 is a flowchart of a data processing method according to an embodiment of the invention.
The specific steps may include:
s101, dividing a database;
Since database authorization fees account for a significant proportion of the deployment costs of data processing, the use of free databases instead of fee-based databases is a priority for most users. Although free databases have more limitations in database size, performance, and functionality. However, since complex data modeling is not typically performed locally, functional limitations have little impact on the data processing software. In order to enable the data output by the common database to meet the requirements of data modeling and AI analysis, the embodiment of the invention correspondingly improves the database structure so as to enable the database to adapt to the application scene of the data modeling and AI analysis.
In a specific application scenario, the database mentioned in the embodiment of the present invention is a local database, and the data processing performed in the embodiment of the present invention is also performed locally, so that the data is prepared locally in advance, and uploaded to the cloud when needed, and data modeling, AI analysis, and the like are performed.
The step divides (or partitions) the database for the purpose of separating data writing from data processing, which can reduce the performance requirements of the database.
For the division of the database, different division modes may be adopted according to the need, for example, the database may be divided according to a predetermined time length, or the database may be divided according to a predetermined size, or of course, the database may be divided according to other modes, which is not particularly limited in the embodiment of the present invention. In the embodiment of the present invention, the term partitioning or dividing refers to creating a database, so that the database satisfies the above-mentioned partitioning manner when writing data.
Since different data of the same time period are processed mainly in units of time at the time of data processing, the division of the database is preferably performed on a time basis. In a specific application scenario, the database may be partitioned according to a preset time length. If the preset time length is a fixed value, the time lengths of the divided databases are the same, and if the preset time length is a variable value, the divided databases determine the time length according to the variable condition.
Specifically, dividing the database by a predetermined length of time refers to determining a crawling time period of the data written by the database, that is, setting an attribute for the database, the attribute including the crawling time period of the written data. For example, a database is provided for writing data grasped from 1 hour in the morning to 2 hours in the morning, and the predetermined time period is 1 hour. According to the foregoing description, when dividing the database by the predetermined time length, in order to determine the capturing period of the written data, the capturing start time of the written data, that is, the capturing start time and the capturing end time of the written data should be specified, for example, in the above example, the capturing start time of the written data is 1 am, the capturing end time is 2 am, and the divided database is data captured by the period formed by the two time points. The method has the advantages that when data is written, the data is written conveniently according to time sequence, because the grabbing time is recorded correspondingly when the data is grabbed, the data meeting the requirements is written according to the grabbing time, in addition, in the subsequent data processing process, the data in a period are generally processed in batches by taking time as a unit, so the data processing efficiency can be improved, the database is divided according to the preset time length, the data corresponding to the set time is required to be requested according to the set time later when the data is requested, and the position of the set time can be quickly positioned by searching the set time in the preset time length, and the data corresponding to the searched set time can be quickly returned. Since the total amount of data captured within the predetermined time period is generally within a certain range and does not vary much, a database of suitable size can be created based on the total amount of data that has been written for the predetermined time period.
In particular, dividing the database by a predetermined size means that a predetermined data capacity is provided for the database, in which case it is necessary to define when to start writing data and when to end writing data. In a specific application scenario, the time for starting writing data may be determined continuously according to the time of capturing, for example, the time for starting writing data in the current database is determined according to the time of capturing the last written data in the last database, for example, the time for capturing the last written data in the last database is about 12 minutes and 45 seconds on the same day, then the database starts writing data from about 12 minutes and 45 seconds on the same day, and then sequentially writes data according to the time sequence until the total amount of the written data approaches or reaches the predetermined data capacity of the database, and a specific determination method is described in detail below.
S102, when the database enters a grabbing period, the grabbed data are written into the database, and after the grabbing is completed, the database which enters the grabbing period next is written;
in a specific application scenario, after a database is created according to a predetermined time length, the database enters a grabbing period, which means that the grabbed data needs to be written into the database. The amount of data written to the database is determined based on a predetermined length of time, the larger the written data time span, the larger the writable data amount, the smaller the predetermined length of time, the smaller the written data time span, and the smaller the writable data amount. The triggering condition for ending writing is to judge whether the capturing time corresponding to the written data is the ending time, if yes, the writing of the data is ended, if no, the writing of the data is continued. After the action triggering of finishing writing, writing can be performed on the database which enters the grabbing period next, namely, a new database is created by repeating the process, and after the new database enters the grabbing period, data writing is performed on the new database.
In a specific application scenario, after the database is created according to the predetermined size, the database enters the grabbing period, which means that the grabbed data needs to be written into the database, and the total amount of data written from the beginning of writing to the end of writing should be no greater than and close to the preset data capacity, that is, the amount of data written into the database is determined according to the preset data capacity. The triggering action of ending writing is to judge whether the difference of the preset data capacity minus the current written data amount is within a preset threshold value, if so, ending the writing of the data, and if not, continuing the writing of the data. For example, the preset threshold is 5Mb, when the preset data capacity minus the written data amount is 2Mb, the written data is finished, and the written data amount in the database is not greater than the preset data capacity and the difference between the written data amount and the preset data capacity is within the preset threshold range. It should be noted that, when the difference between the preset data capacity and the written data amount is exactly equal to the preset threshold value, the writing of the data is directly finished. After the action triggering of finishing writing, writing can be performed on the database which enters the grabbing period next, namely, a new database is created by repeating the process, and after the new database enters the grabbing period, data writing is performed on the new database.
S103, when the database enters a processing period, carrying out local processing on the data in the database, and after the processing is finished, carrying out local processing on the data in the database which enters the processing period next time;
when the database enters the processing period, which means that the database needs to be processed locally, in the embodiment of the present invention, the local processing of the database refers to the local processing of the data in the database. In the embodiment of the invention, only local processing is performed on the database entering the processing period, in other words, only writing processing is performed on the database entering the grabbing period, in other words, the local processing and writing processing of the database are separated, for example, one database only performs (batch) writing, and one database only performs local processing, thus the performance requirement on a single database in the whole processing process is obviously reduced, and the local processing and writing processing of the database are not affected. In addition, when the local data processing is performed, the index during data reading is not continuously rebuilt by frequent data writing, so that the performance requirement on the database is reduced.
Similarly, after the local processing of the data in the database is completed, the database in the next processing period can be locally processed, that is, the data in the database in the next processing period is locally processed, so that the data in the database can be continuously and sequentially written in and locally processed.
S104, after the local processing is completed, data in the database are exported.
After the local processing of the data in the database is completed, the database enters an export period for export processing, the data in the database can be exported in the export period so as to release the storage space of the database later, and a new database can be created to enter a grabbing period again for writing new data.
The embodiment comprises dividing a database; when the database enters a grabbing period, the grabbed data are written into the database, and after the grabbing is completed, the database which enters the grabbing period next is written; when a database enters a processing period, carrying out local processing on data in the database, and after the processing is finished, carrying out local processing on the data of the database which enters the processing period next; after the local processing is completed, the data in the database is exported. According to the method, the database which is being subjected to large-batch data writing is separated from the database which is being subjected to data processing, so that the performance requirement on a single database is obviously reduced, the index during data reading is not continuously rebuilt due to frequent writing, and the performance requirement on the database is reduced.
The embodiment of the invention also provides a data processing method, as shown in fig. 2, which comprises the following steps:
s201, dividing a database;
s202, when a database enters a grabbing period, the grabbed data are written into the database, and after the grabbing is completed, the database which enters the grabbing period next is written;
s203, when the database enters a cooling period, cooling the database to provide continuous grabbing and preparation for entering a processing period when needed;
s204, when the database enters a processing period, carrying out local processing on the data in the database, and after the processing is finished, carrying out local processing on the data in the database which enters the processing period next time;
s205, after the local processing is completed, data in a database are exported.
In the above steps, the implementation manners of S201 and S101, S202 and S102, S204 and S103, and S205 and S104 are the same, and specific implementation details may refer to the data processing method provided in the foregoing embodiment, which is not repeated in this embodiment.
S203 added in this embodiment will be described in detail below. After the data writing of the database is completed in S202, the cooling process is performed in S203 to provide continuous grabbing and preparation for entering the processing period when necessary.
In S202, the database has already written data during the snatch period, at which point the database will go to the cool down period, while the new database continues to be generated and data written into the snatch period. The original database is then subjected to the cooling process of the database in the cooling period of S203. Because the embodiment of the invention may adopt a plurality of grippers (i.e. a device capable of gripping data, writing the gripped data into the database after gripping the data), the time system of the grippers may have slight differences, and in addition, the connection between the grippers and the database may be disconnected for a short period of time, so that the database in a cooling period is set into a writable state (i.e. the grippers can still write the data into the database), so as to ensure that the writing process of the database remains intact. The cooling period is also of significance in confirming that all the grabber grabbed data are written for subsequent operation.
The embodiment of the invention also provides a data processing method, as shown in fig. 3, which comprises the following steps:
s301, dividing a database;
s302, when the database enters a grabbing period, the grabbed data are written into the database, and after the grabbing is completed, the database which enters the grabbing period next is written;
S303, when the database enters the current processing period, carrying out local processing on the data in the database, and after the processing is finished, carrying out local processing on the data in the database which enters the current processing period next time;
s304, when the database enters the next processing period, carrying out local processing on the data in the database, and after the processing is finished, carrying out local processing on the data in the database entering the next processing period;
s305, after the local processing is completed, data in a database are exported.
In this embodiment, steps S301, S302, S305 are the same as corresponding steps of the foregoing embodiments, and specific implementation details may refer to a data processing method provided in the corresponding embodiment of fig. 1, which is not described in detail in this embodiment. In addition, this embodiment may be implemented on the basis of the embodiment corresponding to fig. 2, that is, the two may be combined, that is, a step "when the database enters the cooling period, the database is subjected to the cooling process so as to provide continuous grabbing and preparation for entering the processing period when needed" is added between S302 and S303, which may result in a more preferred embodiment.
S303 and S304 in the present embodiment are described in detail below.
In this embodiment, a plurality of processing periods are provided, that is, the database enters one processing period to perform local processing, and enters the next processing period to perform local processing after the processing is completed until all the processing periods are completed.
Specifically, in S303, when the database enters the current processing period, the data in the database is locally processed, and after the processing is completed, the next database entering the current processing period is locally processed. In S304, after the processing in the current processing period is completed, the database will enter the next processing period to continue the local processing, and after the processing is completed, the database entering the next processing period will be locally processed.
In a specific application scenario, the processing periods are set to be a processing period 1, a processing period 2, a processing period 3, a processing period … and a processing period N. If the processing sequence of the data is exactly the processing 1 phase, the processing 2 phase, the processing 3 phase … and the processing N phase in sequence, the database A firstly carries out local processing in the processing 1 phase, enters the processing 2 phase for local processing after the processing is finished, enters the processing 3 phase for local processing after the processing is finished, and finally enters the processing N phase for local processing. Meanwhile, when the database A enters the processing 2 stage to perform local processing, the database B enters the processing 1 stage to perform local processing, when the database A enters the processing 3 stage to perform local processing, the database B enters the processing 2 stage to perform local processing, and the database C enters the processing 1 stage to perform local processing, so that all the databases enter the corresponding processing stages in sequence according to a pipeline mode to perform processing, and the performance requirements on the databases are greatly reduced. More importantly, according to the algorithm requirement of data processing, the local data processing may need to use data with the time length being several times that of the database as a reference or data which is newer than the processing time, so that in the embodiment of the invention, a plurality of processing periods are set for processing different databases simultaneously when needed, and each database sequentially enters each processing period for local processing according to the timeliness arrangement.
Because the modeling and analysis of the data in the database does not have obvious returns in a short period of time, the system deployment cost is generally required to be reduced, the early investment is reduced, and the most important approach is to store mass data in a low-cost mode. In order to achieve the storage of mass data in a low cost manner, efficient processing of the data in the database is required to reduce deployment costs.
In a specific application scenario, the locally processing the data in the database includes:
determining data needing to be exported in the database, and copying the data needing to be exported.
In the foregoing embodiments, the database is divided, and the database division is only used to effectively store mass data. In order to process data in the cloud, the stored data must be effectively compressed, so as to reduce the network requirement during uploading.
After the data in the database is compressed, the data volume in the database is reduced, so that the follow-up data with smaller size is ensured to be uploaded to the cloud for the follow-up data processing.
In a specific application scenario, the determining the data to be exported in the database includes: determining data which does not need to be exported in the database, and determining the data which does not need to be exported in the database as the data which needs to be exported; wherein determining data in the database that does not need to be exported includes determining one or more of redundant data in the database, determining useless data in the database, and determining data in the database that is removable in terms of transformed data dimensions.
While the embodiments of the present invention will be described in terms of the embodiments of the present invention, it should be apparent that the embodiments of the present invention may be implemented in other ways than the above-described embodiments, and that the embodiments of the present invention are only exemplary of the embodiments, and it is conceivable that those skilled in the art may implement the embodiments of the present invention without creative efforts to implement other ways of processing, and these embodiments are all within the scope of the claims of the present invention.
The database in the processing period is read-only by default, i.e. no substantial compression processing is performed, but only one or more of redundant data, useless data and data which can be rejected in the dimension of conversion data are determined, the data which are not required to be exported are determined, and after the data are excluded, the data which are required to be exported in the database can be determined, and the data which are required to be exported are copied. When the data is exported later, only the copied data is exported, and unnecessary data is abandoned, namely the exported data does not contain redundant data, useless data, and removable data, and the like, so that the deletion of the data is realized, and the aim of substantial compression is fulfilled. That is, during the processing period of the database, no deletion operation is performed on any data, so as to avoid affecting the data processing operation in other processing periods, and meanwhile, the data processing operation of each database can be more complete.
For determining redundant data in data, since multiple grippers may be used for gripping during the process of data gripping, there is a certain redundancy in the gripped data, that is, there is repeated data, for example, the redundant data is determined according to customizable conditions such as data gripping time and project. Therefore, when local processing is performed, redundant data in the data can be determined, and then data to be exported can be determined.
For determining the useless data in the data, in order to restore the environment to the maximum extent when the fault occurs, the system needs to collect a large number of data samples within a period of time (for example, 5 minutes) before the fault occurs, so that the environment before the fault can be restored when the fault occurs. However, during normal operation of the system, these data are of little significance to the mathematical modeling. Because the hardware sensor can only provide real-time data, the system can only collect the useless data comprehensively, but after a period of time after the acquisition time, the useless data can be discarded after the system is confirmed to work normally. Similarly, a specific implementation method is to copy the data to be exported under the corresponding data item name, and the copied data can be stored in a certain position (in the database or other positions of the database) alone, so as to facilitate the export.
For the data which can be removed when the dimension of the converted data is determined, the data collected by the sensor is the data in the current state, and part of the data only needs to record the conversion time of the work when modeling. For example, the motor rotation state, whether the current operation and the operation direction are reported during each collection, and the time, the duration and the current state of each working state change only need to be known during data modeling. During data processing, these raw data need to be converted into the data required for data modeling, which results in a large amount of culling data. Similarly, a specific implementation method is to copy the data to be exported under the corresponding data item name, and the copied data can be stored in a certain position (in the database or other positions of the database) alone, so as to facilitate the export.
It should be noted that the present invention may perform the above three types of local processing on the same database at the same time in one processing period, or may perform the above one type of local processing on the same database in three processing periods, or may perform the above one type of local processing on one database in one processing period, and perform the above two other types of local processing on the same database in the next processing period, where these embodiments are all a lower concept of the protection scope of the present invention, and obviously all belong to the protection scope of the present invention.
In a specific application scenario, the data in the export database includes:
the copied data is exported.
I.e. the data in the export database of the present invention is essentially export copied data, i.e. data that needs to be exported. This step is essentially a compression process. After the data that does not need to be exported is to be excluded, the needed data is exported.
The method for exporting the data adopts copying the useful data and indirectly deleting the data.
After the data in the database is exported, the data processing method further comprises:
and uploading the exported data to the cloud end, and deleting the locally exported data and/or the corresponding database.
And finally, when the data is exported, the data to be exported is stored as an independent local file, so that the data size of the local file is greatly reduced relative to that of an original database, the local file is conveniently stored locally or transferred to other positions, and meanwhile, the bandwidth requirement for uploading to a cloud can be reduced.
After the data is finally exported, the database corresponding to the exported data can be deleted so as to release the storage space, facilitate the creation of the database of the next round, and perform the processes of writing, processing, exporting and the like of new data. Meanwhile, after the local file is uploaded, the local file (exported data) can be deleted, so that the occupation of a local memory is reduced, and the subsequent storage and uploading are convenient.
The embodiment of the invention can be used for slimming mass data and realizing low-cost storage of mass data. The system deployment cost is reduced, and the investment is reduced. In addition, the data is finally required to be uploaded to the cloud for data analysis after being acquired, and the method and the device reduce the transmission of useless data, reduce the data quantity required to be transmitted to the cloud, improve the transmission speed and reduce the network bandwidth requirement.
In embodiments of the invention, one or more processors may be configured and used during a processing period. Each processor corresponds to a particular mathematical algorithm or business function. By the combined use of processors, specific business logic is implemented. During a processing period, no processor may be used. The processing period will only be taken as a time interval at which the database can enter the next processing period.
The variables mentioned in the embodiments of the present invention may be understood as names of each type of data generated by the data capturing device and the processor, for example, "motor [ No. 15 ]. Temperature.
The processor may be a specific program. The program supports the retrieval of processors from any client trusted source, which may be provided by multiple parties. The processor interface includes input values, output values, and may maintain state data during multiple runs to continuously process multiple databases.
Wherein the processor may designate one or more variables as inputs. Variables may come from the data being grabbed or from the output of other processors. The number and type of variables depends on the processor settings.
After the system is started, all data of the first time value of the current data to be processed are respectively written into the corresponding input bits of the processor according to the processing period requirement. After the writing is completed, the operation of the processor is triggered in turn. The processor internal logic is used to perform triggers caused by changes to all or a particular variable. The processor finishes the self-defined algorithm action at each trigger moment, reads all input variable values at the current moment, changes the state of the processor and selects an output value.
In particular, the processor may apply for a time delay during processing. If so, the system will perform a trigger on the delay time after it has arrived. This time delay refers to the data processing time and not the actual run time of the current system.
When all the triggering actions are executed, the system processes all the data of the next second time value, and the steps are repeated until the whole database is processed.
The processor may designate one or more variables as output. The number and type of variables depends on the processor settings. The time corresponding to the output value is the time of the current input data.
The output variables may be set up to write back to the database (see stability assurance mechanism below), or just for the input of other processors running at this time. This output, if used for other processor inputs, will also trigger the corresponding processor. To prevent the resulting loop deadlock, the internal logic of the processor and the permission settings on the external settings trigger whether the corresponding processor.
The embodiment of the invention provides an interface for the processor to internally set the input time offset, wherein the time offset is a non-negative time value. Default to 0, i.e., no offset. When the time offset is set, updated data will be provided for this input as the data is provided. For example, when a certain input of a certain processor is set to a time offset of 1 minute, when data at time t is being processed, the input data is data at time t+1 minute. To ensure that data is present, the processor that has set the time offset cannot operate in the first processing period. Its working period, at the earliest, cannot be lower than the period required for the time of data preparation. For example, when the database is cut to length of 5 minutes, and a certain input of a certain processor is set with a time offset of 8 minutes, the processor can only operate at the earliest in the 3 rd processing period (data with a time offset of not more than 10 minutes can be provided).
In particular, batch data reads may be applied directly when the processor has internal logic requirements. The variable list, start time, end time are specified at the time of reading so as to return all eligible data at once. This reading is not limited by the upper and lower temporal limits of the processing period in which it is located. The read does not trigger any processor actions.
In the embodiment of the invention, the stability guarantee mechanism means that the system cannot ensure that the processor can stably execute because the logic of the processor is complex and the writer is not controlled. At the beginning of execution, a temporary database may be built in each database to hold possible write data as well as processor state data. When executed, the processor operates in a separate host process and sends back heartbeat data at intervals. The data written back by the processor can only be stored in the temporary database. After the fault-free execution is completed, the independent host process notifies that the execution action is completed, at the moment, the data of the temporary database is saved back to the currently processed database, and the temporary database is deleted; when execution fails, the independent host process needs to feed back to delete the temporary library directly and re-execute the process at a later time. If the heartbeat data is not refreshed after the set time, the independent host process is considered to be failed, and the temporary database is deleted. Since the names of the temporary databases are different each time, even if the temporary databases are re-executed later, the last independent host process which has timed out cannot re-write data to interfere with the new execution.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention;
the apparatus may include:
a partitioning module 401, configured to partition a database;
a writing module 402, configured to write the captured data into the database when the database enters the capturing period, and write the database that enters the capturing period next after capturing is completed;
the local processing module 403 is configured to perform local processing on data in the database when the database enters a processing period, and perform local processing on data in the database that enters the processing period next after the processing is completed;
and the data export module 404 is configured to export the data in the database after the local processing is completed.
The device divides a database according to a preset time length; when the database enters a grabbing period, the grabbed data are written into the database, and after the grabbing is completed, the database which enters the grabbing period next is written; when a database enters a processing period, carrying out local processing on data in the database, and after the processing is finished, carrying out local processing on the database which enters the processing period next time; after the local processing is completed, the data in the database is exported. The device separates the database which is being written with a large amount of data from the database which is being processed with the data, so that the performance requirement on a single database is obviously reduced, and the index during data reading is not continuously rebuilt due to frequent writing, thus the performance requirement on the database is also reduced.
Further, the device further comprises:
and the cooling module is used for carrying out cooling treatment on the database when the database enters the cooling period so as to provide continuous grabbing and preparation for entering the treatment period when required.
Further, a plurality of processing periods are provided, and accordingly, the local processing module 403 includes:
the current processing unit is used for carrying out local processing on the data in the database when the database enters the current processing period, and carrying out local processing on the data in the database which enters the current processing period next after the processing is completed;
and the next processing unit is used for carrying out local processing on the data in the database when the database enters the next processing period, and carrying out local processing on the data in the database which enters the next processing period after the processing is completed.
Further, the local processing module 403 includes:
and the replication processing unit is used for determining the data needing to be exported in the database and replicating the data needing to be exported.
Further, the copy processing unit is specifically configured to: determining data which does not need to be exported in the database, and determining the data which does not need to be exported in the database as the data which needs to be exported; wherein determining data in the database that does not need to be exported includes determining one or more of redundant data in the database, determining useless data in the database, and determining data in the database that is removable in terms of transformed data dimensions.
Further, the data export module 404 is specifically configured to export one or more of data in the database except redundant data, data in the export database except useless data, and data in the export database except data with dimension of conversion data being removable.
Further, the dividing module 401 includes:
and the dividing unit is used for dividing the database according to the preset time length.
Further, the device further comprises:
and the uploading processing module is used for uploading the exported data to the cloud end and deleting the locally exported data and/or the corresponding database.
Since the embodiments of the apparatus portion and the embodiments of the method portion correspond to each other, the embodiments of the apparatus portion are referred to the description of the embodiments of the method portion, and are not repeated herein.
The present invention also provides a computer-readable storage medium having stored thereon a computer program which, when executed, performs the steps provided by the above-described embodiments. The storage medium may include: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The invention also provides a computer device, which can comprise a memory and a processor, wherein the memory stores a computer program, and the processor can realize the steps provided by the embodiment when calling the computer program in the memory. Of course the computer device may also include various network interfaces, power supplies, and the like.
In the description, each embodiment is described in a progressive manner, and each embodiment is mainly described by the differences from other embodiments, so that the same similar parts among the embodiments are mutually referred. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the invention can be made without departing from the principles of the invention and these modifications and adaptations are intended to be within the scope of the invention as defined in the following claims.
It should also be noted that in this specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Claims (4)

1. A method of data processing, comprising:
dividing a database according to a preset time length, and designating a grabbing starting moment of written data;
when the database enters a grabbing period, the grabbed data are written into the database, and after the grabbing is completed, the database which enters the grabbing period next is written;
when the database enters a processing period, carrying out local processing on the data in the database, and after the processing is finished, carrying out local processing on the data in the database which enters the processing period next; if one or more processors are configured and used in a processing period, each processor corresponds to a service function; if no processor is used in a certain processing period, the processing period is only used as a time interval, so as to wait for the database to enter the next processing period; setting a time offset input by a processor through an interface, wherein the time offset is a non-negative time value and defaults to 0;
after the local processing is completed, exporting data in a database;
when the database enters a processing period, the data in the database is locally processed, and after the processing is finished, before the data in the database entering the processing period is locally processed, the data processing method further comprises the following steps:
When the database enters a cooling period, cooling the database to provide continuous grabbing and preparation for entering a processing period when needed; wherein the database in the cooling period is set to be in a writable state;
the processing period is provided with a plurality of processing periods, and correspondingly, when the database enters the processing period, the data in the database is locally processed, and after the processing is finished, the data in the database which enters the processing period next is locally processed, which comprises the following steps:
when the database enters the current processing period, carrying out local processing on the data in the database, and carrying out local processing on the data in the database which enters the current processing period next after the processing is completed;
when the database enters the next processing period, carrying out local processing on the data in the database, and after the processing is finished, carrying out local processing on the data in the database entering the next processing period;
the local processing of the data in the database comprises:
determining data needing to be exported in a database, and copying the data needing to be exported;
the determining data to be exported in the database includes: determining data which does not need to be exported in the database, and determining the data which does not need to be exported in the database as the data which needs to be exported; wherein, the data which does not need to be exported in the database is determined to comprise one or more of redundant data in the database, useless data in the database and data with dimension of conversion data in the database being removed;
After the data in the database is exported, the data processing method further comprises:
and uploading the exported data to a cloud, carrying out data modeling and AI analysis, and deleting the locally exported data and/or a corresponding database.
2. A data processing apparatus, comprising:
the dividing module is used for dividing the database according to the preset time length and designating the grabbing starting time of the written data;
the writing module is used for writing the grabbed data into the database when the database enters the grabbing period, and writing the database which enters the grabbing period next after the grabbing is completed;
the local processing module is used for carrying out local processing on the data in the database when the database enters the processing period, and carrying out local processing on the data in the database which enters the processing period next after the processing is completed; if one or more processors are configured and used in a processing period, each processor corresponds to a service function; if no processor is used in a certain processing period, the processing period is only used as a time interval, so as to wait for the database to enter the next processing period; setting a time offset input by a processor through an interface, wherein the time offset is a non-negative time value and defaults to 0;
The data export module is used for exporting the data in the database after the local processing is completed;
the cooling module is used for carrying out cooling treatment on the database when the database enters a cooling period so as to provide continuous grabbing and preparation for entering a treatment period when required; wherein the database in the cooling period is set to be in a writable state;
the processing period is provided with a plurality of, correspondingly, the local processing module comprises:
the current processing unit is used for carrying out local processing on the data in the database when the database enters the current processing period, and carrying out local processing on the data in the database which enters the current processing period next after the processing is completed;
the next processing unit is used for carrying out local processing on the data in the database when the database enters the next processing period, and carrying out local processing on the data in the database entering the next processing period after the processing is completed;
the local processing module comprises:
determining data needing to be exported in a database, and copying the data needing to be exported;
the determining data to be exported in the database includes: determining data which does not need to be exported in the database, and determining the data which does not need to be exported in the database as the data which needs to be exported; wherein, the data which does not need to be exported in the database is determined to comprise one or more of redundant data in the database, useless data in the database and data with dimension of conversion data in the database being removed;
After the data in the database is exported, the data processing apparatus further includes:
and uploading the exported data to a cloud, carrying out data modeling and AI analysis, and deleting the locally exported data and/or a corresponding database.
3. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the data processing method of claim 1 when executing the computer program.
4. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when executed by a processor, implements the data processing method according to claim 1.
CN201910730003.3A 2019-08-08 2019-08-08 Data processing method, device, computer equipment and storage medium Active CN110442565B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910730003.3A CN110442565B (en) 2019-08-08 2019-08-08 Data processing method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910730003.3A CN110442565B (en) 2019-08-08 2019-08-08 Data processing method, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110442565A CN110442565A (en) 2019-11-12
CN110442565B true CN110442565B (en) 2023-06-30

Family

ID=68433980

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910730003.3A Active CN110442565B (en) 2019-08-08 2019-08-08 Data processing method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110442565B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07325839A (en) * 1994-06-02 1995-12-12 Mitsubishi Electric Corp Time series data processor
CN105608202A (en) * 2015-12-25 2016-05-25 北京奇虎科技有限公司 Data packet analysis method and device

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101697152A (en) * 2009-10-23 2010-04-21 金蝶软件(中国)有限公司 Database storage system and method and device for splitting data thereof
US10235044B2 (en) * 2015-07-27 2019-03-19 Datrium, Inc. System and methods for storage data deduplication
CN105159925B (en) * 2015-08-04 2019-08-30 北京京东尚科信息技术有限公司 A kind of data-base cluster data distributing method and system
CN105069134B (en) * 2015-08-18 2018-07-27 上海新炬网络信息技术股份有限公司 A kind of automatic collection method of Oracle statistical informations
US10552454B2 (en) * 2015-11-13 2020-02-04 Sap Se Efficient partitioning of related database tables
CN107179878B (en) * 2016-03-11 2021-03-19 伊姆西Ip控股有限责任公司 Data storage method and device based on application optimization
KR20180047828A (en) * 2016-11-01 2018-05-10 에스케이하이닉스 주식회사 Data processing system and data processing method
CN106649857A (en) * 2016-12-30 2017-05-10 北京恒华伟业科技股份有限公司 Reading and writing separation-based database operation method and apparatus
CN108073703A (en) * 2017-12-14 2018-05-25 郑州云海信息技术有限公司 A kind of comment information acquisition methods, device, equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07325839A (en) * 1994-06-02 1995-12-12 Mitsubishi Electric Corp Time series data processor
CN105608202A (en) * 2015-12-25 2016-05-25 北京奇虎科技有限公司 Data packet analysis method and device

Also Published As

Publication number Publication date
CN110442565A (en) 2019-11-12

Similar Documents

Publication Publication Date Title
CN111316245B (en) Restoring databases using fully hydrated backups
JP5500309B2 (en) Storage device
US10318648B2 (en) Main-memory database checkpointing
US10311028B2 (en) Method and apparatus for replication size estimation and progress monitoring
US7073036B2 (en) Backup system and method for tape recording medium
US8250033B1 (en) Replication of a data set using differential snapshots
US8447726B2 (en) Performance improvement of a capacity optimized storage system including a determiner
CN107145403A (en) The relevant database data retrogressive method of web oriented development environment
KR101429575B1 (en) Real time backup system of database, system of recovering data and method of recovering data
US20090327357A1 (en) Time based file system for continuous data protection
CN105573859A (en) Data recovery method and device of database
WO2016070529A1 (en) Method and device for achieving duplicated data deletion
CN117130827A (en) Restoring databases using fully hydrated backups
CN110209736A (en) Device, method and the storage medium of block chain data processing
CN104820625B (en) A kind of data record, backup and the restoration methods of Information management system
CN116450287A (en) Method, device, equipment and readable medium for managing storage capacity of service container
CN115729749A (en) Data backup method and system
CN111125171A (en) Monitoring data access method, device, equipment and readable storage medium
CN113254394B (en) Snapshot processing method, system, equipment and storage medium
CN113419897B (en) File processing method and device, electronic equipment and storage medium thereof
CN110442565B (en) Data processing method, device, computer equipment and storage medium
Escriva et al. The design and implementation of the warp transactional filesystem
CN113761059A (en) Data processing method and device
CN108959614A (en) A kind of snapshot management method, system, device, equipment and readable storage medium storing program for executing
CN115328696A (en) Data backup method in database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant