CN113486018A - Production data storage method, storage device, electronic device and storage medium - Google Patents

Production data storage method, storage device, electronic device and storage medium Download PDF

Info

Publication number
CN113486018A
CN113486018A CN202110841038.1A CN202110841038A CN113486018A CN 113486018 A CN113486018 A CN 113486018A CN 202110841038 A CN202110841038 A CN 202110841038A CN 113486018 A CN113486018 A CN 113486018A
Authority
CN
China
Prior art keywords
attribute
production data
attribute name
storage
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110841038.1A
Other languages
Chinese (zh)
Other versions
CN113486018B (en
Inventor
胡建平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Zhenshi Information Technology Co Ltd
Original Assignee
Beijing Jingdong Zhenshi Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Zhenshi Information Technology Co Ltd filed Critical Beijing Jingdong Zhenshi Information Technology Co Ltd
Priority to CN202110841038.1A priority Critical patent/CN113486018B/en
Publication of CN113486018A publication Critical patent/CN113486018A/en
Application granted granted Critical
Publication of CN113486018B publication Critical patent/CN113486018B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2291User-Defined Types; Storage management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/547Messaging middleware
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue

Abstract

The disclosure provides a production data storage method, a storage device, an electronic device and a storage medium. The method comprises the following steps: sending the production data to a newly created message queue before writing the production data to the storage system; obtaining a plurality of production data from the message queue, wherein each production data comprises an attribute name and an attribute value corresponding to the attribute name; merging the attribute values with the same attribute name in the plurality of production data according to a preset merging rule to obtain a storage set corresponding to the attribute name; and storing the storage set corresponding to the attribute name to a storage system.

Description

Production data storage method, storage device, electronic device and storage medium
Technical Field
The present disclosure relates to the field of computer technology, and more particularly, to a production data storage method, a production data storage apparatus, an electronic device, a computer-readable storage medium, and a computer program product.
Background
With the development of computer technology, the production data generated in the production process is increased more and more, and the requirements of network business platforms such as express delivery and the like on the acquisition, storage and monitoring of the production process data are higher and higher. Most of the existing production data acquisition systems simply use a real-time database for acquisition and storage.
In implementing the disclosed concept, the inventors found that there are at least the following problems in the related art: the storage efficiency of production data is low.
Disclosure of Invention
In view of the above, the present disclosure provides a production data storage method, a production data storage apparatus, an electronic device, a computer-readable storage medium, and a computer program product.
One aspect of the present disclosure provides a production data storage method, including:
before writing the production data into the storage system, sending the production data into a newly created message queue;
obtaining a plurality of production data from the message queue, wherein each production data comprises an attribute name and an attribute value corresponding to the attribute name;
merging the attribute values with the same attribute name in a plurality of production data according to a preset merging rule to obtain a storage set corresponding to the attribute name; and
and storing the storage set corresponding to the attribute name in the storage system.
According to an embodiment of the present disclosure, the production data storage method further includes:
obtaining a plurality of production data tables;
acquiring initial production data from each production data table by using a log extraction tool; and
and preprocessing each initial production data to obtain a plurality of production data.
According to an embodiment of the present disclosure, the preprocessing includes at least one of: data cleaning, logic processing, data packaging, data statistics and data compression.
According to the embodiment of the disclosure, the production data is data which is generated by the server according to the data generation time point interval and has the classification identification;
the merging the attribute values with the same attribute name in the plurality of pieces of production data according to a preset merging rule to obtain a storage set corresponding to the attribute name includes:
performing data conversion on each attribute name and an attribute value corresponding to the attribute name to obtain a plurality of first attribute names and first attribute values corresponding to the first attribute names;
determining, for each of the category labels, the first attribute names having the same category label and the first attribute values corresponding to each of the first attribute names from among a plurality of the first attribute names corresponding to different ones of the production data and the first attribute values corresponding to each of the first attribute names;
determining a target attribute name and the target attribute value corresponding to the target attribute name from a plurality of the first attribute names having the same class identifier and the first attribute value corresponding to each of the first attribute names according to the data generation time point; and
and merging the target attribute values corresponding to the target attribute information with different classification identifiers to obtain the storage set.
According to an embodiment of the present disclosure, the data generation time points include a first time point and a second time point, and the time of the second time point is earlier than the time of the first time point;
wherein the determining a target attribute name and the target attribute value corresponding to the target attribute name from among a plurality of pieces of the first attribute information having the same classification flag and the target attribute value corresponding to each piece of the first attribute information according to the data generation time point includes:
determining the first attribute name having the first time point and the first attribute value corresponding to the first attribute name from a plurality of the first attribute names having the same classification flag and the first attribute value corresponding to each of the first attribute names; and
determining the first attribute name having the first time point and the first attribute value corresponding to the first attribute name as the target attribute name and the target attribute value.
According to an embodiment of the present disclosure, the production data storage method further includes:
determining the first attribute name having the second time point and the first attribute value corresponding to the first attribute name from a plurality of the first attribute names having the same classification flag and the first attribute value corresponding to each of the first attribute names; and
deleting the first attribute name having the second time point and the first attribute value corresponding to the first attribute name.
Another aspect of the present disclosure provides a production data storage device, including:
the sending module is used for sending the production data to a newly created message queue before the production data are written into the storage system;
a first obtaining module, configured to obtain a plurality of production data from the message queue, where each production data includes an attribute name and an attribute value corresponding to the attribute name;
a merging module, configured to merge the attribute values with the same attribute name in the multiple pieces of production data according to a preset merging rule, so as to obtain a storage set corresponding to the attribute name; and
and the storage module is used for storing the storage set corresponding to the attribute name into the storage system.
Another aspect of the present disclosure provides an electronic device including:
one or more processors;
a memory for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method described above.
Another aspect of the present disclosure provides a computer-readable storage medium storing computer-executable instructions for implementing the method as described above when executed.
Another aspect of the disclosure provides a computer program product comprising computer executable instructions for implementing the method as described above when executed.
According to the embodiment of the disclosure, before the production data is written into the storage system, the production data is sent to the newly created message queue, the attribute values with the same attribute name in the production data are merged according to the preset merging rule to obtain the storage set corresponding to the attribute name, and then the storage set is stored into the storage system, so that a technical means that a plurality of production data are written into the storage system at the same time to cause slow storage is avoided, the technical problem of low storage efficiency of the production data is at least partially solved, and the technical effect of improving the storage efficiency of the production data is further achieved.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent from the following description of embodiments of the present disclosure with reference to the accompanying drawings, in which:
FIG. 1 schematically illustrates a flow chart of a method of producing a data store according to an embodiment of the present disclosure;
FIG. 2 schematically illustrates an exemplary system architecture for applying the production data storage method according to an embodiment of the present disclosure;
FIG. 3 schematically illustrates a flow chart of a method of production data storage according to an embodiment of the present disclosure;
FIG. 4 schematically illustrates a flow chart of a method of obtaining a storage set according to an embodiment of the present disclosure;
FIG. 5 schematically illustrates a block diagram of a production data storage device, in accordance with an embodiment of the present disclosure; and
FIG. 6 schematically illustrates a block diagram of an electronic device suitable for implementing a production data storage method according to an embodiment of the present disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). Where a convention analogous to "A, B or at least one of C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B or C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
Fig. 1 schematically shows a flowchart of a production data storage method in the related art according to an embodiment of the present disclosure.
As shown in FIG. 1, in the related art, the real-time data items are generally extracted by a log extraction tool from a log file of a production data table to generate production data. And processing the preprocessed production data through a processing frame of a real-time data computing project, and writing the fields required by the wide table into a storage system after computing the fields.
However, when the obtained data amount is large or the amount of real-time (real-time) data is large, the storage system may refuse to write in the production data submitted to the storage system for many times because the production data may include a large amount of production data having the same attribute value, so that back pressure is generated, which may reduce the consumption speed of the processing frame, and the back pressure limits the processing speed when the processing speed of the processing frame is greater than the storage speed.
The inventor finds that before writing in the storage system, a message queue can be newly established to store the production data, decouple data processing and data storage, and merge the same attribute values in a plurality of production data, so that the problem that a large amount of production data with the same attribute values are refused to be written in by the storage system can be avoided.
Accordingly, embodiments of the present disclosure provide a production data storage method, a production data storage apparatus, an electronic device, a computer-readable storage medium, and a computer program product. The method includes sending production data to a newly created message queue before writing the production data to the storage system; obtaining a plurality of production data from the message queue, wherein each production data comprises an attribute name and an attribute value corresponding to the attribute name; merging the attribute values with the same attribute name in the plurality of production data according to a preset merging rule to obtain a storage set corresponding to the attribute name; and storing the storage set corresponding to the attribute name to a storage system.
FIG. 2 schematically illustrates an exemplary system architecture 200 to which the production data storage method may be applied, according to an embodiment of the disclosure. It should be noted that fig. 2 is only an example of a system architecture to which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, and does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.
As shown in fig. 2, the system architecture 200 according to this embodiment may include terminal devices 201, 202, 203, a network 204 and a server 205. The network 204 serves as a medium for providing communication links between the terminal devices 201, 202, 203 and the server 205. Network 204 may include various connection types, such as wired and/or wireless communication links, and so forth.
The user may use the terminal devices 201, 202, 203 to interact with the server 205 via the network 204 to receive or send messages or the like. The terminal devices 201, 202, 203 may have installed thereon various communication client applications, such as a production data storage-like application, a web browser application, a search-like application, an instant messaging tool, a mailbox client, and/or social platform software, etc. (by way of example only).
The terminal devices 201, 202, 203 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 205 may be a server providing various services, such as a background management server (for example only) providing support for websites browsed by users using the terminal devices 201, 202, 203. The background management server may analyze and perform other processing on the received data such as the user request, and feed back a processing result (e.g., a webpage, information, or data obtained or generated according to the user request) to the terminal device.
It should be noted that the production data storage method provided by the embodiment of the present disclosure may be generally executed by the server 205. Accordingly, the production data storage provided by the embodiments of the present disclosure may be generally disposed in the server 205. The production data storage method provided by the embodiment of the present disclosure may also be executed by a server or a server cluster different from the server 205 and capable of communicating with the terminal devices 201, 202, 203 and/or the server 205. Accordingly, the production data storage device provided by the embodiment of the present disclosure may also be disposed in a server or a server cluster different from the server 205 and capable of communicating with the terminal devices 201, 202, 203 and/or the server 205. Alternatively, the production data storage method provided by the embodiment of the present disclosure may also be executed by the terminal device 201, 202, or 203, or may also be executed by another terminal device different from the terminal device 201, 202, or 203. Accordingly, the production data storage device provided by the embodiment of the present disclosure may also be disposed in the terminal device 201, 202, or 203, or disposed in another terminal device different from the terminal device 201, 202, or 203.
It should be understood that the number of terminal devices, networks, and servers in fig. 2 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
FIG. 3 schematically shows a flow diagram of a method of production data storage according to an embodiment of the disclosure.
As shown in fig. 3, the method may include operations S301 to S304.
In operation S301, before writing production data to the storage system, the production data is sent to the newly created message queue.
In operation S302, a plurality of production data is obtained from a message queue, wherein each production data includes an attribute name and an attribute value corresponding to the attribute name.
In operation S303, the attribute values with the same attribute name in the multiple pieces of production data are merged according to a preset merging rule, so as to obtain a storage set corresponding to the attribute name.
In operation S304, a storage set corresponding to the attribute name is stored to the storage system.
According to an embodiment of the present disclosure, the Storage system may include an Embedded Storage system (ES), a CK Storage system, a Doris Storage system, or a MySQL Storage system. The message queue may comprise a Kafka message queue.
According to the embodiment of the disclosure, a message queue is newly applied, and the production data is stored in the message queue. And extracting the production data from the message queue, and merging the attribute values corresponding to the attribute names in the multiple production data according to a preset merging rule to obtain a storage set. The storage set is stored to a storage system.
According to the embodiment of the disclosure, when the storage system is stored, the storage can be performed through a plurality of storage nodes or the storage performance of the storage system is improved, so that the storage efficiency is improved.
According to the embodiment of the disclosure, a "time mode" in which the storage efficiency is low due to the fact that the production data are directly stored and the storage system in the related art is converted into a "space mode" in which the production data are transferred to a new message queue for production data merging in this embodiment, and the attribute values with the same attribute name can be merged to improve the storage efficiency.
According to the embodiment of the disclosure, before the production data is written into the storage system, the production data is sent to the newly created message queue, the attribute values with the same attribute name in the production data are merged according to the preset merging rule to obtain the storage set corresponding to the attribute name, and then the storage set is stored into the storage system, so that a technical means that a plurality of production data are written into the storage system at the same time to cause slow storage is avoided, the technical problem of low storage efficiency of the production data is at least partially solved, and the technical effect of improving the storage efficiency of the production data is further achieved.
According to an embodiment of the present disclosure, the production data storage method may further include the following operations.
Obtaining a plurality of production data tables; acquiring initial production data from each production data table by using a log extraction tool; and preprocessing each initial production data to obtain a plurality of production data.
According to embodiments of the present disclosure, tools used for preprocessing may include an Apache Storm streaming framework, an Apache Spark streaming framework, or an Apache Flink open source streaming framework.
According to the embodiment of the disclosure, the log file in the production data table is extracted by using the log extraction tool, so as to obtain the initial production data. The log file may comprise a binlog file.
According to an embodiment of the present disclosure, initial production data is preprocessed to obtain a plurality of production data.
According to the embodiment of the disclosure, the preprocessing and the storage of the production data in the related technology are separated and decoupled, so that the problems of reduced consumption speed and delayed processing of the production data table caused by 'back pressure' generated when the production data is more in the related technology are solved.
According to embodiments of the present disclosure, the pre-processing may include at least one of: data cleaning, logic processing, data packaging, data statistics and data compression.
According to an embodiment of the present disclosure, the data cleansing may include deleting attribute values corresponding to attribute names in the initial production data having a loss rate greater than or equal to a loss rate threshold, for example, the loss rate threshold may include 50%.
According to an embodiment of the present disclosure, the logic processing may include modifying an attribute value of a numeric type corresponding to an attribute name in the initial production data.
According to embodiments of the present disclosure, data encapsulation may include mapping initial production data into a payload of an encapsulation protocol, then padding a header of the corresponding protocol to form a data packet of the encapsulation protocol, and completing rate adaptation.
According to an embodiment of the present disclosure, the data statistics may include performing statistics on attribute names in the initial production data and deleting extraneous feature data that does not need to be counted.
According to embodiments of the present disclosure, data compression may be used to compress initial production data that has undergone data cleansing, logic processing, data encapsulation, data statistics, to form desired production data.
According to an embodiment of the present disclosure, the preprocessing may further include a rule engine processing, and the rule engine processing may include processing the screening condition according to the data processing rule.
FIG. 4 schematically illustrates a flow chart of a method of obtaining a storage set according to an embodiment of the disclosure.
According to the embodiment of the disclosure, the production data is data which is generated by the server according to the data generation time point interval and has the classification identification.
As shown in fig. 4, merging the attribute values with the same attribute name in the multiple production data according to a preset merging rule to obtain a storage set corresponding to the attribute name may include operations S401 to S404.
In operation S401, data conversion is performed on each attribute name and the attribute value corresponding to the attribute name to obtain a plurality of first attribute names and first attribute values corresponding to the first attribute names.
In operation S402, according to each class identifier, a first attribute name having the same class identifier and a first attribute value corresponding to each first attribute name are determined from among a plurality of first attribute names corresponding to different production data and the first attribute value corresponding to each first attribute name.
In operation S403, a target attribute name and a target attribute value corresponding to the target attribute name are determined from among a plurality of first attribute names having the same class identification and the first attribute value corresponding to each first attribute name according to the data generation time point.
In operation S404, the target attribute values corresponding to the target attribute information with different classification identifiers are merged to obtain a storage set.
According to the embodiments of the present disclosure, for convenience of description, the following embodiments will be exemplified with production data of the express industry as a specific use scenario. It should be noted that the following examples are for convenience of description only and are not intended to limit the scope of the present disclosure.
According to the embodiment of the disclosure, since the production data table is transmitted in real time, the production data tables acquired through the window partitioned by the flink Windows may include the production data tables with the same classification identifier at different data generation time points.
According to an embodiment of the present disclosure, for example, production data A (product: garment A; customer name: Zhang III; logistics state: 8; time: 2021.01.01, 15: 00); production data B (product: garment A; customer name: Zhang III; logistics status: 9; time: 2021.01.01, 16: 00); production data C (product: garment A; customer name: Liqu; logistics state: 9; time: 2021.01.01, 15: 00). The classification identifier may be an attribute value of a client name, and the attribute name may include the client name, a logistics state, and time.
According to an embodiment of the present disclosure, the attribute values are subjected to data conversion, and the converted production data may be: production data A (product: clothes A; customer name: Zhang III; logistics status: ex warehouse; time: 2021.01.01, 15: 00); production data B (product: clothing A; customer name: Zhang III; logistics status: delivered; time: 2021.01.01, 16: 00); production data C (product: garment B; customer name: Liqun; logistics status: delivered; time: 2021.01.01, 15: 00). In the data conversion process, the number "8" in the logistics state may refer to "ex-warehouse", the number "9" may refer to "delivered", and the first attribute name may include a customer name, a logistics state, and a time.
According to the embodiment of the present disclosure, it is determined that the production data having the same classification flag has the production data a and the production data B among the three production data according to the classification flag "zhang san". And determining the first attribute name and the corresponding attribute value in the production data B as the target attribute name and the corresponding target attribute value from the production data A and the production data B according to the data generation time point.
According to the embodiment of the disclosure, the target attribute names and the corresponding target attribute values in the generated data B and the data C are merged according to different classification identifiers "zhang san" and "lie si", so as to obtain a storage set.
According to the embodiment of the disclosure, by combining the attribute values with the same attribute name in the plurality of production data according to the preset combination rule, the number of interfaces of the storage system used during storage can be reduced, the storage efficiency during storage of the storage set is indirectly improved, and the problem that the production data submitted to the storage system in the related art contains a large number of attribute values with the same attribute name, so that the preprocessing tool is down is avoided.
According to an embodiment of the present disclosure, the data generation time point may include a first time point and a second time point, the second time point having a time earlier than that of the first time point.
Wherein determining a target attribute name and a target attribute value corresponding to the target attribute name from among a plurality of first attribute information having the same class identification and the target attribute value corresponding to each of the first attribute information according to the data generation time point may include the following operations.
A first attribute name having a first time point and a first attribute value corresponding to the first attribute name are determined from a plurality of first attribute names having the same class identification and the first attribute value corresponding to each first attribute name.
A first attribute name having a first time point and a first attribute value corresponding to the first attribute name are determined as a target attribute name and a target attribute value.
According to the embodiment of the present disclosure, for the production data a and the production data B having the same category identifier "zhang san", the time in the production data a and the production data B are different, for example, the time of the first time point may refer to the time in the production data a, and the time of the second time point may refer to the time in the production data B. The first attribute name and the first attribute value corresponding to the first attribute name in the production data B are determined as a target attribute name and a target attribute value.
According to an embodiment of the present disclosure, the production data storage method may further include the following operations.
The first attribute name having the second time point and the first attribute value corresponding to the first attribute name are determined from a plurality of first attribute names having the same class identification and the first attribute value corresponding to each first attribute name.
Deleting the first attribute name having the second time point and the first attribute value corresponding to the first attribute name.
According to the embodiment of the present disclosure, for the production data a and the production data B having the same category identifier "zhang san", the time in the production data a and the production data B are different, for example, the time of the first time point may refer to the time in the production data a, and the time of the second time point may refer to the time in the production data B. The first attribute name of the production data a having the first time point and the first attribute value corresponding to the first attribute name are deleted.
FIG. 5 schematically shows a block diagram of a production data storage device according to an embodiment of the disclosure.
As shown in fig. 5, the production data storage 500 may include a sending module 510, a first obtaining module 520, a merging module 530, and a storing module 540.
A sending module 510, configured to send the production data to the newly created message queue before writing the production data to the storage system.
The first obtaining module 520 is configured to obtain a plurality of production data from the message queue, where each production data includes an attribute name and an attribute value corresponding to the attribute name.
The merging module 530 is configured to merge attribute values with the same attribute name in the multiple pieces of production data according to a preset merging rule, so as to obtain a storage set corresponding to the attribute name.
And a storage module 540, configured to store the storage set corresponding to the attribute name in a storage system.
According to the embodiment of the disclosure, before the production data is written into the storage system, the production data is sent to the newly created message queue, the attribute values with the same attribute name in the production data are merged according to the preset merging rule to obtain the storage set corresponding to the attribute name, and then the storage set is stored into the storage system, so that a technical means that a plurality of production data are written into the storage system at the same time to cause slow storage is avoided, the technical problem of low storage efficiency of the production data is at least partially solved, and the technical effect of improving the storage efficiency of the production data is further achieved.
According to an embodiment of the present disclosure, the production data storage device 500 may further include a second obtaining module, an extracting module, and a preprocessing module.
And the second acquisition module is used for acquiring a plurality of production data tables.
And the extraction module is used for acquiring initial production data from each production data table by using the log extraction tool.
And the preprocessing module is used for preprocessing each initial production data to obtain a plurality of production data.
According to an embodiment of the present disclosure, the pre-processing comprises at least one of: data cleaning, logic processing, data packaging, data statistics and data compression.
According to the embodiment of the disclosure, the production data is data which is generated by the server according to the data generation time point interval and has the classification identification.
According to an embodiment of the present disclosure, the merging module 530 may include a conversion unit first determination unit, a second determination unit, and a merging unit.
And the conversion unit is used for performing data conversion on each attribute name and the attribute value corresponding to the attribute name to obtain a plurality of first attribute names and first attribute values corresponding to the first attribute names.
A first determining unit configured to determine, according to each of the classification flags, a first attribute name having the same classification flag and a first attribute value corresponding to each of the first attribute names from among a plurality of first attribute names corresponding to different production data and the first attribute value corresponding to each of the first attribute names.
A second determining unit for determining a target attribute name and a target attribute value corresponding to the target attribute name from among a plurality of first attribute names having the same class identification and the first attribute value corresponding to each of the first attribute names according to the data generation time point.
And the merging unit is used for merging the target attribute values corresponding to the target attribute information with different classification identifiers to obtain a storage set.
According to an embodiment of the present disclosure, the data generation time point may include a first time point and a second time point, the second time point having a time earlier than that of the first time point.
According to an embodiment of the present disclosure, the second determination unit may include a first determination subunit and a second determination subunit.
A first determining subunit, configured to determine, from the plurality of first attribute names having the same class identifier and the first attribute value corresponding to each first attribute name, a first attribute name having a first time point and a first attribute value corresponding to the first attribute name.
A second determining subunit, configured to determine the first attribute name having the first time point and the first attribute value corresponding to the first attribute name as a target attribute name and a target attribute value.
According to an embodiment of the present disclosure, the second determination unit may further include a third determination subunit and a deletion subunit.
A third determining subunit, configured to determine, from the plurality of first attribute names having the same class identifier and the first attribute value corresponding to each first attribute name, the first attribute name having the second time point and the first attribute value corresponding to the first attribute name.
And a deletion subunit, configured to delete the first attribute name having the second time point and the first attribute value corresponding to the first attribute name.
Any of the modules, units, sub-units, or at least part of the functionality of any of them according to embodiments of the present disclosure may be implemented in one module. Any one or more of the modules, units and sub-units according to the embodiments of the present disclosure may be implemented by being split into a plurality of modules. Any one or more of the modules, units, sub-units according to the embodiments of the present disclosure may be implemented at least partially as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in any other reasonable manner of hardware or firmware by integrating or packaging a circuit, or in any one of three implementations of software, hardware, and firmware, or in any suitable combination of any of them. Alternatively, one or more of the modules, units, sub-units according to embodiments of the disclosure may be at least partially implemented as computer program modules, which, when executed, may perform the corresponding functions.
For example, any plurality of the sending module 510, the first obtaining module 520, the combining module 530 and the storing module 540 may be combined and implemented in one module/unit/sub-unit, or any one of the modules/units/sub-units may be split into a plurality of modules/units/sub-units. Alternatively, at least part of the functionality of one or more of these modules/units/sub-units may be combined with at least part of the functionality of other modules/units/sub-units and implemented in one module/unit/sub-unit. According to an embodiment of the present disclosure, at least one of the sending module 510, the first obtaining module 520, the combining module 530 and the storing module 540 may be implemented at least partially as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or implemented by any one of three implementations of software, hardware and firmware, or any suitable combination of any of them. Alternatively, at least one of the sending module 510, the first obtaining module 520, the combining module 530 and the storing module 540 may be at least partially implemented as a computer program module, which when executed may perform a corresponding function.
It should be noted that the production data storage device portion in the embodiment of the present disclosure corresponds to the production data storage method portion in the embodiment of the present disclosure, and the description of the production data storage device portion specifically refers to the production data storage method portion, and is not repeated herein.
Fig. 6 schematically shows a block diagram of an electronic device adapted to implement the above described method according to an embodiment of the present disclosure. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 6, an electronic device 600 according to an embodiment of the present disclosure includes a processor 601, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. Processor 601 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 601 may also include onboard memory for caching purposes. Processor 601 may include a single processing unit or multiple processing units for performing different actions of a method flow according to embodiments of the disclosure.
In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 600 are stored. The processor 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. The processor 601 performs various operations of the method flows according to the embodiments of the present disclosure by executing programs in the ROM 602 and/or RAM 603. It is to be noted that the programs may also be stored in one or more memories other than the ROM 602 and RAM 603. The processor 601 may also perform various operations of the method flows according to embodiments of the present disclosure by executing programs stored in the one or more memories.
Electronic device 600 may also include input/output (I/O) interface 605, input/output (I/O) interface 605 also connected to bus 604, according to an embodiment of the disclosure. The system 600 may also include one or more of the following components connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
According to embodiments of the present disclosure, method flows according to embodiments of the present disclosure may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program, when executed by the processor 601, performs the above-described functions defined in the system of the embodiments of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.
The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.
According to an embodiment of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium. Examples may include, but are not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
For example, according to embodiments of the present disclosure, a computer-readable storage medium may include the ROM 602 and/or RAM 603 described above and/or one or more memories other than the ROM 602 and RAM 603.
Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the method provided by the embodiments of the present disclosure, when the computer program product is run on an electronic device, the program code being adapted to cause the electronic device to carry out the method of production data storage provided by the embodiments of the present disclosure.
The computer program, when executed by the processor 601, performs the above-described functions defined in the system/apparatus of the embodiments of the present disclosure. The systems, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.
In one embodiment, the computer program may be hosted on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed in the form of a signal on a network medium, downloaded and installed through the communication section 609, and/or installed from the removable medium 611. The computer program containing program code may be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
In accordance with embodiments of the present disclosure, program code for executing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, these computer programs may be implemented using high level procedural and/or object oriented programming languages, and/or assembly/machine languages. The programming language includes, but is not limited to, programming languages such as Java, C + +, python, the "C" language, or the like. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.
The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used in advantageous combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims (10)

1. A production data storage method comprising:
sending the production data to a newly created message queue before writing the production data to a storage system;
obtaining a plurality of production data from the message queue, wherein each production data comprises an attribute name and an attribute value corresponding to the attribute name;
merging the attribute values with the same attribute name in the plurality of production data according to a preset merging rule to obtain a storage set corresponding to the attribute name; and
and storing the storage set corresponding to the attribute name to the storage system.
2. The method of claim 1, further comprising:
obtaining a plurality of production data tables;
acquiring initial production data from each production data table by using a log extraction tool; and
and preprocessing each initial production data to obtain a plurality of production data.
3. The method of claim 2, the pre-processing comprising at least one of: data cleaning, logic processing, data packaging, data statistics and data compression.
4. The method of claim 1, wherein the production data is data generated by a server according to a data production time point interval and having a classification identifier;
the merging the attribute values with the same attribute name in the plurality of production data according to a preset merging rule to obtain a storage set corresponding to the attribute name includes:
performing data conversion on each attribute name and an attribute value corresponding to the attribute name to obtain a plurality of first attribute names and first attribute values corresponding to the first attribute names;
determining, from a plurality of first attribute names corresponding to different ones of the production data and the first attribute value corresponding to each of the first attribute names, the first attribute name having the same one of the class identifiers and the first attribute value corresponding to each of the first attribute names, according to each of the class identifiers;
determining a target attribute name and the target attribute value corresponding to the target attribute name from a plurality of the first attribute names having the same classification identification and the first attribute value corresponding to each of the first attribute names according to the data generation time point; and
and merging the target attribute values corresponding to the target attribute information with different classification identifications to obtain the storage set.
5. The method of claim 4, the data generation time points comprising a first time point and a second time point, the second time point being earlier in time than the first time point;
wherein the determining, according to the data generation time point, a target attribute name and the target attribute value corresponding to the target attribute name from among a plurality of the first attribute information having the same classification flag and the target attribute value corresponding to each of the first attribute information includes:
determining the first attribute name having the first time point and the first attribute value corresponding to the first attribute name from a plurality of the first attribute names having the same classification identification and the first attribute value corresponding to each of the first attribute names; and
determining the first attribute name having the first time point and the first attribute value corresponding to the first attribute name as the target attribute name and the target attribute value.
6. The method of claim 5, further comprising:
determining the first attribute name having the second time point and the first attribute value corresponding to the first attribute name from a plurality of the first attribute names having the same class identification and the first attribute value corresponding to each of the first attribute names; and
deleting the first attribute name having the second time point and the first attribute value corresponding to the first attribute name.
7. A production data storage device comprising:
the sending module is used for sending the production data to a newly created message queue before the production data is written into the storage system;
the first acquisition module is used for acquiring a plurality of production data from the message queue, wherein each production data comprises an attribute name and an attribute value corresponding to the attribute name;
the merging module is used for merging the attribute values with the same attribute name in the plurality of production data according to a preset merging rule to obtain a storage set corresponding to the attribute name; and
and the storage module is used for storing the storage set corresponding to the attribute name to the storage system.
8. An electronic device, comprising:
one or more processors;
a memory for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-6.
9. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to carry out the method of any one of claims 1 to 6.
10. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 6.
CN202110841038.1A 2021-07-23 2021-07-23 Production data storage method, storage device, electronic equipment and storage medium Active CN113486018B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110841038.1A CN113486018B (en) 2021-07-23 2021-07-23 Production data storage method, storage device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110841038.1A CN113486018B (en) 2021-07-23 2021-07-23 Production data storage method, storage device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113486018A true CN113486018A (en) 2021-10-08
CN113486018B CN113486018B (en) 2023-09-26

Family

ID=77942438

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110841038.1A Active CN113486018B (en) 2021-07-23 2021-07-23 Production data storage method, storage device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113486018B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100262270A1 (en) * 2009-04-14 2010-10-14 Fujitsu Limited Design data merging apparatus and design data merging method
US20160217082A1 (en) * 2013-09-04 2016-07-28 Nec Platforms, Ltd. Store merge processing device, store merge processing system, store merge processing method, and storage medium
CN107168796A (en) * 2017-05-12 2017-09-15 郑州云海信息技术有限公司 A kind of data merging method, device, memory and storage control
CN107808313A (en) * 2016-09-08 2018-03-16 北京京东尚科信息技术有限公司 Data merging treatment method, apparatus and system
CN108038776A (en) * 2017-12-19 2018-05-15 深圳市买买提乐购金融服务有限公司 A kind of data processing method and data processing terminal
CN109522332A (en) * 2018-11-22 2019-03-26 泰康保险集团股份有限公司 Customer profile data merging method, device, equipment and readable storage medium storing program for executing
CN109597804A (en) * 2018-10-08 2019-04-09 中国平安人寿保险股份有限公司 Client's merging method and device, electronic equipment and storage medium based on big data

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100262270A1 (en) * 2009-04-14 2010-10-14 Fujitsu Limited Design data merging apparatus and design data merging method
US20160217082A1 (en) * 2013-09-04 2016-07-28 Nec Platforms, Ltd. Store merge processing device, store merge processing system, store merge processing method, and storage medium
CN107808313A (en) * 2016-09-08 2018-03-16 北京京东尚科信息技术有限公司 Data merging treatment method, apparatus and system
CN107168796A (en) * 2017-05-12 2017-09-15 郑州云海信息技术有限公司 A kind of data merging method, device, memory and storage control
CN108038776A (en) * 2017-12-19 2018-05-15 深圳市买买提乐购金融服务有限公司 A kind of data processing method and data processing terminal
CN109597804A (en) * 2018-10-08 2019-04-09 中国平安人寿保险股份有限公司 Client's merging method and device, electronic equipment and storage medium based on big data
CN109522332A (en) * 2018-11-22 2019-03-26 泰康保险集团股份有限公司 Customer profile data merging method, device, equipment and readable storage medium storing program for executing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
余思;桂小林;黄汝维;庄威;: "一种提高云存储中小文件存储效率的方案", 西安交通大学学报, no. 06 *

Also Published As

Publication number Publication date
CN113486018B (en) 2023-09-26

Similar Documents

Publication Publication Date Title
CN113987074A (en) Distributed service full-link monitoring method and device, electronic equipment and storage medium
WO2018184535A1 (en) Insurance service processing method and device, server, and storage medium
CN113946425A (en) Service processing method and device, electronic equipment and computer readable storage medium
CN115357761A (en) Link tracking method and device, electronic equipment and storage medium
US20120284390A1 (en) Guaranteed response pattern
CN110807050B (en) Performance analysis method, device, computer equipment and storage medium
US20180210804A1 (en) System and method for dynamic scaling of concurrent processing threads
CN113486018B (en) Production data storage method, storage device, electronic equipment and storage medium
CN114237765B (en) Functional component processing method, device, electronic equipment and medium
CN113132400B (en) Business processing method, device, computer system and storage medium
CN115904527A (en) Data processing method, device, equipment and medium
CN114780361A (en) Log generation method, device, computer system and readable storage medium
CN113781154A (en) Information rollback method, system, electronic equipment and storage medium
CN113448578A (en) Page data processing method, processing system, electronic device and readable storage medium
CN114168607A (en) Global serial number generation method, device, equipment, medium and product
CN115048411A (en) Data processing method, apparatus, system, medium, and program product
CN114461502B (en) Model monitoring method and device
CN113419922A (en) Method and device for processing batch job running data of host
CN114448976A (en) Network message assembling method, device, equipment, medium and program product
CN115329232A (en) Page display method, device, equipment and medium
CN115984001A (en) Event stream processing method, event stream processing device, electronic device, medium, and program product
CN114329612A (en) Sensitive data identification method and device, electronic equipment and storage medium
CN114218198A (en) Service information migration method, device, equipment and medium
CN117785413A (en) Task forwarding method, device, equipment and storage medium
CN114911858A (en) Cloud platform interface generation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant