WO2022108523A1 - Method and apparatus for compressing data, computer device and storage medium - Google Patents

Method and apparatus for compressing data, computer device and storage medium Download PDF

Info

Publication number
WO2022108523A1
WO2022108523A1 PCT/SG2021/050697 SG2021050697W WO2022108523A1 WO 2022108523 A1 WO2022108523 A1 WO 2022108523A1 SG 2021050697 W SG2021050697 W SG 2021050697W WO 2022108523 A1 WO2022108523 A1 WO 2022108523A1
Authority
WO
WIPO (PCT)
Prior art keywords
target data
compression
parameter
data segment
segment
Prior art date
Application number
PCT/SG2021/050697
Other languages
French (fr)
Inventor
JinJin LIU
Hong Zhao
Xiaomeng Chen
Degang NING
Jinghui Zhao
Original Assignee
Envision Digital International Pte. Ltd.
Shanghai Envision Digital Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Envision Digital International Pte. Ltd., Shanghai Envision Digital Co., Ltd. filed Critical Envision Digital International Pte. Ltd.
Publication of WO2022108523A1 publication Critical patent/WO2022108523A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • G06F16/1744Redundancy elimination performed by the file system using compression, e.g. sparse files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present disclosure relates to the field of data processing, and in particular, relates to a method and an apparatus for compressing data, a computer device and a storage medium.
  • Embodiments of the present disclosure provide a method and an apparatus for compressing data, a computer device, and a storage medium, which may adjust compression parameters and improve compression efficiency in the case of ensuring the compression accuracy.
  • the technical solutions are as follows.
  • a method for compressing data includes:
  • target data includes at least two target data segments
  • acquiring a compression parameter corresponding to an i th target data segment of the at least two target data segments by updating, based on compressed data information corresponding to an (i-l) th target data segment of the at least two target data segments, a compression parameter corresponding to the (i- l) th target data segment using a parameter update model; wherein i is an integer greater than or equal to 2, the compressed data information includes at least one of a compression proportion and a compression standard deviation, the parameter update model is acquired by reinforcement learning based on a history compression parameter and compressed data information corresponding to the history compression parameter, the compression parameter is is indicative of a compression accuracy of data compression on the at least two target data segments; and the history compression parameter is a compression parameter corresponding to a history target data segment; and
  • an apparatus for compressing data includes: [0011] a target data acquiring module, configured to acquire target data; wherein the target data includes at least two target data segments;
  • a compression parameter updating module configured to acquire a compression parameter corresponding to an i th target data segment of the at least two target data segments by updating, based on compressed data information corresponding to an (i-l) th target data segment of the at least two target data segments, a compression parameter corresponding to the (i-l) th target data segment using a parameter update model; wherein i is an integer greater than or equal to 2, the compressed data information includes at least one of a compression proportion and a compression standard deviation, the parameter update model is acquired by reinforcement learning based on a history compression parameter and compressed data information corresponding to the history compression parameter, the compression parameter is indicative of a compression accuracy of data compression on the at least two target data segments, and the history compression parameter is a compression parameter corresponding to a history target data segment; and
  • a data compressing module configured to perform, based on the compression parameter corresponding to the i th target data segment, data compression on the i th target data segment.
  • the history target data segment includes the target data segment prior to the i th target data segment
  • the apparatus further includes:
  • a model updating module configured to update, based on compression parameters corresponding to prior N target data segments of the i th target data segment, and compressed data information corresponding to the prior N target data segments, the parameter update model in the case that the i th target data segment meets a specified condition; wherein the prior N target data segments are N target data segments prior to the i th target data segment in the target data, and N is an integer greater than or equal to 1, and less than i.
  • the model updating module is configured to:
  • the parameter update model includes a first model branch and a second model branch
  • the first model branch is configured to update the compression parameter corresponding to the (i- l) th target data segment based on the compressed data information corresponding to the (i-l) th target data segment;
  • the second model branch is configured to instruct the first model branch to update the compression parameter to increase the compression proportion corresponding to the compression parameter and decrease the compression standard deviation corresponding to the compression parameter.
  • the compression parameter updating module is further configured to:
  • [0023] acquire the compression parameter corresponding to the i th target data segment by updating, based on the compressed data information corresponding to the (i-l) th target data segment, the compression parameter corresponding to the (i-l) th target data segment using the first model branch;
  • model updating module is further configured to:
  • [0025] acquire, based on the compression parameters corresponding to the prior N target data segments of the i th target data segment and the compressed data information corresponding to the prior N target data segments, value information corresponding to the i th target data segment using the second model branch; wherein the value information is configured to instruct the first model branch to update the compression parameter to increase the compression proportion corresponding to the compression parameter and decrease the compression standard deviation corresponding to the compression parameter; and
  • the target data further includes an initial data segment
  • the compression parameter updating module is configured to: [0029] acquire, based on the initial data segment, an initial compression parameter; wherein the initial compression parameter is a compression parameter corresponding to a first target data segment; and
  • [0030] acquire a compression parameter of the first target data segment and compressed data information corresponding to the first target data segment by performing, based on the initial compression parameter, data compression on the first target data segment.
  • the history target data segment includes a sample target data segment in sample data; the sample data is data of a same type as the target data; the sample data includes at least two sample target data segments; and
  • the apparatus further includes:
  • a sample updating module configured to acquire an updated parameter update model by training, based on the at least two sample target data segments, the parameter update model.
  • a computer device includes a processor and a memory configured to store at least one instruction, at least one program, a code set, or an instruction set, wherein the processor, when loading and executing the at least one instruction, the at least one program, the code set, or the instruction set, is caused to perform the method for compressing data as described above.
  • a non-transitory computer-readable storage medium stores at least one instruction, at least one program, a code set, or an instruction set, wherein the at least one instruction, the at least one program, the code set, or the instruction set, when loaded and executed by a processor of a computer device, causes the computer device to perform the method for compressing data as described above.
  • a computer program product or a computer program includes one or more computer instructions.
  • the one or more computer instructions are stored in a non-transitory computer-readable storage medium.
  • the one or more computer instructions when loaded and executed by a processor of a computer device, cause the computer device to perform the method for compressing data as described above.
  • the parameter update model is updated through the history compression parameter acquired during the compression process, and the compression proportion and the compression error which correspond to the history compression parameter.
  • the updated compression parameter is acquired by updating the compression parameter corresponding to the previous target data segment using the updated parameter update model, and the compressed data information corresponding to the target data segment is acquired by compressing the target data segment based on the updated compression parameter. Continuously, based on the compressed data information, the compression parameter corresponding to the next target data segment is acquired using the parameter update model.
  • the parameter update model is updated based on the compression proportion and the compression error of the history compression parameter, the parameter update model adjusts the compression parameter corresponding to the target data segment based on the compression proportion and the compression error of the previous target data segment, and a value of the compression parameter is accurately adjusted, which improves the compression efficiency in the case of ensuring the compression accuracy.
  • FIG. 1 is a schematic structural diagram of a data compression system according to an exemplary embodiment
  • FIG. 2 shows a schematic diagram of a swinging door trending algorithm involved in an embodiment of the present disclosure
  • FIG. 3 is a schematic flowchart of a method for compressing data according to an exemplary embodiment
  • FIG. 4 is a flowchart of a method for compressing data according to an exemplary embodiment
  • FIG. 5 is a schematic flowchart of a method for training the parameter update model involved in the embodiment shown in FIG. 4;
  • FIG. 6 is a schematic flowchart of a method for compressing data involved in the embodiment shown in FIG. 4;
  • FIG. 7 is a schematic diagram of application of the data compression involved in the embodiment shown in FIG. 4;
  • FIG. 8 is a schematic diagram of application of the data compression involved in the embodiment shown in FIG. 4;
  • FIG. 9 is a frame diagram of a data compression process according to an exemplary embodiment
  • FIG. 10 is a block diagram of a structure of an apparatus for compressing data according to an exemplary embodiment.
  • FIG. 11 is a schematic structural diagram of a computer device according to an exemplary embodiment.
  • Al is a theory, a method, a technology or an application system which simulates, extends and expands human intelligence to perceive an environment, acquire knowledge and obtain the best result using the knowledge by using a digital computer or a machine controlled by the digital computer.
  • Al is a comprehensive technology in computer science, which intends to understand the essence of intelligence and produce a new type of intelligent machines which can respond in a fashion similar to human intelligence. That is, Al is to study design principles and implementation methods of various intelligent machines, such that the machines have functions of perception, reasoning and decision-making.
  • Al technology is a comprehensive discipline, and involves both hardware technologies and software technologies in a wide range of fields.
  • Basic Al technologies generally include a sensor technology, a dedicated Al chip technology, a cloud computing technology, a distributed storage technology, a big data processing technology, an operating/interaction system technology, an electromechanical integration technology, and the like.
  • Software Al technologies mainly include a computer vision technology, a voice processing technology, a natural language processing technology, a machine leaming/deep learning technology, and the like.
  • DC refers to a technological method of reducing a data amount to reduce a storage space and improve the transmission, storage and processing efficiency, or reorganizing the data by algorithms to reduce data redundancy and the storage space without losing useful information.
  • DC includes lossy compression and lossless compression.
  • RL also known as reinforced learning, evaluation learning, or enhancement learning, is paradigms or methodologies of machine learning, and is configured to describe and solve the problem of achieving a maximum retur or a specific purpose of an agent based on learning strategies in the process of interacting with the environment.
  • a common model of RL is the standard markov decision process (MDP). According to given conditions, RL may be divided into model-based RL, model-free RL, active RL and passive RL.
  • RL is that the agent learns in a "trial and error" fashion, and is a reward-guided behavior acquired by interacting with the environment, with a purpose that the agent gains the greatest reward.
  • RL different from supervised learning in connectionist learning, is mainly manifested in a reinforcement signal.
  • the reinforcement signal provided by the environment in RL is an evaluation of the quality of a generated action (usually a scalar signal), rather than telling a reinforcement learning system (RLS) how to generate the correct action.
  • RLS reinforcement learning system
  • the RLS must rely on its own experience to learn. In this way, the RLS gains knowledge in an action-evaluation environment and improves action plans to adapt to the environment.
  • FIG. 1 is a schematic structural diagram of a data compression system according to an exemplary embodiment.
  • the system includes a data storage device 120 and a data compression device 140.
  • the data storage device 120 may include a data storage module (not shown in the drawing), and data to be compressed may be stored in the data storage module in advance; or the data storage device 120 is directly connected to a sensor, and the sensor may be one sensor or several sensors. The sensor generates corresponding timing data via changes in the external environment, and sends the timing data to the data storage device for storage.
  • the data compression device 140 may include a data compressing module and a data processing module.
  • the data to be compressed may be processed into a data form suitable for compression by the data processing module, or the data to be compressed may be directly analyzed by the data processing module.
  • the data compression device may further compress the data to be compressed using the data compressing module to acquire compressed data after compression.
  • the data compression device 140 may include a compressed data storage module, and the data compressing module compresses the data to be compressed, and saves the compressed data after compression to the compressed data storage module.
  • the data compression device 140 may be a server, and may include one server, or several servers, or may be a distributed computer cluster composed of several servers, or may be a virtualization platform, or may be a cloud computing service center, etc., which is not limited in the present disclosure.
  • the data storage device 120 and the data compression device 140 are connected via a communication network, hi some embodiments, the communication network is a wired network or a wireless network.
  • the system may further include a management device (not shown in FIG. 1), and the management device is connected to the data storage device 120 and the data compression device 140 via a communication network.
  • the communication network is a wired network or a wireless network.
  • the above wireless network or wired network uses a standard communication technology and/or protocol.
  • the network is usually Internet, but may also be any network, including but not limited to any combination of a local area network (LAN), a metropolitan area network (MAN), a wide area network (WAN), a mobile network, a wired network or a wireless network, a private network or a virtual private network.
  • LAN local area network
  • MAN metropolitan area network
  • WAN wide area network
  • mobile network a wired network or a wireless network
  • private network or a virtual private network a virtual private network.
  • the data exchanged over the network is represented by using technologies and/or formats including hyper text mark-up language (HTML), extensible markup language (XML), and the like.
  • HTML hyper text mark-up language
  • XML extensible markup language
  • all or some links may also be encrypted by using Secure Socket Layer (SSL), Transport Layer Security (TLS), virtual private network (VPN), Internet Protocol Security (IPsec) and other conventional encryption technologies.
  • SSL Secure Socket Layer
  • TLS Transport Layer Security
  • VPN virtual private network
  • IPsec Internet Protocol Security
  • some customized and/or dedicated data communication technologies may also be configured to replace or supplement the data communication technologies above.
  • FIG. 2 is a schematic diagram of a swinging door trending (SDT) algorithm involved in an embodiment of the present disclosure.
  • SDT swinging door trending
  • FIG. 2 shows the process of a method for compressing a timing data segment using the SDT algorithm.
  • the SDT algorithm takes the timing data node corresponding to tO as a starting point to start the first compression process.
  • the compression process is as follows.
  • the timing data node corresponding to tl is determined by the pivot 1 and the pivot 2.
  • the timing data node corresponding to tl and the two pivots form a triangle 1.
  • a triangle inner angle corresponding to the pivot 1 in the triangle 1 is recorded as an inner angle 1; and a triangle inner angle corresponding to the pivot 2 in the triangle 1 is recorded as an inner angle 2.
  • a sum of the inner angle 1 and the inner angle 2 is less than 180 degrees, and the timing data node corresponding to tl may be normally compressed in the compression process.
  • the timing data node corresponding to t2 is determined.
  • the determination process is the same as the determination process of tl.
  • the timing data node corresponding to t2 is connected to the pivot 1 and the pivot 2 to form a triangle 2.
  • the inner angle 1 is compared with a triangle inner angle corresponding to the pivot 1 in the triangle 2, and the larger inner angle is taken as the inner angle 1.
  • the inner angle 2 is compared with a triangle inner angle corresponding to the pivot 2 in the triangle 2, the larger inner angle is taken as the inner angle 2, and then whether the sum of the inner angle 1 and the inner angle 2 is less than or equal to 180 degrees is determined.
  • the timing data node corresponding to t2 may also be compressed normally in the compression process.
  • timing data node corresponding to t3 and the timing data node corresponding to t4 may further be compressed normally in the compression process, which is not repeated here.
  • the timing data node corresponding to t5 in FIG. 2 is determined, it can be seen that in a triangle 5 formed by connecting t5 and the two pivots, in the case that the triangle inner angle corresponding to the pivot 2 is updated to the inner angle 2, the sum of the inner angle 2 and the inner angle 1 is greater than 180 degrees. This indicates that the timing data node corresponding to t5 cannot be compressed normally in the compression process. In this case, the timing data node corresponding to t5 is taken as a new starting point to start the next compression process. Besides, the corresponding timing data nodes of tO to t4 corresponding to the previous compression process are represented by a data segment, so as to complete the data compression process of the timing data nodes corresponding to tO to t4.
  • the SDT algorithm is a lossy compression algorithm.
  • the compression parameter AE is configured to control the compression accuracy and compression effect of the SDT algorithm.
  • AE a difference value between the timing data nodes allowed to be compressed in one compression process is smaller, such that the compression standard deviation is smaller.
  • AE too small, more useless data points are retained, and the compression proportion is lower.
  • AE is larger, the difference value between the timing data nodes allowed to be compressed in one compression process is larger, such that the compression proportion is higher.
  • the larger compression difference value may lead to too much loss of compressed information of compression points in one compression process, and the compression standard deviation is larger.
  • FIG. 3 is a schematic flowchart of a method for compressing data according to an exemplary embodiment.
  • the method may be executed by a computer device, and the computer device may be the data compression device 140 in the embodiment shown in FIG. 1.
  • the process of the method for compressing data may include the following steps.
  • a target data is acquired, wherein the target data includes at least two target data segments.
  • the target data may be timing data.
  • the timing data refers to time sequence data.
  • the time sequence data is a data sequence of a same indicator recorded by in a chronological order.
  • the data in the same data sequence is of the same size and is comparable.
  • the target data may be segmented into respective target data segments according to time identifiers.
  • the timing data may include the time identifiers for indicating time information in the timing data, and the target data may be determined as respective target data segments according to the time identifiers.
  • the data amount of each target data segment is the same.
  • a compression parameter corresponding to an i* target data segment of the at least two target data segments is acquired by updating, based on compressed data information corresponding to an (i-l) ,h target data segment of the at least two target data segments, a compression parameter corresponding to the (i- 1 ) th target data segment using a parameter update model.
  • the compressed data information includes at least one of a compression proportion and a compression standard deviation
  • the parameter update model is acquired by reinforcement learning according to a history compression parameter and compressed data information corresponding to the history compression parameter, the compression parameter is indicative of a compression accuracy of data compression on the at least two target data segments; and the history compression parameter is a compression parameter corresponding to a history target data segment.
  • the parameter update model includes a first model branch and a second model branch.
  • the first model branch is configured to update the compression parameter corresponding to the (i-l) th target data segment based on the compressed data information corresponding to the (i- l) th target data segment; and the second model branch is configured to instruct the first model branch to update the compression parameter to increase the compression proportion corresponding to the compression parameter and decrease the compression standard deviation corresponding to the compression parameter.
  • the history target data segment includes the target data segment prior to the i* target data segment.
  • the history- target data segment includes a sample target data segment in sample data; and the sample data is data of a same type as the target data, and the sample data includes at least two sample target data segments.
  • data compression is performed on the i* target data segment based on the compression parameter corresponding to the i* target data segment.
  • the parameter update model is updated based on the history compression parameter acquired in the compression process, and the compression proportion and the compression standard deviation corresponding to the history compression parameter.
  • the updated compression parameter is acquired by updating the compression parameter corresponding to the previous target data segment using the updated parameter update model, and the compressed data information corresponding to the target data segment is acquired by compressing the target data segment based on the updated compression parameter. Continuously, based on the compressed data information, the compression parameter corresponding to the next target data segment is acquired using the parameter update model.
  • the parameter update model is updated based on the compression proportion and the compression standard deviation of the history compression parameter, the parameter update model adjusts the compression parameter corresponding to the target data segment based on the compression proportion and the compression standard deviation of the previous target data segment, and a value of the compression parameter is accurately adjusted, which improves the compression efficiency in the case of ensuring the compression accuracy.
  • FIG. 4 is a flowchart of a method for compressing data according to an exemplary embodiment.
  • the method may be executed by a computer device, and the computer device may be the data compression device 140 in the embodiment shown in FIG. 1, and the data compression device may be a server.
  • the method for compressing data may include the following steps.
  • the sample target data segment is a data segment in sample data.
  • an updated parameter update model is acquired by training, based on the at least two sample target data segments, a parameter update model.
  • the sample data further includes an initial sample data segment.
  • the initial sample data segment is a first data segment in the sample data.
  • an initial sample compression parameter is acquired based on the initial sample data segment, wherein the initial sample compression parameter is a compression parameter corresponding to the first sample target data segment.
  • the parameter update model is initialized based on the initial sample compression parameter and a parameter preset by a user.
  • the parameter update model may be initially constructed by the initial sample compression parameter and the preset parameter before being updated based on the at least two sample target data segments.
  • the server then trains the initialized parameter update model based on the at least two sample target data segments.
  • 402 may include 402a, 402b, and 402c.
  • a sample compression parameter corresponding to an n 01 sample target data segment is acquired by updating, based on compressed data information corresponding to a (n-l) th sample target data segment, a sample compression parameter corresponding to the (n-l) th sample target data segment using the parameter update model; n is an integer greater than or equal to 2.
  • the compressed data information corresponding to the (n-l) th sample target data segment includes at least one of a compression proportion and a compression standard deviation corresponding to the (n-l) th sample target data segment.
  • the compression proportion corresponding to the (n-l) th sample target data segment is a ratio of the number of data segments before the (n-l) th sample target data segment is compressed to the number of data segments after tire (n-l) th sample target data segment is compressed.
  • the compression proportion is shown in the following formula:
  • N represents the number of data points contained in the data segment before compression
  • N* represents the number of data points contained in the data segment after compression
  • the compression standard deviation Std is shown in the following formula:
  • y represents a data value corresponding to each data point in the at least two sample target data segments
  • u represents an average value of the data values corresponding to each data point in the at least two sample target data segments
  • n represents the number of the data points in the at least two sample target data segments.
  • compressed data of the n th sample target data segment, and compressed data information corresponding to the n 01 sample target data segment are acquired by performing data compression on the n* sample target data segment based on the sample compression parameter corresponding to the n 111 sample target data segment.
  • the sample compression parameter is the compression parameta* AE in the SDT algorithm, which is configured to control the compression accuracy and the compression effect of the SDT algorithm.
  • the parameter update model is updated based on sample compression parameters corresponding to the at least two sample target data segments and compressed data information corresponding to the at least two sample target data segments.
  • the parameter update model is updated based on compressed data information corresponding to prior N sample target data segments of the 11 th sample target data segment.
  • the prior N sample target data segments are N data segments prior to the 1 th sample target data segment in the target data; and N is an integer greater than or equal to 1, and less than i.
  • the parameter update model includes a first model branch and a second model branch.
  • the first model branch is configured to update the sample compression parameter corresponding to the (n-l) th sample target data segment based on the compressed data information corresponding to the (n-l) th sample target data segment; and the second model branch is configured to instruct the first model branch to update the compression parameter to increase the compression proportion corresponding to the compression parameter and decrease the compression standard deviation corresponding to the compression parameter.
  • the parameter update model is a model constructed by an Actor-Critic-based proximal policy optimization (PRO) algorithm. Therefore, in the parameter update model, the first model branch is an Actor network model, and the second model branch is a critic network model.
  • PRO Actor-Critic-based proximal policy optimization
  • the sample compression parameter corresponding to the n ,h sample target data segment is acquired by updating the sample compression parameter corresponding to the (n-l) 111 sample target data segment using the first model branch; based on the sample compression parameters corresponding to the at least two sample target data segments, and the compressed data information corresponding to the at least two sample target data segments, value information corresponding to the 1 th target data segment is acquired using the second model branch; and the value information is configured to instruct the first model branch to update the compression parameter with to increase the compression proportion corresponding to the compression parameter and decrease the compression standard deviation corresponding to the compression parameter.
  • the first model branch and the second model branch are updated.
  • the Actor network model takes the compression parameter AE, the compression proportion and the compression standard deviation Std as input parameters, and combines some random numbers to calculate a change value of AE, and the change value of AE is taken as an Action in an RL concept.
  • the compression parameter AE, and the compression proportion and tire compression standard deviation Std are taken as a state in the RL concept.
  • a difference value between the proportion corresponding to the n 111 sample target data segment and the proportion corresponding to the (n-l) 01 sample target data segment is calculated, a difference value between the Std corresponding to the n 111 sample target data segment and the Std corresponding to the (n-l) 111 sample target data segment is calculated, and a ratio of the two difference values is taken as a Reward corresponding to the n th sample target data segment.
  • the Actorcritic network model is updated based on the States, Rewards, and Actions corresponding to the prior N sample target data segments of the n th sample target data segment. Specifically, Actor in the model is replicated as Old_Actor before being updated based on the states of the prior N sample target data segments. Actor updated based on the states corresponding to the prior N sample target data segments is compared with the non-updated Old Actor to calculate a a_loss, and Rewards and states corresponding to the prior N sample target data segments are taken as input parameters of the critic network model in the model to acquire value information corresponding to the prior N sample target data segments (that is, value in the RL concept).
  • a c loss is acquired based on the value information, wherein the a_loss is a loss function value of the Actor network, and the c_loss is a loss function value of the critic network. Based on the a_loss and the c_loss, back propagation is performed on both the Actor network and the critic network to update the parameter update model.
  • the target data is acquired, wherein the target data includes at least two target data segments.
  • the target data is data of the same type as the sample target data.
  • an initial compression parameter is acquired based on an initial data segment, wherein the initial compression parameter is a compression parameter corresponding to a first target data segment.
  • the target data further includes the initial data segment, and the initial data segment is a first data segment in the target data.
  • the initial compression parameter is acquired based on a standard deviation of data values corresponding to the data points of the initial data segment.
  • the server may directly analyze the distribution of the data values corresponding to the data points in the initial data segment, for example, the distribution of the data values corresponding to the data points in the initial data segment is analyzed according to the standard deviation, which can partially reflect the distribution of the data values corresponding to the overall data points of the corresponding target data of the initial data segment.
  • the initial compression parameter may be positively correlated with the standard deviation of the data values corresponding to the data points in the initial data segment. Where the standard deviation is larger, the difference value of the data values corresponding to the data points in the initial data segment is larger, and the data points are relatively scattered. Therefore, it can be inferred that the distribution of the overall data points of the target data may further be relatively scattered.
  • a larger initial compression parameter may be set, and then the initial compression parameter is updated according to the compression situation of a subsequent target data segment.
  • the standard deviation is smaller, the difference value of the data values corresponding to respective data points in the initial data segment is smaller, in tire case of ensuring the compression proportion, a smaller initial compression parameter may be set to improve the compression accuracy, and then the initial compression parameter is updated according to the compression situation of the subsequent target data segment.
  • a compression parameter of the first target data segment and compressed data information corresponding to the first target data segment are acquired by performing data compression on the first target data segment based on the initial compression parameter.
  • the first target data segment is performed the data compression using the SDT algorithm based on the initial compression parameter to acquire the compression parameter of the first target data segment and the compressed data information corresponding to the first target data segment.
  • the initial compression parameter is the compression parameter AE corresponding to the SDT algorithm.
  • a compression parameter corresponding to an 1 th target data segment of the at least two target data segments is acquired by updating, based on compressed data information corresponding to an (i-l) lh target data segment of the at least two target data segments, a compression parameter corresponding to the (i-l) 111 target data segment using the parameter update model.
  • the first model branch is configured to update the compression parameter corresponding to the (i-l) 01 target data segment based on the compressed data information corresponding to the (i- l) th target data segment.
  • the compression parameter corresponding to the i th target data segment is acquired by updating the compression parameter corresponding to the (i-l) 111 target data segment using the first model branch.
  • data compression is performed on the i th target data segment based on the compression parameter corresponding to the i th target data segment.
  • the parameter update model is updated; and the prior N target data segments are N data segments prior to the i* target data segment in the target data; N is an integer greater than or equal to 1.
  • the parameter update model is updated.
  • the preset value may be one value or several values.
  • the user may set the preset value to 10.
  • the parameter update model is updated; or, the preset value may be a multiple of 5, that is, 5, 10, 15, 20, or the like. In this case, when i reaches the preset value, the parameter update model is updated.
  • the parameter update model includes a second model branch, and the second model branch is configured to instruct the first model branch to update the compression parameter to increase the compression proportion corresponding to the compression parameter and reducing the compression standard deviation corresponding to the compression parameter.
  • value information corresponding to the i th target data segment is acquired using the second model branch; and the value information is configured to instruct the first model branch to update the compression parameter to increase the compression proportion corresponding to the compression parameter and decrease the compression standard deviation corresponding to the compression parameter; and the first model branch and the second model branch are updated based on the value information.
  • FIG. 5 schematically describes the process of updating the parameter update model based on the compression parameters corresponding to the prior N target data segments of the i th target data segment and the compressed data information corresponding to the prior N target data segments.
  • FIG. 5 is a schematic flowchart of a method for training a parameter update model involved in an embodiment of the present disclosure. As shown in FIG. 5, the 1 th target data segment is performed the data compression using the SDT algorithm to obtain a compression proportion and a compression standard deviation (Std) corresponding to the i th target data segment.
  • Std compression standard deviation
  • the embodiment adopts a proximal policy optimization (PPG) algorithm in the RL method.
  • PPG proximal policy optimization
  • the parameter update model may be an Actor-critic model in an RL model, and an Actor network in the Actor-critic model is configured to acquire a compression parameter of a (i+l) 1 * 1 target data segment based on State information of the i th target data segment. That is, the compression parameter AE, the compression proportion and the compression standard deviation Std are taken as input parameters, and a change value of the compression parameter AE of the (i+l) th target data segment is calculated by combining some random number (that is, Action in the RL concept). Besides, difference values between the compression proportion and the compression standard deviation corresponding to the i th target data segment, and the compression proportion and the compression standard deviation corresponding to the (i+l) th target data segment are calculated (that is, Reward in the RL concept).
  • a round of update process of the Actor-critic model is as follows.
  • the State information corresponding to the i th target data segment is input into the Actor network (S502) to acquire two parameters Ji and G, and the two parameters are taken as a mean and a variance of a normal distribution to construct the normal distribution.
  • the normal distribution is configured to express a probability distribution of Action (the change value of AE) in the RL method, and then the normal distribution is sampled based on the random number to acquire a Sample Action as the change value of AE of the next target data segment (S503), and the change value is recorded as Action corresponding to the i* target data segment (S504).
  • the change value of AE is input into a compression program (i.e., compression environment) to acquire the compression parameter AE corresponding to the i 111 target data segment, and the compression parameter AE corresponding to the (i+l) th target data segment is determined. Then, based on the compression parameter AE corresponding to the (i+l) th target data segment, data compression is performed on the (i+l) 111 target data segment using the SDT algorithm to acquire compressed data corresponding to the (i+l) ,h target data segment, and a compression proportion and a compression standard deviation corresponding to the (i+l) 111 target data segment (S505).
  • a compression program i.e., compression environment
  • the compression parameter, compression proportion and compression standard deviation corresponding to the (i+l) 111 target data segment are saved as state corresponding to the (i+l) 111 target data segment (S506) and are input to the Actor network, and the next Action (the next process, not shown in the drawing) is acquired.
  • the Actor network acquire a reward value corresponding to the (i+l) th target data segment based on the compression proportion and the compression standard deviation Std corresponding to the i* target data segment, and the compression proportion and the compression standard deviation Std corresponding to the (i+l) th target data segment (S507).
  • a model update device stores parameters (s, a, r) corresponding to a number of target data segments, that is, the parameters (State, Action, Reward) corresponding to all target data segments.
  • a critic network in the Actor-critic model based on the parameters (s, a, r) corresponding to the number of target data segments, a value corresponding to the number of target data segments is acquired, and based on the parameters (s, a, r) corresponding to the number of target data segments and the value, the critic network and the Actor network are updated.
  • the model update device stores the parameters (s, a, r) corresponding to the number of target data segments trained in the above steps
  • the parameters (s, a, r) acquired after the above update process of cycling compression parameter are input into the critic network (S508), so as to acquire v_(value) corresponding to the above cyclically updated model, and a discount reward corresponding to the above cyclically updated model is calculated based on v_.
  • the formula of the discount reward is as follows:
  • r[t] represents a Reward value of the target data segment corresponding to the parameter update model at moment t
  • y represents a discount coefficient in the RL concept
  • y is greater than or equal to 0, and less than or equal to 1
  • y is configured to instruct an influence rate of Reward at the current moment to a moment in the fixture
  • v_ is the value acquired by the target data segment via the critic network at the moment t.
  • the discount rewards corresponding to the number of target data segments are calculated using the above discount reward formula.
  • the discount rewards corresponding to the T+l target data segments are R[0], R[l], R[2], R[3]...R[T] respectively, wherein T is the last moment (time step).
  • a loss function value is acquired (S510).
  • the loss function may be eJoss which is equal to mean(square(At )), and the loss function value is an average of square values of the first difference values corresponding to the target data segments (S511).
  • the parameters (s, a, r) corresponding to the number of target data segments are input into an Old_Actor network and the Actor network (S512).
  • the Old_Actor network and the Actor network have the same network structure, wherein the Old_Actor is acquired after updating the compression parameter, the compression proportion and the compression standard deviation corresponding to tire first target data segment; and the Actor network is acquired after updating the compression parameters, the compression proportions and the compression standard deviations corresponding to the number of target data segments.
  • a normal distribution Normall corresponding to the OkLActor network and a normal distribution Normal2 corresponding to the Actor network are constructed.
  • Normall is indicative of a probability that Action (that is, the change value of E) is taken each value in Old_Actor
  • Normal2 is indicative of a probability that the Action is taken each value in Actor.
  • Actions corresponding to the number of target data segments are input into both the normal distributions Normall and Normal2, so as to acquire the probabilities probl and prob2 of each Action in the two networks (Actor and Old_Actor), and then prob2 is divided by probl to acquire an important weight, which is a ratio of Old_Actor to the updated Actor.
  • the Actor network is updated by back propagation of the loss function (S513), wherein the loss function a_loss may be expressed as mean(min((ratio*At, clip(ratio, l- ⁇ , 1+Q* At))).
  • the clip function is a clipping function, that is, the ratio is clipped based on the size of an interval shown in (l- ⁇ , 1+Q, and ⁇ is a constant Therefore, from an intuitive point of view, for the above loss function, firstly the ratio is clipped by the clip to form a range, the clipped range is acted on the first difference value At to obtain a value, which is compared with a value of the ratio directly being acted on At, and the minimum value is taken as an updated loss function value of Actor.
  • FIG. 6 is a schematic flowchart of a method for compressing data involved in an embodiment of the present disclosure. As shown in FIG.
  • a target data 61 may be divided into an initial data 611 (an initial data segment) and several target data segments (a first target data segment 612, a second target data segment 613, and several subsequent target data segments).
  • an initial compression parameter corresponding to the initial data segment is acquired by data analyzing on the initial data and calculating the parameters such as a standard deviation and an average of data contained in the initial data, and the initial compression parameter corresponding to the initial data is input into a parameter update model to acquire an initialized parameter update model.
  • the parameter update model does not adjust the initial compression parameter, and directly inputs the initial compression parameter into an SDT module 623 as a compression parameter of an SDT algorithm, and SDT is performed on the first target data segment 612 (that is, the first of the target data segments) based on the initial compression parameter, to acquire compressed data corresponding to the first target data segment 612, and a compression proportion and a compression standard deviation corresponding to the first target data segment 612.
  • the compression proportion and the compression standard deviation corresponding to the first target data segment, and the compression parameter corresponding to the first target data segment are input into the parameter update model 621 to acquire an updated compression parameter.
  • the updated compression parameter as a compression parameter of the second target data segment (that is, the second of the target data segments), is input into the SDT module, and SDT is performed on the second target data segment 613 to acquire compressed data corresponding to the second target data segment 613, and a compression proportion and a compression standard deviation corresponding to the second target data segment 613.
  • FIG. 7 is a schematic diagram of application of the data compression involved in an embodiment of the present disclosure.
  • real-time data or history data 702 may be acquired by performing data access and processing on data generated by a physical device 701, and the physical device 701 may be a sensor, and the real-time data or history data may be data acquired by data access and processing of each sensor according to a timing.
  • corresponding compressed data 704 is output from the real-time data or history data via a lossy compression program 703 (that is, the compression program constructed according to the compression method in the embodiment shown in FIG. 4), and the compressed data 704 is stored in a time-series database (TSDB).
  • a lossy compression program 703 that is, the compression program constructed according to the compression method in the embodiment shown in FIG. 4
  • TSDB time-series database
  • FIG. 8 is a schematic diagram of application of the data compression involved in an embodiment of the present disclosure.
  • target data may be a timing data file 801 pre-stored in a database.
  • the timing file may be a comma-separated values (CSV) file.
  • CSV comma-separated values
  • the timing data file is subjected to data processing into compressible history data 802, and then the history data is performed the data compression using an envision swinging door trending algorithm 803 (ESDI)), that is, the compression method in the embodiments shown in FIG. 4, and output compressed data 804 is stored in the TSDB.
  • ESDI envision swinging door trending algorithm
  • the embodiments of the present disclosure provide an improved lossy compression algorithm ESDT based on the SDT algorithm; an RL-related algorithm is used, and learning is performed based on the timing data, such that an SDT parameter can be adaptively- adjusted to acquire the compression effect of a high compression proportion and a low compression standard deviation.
  • the compression proportion and the compression standard deviation which are continuously fed back by the compressed data are taken as a reward and punishment mechanism, and an RL model is configured to dynamically adjust the compression parameter to acquire a better compression effect.
  • the parameter update model is updated based on a history compression parameter acquired in the compression process, and a compression proportion and a compression standard deviation corresponding to the history compression parameter.
  • the compression parameter corresponding to the previous target data segment is updated using an updated parameter update model to acquire the updated compression parameter, and the target data segment is compressed based on the updated compression parameter to acquire compressed data information corresponding to the target data segment. Continuously, based on the compressed data information, a compression parameter corresponding to the next target data segment is acquired using the parameter update model.
  • the parameter update model is updated based on the compression proportion and the compression standard deviation of the history compression parameter, the parameter update model adjusts the compression parameter corresponding to the target data segment based on the compression proportion and tire compression standard deviation of the previous target data segment, and a value of the compression parameter is accurately adjusted, which can improve the compression efficiency in the case of ensuring the compression accuracy.
  • FIG. 9 is a framework diagram of a data compression process according to an exemplar)' embodiment.
  • the method for compressing data is jointly executed by a model training device 90 and a data compression device 91.
  • the model training device 90 and the data compression device 91 may be servers.
  • a parameter update model is trained in the model training device 90 based on sample data 900.
  • the sample data 900 includes an initial sample data segment 901, a first sample target data segment (the first of the sample target data segments) 902, a second sample target data segment (the second of the sample target data segments) 903, and several sample target data segments including an N* sample target data segment 904.
  • a parameter update model 905 is updated.
  • the parameter update model 905 may be an Actor-Critic network model shown in FIG. 9, and the parameter update model 905 includes an Actor network 905a and a Critic network 905b.
  • the parameter update model 905 directly inputs the initial compression parameter into an SDT module 907 without adjusting a compression parameter 906 corresponding to the initial sample data segment, and the SDT module 907, by performing data compression on the first sample target data segment 902 based on the compression parameter 906 corresponding to the initial sample data segment, acquires compressed data corresponding to the first sample target data segment 902, and a compression proportion and a compression standard deviation corresponding to the first sample target data segment 902.
  • the compression proportion and the compression standard deviation corresponding to the first sample target data segment, and the compression parameter corresponding to the first sample target data segment are input to the parameter update model 905, and the compression parameter is updated via the Actor network 905a in the parameter update model 905 to acquire the updated compression parameter.
  • the updated compression parameter as a compression parameter corresponding to the second sample target data segment, is input into the SDT module, and compressed data corresponding to the second sample target data segment 903, and a compression proportion and a compression standard deviation corresponding to the second sample target data segment 903 are acquired by performing SDT on the second sample target data segment 903.
  • the N 111 sample target data segment 904 is performed SDT by the SDT module based on a compression parameter corresponding to the N* sample target data segment 904, to acquire a compression proportion and a compression standard deviation corresponding to the sample target data segment 904, compression parameters corresponding to prior N sample target data segments, and compression proportions and the compression standard deviations corresponding to the prior N sample target data segments are input into the Critic network 905b to acquire Value corresponding to the prior N sample target data segments.
  • the parameter update model is updated.
  • the update process is the same as corresponding content of FIG. 5 in tire embodiment corresponding to FIG. 4, and is not repeated here.
  • the target data 910 is the same type of data as the sample data 900.
  • the target data may be divided into an initial data segment 911, a first target data segment 912 (the first target data segment of the target data segments), a second target data segment 913 (the second target data segment of the target data segments) 913, and several target data segments including an N 111 target data segment 914.
  • a parameter update model 915 is a model trained with the sample data in the model training device 90.
  • the parameter update model 915 and the parameter update model 905 have the same structure, and both are Actor-Critic network models (not shown in the drawing).
  • a compression parameter 916 in the data compression device 91 is the date with the same type as the compression parameter 906 in the model training device 90; and an SDT module 917 in the data compression device 91 is the same as the SDT module 907 in the model training device 90. Therefore, the compression process of the target data in the data compression device 91 and the update process of the parameter update model are consistent with the compression process of the sample data in tire model training device 90, and is not repeated here.
  • the data compression device 91 updates the compression parameter corresponding to each of the target data segments using the parameter update model trained in the model training device to acquire the compression parameter of the next target data segment corresponding to each of the target data segments.
  • the data compression device may update the parameter update model based on the compression parameters, the compression proportions, and the compression standard deviations corresponding to the predetermined number of target data segments, and the updated model is further in line with characteristics of the target data, thereby improving the compression proportion of the data acquired by compression using the SDT algorithm, and decreasing the compression standard deviation of the target data.
  • the above model training process collects real data training set that needs to be compressed, and as model training and compression calculation are two processes, in the case that a training program leams, a compression program for compressing the training set (sample data) may be preset in the server, after analyzing the training set, the compression calculation is started, and a calculation result is sent to a neural network (the parameter update model).
  • the training program for the parameter update model is firstly started, an initial model is acquired by learning the training set data, and then the initial model, as the parameter update model, is placed in a compression environment to acquire real-time spinning door calculation data, learning is performed every other a time period, and the updated parameter update model is saved.
  • FIG. 10 is a block diagram of the structure of an apparatus for compressing data according to an exemplary embodiment.
  • the apparatus for compressing data may implement all or part of the steps in the method according to the embodiment shown in FIG. 3 or FIG. 4.
  • Tire apparatus for compressing data may include:
  • a target data acquiring module 1001 configured to acquire target data; wherein the target data includes at least two target data segments;
  • a compression parameter updating module 1002 configured to acquire a compression parameter corresponding to an i th target data segment of the at least two target data segments by updating, based on compressed data information corresponding to an (i- l) th target data segment of the at least two target data segments, a compression parameter corresponding to the (i-l) th target data segment using a parameter update model; wherein i is an integer greater than 2, the compressed data information inentes at least one of a compression proportion and a compression standard deviation, the parameter update model is acquired by reinforcement learning based on a history compression parameter and compressed data information corresponding to the history compression parameter, the compression parameter is indicative of a compression accuracy of data compression on the at least two target data segment, and the history compression parameter is a compression parameter corresponding to a history target data segment; and
  • a data compressing module 1003 configured to perform, based on the compression parameter corresponding to the i th target data segment, data compression on the i th target data segment.
  • the history target data segment includes the target data segment prior to the i th target data segment; and [00185] the apparatus further includes:
  • a model updating module configured to update, based on compression parameters corresponding to prior N target data segments of the i th target data segment, and compressed data information corresponding to the prior N target data segments, the parameter update model in the case that the i th target data segment meets a specified condition; wherein the prior N target data segments are N target data segments prior to the i th target data segment in the target data, and N is an integer greater than or equal to 1, and less than i.
  • the model updating module is configured to: update, based on the compression parameters corresponding to the prior N target data segments of the i th target data segment and the compressed data information corresponding to the prior N target data segments, the parameter update model in the case that i is a preset value.
  • the parameter update model includes a first model branch and a second model branch; wherein [00189] the first model branch is configured to update the compression parameter corresponding to the (i-l) th target data segment based on the compressed data information corresponding to the (i- 1 ) th target data segment; and
  • the second model branch is configured to instruct the first model branch to update the compression parameter to increase the compression proportion corresponding to the compression parameter and decrease the compression standard deviation corresponding to the compression parameter.
  • the compression parameter updating module 1002 is further configured to:
  • [00192] acquire the compression parameter corresponding to the i th target data segment by updating, based on the compressed data information corresponding to the (i-l) th target data segment, the compression parameter corresponding to the (i-l) th target data segment using the first model branch;
  • model updating module is further configured to:
  • [00194] acquire, based on the compression parameters corresponding to the prior N target data segments of the i th target data segment and the compressed data information corresponding to the prior N target data segments, value information corresponding to the i th target data segment using the second model branch; wherein the value information is configured to instruct the first model branch to update the compression parameter to increase the compression proportion corresponding to the compression parameter and decrease the compression standard deviation corresponding to the compression parameter; and
  • the target data further includes an initial data segment; and [00197] the compression parameter updating module 1002 is configured to:
  • [00199] acquire a compression parameter of the first target data segment and compressed data information corresponding to the first target data segment by performing, based on the initial compression parameter, data compression on the first target data segment.
  • the history target data segment includes a sample target data segment in sample data; the sample data is data of the same type as the target data; and the sample data includes at least two sample target data segments; and [00201] the apparatus further includes: a sample updating module, configured to acquire an updated parameter update model by training, based on the at least two sample target data segments, the parameter update model.
  • the parameter update model is updated based on the history compression parameter acquired during the compression process, and the compression proportion and the compression standard deviation corresponding to the history compression parameter.
  • the compression parameter corresponding to the previous target data segment is updated using the updated parameter update model to acquire the updated compression parameter
  • the target data segment is compressed based on the updated compression parameter to acquire the compressed data information corresponding to the target data segment.
  • the compression parameter corresponding to the next target data segment is acquired using the parameter update model.
  • the parameter update model is updated based on the compression proportion and the compression standard deviation of the history- compression parameter, the parameter update model adjusts the compression parameter corresponding to the target data segment based on the compression proportion and the compression standard deviation of the previous target data segment, and a value of the compression parameter is accurately adjusted, which can improve the compression efficiency in the case of ensuring the compression accuracy.
  • FIG. 11 is a schematic structural diagram of a computer device according to an exemplary embodiment.
  • the computer device may be implemented as a model search device and/or an image segmentation device in each above method embodiment.
  • the computer device 1100 includes a central processing unit (CPU) 1101, a system memory 1104 including a randomaccess memory (RAM) 1 102 and a read-only memory (ROM) 1103, and a system bus 1105 connecting the system memory 1104 and the CPU 1101.
  • the computer device 1 100 further includes a basic input/output system 1106 which helps to transmit information between various components in a computer, and a high-capacity storage device 1107 configured to store an operating system 1113, an application program 1114 and other program modules 1115.
  • the high-capacity storage device 1107 is connected to the CPU 1101 via a high- capacity storage controller (not shown) that is connected to the system bus 1105.
  • the high- capacity storage device 1107 and an associated computer-readable medium thereof provide nonvolatile storage for the computer device 1100. That is, the high-capacity storage device 1107 may include a computer-readable medium (not shown), such as a hard disk, a compact disc readonly memory (CD-ROM) drive, or the like.
  • the computer-readable medium may include a computer storage medium and a communication medium.
  • the computer storage medium includes volatile and non-volatile, and removable and non-removable mediums implemented in any method or technology for storing information such as a computer-readable instruction, a data structure, a program module or other data.
  • the computer storage medium includes an RAM, an ROM, a flash memory or other solid-state storage technologies, a CD-ROM or other optical storage, a tape cartridge, a magnetic tape, a disk storage or other magnetic storage devices.
  • the computer storage medium is not limited to above.
  • the above system memory 1104 and high-capacity storage device 1107 may be collectively referred to as the memory.
  • the computer device 1100 may be connected to the Internet or other network devices via a network interface unit 1111 that is connected to the system bus 1105.
  • the memory further includes one or more above programs, and the one or more above programs are stored in the memory.
  • the CPU 1101 implements all or part of the steps of the method shown in FIG. 3, FIG. 4 or FIG. 9 by executing the one or more above programs.
  • a non-transitory computer-readable storage medium including instructions for example, a memory including computer programs (instructions) is further provided, and the above programs (instractions) are executable by a processor of a computer device to complete the method executed by a server or user terminal in the methods shown in the embodiments of the present disclosure.
  • the non- transitory' computer-readable storage medium may be an ROM, an RAM, a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, etc.
  • a computer program product or a computer program is further provided.
  • the computer program product or the computer program includes a computer instruction.
  • the computer instruction is stored in a computer-readable storage medium.
  • a processor of a computer device reads the computer instruction from the computer-readable storage medium.
  • the processor executes the computer instruction, such that the computer device executes the method shown in the above embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

Provided is a method and an apparatus for compressing data, a computer device and a storage medium. The method includes: acquiring target data; acquiring a compression parameter corresponding to an ith target data segment of the at least two target data segments by updating, based on compressed data information corresponding to an (i-1)th target data segment of the at least two target data segments, a compression parameter corresponding to the (i-1)th target data segment using a parameter update model, wherein the parameter update model is acquired by reinforcement learning based on a history compression parameter and compressed data information corresponding to the history compression parameter; and performing, based on the compression parameter corresponding to the ith target data segment, data compression on the ith target data segment.

Description

METHOD AND APPARATUS FOR COMPRESSING DATA,
COMPUTER DEVICE AND STORAGE MEDIUM
TECHNICAL FIELD
[0001] The present disclosure relates to the field of data processing, and in particular, relates to a method and an apparatus for compressing data, a computer device and a storage medium.
BACKGROUND
[0002] With a rapid development of the technology of Interet of Things, the problem of big data of the Internet of Things has become prominent. A massive characteristic of the data of the Internet of Things has brought great challenges to data quality control, data storage, data compression, data integration, data fusion, and data query'. The demand for the data compression capability is precisely a weakness of the development of informatization and digitalization of the Interet of Things.
[0003] In the related art, under the premise of meeting the data quality control, developers adopt appropriate and efficient compression methods to minimize redundant storage of timing data and increase a space utilization rate, which may effectively reduce storage costs and increase storage performances.
[0004] However, in the related art, during data compression, compression parameters with a larger compression proportion may result in a lower compression accuracy. But compression parameters with a higher compression accuracy may result in a too low compression proportion. Thus, it is difficult for the developers to balance the compression parameters between the compression proportion and the compression accuracy.
SUMMARY
[0005] Embodiments of the present disclosure provide a method and an apparatus for compressing data, a computer device, and a storage medium, which may adjust compression parameters and improve compression efficiency in the case of ensuring the compression accuracy. The technical solutions are as follows.
[0006] In one aspect, a method for compressing data is provided. The method includes:
[0007] acquiring target data; wherein the target data includes at least two target data segments;
[0008] acquiring a compression parameter corresponding to an ith target data segment of the at least two target data segments by updating, based on compressed data information corresponding to an (i-l)th target data segment of the at least two target data segments, a compression parameter corresponding to the (i- l)th target data segment using a parameter update model; wherein i is an integer greater than or equal to 2, the compressed data information includes at least one of a compression proportion and a compression standard deviation, the parameter update model is acquired by reinforcement learning based on a history compression parameter and compressed data information corresponding to the history compression parameter, the compression parameter is is indicative of a compression accuracy of data compression on the at least two target data segments; and the history compression parameter is a compression parameter corresponding to a history target data segment; and
[0009] performing, based on the compression parameter corresponding to the ith target data segment, data compression on the ith target data segment.
[0010] In another aspect, an apparatus for compressing data is provided. The apparatus includes: [0011] a target data acquiring module, configured to acquire target data; wherein the target data includes at least two target data segments;
[0012] a compression parameter updating module, configured to acquire a compression parameter corresponding to an ith target data segment of the at least two target data segments by updating, based on compressed data information corresponding to an (i-l)th target data segment of the at least two target data segments, a compression parameter corresponding to the (i-l)th target data segment using a parameter update model; wherein i is an integer greater than or equal to 2, the compressed data information includes at least one of a compression proportion and a compression standard deviation, the parameter update model is acquired by reinforcement learning based on a history compression parameter and compressed data information corresponding to the history compression parameter, the compression parameter is indicative of a compression accuracy of data compression on the at least two target data segments, and the history compression parameter is a compression parameter corresponding to a history target data segment; and
[0013] a data compressing module, configured to perform, based on the compression parameter corresponding to the ith target data segment, data compression on the ith target data segment.
[0014] In some embodiments, the history target data segment includes the target data segment prior to the ith target data segment; and
[0015] the apparatus further includes:
[0016] a model updating module, configured to update, based on compression parameters corresponding to prior N target data segments of the ith target data segment, and compressed data information corresponding to the prior N target data segments, the parameter update model in the case that the ith target data segment meets a specified condition; wherein the prior N target data segments are N target data segments prior to the ith target data segment in the target data, and N is an integer greater than or equal to 1, and less than i.
[0017] In some embodiments, the model updating module is configured to:
[0018] update, based on the compression parameters corresponding to the prior N target data segments of the ith target data segment and the compressed data information corresponding to the prior N target data segments, the parameter update model in the case that i is a preset value.
[0019] In some embodiments, the parameter update model includes a first model branch and a second model branch;
[0020] the first model branch is configured to update the compression parameter corresponding to the (i- l)th target data segment based on the compressed data information corresponding to the (i-l)th target data segment; and
[0021] the second model branch is configured to instruct the first model branch to update the compression parameter to increase the compression proportion corresponding to the compression parameter and decrease the compression standard deviation corresponding to the compression parameter.
[0022] In some embodiments, the compression parameter updating module is further configured to:
[0023] acquire the compression parameter corresponding to the ith target data segment by updating, based on the compressed data information corresponding to the (i-l)th target data segment, the compression parameter corresponding to the (i-l)th target data segment using the first model branch; and
[0024] the model updating module is further configured to:
[0025] acquire, based on the compression parameters corresponding to the prior N target data segments of the ith target data segment and the compressed data information corresponding to the prior N target data segments, value information corresponding to the ith target data segment using the second model branch; wherein the value information is configured to instruct the first model branch to update the compression parameter to increase the compression proportion corresponding to the compression parameter and decrease the compression standard deviation corresponding to the compression parameter; and
[0026] update, based on the value information, the first model branch and the second model branch.
[0027] In some embodiments, the target data further includes an initial data segment;
[0028] the compression parameter updating module is configured to: [0029] acquire, based on the initial data segment, an initial compression parameter; wherein the initial compression parameter is a compression parameter corresponding to a first target data segment; and
[0030] acquire a compression parameter of the first target data segment and compressed data information corresponding to the first target data segment by performing, based on the initial compression parameter, data compression on the first target data segment.
[0031] In some embodiments, the history target data segment includes a sample target data segment in sample data; the sample data is data of a same type as the target data; the sample data includes at least two sample target data segments; and
[0032] the apparatus further includes:
[0033] a sample updating module, configured to acquire an updated parameter update model by training, based on the at least two sample target data segments, the parameter update model.
[0034] In yet another aspect, a computer device is provided. The computer device includes a processor and a memory configured to store at least one instruction, at least one program, a code set, or an instruction set, wherein the processor, when loading and executing the at least one instruction, the at least one program, the code set, or the instruction set, is caused to perform the method for compressing data as described above.
[0035] In still a further aspect, a non-transitory computer-readable storage medium is provided. The storage medium stores at least one instruction, at least one program, a code set, or an instruction set, wherein the at least one instruction, the at least one program, the code set, or the instruction set, when loaded and executed by a processor of a computer device, causes the computer device to perform the method for compressing data as described above.
[0036] In one more aspect, a computer program product or a computer program is provided. The computer program product or computer program includes one or more computer instructions. The one or more computer instructions are stored in a non-transitory computer-readable storage medium. The one or more computer instructions, when loaded and executed by a processor of a computer device, cause the computer device to perform the method for compressing data as described above.
[0037] The technical solutions according to the present disclosure achieve the following beneficial effects:
[0038] The parameter update model is updated through the history compression parameter acquired during the compression process, and the compression proportion and the compression error which correspond to the history compression parameter. The updated compression parameter is acquired by updating the compression parameter corresponding to the previous target data segment using the updated parameter update model, and the compressed data information corresponding to the target data segment is acquired by compressing the target data segment based on the updated compression parameter. Continuously, based on the compressed data information, the compression parameter corresponding to the next target data segment is acquired using the parameter update model. With the above solutions, the parameter update model is updated based on the compression proportion and the compression error of the history compression parameter, the parameter update model adjusts the compression parameter corresponding to the target data segment based on the compression proportion and the compression error of the previous target data segment, and a value of the compression parameter is accurately adjusted, which improves the compression efficiency in the case of ensuring the compression accuracy.
[0039] It should be understood that, the above general description and the following detailed description are merely exemplary and explanatory, and is not intended to limit the present disclosure.
BRIEF DESCRIPTION OF THE DRAWINGS
[0040] The accompanying drawings incorporated in the specification and forming a part thereof illustrate the embodiments in line with the present disclosure, and are configured to explain principles of the present disclosure in combination with the specification.
[0041] FIG. 1 is a schematic structural diagram of a data compression system according to an exemplary embodiment;
[0042] FIG. 2 shows a schematic diagram of a swinging door trending algorithm involved in an embodiment of the present disclosure;
[0043] FIG. 3 is a schematic flowchart of a method for compressing data according to an exemplary embodiment;
[0044] FIG. 4 is a flowchart of a method for compressing data according to an exemplary embodiment;
[0045] FIG. 5 is a schematic flowchart of a method for training the parameter update model involved in the embodiment shown in FIG. 4;
[0046] FIG. 6 is a schematic flowchart of a method for compressing data involved in the embodiment shown in FIG. 4;
[0047] FIG. 7 is a schematic diagram of application of the data compression involved in the embodiment shown in FIG. 4; [0048] FIG. 8 is a schematic diagram of application of the data compression involved in the embodiment shown in FIG. 4;
[0049] FIG. 9 is a frame diagram of a data compression process according to an exemplary embodiment;
[0050] FIG. 10 is a block diagram of a structure of an apparatus for compressing data according to an exemplary embodiment; and
[0051] FIG. 11 is a schematic structural diagram of a computer device according to an exemplary embodiment.
DETAILED DESCRIPTION
[0052] Exemplary embodiments are explained in detail hereinafter, and examples of the exemplary embodiments are illustrated in the accompanying drawings. Where the description hereinafter relates to the accompanying drawings, unless otherwise specified, identical reference numerals in the accompanying drawings denote identical or like elements.. The implementations set forth in the following description of exemplary- embodiments do not represent all implementations consistent with the present disclosure. Instead, they are merely examples of apparatuses and methods consistent with some aspects related to the present disclosure as recited in the appended claims.
[0053] Prior to describing various embodiments of the present disclosure, several concepts involved in the present disclosure are introduced at first.
[0054] Artificial Intelligence (Al)
[0055] Al is a theory, a method, a technology or an application system which simulates, extends and expands human intelligence to perceive an environment, acquire knowledge and obtain the best result using the knowledge by using a digital computer or a machine controlled by the digital computer. In other words, Al is a comprehensive technology in computer science, which intends to understand the essence of intelligence and produce a new type of intelligent machines which can respond in a fashion similar to human intelligence. That is, Al is to study design principles and implementation methods of various intelligent machines, such that the machines have functions of perception, reasoning and decision-making.
[0056] Al technology is a comprehensive discipline, and involves both hardware technologies and software technologies in a wide range of fields. Basic Al technologies generally include a sensor technology, a dedicated Al chip technology, a cloud computing technology, a distributed storage technology, a big data processing technology, an operating/interaction system technology, an electromechanical integration technology, and the like. Software Al technologies mainly include a computer vision technology, a voice processing technology, a natural language processing technology, a machine leaming/deep learning technology, and the like.
[0057] 2) Data Compression (DC)
[0058] DC refers to a technological method of reducing a data amount to reduce a storage space and improve the transmission, storage and processing efficiency, or reorganizing the data by algorithms to reduce data redundancy and the storage space without losing useful information. DC includes lossy compression and lossless compression.
[0059] 3) Reinforcement Learning (RL)
[0060] RL, also known as reinforced learning, evaluation learning, or enhancement learning, is paradigms or methodologies of machine learning, and is configured to describe and solve the problem of achieving a maximum retur or a specific purpose of an agent based on learning strategies in the process of interacting with the environment.
[0061] A common model of RL is the standard markov decision process (MDP). According to given conditions, RL may be divided into model-based RL, model-free RL, active RL and passive RL. RL is that the agent learns in a "trial and error" fashion, and is a reward-guided behavior acquired by interacting with the environment, with a purpose that the agent gains the greatest reward. RL, different from supervised learning in connectionist learning, is mainly manifested in a reinforcement signal. The reinforcement signal provided by the environment in RL is an evaluation of the quality of a generated action (usually a scalar signal), rather than telling a reinforcement learning system (RLS) how to generate the correct action. In the case that the external environment provides little information, the RLS must rely on its own experience to learn. In this way, the RLS gains knowledge in an action-evaluation environment and improves action plans to adapt to the environment.
[0062] FIG. 1 is a schematic structural diagram of a data compression system according to an exemplary embodiment. The system includes a data storage device 120 and a data compression device 140.
[0063] The data storage device 120 may include a data storage module (not shown in the drawing), and data to be compressed may be stored in the data storage module in advance; or the data storage device 120 is directly connected to a sensor, and the sensor may be one sensor or several sensors. The sensor generates corresponding timing data via changes in the external environment, and sends the timing data to the data storage device for storage.
[0064] The data compression device 140 may include a data compressing module and a data processing module. The data to be compressed may be processed into a data form suitable for compression by the data processing module, or the data to be compressed may be directly analyzed by the data processing module. The data compression device may further compress the data to be compressed using the data compressing module to acquire compressed data after compression.
[0065] In some embodiments, the data compression device 140 may include a compressed data storage module, and the data compressing module compresses the data to be compressed, and saves the compressed data after compression to the compressed data storage module.
[0066] In some embodiments, the data compression device 140 may be a server, and may include one server, or several servers, or may be a distributed computer cluster composed of several servers, or may be a virtualization platform, or may be a cloud computing service center, etc., which is not limited in the present disclosure.
[0067] The data storage device 120 and the data compression device 140 are connected via a communication network, hi some embodiments, the communication network is a wired network or a wireless network.
[0068] hi some embodiments, the system may further include a management device (not shown in FIG. 1), and the management device is connected to the data storage device 120 and the data compression device 140 via a communication network. In some embodiments, the communication network is a wired network or a wireless network.
[0069] In some embodiments, the above wireless network or wired network uses a standard communication technology and/or protocol. The network is usually Internet, but may also be any network, including but not limited to any combination of a local area network (LAN), a metropolitan area network (MAN), a wide area network (WAN), a mobile network, a wired network or a wireless network, a private network or a virtual private network. In some embodiments, the data exchanged over the network is represented by using technologies and/or formats including hyper text mark-up language (HTML), extensible markup language (XML), and the like. In addition, all or some links may also be encrypted by using Secure Socket Layer (SSL), Transport Layer Security (TLS), virtual private network (VPN), Internet Protocol Security (IPsec) and other conventional encryption technologies. In some embodiments, some customized and/or dedicated data communication technologies may also be configured to replace or supplement the data communication technologies above.
[0070] Referring to FIG. 2, FIG. 2 is a schematic diagram of a swinging door trending (SDT) algorithm involved in an embodiment of the present disclosure. As shown in FIG. 2, FIG. 2 shows the process of a method for compressing a timing data segment using the SDT algorithm. By taking nine timing data nodes corresponding to nine time nodes from tO to t8 as an example, the SDT algorithm takes the timing data node corresponding to tO as a starting point to start the first compression process. The compression process is as follows.
[0071] Firstly, based on a compression parameter AE, two points on the drawing with a vertical distance of AE from the starting point are taken as pivots (a pivot 1 and a pivot 2).
[0072] The timing data node corresponding to tl is determined by the pivot 1 and the pivot 2. By connecting the timing data node corresponding to tl to the two pivots, the timing data node corresponding to tl and the two pivots form a triangle 1. In this case, a triangle inner angle corresponding to the pivot 1 in the triangle 1 is recorded as an inner angle 1; and a triangle inner angle corresponding to the pivot 2 in the triangle 1 is recorded as an inner angle 2. Obviously, a sum of the inner angle 1 and the inner angle 2 is less than 180 degrees, and the timing data node corresponding to tl may be normally compressed in the compression process.
[0073] Then, the timing data node corresponding to t2 is determined. The determination process is the same as the determination process of tl. The timing data node corresponding to t2 is connected to the pivot 1 and the pivot 2 to form a triangle 2. In this case, the inner angle 1 is compared with a triangle inner angle corresponding to the pivot 1 in the triangle 2, and the larger inner angle is taken as the inner angle 1. The inner angle 2 is compared with a triangle inner angle corresponding to the pivot 2 in the triangle 2, the larger inner angle is taken as the inner angle 2, and then whether the sum of the inner angle 1 and the inner angle 2 is less than or equal to 180 degrees is determined. In FIG. 2, as the sum of the inner angle 1 and the inner angle 2 is still less than or equal to 180 degrees upon the determination of t2, the timing data node corresponding to t2 may also be compressed normally in the compression process.
[0074] In the same way, the timing data node corresponding to t3 and the timing data node corresponding to t4 may further be compressed normally in the compression process, which is not repeated here.
[0075] When the timing data node corresponding to t5 in FIG. 2 is determined, it can be seen that in a triangle 5 formed by connecting t5 and the two pivots, in the case that the triangle inner angle corresponding to the pivot 2 is updated to the inner angle 2, the sum of the inner angle 2 and the inner angle 1 is greater than 180 degrees. This indicates that the timing data node corresponding to t5 cannot be compressed normally in the compression process. In this case, the timing data node corresponding to t5 is taken as a new starting point to start the next compression process. Besides, the corresponding timing data nodes of tO to t4 corresponding to the previous compression process are represented by a data segment, so as to complete the data compression process of the timing data nodes corresponding to tO to t4.
[0076] The SDT algorithm is a lossy compression algorithm. In the SDT algorithm, the compression parameter AE is configured to control the compression accuracy and compression effect of the SDT algorithm. In the case that AE is smaller, a difference value between the timing data nodes allowed to be compressed in one compression process is smaller, such that the compression standard deviation is smaller. However, where AE is too small, more useless data points are retained, and the compression proportion is lower. In the case that AE is larger, the difference value between the timing data nodes allowed to be compressed in one compression process is larger, such that the compression proportion is higher. The larger compression difference value may lead to too much loss of compressed information of compression points in one compression process, and the compression standard deviation is larger. Therefore, in the process of data compression using the SDT algorithm, it is necessary to accurately control the value of AE, and increase the compression proportion as much as possible on the premise of ensuring the smaller error. Therefore, in the SDT algorithm, the selection of the value of AE is relatively difficult, which directly affects the compression performance.
[0077] Referring to FIG. 3, FIG. 3 is a schematic flowchart of a method for compressing data according to an exemplary embodiment. The method may be executed by a computer device, and the computer device may be the data compression device 140 in the embodiment shown in FIG. 1. As shown in FIG. 3, the process of the method for compressing data may include the following steps.
[0078] In 31, a target data is acquired, wherein the target data includes at least two target data segments.
[0079] In some embodiments, the target data may be timing data.
[0080] The timing data refers to time sequence data. The time sequence data is a data sequence of a same indicator recorded by in a chronological order. The data in the same data sequence is of the same size and is comparable.
[0081] In some embodiments, the target data may be segmented into respective target data segments according to time identifiers.
[0082] The timing data may include the time identifiers for indicating time information in the timing data, and the target data may be determined as respective target data segments according to the time identifiers.
[0083] In some embodiments, the data amount of each target data segment is the same.
[0084] In 32, a compression parameter corresponding to an i* target data segment of the at least two target data segments is acquired by updating, based on compressed data information corresponding to an (i-l),h target data segment of the at least two target data segments, a compression parameter corresponding to the (i- 1 )th target data segment using a parameter update model. [0085] Here, i is an integer greater than or equal to 2, the compressed data information includes at least one of a compression proportion and a compression standard deviation; the parameter update model is acquired by reinforcement learning according to a history compression parameter and compressed data information corresponding to the history compression parameter, the compression parameter is indicative of a compression accuracy of data compression on the at least two target data segments; and the history compression parameter is a compression parameter corresponding to a history target data segment.
[0086] In some embodiments, the parameter update model includes a first model branch and a second model branch.
[0087] The first model branch is configured to update the compression parameter corresponding to the (i-l)th target data segment based on the compressed data information corresponding to the (i- l)th target data segment; and the second model branch is configured to instruct the first model branch to update the compression parameter to increase the compression proportion corresponding to the compression parameter and decrease the compression standard deviation corresponding to the compression parameter.
[0088] In some embodiments, the history target data segment includes the target data segment prior to the i* target data segment.
[0089] In some embodiments, the history- target data segment includes a sample target data segment in sample data; and the sample data is data of a same type as the target data, and the sample data includes at least two sample target data segments.
[0090] In 33, data compression is performed on the i* target data segment based on the compression parameter corresponding to the i* target data segment.
[0091] In summary, in the solution shown in the embodiments of the present disclosure, the parameter update model is updated based on the history compression parameter acquired in the compression process, and the compression proportion and the compression standard deviation corresponding to the history compression parameter. The updated compression parameter is acquired by updating the compression parameter corresponding to the previous target data segment using the updated parameter update model, and the compressed data information corresponding to the target data segment is acquired by compressing the target data segment based on the updated compression parameter. Continuously, based on the compressed data information, the compression parameter corresponding to the next target data segment is acquired using the parameter update model. With the above solution, the parameter update model is updated based on the compression proportion and the compression standard deviation of the history compression parameter, the parameter update model adjusts the compression parameter corresponding to the target data segment based on the compression proportion and the compression standard deviation of the previous target data segment, and a value of the compression parameter is accurately adjusted, which improves the compression efficiency in the case of ensuring the compression accuracy.
[0092] Referring to FIG. 4, FIG. 4 is a flowchart of a method for compressing data according to an exemplary embodiment. The method may be executed by a computer device, and the computer device may be the data compression device 140 in the embodiment shown in FIG. 1, and the data compression device may be a server. As shown in FIG. 4, the method for compressing data may include the following steps.
[0093] In 401, at least two sample target data segments are acquired.
[0094] The sample target data segment is a data segment in sample data.
[0095] hi 402, an updated parameter update model is acquired by training, based on the at least two sample target data segments, a parameter update model.
[0096] hi some embodiments, the sample data further includes an initial sample data segment.
[0097] The initial sample data segment is a first data segment in the sample data.
[0098] hi some embodiments, an initial sample compression parameter is acquired based on the initial sample data segment, wherein the initial sample compression parameter is a compression parameter corresponding to the first sample target data segment.
[0099] In some embodiments, the parameter update model is initialized based on the initial sample compression parameter and a parameter preset by a user.
[00100] That is, the parameter update model may be initially constructed by the initial sample compression parameter and the preset parameter before being updated based on the at least two sample target data segments. The server then trains the initialized parameter update model based on the at least two sample target data segments.
[00101] In some embodiments, 402 may include 402a, 402b, and 402c.
[00102] In 402a, a sample compression parameter corresponding to an n01 sample target data segment is acquired by updating, based on compressed data information corresponding to a (n-l)th sample target data segment, a sample compression parameter corresponding to the (n-l)th sample target data segment using the parameter update model; n is an integer greater than or equal to 2.
[00103] The compressed data information corresponding to the (n-l)th sample target data segment includes at least one of a compression proportion and a compression standard deviation corresponding to the (n-l)th sample target data segment.
[00104] In some embodiments, the compression proportion corresponding to the (n-l)th sample target data segment is a ratio of the number of data segments before the (n-l)th sample target data segment is compressed to the number of data segments after tire (n-l)th sample target data segment is compressed. The compression proportion is shown in the following formula:
[00105] Proportion = £
[00106] wherein N represents the number of data points contained in the data segment before compression, and N* represents the number of data points contained in the data segment after compression.
[00107] In some embodiments, the compression standard deviation Std is shown in the following formula:
Figure imgf000015_0001
[00109] wherein y, represents a data value corresponding to each data point in the at least two sample target data segments, u represents an average value of the data values corresponding to each data point in the at least two sample target data segments, and n represents the number of the data points in the at least two sample target data segments.
[00110] In 402b, compressed data of the nth sample target data segment, and compressed data information corresponding to the n01 sample target data segment are acquired by performing data compression on the n* sample target data segment based on the sample compression parameter corresponding to the n111 sample target data segment.
[00111] In some embodiments, in the case that the compression method is the SDT algorithm, the sample compression parameter is the compression parameta* AE in the SDT algorithm, which is configured to control the compression accuracy and the compression effect of the SDT algorithm.
[00112] In 402c, the parameter update model is updated based on sample compression parameters corresponding to the at least two sample target data segments and compressed data information corresponding to the at least two sample target data segments.
[00113] In some embodiments, in the case that the n01 sample target data segment meets a specified condition, the parameter update model is updated based on compressed data information corresponding to prior N sample target data segments of the 11th sample target data segment.
[00114] The prior N sample target data segments are N data segments prior to the 1th sample target data segment in the target data; and N is an integer greater than or equal to 1, and less than i.
[00115] In some embodiments, the parameter update model includes a first model branch and a second model branch. The first model branch is configured to update the sample compression parameter corresponding to the (n-l)th sample target data segment based on the compressed data information corresponding to the (n-l)th sample target data segment; and the second model branch is configured to instruct the first model branch to update the compression parameter to increase the compression proportion corresponding to the compression parameter and decrease the compression standard deviation corresponding to the compression parameter.
[00116] The parameter update model is a model constructed by an Actor-Critic-based proximal policy optimization (PRO) algorithm. Therefore, in the parameter update model, the first model branch is an Actor network model, and the second model branch is a critic network model.
[00117] In some embodiments, based on the compressed data information corresponding to the (n-l)th sample target data segment, the sample compression parameter corresponding to the n,h sample target data segment is acquired by updating the sample compression parameter corresponding to the (n-l)111 sample target data segment using the first model branch; based on the sample compression parameters corresponding to the at least two sample target data segments, and the compressed data information corresponding to the at least two sample target data segments, value information corresponding to the 1th target data segment is acquired using the second model branch; and the value information is configured to instruct the first model branch to update the compression parameter with to increase the compression proportion corresponding to the compression parameter and decrease the compression standard deviation corresponding to the compression parameter.
[00118] Based on the value information, the first model branch and the second model branch are updated.
[00119] In the case that the parameter update model is the Actor-critic network model, the Actor network model takes the compression parameter AE, the compression proportion and the compression standard deviation Std as input parameters, and combines some random numbers to calculate a change value of AE, and the change value of AE is taken as an Action in an RL concept. The compression parameter AE, and the compression proportion and tire compression standard deviation Std are taken as a state in the RL concept. A difference value between the proportion corresponding to the n111 sample target data segment and the proportion corresponding to the (n-l)01 sample target data segment is calculated, a difference value between the Std corresponding to the n111 sample target data segment and the Std corresponding to the (n-l)111 sample target data segment is calculated, and a ratio of the two difference values is taken as a Reward corresponding to the nth sample target data segment.
[00120] When the nth sample target data segment meets the specified condition, the Actorcritic network model is updated based on the States, Rewards, and Actions corresponding to the prior N sample target data segments of the nth sample target data segment. Specifically, Actor in the model is replicated as Old_Actor before being updated based on the states of the prior N sample target data segments. Actor updated based on the states corresponding to the prior N sample target data segments is compared with the non-updated Old Actor to calculate a a_loss, and Rewards and states corresponding to the prior N sample target data segments are taken as input parameters of the critic network model in the model to acquire value information corresponding to the prior N sample target data segments (that is, value in the RL concept). Besides, a c loss is acquired based on the value information, wherein the a_loss is a loss function value of the Actor network, and the c_loss is a loss function value of the critic network. Based on the a_loss and the c_loss, back propagation is performed on both the Actor network and the critic network to update the parameter update model.
[00121] In 403, the target data is acquired, wherein the target data includes at least two target data segments.
[00122] The target data is data of the same type as the sample target data.
[00123] In 404, an initial compression parameter is acquired based on an initial data segment, wherein the initial compression parameter is a compression parameter corresponding to a first target data segment.
[00124] The target data further includes the initial data segment, and the initial data segment is a first data segment in the target data.
[00125] In some embodiments, the initial compression parameter is acquired based on a standard deviation of data values corresponding to the data points of the initial data segment.
[00126] The server may directly analyze the distribution of the data values corresponding to the data points in the initial data segment, for example, the distribution of the data values corresponding to the data points in the initial data segment is analyzed according to the standard deviation, which can partially reflect the distribution of the data values corresponding to the overall data points of the corresponding target data of the initial data segment. The initial compression parameter may be positively correlated with the standard deviation of the data values corresponding to the data points in the initial data segment. Where the standard deviation is larger, the difference value of the data values corresponding to the data points in the initial data segment is larger, and the data points are relatively scattered. Therefore, it can be inferred that the distribution of the overall data points of the target data may further be relatively scattered. Therefore, in order to ensure a compression proportion, a larger initial compression parameter may be set, and then the initial compression parameter is updated according to the compression situation of a subsequent target data segment. Similarly, where the standard deviation is smaller, the difference value of the data values corresponding to respective data points in the initial data segment is smaller, in tire case of ensuring the compression proportion, a smaller initial compression parameter may be set to improve the compression accuracy, and then the initial compression parameter is updated according to the compression situation of the subsequent target data segment.
[00127] In 405, a compression parameter of the first target data segment and compressed data information corresponding to the first target data segment are acquired by performing data compression on the first target data segment based on the initial compression parameter.
[00128] In some embodiments, the first target data segment is performed the data compression using the SDT algorithm based on the initial compression parameter to acquire the compression parameter of the first target data segment and the compressed data information corresponding to the first target data segment.
[00129] The initial compression parameter is the compression parameter AE corresponding to the SDT algorithm.
[00130] In 406, a compression parameter corresponding to an 1th target data segment of the at least two target data segments is acquired by updating, based on compressed data information corresponding to an (i-l)lh target data segment of the at least two target data segments, a compression parameter corresponding to the (i-l)111 target data segment using the parameter update model.
[00131] In some embodiments, in the case that the parameter update model includes a first model branch, the first model branch is configured to update the compression parameter corresponding to the (i-l)01 target data segment based on the compressed data information corresponding to the (i- l)th target data segment.
[00132] In some embodiments, based on the compressed data information corresponding to the (i-l)1*1 target data segment, the compression parameter corresponding to the ith target data segment is acquired by updating the compression parameter corresponding to the (i-l)111 target data segment using the first model branch.
[00133] The process of updating tire compression parameter corresponding to the (i-l )th target data segment using the parameter update model, and the process of updating the compression parameter corresponding to the (i-l)111 target data segment using the first model branch and the second model branch are the same as the process of updating the compression parameter in the model training process in 402, which is not repeated here.
[00134] In 407, data compression is performed on the ith target data segment based on the compression parameter corresponding to the ith target data segment.
[00135] In some embodiments, in the case that the ith target data segment meets a specified condition, based on compression parameters corresponding to prior N target data segments of the ith target data segment and compressed data information corresponding to the prior N target data segments, the parameter update model is updated; and the prior N target data segments are N data segments prior to the i* target data segment in the target data; N is an integer greater than or equal to 1.
[00136] In some embodiments, in the case that i is a preset value, based on the compression parameters corresponding to the prior N target data segments of the 1th target data segment and the compressed data information corresponding to the prior N target data segments, the parameter update model is updated.
[00137] The preset value may be one value or several values. For example, in the case that the preset value is one value, the user may set the preset value to 10. In this case, based on the compression parameters corresponding to the prior N target data segments of the tenth target data segment and the compressed data information corresponding to the prior N target data segments, the parameter update model is updated; or, the preset value may be a multiple of 5, that is, 5, 10, 15, 20, or the like. In this case, when i reaches the preset value, the parameter update model is updated.
[00138] In some embodiments, the parameter update model includes a second model branch, and the second model branch is configured to instruct the first model branch to update the compression parameter to increase the compression proportion corresponding to the compression parameter and reducing the compression standard deviation corresponding to the compression parameter.
[00139] In some embodiments, based on the compression parameters corresponding to the prior N target data segments of the ith target data segment and the compressed data information corresponding to the prior N target data segments, value information corresponding to the ith target data segment is acquired using the second model branch; and the value information is configured to instruct the first model branch to update the compression parameter to increase the compression proportion corresponding to the compression parameter and decrease the compression standard deviation corresponding to the compression parameter; and the first model branch and the second model branch are updated based on the value information.
[00140] The process of updating the parameter update model based on the compression parameters corresponding to the prior N target data segments of the ith target data segment and the compressed data information corresponding to the prior N target data segments is the same as the process of updating the parameter update model in the model training process in the above 402, which is not repeated here.
[00141] FIG. 5 schematically describes the process of updating the parameter update model based on the compression parameters corresponding to the prior N target data segments of the ith target data segment and the compressed data information corresponding to the prior N target data segments. Referring to FIG. 5, FIG. 5 is a schematic flowchart of a method for training a parameter update model involved in an embodiment of the present disclosure. As shown in FIG. 5, the 1th target data segment is performed the data compression using the SDT algorithm to obtain a compression proportion and a compression standard deviation (Std) corresponding to the ith target data segment. Then the proportion and the Std corresponding to the ith target data segment, and the compression parameter AE corresponding to the 1th target data segment are taken as a State in the RL method to be input into the parameter update model (S501). The embodiment adopts a proximal policy optimization (PPG) algorithm in the RL method.
[00142] In some embodiments, the parameter update model may be an Actor-critic model in an RL model, and an Actor network in the Actor-critic model is configured to acquire a compression parameter of a (i+l)1*1 target data segment based on State information of the ith target data segment. That is, the compression parameter AE, the compression proportion and the compression standard deviation Std are taken as input parameters, and a change value of the compression parameter AE of the (i+l)th target data segment is calculated by combining some random number (that is, Action in the RL concept). Besides, difference values between the compression proportion and the compression standard deviation corresponding to the ith target data segment, and the compression proportion and the compression standard deviation corresponding to the (i+l)th target data segment are calculated (that is, Reward in the RL concept).
[00143] A round of update process of the Actor-critic model is as follows. [00144] The State information corresponding to the ith target data segment is input into the Actor network (S502) to acquire two parameters Ji and G, and the two parameters are taken as a mean and a variance of a normal distribution to construct the normal distribution. The normal distribution is configured to express a probability distribution of Action (the change value of AE) in the RL method, and then the normal distribution is sampled based on the random number to acquire a Sample Action as the change value of AE of the next target data segment (S503), and the change value is recorded as Action corresponding to the i* target data segment (S504).
[00145] The change value of AE is input into a compression program (i.e., compression environment) to acquire the compression parameter AE corresponding to the i111 target data segment, and the compression parameter AE corresponding to the (i+l)th target data segment is determined. Then, based on the compression parameter AE corresponding to the (i+l)th target data segment, data compression is performed on the (i+l)111 target data segment using the SDT algorithm to acquire compressed data corresponding to the (i+l),h target data segment, and a compression proportion and a compression standard deviation corresponding to the (i+l)111 target data segment (S505). The compression parameter, compression proportion and compression standard deviation corresponding to the (i+l)111 target data segment are saved as state corresponding to the (i+l)111 target data segment (S506) and are input to the Actor network, and the next Action (the next process, not shown in the drawing) is acquired. In addition, the Actor network acquire a reward value corresponding to the (i+l)th target data segment based on the compression proportion and the compression standard deviation Std corresponding to the i* target data segment, and the compression proportion and the compression standard deviation Std corresponding to the (i+l)th target data segment (S507). The above update process of the compression parameter is cycled until a model update device (server) stores parameters (s, a, r) corresponding to a number of target data segments, that is, the parameters (State, Action, Reward) corresponding to all target data segments.
[00146] Then, via a critic network in the Actor-critic model, based on the parameters (s, a, r) corresponding to the number of target data segments, a value corresponding to the number of target data segments is acquired, and based on the parameters (s, a, r) corresponding to the number of target data segments and the value, the critic network and the Actor network are updated.
[00147] Specifically, in the case that the model update device stores the parameters (s, a, r) corresponding to the number of target data segments trained in the above steps, the parameters (s, a, r) acquired after the above update process of cycling compression parameter are input into the critic network (S508), so as to acquire v_(value) corresponding to the above cyclically updated model, and a discount reward corresponding to the above cyclically updated model is calculated based on v_. The formula of the discount reward is as follows:
[00148]
Figure imgf000021_0001
[00149] wherein r[t] represents a Reward value of the target data segment corresponding to the parameter update model at moment t, y represents a discount coefficient in the RL concept, y is greater than or equal to 0, and less than or equal to 1, and y is configured to instruct an influence rate of Reward at the current moment to a moment in the fixture; and v_ is the value acquired by the target data segment via the critic network at the moment t.
[00150] For the number of target data segments, the discount rewards corresponding to the number of target data segments are calculated using the above discount reward formula. For example, in the case that a value of the number is T+l, the discount rewards corresponding to the T+l target data segments are R[0], R[l], R[2], R[3]...R[T] respectively, wherein T is the last moment (time step).
[00151] Then the parameters (s, a, r) of the number (T+l) stored in the model update device are input into the critic network to acquire the V_(value of the value) corresponding to the number of target data segments, and then a first difference value (At) between the V_ value of each of the target data segments and the discount reward is calculated by the formula: At=R-V (S509).
[00152] Based on the first difference value as an input value of a critic network loss function, a loss function value is acquired (S510). As the critic network is updated using back propagation of the loss function value, the loss function may be eJoss which is equal to mean(square(At )), and the loss function value is an average of square values of the first difference values corresponding to the target data segments (S511).
[00153] Then the parameters (s, a, r) corresponding to the number of target data segments are input into an Old_Actor network and the Actor network (S512). The Old_Actor network and the Actor network have the same network structure, wherein the Old_Actor is acquired after updating the compression parameter, the compression proportion and the compression standard deviation corresponding to tire first target data segment; and the Actor network is acquired after updating the compression parameters, the compression proportions and the compression standard deviations corresponding to the number of target data segments.
[00154] Then, based on the parameters (s, a, r) corresponding to the number of target data segments, a normal distribution Normall corresponding to the OkLActor network and a normal distribution Normal2 corresponding to the Actor network are constructed. Normall is indicative of a probability that Action (that is, the change value of E) is taken each value in Old_Actor, and Normal2 is indicative of a probability that the Action is taken each value in Actor. Actions corresponding to the number of target data segments are input into both the normal distributions Normall and Normal2, so as to acquire the probabilities probl and prob2 of each Action in the two networks (Actor and Old_Actor), and then prob2 is divided by probl to acquire an important weight, which is a ratio of Old_Actor to the updated Actor.
[00155] Then, based on the ratio and the first difference At, the Actor network is updated by back propagation of the loss function (S513), wherein the loss function a_loss may be expressed as mean(min((ratio*At, clip(ratio, l-§, 1+Q* At))).
[00156] The clip function is a clipping function, that is, the ratio is clipped based on the size of an interval shown in (l-§, 1+Q, and § is a constant Therefore, from an intuitive point of view, for the above loss function, firstly the ratio is clipped by the clip to form a range, the clipped range is acted on the first difference value At to obtain a value, which is compared with a value of the ratio directly being acted on At, and the minimum value is taken as an updated loss function value of Actor.
[00157] The stored parameters (s, a, r) corresponding to the number of target data segments are cyclically input into the Old_Actor network and the Actor network, and the Actor network is updated. After a number of cycles, a network weight in the Actor network updated for a number of times is configured to update the Old_Actor network (S514), such that in the next update process of the Actor network, the updated Old_Actor network may be used as a comparison network to construct updated Normal2, thereby causing model update more efficient. [00158] Referring to FIG. 6, FIG. 6 is a schematic flowchart of a method for compressing data involved in an embodiment of the present disclosure. As shown in FIG. 6, a target data 61 may be divided into an initial data 611 (an initial data segment) and several target data segments (a first target data segment 612, a second target data segment 613, and several subsequent target data segments). Firstly, an initial compression parameter corresponding to the initial data segment is acquired by data analyzing on the initial data and calculating the parameters such as a standard deviation and an average of data contained in the initial data, and the initial compression parameter corresponding to the initial data is input into a parameter update model to acquire an initialized parameter update model.
[00159] The parameter update model does not adjust the initial compression parameter, and directly inputs the initial compression parameter into an SDT module 623 as a compression parameter of an SDT algorithm, and SDT is performed on the first target data segment 612 (that is, the first of the target data segments) based on the initial compression parameter, to acquire compressed data corresponding to the first target data segment 612, and a compression proportion and a compression standard deviation corresponding to the first target data segment 612.
[00160] The compression proportion and the compression standard deviation corresponding to the first target data segment, and the compression parameter corresponding to the first target data segment are input into the parameter update model 621 to acquire an updated compression parameter. The updated compression parameter, as a compression parameter of the second target data segment (that is, the second of the target data segments), is input into the SDT module, and SDT is performed on the second target data segment 613 to acquire compressed data corresponding to the second target data segment 613, and a compression proportion and a compression standard deviation corresponding to the second target data segment 613.
[00161] The above process is iterated, in the case that an N111 target data segment 614 is performed SDT by the SDT module based on a compression parameter corresponding to the N111 target data segment 614, to acquire a compression proportion and a compression standard deviation corresponding to the N111 target data segment 614, the parameter update model is updated, based on compression parameters corresponding to the prior N target data segments, and compression proportions and compression standard deviations corresponding to the prior N target data segments, to acquire the updated parameter update model, and N is a positive integer greater than or equal to 2, and is a positive integer.
[00162] Then based on the compression parameter corresponding to the N* target data segment 614, and the compression proportion and the compression standard deviation corresponding to the N01 target data segment 614, a compression parameter corresponding to a (N+l)111 target data segment is acquired using the updated parameter update model.
[00163] The above SDT process and the update process of the parameter update model are repeated until the target data 61 is completely compressed to acquire compressed data corresponding to the target data 61.
[00164] Referring to FIG. 7, FIG. 7 is a schematic diagram of application of the data compression involved in an embodiment of the present disclosure.
[00165] As shown in FIG. 7, real-time data or history data 702 may be acquired by performing data access and processing on data generated by a physical device 701, and the physical device 701 may be a sensor, and the real-time data or history data may be data acquired by data access and processing of each sensor according to a timing.
[00166] Then corresponding compressed data 704 is output from the real-time data or history data via a lossy compression program 703 (that is, the compression program constructed according to the compression method in the embodiment shown in FIG. 4), and the compressed data 704 is stored in a time-series database (TSDB).
[00167] Referring to FIG. 8, FIG. 8 is a schematic diagram of application of the data compression involved in an embodiment of the present disclosure.
[00168] As shown in FIG. 8, target data may be a timing data file 801 pre-stored in a database. For example, the timing file may be a comma-separated values (CSV) file. The timing data file is subjected to data processing into compressible history data 802, and then the history data is performed the data compression using an envision swinging door trending algorithm 803 (ESDI)), that is, the compression method in the embodiments shown in FIG. 4, and output compressed data 804 is stored in the TSDB.
[00169] The embodiments of the present disclosure provide an improved lossy compression algorithm ESDT based on the SDT algorithm; an RL-related algorithm is used, and learning is performed based on the timing data, such that an SDT parameter can be adaptively- adjusted to acquire the compression effect of a high compression proportion and a low compression standard deviation. The compression proportion and the compression standard deviation which are continuously fed back by the compressed data are taken as a reward and punishment mechanism, and an RL model is configured to dynamically adjust the compression parameter to acquire a better compression effect. [00170] In summary, in the solution shown in the embodiments of the present disclosure, the parameter update model is updated based on a history compression parameter acquired in the compression process, and a compression proportion and a compression standard deviation corresponding to the history compression parameter. The compression parameter corresponding to the previous target data segment is updated using an updated parameter update model to acquire the updated compression parameter, and the target data segment is compressed based on the updated compression parameter to acquire compressed data information corresponding to the target data segment. Continuously, based on the compressed data information, a compression parameter corresponding to the next target data segment is acquired using the parameter update model. With the above solution, the parameter update model is updated based on the compression proportion and the compression standard deviation of the history compression parameter, the parameter update model adjusts the compression parameter corresponding to the target data segment based on the compression proportion and tire compression standard deviation of the previous target data segment, and a value of the compression parameter is accurately adjusted, which can improve the compression efficiency in the case of ensuring the compression accuracy.
[00171] Referring to FIG. 9, FIG. 9 is a framework diagram of a data compression process according to an exemplar)' embodiment. The method for compressing data is jointly executed by a model training device 90 and a data compression device 91. The model training device 90 and the data compression device 91 may be servers. As shown in FIG. 9, firstly, a parameter update model is trained in the model training device 90 based on sample data 900. The sample data 900 includes an initial sample data segment 901, a first sample target data segment (the first of the sample target data segments) 902, a second sample target data segment (the second of the sample target data segments) 903, and several sample target data segments including an N* sample target data segment 904.
[00172] Firstly, by data analysis on the initial sample data, that is, by calculating the parameters such as a standard deviation and an average of data in the initial sample data, an initial compression parameter corresponding to the initial sample data segment is acquired, and based on the initial compression parameter corresponding to the initial sample data and a preset parameter, a parameter update model 905 is updated. The parameter update model 905 may be an Actor-Critic network model shown in FIG. 9, and the parameter update model 905 includes an Actor network 905a and a Critic network 905b.
[00173] The parameter update model 905 directly inputs the initial compression parameter into an SDT module 907 without adjusting a compression parameter 906 corresponding to the initial sample data segment, and the SDT module 907, by performing data compression on the first sample target data segment 902 based on the compression parameter 906 corresponding to the initial sample data segment, acquires compressed data corresponding to the first sample target data segment 902, and a compression proportion and a compression standard deviation corresponding to the first sample target data segment 902.
[00174] The compression proportion and the compression standard deviation corresponding to the first sample target data segment, and the compression parameter corresponding to the first sample target data segment are input to the parameter update model 905, and the compression parameter is updated via the Actor network 905a in the parameter update model 905 to acquire the updated compression parameter. Then the updated compression parameter, as a compression parameter corresponding to the second sample target data segment, is input into the SDT module, and compressed data corresponding to the second sample target data segment 903, and a compression proportion and a compression standard deviation corresponding to the second sample target data segment 903 are acquired by performing SDT on the second sample target data segment 903.
[00175] The above process is iterated, in the case that the N111 sample target data segment 904 is performed SDT by the SDT module based on a compression parameter corresponding to the N* sample target data segment 904, to acquire a compression proportion and a compression standard deviation corresponding to the sample target data segment 904, compression parameters corresponding to prior N sample target data segments, and compression proportions and the compression standard deviations corresponding to the prior N sample target data segments are input into the Critic network 905b to acquire Value corresponding to the prior N sample target data segments. Based on Value corresponding to the prior N sample target data segments, and State and Action which are acquired from the prior N sample target data segments via the Actor network, the parameter update model is updated. The update process is the same as corresponding content of FIG. 5 in tire embodiment corresponding to FIG. 4, and is not repeated here.
[00176] The above SDT process and the update process of the parameter model are repeated until the sample data are completely compressed, that is, the model training device 90 has completed the training process of the parameter update model based on the sample data 900. In this case, the updated parameter update model is sent to the data compression device 91 to compress target data that needs to be compressed.
[00177] In the data compression process, the target data 910 is the same type of data as the sample data 900. The target data may be divided into an initial data segment 911, a first target data segment 912 (the first target data segment of the target data segments), a second target data segment 913 (the second target data segment of the target data segments) 913, and several target data segments including an N111 target data segment 914.
[00178] A parameter update model 915 is a model trained with the sample data in the model training device 90. The parameter update model 915 and the parameter update model 905 have the same structure, and both are Actor-Critic network models (not shown in the drawing). Besides, a compression parameter 916 in the data compression device 91 is the date with the same type as the compression parameter 906 in the model training device 90; and an SDT module 917 in the data compression device 91 is the same as the SDT module 907 in the model training device 90. Therefore, the compression process of the target data in the data compression device 91 and the update process of the parameter update model are consistent with the compression process of the sample data in tire model training device 90, and is not repeated here. That is, the data compression device 91 updates the compression parameter corresponding to each of the target data segments using the parameter update model trained in the model training device to acquire the compression parameter of the next target data segment corresponding to each of the target data segments. In the case that a predetermined number of target data is compressed, the data compression device may update the parameter update model based on the compression parameters, the compression proportions, and the compression standard deviations corresponding to the predetermined number of target data segments, and the updated model is further in line with characteristics of the target data, thereby improving the compression proportion of the data acquired by compression using the SDT algorithm, and decreasing the compression standard deviation of the target data.
[00179] The above model training process collects real data training set that needs to be compressed, and as model training and compression calculation are two processes, in the case that a training program leams, a compression program for compressing the training set (sample data) may be preset in the server, after analyzing the training set, the compression calculation is started, and a calculation result is sent to a neural network (the parameter update model). In the model training process, the training program for the parameter update model is firstly started, an initial model is acquired by learning the training set data, and then the initial model, as the parameter update model, is placed in a compression environment to acquire real-time spinning door calculation data, learning is performed every other a time period, and the updated parameter update model is saved.
[00180] FIG. 10 is a block diagram of the structure of an apparatus for compressing data according to an exemplary embodiment. The apparatus for compressing data may implement all or part of the steps in the method according to the embodiment shown in FIG. 3 or FIG. 4. Tire apparatus for compressing data may include:
[00181] a target data acquiring module 1001, configured to acquire target data; wherein the target data includes at least two target data segments;
[00182] a compression parameter updating module 1002, configured to acquire a compression parameter corresponding to an ith target data segment of the at least two target data segments by updating, based on compressed data information corresponding to an (i- l)th target data segment of the at least two target data segments, a compression parameter corresponding to the (i-l)th target data segment using a parameter update model; wherein i is an integer greater than 2, the compressed data information inchides at least one of a compression proportion and a compression standard deviation, the parameter update model is acquired by reinforcement learning based on a history compression parameter and compressed data information corresponding to the history compression parameter, the compression parameter is indicative of a compression accuracy of data compression on the at least two target data segment, and the history compression parameter is a compression parameter corresponding to a history target data segment; and
[00183] a data compressing module 1003, configured to perform, based on the compression parameter corresponding to the ith target data segment, data compression on the ith target data segment.
[00184] In some embodiments, the history target data segment includes the target data segment prior to the ith target data segment; and [00185] the apparatus further includes:
[00186] a model updating module, configured to update, based on compression parameters corresponding to prior N target data segments of the ith target data segment, and compressed data information corresponding to the prior N target data segments, the parameter update model in the case that the ith target data segment meets a specified condition; wherein the prior N target data segments are N target data segments prior to the ith target data segment in the target data, and N is an integer greater than or equal to 1, and less than i.
[00187] In some embodiments, the model updating module is configured to: update, based on the compression parameters corresponding to the prior N target data segments of the ith target data segment and the compressed data information corresponding to the prior N target data segments, the parameter update model in the case that i is a preset value.
[00188] In some embodiments, the parameter update model includes a first model branch and a second model branch; wherein [00189] the first model branch is configured to update the compression parameter corresponding to the (i-l)th target data segment based on the compressed data information corresponding to the (i- 1 )th target data segment; and
[00190] the second model branch is configured to instruct the first model branch to update the compression parameter to increase the compression proportion corresponding to the compression parameter and decrease the compression standard deviation corresponding to the compression parameter.
[00191] In some embodiments, the compression parameter updating module 1002 is further configured to:
[00192] acquire the compression parameter corresponding to the ith target data segment by updating, based on the compressed data information corresponding to the (i-l)th target data segment, the compression parameter corresponding to the (i-l)th target data segment using the first model branch; and
[00193] the model updating module is further configured to:
[00194] acquire, based on the compression parameters corresponding to the prior N target data segments of the ith target data segment and the compressed data information corresponding to the prior N target data segments, value information corresponding to the ith target data segment using the second model branch; wherein the value information is configured to instruct the first model branch to update the compression parameter to increase the compression proportion corresponding to the compression parameter and decrease the compression standard deviation corresponding to the compression parameter; and
[00195] update, based on the value information, the first model branch and the second model branch.
[00196] In some embodiments, the target data further includes an initial data segment; and [00197] the compression parameter updating module 1002 is configured to:
[00198] acquire, based on the initial data segment, an initial compression parameter; wherein the initial compression parameter is a compression parameter corresponding to a first target data segment; and
[00199] acquire a compression parameter of the first target data segment and compressed data information corresponding to the first target data segment by performing, based on the initial compression parameter, data compression on the first target data segment.
[00200] In some embodiments, the history target data segment includes a sample target data segment in sample data; the sample data is data of the same type as the target data; and the sample data includes at least two sample target data segments; and [00201] the apparatus further includes: a sample updating module, configured to acquire an updated parameter update model by training, based on the at least two sample target data segments, the parameter update model.
[00202] In summary, in the solution shown in the embodiment of the present disclosure, the parameter update model is updated based on the history compression parameter acquired during the compression process, and the compression proportion and the compression standard deviation corresponding to the history compression parameter. The compression parameter corresponding to the previous target data segment is updated using the updated parameter update model to acquire the updated compression parameter, and the target data segment is compressed based on the updated compression parameter to acquire the compressed data information corresponding to the target data segment. Continuously, based on the compressed data information, the compression parameter corresponding to the next target data segment is acquired using the parameter update model. With the above solution, the parameter update model is updated based on the compression proportion and the compression standard deviation of the history- compression parameter, the parameter update model adjusts the compression parameter corresponding to the target data segment based on the compression proportion and the compression standard deviation of the previous target data segment, and a value of the compression parameter is accurately adjusted, which can improve the compression efficiency in the case of ensuring the compression accuracy.
[00203] FIG. 11 is a schematic structural diagram of a computer device according to an exemplary embodiment. The computer device may be implemented as a model search device and/or an image segmentation device in each above method embodiment. The computer device 1100 includes a central processing unit (CPU) 1101, a system memory 1104 including a randomaccess memory (RAM) 1 102 and a read-only memory (ROM) 1103, and a system bus 1105 connecting the system memory 1104 and the CPU 1101. The computer device 1 100 further includes a basic input/output system 1106 which helps to transmit information between various components in a computer, and a high-capacity storage device 1107 configured to store an operating system 1113, an application program 1114 and other program modules 1115.
[00204] The high-capacity storage device 1107 is connected to the CPU 1101 via a high- capacity storage controller (not shown) that is connected to the system bus 1105. The high- capacity storage device 1107 and an associated computer-readable medium thereof provide nonvolatile storage for the computer device 1100. That is, the high-capacity storage device 1107 may include a computer-readable medium (not shown), such as a hard disk, a compact disc readonly memory (CD-ROM) drive, or the like. [00205] Without loss of generality, the computer-readable medium may include a computer storage medium and a communication medium. The computer storage medium includes volatile and non-volatile, and removable and non-removable mediums implemented in any method or technology for storing information such as a computer-readable instruction, a data structure, a program module or other data. The computer storage medium includes an RAM, an ROM, a flash memory or other solid-state storage technologies, a CD-ROM or other optical storage, a tape cartridge, a magnetic tape, a disk storage or other magnetic storage devices. Of course, it is known by persons skilled in the art that the computer storage medium is not limited to above. The above system memory 1104 and high-capacity storage device 1107 may be collectively referred to as the memory.
[00206] The computer device 1100 may be connected to the Internet or other network devices via a network interface unit 1111 that is connected to the system bus 1105.
[00207] The memory further includes one or more above programs, and the one or more above programs are stored in the memory. The CPU 1101 implements all or part of the steps of the method shown in FIG. 3, FIG. 4 or FIG. 9 by executing the one or more above programs.
[00208] In the exemplary embodiments, a non-transitory computer-readable storage medium including instructions, for example, a memory including computer programs (instructions) is further provided, and the above programs (instractions) are executable by a processor of a computer device to complete the method executed by a server or user terminal in the methods shown in the embodiments of the present disclosure. For example, the non- transitory' computer-readable storage medium may be an ROM, an RAM, a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, etc.
[00209] In the exemplary embodiments, a computer program product or a computer program is further provided. The computer program product or the computer program includes a computer instruction. The computer instruction is stored in a computer-readable storage medium. A processor of a computer device reads the computer instruction from the computer-readable storage medium. The processor executes the computer instruction, such that the computer device executes the method shown in the above embodiments.
[00210] Other implementations of the present disclosure will be apparent to those skilled in the art from consideration of the description and practice of the present disclosure. The present disclosure is intended to cover any variations, uses, or adaptations of the present disclosure, which follow the general principles of the present disclosure and include common general knowledge or commonly used technical measures not disclosed by the present disclosure in the related art. The specification and the embodiments are considered as exemplary' only, and a true scope and spirit of the present disclosure are indicated by claims.
[00211] It should be understood that the present disclosure is not limited to the exact construction that has been described above and illustrated in the accompanying drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is merely limited by the appended claims.

Claims

CLAIMS What is claimed is:
1. A method for compressing data, comprising: acquiring target data, wherein the target data comprises at least two target data segments; acquiring a compression parameter corresponding to an ith target data segment of the at least two target data segments by updating, based on compressed data information corresponding to an (i-1 )th target data segment of the at least two target data segments, a compression parameter corresponding to the (i- l)th target data segment using a parameter update model; wherein i is an integer greater than or equal to 2, the compressed data information comprises at least one of a compression proportion and a compression standard deviation, the parameter update model is acquired by reinforcement learning according to a history compression parameter and compressed data information corresponding to the history compression parameter, the compression parameter is indicative of a compression accuracy of data compression on the at least two target data segments, and the history compression parameter is a compression parameter corresponding to a history target data segment; and performing, based on the compression parameter corresponding to the ith target data segment, data compression on the ith target data segment.
2. The method according to claim 1, wherein the history target data segment comprises the target data segment prior to the ith target data segment; and the method further comprises: updating, based on compression parameters corresponding to prior N target data segments of the ith target data segment and compressed data information corresponding to the prior N target data segments, the parameter update model in the case that the ith target data segment meets a specified condition; wherein the prior N target data segments are N target data segments prior to the ith target data segment in the target data, and N is an integer greater than or equal to 1, and less than i.
3. The method according to claim 2, wherein updating, based on the compression parameters corresponding to the prior N target data segments of the ith target data segment, and the compressed data information corresponding to the prior N target data segments, the parameter update model in the case that the ith target data segment meets the specified condition comprises: updating, based on the compression parameters corresponding to the prior N target data segments of the ith target data segment and the compressed data information corresponding to the prior N target data segments, the parameter update model in the case that i is a preset value.
4. The method according to claim 2, wherein the parameter update model comprises a first model branch and a second model branch; wherein the first model branch is configured to update the compression parameter corresponding to the (i- 1 )th target data segment based on the compressed data information corresponding to the (i- l)th target data segment; and the second model branch is configured to instruct the first model branch to update the compression parameter to increase the compression proportion corresponding to the compression parameter and decrease the compression standard deviation corresponding to the compression parameter.
5. The method according to claim 4, wherein acquiring the compression parameter corresponding to the ith target data segment by updating, based on the compressed data information corresponding to the (i-l)th target data segment, the compression parameter corresponding to the (i-l)th target data segment using the parameter update model comprises: acquiring the compressed parameter corresponding to the ith target data segment by updating, based on the compressed data information corresponding to the (i-l)th target data segment, the compression parameter corresponding to the (i- l)th target data segment using the first model branch; and updating, based on the compression parameters corresponding to the prior N target data segments of the ith target data segment and the compressed data information corresponding to the prior N target data segments, the parameter update model comprises: acquiring, based on the compression parameters corresponding to the prior N target data segments of the ith target data segment and the compressed data information corresponding to the prior N target data segments, value information corresponding to the ith target data segment using the second model branch; wherein the value information is indicative of the first model branch to update the compression parameter to increase the compression proportion corresponding to the compression parameter and decrease the compression standard deviation corresponding to the compression parameter; and updating, based on the value information, the first model branch and the second model branch.
6. The method according to claim 1, wherein the target data further comprises an initial data segment; and prior to acquiring the compression parameter corresponding to the ith target data segment by updating, based on the compression proportion and the compression error which correspond to the (i- 1 )th target data segment, the compression parameter corresponding to the (i- l)th target data segment using the parameter update model, the method further comprises: acquiring, based on the initial data segment, an initial compression parameter; wherein the initial compression parameter is a compression parameter corresponding to a first target data segment; and acquiring a compressed parameter of the first target data segment and compressed data information corresponding to the first target data segment by performing, based on the initial compression parameter, data compression on the first target data segment.
7. The method according to claim 1, wherein the history target data segment comprises a sample target data segment in sample data; wherein the sample data is data of a same type as the target data; wherein the sample data comprises at least two sample target data segments; and prior to acquiring the target data, the method further comprises: acquiring an updated parameter update model by training, based on the at least two sample target data segments, the parameter update model.
8. An apparatus for compressing data, comprising: a target data acquiring module, configured to acquire target data; wherein the target data comprises at least two target data segments; a compression parameter updating module, configured to acquire a compression parameter corresponding to an ith target data segment of the at least two target data segments by updating, based on compressed data information corresponding to an (i- l)th target data segment of the at least two target data segments, a compression parameter corresponding to the (i-l)th target data segment using a parameter update model; wherein i is an integer greater than or equal to 2, the compressed data information comprises at least one of a compression proportion and a compression standard deviation, the parameter update model is acquired by reinforcement learning based on a history compression parameter and compressed data information corresponding to the history compression parameter, the compression parameter is indicative of a compression accuracy of data compression on the at least two target data segments, and the history compression parameter is a compression parameter corresponding to a history target data segment; and a data compressing module, configured to perform, based on the compression parameter corresponding to the ith target data segment, data compression on the ith target data segment.
9. A computer device, comprising a processor and a memory configured to store at least one instruction, at least one program, a code set, or an instruction set; wherein the processor, when loading and executing the at least one instruction, the at least one program, the code set, or the instruction set, is caused to perform the method for compressing data as defined in any one of claims 1 to 7.
10. A non-transitory computer-readable storage medium storing at least one instruction, at least one program, a code set, or an instruction set; wherein the at least one instruction, the at least one program, the code set, or the instruction set, when loaded and executed by a processor of a computer device, causes the computer device to perform the method for compressing data as defined in in any one of claims 1 to 7.
PCT/SG2021/050697 2020-11-18 2021-11-15 Method and apparatus for compressing data, computer device and storage medium WO2022108523A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011295604.5A CN112269769B (en) 2020-11-18 2020-11-18 Data compression method, device, computer equipment and storage medium
CN202011295604.5 2020-11-18

Publications (1)

Publication Number Publication Date
WO2022108523A1 true WO2022108523A1 (en) 2022-05-27

Family

ID=74340240

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SG2021/050697 WO2022108523A1 (en) 2020-11-18 2021-11-15 Method and apparatus for compressing data, computer device and storage medium

Country Status (2)

Country Link
CN (1) CN112269769B (en)
WO (1) WO2022108523A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114726379A (en) * 2022-06-13 2022-07-08 西安热工研究院有限公司 Self-adaptive compression method and system based on time sequence database sample storage characteristics
CN115359807A (en) * 2022-10-21 2022-11-18 金叶仪器(山东)有限公司 Noise online monitoring system for urban noise pollution
CN116131860A (en) * 2022-12-28 2023-05-16 山东华科信息技术有限公司 Data compression system and data compression method for distributed energy grid-connected monitoring

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113659993B (en) * 2021-08-17 2022-06-17 深圳市康立生物医疗有限公司 Immune batch data processing method and device, terminal and readable storage medium
CN114120915A (en) * 2021-11-11 2022-03-01 合肥维信诺科技有限公司 Data compression method and device and data decompression method and device
CN114547030B (en) * 2022-01-20 2023-03-24 清华大学 Multi-stage time sequence data compression method and device, electronic equipment and storage medium
CN114547144B (en) * 2022-01-30 2023-03-24 清华大学 Time sequence data range query method, device and equipment
CN114547027B (en) * 2022-02-11 2023-01-31 清华大学 Data compression processing method and device with capacity and value constraint and storage medium
CN116320042B (en) * 2023-05-16 2023-08-04 陕西思极科技有限公司 Internet of things terminal monitoring control system for edge calculation

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100430943C (en) * 2006-01-09 2008-11-05 中国科学院自动化研究所 Intelligent two-stage compression method for process industrial historical data
US20180026649A1 (en) * 2016-07-20 2018-01-25 Georges Harik Method for data compression
US20200311552A1 (en) * 2019-03-25 2020-10-01 Samsung Electronics Co., Ltd. Device and method for compressing machine learning model

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001065395A1 (en) * 2000-03-03 2001-09-07 Vill Ab Infinite level meta-learning through compression
CN102611454B (en) * 2012-01-29 2014-12-24 上海锅炉厂有限公司 Dynamic lossless compressing method for real-time historical data
CN103309889A (en) * 2012-03-15 2013-09-18 华北计算机系统工程研究所 Method for realizing of real-time data parallel compression by utilizing GPU (Graphic processing unit) cooperative computing
US10466921B1 (en) * 2017-10-31 2019-11-05 EMC IP Holding Company LLC Accelerating data reduction through reinforcement learning
CN108197181B (en) * 2017-12-25 2023-04-18 广州亦云信息技术股份有限公司 Compression storage method of time sequence data, electronic equipment and storage medium
CN110163367B (en) * 2018-09-29 2023-04-07 腾讯科技(深圳)有限公司 Terminal deployment method and device
CN110532466A (en) * 2019-08-21 2019-12-03 广州华多网络科技有限公司 Processing method, device, storage medium and the equipment of platform training data is broadcast live
CN110851699A (en) * 2019-09-16 2020-02-28 中国平安人寿保险股份有限公司 Deep reinforcement learning-based information flow recommendation method, device, equipment and medium
CN111191791B (en) * 2019-12-02 2023-09-29 腾讯云计算(北京)有限责任公司 Picture classification method, device and equipment based on machine learning model
CN110985346B (en) * 2019-12-10 2022-10-28 江西莱利电气有限公司 After-cooling control method for air compressor
CN111556294B (en) * 2020-05-11 2022-03-08 腾讯科技(深圳)有限公司 Safety monitoring method, device, server, terminal and readable storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100430943C (en) * 2006-01-09 2008-11-05 中国科学院自动化研究所 Intelligent two-stage compression method for process industrial historical data
US20180026649A1 (en) * 2016-07-20 2018-01-25 Georges Harik Method for data compression
US20200311552A1 (en) * 2019-03-25 2020-10-01 Samsung Electronics Co., Ltd. Device and method for compressing machine learning model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SEKINE MASATOSHI; IKADA SATOSHI: "LACSLE: Lightweight and Adaptive Compressed Sensing Based on Deep Learning for Edge Devices", 2019 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), IEEE, 9 December 2019 (2019-12-09), pages 1 - 7, XP033722765, DOI: 10.1109/GLOBECOM38437.2019.9014058 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114726379A (en) * 2022-06-13 2022-07-08 西安热工研究院有限公司 Self-adaptive compression method and system based on time sequence database sample storage characteristics
CN115359807A (en) * 2022-10-21 2022-11-18 金叶仪器(山东)有限公司 Noise online monitoring system for urban noise pollution
CN115359807B (en) * 2022-10-21 2023-01-20 金叶仪器(山东)有限公司 Noise online monitoring system for urban noise pollution
CN116131860A (en) * 2022-12-28 2023-05-16 山东华科信息技术有限公司 Data compression system and data compression method for distributed energy grid-connected monitoring
CN116131860B (en) * 2022-12-28 2023-09-05 山东华科信息技术有限公司 Data compression system and data compression method for distributed energy grid-connected monitoring

Also Published As

Publication number Publication date
CN112269769B (en) 2023-12-05
CN112269769A (en) 2021-01-26

Similar Documents

Publication Publication Date Title
WO2022108523A1 (en) Method and apparatus for compressing data, computer device and storage medium
CN111784002B (en) Distributed data processing method, device, computer equipment and storage medium
US20210073639A1 (en) Federated Learning with Adaptive Optimization
CN108446769B (en) Knowledge graph relation inference method, knowledge graph relation inference device, computer equipment and storage medium
CN107817891B (en) Screen control method, device, equipment and storage medium
US20230131283A1 (en) Method for generating universal learned model
CN112364975B (en) Terminal running state prediction method and system based on graph neural network
KR20200070831A (en) Apparatus and method for compressing neural network
US20210406695A1 (en) Systems and Methods for Training an Autoencoder Neural Network Using Sparse Data
CN111091278A (en) Edge detection model construction method and device for mechanical equipment anomaly detection
US20220176554A1 (en) Method and device for controlling a robot
JP2021524072A (en) Methods and equipment to identify target video clips in video
CN112904852B (en) Automatic driving control method and device and electronic equipment
US20220124387A1 (en) Method for training bit rate decision model, and electronic device
CN110826692B (en) Automatic model compression method, device, equipment and storage medium
WO2024094094A1 (en) Model training method and apparatus
CN110782016A (en) Method and apparatus for optimizing neural network architecture search
CN113191504B (en) Federated learning training acceleration method for computing resource isomerism
Bowen et al. Finite-time theory for momentum Q-learning
CN116739107A (en) Gradient quantization method, device, equipment and storage medium based on federal learning
US20210397962A1 (en) Effective network compression using simulation-guided iterative pruning
CN111953533B (en) Method, device and equipment for regulating and controlling working duration of target network node and storage medium
US11710301B2 (en) Apparatus for Q-learning for continuous actions with cross-entropy guided policies and method thereof
CN113822135A (en) Video processing method, device and equipment based on artificial intelligence and storage medium
EP3683733A1 (en) A method, an apparatus and a computer program product for neural networks

Legal Events

Date Code Title Description
DPE2 Request for preliminary examination filed before expiration of 19th month from priority date (pct application filed from 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21895242

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21895242

Country of ref document: EP

Kind code of ref document: A1