CN115543528A - Data processing method, related device and storage medium - Google Patents

Data processing method, related device and storage medium Download PDF

Info

Publication number
CN115543528A
CN115543528A CN202110731228.8A CN202110731228A CN115543528A CN 115543528 A CN115543528 A CN 115543528A CN 202110731228 A CN202110731228 A CN 202110731228A CN 115543528 A CN115543528 A CN 115543528A
Authority
CN
China
Prior art keywords
data
compressed
compression algorithm
processing module
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110731228.8A
Other languages
Chinese (zh)
Inventor
张同宝
向伟
施慧
王超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Tencent Computer Systems Co Ltd
Original Assignee
Shenzhen Tencent Computer Systems Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Tencent Computer Systems Co Ltd filed Critical Shenzhen Tencent Computer Systems Co Ltd
Priority to CN202110731228.8A priority Critical patent/CN115543528A/en
Publication of CN115543528A publication Critical patent/CN115543528A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The embodiment of the application discloses a data processing method, a related device and a storage medium. The application includes: determining a target compression algorithm corresponding to data to be compressed; if the compression algorithm currently applied by the first processing module does not belong to the target compression algorithm, writing the target compression algorithm into the first processing module by using the virtual machine; locking the target compression algorithm as the compression algorithm applied by the first processing module; inputting data to be compressed to a first processing module by adopting a virtual machine; and compressing the data to be compressed by adopting a target compression algorithm in the first processing module. Because the virtual machine shields the difference between different operating platforms, a unified communication interface is provided, the virtual machine writes a compression algorithm and transmits data to be compressed into the first processing module, the compression algorithm of the first processing module does not need to be compiled for each application, and related codes do not need to be compiled for the transmission of the data to be compressed, so that the efficiency of compressing the data by using the first processing module is improved, and convenience is brought to file storage.

Description

Data processing method, related device and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method, a related apparatus, and a storage medium.
Background
With the rapid development of big data, mass data grows explosively, and huge challenges are brought to data storage management. In order to improve the utilization rate of the storage space, a data compression technology is an indispensable key technology in the storage system.
To reduce the computational burden on a Central Processing Unit (CPU), FPGAs may be used to compress data. When each JAVA application compresses data using a programmable gate array (FPGA), a user is required to write a compression algorithm of the FPGA, and then write the compression algorithm into FPGA hardware, and simultaneously write related codes into the compression algorithm, so that the data to be compressed is transmitted to the FPGA for compression. The whole process of compressing data by adopting the FPGA is relatively complex and has low efficiency.
Therefore, an efficient and convenient data compression scheme is urgently needed to be released.
Disclosure of Invention
In view of the foregoing, the present application provides a data processing method, a related apparatus and a storage medium for improving efficiency of data compression.
One aspect of the present application provides a data processing method, including:
acquiring data to be compressed;
determining a target compression algorithm required by data to be compressed;
judging whether the compression algorithm currently applied by the first processing module is a target compression algorithm or not;
if not, writing a target compression algorithm into the first processing module by adopting the virtual machine;
judging whether the compression algorithm applied by the first processing module is successfully switched to a target compression algorithm;
if yes, locking the target compression algorithm as the compression algorithm applied by the first processing module;
inputting data to be compressed to a first processing module by adopting a virtual machine;
and in the first processing module, compressing the data to be compressed by adopting a target compression algorithm.
Another aspect of the present application provides a data processing apparatus, including:
an acquisition unit configured to acquire data to be compressed;
the device comprises a determining unit, a compressing unit and a compressing unit, wherein the determining unit is used for determining a target compression algorithm required by data to be compressed;
the judging unit is used for judging whether the compression algorithm currently applied by the first processing module is a target compression algorithm or not;
the writing unit is used for writing the target compression algorithm into the first processing module by adopting the virtual machine when the compression algorithm currently applied by the first processing module is the target compression algorithm;
the judging unit is also used for judging whether the compression algorithm applied by the first processing module is successfully switched to the target compression algorithm;
the locking unit is used for locking the target compression algorithm into the compression algorithm applied by the first processing module;
the input unit is used for inputting data to be compressed to the first processing module by adopting a virtual machine;
and the compression unit is used for compressing the data to be compressed by adopting a target compression algorithm in the first processing module.
In one possible design, in one implementation of another aspect of an embodiment of the present application,
the input unit is also used for inputting data to be compressed to the first processing module by adopting a virtual machine when the compression algorithm currently applied by the first processing module is a target compression algorithm;
and the compression unit is also used for compressing the data to be compressed by adopting a target compression algorithm in the first processing module.
In one possible design, in one implementation of another aspect of the embodiments of the present application, the data processing apparatus further includes a configuration unit,
the virtual machine compression method comprises a configuration unit and a compression unit, wherein the configuration unit is used for configuring N compression algorithms to a virtual machine, and N is an integer greater than or equal to 1;
the device comprises a determining unit, a compressing unit and a judging unit, wherein the determining unit is specifically used for receiving a compression instruction aiming at data to be compressed, and the compression instruction is used for specifying a target compression algorithm required by the data to be compressed;
a target compression algorithm is determined from the N compression algorithms of the virtual machine.
In one possible design, in one implementation of another aspect of an embodiment of the present application,
the configuration unit is also used for configuring N compression algorithms to the virtual machine, wherein N is an integer greater than or equal to 1;
the determining unit is further used for determining a default compression algorithm from the N compression algorithms;
and the determining unit is specifically used for determining that the default compression algorithm is the target compression algorithm required by the data to be compressed.
In one possible design, in one implementation of another aspect of an embodiment of the present application,
the determining unit is also used for determining the data volume of the data to be compressed;
the judging unit is also used for judging whether the data volume exceeds a first preset threshold value;
the triggering unit is also used for triggering the determining unit when the data volume reaches a first preset threshold value;
the input unit is also used for transmitting data to be compressed to the second processing module when the data volume does not exceed a first preset threshold value;
and the compression unit is also used for compressing the data to be compressed by adopting the second processing module.
In one possible design, in one implementation of another aspect of an embodiment of the present application, the first processing module includes a plurality of threads,
the compression unit is specifically used for writing data to be compressed into a first processing queue of the first processing module;
judging whether idle threads exist in the multiple threads or not;
when idle threads exist in the multiple threads, acquiring data to be compressed from a first processing queue;
in an idle thread, compressing data to be compressed by adopting a target compression algorithm;
and the compression unit is also used for compressing the data to be compressed by adopting the second processing module when no idle thread exists in the multiple threads.
In one possible design, in one implementation of another aspect of an embodiment of the present application,
the determining unit is further used for determining the number of tasks in a waiting queue in the multiple threads when no idle thread exists in the multiple threads, wherein the waiting queue represents a task queue to be processed in the multiple threads;
the judging unit is also used for judging whether the number of the tasks exceeds a second preset threshold value;
and the compression unit is also used for compressing the data to be compressed by adopting the second processing module when the number of the tasks exceeds a second preset threshold value.
In one possible design, in one implementation of another aspect of the embodiments of the present application, the data processing apparatus further includes a preemption unit,
the determining unit is further configured to determine a first priority corresponding to the first processing queue when no idle thread exists in the plurality of threads;
the acquiring unit is further used for acquiring a second processing queue of the plurality of threads, wherein the second processing queue is an executing processing queue;
the determining unit is further used for determining a second priority corresponding to the second processing queue;
the preemption unit is used for preempting the target thread occupied by the second processing queue for the first processing queue when the first priority is higher than the second priority;
and the compression unit is also used for compressing the data to be compressed in the first processing queue by adopting a target compression algorithm in the target thread.
In one possible design, in one implementation of another aspect of the embodiments of the present application, the data processing apparatus further includes an output unit,
the output unit is used for outputting the compressed data to be compressed to obtain target data;
and the determining unit is used for determining that a new idle thread exists in the first processing module according to the target data.
In one possible design, in one implementation manner of another aspect of the embodiment of the present application, the data processing apparatus further includes a parsing unit,
the acquisition unit is also used for acquiring a file to be compressed;
the analysis unit is used for analyzing the file to be compressed into a byte array;
and the determining unit is also used for determining the byte array as the data to be compressed.
Another aspect of the present application provides a computer device, including: a memory, a processor, and a bus system; the memory is used for storing program codes; the processor is configured to perform the method of any one of the above aspects according to instructions in the program code.
Another aspect of the present application provides a computer-readable storage medium having stored therein instructions, which when executed on a computer, cause the computer to perform the method of data processing of any one of the above aspects.
According to another aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and executes the computer instructions, so that the computer device executes the data processing method of any one of the above aspects.
According to the technical scheme, the embodiment of the application has the following advantages:
in the embodiment of the application, a data processing method is provided, and after data to be compressed is obtained, a target compression algorithm required by the data to be compressed is determined. If the compression algorithm currently applied by the first processing module does not belong to the target compression algorithm, the virtual machine is adopted to write the target compression algorithm into the first processing module, and the target compression algorithm is locked as the compression algorithm applied by the first processing module, so that the compression algorithm applied by the first processing module can be matched with the target compression algorithm required by the data to be compressed. And inputting data to be compressed to the first processing module by adopting the virtual machine, and compressing the data to be compressed by adopting a target compression algorithm in the first processing module. By the mode, in the process of compressing the data to be compressed, because the virtual machine shields the difference between different operating platforms and provides a uniform communication interface for different operating platforms, the compression algorithm and the data to be compressed are written into the first processing module through the virtual machine, the compression algorithm of the first processing module does not need to be written into each application program, related codes do not need to be written into the first processing module, the data to be compressed is transmitted to the first processing module to be compressed, the efficiency of data compression by the first processing module is improved, and convenience is brought to file storage.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a diagram of a network architecture in which a data compression system operates;
fig. 2 is a schematic view of an application scenario of the data processing method provided in the present application;
FIG. 3 is a flow chart of a data processing method in an embodiment of the present application;
FIG. 4 is another flow chart of a data processing method in an embodiment of the present application;
FIG. 5A is another flow chart of a data processing method in the embodiment of the present application;
FIG. 5B is another flow chart of a data processing method according to an embodiment of the present application;
FIG. 6 is a flow chart illustrating data compression according to a first priority of a first processing queue according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a computer device in an embodiment of the present application.
Detailed Description
The embodiment of the application provides a data processing method, a related device and a storage medium, which are used for improving the efficiency of data compression by adopting an FPGA.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present application and in the drawings described above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "corresponding" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
First, some nouns that may appear in the embodiments of the present application are explained.
Data compression: in computer science and information theory, data compression is the process of representing information in fewer data bits (or other information-related units) than uncoded according to a particular coding scheme. On the premise of not losing useful information, the data size is reduced to reduce storage space and improve the transmission, storage and processing efficiency of the data, or the data is reorganized according to a certain algorithm to reduce the redundancy and storage space of the data.
Programmable Gate Array (FPGA): the circuit is used as a semi-custom circuit in the field of Application Specific Integrated Circuits (ASIC), not only overcomes the defects of the custom circuit, but also overcomes the defect of limited gate circuits of the original programmable device.
Virtual machine: refers to a complete computer system that has complete hardware system functionality and operates in a completely isolated environment, simulated by software. The work that can be done in a physical computer can be implemented in a virtual machine. When creating a virtual machine in a computer, it is necessary to use a part of the hard disk and the memory capacity of the physical machine as the hard disk and the memory capacity of the virtual machine. Each virtual machine has a separate Complementary Metal Oxide Semiconductor (CMOS), hard disk, and operating system, and can be operated as if a physical machine is used.
It should be understood that the method for data processing provided by the present application may be applied to a system or program containing a data compression function in a terminal device, such as video compression software or audio compression software. In particular, the data compression system may operate in a network architecture as shown in fig. 1, which is a network architecture diagram of the operation of the data compression system as shown in fig. 1. It can be understood that, fig. 1 shows various terminal devices, the terminal devices may be computer devices, and in an actual scenario, there may be more or fewer types of terminal devices participating in the process of data compression, where the specific number and type depend on the actual scenario, and this is not limited herein, and in addition, fig. 1 shows one server, but in an actual scenario, there may also be participation of multiple servers, especially in a scenario of multi-model training interaction, where the specific number of servers depends on the actual scenario.
In this embodiment, the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like. The terminal may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, and the like. The terminal and the server may be directly or indirectly connected through a wired or wireless communication manner, and the terminal and the server may be connected to form a block chain network, which is not limited herein.
Specifically, cloud technology (Cloud technology) refers to a hosting technology for unifying series resources such as hardware, software, and network in a wide area network or a local area network to realize calculation, storage, processing, and sharing of data.
The cloud technology is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied in the cloud computing business model, can form a resource pool, is used as required, and is flexible and convenient. Cloud computing technology will become an important support. Background services of the technical network system require a large amount of computing and storage resources, such as video websites, picture-like websites and more web portals. With the high development and application of the internet industry, each article may have its own identification mark and needs to be transmitted to a background system for logic processing, data in different levels are processed separately, and various industrial data need strong system background support and can only be realized through cloud computing.
It should be understood that the method for processing data provided in the embodiment of the present application may be applied to the server shown in fig. 1, and may also be applied to the terminal device shown in fig. 1. After receiving the data to be compressed from the terminal device, the server may compress the data to be compressed by using the data processing method provided by the embodiment of the present application; or after the terminal device receives the data to be compressed from the server, the data to be compressed is compressed by adopting the data processing method provided by the embodiment of the application; the server or the terminal device may also compress the local data stored in the device itself according to the needs of the user, so as to reduce the occupation of storage resources, which is not limited herein.
The data processing method provided by the application can improve the efficiency of data compression by adopting the first processing module. The first processing module may be a processor with programmable characteristics, for example, the first processing module may be an FPGA, an ARM (Advanced RISC Machines) processor, or a digital signal (digital signal processor), and is not limited herein. On the other hand, the second processing module may include various possible processor types, including but not limited to a Central Processing Unit (CPU), a microprocessor, a microcontroller, an artificial intelligence processor, a heterogeneous processor, or the like. In order to facilitate understanding, in the embodiment of the present application, an FPGA is taken as a first processing module, and a CPU is taken as a second processing module for example, which is described.
For ease of understanding, please refer to fig. 2, where fig. 2 is a schematic view of an application scenario of the data processing method provided in the present application. As shown in fig. 2, with the rapid development of big data, the mass data grows explosively, which brings great challenges to data storage management. In order to improve the utilization rate of the storage space and save the occupation of the storage resources of the computer device, the computer device usually needs to compress the acquired data. In a conventional data compression flow, a CPU is generally used to compress data. However, in the scheme of compressing data by using a CPU, the calculation power of the CPU is consumed very often, and the compression speed is very slow. With the rapid development of the FPGA, a heterogeneous computing mode that a CPU is combined with the FPGA is one of the main methods for increasing the computing speed at present.
In the existing scheme of compressing data by using an FPGA, for each different application (for example, JAVA application), a user is required to write a compression algorithm of the FPGA, and then write the compression algorithm into FPGA hardware, and at the same time, related codes are required to be written into the compression algorithm, so that the data to be compressed is transmitted to the FPGA to be compressed. The flow of compressing data by adopting the FPGA is relatively complex and has relatively low efficiency. As shown in fig. 2, in the computer device, the JAVA application may transmit the data to be compressed to the FPGA or the CPU for compression through the JAVA virtual machine. Through the compression API module shown in fig. 2, in the data compression process of the present application, a user does not need to modify a code, but only needs to use a call related parameter to start a virtual machine. Therefore, the existing compression method implementation for JAVA application will all provide the capability of FPGA compression, while the original CPU compression capability will be retained by default. The scheduling module can classify the received compression tasks according to the size of the data volume, so as to decide to adopt the FPGA for compression or adopt the CPU for compression; the FPGA can be adopted for compression or the CPU can be adopted for compression according to the condition of the waiting queue in the FPGA.
Because the virtual machine shields the difference between different operating platforms and provides a uniform communication interface, the virtual machine writes a compression algorithm and transmits data to be compressed into the FPGA, so that the compression algorithm of the FPGA does not need to be written for each application program, related codes do not need to be written, the data to be compressed is transmitted to the FPGA for compression, and the efficiency of data compression by adopting the FPGA is improved.
On the other hand, in the existing scheme for compressing data by using an FPGA, even if a user writes a code and a compression algorithm for transmitting data to the FPGA, the transmission code and the compression algorithm are not applicable when the transmission code and the compression algorithm are transplanted to different operating systems (e.g., windows operating system or linux operating system), and at this time, the user is required to maintain the transmission code and the compression algorithm again. Because the virtual machine shields the difference between different operating platforms and provides a uniform communication interface, related transmission codes and compression algorithms do not need to be modified aiming at different operating systems, and great convenience is brought to the flow of compressing data by adopting the FPGA.
It should be understood that the data processing method provided in the embodiment of the present application is not limited to be applied to the JAVA programming language, and may also be applied to other types of managed type languages, for example, a program written in C # language, and the Common Language Runtime (CLR) may be used as a virtual machine to write a compression algorithm into the FPGA and transmit data to be compressed, which is not limited herein.
With the above description, a method of data processing in the present application is described below. Referring to fig. 3, fig. 3 is a flowchart of a data processing method according to an embodiment of the present disclosure, and as shown in fig. 3, an embodiment of the data processing method according to the embodiment of the present disclosure includes:
101. acquiring data to be compressed;
in the embodiment of the present application, the acquired data to be compressed should be input to the virtual machine in a data format of a byte array. Specifically, in computer programming, a continuous variable sequence whose data type is byte is called a byte array. An array is one of the most basic data structures, and bytes are the smallest standard scalar type in most programming languages. Byte array is invaluable when reading files stored in unknown or arbitrary binary formats, or when large amounts of data need to be stored efficiently to save memory. There are also instances where arrays of bytes may be used to store string data to help reduce memory usage. Some optimizations may be made using byte arrays to make accessing and changing information in the array faster than using other types.
The data processing method provided by the embodiment of the present application may be applicable to compression of files of multiple data types, where the data types include, but are not limited to, video files, audio files, texts, images, and the like, and is not limited herein. Taking JAVA application as an example, after obtaining the file of the data type, the JAVA application will parse the file into a format of a byte array as data to be compressed. And inputting the data to be compressed into the virtual machine, and executing step 102.
102. Determining a target compression algorithm required by data to be compressed;
and (3) compression algorithm: the compression algorithm (compression algorithm) refers to an algorithm for data compression, and is also commonly referred to as signal encoding in the electronic and communication fields, and includes two steps of compression and restoration (decoding and encoding).
In order to adapt to the requirements of different operating systems, a plurality of compression algorithms, such as various types of entropy coding or source coding, are preset in the virtual machine. Generally, a virtual machine selects one of a plurality of compression algorithms as a default compression algorithm, and compresses data to be compressed by using the default compression algorithm, that is, the default compression algorithm of the current virtual machine is directly used as a target compression algorithm required by the data to be compressed in the present application.
In practical applications, a user may also select a compression algorithm (i.e., a target compression algorithm) to be used by the data to be compressed according to different needs of the user (e.g., needs of the user for a compression rate or a compression format, etc.).
103. Judging whether the compression algorithm currently applied by the first processing module is a target compression algorithm or not;
in the process of compressing data by using the first processing module, a corresponding compression algorithm (i.e., a target compression algorithm) needs to be written into the first processing module, and then the first processing module can compress the data by using the compression algorithm.
In practical applications, there may be many cases of the compression algorithm applied by the first processing module. Taking the FPGA as the first processing module as an example, the FPGA often writes a certain compression algorithm in the working process due to the need of executing other tasks previously. Therefore, whether a target compression algorithm adopted by the data to be compressed is consistent with the algorithm currently applied by the FPGA or not can be judged firstly, and if the target compression algorithm is inconsistent, the compression algorithm applied by the FPGA needs to be rewritten into the target compression algorithm; alternatively, the first processing module is just initialized, and there is no compression algorithm in the first processing module, and in this case, the compression algorithm currently applied by the first processing module is also determined to be inconsistent with the required target compression algorithm. If yes, go to step 107: inputting data to be compressed to a first processing module by adopting a virtual machine; if not, go to step 104: and writing a target compression algorithm into the first processing module by adopting the virtual machine.
104. If not, writing a target compression algorithm into the first processing module by adopting the virtual machine;
as mentioned in step 103, in practical applications, some compression algorithm is written already in the process of processing tasks by the first processing module, due to the need of executing other tasks previously. Therefore, after the step 103, after determining whether the compression algorithm currently applied by the first processing module is the target compression algorithm, if the compression algorithm currently applied by the first processing module is exactly the target compression algorithm, in order to improve the processing efficiency and reduce the processing flow, the target compression algorithm may be directly passed through the target compression algorithm currently applied by the first processing module without writing the target compression algorithm again.
If the compression algorithm currently applied by the first processing module does not belong to the target compression algorithm, that is, the compression algorithm currently applied by the first processing module is inconsistent with the required target compression algorithm, at this time, the target compression algorithm needs to be written into the first processing module, and the algorithm state of the first processing module is updated, so that the algorithm applied by the first processing module is the target compression algorithm.
For the virtual machine, a plurality of compression algorithms are often preset in a standard library, a user does not need to manually realize the compression algorithms, and the target compression algorithms in the standard library are directly written into the first processing module by adopting the virtual machine.
105. Judging whether the compression algorithm applied by the first processing module is successfully switched to a target compression algorithm;
in practical applications, the writing of the target compression algorithm cannot be guaranteed to be successful in a hundred percent, for example, the first processing module is not compatible with the written target compression algorithm, or another task request with high priority forcibly tampers the compression algorithm applied by the first processing module again, which may cause the writing of the target compression algorithm to fail.
Therefore, before the data to be compressed is input to the first processing module, whether the compression algorithm applied by the first processing module is successfully switched to the target compression algorithm is judged. If yes, i.e. the standard compression algorithm is successfully written, execute step 106: and the target compression algorithm is locked as the compression algorithm applied by the first processing module, so that the data to be compressed is still continuously compressed when the target compression algorithm fails to be written, and the stability and reliability of the scheme are improved. If not, that is, it is said that the target compression algorithm fails to be written, execute step 109: and compressing the data to be compressed by adopting a second processing module.
106. If yes, locking the target compression algorithm as the compression algorithm applied by the first processing module;
if the compression algorithm applied by the first processing module has been successfully switched to the target compression algorithm, it may be determined that the target compression algorithm was successfully written. Further, in order to prevent other tasks of the first processing module from forcibly tampering with the compression algorithm during the execution of data compression, in this embodiment, the state lock may be used to lock the target compression algorithm to the compression algorithm applied by the first processing module, so as to ensure the stability of the compression algorithm. Then, the virtual machine can be used to input the data to be compressed to the first processing module, so as to execute the subsequent data compression process.
107. Inputting data to be compressed to a first processing module by adopting a virtual machine;
through the above steps, the algorithm currently applied by the first processing module is determined to be the target compression algorithm required by the data to be compressed, the data to be compressed can be compressed by the first processing module, and the data to be compressed needs to be transmitted to the first processing module. Because the virtual machine shields the difference between different operating platforms and provides a uniform communication interface, in the embodiment of the application, the virtual machine is adopted to directly transmit the data to be compressed to the first processing module, so that the process of writing codes to transmit the data to be compressed to the first processing module is omitted.
108. In a first processing module, compressing data to be compressed by adopting a target compression algorithm;
through the steps, the first processing module already acquires the data to be compressed and the target compression algorithm required to be used. At this time, in the first processing module, the target compression algorithm may be adopted to compress the data to be compressed.
109. Compressing the data to be compressed by adopting a second processing module;
in this embodiment of the present application, if the target compression algorithm fails to be written, the compression algorithm applied by the first processing module cannot be rewritten into the target compression algorithm, and at this time, in order to continue to compress the data to be compressed, a task of performing data compression by using the first processing module may be abandoned, and the data to be compressed may be compressed by using the second processing module instead.
The embodiment of the application provides a data processing method, and after data to be compressed is obtained, a target compression algorithm required by the data to be compressed is determined. If the compression algorithm currently applied by the first processing module does not belong to the target compression algorithm, the virtual machine is adopted to write the target compression algorithm into the first processing module, and the target compression algorithm is locked as the compression algorithm applied by the first processing module, so that the compression algorithm applied by the first processing module can be matched with the target compression algorithm required by the data to be compressed. And inputting data to be compressed to the first processing module by adopting the virtual machine, and compressing the data to be compressed by adopting a target compression algorithm in the first processing module. By the method, in the process of compressing the data to be compressed, because the virtual machine shields the difference between different operating platforms and provides a uniform communication interface for different operating platforms, the virtual machine writes the compression algorithm and transmits the data to be compressed into the first processing module, the compression algorithm of the first processing module does not need to be written into each application program, and related codes do not need to be written into the first processing module, so that the data to be compressed is transmitted to the first processing module for compression, the efficiency of data compression by adopting the first processing module is improved, and convenience is brought to file storage.
On the other hand, taking the first processing module as an FPGA and the second processing module as a CPU as an example, in a compression test of a 1G large file, compared with a conventional scheme of performing data compression by using a CPU, the data processing method provided by the application increases the speed by 8 times when compressing a 1G large file, and saves the consumed CPU utilization rate by 2 times. The invention can greatly improve the throughput rate of application, and similarly, the power consumption of the device is reduced due to the reduction of the CPU utilization rate, and the power consumption of the FPGA is much lower than that of the CPU, so that the overall power consumption of the device is reduced due to the reduction of the CPU utilization rate.
On the other hand, in the embodiment of the application, the compression algorithm is written into the FPGA and the data to be compressed are transmitted through the virtual machine, so that the parameters of the original application program (such as JAVA application) do not need to be modified, and the compatibility and the stability of the scheme are ensured.
It should be understood that the method for processing data provided by the present application may be applied to a cloud server configured with a first processing module and a second processing module (e.g., FPGA and CPU). When the cloud server configured with the first processing module and the second processing module needs to compress local data or other received data, the data processing method provided by the application may be adopted. The cloud server can have strong computing power, and can execute tasks with large computing power requirements for the terminal equipment which establishes the link with the cloud server. For example, the server shown in fig. 1 may be a cloud service, and receives data to be compressed from a plurality of terminal devices, where the data amount of the data to be compressed from the plurality of terminal devices tends to be relatively large, and the cloud server may use the data to be compressed for performing the data processing method provided by the present application, using its strong computing power.
Cloud computing (cloud computing) is a computing model that distributes computing tasks over a pool of resources formed by a large number of computers, enabling various application systems to obtain computing power, storage space, and information services as needed. The network that provides the resources is called the "cloud". Resources in the "cloud" appear to the user as if they are infinitely expandable and can be acquired at any time, used on demand, expanded at any time, and paid for use.
As a basic capability provider of cloud computing, a cloud computing resource pool (called as an ifas (Infrastructure as a Service) platform for short is established, and multiple types of virtual resources are deployed in the resource pool and are selectively used by external clients.
According to the logic function division, a PaaS (Platform as a Service) layer can be deployed on an IaaS (Infrastructure as a Service) layer, a SaaS (Software as a Service) layer is deployed on the PaaS layer, and the SaaS can be directly deployed on the IaaS. PaaS is a platform on which software runs, such as a database, a web container, etc. SaaS is a variety of business software, such as web portal, sms group sender, etc. Generally speaking, saaS and PaaS are upper layers relative to IaaS.
Optionally, on the basis of the embodiment corresponding to fig. 3, in another optional embodiment of the data processing method provided in the embodiment of the present application, the method may further include the following steps:
if the compression algorithm currently applied by the first processing module is a target compression algorithm, the virtual machine is adopted to input data to be compressed to the first processing module;
and in the first processing module, compressing the data to be compressed by adopting a target compression algorithm.
In the application, in the process of data compression, it is required to ensure that the compression algorithm applied by the first processing module is consistent with the target compression algorithm required by the data to be compressed. As can be seen from the foregoing step 103, if the compression algorithm currently applied by the first processing module is inconsistent with the required target compression algorithm, the target compression algorithm needs to be written into the first processing module at this time, and the algorithm state of the first processing module is updated, so that the algorithm applied by the first processing module is the target compression algorithm.
For easy understanding, please refer to fig. 4, in which fig. 4 is another flowchart of a data processing method in the embodiment of the present application. As shown in fig. 4, in the present embodiment, after steps 201 to 202 (refer to the description of step 101 to step 102), when the currently applied compression algorithm of the first processing module is consistent with the required target compression algorithm, the target compression algorithm does not need to be written into the first processing module, that is, step 203 does not need to be executed, and step 204 is directly executed. On the other hand, after the data compression process of the data to be compressed is completed through steps 101 to 105 shown in fig. 3, if the first processing module is subsequently required to compress other data, when the required target compression algorithm of the subsequent data to be compressed is not changed, that is, is consistent with the previous target compression algorithm, the algorithm state of the first processing module does not need to be rewritten, that is, step 203 does not need to be executed, and step 204 is directly executed.
For example, before the first processing module processes the compression task of the data to be compressed, the target compression algorithm is just adopted to execute other data compression tasks, and at this time, when the first processing module needs to compress the data to be compressed, the algorithm state of the first processing module does not need to be updated, and the target compression algorithm currently applied by the first processing module is directly used to compress the data to be compressed. That is, when the compression algorithm currently applied by the first processing module is the target compression algorithm, the step 204 is directly executed without executing the step of writing the target compression algorithm into the first processing module by using the virtual machine, so that the execution efficiency of the data compression process is improved.
For another example, in a factory setting of the current algorithm state of the first processing module, the default compression algorithm is exactly the target compression algorithm that needs to be used by the data to be compressed, and the algorithm state of the first processing module is not modified after the factory setting, so that the algorithm state of the first processing module does not need to be updated, and the data to be compressed can be compressed by directly using the target compression algorithm that is currently applied by the first processing module.
Optionally, on the basis of the embodiment corresponding to fig. 3, in another optional embodiment of the data processing method provided in the embodiment of the present application, before acquiring the data to be compressed, the method may further include the following steps:
configuring N compression algorithms to a virtual machine, wherein N is an integer greater than or equal to 1;
the target compression algorithm required for determining the data to be compressed comprises the following steps:
receiving a compression instruction aiming at data to be compressed, wherein the compression instruction is used for specifying a target compression algorithm required by the data to be compressed;
a target compression algorithm is determined from N compression algorithms of the virtual machine.
In this embodiment, in order to adapt to different requirements of different operating systems and different requirements of a user on an encoding manner of data compression, N compression algorithms may be configured in a virtual machine in advance, where N is an integer greater than or equal to 1. Illustratively, the preset N compression algorithms may include run-length encoding (RLE) in entropy encoding, list-table compression (LZW), shannon (Shannon) encoding, huffman (huffman) encoding or arithmetic encoding (rrithmet encoding), fast Fourier Transform (FFT) encoding or sub-band encoding (SBC) in source encoding, and the like, and may be other compression algorithms than the above-mentioned compression algorithms, which is not limited herein.
After the data to be compressed is obtained, a target compression algorithm to be adopted by the data to be compressed is determined, and a user can select a compression algorithm (i.e., a target compression algorithm) to be used by the data to be compressed from preset N compression algorithms according to different needs of the user (for example, needs of the user on a compression rate, a compression format and the like). Specifically, after a compression instruction for specifying a target compression algorithm is received, the target compression algorithm may be determined from N compression algorithms of the virtual machine.
In this embodiment, by presetting a plurality of compression algorithms in the virtual machine, different requirements of different operating systems and different requirements of a user on a coding mode of data compression can be adapted, and flexibility of a scheme is improved.
Optionally, on the basis of the embodiment corresponding to fig. 3, in another optional embodiment of the data processing method provided in the embodiment of the present application, before acquiring the data to be compressed, the method may further include the following steps:
configuring N compression algorithms to a virtual machine, wherein N is an integer greater than or equal to 1;
determining a default compression algorithm from the N compression algorithms;
the target compression algorithm required for determining the data to be compressed comprises the following steps:
and determining the default compression algorithm as the target compression algorithm required by the data to be compressed.
In this embodiment, in order to adapt to different requirements of different operating systems and different requirements of a user on an encoding manner of data compression, N compression algorithms may be configured in the virtual machine in advance, and one of the compression algorithms may be selected as a default compression algorithm. Therefore, when the user has no different explicit requirements for the compression algorithm, the default compression algorithm can be used as the target compression algorithm, so as to perform the subsequent data compression process.
In this embodiment, by presetting a plurality of compression algorithms in the virtual machine, different requirements of different operating systems and different requirements of a user on a coding mode of data compression can be adapted, and flexibility of a scheme is improved.
Optionally, on the basis of the embodiment corresponding to fig. 3, in another optional embodiment of the data processing method provided in the embodiment of the present application, after acquiring the data to be compressed, the method may further include the following steps:
determining the data volume of data to be compressed;
judging whether the data volume exceeds a first preset threshold value or not;
and if the data volume exceeds a first preset threshold value, triggering a step of determining a target compression algorithm required by the data to be compressed.
As can be seen from the embodiment shown in fig. 3, in the flow of performing data compression by using the first processing module, before the task of performing data compression, corresponding preparation work (such as copy transmission of the data to be compressed, rewriting of the compression algorithm of the first processing module, etc.) needs to be performed, so when the data amount of the data to be compressed is small, the preparation work cannot be avoided by using the first processing module for data compression, which brings unnecessary overhead.
In this embodiment, the first preset threshold may be configured according to actual needs, with respect to the data size of the data to be compressed. After the data to be compressed are obtained, the data volume of the data to be compressed is determined. And judging whether the data volume exceeds a first preset threshold, and if so, triggering a step of determining a target compression algorithm required by the data to be compressed, thereby executing a subsequent data compression process. And if the data volume does not exceed the first preset threshold, the first processing module is not adopted to compress the data to be compressed.
For example, taking the first processing module as an FPGA and the second processing module as a CPU, it is assumed that the data size of the data to be compressed defined by the first preset threshold is 2 Gigabytes (GB). If the data volume of the data to be compressed is determined to be 1.5GB after the data to be compressed is acquired, and the data volume of the data to be compressed does not exceed a first preset threshold value because the 1.5GB is less than 2GB (the first preset threshold value), the data to be compressed can be compressed without adopting an FPGA (field programmable gate array); if the data volume of the data to be compressed is determined to be 2.5GB after the data to be compressed is acquired, and the data volume of the data to be compressed exceeds a first preset threshold value because the 2.5GB is greater than 2GB (the first preset threshold value), the data to be compressed may be compressed by using the FPGA.
In this embodiment, by limiting the data size of the data to be compressed in the above manner, the situation that the data to be compressed with too small data size still adopts the first processing module to perform data compression is avoided, and thus the system overhead is saved.
Optionally, on the basis of the embodiment corresponding to fig. 3, in another optional embodiment of the data processing method provided in the embodiment of the present application, the method may further include the following steps:
if the data volume does not exceed the first preset threshold, transmitting data to be compressed to a second processing module;
and compressing the data to be compressed by adopting a second processing module.
When the first processing module is used for data compression, operations such as judgment and rewriting of a compression algorithm of the first processing module are inevitably needed, and if the data amount of the data to be compressed is too small, the benefit obtained by compression by using the first processing module is not high. Therefore, in this embodiment, if the data amount of the data to be compressed does not exceed the first preset threshold, the data to be compressed may be transmitted to the second processing module, and the data to be compressed is compressed by the second processing module, so that the situation that the data to be compressed with the excessively small data amount still adopts the first processing module to perform data compression is avoided, and thus the system overhead is saved.
Optionally, on the basis of the embodiment corresponding to fig. 3, in another optional embodiment of the data processing method provided in the embodiment of the present application, the first processing module includes a plurality of threads, and in the first processing module, the target compression algorithm is adopted to compress the data to be compressed, which specifically includes the following steps:
writing data to be compressed into a first processing queue of a first processing module;
judging whether idle threads exist in the multiple threads or not;
if yes, acquiring data to be compressed from the first processing queue;
and in the idle thread, compressing the data to be compressed by adopting a target compression algorithm.
In this embodiment, by using the characteristic that the first processing module includes a plurality of threads, after the data to be compressed is input to the first processing module by using the virtual machine, the data to be compressed is written into the first processing queue of the first processing module. And judging whether idle threads exist in the multiple threads, if so, taking out the data to be compressed from the first processing queue, and compressing the data to be compressed by adopting a target compression algorithm in the idle process.
Optionally, on the basis of the embodiment corresponding to fig. 3, in another optional embodiment of the data processing method provided in the embodiment of the present application, the method may further include the following steps:
and if no idle thread exists in the multiple threads, compressing the data to be compressed by adopting a second processing module.
In this embodiment, if all the threads of the first processing module have task flows being executed, that is, no idle thread exists in the threads of the first processing module, at this time, the first processing module may be abandoned for data compression, and the second processing module is adopted for compressing the data to be compressed, so that the compression task of the data to be compressed is processed in time, the situation that the task waiting time is too long is avoided, and the efficiency of the scheme is improved.
Optionally, on the basis of the embodiment corresponding to fig. 3, in another optional embodiment of the data processing method provided in the embodiment of the present application, after compressing the data to be compressed by using the target compression algorithm, the method may further include the following steps:
if no idle thread exists in the multiple threads, determining the number of tasks in a waiting queue in the multiple threads, wherein the waiting queue represents a task queue to be processed in the multiple threads;
judging whether the number of tasks exceeds a second preset threshold value or not;
and if so, compressing the data to be compressed by adopting a second processing module.
In this embodiment, if all of the multiple threads of the first processing module have task flows being executed, that is, no idle thread exists in the multiple threads of the first processing module, on one hand, in addition to the above-mentioned selection of handing over the data-compressed task to the second processing module for processing, the first processing queue may be queued. However, in order to avoid a situation that the waiting time of the data compression task is too long due to more queued tasks, in this embodiment, a second preset threshold may be set for the number of tasks waiting for the queue in the multiple threads. The waiting queue refers to a task queue to be processed in a plurality of threads, and the larger the number of people in the waiting queue, the more tasks to be processed by the current first processing module are, the more stacked tasks are, and the later the queue position of the compression task of the data to be compressed is, the longer the waiting time is.
Therefore, when there is no idle thread in the plurality of threads of the first processing module, the number of tasks waiting in the queue in the current plurality of threads may be determined. And judging whether the number of the tasks exceeds a second preset threshold, if the number of the tasks reaches the second preset threshold, indicating that the number of the tasks to be processed by the current first processing module is large, the number of the stacked tasks is large, the queuing time is possibly long, and the efficiency of the data compression process is influenced, and selecting to adopt the second processing module to process the data to be compressed. If the number of the persons is smaller than the second preset threshold, it indicates that the tasks to be processed by the first processing module are not many, and the first processing module can be selected to continue to be used for performing the data compression process, and the first processing queue of the data to be compressed continues to be queued up.
Illustratively, taking the first processing module as an FPGA and the second processing module as a CPU as an example, it is assumed that the number of tasks in the waiting queue defined by the second preset threshold is 3. After the data to be compressed is written into the first processing queue of the FPGA, if the number of tasks waiting for the queue in the FPGA is 4 (greater than the second preset threshold), the CPU may be selected to process the data to be compressed. If the number of the tasks in the waiting queue in the FPGA is 2 (smaller than the second preset threshold), the FPGA can be selected to continue to use for the data compression process, and the first processing queue of the data to be compressed continues to be queued for waiting.
By the above mode, when no idle thread exists in the multiple threads of the first processing module, whether the first processing module needs to be continuously adopted for data compression can be determined according to the number of the tasks waiting for the queue in the multiple threads, the situation that the queuing waiting time of the task flow of the data compression is too long is avoided, and the execution efficiency of the data compression is improved.
With reference to the foregoing embodiments, please refer to fig. 5A for easy understanding, and fig. 5A is another flowchart of a data processing method in the embodiment of the present application. As shown in fig. 5A, taking the first processing module as an FPGA and the second processing module as a CPU as an example, the flow of the data processing method may specifically include the following steps:
301. acquiring data to be compressed, wherein if the data to be compressed is smaller than a first preset threshold, executing step 302; if the data to be compressed reaches a first preset threshold value, judging whether a compression algorithm currently applied by the FPGA is a target compression algorithm required by the data to be compressed, if so, executing a step 304, and if not, executing a step 303;
302. adopting a CPU to compress;
303. writing a target compression algorithm into the FPGA by using a virtual machine, if the target compression algorithm is successfully written, executing a step 304, and if the target compression algorithm is failed to be written, executing a step 302;
304. inputting data to be compressed to the FPGA by adopting a virtual machine, if the queue in the FPGA is full, executing step 302, and if the queue in the FPGA is not full, executing step 305;
305. compressing data to be compressed by adopting an FPGA;
306. the compression is complete.
Optionally, on the basis of the embodiment corresponding to fig. 3, in another optional embodiment of the data processing method provided in the embodiment of the present application, the method may further include the following steps:
if no idle thread exists in the multiple threads, determining a first priority corresponding to the first processing queue;
acquiring a second processing queue of the plurality of threads, wherein the second processing queue is an executing processing queue;
determining a second priority corresponding to the second processing queue;
if the first priority is higher than the second priority, a target thread occupied by the second processing queue is seized for the first processing queue;
and in the target thread, compressing the data to be compressed in the first processing queue by adopting a target compression algorithm.
In this embodiment, in the multiple threads of the first processing module, all the threads have task flows that are being executed, that is, in the case that no idle thread exists in the multiple threads of the first processing module, if the first processing module still needs to be used to compress the data to be compressed at this time, the priority of the first processing queue where the data to be compressed is located may be compared with the priority of the task that is being processed in the multiple threads, so as to determine whether to preferentially use the first processing module to compress the data to be compressed.
Referring to fig. 5B, fig. 5B is another flowchart of a data processing method according to an embodiment of the present application. As shown in fig. 5B, taking the first processing module as an FPGA and the second processing module as a CPU as an example, the flow of the data processing method may specifically include the following steps:
401. preparing a byte array to be compressed;
step 401 is similar to step 101 shown in fig. 3, and is not described herein again specifically;
402. judging whether the data to be compressed is too small;
configuring a first preset threshold, and if the data amount of the data to be compressed exceeds the first preset threshold, it indicates that the size of the data to be compressed is sufficient, executing step 403; if the data amount of the data to be compressed does not exceed the first preset threshold, it indicates that the data to be compressed is too small, and step 404 is executed.
403. Judging the algorithm state of the FPGA;
and judging whether the currently applied compression algorithm of the FPGA is consistent with a target algorithm required by the data to be compressed or not according to the algorithm state of the FPGA, if so, executing step 408, and if not, executing step 405.
404. Compressing by adopting a CPU;
if compression is complete, step 415 is performed.
405. Writing a needed FPGA algorithm into the FPGA;
step 405 is similar to step 103 shown in fig. 3, and is not described herein again;
406. judging whether the writing is successful;
in practical applications, the writing of the compression algorithm cannot guarantee success in a hundred percent, for example, the FPGA may fail to write the compression algorithm when the written compression algorithm is not compatible. Therefore, after the compression algorithm is written, it is determined whether the writing is successful, and if so, step 407 is executed; if the write is not successful, go to step 404.
407. Updating the state of the FPGA;
and if the compression algorithm is successfully written, updating the written compression algorithm to be the currently applied compression algorithm.
408. Submitting a compression task to a work queue;
409. judging whether the queue is full;
if the queue is full, the CPU can be selected for compression, i.e. step 404 is executed; if the queue is not full, writing the compression task into a waiting queue of the FPGA, namely executing step 410;
410. writing the compression task into a waiting queue of the FPGA;
411. taking out the byte array to be compressed from the queue;
and if the current task execution sequence is the waiting queue of the data to be compressed, taking out the byte array to be compressed from the queue, and executing the compression process.
412. Compressing by using a compression algorithm of the FPGA;
and compressing the data to be compressed by adopting a corresponding compression algorithm. Step 412 is similar to step 105 shown in fig. 3, and details thereof are not repeated herein;
413. returning the compressed byte section groups;
and if the compression is finished, returning the compressed byte array.
414. Informing the thread submitting the compression task of completing compression;
after the compression is completed, the notification of the completion of the compression task execution is fed back, so that other compression tasks can be executed continuously.
415. The compression is complete.
For ease of understanding, please refer to fig. 6, fig. 6 is a flowchart illustrating data compression according to the first priority of the first processing queue according to an embodiment of the present application. As shown in fig. 6, this embodiment may specifically include the following steps:
501. determining a first priority corresponding to the first processing queue;
at this time, since no idle thread exists in the multiple threads of the first processing module, whether priority processing needs to be performed on the first processing queue may be determined according to the first priority of the first processing queue. The corresponding first priority corresponding to the first processing queue needs to be determined.
502. Determining a second priority corresponding to the second processing queue;
specifically, in this embodiment, the second processing queue refers to a processing queue being executed by the first processing module. It should be noted that, in the embodiment of the present application, the execution order of step 501 and step 502 is not limited, for example, step 501 may be executed first, and then step 502 may be executed; step 502 may be executed first, and then step 501 may be executed; step 501 and step 502 may also be performed simultaneously, which is not limited herein.
503. Comparing the first priority with the second priority;
after the first priority and the second priority are determined, the two priorities need to be compared. If the first priority is higher than the second priority, step 404 is executed, and if the first priority is lower than or equal to the second priority, the first processing queue is continuously put into the waiting queue for queuing.
504. Preempting a target thread occupied by the second processing queue;
if the first priority is higher than the second priority, it indicates that the first processing queue corresponding to the data to be compressed should be prior to the second processing queue, and it is necessary to preempt the target thread occupied by the second processing queue for the first processing queue, that is, the target thread stops executing the second processing queue with the lower priority and starts executing the first processing queue with the higher priority.
505. And in the target thread, compressing the data to be compressed in the first processing queue.
In this embodiment, under the condition that no idle thread exists in the multiple threads of the first processing module, whether to preferentially compress the data to be compressed is determined by combining the priority of the first processing queue corresponding to the data to be compressed, so that the queue waiting time of the data to be compressed is reduced, and the execution efficiency of the scheme is improved.
Optionally, on the basis of the embodiment corresponding to fig. 3, in another optional embodiment of the data processing method provided in the embodiment of the present application, after compressing data to be compressed by using a target compression algorithm, the method may further include the following steps:
outputting the compressed data to be compressed to obtain target data;
and determining that a new idle thread exists in the first processing module according to the target data.
In this embodiment, after the target compression algorithm is used to compress the data to be compressed, the compressed data to be compressed, that is, the target data, is output. After the target data is obtained, it is indicated that the compression process for the data to be compressed has been completed, and in the first processing module, the thread previously used for compressing the data to be compressed should belong to an idle state, that is, a new idle thread, and the new idle thread may be used to execute other task queues, including other subsequent data compression tasks. By the method, after the compression process is completed, the existence of a new idle thread can be determined, and the new idle thread can be used for executing other task queues, so that the flexibility of the scheme is improved.
It should be understood that the data processing method provided by the embodiment of the present application may be applied to the technical field of cloud storage. For example, when the server or the terminal shown in fig. 2 executes the data processing method provided by the present application on the data to be compressed, and outputs the obtained target data, the target data can be stored in the server in the cloud, so that the storage resource of the local device is saved, and the transmission cost of the local device for transmitting the target data to the cloud server is reduced.
A distributed cloud storage system (hereinafter, referred to as a storage system) refers to a storage system that aggregates a large number of storage devices (storage devices are also referred to as storage nodes) of various types in a network through application software or application interfaces to cooperatively work through functions such as cluster application, grid technology, and a distributed storage file system, and provides data storage and service access functions to the outside.
At present, a storage method of a storage system is as follows: logical volumes are created, and when created, each logical volume is allocated physical storage space, which may be the disk composition of a certain storage device or of several storage devices. The client stores data on a certain logical volume, that is, the data is stored on a file system, the file system divides the data into a plurality of parts, each part is an object, the object not only contains the data but also contains additional information such as data identification (ID, ID entry), the file system writes each object into a physical storage space of the logical volume, and the file system records storage location information of each object, so that when the client requests to access the data, the file system can allow the client to access the data according to the storage location information of each object.
The process of allocating physical storage space for the logical volume by the storage system specifically includes: physical storage space is divided in advance into stripes according to a group of capacity measures of objects stored in a logical volume (the measures often have a large margin with respect to the capacity of the actual objects to be stored) and Redundant Array of Independent Disks (RAID), and one logical volume can be understood as one stripe, thereby allocating physical storage space to the logical volume.
Optionally, on the basis of the embodiment corresponding to fig. 3, in another optional embodiment of the data processing method provided in the embodiment of the present application, before acquiring the data to be compressed, the method may further include the following steps:
acquiring a file to be compressed;
analyzing a file to be compressed into a byte array;
and determining the byte array as data to be compressed.
The data processing method provided by the embodiment of the present application may be applicable to compression of files of multiple data types, where the data types include, but are not limited to, video files, audio files, texts, images, and the like, and is not limited herein. After the file to be compressed of any data type is acquired, the file to be compressed can be analyzed into a byte array format and used as data to be compressed. By the method, the files to be compressed of various data types can be compressed, and the realizability of the scheme is improved.
In order to better implement the above-mentioned aspects of the embodiments of the present application, the following also provides related apparatuses for implementing the above-mentioned aspects. Referring to fig. 7, fig. 7 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present disclosure, where the data processing apparatus includes:
an obtaining unit 601 configured to obtain data to be compressed;
a determining unit 602, configured to determine a target compression algorithm required by data to be compressed;
a determining unit 603, configured to determine whether a compression algorithm currently applied by the first processing module is a target compression algorithm;
a writing unit 604, configured to write a target compression algorithm into the first processing module by using a virtual machine if a compression algorithm currently applied by the first processing module is the target compression algorithm;
the determining unit 603 is further configured to determine whether the compression algorithm applied by the first processing module is successfully switched to the target compression algorithm;
a locking unit 605 configured to lock the target compression algorithm to the compression algorithm applied by the first processing module;
an input unit 606, configured to input data to be compressed to the first processing module by using a virtual machine;
the compressing unit 607 is configured to compress, in the first processing module, the data to be compressed by using the target compression algorithm.
Optionally, on the basis of the embodiment corresponding to fig. 7, in an embodiment of the data processing apparatus provided in the embodiment of the present application, the data processing apparatus further includes a configuration unit 608,
a configuration unit 608, configured to configure N compression algorithms to a virtual machine, where N is an integer greater than or equal to 1;
a determining unit 602, configured to specifically receive a compression instruction for data to be compressed, where the compression instruction is used to specify a target compression algorithm required by the data to be compressed;
a target compression algorithm is determined from N compression algorithms of the virtual machine.
Optionally, on the basis of the embodiment corresponding to fig. 7, in an embodiment of the data processing apparatus provided in this application,
a configuration unit 608, further configured to configure N compression algorithms to the virtual machine, where N is an integer greater than or equal to 1;
a determining unit 602, further configured to determine a default compression algorithm from the N compression algorithms;
the determining unit 602 is specifically configured to determine that the default compression algorithm is a target compression algorithm required by the data to be compressed.
Optionally, on the basis of the embodiment corresponding to fig. 7, in an embodiment of the data processing apparatus provided in the embodiment of the present application, the data processing apparatus further includes a triggering unit 609,
a determining unit 602, configured to determine a data amount of data to be compressed;
the determining unit 603 is further configured to determine whether the data amount exceeds a first preset threshold;
the triggering unit 609 is further configured to trigger the determining unit when the data amount reaches a first preset threshold;
the input unit 606 is further configured to transmit data to be compressed to the second processing module when the data amount does not exceed the first preset threshold;
the compressing unit 607 is further configured to compress the data to be compressed by using the second processing module.
Optionally, on the basis of the embodiment corresponding to fig. 7, in an embodiment of the data processing apparatus provided in the embodiment of the present application, the first processing module includes a plurality of threads,
the compressing unit 607 is specifically configured to write data to be compressed into a first processing queue of the first processing module;
judging whether idle threads exist in the multiple threads or not;
when idle threads exist in the multiple threads, acquiring data to be compressed from a first processing queue;
in an idle thread, compressing data to be compressed by adopting a target compression algorithm;
the compressing unit 607 is further configured to compress the data to be compressed by using the second processing module when there is no idle thread in the multiple threads.
Optionally, on the basis of the embodiment corresponding to fig. 7, in an embodiment of the data processing apparatus provided in the embodiment of the present application,
a determining unit 602, further configured to determine, when there is no idle thread in the multiple threads, a number of tasks in a waiting queue in the multiple threads, where the waiting queue represents a task queue to be processed in the multiple threads;
the determining unit 603 is further configured to determine whether the number of tasks exceeds a second preset threshold;
the compressing unit 607 is further configured to compress the data to be compressed by using the second processing module when the number of tasks exceeds the second preset threshold.
In one possible design, in one implementation of another aspect of the embodiment of the present application, the data processing apparatus further includes a preemption unit 610,
a determining unit 602, further configured to determine, when there is no idle thread in the multiple threads, a first priority corresponding to the first processing queue;
the obtaining unit 601 is further configured to obtain a second processing queue of the multiple threads, where the second processing queue is an executing processing queue;
a determining unit 602, further configured to determine a second priority corresponding to the second processing queue;
a preemption unit 610, configured to preempt, when the first priority is higher than the second priority, a target thread occupied by the second processing queue for the first processing queue;
the compressing unit 607 is further configured to compress, in the target thread, the data to be compressed in the first processing queue by using a target compression algorithm.
In one possible design, in one implementation of another aspect of the embodiment of the present application, the data processing apparatus further includes an output unit 611,
an output unit 611, configured to output the compressed data to be compressed to obtain target data;
a determining unit 602, configured to determine that a new idle thread exists in the first processing module according to the target data.
In this embodiment, the data processing apparatus may perform the operations of any one of the embodiments shown in fig. 3 to fig. 6, which will not be described herein again.
Embodiments of the present application further provide a computer device, configured to perform the operations of any one of the embodiments shown in fig. 3 to 6. Referring to fig. 8, fig. 8 is a schematic structural diagram of a computer device 700 according to an embodiment of the present application. As shown, the computer device 700 may vary widely in configuration or performance, and may include one or more Central Processing Units (CPUs) 722 (e.g., one or more processors) and memory 732, one or more storage media 730 (e.g., one or more mass storage devices) storing applications 742 or data 744. Memory 732 and storage medium 730 may be, among other things, transient storage or persistent storage. The program stored on the storage medium 730 may include one or more modules (not shown), each of which may include a sequence of instructions operating on a computer device. Still further, central processor 722 may be configured to communicate with storage medium 730 to execute a series of instruction operations in storage medium 730 on computer device 700.
The computer device 700 may also include one or more power supplies 726, one or more wired or wireless network interfaces 750, one or more input-output interfaces 758, and/or one or more operating systems 741, such as a Windows Server TM ,Mac OS X TM ,Unix TM ,Linux TM ,FreeBSD TM And so on.
The steps performed in the above-described embodiment may be based on the structure of the computer apparatus shown in this fig. 8.
Embodiments of the present application also provide a computer-readable storage medium, in which a computer program is stored, and when the computer program runs on a computer, the computer is caused to execute the method described in the foregoing embodiments.
Also provided in embodiments herein is a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to cause the computer device to execute the method of data processing of any one of the embodiments shown in fig. 3 to 6.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one type of logical functional division, and other divisions may be realized in practice, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a management apparatus for interactive video, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (12)

1. A method of data processing, comprising:
acquiring data to be compressed;
determining a target compression algorithm required by the data to be compressed;
judging whether the compression algorithm currently applied by the first processing module is the target compression algorithm or not;
if not, writing the target compression algorithm into the first processing module by adopting a virtual machine;
judging whether the compression algorithm applied by the first processing module is successfully switched to a target compression algorithm;
if yes, locking the target compression algorithm as the compression algorithm applied by the first processing module;
inputting the data to be compressed to the first processing module by adopting the virtual machine;
and in the first processing module, compressing the data to be compressed by adopting the target compression algorithm.
2. The method of claim 1, further comprising:
if the compression algorithm currently applied by the first processing module is the target compression algorithm, the virtual machine is adopted to input the data to be compressed to the first processing module;
and in the first processing module, compressing the data to be compressed by adopting the target compression algorithm.
3. The method of claim 1, wherein prior to obtaining the data to be compressed, the method further comprises:
configuring N compression algorithms to the virtual machine, wherein N is an integer greater than or equal to 1;
the determining of the target compression algorithm required by the data to be compressed includes:
receiving a compression instruction aiming at the data to be compressed, wherein the compression instruction is used for specifying a target compression algorithm required by the data to be compressed;
determining a target compression algorithm from the N compression algorithms of the virtual machine.
4. The method of claim 1, wherein prior to obtaining the data to be compressed, the method further comprises:
configuring N compression algorithms to the virtual machine, wherein N is an integer greater than or equal to 1;
determining a default compression algorithm from the N compression algorithms;
the determining of the target compression algorithm required by the data to be compressed comprises:
and determining the default compression algorithm as a target compression algorithm required by the data to be compressed.
5. The method of claim 1, 2, 3 or 4, wherein after the obtaining the data to be compressed, the method further comprises:
determining the data volume of the data to be compressed;
judging whether the data volume exceeds a first preset threshold value or not;
if the data volume exceeds the first preset threshold, triggering the step of determining a target compression algorithm required by the data to be compressed;
if the data volume does not exceed the first preset threshold, transmitting the data to be compressed to a second processing module;
and compressing the data to be compressed by adopting the second processing module.
6. The method of claim 1, wherein the first processing module comprises a plurality of threads, and wherein compressing the data to be compressed using the target compression algorithm in the first processing module comprises:
writing the data to be compressed into a first processing queue of the first processing module;
judging whether an idle thread exists in the multiple threads;
if idle threads exist in the multiple threads, acquiring the data to be compressed from the first processing queue;
in the idle thread, compressing the data to be compressed by adopting the target compression algorithm;
and if no idle thread exists in the plurality of threads, compressing the data to be compressed by adopting the second processing module.
7. The method of claim 6, wherein after compressing the data to be compressed using the target compression algorithm, the method further comprises:
if no idle thread exists in the multiple threads, determining the number of tasks in a waiting queue in the multiple threads, wherein the waiting queue represents a task queue to be processed in the multiple threads;
judging whether the task quantity exceeds a second preset threshold value;
and if so, compressing the data to be compressed by adopting the second processing module.
8. The method of claim 6, further comprising:
if no idle thread exists in the multiple threads, determining a first priority corresponding to the first processing queue;
acquiring a second processing queue of the plurality of threads, wherein the second processing queue is an executing processing queue;
determining a second priority corresponding to the second processing queue;
if the first priority is higher than the second priority, preempting a target thread occupied by the second processing queue for the first processing queue;
and in the target thread, compressing the data to be compressed in the first processing queue by adopting the target compression algorithm.
9. The method according to any one of claims 6 to 8, wherein after compressing the data to be compressed using the target compression algorithm, the method further comprises:
outputting the compressed data to be compressed to obtain target data;
and determining that a new idle thread exists in the first processing module according to the target data.
10. A data processing apparatus, comprising:
an acquisition unit configured to acquire data to be compressed;
the determining unit is used for determining a target compression algorithm required by the data to be compressed;
the judging unit is used for judging whether the compression algorithm currently applied by the first processing module is the target compression algorithm or not;
a writing unit, configured to write the target compression algorithm to the first processing module by using a virtual machine if a compression algorithm currently applied by the first processing module is the target compression algorithm;
the judging unit is further configured to judge whether the compression algorithm applied by the first processing module is successfully switched to the target compression algorithm;
a locking unit for locking the target compression algorithm to the compression algorithm applied by the first processing module;
the input unit is used for inputting the data to be compressed to the first processing module by adopting the virtual machine;
and the compression unit is used for compressing the data to be compressed by adopting the target compression algorithm in the first processing module.
11. A computer device, the computer device comprising a processor and a memory:
the memory is used for storing program codes; the processor is configured to perform the method of data processing of any of claims 1 to 9 according to instructions in the program code.
12. A computer readable storage medium having stored therein instructions which, when run on a computer, cause the computer to carry out the method of data processing of any of the preceding claims 1 to 9.
CN202110731228.8A 2021-06-29 2021-06-29 Data processing method, related device and storage medium Pending CN115543528A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110731228.8A CN115543528A (en) 2021-06-29 2021-06-29 Data processing method, related device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110731228.8A CN115543528A (en) 2021-06-29 2021-06-29 Data processing method, related device and storage medium

Publications (1)

Publication Number Publication Date
CN115543528A true CN115543528A (en) 2022-12-30

Family

ID=84717372

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110731228.8A Pending CN115543528A (en) 2021-06-29 2021-06-29 Data processing method, related device and storage medium

Country Status (1)

Country Link
CN (1) CN115543528A (en)

Similar Documents

Publication Publication Date Title
US11836533B2 (en) Automated reconfiguration of real time data stream processing
US10444995B2 (en) Automated selection of functions to reduce storage capacity based on performance requirements
US10380048B2 (en) Suspend and resume in a time shared coprocessor
US11442627B2 (en) Data compression utilizing low-ratio compression and delayed high-ratio compression
US10592447B1 (en) Accelerated data handling in cloud data storage system
US20190235758A1 (en) Autonomic Data Compression for Balancing Performance and Space
US10701154B2 (en) Sharding over multi-link data channels
US9916319B2 (en) Effective method to compress tabular data export files for data movement
CN111857550A (en) Method, apparatus and computer readable medium for data deduplication
CN111104258A (en) MongoDB database backup method and device and electronic equipment
US10592169B2 (en) Methods and systems that efficiently store metric data to enable period and peak detection
US10031764B2 (en) Managing executable files
US11119977B2 (en) Cognitive compression with varying structural granularities in NoSQL databases
CN115039091A (en) Multi-key-value command processing method and device, electronic equipment and storage medium
US9298487B2 (en) Managing virtual machine images in a distributed computing environment
US20170109367A1 (en) Early compression related processing with offline compression
CN115543528A (en) Data processing method, related device and storage medium
US20230061902A1 (en) Intelligent dataset slicing during microservice handshaking
US11630738B2 (en) Automatic objective-based compression level change for individual clusters
US11940998B2 (en) Database compression oriented to combinations of record fields
US11791835B1 (en) Compression improvement in data replication
CN112667607B (en) Historical data management method and related equipment
CA3190288A1 (en) Fault tolerant big data processing
CN116910004A (en) Data storage method, device, server and storage medium
CN113778323A (en) File processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40079472

Country of ref document: HK