CN114328400A - Data processing method and related equipment - Google Patents

Data processing method and related equipment Download PDF

Info

Publication number
CN114328400A
CN114328400A CN202011057183.2A CN202011057183A CN114328400A CN 114328400 A CN114328400 A CN 114328400A CN 202011057183 A CN202011057183 A CN 202011057183A CN 114328400 A CN114328400 A CN 114328400A
Authority
CN
China
Prior art keywords
files
file
data processing
original
multiple threads
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011057183.2A
Other languages
Chinese (zh)
Inventor
刘俊杰
江锦毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202011057183.2A priority Critical patent/CN114328400A/en
Publication of CN114328400A publication Critical patent/CN114328400A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a data processing method and related equipment, which are used for exporting files and improving compression efficiency. The method in the embodiment of the application comprises the following steps: obtaining N original files, and splitting the N original files through multiple threads to obtain M first files, wherein M is smaller than N; respectively compressing the M first files through multiple threads to obtain M second files; and combining the M second files to obtain a target file. And respectively compressing the M first files through multiple threads to obtain M second files. And combining the M second files to obtain a target file. On one hand, the compression efficiency can be improved through a multi-thread respective compression mode, so that the file export efficiency is improved. On the other hand, no extra file slice information header is generated or recorded.

Description

Data processing method and related equipment
Technical Field
The embodiment of the application relates to the field of communication, and in particular relates to a data processing method and related equipment.
Background
In an Internet Technology (IT) offline operation and maintenance scene, an operation and maintenance tool or data often needs to interact with the outside through an import and export mode.
At present, in the import and export process, a compression technology is used for compressing a target file into a package during compression, all files need to be traversed, and file streams are opened, read and closed one by one, so that the IO efficiency is low.
On the one hand, however, the single-thread operation is only possible, and the multi-core CPU performance cannot be fully exerted. On the other hand, in the case where a target file has a small file of hundreds of thousands or more, the compression efficiency is very low, resulting in an excessively long export time.
Disclosure of Invention
The embodiment of the application provides a data processing method and related equipment. The method can be used for exporting the file, and improves the compression efficiency.
A first aspect of an embodiment of the present application provides a data processing method, where the method includes: acquiring N original files, wherein N is a positive integer greater than 1; splitting N original files through multiple threads to obtain M first files, wherein M is a positive integer larger than 1 and is smaller than N; respectively compressing the M first files through multiple threads to obtain M second files, wherein the multiple threads correspond to the M second files one by one, and the M first files correspond to the M second files one by one; and combining the M second files to obtain a target file.
In the embodiment of the application, N original files are split to obtain M first files. And respectively compressing the M first files through multiple threads to obtain M second files. And combining the M second files to obtain a target file. On one hand, the compression efficiency can be improved through a multi-thread respective compression mode, so that the file export efficiency is improved. On the other hand, no extra file slice blocking information file header is generated or recorded.
Optionally, in a possible implementation manner of the first aspect, the step further includes: acquiring a target file; splitting the target file to obtain M second files; respectively decompressing the M second files through multiple threads to obtain M first files; and splitting the M first files to obtain N original files.
In the possible implementation mode, decompression is carried out through multiple threads, the physical configuration capacity of the system is fully utilized, the I/O reading and writing of the original file for decompression can be improved, and the decompression benefit and the target file importing efficiency are improved.
Optionally, in a possible implementation manner of the first aspect, the splitting, through multiple threads, the N original files to obtain M first files in the above steps includes: and splitting the N original files according to the preset byte size through multiple threads to obtain M first files.
In this possible implementation, the byte size of the first file can be controlled by splitting the N original files by bytes.
Optionally, in a possible implementation manner of the first aspect, the first file in the above step is a tar file, the second file is a zip file, and the target file is a tar file.
In the possible implementation mode, the first file is a tar file, the second file is a zip file, the target file is a tar file, the method can be operated on various operating systems, and the universality is high.
A second aspect of the embodiments of the present application provides a data processing apparatus, including:
the device comprises an acquisition unit, a storage unit and a processing unit, wherein the acquisition unit is used for acquiring N original files, and N is a positive integer greater than 1;
the file unit is used for splitting N original files through multiple threads to obtain M first files, wherein M is a positive integer larger than 1 and is smaller than N;
the compression unit is used for respectively compressing the M first files through multiple threads to obtain M second files, the multiple threads correspond to the M second files one by one, and the M first files correspond to the M second files one by one;
and the filing unit is also used for combining the M second files to obtain a target file.
Optionally, in a possible implementation manner of the second aspect, the obtaining unit of the data processing apparatus is further configured to obtain a target file;
the archiving unit is also used for splitting the target file to obtain M second files;
the compression unit is also used for respectively decompressing the M second files through multiple threads to obtain M first files;
and the filing unit is also used for splitting the M first files to obtain N original files.
Optionally, in a possible implementation manner of the second aspect, the archive unit of the data processing apparatus is specifically configured to split the N original files according to a preset byte size through multiple threads to obtain M first files.
Optionally, in a possible implementation manner of the second aspect, the first file of the data processing apparatus is a tar file, the second file is a zip file, and the target file is a tar file.
A third aspect of the application provides a data processing apparatus comprising a processor coupled to a memory for storing a computer program or instructions, the processor being adapted to execute the computer program or instructions in the memory such that the data processing apparatus performs the method of the first aspect.
A fourth aspect of the present application provides a chip, which includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a computer program or instructions, so that the chip implements the method in the first aspect or any possible implementation manner of the first aspect.
A fifth aspect of the present application provides a computer storage medium having stored therein instructions that, when executed on a computer, cause the computer to perform the method of the first aspect or any possible implementation manner of the first aspect.
A sixth aspect of the present application provides a computer program product which, when executed on a computer, causes the computer to perform the method of the preceding first aspect or any possible implementation manner of the first aspect.
For technical effects brought by the second, third, fourth, fifth, and sixth aspects or any one of possible implementation manners, reference may be made to technical effects brought by the first aspect or different possible implementation manners of the first aspect, and details are not described here.
According to the technical scheme, the embodiment of the application has the following advantages: and splitting the N original files to obtain M first files. And respectively compressing the M first files through multiple threads to obtain M second files. And combining the M second files to obtain a target file. By means of multi-thread respective compression, compression efficiency can be improved, and therefore file export efficiency is improved.
Drawings
FIG. 1 is a schematic flow chart of a data processing method according to an embodiment of the present application;
FIG. 2 is another schematic flow chart illustrating a data processing method according to an embodiment of the present application;
FIG. 3 is another schematic flow chart illustrating a data processing method according to an embodiment of the present application;
fig. 4 is a schematic view of an application scenario of the data processing method in the embodiment of the present application;
FIG. 5 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of another data processing apparatus according to an embodiment of the present application.
Detailed Description
The embodiment of the application provides a data processing method and related equipment. The method can be used for exporting and/or importing the file, and compression efficiency is improved.
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments.
In an IT offline operation and maintenance scene, an operation and maintenance tool or data often needs to interact with the outside through an import and export mode. In the import and export process, a compression technology is used to compress the target file into a package during compression, all files need to be traversed, and file streams are opened, read and closed one by one, so that the IO efficiency is low. On one hand, the method can only operate in a single thread and cannot fully exert the performance of the multi-core CPU. On the other hand, when the target file has a small file of hundreds of thousands or more, the compression or decompression efficiency is very low, resulting in a long time for importing and exporting.
In view of the foregoing problems, an embodiment of the present application provides a data processing method. The following describes a data processing method in the embodiment of the present application.
Referring to fig. 1, an embodiment of a data processing method in the embodiment of the present application includes:
the method of the present embodiment may be applied to a process of data export and/or data import.
101. N original files are obtained.
In this embodiment of the application, there are various ways for the data processing apparatus to obtain the N original files, and the N original files may be directly read, or may also be received from other devices, where a specific limitation is not provided here, and N is a positive integer greater than 1.
For example, referring to FIG. 2, the N original files may include File-1, File-2, File-3, and File-X, where X is a positive integer greater than 3.
Optionally, the N original files are original files already on the hundred thousand level, and of course, the number of original files is only an example here, and the efficiency improved by the method is higher when the number of original files is larger.
102. And splitting the N original files through multiple threads to obtain M first files.
There are various ways for the data processing apparatus to split the N original files by multithreading, which are described below:
1. and splitting the N original files according to the preset byte size.
The data processing device can write the N original files into M first files, which may be tar archive files, respectively according to a preset byte size through a tar technique. The byte number of at least two first files in the plurality of first files is the same, wherein M is a positive integer greater than 1, and M is far smaller than N.
Optionally, each first file may also carry a sequential fragment number, which may facilitate the order of restoring N original files in a subsequent import process.
Illustratively, the N original files are 4.5G or one folder, the folder includes 4,5G N original files, the N original files may be split according to a preset condition that the maximum number of bytes of each first file is 1G, and 5 first files may be generated. The total number of bytes of the 5 first files may be all 0.9G, or the total number of bytes of the 4 first files may be 1G, the total number of bytes of the 1 first file is 0.5G, and the predetermined size of bytes is not limited herein.
Illustratively, continuing the above example, as shown in fig. 2, the data processing apparatus slices N original files to obtain M first files, that is, N temporary archived files, where the N temporary archived files include xxx.tar.01, xxx.tar.02, and xxx.tar.n, and N is a positive integer greater than 2. The number of the temporary archive files may be 2 or more, and is not limited herein.
2. And splitting the original file according to a preset directory.
The data processing device can also split and store the N original files into the M directories through the preset M directories.
In the embodiment of the present application, the two types are only examples, and a specific rule for splitting N original files is not limited here.
In the step, the N original files are split into the M first files through multithreading, a large number of original files can be split into a small number of first files, and the I/O read-write quantity of subsequent compression can be reduced.
103. And respectively compressing the M first files through multiple threads to obtain M second files.
The data processing device compresses the M first files respectively through multiple threads to obtain M second files.
Alternatively, the M first files and the M second files may correspond one to one. Illustratively, the number of first files is 5, and the number of second files is also 5.
In this embodiment of the application, the first file may be a tar file, the second file may be a zip file, the first file may also be a file in a format such as 7z in the archive mode, and the second file may also be a file in a format such as gz in the compressed mode, which is not limited herein.
Illustratively, continuing the above example, as shown in fig. 2, the data processing apparatus compresses the plurality of first files by the plurality of compression threads to obtain a plurality of second files, and a plurality of n temporary compressed files, where the n temporary compressed files include xxx. The number of the temporary compressed files may be 2 or more, and is not limited herein.
In the step, M first files are respectively compressed through multiple threads to obtain M second files, and extra file fragment and block information file headers cannot be generated or recorded.
In the embodiment of the present application, the multithreading in step 102 and the multithreading in step 103 may be the same set of multithreading or two sets of multithreading, and in practical application, the same set of multithreading is more optimal, so that resources are saved.
104. And combining the M second files to obtain a target file.
After the data processing device obtains the M second files, the M second files may be merged to obtain one target file.
The target file in the embodiment of the present application may be a tar file. Or may be a file in a format such as gz in the archive mode, which is not limited herein.
Illustratively, continuing the above example, as shown in fig. 2, the data processing apparatus merges M second files (i.e., n temporary compressed files) into one target file (i.e., final export file: xxx.
105. A target file is obtained, this step being optional.
In the embodiment of the present application, there are various ways for the data processing apparatus to obtain the target file, and the data processing apparatus may directly read the target file, or may receive the target file sent by other devices, which is not limited herein.
For example, referring to fig. 3, the target file is xxx.
The step can be compressed through multiple threads, the physical configuration capacity of the system is fully utilized, the I/O reading and writing of the original file for compression can be improved, and the compression benefit and the exporting efficiency of the original file are improved. Furthermore, the first file is a tar file, the second file is a zip file, the target file is a tar file, the method can be operated on various operating systems, and the universality is high.
106. Splitting a target file to obtain M second files, which is optional in this step.
After the data processing device acquires the target files, one target file is split to obtain M second files.
Optionally, the data processing apparatus splits the target file using tar technique.
Illustratively, continuing the above example, as shown in fig. 3, the data processing apparatus splits one target file xxx.exp.tar into M second files, namely xxx.tar.01.zip, xxx.tar.02.zip, xxx.tar.nn. zip.
107. And respectively decompressing the M second files through multiple threads to obtain M first files, wherein the step is optional.
And the data processing device respectively decompresses the M second files through multiple threads to obtain M first files.
Optionally, each first file may also carry a sequential slice number.
Alternatively, the M second files may correspond to the M first files one to one. Illustratively, the number of second files is 5, and the number of first files is also 5.
Illustratively, continuing with the above example, as shown in fig. 3, the data processing apparatus decompresses the M second files respectively by a plurality of compression threads to obtain M first files. Decompression of XXX. tar.01.zip, XXX. tar.02.zip and XXX. tar.nn. zip to XXX. tar.01, XXX. tar.02 and XXX. tar.nn, respectively, was carried out.
108. The M first files are split to obtain N original files, and this step is optional.
After the data processing device obtains the M first files, the M first files are split to obtain N original files.
Optionally, if the first file carries the sequential fragment number, the data processing apparatus splits the M first files according to the sequential fragment number to obtain N original files. Furthermore, all the original files are combined into a folder.
Illustratively, continuing the above example, as shown in fig. 3, the data processing apparatus splits M first files to obtain N original files, that is, splits M first files to obtain more N original files. Furthermore, all the original files are combined into a folder.
The embodiments of the present application include multiple possibilities, one embodiment may include step 101 to step 104, another embodiment may include step 101 to step 108, and yet another embodiment may include step 105 to step 108, which is not limited herein.
Steps 101 to 104 in the embodiment of the present application may be applied to data export, and steps 105 to 108 may be applied to data import.
In the embodiment of the application, in steps 101 to 104, N original files are split into M first files through multithreading, a large number of original files can be split into a small number of first files, and the I/O read-write amount of subsequent compression can be reduced. And the compression is carried out through multiple threads, the physical configuration capacity of the system is fully utilized, the I/O reading and writing of the original file for compression can be improved, and the compression benefit and the exporting efficiency of the original file are improved. In steps 105 to 108, decompression is performed through multiple threads, the physical configuration capacity of the system is fully utilized, the I/O reading and writing of the original file for decompression can be improved, and the decompression benefit and the import efficiency of the target file are improved. Furthermore, the first file is a tar file, the second file is a zip file, the target file is a tar file, the method can be operated on various operating systems, and the universality is high. Furthermore, the first file can also carry the sequence fragment number, so that the sequence of restoring the N original files in the subsequent import process is facilitated.
The following describes an application scenario of the data processing method in the present application, and it is understood that the application scenario described below is only an example, and a specific application scenario is not limited herein.
Referring to fig. 4, an application scenario of the data processing method in the present application is shown.
The data processing method can be applied to a XXX tool platform, and the XXX tool platform comprises an archiving module and a compression module.
The archive module is used to merge the original folders of "tools 1" through "tool n" (typically specified by the user to export a range of tools, each containing several small files under the tool directory) into one file input stream.
The archive module is also used to stream files, sliced into n tar files (i.e., "tool 1.tar," "tool 2.tar," and "tool n.tar") in units of tool names.
The compression module is used for starting a plurality of threads to read and compress n tar files one by one in parallel to generate n zip files (namely 'tool 1.tar. zip', 'tool 2.tar. zip' and 'tool n. tar. zip').
The archive module is also used to merge the n zip files into one final tar file (i.e., XXX. exp. tar) export.
Corresponding to the method provided by the above method embodiment, the embodiment of the present application further provides a corresponding apparatus, which includes a module for executing the above embodiment. The module may be software, hardware, or a combination of software and hardware.
Referring to fig. 5, in an embodiment of a data processing apparatus in the embodiment of the present application, the data processing apparatus includes:
an obtaining unit 501, configured to obtain N original files, where N is a positive integer greater than 1;
the archiving unit 502 is configured to split N original files by multiple threads to obtain M first files, where M is a positive integer greater than 1 and M is smaller than N;
a compressing unit 503, configured to compress the M first files respectively through multiple threads to obtain M second files, where the multiple threads correspond to the M second files one to one, and the M first files correspond to the M second files one to one;
the filing unit 502 is further configured to merge the M second files to obtain a target file.
Optionally, the obtaining unit 501 is further configured to obtain a target file;
optionally, the archiving unit 502 is further configured to split the target file to obtain M second files;
optionally, the compressing unit 503 is further configured to decompress the M second files respectively through multiple threads to obtain M first files;
optionally, the archiving unit 502 is further configured to split the M first files to obtain N original files.
Optionally, the archiving unit 502 is specifically configured to split the N original files according to the preset byte size through multiple threads to obtain M first files.
Optionally, the first file is a tar file, the second file is a zip file, and the target file is a tar file.
In this embodiment, operations performed by each unit in the data processing apparatus are similar to those performed by the data processing apparatus in the embodiment shown in fig. 1 to 4, and are not described again here.
In this embodiment, the filing unit 502 splits N original files into M first files by multithreading, which can split a large amount of original files into a small amount of first files, and can reduce I/O read/write amount of subsequent compression. And the compression unit 503 compresses through multiple threads, fully utilizes the physical configuration capability of the system, can improve the I/O read-write of the original file for compression, and improves the compression benefit and the export efficiency of the original file. Optionally, the compression unit 503 decompresses through multiple threads, and fully utilizes the physical configuration capability of the system, so that I/O reading and writing for decompressing the original file can be improved, and the decompression benefit and the import efficiency of the target file can be improved. Optionally, the first file is a tar file, the second file is a zip file, and the target file is a tar file, and the method can be run on various operating systems and is high in universality. Optionally, the first file may also carry a sequential fragment number, which facilitates the order of restoring the N original files in the subsequent import process.
Referring to fig. 6, a possible schematic diagram of a data processing apparatus 600 according to the foregoing embodiments is provided for an embodiment of the present application, where the data processing apparatus 600 may specifically be a data processing apparatus in the foregoing embodiments, and the data processing apparatus 600 may include, but is not limited to, a processor 601, a communication port 602, a memory 603, and a bus 604, and in the embodiment of the present application, the processor 601 is used for controlling and processing the operation of the data processing apparatus 600.
Further, the processor 601 may be a central processing unit, a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, transistor logic, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor may also be a combination of computing functions, e.g., comprising one or more microprocessors, a digital signal processor and a microprocessor, or the like. It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
It should be noted that the data processing apparatus shown in fig. 6 may be specifically configured to implement the functions of the steps executed by the data processing apparatus in the method embodiments corresponding to fig. 1 to fig. 4, and implement the technical effect corresponding to the data processing apparatus, and specific implementation manners of the data processing apparatus shown in fig. 6 may refer to descriptions in each of the method embodiments corresponding to fig. 1 to fig. 4, and are not described in detail here.
The present application further provides a computer-readable storage medium storing one or more computer-executable instructions, where the computer-executable instructions are executed by a processor, and the processor executes a method according to a possible implementation manner of the data processing apparatus in the foregoing embodiments, where the data processing apparatus may specifically be the data processing apparatus in the foregoing method embodiments corresponding to fig. 1 to fig. 4.
The present application further provides a computer program product storing one or more computers, and when the computer program product is executed by the processor, the processor executes the method that may be implemented by the data processing apparatus, where the data processing apparatus may specifically be the data processing apparatus in the method embodiments corresponding to fig. 1 to fig. 4.
The embodiment of the present application further provides a chip system, where the chip system includes a processor, and is used to support a data processing apparatus to implement the functions related to the possible implementation manners of the data processing apparatus. In one possible design, the system-on-chip may further include a memory, which stores program instructions and data necessary for the data processing apparatus. The chip system may be formed by a chip, or may include a chip and other discrete devices, where the data processing apparatus may specifically be the data processing apparatus in the method embodiments corresponding to fig. 1 to fig. 4.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Claims (11)

1. A data processing method, comprising:
acquiring N original files, wherein N is a positive integer greater than 1;
splitting the N original files through multiple threads to obtain M first files, wherein M is a positive integer larger than 1 and is smaller than N;
respectively compressing the M first files through the multiple threads to obtain M second files, wherein the multiple threads correspond to the M second files one by one, and the M first files correspond to the M second files one by one;
and combining the M second files to obtain a target file.
2. The method of claim 1, further comprising:
acquiring the target file;
splitting the target file to obtain the M second files;
respectively decompressing the M second files through the multiple threads to obtain the M first files;
and splitting the M first files to obtain the N original files.
3. The method of claim 1, wherein the splitting the N original files by the multithreading into M first files comprises:
and splitting the N original files according to a preset byte size through the multithreading to obtain the M first files.
4. The method according to any one of claims 1 to 3, wherein the first file is a tar file, the second file is a zip file, and the target file is a tar file.
5. A data processing apparatus, comprising:
the device comprises an acquisition unit, a storage unit and a processing unit, wherein the acquisition unit is used for acquiring N original files, and N is a positive integer greater than 1;
the archiving unit is used for splitting the N original files through multiple threads to obtain M first files, wherein M is a positive integer greater than 1 and is smaller than N;
the compression unit is used for respectively compressing the M first files through the multiple threads to obtain M second files, the multiple threads correspond to the M second files one by one, and the M first files correspond to the M second files one by one;
the filing unit is further configured to merge the M second files to obtain a target file.
6. The data processing apparatus according to claim 5, wherein the obtaining unit is further configured to obtain the one target file;
the filing unit is further configured to split the target file to obtain the M second files;
the compression unit is further configured to decompress the M second files respectively through the multiple threads to obtain the M first files;
the filing unit is further configured to split the M first files to obtain the N original files.
7. The data processing apparatus according to claim 5, wherein the archive unit is specifically configured to split the N original files according to a preset byte size through the multithreading to obtain the M first files.
8. The data processing apparatus according to any of claims 5 to 7, wherein the first file is a tar file, the second file is a zip file, and the target file is a tar file.
9. A data processing apparatus comprising a processor coupled to a memory for storing a computer program or instructions for executing the computer program or instructions in the memory such that the method of any of claims 1 to 4 is performed.
10. A chip comprising a processor and a communication interface, the communication interface being coupled to the processor, the processor being configured to execute a computer program or instructions such that the method of any of claims 1 to 4 is performed.
11. A computer storage medium storing a program or instructions for implementing the method of any one of claims 1 to 4.
CN202011057183.2A 2020-09-29 2020-09-29 Data processing method and related equipment Pending CN114328400A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011057183.2A CN114328400A (en) 2020-09-29 2020-09-29 Data processing method and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011057183.2A CN114328400A (en) 2020-09-29 2020-09-29 Data processing method and related equipment

Publications (1)

Publication Number Publication Date
CN114328400A true CN114328400A (en) 2022-04-12

Family

ID=81011123

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011057183.2A Pending CN114328400A (en) 2020-09-29 2020-09-29 Data processing method and related equipment

Country Status (1)

Country Link
CN (1) CN114328400A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117076388A (en) * 2023-10-12 2023-11-17 中科信工创新技术(北京)有限公司 File processing method and device, storage medium and electronic equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117076388A (en) * 2023-10-12 2023-11-17 中科信工创新技术(北京)有限公司 File processing method and device, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
US11334255B2 (en) Method and device for data replication
US9031997B2 (en) Log file compression
Lin et al. Migratory compression: Coarse-grained data reordering to improve compressibility
CN110377226B (en) Compression method and device based on storage engine bluestore and storage medium
US20110184908A1 (en) Selective data deduplication
CN107682016B (en) Data compression method, data decompression method and related system
CN106547911B (en) Access method and system for massive small files
KR20110025359A (en) Block unit data compression and decompression method and apparatus thereof
CN104123280A (en) File comparison method and device
US20190272257A1 (en) Tape drive memory deduplication
CN110062028A (en) Data synchronous method, apparatus, computer equipment and computer storage medium
US20220335013A1 (en) Generating readable, compressed event trace logs from raw event trace logs
US8909606B2 (en) Data block compression using coalescion
CN114328400A (en) Data processing method and related equipment
CN111061428B (en) Data compression method and device
WO2018103490A1 (en) A method and system for compressing data
CN112765112A (en) Installation package packing and unpacking method
US20170293452A1 (en) Storage apparatus
Schatz Wirespeed: Extending the AFF4 forensic container format for scalable acquisition and live analysis
WO2019119336A1 (en) Multi-thread compression and decompression methods in generic data gz format, and device
US20160254824A1 (en) Determining compression techniques to apply to documents
CN111625186B (en) Data processing method, device, electronic equipment and storage medium
US11397586B1 (en) Unified and compressed statistical analysis data
Zhang et al. A Compatible LZMA ORC-Based Optimization for High Performance Big Data Load
US20090094392A1 (en) System and Method for Data Operations in Memory

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination