CN104484174B - The treating method and apparatus of the compressed file of RAR forms - Google Patents

The treating method and apparatus of the compressed file of RAR forms Download PDF

Info

Publication number
CN104484174B
CN104484174B CN201410773628.5A CN201410773628A CN104484174B CN 104484174 B CN104484174 B CN 104484174B CN 201410773628 A CN201410773628 A CN 201410773628A CN 104484174 B CN104484174 B CN 104484174B
Authority
CN
China
Prior art keywords
file
function
class
decompression
decompressing files
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410773628.5A
Other languages
Chinese (zh)
Other versions
CN104484174A (en
Inventor
谢宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201410773628.5A priority Critical patent/CN104484174B/en
Publication of CN104484174A publication Critical patent/CN104484174A/en
Application granted granted Critical
Publication of CN104484174B publication Critical patent/CN104484174B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a kind for the treatment of method and apparatus of the compressed file of RAR forms.The processing method of the compressed file of the RAR forms includes determining the compressed file of pending RAR forms;Obtain the file loading class function being pre-created and file decompression class function;The compressed file of pending RAR forms is decompressed by the way that file is called to decompress class function in loading class function in file, obtains decompressing files;Obtain the store path of file after decompressing;Analyzing and processing to decompressing files is performed by data analysis function, obtains handling result.By the present invention, solves the problems, such as the compressed file that Hadoop of the prior art can not read analysis RAR forms.

Description

The treating method and apparatus of the compressed file of RAR forms
Technical field
The present invention relates to data processing field, in particular to a kind of processing method of the compressed file of RAR forms and Device.
Background technology
It is daily the analysis of daily record data to be usually required to carry out using Hadoop in practice, wherein, Hadoop is an energy Enough software frames increased income that distributed treatment is carried out to mass data, computing platform masters of the Hadoop as distributed big data To include two parts, distributed file system (Hadoop Distributed File System, referred to as HDFS) and distribution Formula calculation block MapReduce.HDFS can be created, be deleted, moving or Rename file, has high fault tolerance, expansibility The features such as.MapReduce containment mappings (Map) and merging (Reduce) two parts, data analysis carry out generally in Map, point Analysis result exports after being merged by Reduce.The process of file is read as shown in Figure 1, wherein, Fig. 1 is using Hadoop Hadoop reads the flow chart of file according to prior art, and HDFS is stored in by inputting segmentation function InputFormat In file data be split, generate multiple file data fragments Splits, read by function reading RecordReader Multiple Splits, then using reading result as the input parameter of Map, Map is analyzed and processed to reading result, at data Reason result exports after being merged by Reduce, will be exported in result deposit HDFS using output function OutputFormat.
But with portfolio constantly increase, the daily record data amount that server generates daily quickly increases.In order to improve The space availability ratio of computer system, it usually needs daily record data is subjected to compression preservation.Hadoop can be read in the prior art The form of the compressed file taken has gzip, bzip, lzo, snappy etc., but does not support RAR forms, i.e. Hadoop can not be read The compressed file of RAR forms.RAR is a kind of common compressed format, has many advantages, such as compression factor height, and compression speed is fast, and And the file in HDFS carries out the compression of file using RAR forms mostly, Hadoop can not read RAR forms in the prior art Compressed file, it will bring many troubles to data processing, seriously affect data-handling efficiency.
It the problem of can not reading the compressed file of analysis RAR forms for Hadoop of the prior art, not yet carries at present Go out effective solution.
Invention content
It is a primary object of the present invention to provide a kind for the treatment of method and apparatus of the compressed file of RAR forms, to solve Hadoop of the prior art can not read the problem of compressed file of analysis RAR forms.
To achieve these goals, according to an aspect of the invention, there is provided a kind of place of the compressed file of RAR forms Reason method.
The processing method of the compressed file of the RAR forms includes:Determine the compressed file of pending RAR forms;It obtains The file loading class function and file decompression class function being pre-created;By file being called to decompress class in loading class function in file The compressed file of the pending RAR forms of function pair is decompressed, and obtains decompressing files;Obtain the storage road of file after decompressing Diameter;Analyzing and processing to decompressing files is performed by data analysis function, obtains handling result.
Further, by file being called to decompress class function to pending RAR forms in loading class function in file Compressed file carries out decompression and includes, and obtains decompressing files and includes:It is performed in file loads class function and calls file decompression class letter Number;The decompression function in calling solution briquetting is performed in file decompresses class function, obtains decompressing files, wherein, it solves and is deposited in briquetting Contain the decompression function decompressed to the compressed file of pending RAR forms.
Further, is obtained by handling result and is included for the analyzing and processing of decompressing files by the execution of data analysis function:It obtains The return value of file decompression class function is taken, wherein, the return value of file decompression class function is corresponded to for the store path of decompressing files Character string;The return value of file decompression class function is sent to data analysis function to analyze and process, obtains handling result.
Further, the return value of file decompression class function is sent to data analysis function to analyze and process, is obtained Handling result includes:The return value of file decompression class function is converted into class of paths address;It is stored at acquisition approach class address Decompressing files;Data analysis function pair decompressing files is analyzed and processed, and obtains handling result.
Further, it is analyzed and processed in data analysis function pair decompressing files, after obtaining handling result, method is also Including:Delete the decompressing files stored at class of paths address;Handling result is stored in the corresponding address of default store path.
Further, the processing method of the compressed file of the RAR forms starts multiple processes simultaneously, wherein, each process Middle data analysis function pair decompressing files performs analyzing and processing, is analyzed and processed, obtained in data analysis function pair decompressing files To after handling result, the processing method of the compressed file of the RAR forms further includes:By the data analysis function in multiple processes The multiple handling results obtained after analyzing and processing merge, the handling result after being merged;Processing knot after output merging Fruit.
To achieve these goals, according to another aspect of the present invention, a kind of place of the compressed file of RAR forms is provided Manage device.
The processing unit of the compressed file of the RAR forms includes:Determining module, for determining pending RAR forms Compressed file;First acquisition module, for obtaining the file being pre-created loading class function and file decompression class function;Solve pressing mold Block, for by called in loading class function in file file decompress class function to the compressed files of pending RAR forms into Row decompression, obtains decompressing files;Second acquisition module, for obtaining the store path of decompressing files;Processing module, for passing through Data analysis function performs the analyzing and processing to decompressing files, obtains handling result.
Further, decompression module includes:First calling module calls file for being performed in file loading class function Decompress class function;Second calling module for performing the decompression function in calling solution briquetting in file decompression class function, obtains Decompressing files, wherein, it solves in briquetting and is stored with the decompression function decompressed to the compressed file of pending RAR forms.
Further, processing module includes:Second acquisition submodule, for obtaining the return value of file decompression class function, Wherein, the return value of file decompression class function is the corresponding character string of store path of decompressing files;First processing submodule, is used It is analyzed and processed in the return value of file decompression class function is sent to data analysis function, obtains handling result, wherein, the One processing submodule includes:Conversion module, for the return value of file decompression class function to be converted into class of paths address;Third obtains Submodule is taken, for the decompressing files stored at acquisition approach class address;Second processing submodule, for data analysis function pair Decompressing files is analyzed and processed, and obtains handling result.
Further, the processing unit of the compressed file of the RAR forms further includes:Removing module, for deleting class of paths The decompressing files stored at address;Memory module, for handling result to be stored in the corresponding address of default store path.
By the present invention, using the compressed file for determining pending RAR forms;Obtain the file loading classes being pre-created Function and file decompression class function;By file being called to decompress class function to pending RAR lattice in loading class function in file The compressed file of formula is decompressed, and obtains decompressing files;Obtain the store path of decompressing files;It is performed by data analysis function To the analyzing and processing of decompressing files, handling result is obtained, analysis RAR lattice can not be read by solving Hadoop of the prior art The problem of compressed file of formula.The invention creates file loading class function on the basis of input segmentation function and function reading Class function is decompressed with file, the compressed file of the RAR forms in HDFS is read using file decompression class function and is solved Pressure, obtains decompressing files, then decompressing files is sent in Map and is analyzed and processed, obtain handling result, finally will place Reason result is stored in HDFS.Hadoop can start multiple Map tasks simultaneously in the invention, and each Map corresponds to a file Class function is decompressed, each file decompression class function reads the compressed file of a RAR form, has been achieved multiple pending It is handled while the compressed file of RAR forms, improves execution efficiency.It is obtained in addition, the invention has performed analyzing and processing in Map Temporary decompressing files is deleted after handling result, has saved system space.
Description of the drawings
The attached drawing for forming the part of the application is used to provide further understanding of the present invention, schematic reality of the invention Example and its explanation are applied for explaining the present invention, is not constituted improper limitations of the present invention.In the accompanying drawings:
Fig. 1 is the flow chart that Hadoop reads file according to prior art;
Fig. 2 is the flow chart of the processing method of the compressed file of RAR forms according to embodiments of the present invention;
Fig. 3 is the compressed file flow chart that Hadoop according to embodiments of the present invention reads analysis RAR forms;And
Fig. 4 is the schematic diagram of the processing unit of the compressed file of RAR forms according to embodiments of the present invention.
Specific embodiment
It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the application can phase Mutually combination.The present invention will be described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
In order to which those skilled in the art is made to more fully understand application scheme, below in conjunction in the embodiment of the present application The technical solution in the embodiment of the present application is clearly and completely described in attached drawing, it is clear that described embodiment is only The embodiment of the application part, instead of all the embodiments.Based on the embodiment in the application, ordinary skill people Member's all other embodiments obtained without making creative work should all belong to the model of the application protection It encloses.
It should be noted that term " first " in the description and claims of this application and above-mentioned attached drawing, " Two " etc. be the object for distinguishing similar, and specific sequence or precedence are described without being used for.It should be appreciated that it uses in this way Data can be interchanged in the appropriate case, so as to embodiments herein described herein.In addition, term " comprising " and " tool Have " and their any deformation, it is intended that cover it is non-exclusive include, for example, containing series of steps or unit Process, method, system, product or equipment are not necessarily limited to those steps or unit clearly listed, but may include without clear It is listing to Chu or for the intrinsic other steps of these processes, method, product or equipment or unit.
The present invention is intended to provide a kind for the treatment of method and apparatus of the compressed file of RAR forms.
Fig. 2 is the flow chart of the processing method of the compressed file of RAR forms according to embodiments of the present invention.Such as Fig. 2 institutes Show, the processing method of the compressed file of the RAR forms includes steps S101 to step S105:
Step S101 determines the compressed file of pending RAR forms.
The compressed file of RAR forms stored in HDFS usually has many, the compression text of the RAR forms of the embodiment The processing method of part can be read out processing to the compressed file of a RAR form, can also be to the pressure of multiple RAR forms Contracting file is read out processing.Determine that the number of the compressed file of pending RAR forms can be according to specific analysis demand It is determined.Preferably, the processing method of the compressed file of the RAR forms of the embodiment determines the pressure of pending RAR forms The number of contracting file is multiple, relative to individually being handled after the compressed file for reading RAR forms one by one, the embodiment The processing method of the compressed file of RAR forms greatly improves treatment effeciency.
Preferably, it is further included while the compressed file for determining pending RAR forms:Obtain pending RAR lattice The store path of the compressed file of formula, the purpose for obtaining the store path are accurately to obtain to deposit at the corresponding address of the store path The compressed file of the pending RAR forms of storage.
Step S102 obtains the file loading class function being pre-created and file decompression class function.
Relevant processing analysis is carried out to the compressed file of pending RAR forms to be needed by means of class function, for example, reading The compressed file of pending RAR forms is taken to need to use file loading function, to the compressed files of pending RAR forms into Row decompression needs to use file decompression function etc..The processing method of the compressed file of the RAR forms of the embodiment is being inherited It is created on the basis of the input segmentation function InputFormat for ordinary file in Hadoop frames for RAR forms Compressed file file loading class function RarInputFormat, the reading for ordinary file in Hadoop frames are inherited The file created on the basis of function RecordReader for the compressed file of RAR forms is taken to decompress class function RarRecordReader.RarInputFormat can read the compressed file of one or more RAR form from HDFS, And the compressed file of one or more RAR form is subjected to data segmentation, generate several data file segments. RarRecordReader reads this data file segment, and these data file segments are decompressed, generation decompression text Part, and the store path of storage decompressing files is obtained, the store path of the storage decompressing files is passed into data as parameter Analytic function Map.Reading and the decompression of the compressed file to RAR forms can be realized by the two class functions.
Step S103, by file being called to decompress class function to pending RAR forms in loading class function in file Compressed file is decompressed, and obtains decompressing files.
File RarRecordReader pairs of class function of decompression is called in file loading class function RarInputFormat The compressed file of RAR forms is decompressed, and obtains decompressing files.Wherein, in file decompression class function RarRecordReader In to the compressed file of RAR forms carry out decompression specifically include:Tune is performed in file loading class function RarInputFormat With file decompression class function RarRecordReader;It is performed in file decompression class function RarRecordReader and calls solution Decompression function in briquetting, obtains decompressing files, wherein, solve the compressed file being stored in briquetting to pending RAR forms The decompression function decompressed.Solution briquetting in the embodiment is preferably java-unrar-0.5.jar packets, which can be https://clojars.org/org.clojars.bonega/java-unrar is downloaded.Class function is decompressed in file The decompression function in the jar packets is called to decompress the compressed file of RAR forms in RarRecordReader.The embodiment The processing methods of compressed file of RAR forms the compressed file of RAR forms is decompressed, the convenient content to this document into Row is read or processing.
Step S104 obtains the store path of file after decompression.
By file being called to decompress class function in loading class function RarInputFormat in file After RarRecordReader decompresses the compressed file of pending RAR forms, decompressing files is obtained.Preferably, Decompressing files will be stored temporarily in HDFS.Store path of the decompressing files in HDFS will be used as file decompression function The return value Value of RarRecordReader is transmitted in data analysis function Map.By storage of the decompressing files in HDFS Path is input to as parameter in data analysis function Map, is conducive to data analysis function Map and is obtained according to the store path Corresponding decompressing files after the compressed file decompression of RAR forms, facilitates data analysis function Map to carry out the decompressing files Analyzing and processing.
Step S105 performs the analyzing and processing to decompressing files by data analysis function, obtains handling result.
It, should after the return value Value for receiving file decompression function RarRecordReader in data analysis function Map Data analysis function can analyze and process decompressing files.Preferably, the processing of the compressed file of the RAR forms of the embodiment Method performs the analyzing and processing to decompressing files by data analysis function, and obtaining handling result can specifically include:Obtain text Part decompresses the return value of class function, wherein, the return value of file decompression class function is the corresponding word of store path of decompressing files Symbol string;The return value of file decompression class function is sent to data analysis function to analyze and process, obtains handling result.
Specifically, the return value of file decompression class function is sent to data analysis function to analyze and process, is obtained everywhere Reason result can include:The return value that file is decompressed to class function by the path class function Path in Hadoop is converted into path Class address;At the class of paths address being directed toward by data acquisition function FSDateInputStream acquisition approach class functions Path The decompressing files of storage;Data analysis function completes the analyzing and processing to decompressing files by business diagnosis logic, and obtains everywhere Manage result.
Preferably, it is analyzed and processed in data analysis function pair decompressing files, after obtaining handling result, the embodiment The processing methods of compressed file of RAR forms further include:It deletes temporary at the class of paths address that path class function Path is directed toward The decompressing files being stored in HDFS, release disk space;And pass through output function OutputFormat and carry out handling result Output, is stored in the corresponding address of default store path.The processing method of the compressed file of the RAR forms of the embodiment will be temporary The decompressing files being stored in HDFS is deleted, and is conducive to discharge system memory space.
Preferably, the processing method of the compressed file of the RAR forms of the embodiment can start multiple processes simultaneously, In, data analysis function Map performs analyzing and processing to decompressing files in each process, in data analysis function Map to decompression text Part is analyzed and processed, and after obtaining handling result, the processing method of the compressed file of the RAR forms of the embodiment can also wrap It includes:The multiple handling results obtained after data analysis function Map analyzing and processing in multiple processes are passed through into pooled function Reduce is merged, the handling result after being merged;After merging finally by data function OutputFormat outputs Handling result.The processing method of the compressed file of the RAR forms of the embodiment starts multiple Map tasks simultaneously, realizes to more It is handled while the compressed file of a pending RAR forms, improves execution efficiency.Fig. 3 is according to embodiments of the present invention Hadoop reads the compressed file flow chart of analysis RAR forms.
The processing method of the compressed file of the RAR forms of the embodiment is using the compression text for determining pending RAR forms Part;Obtain the file loading class function being pre-created and file decompression class function;By calling text in loading class function in file Part decompression class function decompresses the compressed file of pending RAR forms, obtains decompressing files;Obtain file after decompressing Store path;Analyzing and processing to decompressing files is performed by data analysis function, handling result is obtained, solves the prior art In Hadoop can not read analysis RAR forms compressed file the problem of.Meanwhile the compression text of the RAR forms of the embodiment The processing method of part starts multiple Map tasks simultaneously, while realizing the compressed file to multiple pending RAR forms at Reason, improves execution efficiency.Moreover, the processing method of the compressed file of the RAR forms of the embodiment has performed analysis in Map Processing deletes temporary decompressing files after obtaining handling result, has saved system space.
It can be seen from the above description that the processing method of the compressed file of the RAR forms of the embodiment of the present invention is defeated Enter to create on the basis of segmentation function and function reading file loading class function and file decompression class function, decompressed using file Class function reads the compressed file of the RAR forms in HDFS and is decompressed, and decompressing files is obtained, then by decompressing files It is sent in Map and is analyzed and processed, obtain handling result, finally handling result is stored in HDFS, solve existing skill Hadoop in art can not read the problem of compressed file of analysis RAR forms.Meanwhile Hadoop can in the embodiment of the invention To start multiple Map tasks simultaneously, each Map corresponds to a file decompression class function, and each file decompression class function reads one The compressed file of a RAR forms, the compressed file that multiple pending RAR forms have been achieved while, are handled, and are improved and are held Line efficiency.In addition, the embodiment of the invention has performed analyzing and processing in Map obtains the decompressing files that will be kept in after handling result It deletes, has greatly saved system space.
It should be noted that step shown in the flowchart of the accompanying drawings can be in such as a group of computer-executable instructions It is performed in computer system, although also, show logical order in flow charts, it in some cases, can be with not The sequence being same as herein performs shown or described step.
The embodiment of the present invention additionally provides a kind of processing unit of the compressed file of RAR forms.It it should be noted that should The processing unit of the compressed file of RAR forms can be used for performing the processing of the compressed file of the RAR forms of the embodiment of the present invention Method.
Fig. 4 is the schematic diagram of the processing unit of the compressed file of RAR forms according to embodiments of the present invention.Such as Fig. 4 institutes Show, the processing unit of the compressed file of the RAR forms includes:Determining module 10, the first acquisition module 20, decompression module 30, the Two acquisition modules 40 and processing module 50.
Determining module 10, for determining the compressed file of pending RAR forms.
First acquisition module 20, for obtaining the file being pre-created loading class function and file decompression class function.
Decompression module 30, for by file being called to decompress class function to pending RAR in loading class function in file The compressed file of form is decompressed, and obtains decompressing files.
Preferably, decompression module 30 includes:First calling module calls file for being performed in file loading class function Decompress class function;Second calling module for performing the decompression function in calling solution briquetting in file decompression class function, obtains Decompressing files, wherein, it solves in briquetting and is stored with the decompression function decompressed to the compressed file of pending RAR forms.
Second acquisition module 40, for obtaining the store path of decompressing files.
Processing module 50 for passing through the execution of data analysis function to the analyzing and processing of decompressing files, obtains handling result.
Preferably, processing module 50 includes:Second acquisition submodule, for obtaining the return value of file decompression class function, Wherein, the return value of file decompression class function is the corresponding character string of store path of decompressing files;First processing submodule, is used It is analyzed and processed in the return value of file decompression class function is sent to data analysis function, obtains handling result.
Specifically, the first processing submodule includes:Conversion module, for the return value of file decompression class function to be converted into Class of paths address;Third acquisition submodule, for the decompressing files stored at acquisition approach class address;Second processing submodule, It is analyzed and processed for data analysis function pair decompressing files, obtains handling result.
Preferably, the processing unit of the compressed file of the RAR forms of the embodiment further includes:Removing module, for deleting The decompressing files stored at class of paths address;Memory module, it is corresponding for handling result to be stored in default store path Location.
The processing unit of the compressed file of the RAR forms of the embodiment includes determining module 10, the first acquisition module 20, solution Die block 30, the second acquisition module 40 and processing module 50.Pass through the processing unit of the compressed file of the RAR forms of the embodiment Solve the problems, such as the compressed file that Hadoop of the prior art can not read analysis RAR forms, meanwhile, by being treated to multiple The reading process while compressed file of the RAR forms of processing, improves treatment effeciency, by deleting temporary decompressing files, System space is saved.
Obviously, those skilled in the art should be understood that each module of the above-mentioned present invention or each step can be with general Computing device realize that they can concentrate on single computing device or be distributed in multiple computing devices and be formed Network on, optionally, they can be realized with the program code that computing device can perform, it is thus possible to which they are stored In the storage device by computing device come perform either they are fabricated to respectively each integrated circuit modules or by they In multiple modules or step be fabricated to single integrated circuit module to realize.In this way, the present invention is not limited to any specific Hardware and software combines.
The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, that is made any repaiies Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.

Claims (10)

1. a kind of processing method of the compressed file of RAR forms, which is characterized in that including:
Determine the compressed file of the pending RAR forms of Hadoop distributed file systems HDFS;
Obtain the file loading class function being pre-created in Hadoop frames and file decompression class function;
By the file being called to decompress pressure of the class function to the pending RAR forms in loading class function in the file Contracting file is decompressed, and obtains decompressing files;
Obtain the store path of the decompressing files;And
It is input to the store path of the decompressing files as parameter in data analysis function Map, for the data analysis Function Map obtains the decompressing files, and performs the analyzing and processing to the decompressing files, obtains handling result.
2. the processing method of the compressed file of RAR forms according to claim 1, which is characterized in that by the text The file decompression class function is called to decompress the compressed file of the pending RAR forms in part loading class function, Decompressing files is obtained to include:
It is performed in the file loads class function and calls the file decompression class function;And
The decompression function in calling solution briquetting is performed in the file decompresses class function, obtains the decompressing files, wherein, institute It states in solution briquetting and is stored with the decompression function decompressed to the compressed file of the pending RAR forms.
3. the processing method of the compressed file of RAR forms according to claim 1, which is characterized in that by the decompression text The store path of part is input to as parameter in data analysis function Map, and the solution is obtained for the data analysis function Map File is pressed, and data analysis function performs the analyzing and processing to the decompressing files, obtains handling result and includes:
The return value of the file decompression class function is obtained, wherein, the return value of the file decompression class function is the decompression The corresponding character string of store path of file;And
The return value of file decompression class function is sent to the data analysis function to analyze and process, obtains the place Manage result.
4. the processing method of the compressed file of RAR forms according to claim 3, which is characterized in that by the file solution The return value of pressure class function is sent to the data analysis function and is analyzed and processed, and obtains the handling result and includes:
The return value of file decompression class function is converted into class of paths address;
Obtain the decompressing files stored at the class of paths address;And
Decompressing files described in the data analysis function pair is analyzed and processed, and obtains the handling result.
5. the processing method of the compressed file of RAR forms according to claim 4, which is characterized in that by the decompression The store path of file is input to as parameter in data analysis function Map, for described in data analysis function Map acquisitions Decompressing files, and data analysis function performs the analyzing and processing to the decompressing files, after obtaining handling result, the method It further includes:
Delete the decompressing files stored at the class of paths address;And
The handling result is stored in the corresponding address of default store path.
6. the processing method of the compressed file of RAR forms according to claim 5, which is characterized in that the method is simultaneously Start multiple processes, wherein, decompressing files described in data analysis function pair described in each process performs analyzing and processing, described Decompressing files described in data analysis function pair is analyzed and processed, and after obtaining the handling result, the method further includes:
The multiple handling results obtained after data analysis Functional Analysis processing in multiple processes are merged, are closed Handling result after and;And
Export the handling result after the merging.
7. a kind of processing unit of the compressed file of RAR forms, which is characterized in that including:
Determining module, for determining the compressed file of the pending RAR forms of Hadoop distributed file systems HDFS;
First acquisition module, for obtaining the file being pre-created in Hadoop frames loading class function and file decompression class letter Number;
Decompression module, for by the file being called to decompress class function to described pending in loading class function in the file The compressed files of RAR forms decompressed, obtain decompressing files;
Second acquisition module, for obtaining the store path of the decompressing files;And
Processing module for being input to the store path of the decompressing files as parameter in data analysis function Map, is used for The data analysis function Map obtains the decompressing files, and performs the analyzing and processing to the decompressing files, obtains processing knot Fruit.
8. the processing unit of the compressed file of RAR forms according to claim 7, which is characterized in that the decompression module Including:
First calling module calls the file decompression class function for being performed in file loading class function;And
Second calling module for performing the decompression function in calling solution briquetting in file decompression class function, obtains institute Decompressing files is stated, wherein, it is stored with the solution decompressed to the compressed file of the pending RAR forms in the solution briquetting Press function.
9. the processing unit of the compressed file of RAR forms according to claim 7, which is characterized in that the processing module Including:
Second acquisition submodule, for obtaining the return value of the file decompression class function, wherein, the file decompresses class function Return value be the decompressing files the corresponding character string of store path;
First processing submodule carries out for the return value of file decompression class function to be sent to the data analysis function Analyzing and processing, obtains the handling result,
Wherein, the first processing submodule includes:
Conversion module, for the return value of file decompression class function to be converted into class of paths address;
Third acquisition submodule, for obtaining the decompressing files stored at the class of paths address;And second processing Module is analyzed and processed for decompressing files described in the data analysis function pair, obtains the handling result.
10. the processing unit of the compressed file of RAR forms according to claim 9, which is characterized in that described device is also wrapped It includes:
Removing module, for deleting the decompressing files stored at the class of paths address;And
Memory module, for the handling result to be stored in the corresponding address of default store path.
CN201410773628.5A 2014-12-12 2014-12-12 The treating method and apparatus of the compressed file of RAR forms Active CN104484174B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410773628.5A CN104484174B (en) 2014-12-12 2014-12-12 The treating method and apparatus of the compressed file of RAR forms

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410773628.5A CN104484174B (en) 2014-12-12 2014-12-12 The treating method and apparatus of the compressed file of RAR forms

Publications (2)

Publication Number Publication Date
CN104484174A CN104484174A (en) 2015-04-01
CN104484174B true CN104484174B (en) 2018-06-22

Family

ID=52758718

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410773628.5A Active CN104484174B (en) 2014-12-12 2014-12-12 The treating method and apparatus of the compressed file of RAR forms

Country Status (1)

Country Link
CN (1) CN104484174B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106406923B (en) * 2015-07-30 2020-09-04 腾讯科技(深圳)有限公司 Method and device for processing dynamic library file
CN106844766A (en) * 2017-02-23 2017-06-13 郑州云海信息技术有限公司 The method and device of a kind of compressed file decompression
CN108197204B (en) * 2017-12-28 2021-12-21 北京安博通科技股份有限公司 File processing method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101668018A (en) * 2009-10-13 2010-03-10 金蝶软件(中国)有限公司 Network transmission method and system therefor
CN103235829B (en) * 2013-05-14 2016-03-02 厦门市美亚柏科信息股份有限公司 The decompression method of RAR file and device

Also Published As

Publication number Publication date
CN104484174A (en) 2015-04-01

Similar Documents

Publication Publication Date Title
US9542461B2 (en) Enhancing performance of extract, transform, and load (ETL) jobs
CN102609462A (en) Method for compressed storage of massive SQL (structured query language) by means of extracting SQL models
CN110362544A (en) Log processing system, log processing method, terminal and storage medium
CN113360554B (en) Method and equipment for extracting, converting and loading ETL (extract transform load) data
CN104484174B (en) The treating method and apparatus of the compressed file of RAR forms
CN108304538A (en) A kind of ETL system and its method based entirely on distributed memory calculating
CN104572679B (en) Public sentiment data storage method and device
CN105493095A (en) Adaptive and recursive filtering for sample submission
US9966971B2 (en) Character conversion
CN106407442B (en) A kind of mass text data processing method and device
CN106649676A (en) Duplication eliminating method and device based on HDFS storage file
CN114417408B (en) Data processing method, device, equipment and storage medium
CN109669976A (en) Data service method and equipment based on ETL
CN109471893B (en) Network data query method, equipment and computer readable storage medium
CN104166701A (en) Machine learning method and system
CN112817926B (en) File processing method and device, storage medium and electronic device
CN105308579A (en) Series data parallel analysis infrastructure and parallel distributed processing method therefor
KR20200103133A (en) Method and apparatus for performing extract-transfrom-load procedures in a hadoop-based big data processing system
CN106599244B (en) General original log cleaning device and method
CN111723063A (en) Method and device for processing offline log data
CN115840765A (en) Data processing method and device based on rule engine
KR20120084100A (en) Inputformat for binary format data in hadoop mapreduce and binary data analysis using the same
CN111125161B (en) Method, device, equipment and storage medium for processing data in real time
CN113704203A (en) Log file processing method and device
CN114328400A (en) Data processing method and related equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Processing method and processing device for compressed file with RAR (Roshal A Rchive) format

Effective date of registration: 20190531

Granted publication date: 20180622

Pledgee: Shenzhen Black Horse World Investment Consulting Co., Ltd.

Pledgor: Beijing Guoshuang Technology Co.,Ltd.

Registration number: 2019990000503

PE01 Entry into force of the registration of the contract for pledge of patent right
CP02 Change in the address of a patent holder

Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing

Patentee after: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd.

Address before: 100086 Beijing city Haidian District Shuangyushu Area No. 76 Zhichun Road cuigongfandian 8 layer A

Patentee before: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd.

CP02 Change in the address of a patent holder