CN108108226A - A kind of large data files analysis and processing method under cloud computing environment - Google Patents
A kind of large data files analysis and processing method under cloud computing environment Download PDFInfo
- Publication number
- CN108108226A CN108108226A CN201711380414.1A CN201711380414A CN108108226A CN 108108226 A CN108108226 A CN 108108226A CN 201711380414 A CN201711380414 A CN 201711380414A CN 108108226 A CN108108226 A CN 108108226A
- Authority
- CN
- China
- Prior art keywords
- file
- analysis
- large data
- data files
- exploration
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45504—Abstract machines for programme code execution, e.g. Java virtual machine [JVM], interpreters, emulators
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
Abstract
The invention discloses the large data files analysis and processing methods under a kind of cloud computing environment, exploration and development large data files are uploaded in cloud storage system by this method by file uploading module first, then exploration and development data file ID number is sent in message forwarding agency by Data Analysis Services module, agent in cloud computing environment void machine is listened to after message downloads the corresponding file of the ID number under the specified directory in empty machine from cloud storage system, and open the specialty analysis processing software installed in empty machine, directly software is handled after the empty machine of Data Analysis Services module connection by the specialty analysis opened to analyze and process data file.Large data file is stored in cloud storage system by the present invention, Large-scale professional analysis software is shared to by multiple professional researchers by empty machine, and be automatically downloaded in empty machine use for specialty analysis processing software by exploration and development large data files, have many advantages, such as that data processing speed is fast, stability is high.
Description
Technical field
The present invention relates to large data files analysis and processing methods, and in particular to the large data files under a kind of cloud computing environment
Analysis and processing method belongs to field of cloud computer technology.
Background technology
Data Analysis Services in exploration and development field generally require to face the large data file more than 100GB, analysis
Handling implement is also the Large-scale professional processing software of millions of members easily, and analyzing and processing work is usually enterprising in special large scale computer
Row, this mode is not only of high cost, and exist analyzing and processing work cannot more people it is shared the drawbacks of, therefore, with cloud computing
The fast development of technology, stores and analyzes and processes that be all placed on high in the clouds be a kind of rational solution by data file, Ji Nengshi
Existing multi-person synergy carries out analysis and research work, and can reduce total system cost, realizes efficiently using for computing resource.
Therefore, development is a kind of is stored in large-scale exploration and development large data files in cloud storage system, will be big by empty machine
Type specialty analysis software shares to multiple professional researchers, and large-scale exploration and development large data files are passed through real time propelling movement skill
Art is automatically downloaded to be very important for the method that interpretation software uses in empty machine, and the invention also has important answer
Use prospect.
The content of the invention
The present invention is directed to the drawbacks of Data Analysis Services technology in existing exploration and development field, provides a kind of cloud computing ring
Large-scale exploration and development large data files are stored in cloud storage system, pass through by the large data files analysis and processing method under border
Exploration and development specialty analysis processing software is shared to multiple professional researchers by virtual machine, and by large-scale exploration and development big data
File is automatically downloaded to by real time propelling movement technology in empty machine handle software use for exploration and development specialty analysis.
In order to realize above-mentioned target, the technical solution adopted in the present invention is:
Large data files analysis and processing method under a kind of cloud computing environment, which is characterized in that the described method includes following
Step:
Exploration and development large data files are uploaded in cloud storage system by S1, file uploading module;
S2, will exploration and development specialty analysis handle software installation to cloud computing environment under empty machine in;
S3, agent softwares are installed in each empty machine;
Then file ID number is sent to message by Data Analysis Services module and forwarded by S4, user's select file ID number
Agency;
The message channel of oneself in agent softwares monitoring information forwarding agency in S5, empty machine, receives Data Analysis Services
The file ID in cloud storage system is obtained after the message that module is sent;
Agent softwares in S6, empty machine search the file in cloud storage system according to file ID, after finding corresponding file
Exploration and development large data files are downloaded to handle under software catalog to the exploration and development specialty analysis in the void machine;
Agent softwares in S7, empty machine start the exploration and development specialty analysis processing software installed in empty machine;
S8, user connect empty machine by Data Analysis Services module, and in exploration and development specialty analysis handles software
Manage exploration and development large data files;
By treated, data file is uploaded in cloud storage system agent in S9, empty machine.
Large data files analysis and processing method under foregoing cloud computing environment, which is characterized in that it is characterized in that, described
In step S1, exploration and development large data files are first carried out burst processing by the file uploading module file uploading module;Then
Whole slicing files and total number of files are uploaded to the server-side of file uploading module by file uploading module client;File uploads
Slicing files are polymerized to a file by the server-side of module again;The server-side of final act uploading module calls cloud storage system
Interface writes file in cloud storage system.
Large data files analysis and processing method under foregoing cloud computing environment, which is characterized in that in the step S1, institute
The size for stating exploration and development large data files is more than 100GB.
Large data files analysis and processing method under foregoing cloud computing environment, which is characterized in that it is characterized in that, described
In step S2, the void machine is mounted with the operating system of including but not limited to the following:Windows operating system, Linux behaviour
Make system.
Large data files analysis and processing method under foregoing cloud computing environment, which is characterized in that in the step S2, institute
Specialty analysis processing software package is stated to include but be not limited to the following:EPoffice、Petrel.
Large data files analysis and processing method under foregoing cloud computing environment, which is characterized in that in the step S3, institute
State agent softwares be can cross-platform operation program, have empty machine startup after automatic running and connection message forwarding agency, prison
Message channel is listened, reads message, file ID is obtained, downloads this document from cloud storage system automatically, be then automatically saved in specified
Function under catalogue.
Large data files analysis and processing method under foregoing cloud computing environment, which is characterized in that in the step S4, institute
It is with the file ID number to analyze for selecting user to state Data Analysis Services module, is sent to message forwarding agency's
The module of function in the empty machine message channel of this document can be handled.
Large data files analysis and processing method under foregoing cloud computing environment, which is characterized in that in the step S4, institute
Stating Data Analysis Services module includes an empty machine and the mapping table of exploration and development large data files type, data analysis
After processing module is according to exploration and development large data files type search to corresponding empty machine, it will be sent comprising the message of file ID number
Into the message channel of the specified empty machine of message forwarding agency.
Compared with prior art, the invention has the beneficial effects that:
(1) large data file is stored in cloud storage system, is shared to Large-scale professional analysis software by empty machine
Multiple specialty researchers, and exploration and development large data files are automatically downloaded in empty machine by real time propelling movement technology for specialty
Interpretation software uses, and provides a kind of one-stop solution using exploration and development big data as core;
(2) have many advantages, such as that data processing speed is fast, stability is high, highly practical, applied widely.
Description of the drawings
Fig. 1 is the system structure diagram of the large data files analysis and processing method under the cloud computing environment of the present invention;
Fig. 2 is the flow diagram of the large data files analysis and processing method under cloud computing environment in Fig. 1.
Specific embodiment
Make specific introduce to the present invention below in conjunction with the drawings and specific embodiments.
Referring to Figures 1 and 2, the large data files analysis and processing method under cloud computing environment of the invention, including following step
Suddenly:
Exploration and development large data files are uploaded in cloud storage system by S1, file uploading module;
S2, will exploration and development specialty analysis handle software installation to cloud computing environment under empty machine in;
S3, agent softwares are installed in each empty machine;
Then file ID number is sent to message by Data Analysis Services module and forwarded by S4, user's select file ID number
Agency, message forwarding agency installation message forwarding agent software RabbitMQ preferably in one individually empty machine, and be each
Empty machine configures a message channel, the preferred direct modes of exchanging mechanism;
The message channel of oneself in agent softwares monitoring information forwarding agency in S5, empty machine, receives Data Analysis Services
The file ID in cloud storage system is obtained after the message that module is sent;
Agent softwares in S6, empty machine search the file in cloud storage system according to file ID, after finding corresponding file
It downloads exploration and development large data files to handle under software catalog to the exploration and development specialty analysis in the void machine, such as:C:\
In analysisfile;;
Agent softwares in S7, empty machine start the exploration and development specialty analysis processing software installed in empty machine;
S8, user connect empty machine by Data Analysis Services module, and in exploration and development specialty analysis handles software
Manage exploration and development large data files;
By treated, data file is uploaded in cloud storage system agent in S9, empty machine.
As a preferred solution, in step S1, file uploading module file uploading module is first by the big number of exploration and development
Burst processing is carried out according to file, by multiple subfiles that the cutting of super large exploration and development large data files is 20MB sizes, and to each
Subfile is numbered in order;Then whole slicing files and total number of files are uploaded to file by file uploading module client
The server-side of uploading module;The server-side of file uploading module all judges whether total number of files reaches after a file is often received
It arrives, has all been uploaded if all of file, slicing files are polymerized to a file by the server-side of file uploading module again;Text
The ID number of the preferred UUID generating algorithms generation file of server-side of part uploading module, then calls cloud storage interface to write file
In cloud storage system, and using the UUID as the identification number of file;The server-side of final act uploading module calls cloud storage system
Interface of uniting writes file in cloud storage system, the preferred CePH of cloud storage system, OpenStack Swift.
As a preferred solution, in step S1, the size of exploration and development large data files is more than 100GB.
As a preferred solution, in step S2, empty machine is mounted with the operating system of including but not limited to the following:
Windows operating system, (SuSE) Linux OS.
As a preferred solution, in step S2, specialty analysis processing software package includes but is not limited to the following:
EPoffice、Petrel。
As a preferred solution, in step S3, agent softwares be can cross-platform operation program, have in empty machine
Automatic running and connection message forwarding agency, monitoring information passage, read message after startup, and acquisition file ID is deposited automatically from cloud
Storage system downloads this document, the function being then automatically saved under specified directory.It is preferred that python writes agent programs, and note
Volume is system service program, the automatic start when system starts.The IP acted on behalf of after the program automatic start by message forwarding
Location and port are attached thereto, and then begin listening for the passage of the empty machine of entitled of passage.
As a preferred solution, in step S4, Data Analysis Services module is carried out with select user
The file ID number of analysis is sent to the module of function in the empty machine message channel that can handle this document of message forwarding agency.
As a preferred solution, in step S4, Data Analysis Services module includes an empty machine and exploration and development
The mapping table of large data files type, Data Analysis Services module according to exploration and development large data files type search to pair
After the empty machine answered, in the message channel for the specified empty machine that the message comprising file ID number is sent to message forwarding agency, when with
After the file of processing needed for the selection of family, Data Analysis Services module gets the ID number of this document, Ran Houtong according to user's selection
It crosses empty machine and finds the empty machine that can handle this document with file type mapping table;
In conclusion large-scale exploration and development large data files are stored in cloud storage system by the present invention, it will by empty machine
Large-scale professional analysis software shares to multiple professional researchers, and large-scale exploration and development large data files are passed through real time propelling movement
Technology is automatically downloaded in empty machine use for interpretation software, provides an a kind of station using exploration and development big data as core
Formula solution;Have many advantages, such as that data processing speed is fast, stability is high, highly practical, applied widely.
It should be noted that the foregoing is merely presently preferred embodiments of the present invention, it is not intended to limit the invention, it is all at this
Within the spirit and principle of invention, any modifications, equivalent replacements and improvements are made should be included in the protection model of the present invention
Within enclosing.
Claims (8)
1. the large data files analysis and processing method under a kind of cloud computing environment, which is characterized in that the described method includes following steps
Suddenly:
Exploration and development large data files are uploaded in cloud storage system by S1, file uploading module;
S2, will exploration and development specialty analysis handle software installation to cloud computing environment under empty machine in;
S3, agent softwares are installed in each empty machine;
Then file ID number is sent to message forwarding agency by S4, user's select file ID number by Data Analysis Services module;
The message channel of oneself in agent softwares monitoring information forwarding agency in S5, empty machine, receives Data Analysis Services module
The file ID in cloud storage system is obtained after the message of transmission;
Agent softwares in S6, empty machine search the file in cloud storage system according to file ID, are downloaded after finding corresponding file
Exploration and development large data files are handled to the exploration and development specialty analysis in the void machine under software catalog;
Agent softwares in S7, empty machine start the exploration and development specialty analysis processing software installed in empty machine;
S8, user connect empty machine by Data Analysis Services module, and handle and survey in exploration and development specialty analysis handles software
Visit exploitation large data files;
By treated, data file is uploaded in cloud storage system agent in S9, empty machine.
2. the large data files analysis and processing method under cloud computing environment according to claim 1, which is characterized in that described
In step S1, exploration and development large data files are first carried out burst processing by the file uploading module file uploading module;Then
Whole slicing files and total number of files are uploaded to the server-side of file uploading module by file uploading module client;File uploads
Slicing files are polymerized to a file by the server-side of module again;The server-side of final act uploading module calls cloud storage system
Interface writes file in cloud storage system.
3. the large data files analysis and processing method under cloud computing environment according to claim 1, which is characterized in that described
In step S1, the size of the exploration and development large data files is more than 100GB.
4. the large data files analysis and processing method under cloud computing environment according to claim 1, which is characterized in that described
In step S2, the void machine is mounted with the operating system of including but not limited to the following:Windows operating system, Linux behaviour
Make system.
5. the large data files analysis and processing method under cloud computing environment according to claim 1, which is characterized in that described
In step S2, the specialty analysis processing software package includes but is not limited to the following:EPoffice、Petrel.
6. the large data files analysis and processing method under cloud computing environment according to claim 1, which is characterized in that described
In step S3, the agent softwares be can cross-platform operation program, there is automatic running and connection message after the startup of empty machine
Forwarding agency, monitoring information passage read message, obtain file ID, download this document from cloud storage system automatically, then automatically
The function being stored under specified directory.
7. the large data files analysis and processing method under cloud computing environment according to claim 1, which is characterized in that described
In step S4, the Data Analysis Services module is with the file ID number to analyze for selecting user, is sent to and disappears
The module of function in the empty machine message channel that can handle this document of breath forwarding agency.
8. the large data files analysis and processing method under cloud computing environment according to claim 1, which is characterized in that described
In step S4, the Data Analysis Services module includes an empty machine and the correspondence of exploration and development large data files type
Table after Data Analysis Services module is according to exploration and development large data files type search to corresponding empty machine, will include file ID
Number the message specified empty machine that is sent to message forwarding agency message channel in.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711380414.1A CN108108226A (en) | 2017-12-20 | 2017-12-20 | A kind of large data files analysis and processing method under cloud computing environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711380414.1A CN108108226A (en) | 2017-12-20 | 2017-12-20 | A kind of large data files analysis and processing method under cloud computing environment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108108226A true CN108108226A (en) | 2018-06-01 |
Family
ID=62210378
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711380414.1A Pending CN108108226A (en) | 2017-12-20 | 2017-12-20 | A kind of large data files analysis and processing method under cloud computing environment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108108226A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109656682A (en) * | 2018-12-03 | 2019-04-19 | 中国石油化工股份有限公司 | A kind of system and method for the exploration and development big data processing platform based on container technique |
CN111125050A (en) * | 2019-12-26 | 2020-05-08 | 浪潮云信息技术有限公司 | CephFS-based file storage method for providing NFS protocol in openstack environment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102789477A (en) * | 2011-05-19 | 2012-11-21 | 巴比禄股份有限公司 | File managing apparatus for processing an online storage service |
CN103257958A (en) * | 2012-02-16 | 2013-08-21 | 中兴通讯股份有限公司 | Cloud storage based translating method and system |
CN106020902A (en) * | 2016-05-31 | 2016-10-12 | 浪潮(北京)电子信息产业有限公司 | Virtual machine mirror image file management method and system applied to cloud platform |
-
2017
- 2017-12-20 CN CN201711380414.1A patent/CN108108226A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102789477A (en) * | 2011-05-19 | 2012-11-21 | 巴比禄股份有限公司 | File managing apparatus for processing an online storage service |
CN103257958A (en) * | 2012-02-16 | 2013-08-21 | 中兴通讯股份有限公司 | Cloud storage based translating method and system |
CN106020902A (en) * | 2016-05-31 | 2016-10-12 | 浪潮(北京)电子信息产业有限公司 | Virtual machine mirror image file management method and system applied to cloud platform |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109656682A (en) * | 2018-12-03 | 2019-04-19 | 中国石油化工股份有限公司 | A kind of system and method for the exploration and development big data processing platform based on container technique |
CN111125050A (en) * | 2019-12-26 | 2020-05-08 | 浪潮云信息技术有限公司 | CephFS-based file storage method for providing NFS protocol in openstack environment |
CN111125050B (en) * | 2019-12-26 | 2023-08-22 | 浪潮云信息技术股份公司 | File storage method based on CephFS to provide NFS protocol in openstack environment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10210115B2 (en) | System for handling event messages for file collaboration | |
CN110531987A (en) | Management method, device and computer readable storage medium based on Kubernetes cluster | |
US8775485B1 (en) | Object store management operations within compute-centric object stores | |
CN103414759B (en) | Network disk file transmission method and device | |
US20160253339A1 (en) | Data migration systems and methods including archive migration | |
US11409756B1 (en) | Creating and communicating data analyses using data visualization pipelines | |
TWI493465B (en) | Method and system for distributed application stack deployment | |
US9807135B1 (en) | Methods and computing systems for sharing cloud files using a social network | |
US20150339324A1 (en) | System and Method for Imagery Warehousing and Collaborative Search Processing | |
US20130111336A1 (en) | Platform and application independent system and method for networked file access and editing | |
US20100082713A1 (en) | Method and system for attaching files to e-mail from backup copies remotely stored | |
US9020992B1 (en) | Systems and methods for facilitating file archiving | |
US20130227116A1 (en) | Determining optimal component location in a networked computing environment | |
CN102930218B (en) | File management system and file management method | |
US11218544B1 (en) | Tiered queuing system | |
US20210174238A1 (en) | Machine learning inference calls for database query processing | |
US11361039B2 (en) | Autodidactic phenological data collection and verification | |
CN114586020A (en) | On-demand code obfuscation of data in an input path of an object storage service | |
CN108108226A (en) | A kind of large data files analysis and processing method under cloud computing environment | |
CN105099735B (en) | A kind of method and system for obtaining magnanimity more detailed logging | |
Vianna et al. | A tool for personal data extraction | |
US20230101774A1 (en) | Techniques for performing clipboard-to-file paste operations | |
US9059870B1 (en) | Techniques for managing electronic message distribution | |
US9984088B1 (en) | User driven data pre-fetch | |
CN108259543A (en) | Distributed cloud storage database and its be deployed in the method for multiple data centers |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180601 |
|
RJ01 | Rejection of invention patent application after publication |