CN106250380B - The customized method of partition of Hadoop file system data - Google Patents
The customized method of partition of Hadoop file system data Download PDFInfo
- Publication number
- CN106250380B CN106250380B CN201510320303.6A CN201510320303A CN106250380B CN 106250380 B CN106250380 B CN 106250380B CN 201510320303 A CN201510320303 A CN 201510320303A CN 106250380 B CN106250380 B CN 106250380B
- Authority
- CN
- China
- Prior art keywords
- data
- input data
- block
- file system
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/11—File system administration, e.g. details of archiving or snapshots
- G06F16/113—Details of archiving
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/1858—Parallel file systems, i.e. file systems supporting multiple processors
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Propose a kind of customized method of partition of Hadoop file system data, comprising: be ranked up to input data;According to pre-set deblocking parameter, piecemeal is carried out to the input data after sequence, to obtain data block, wherein carrying out piecemeal to the input data after sequence includes: that initial position in input data by each data block after sequence and final position are recorded in blocking information corresponding with each data block;And it is based on the blocking information, corresponding data block is read, from the input data after sequence to carry out parallel processing.
Description
Technical field
The invention belongs to parallel file system data management domains in computer field, and in particular to a kind of Hadoop file
The customized method of partition of system data.
Background technique
Hadoop distributed file system HDFS (Hadoop Distributed File System) is Google file
The open source version of system GFS (Google File System), is the distributed file system of an Error Tolerance, is suitble to deployment
In cheap large-scale machines.HDFS is capable of providing the data access of high-throughput, and big file is supported to store, and is very suitable to big
Application on scale data collection.HDFS is the sub-project of Hadoop, provides the expansible of high-throughput for Hadoop upper layer application
Big file storage service, be Hadoop cloud calculate basis.
Fig. 1 is the structural schematic diagram of HDFS in the prior art, and the basic structure of HDFS uses master slave mode, a HDFS collection
Group includes a namenode, it is the primary server of a management file name space and adjusting client access file, when
So there are also some back end, one machine of a usually node, it manages the storage of corresponding node.HDFS opening
File name space simultaneously allows user data to store with document form.
The internal mechanism of HDFS is by a file division into one or more blocks, these blocks are stored in one group of data section
Point in.Namenode is used to the file or directory operation of operation file NameSpace, such as opens, and closes, renaming etc..It is same
When determine the mapping of block and back end.Back end is responsible for the read-write requests from file system client.Back end is same
When also want the creation of perfoming block, delete, and the block duplicate instructions from namenode.
The design of HDFS is for supporting big file.The program operated on HDFS is also for handling large data sets
's.These programs only write a data, one or many read data requests, and these read operations and are required to meet stream transmission speed
Degree.HDFS supports the write multiple times operation of file.Typical block size is 64MB in HDFS, and HDFS file can be by
It is cut into the block of multiple 64MB sizes, this fixed block mode limits the application field of Hadoop, such as in prestack seismic data
In preceding migration processing, a data input needs repeatedly different partitioned mode processing, and the fixed data partitioned mode of HDFS can not
It meets the requirements.
Summary of the invention
The present invention proposes a kind of descriptive piecemeal side of self-defining data on the basis of HDFS fixed data piecemeal
Method realizes customized, the descriptive piecemeal of data in HDFS file system, solves HDFS in deblocking access mode
Solid data fixed block mode is taken, the problem of changeable data access requires is not adapted to, improves HDFS data file
The versatility and flexibility of access.
One aspect of the present invention proposes a kind of customized method of partition of Hadoop file system data, comprising: to input number
According to being ranked up;According to pre-set deblocking parameter, piecemeal is carried out to the input data after sequence, to obtain data
Block, wherein including: the starting in the input data by each data block after sequence to the input data progress piecemeal after sequence
Position and final position are recorded in blocking information corresponding with each data block;And it is based on the blocking information, from row
Corresponding data block is read in input data after sequence, to carry out parallel processing.
According to another embodiment of the present invention, a kind of customized blocking devices of Hadoop file system data are proposed, are wrapped
It includes: the component for being ranked up to input data;For according to pre-set deblocking parameter, to the input after sequence
Data carry out piecemeal, to obtain the component of data block, wherein carrying out piecemeal to the input data after sequence includes: by each data
Initial position and final position of the block in the input data after sequence are recorded in blocking information corresponding with each data block
In;And for being based on the blocking information, corresponding data block is read, from the input data after sequence to be located parallel
The component of reason.
Each aspect of the present invention improves HDFS file access method, improve HDFS data file access versatility and
Flexibility provides more efficient file storage service for the popularization and application of Hadoop technology.
Detailed description of the invention
Disclosure illustrative embodiments are described in more detail in conjunction with the accompanying drawings, the disclosure above-mentioned and its
Its purpose, feature and advantage will be apparent, wherein in disclosure illustrative embodiments, identical reference label
Typically represent same parts.
Fig. 1 shows the structural schematic diagram of HDFS in the prior art.
Fig. 2 shows a kind of customized method of partition of Hadoop file system data according to an embodiment of the invention
Flow chart.
Specific embodiment
The preferred embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although showing the disclosure in attached drawing
Preferred embodiment, however, it is to be appreciated that may be realized in various forms the disclosure without the embodiment party that should be illustrated here
Formula is limited.On the contrary, these embodiments are provided so that this disclosure will be more thorough and complete, and can be by the disclosure
Range is completely communicated to those skilled in the art.
Fig. 2 shows the streams of the customized method of partition of Hadoop file system data according to an embodiment of the invention
Cheng Tu, in this embodiment, this method comprises:
Step 201, input data is ranked up;
Step 202, according to pre-set deblocking parameter, piecemeal is carried out to the input data after sequence, to obtain
Data block, wherein including: in the input data by each data block after sequence to the input data progress piecemeal after sequence
Initial position and final position are recorded in blocking information corresponding with each data block;
Step 203, it is based on the blocking information, corresponding data block is read from the input data after sequence, to carry out
Parallel processing.
The present embodiment, which is used, carries out piecemeal simultaneously to the input data after sequence according to pre-set deblocking parameter
Blocking information is recorded, further according to the mode of blocking information read block, this is a kind of different from traditional entity fixed block
The descriptive customized partitioned mode of mode, which solve HDFS to take fixed point of solid data in deblocking access mode
Block mode does not adapt to the problem of changeable data access requires, improves the versatility and flexibly of HDFS data file access
Property, extend the application range of Hadoop cloud calculating.
Data sorting
It is to provide the input data of regularization for subsequent descriptive piecemeal to the purpose that input data is ranked up, guarantees
The continuity of deblocking, and the blocking information in following blocks processing is made to can simplify initial position and end for data
Location information keeps parallel processing more efficient.
It will be understood by those skilled in the art that can arbitrarily formulate as needed the principle that input data is ranked up.One
In a example, can be according to the parallel processing to be executed the characteristics of, is (for example, the processing sequence of input data is wanted in parallel processing
Ask), input data is ranked up.
In one example, classification processing first can be carried out to input data before sorting, so that with same alike result
Data concentrate in together, then are ranked up.By this processing, sorting out by attribute and orderly for data can be further realized
Change.
In one example, the input data in the present embodiment, which can be, is consolidated solid data based on Hadoop file system
Determine the data stored after piecemeal.That is, this implementation can be built upon secondary point on original fixed block basis
Block.However it will be understood by those skilled in the art that the present embodiment can also be used for replacing the fixed block in Hadoop file system.
Deblocking
The present embodiment carries out piecemeal to the input data after sequence according to pre-set deblocking parameter, realizes certainly
Define piecemeal.In addition, the present embodiment is by by initial position of each data block in input data after sequence and stop bit
It sets and is recorded in blocking information corresponding with each data block, to realize descriptive piecemeal.By such customized, description
Property piecemeal processing after, the corresponding blocking information of each data block, and solid data remains unchanged, thus keeping solid data
In the case that storage mode is constant, any customized piecemeal can be carried out to data at any time according to the demand of parallel processing and handled.
In one example, piecemeal parameter is parameter required for carrying out piecemeal, can arbitrarily be set as needed by user
It is fixed, to meet a variety of partitioned modes of user's needs.It is that the present embodiment can be realized and " make by oneself that piecemeal parameter, which can arbitrarily be set,
The embodiment of adopted piecemeal ".
Parallel processing
In the present embodiment, it is based on the blocking information, corresponding data block is read from the input data after sequence, with
Parallel processing is carried out, this mode is not limited by solid data piecemeal storage, can be carried out in real time in operation treatment process
Data block access and processing.
In one example, it before parallel processing, can also further comprise: be opened according to the number of obtained data block
Dynamic parallel processing element a, wherein parallel processing element can be started for each data block.After starting parallel processing element,
Using parallel processing element be based on the blocking information, corresponding data block is read from the input data after sequence, with into
Row parallel processing.
Data regularization
In one example, the step of method of the present embodiment can also further comprise data regularization, to parallel processing
As a result reduction process is carried out.Reduction process can be carried out respectively for the parallel processing of each completion, it can also be in all parallel processings
After the completion, then for the processing result of each parallel processing reduction process is carried out.
Reduction principle can be determined according to piecemeal principle, for example, if the input data that piecemeal principle is same attribute is divided into
One data block, then reduction principle can will be combined into a data result through reduction group for the output result of each data block.
After completing reduction process, processing result is exported.
It will be understood by those skilled in the art that it is not necessary that under carrying out the application scenarios of reduction operation, the present embodiment can not also be wrapped
Containing the reduction operation in the example.
Using example
Hereinafter, one for providing the embodiment of the present invention applies example for using seismic channel set data as input data.This
It should be understood to the one skilled in the art that this is intended merely to better understand the present invention using example, any details is not intended to this in field
Invention is limited.
Data sorting
Common seismic channel set data may include common-shot-gather data and common point (CMP) trace gather data, common-source point
Trace gather refers to the data acquired per same big gun is belonged to together in trace gather, and concentrically trace gather data refer to counting in trace gather per one
According to inspection point center it is identical.
Before carrying out parallel processing to seismic channel set data, sequence processing according to an embodiment of the present invention can be carried out, this
It outside, can also further progress classification processing before sequence processing.In this application example, classification can be by taking out trace gather come real
It is existing, i.e., trace gather data (such as common-shot-gather data or common point (CMP) trace gather data) are generated by taking out trace gather operation
Required trace gather form.For example, under the application scenarios for the piecemeal processing for mainly needing to solve the problems, such as common offset data, institute
The trace gather form needed can be common offset trace gather.Common offset trace gather data refer to the offset distance (shot point of each track data in trace gather
To the distance of geophone station) it is identical, therefore the identical trace gather data of offset distance can be referred to together, form common offset trace gather
Data.
Trace gather data after classification can be ranked up, for obtaining common offset trace gather after sorting out, sequencer procedure
It can be and common offset trace gather data are ranked up according to the size of offset distance value.
Deblocking
Piecemeal can be carried out to the seismic channel set data after sequence according to pre-set deblocking parameter.In the application
In example, still by taking the piecemeal of common offset data as an example, deblocking parameter can include but is not limited to one in following parameter
It is a or multiple: the max number of channels etc. in minimum offset values, maximum offset value, offset distance class interval and each data block.
These parameters can be provided by user, to determine partitioned mode.By start bit of each data block in the input data after sequence
Set and be recorded in blocking information corresponding with each data block with final position, thus realize to seismic channel set data from
Definition, descriptive data piecemeal.
It should be noted that purpose for ease of description, this application example is described by taking " common offset " principle as an example,
However it will be understood by those skilled in the art that deblocking and the principle of processing are not limited to " common offset ", can be according to number
According to processing any principle actually required, such as in the processing of big gun domain, need to carry out big gun collection piecemeal to data.At seismic data
Reason field, main piecemeal principle may include offset distance piecemeal, CMP trace gather piecemeal, big gun collection piecemeal etc..
Parallel processing
Can be based on the blocking information, read corresponding data block from the input seismic channel set data after sequence, with into
Row parallel processing.The former processing mode of Hadoop file system is fixed block mode, i.e., input data is stored in file system
When, multi-block data storage is had been separated into, this piecemeal storage is just immobilized after input data importing.And in earthquake number
According in treatment process, especially in migration before stack treatment process, the piecemeal of data is required to be to change, each run program,
User may need different deblocking modes to handle.It is applied in example at this, utilizes making by oneself for the embodiment of the present invention
The fixed block storage mode that adopted, descriptive block parallel processing mode solves Hadoop cannot adapt to seismic data process
Problem realizes in real time according to user's definition come any carry out deblocking during seismic data process, meets ground
Shake the specific demand of data processing.
Data regularization.
This can also include reduction process using example, and reduction process can be different and different according to piecemeal principle, still with altogether
For offset distance processing, reduction mode can be determined according to the packet mode of offset distance, for example, same offset distance group (i.e. same number
According to block) data can be overlapped reduction process, i.e., will for multiple parallel processings of same offset distance group result it is corresponding
Value is added, a raw result;The data of different offset distance groups can be combined reduction process, i.e., will be directed to different offset distance groups
The result of each parallel processing is combined.And for big gun domain processing routine, last reduction process can be overlapped reduction.
After completing reduction process, processing result is exported.
According to another embodiment of the present invention, a kind of customized blocking devices of Hadoop file system data are proposed, are wrapped
It includes: the component for being ranked up to input data;For according to pre-set deblocking parameter, to the input after sequence
Data carry out piecemeal, to obtain the component of data block, wherein carrying out piecemeal to the input data after sequence includes: by each data
Initial position and final position of the block in the input data after sequence are recorded in blocking information corresponding with each data block
In;And for being based on the blocking information, corresponding data block is read, from the input data after sequence to be located parallel
The component of reason.
In one example, which may also include that for before being ranked up to input data, to input data into
Row classification processing, so that the component that the data with same alike result concentrate in together.
In one example, can be will be after solid data fixed block based on Hadoop file system for the input data
The data of storage.
In one example, which may also include that for starting parallel processing according to the number of obtained data block
The component of unit a, wherein parallel processing element can be started for each data block;And for utilizing the parallel place started
It manages unit and is based on the blocking information, read corresponding data block, from the input data after sequence to carry out parallel processing
Component.
In one example, which may also include that the portion that reduction process is carried out for the processing result to parallel processing
Part.
In one example, input data can be seismic channel set data.
In one example, be ranked up to input data may include: to return the identical seismic channel set data of offset distance
Class is to together, to form common offset trace gather data;And common offset trace gather data are carried out according to the size of offset distance value
Sequence.
In one example, according to pre-set deblocking parameter, carrying out piecemeal to the input data after sequence can
To include: to carry out piecemeal to the common offset trace gather data after sequence according to one or more of following data piecemeal parameter:
Max number of channels in minimum offset values, maximum offset value, offset distance class interval and each data block.
In one example, which may also include the result pair of multiple parallel processings for that will be directed to same data block
Addition should be worth, to realize the component of reduction process.
In one example, which may also include the combination of the result for that will be directed to each parallel processing of different data block
Together, to realize the component of reduction process.
The embodiment of the present invention proposes a kind of customized, descriptive data piecemeal mechanism, realizes in HDFS file system
Customized, the descriptive piecemeals of data.It is right using data specifying-information on the basis of HDFS fixed data piecemeal mechanism
Data carry out customized piecemeal, in the case where not changing solid data storage, realize any piecemeal of data.Realize number
According to flexible piecemeal, extend the data management function and application field of HDFS file system.
In seismic data process application scenarios, by applying customized, the descriptive piecemeal mechanism of the embodiment of the present invention,
The Hadoop parallel processing that earthquake migration before stack processing technique may be implemented, improves mass data processing ability.
The disclosure can be system, method and/or computer program product.Computer program product may include computer
Readable storage medium storing program for executing, containing for making processor realize the computer-readable program instructions of various aspects of the disclosure.
Computer readable storage medium, which can be, can keep and store the tangible of the instruction used by instruction execution equipment
Equipment.Computer readable storage medium for example can be-- but it is not limited to-- storage device electric, magnetic storage apparatus, optical storage
Equipment, electric magnetic storage apparatus, semiconductor memory apparatus or above-mentioned any appropriate combination.Computer readable storage medium
More specific example (non exhaustive list) includes: portable computer diskette, hard disk, random access memory (RAM), read-only deposits
It is reservoir (ROM), erasable programmable read only memory (EPROM or flash memory), static random access memory (SRAM), portable
Compact disk read-only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanical coding equipment, for example thereon
It is stored with punch card or groove internal projection structure and the above-mentioned any appropriate combination of instruction.Calculating used herein above
Machine readable storage medium storing program for executing is not interpreted that instantaneous signal itself, the electromagnetic wave of such as radio wave or other Free propagations lead to
It crosses the electromagnetic wave (for example, the light pulse for passing through fiber optic cables) of waveguide or the propagation of other transmission mediums or is transmitted by electric wire
Electric signal.
Computer-readable program instructions as described herein can be downloaded to from computer readable storage medium it is each calculate/
Processing equipment, or outer computer or outer is downloaded to by network, such as internet, local area network, wide area network and/or wireless network
Portion stores equipment.Network may include copper transmission cable, optical fiber transmission, wireless transmission, router, firewall, interchanger, gateway
Computer and/or Edge Server.Adapter or network interface in each calculating/processing equipment are received from network to be counted
Calculation machine readable program instructions, and the computer-readable program instructions are forwarded, for the meter being stored in each calculating/processing equipment
In calculation machine readable storage medium storing program for executing.
Computer program instructions for executing disclosure operation can be assembly instruction, instruction set architecture (ISA) instructs,
Machine instruction, machine-dependent instructions, microcode, firmware instructions, condition setup data or with one or more programming languages
The source code or object code that any combination is write, the programming language include the programming language-of object-oriented such as
Smalltalk, C++ etc., and conventional procedural programming languages-such as " C " language or similar programming language.Computer
Readable program instructions can be executed fully on the user computer, partly execute on the user computer, be only as one
Vertical software package executes, part executes on the remote computer or completely in remote computer on the user computer for part
Or it is executed on server.In situations involving remote computers, remote computer can pass through network-packet of any kind
It includes local area network (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as benefit
It is connected with ISP by internet).In some embodiments, by utilizing computer-readable program instructions
Status information carry out personalized customization electronic circuit, such as programmable logic circuit, field programmable gate array (FPGA) or can
Programmed logic array (PLA) (PLA), the electronic circuit can execute computer-readable program instructions, to realize each side of the disclosure
Face.
Referring herein to according to the flow chart of the method, apparatus (system) of the embodiment of the present disclosure and computer program product and/
Or block diagram describes various aspects of the disclosure.It should be appreciated that flowchart and or block diagram each box and flow chart and/
Or in block diagram each box combination, can be realized by computer-readable program instructions.
These computer-readable program instructions can be supplied to general purpose computer, special purpose computer or other programmable datas
The processor of processing unit, so that a kind of machine is produced, so that these instructions are passing through computer or other programmable datas
When the processor of processing unit executes, function specified in one or more boxes in implementation flow chart and/or block diagram is produced
The device of energy/movement.These computer-readable program instructions can also be stored in a computer-readable storage medium, these refer to
It enables so that computer, programmable data processing unit and/or other equipment work in a specific way, thus, it is stored with instruction
Computer-readable medium then includes a manufacture comprising in one or more boxes in implementation flow chart and/or block diagram
The instruction of the various aspects of defined function action.
Computer-readable program instructions can also be loaded into computer, other programmable data processing units or other
In equipment, so that series of operation steps are executed in computer, other programmable data processing units or other equipment, to produce
Raw computer implemented process, so that executed in computer, other programmable data processing units or other equipment
Instruct function action specified in one or more boxes in implementation flow chart and/or block diagram.
The flow chart and block diagram in the drawings show system, method and the computer journeys according to multiple embodiments of the disclosure
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
One module of table, program segment or a part of instruction, the module, program segment or a part of instruction include one or more use
The executable instruction of the logic function as defined in realizing.In some implementations as replacements, function marked in the box
It can occur in a different order than that indicated in the drawings.For example, two continuous boxes can actually be held substantially in parallel
Row, they can also be executed in the opposite order sometimes, and this depends on the function involved.It is also noted that block diagram and/or
The combination of each box in flow chart and the box in block diagram and or flow chart, can the function as defined in executing or dynamic
The dedicated hardware based system made is realized, or can be realized using a combination of dedicated hardware and computer instructions.
The presently disclosed embodiments is described above, above description is exemplary, and non-exclusive, and
It is not limited to disclosed each embodiment.Without departing from the scope and spirit of illustrated each embodiment, for this skill
Many modifications and changes are obvious for the those of ordinary skill in art field.The selection of term used herein, purport
In the principle, practical application or technological improvement to the technology in market for best explaining each embodiment, or lead this technology
Other those of ordinary skill in domain can understand each embodiment disclosed herein.
Claims (8)
1. a kind of customized method of partition of Hadoop file system data, comprising:
Input data is ranked up;
According to pre-set deblocking parameter, piecemeal is carried out to the input data after sequence, to obtain data block, wherein right
Input data progress piecemeal after sequence includes: initial position and termination in the input data by each data block after sequence
Position is recorded in blocking information corresponding with each data block;And
Based on the blocking information, corresponding data block is read from the input data after sequence, to carry out parallel processing;
Wherein, the input data is the data that will be stored after solid data fixed block based on Hadoop file system;
It is described according to pre-set deblocking parameter, carrying out piecemeal to the input data after sequence includes:
According to one or more of following data piecemeal parameter, piecemeal is carried out to the common offset trace gather data after sequence: most
Max number of channels in small deviant, maximum offset value, offset distance class interval and each data block.
2. the customized method of partition of Hadoop file system data according to claim 1, further includes:
Before being ranked up to input data, classification processing is carried out to input data, so that the data set with same alike result
In together.
3. the customized method of partition of Hadoop file system data according to claim 1, further includes:
Start parallel processing element according to the number of obtained data block, wherein one can be started for each data block parallel
Processing unit;And
It utilizes started parallel processing element to be based on the blocking information, corresponding number is read from the input data after sequence
According to block, to carry out parallel processing.
4. the customized method of partition of Hadoop file system data according to claim 1, further includes:
Reduction process is carried out to the processing result of parallel processing.
5. the customized method of partition of Hadoop file system data according to claim 1, wherein
The input data is seismic channel set data.
6. the customized method of partition of Hadoop file system data according to claim 5, wherein being carried out to input data
Sequence includes:
The identical seismic channel set data of offset distance are referred to together, to form common offset trace gather data;And
Common offset trace gather data are ranked up according to the size of offset distance value.
7. the customized method of partition of Hadoop file system data according to claim 5, wherein same data will be directed to
The result respective value of multiple parallel processings of block is added, to carry out reduction process.
8. the customized method of partition of Hadoop file system data according to claim 5, wherein different data will be directed to
The result of each parallel processing of block is combined, to carry out reduction process.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510320303.6A CN106250380B (en) | 2015-06-12 | 2015-06-12 | The customized method of partition of Hadoop file system data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510320303.6A CN106250380B (en) | 2015-06-12 | 2015-06-12 | The customized method of partition of Hadoop file system data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106250380A CN106250380A (en) | 2016-12-21 |
CN106250380B true CN106250380B (en) | 2019-08-23 |
Family
ID=57626402
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510320303.6A Active CN106250380B (en) | 2015-06-12 | 2015-06-12 | The customized method of partition of Hadoop file system data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106250380B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109655911A (en) * | 2017-10-11 | 2019-04-19 | 中国石油化工股份有限公司 | Seismic data visualization system and method based on WebService |
CN110954941B (en) * | 2018-09-26 | 2021-08-24 | 中国石油化工股份有限公司 | Automatic first arrival picking method and system |
CN114463962A (en) * | 2020-10-21 | 2022-05-10 | 中国石油化工股份有限公司 | Intelligent node data acquisition method, electronic device and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102231155A (en) * | 2011-06-03 | 2011-11-02 | 中国石油集团川庆钻探工程有限公司地球物理勘探公司 | Method for managing and organizing three-dimensional seismic data |
CN102508902A (en) * | 2011-11-08 | 2012-06-20 | 西安电子科技大学 | Block size variable data blocking method for cloud storage system |
CN103428494A (en) * | 2013-08-01 | 2013-12-04 | 浙江大学 | Image sequence coding and recovering method based on cloud computing platform |
WO2014209375A1 (en) * | 2013-06-28 | 2014-12-31 | Landmark Graphics Corporation | Smart grouping of seismic data in inventory trees |
-
2015
- 2015-06-12 CN CN201510320303.6A patent/CN106250380B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102231155A (en) * | 2011-06-03 | 2011-11-02 | 中国石油集团川庆钻探工程有限公司地球物理勘探公司 | Method for managing and organizing three-dimensional seismic data |
CN102508902A (en) * | 2011-11-08 | 2012-06-20 | 西安电子科技大学 | Block size variable data blocking method for cloud storage system |
WO2014209375A1 (en) * | 2013-06-28 | 2014-12-31 | Landmark Graphics Corporation | Smart grouping of seismic data in inventory trees |
CN103428494A (en) * | 2013-08-01 | 2013-12-04 | 浙江大学 | Image sequence coding and recovering method based on cloud computing platform |
Non-Patent Citations (1)
Title |
---|
基于Hadoop的地震数据分布式存储策略的研究;冯翔;《中国优秀硕士学位论文全文数据库信息科技辑》;20150215;第2-5章 |
Also Published As
Publication number | Publication date |
---|---|
CN106250380A (en) | 2016-12-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106250987B (en) | A kind of machine learning method, device and big data platform | |
US10263879B2 (en) | I/O stack modeling for problem diagnosis and optimization | |
CN105446896B (en) | The buffer memory management method and device of MapReduce application | |
US10970322B2 (en) | Training an artificial intelligence to generate an answer to a query based on an answer table pattern | |
US10373071B2 (en) | Automated intelligent data navigation and prediction tool | |
CN103177329B (en) | Rule-based determination and checking in business object processing | |
CN106250380B (en) | The customized method of partition of Hadoop file system data | |
US10223435B2 (en) | Data transfer between multiple databases | |
CN105988911B (en) | Trust chain is established in system log | |
US11321318B2 (en) | Dynamic access paths | |
US10394788B2 (en) | Schema-free in-graph indexing | |
JP2022524006A (en) | Development and training of deep forest models | |
CN110019111A (en) | Data processing method, device, storage medium and processor | |
CN106250101A (en) | Migration before stack method for parallel processing based on MapReduce and device | |
CN109408601B (en) | Data model conversion method based on graph data and graph data structure converter | |
US10685294B2 (en) | Hardware device based software selection | |
CN110069453A (en) | Operation/maintenance data treating method and apparatus | |
CN106570572B (en) | Travel time calculation method and device based on MapReduce | |
CN104572921B (en) | A kind of method of data synchronization and device across data center | |
CN109582476A (en) | Data processing method, apparatus and system | |
US20200387507A1 (en) | Optimization of database execution planning | |
CN103176843B (en) | The file migration method and apparatus of MapReduce distributed system | |
JP2023088289A (en) | Computer-implemented method for building decision tree in machine learning, computer program product including computer-readable storage medium implemented therein with program instructions, and system (for performance improvement of classification and regression trees through dimensionality reduction) | |
CN109101641A (en) | Form processing method, device, system and medium | |
CN110287977A (en) | Content clustering method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |