CN112860734A - Seismic data multi-dimensional range query method and device - Google Patents

Seismic data multi-dimensional range query method and device Download PDF

Info

Publication number
CN112860734A
CN112860734A CN201911181069.8A CN201911181069A CN112860734A CN 112860734 A CN112860734 A CN 112860734A CN 201911181069 A CN201911181069 A CN 201911181069A CN 112860734 A CN112860734 A CN 112860734A
Authority
CN
China
Prior art keywords
query
data
keywords
group
seismic data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911181069.8A
Other languages
Chinese (zh)
Inventor
赵长海
文佳敏
王增波
杜吉国
尚民强
孙孝萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China National Petroleum Corp
BGP Inc
Original Assignee
China National Petroleum Corp
BGP Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China National Petroleum Corp, BGP Inc filed Critical China National Petroleum Corp
Priority to CN201911181069.8A priority Critical patent/CN112860734A/en
Publication of CN112860734A publication Critical patent/CN112860734A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2291User-Defined Types; Storage management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a device for inquiring seismic data in a multi-dimensional range, wherein the method comprises the following steps: acquiring a group of channel head keywords of the seismic data and query condition data corresponding to each channel head keyword in the group of channel head keywords; determining one or more pointers corresponding to the group of header keywords according to query condition data corresponding to each header keyword in the group of header keywords and a pre-established query model, wherein the one or more pointers carry pointer information corresponding to each pointer, and the query model is pre-established according to a plurality of historical header keywords and query sequence data of the seismic data; and querying the seismic data according to one or more pointers corresponding to the group of channel head keywords. The method and the system can quickly query the seismic data in a multidimensional range, avoid accessing a large amount of redundant data in the query process, improve the query efficiency and optimize the user experience.

Description

Seismic data multi-dimensional range query method and device
Technical Field
The invention relates to the technical field of high-performance calculation and big data, in particular to a seismic data multi-dimensional range query method and device.
Background
The seismic data processing is an important technology in the petroleum exploration industry, and has the function of processing and calculating the seismic data acquired in the field according to a specific processing algorithm so as to obtain an image of an underground geological structure for guiding subsequent drilling and petroleum exploitation work. With the continuous application of new exploration technology and high-precision acquisition technology in oil exploration, the quantity of original seismic data acquired from the field is rapidly increased, the scale of a single data body exceeds a PB level at present, and the number of seismic channels can reach billions. The object of processing by a seismic application is typically a large volume of seismic data that is logically similar to a data table in a relational database, organized in row-by-row order, with each row record being called a seismic trace. The seismic channel consists of two parts, a channel head and a channel body. The trace header stores attribute information related to the seismic trace, and each attribute is called a trace header keyword. The trace volume is a floating point array, each floating point number being referred to as a sample point. As the seismic data volume is high-dimensional structured data, each seismic channel has hundreds of kinds of attribute information and is stored in different channel head keywords.
However, a large number of interactive seismic applications are typically only interested in a partial dataset of a seismic data volume when accessing the seismic data volume. Thus, a large number of seismic data visits may specify a range of values for some attributes to filter out a particular data set, and may also specify how to order query results in the order of some attributes.
Since multi-dimensional range queries are the most common data query model in seismic applications, their query speed is critical to the performance and user experience of seismic applications, especially interactive applications. Efficient index query is the basis for ensuring query efficiency and reducing seismic data query delay.
In the prior art, the query range of the first track head keyword is usually used to determine the data range to be scanned, and the query ranges of other keywords are used to filter the data records during the process of scanning data. When the selectivity of the first top keyword is low, a large amount of redundant data can be accessed in the query process, and the query efficiency and the user experience are seriously influenced.
Disclosure of Invention
The embodiment of the invention provides a seismic data multidimensional range query method, which is used for rapidly querying seismic data in a multidimensional range, avoiding accessing a large amount of redundant data in a query process, improving query efficiency and optimizing user experience, and comprises the following steps:
acquiring a group of channel head keywords of the seismic data and query condition data corresponding to each channel head keyword in the group of channel head keywords;
determining one or more pointers corresponding to the group of header keywords according to query condition data corresponding to each header keyword in the group of header keywords and a pre-established query model, wherein the one or more pointers carry pointer information corresponding to each pointer, and the query model is pre-established according to a plurality of historical header keywords and query sequence data of the seismic data;
and querying the seismic data according to one or more pointers corresponding to the group of channel head keywords.
The embodiment of the invention provides a seismic data multidimensional range query device, which is used for rapidly querying seismic data in a multidimensional range, avoiding accessing a large amount of redundant data in a query process, improving query efficiency and optimizing user experience, and comprises the following steps:
the data acquisition module is used for acquiring a group of channel head keywords of the seismic data and query condition data corresponding to each channel head keyword in the group of channel head keywords;
the pointer determining module is used for determining one or more pointers corresponding to the group of the header keywords according to query condition data corresponding to each header keyword in the group of the header keywords and a pre-established query model, wherein the one or more pointers carry pointer information corresponding to each pointer, and the query model is pre-established according to a plurality of historical header keywords and query sequence data of the seismic data;
and the data query module is used for querying the seismic data according to the one or more pointers corresponding to the group of the channel head keywords.
The embodiment of the invention also provides computer equipment which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the processor executes the computer program to realize the seismic data multi-dimensional range query method.
The embodiment of the invention also provides computer equipment which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the processor executes the computer program to realize the seismic data multi-dimensional range query method.
Compared with the scheme that the data range needing to be scanned is determined by using the query range of the first channel head keyword in the prior art, and the query ranges of other keywords are used for screening data records in the data scanning process, the method and the device for processing the seismic data in the embodiment of the invention acquire a group of channel head keywords of the seismic data and query condition data corresponding to each channel head keyword in the group of channel head keywords; determining one or more pointers corresponding to the group of header keywords according to query condition data corresponding to each header keyword in the group of header keywords and a pre-established query model, wherein the one or more pointers carry pointer information corresponding to each pointer, and the query model is pre-established according to a plurality of historical header keywords and query sequence data of the seismic data; and querying the seismic data according to one or more pointers corresponding to the group of channel head keywords. According to the method and the device, one or more pointers corresponding to a group of channel head keywords are determined according to query condition data corresponding to each channel head keyword in the group of channel head keywords and a pre-established query model, then the seismic data are queried according to the determined pointers, the scanning range of query is effectively reduced, the seismic data can be queried in a multi-dimensional range, a large amount of redundant data are prevented from being accessed in the query process, the query efficiency is improved, and the user experience is optimized.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts. In the drawings:
FIG. 1 is a schematic diagram of a seismic data multidimensional range query method in an embodiment of the invention;
FIG. 2 is a diagram illustrating a two-level B + tree structure according to an embodiment of the present invention;
FIG. 3 is a bitmap of a multi-dimensional range query filter phase of seismic data in an embodiment of the invention;
FIG. 4 is a flow chart of a distributed index construction algorithm in an embodiment of the present invention;
FIG. 5 is a diagram illustrating the structure and merging process of a temporary tree according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a seismic data multidimensional range query method in an embodiment of the invention;
FIG. 7 is a schematic diagram illustrating the construction of an index implemented by an IndexBTreeWriter class according to an embodiment of the present invention;
FIG. 8 is a diagram illustrating implementation of multidimensional range querying via IndexBTreeReader classes in an embodiment of the present invention;
FIG. 9 is a block diagram of a seismic data multi-dimensional range finder in an embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention are further described in detail below with reference to the accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.
As mentioned above, since multi-dimensional range query is the most common data query mode in seismic applications, its query speed is critical to the performance and user experience of seismic applications, especially interactive applications. Efficient index query is the basis for ensuring query efficiency and reducing seismic data query delay. The B + tree index is used as a balanced search tree designed for a disk or other auxiliary storage equipment with direct access, and can effectively reduce disk I/O operation number during query. And because the B + tree can support fast range scanning along leaf nodes, the method has better range query performance and is widely used in current seismic data query. However, when performing multi-dimensional range query, the B + tree index only uses the query range of the first top-of-track keyword to determine the data range to be scanned, and the query ranges of other keywords are used to filter the data records during the process of scanning data. When the selectivity of the first top keyword is low, a large amount of redundant data can be accessed in the query process, and the query efficiency and the user experience are seriously influenced.
In order to query seismic data in a multi-dimensional range quickly, avoid accessing a large amount of redundant data in a query process, improve query efficiency, and optimize user experience, an embodiment of the present invention provides a seismic data multi-dimensional range query method, as shown in fig. 1, which may include:
step 101, acquiring a group of channel head keywords of seismic data and query condition data corresponding to each channel head keyword in the group of channel head keywords;
step 102, determining one or more pointers corresponding to the group of header keywords according to query condition data corresponding to each header keyword in the group of header keywords and a pre-established query model, wherein the one or more pointers carry pointer information corresponding to each pointer, and the query model is pre-established according to a plurality of historical header keywords and query sequence data of seismic data;
and 103, inquiring the seismic data according to one or more pointers corresponding to the group of the channel head keywords.
As shown in fig. 1, in the embodiment of the present invention, a group of heading keywords of seismic data and query condition data corresponding to each heading keyword in the group of heading keywords are obtained; determining one or more pointers corresponding to the group of header keywords according to query condition data corresponding to each header keyword in the group of header keywords and a pre-established query model, wherein the one or more pointers carry pointer information corresponding to each pointer, and the query model is pre-established according to a plurality of historical header keywords and query sequence data of the seismic data; and querying the seismic data according to one or more pointers corresponding to the group of channel head keywords. According to the method and the device, one or more pointers corresponding to a group of channel head keywords are determined according to query condition data corresponding to each channel head keyword in the group of channel head keywords and a pre-established query model, then the seismic data are queried according to the determined pointers, the scanning range of query is effectively reduced, the seismic data can be queried in a multi-dimensional range, a large amount of redundant data are prevented from being accessed in the query process, the query efficiency is improved, and the user experience is optimized.
In specific implementation, a group of channel head keywords of the seismic data and query condition data corresponding to each channel head keyword in the group of channel head keywords are obtained.
In an embodiment, the header key comprises: shot point coordinates, demodulator probe coordinates, sampling point number, shot number and track number.
In the embodiment, the query condition data corresponding to each heading keyword may be one or more values, or may be a value range, and may be set as needed.
In specific implementation, one or more pointers corresponding to the group of header keywords are determined according to query condition data corresponding to each header keyword in the group of header keywords and a pre-established query model, wherein the one or more pointers carry pointer information corresponding to each pointer, and the query model is pre-established according to a plurality of historical header keywords and query sequence data of seismic data; and querying the seismic data according to one or more pointers corresponding to the group of channel head keywords.
In an embodiment, the query model is pre-established according to a plurality of historical trace header keywords and query sequence data of the seismic data, and comprises: the query model is pre-established according to one or more B + trees corresponding to each historical trace head keyword in a plurality of historical trace head keywords of the seismic data and query sequence data.
In an embodiment, the query model is a multi-level B + tree index structure. The multi-level B + tree index adopts a multi-level structure, each level corresponds to a history track head keyword, and each level is composed of a plurality of mutually independent B + trees. The B + tree of each level is composed of the corresponding head key of the level. The structure ensures that the query condition data of each channel head keyword can be respectively used for searching the B + tree of the corresponding level in the query process, reduces the access to redundant data and achieves the aim of obtaining higher query performance.
For example, as shown in fig. 2, a two-level B + tree structure diagram is shown, wherein the two-level B + tree structure diagram is composed of a track head keyword KeyA and a track head keyword KeyB, and the first level (level 0) includes a B + tree TreeA0From all the different values of KeyA in the data volume (a)0,a1,…,an-1N in total). Each heading key a in leaf nodes of TreeA0iB + tree TreeB with pointers pointing to a second level (level 1)i. The second tier contains n B + trees consisting of KeyB: TreeB0,TreeB1,…,TreeBn-1。TreeBiThe KeyB value in (1) comes from all the data bodies satisfying KeyA ═ aiThe seismic trace recording. The second level is the lowest level in the multi-level B + tree, at each TreeBiLeaf ofThe child node stores a pointer to a corresponding location of the data volume. The pointer carries pointer information corresponding to each pointer, and the pointer information comprises: and numbering of seismic traces.
In an embodiment, the query model, i.e. a multi-level B + tree index structure, has the following features:
1. the number of layers is the same as the number of the key words at the head of the track;
2. starting from the uppermost layer, each layer corresponds to one track head keyword according to the sequence of the track head keywords;
3. inside each level, the indexes are organized into a plurality of mutually independent B + trees;
4. the lower B + tree is a subtree of the upper B + tree, and a pointer pointing to a child B + tree of the upper B + tree is stored in a leaf node of the upper B + tree;
5. storing pointers pointing to corresponding positions of the seismic data bodies in leaf nodes of the lowest B + tree;
6. compared with a single-level B + tree, the query model, namely a multi-level B + tree index structure, is a general index structure, and each heading keyword can independently determine the size of data to be read in each level in the query process of the multi-level B + tree, so that redundant data are prevented from being read, the amount of data accessed in the whole process is less, and more efficient query performance is achieved.
In an embodiment, the query order data is obtained as follows: determining an approximate value of the data quantity to be read corresponding to each preset sequencing scheme in a plurality of preset sequencing schemes; and obtaining query sequence data according to the data quantity approximate value to be read corresponding to each preset sequencing scheme.
The inventors have discovered that the order of which trace key words to build a query model during the process of querying seismic data greatly affects the model performance. In order to select the most suitable index structure, the query performance of some specific index candidates (i.e. preset ordering schemes) needs to be evaluated.
The prior art proposes a concept of the three-star index, and an ideal index should satisfy all the conditions of the three-star index. Three conditions for ideal indexing are as follows:
1. if the index row that a query needs to access is adjacent or close enough, then the index can be assigned to the first star, which minimizes the size of the index data that must be scanned;
2. if the order of the index rows is consistent with the query requirement, the index can be given a second star, and the index query structure meeting the condition can avoid the sorting operation;
3. if the index row contains all the top keys in the query, then no further reads to the storage device are required during the query index, and such an index may be assigned a third star.
However, in selecting an index, the first star and the second star are not always satisfied simultaneously in most cases. To minimize the read index data, we need to put the top-of-track key that filters out the least data at the top of the index, which may result in the order of the index rows being different from the query requirement. Therefore, the present invention proposes three preset ordering schemes as follows:
scheme A, putting the head keywords with the best selectivity in the row at the top, adding the head keywords which need to be arranged in sequence in a correct sequence, and finally adding the rest related head keywords in the query to the index in any sequence;
scheme B, arranging the heading keywords to be sequenced at the forefront of the index sheet in sequence, and then adding the rest related heading keywords in the query in any sequence;
scheme C, dividing the query into two stages: a filtering phase and a scanning phase. In the filtering stage, an index containing only the in-line trace header key is used, and all pointers to valid seismic traces are found using the index. Using these pointers, a bitmap is constructed, as shown in FIG. 3, marking all valid seismic traces. In the scanning stage, the indexes containing all the channel head keywords in the sorting requirement are used for sequentially scanning leaf nodes at the lowest layer of the indexes, and all the seismic channel records marked as valid in the bitmap are selected as final results.
In the process of query, the performances of the three preset sorting schemes can be dynamically changed according to the characteristics of data and query conditions. In order to select the most appropriate index, the embodiment of the invention determines the approximate value of the data quantity to be read corresponding to each preset sorting scheme in a plurality of preset sorting schemes; and obtaining query sequence data according to the data quantity approximate value to be read corresponding to each preset sequencing scheme. The inventor finds that the query performance greatly depends on the Size of the Read data volume, so when an application program queries data, an approximate value (ERS) of the data volume to be Read corresponding to each preset sorting scheme in a plurality of preset sorting schemes is determined, and then the preset sorting scheme with the smallest ERS is selected as query sequence data, and the specific steps are as follows:
1. determining the number of B + trees to be searched in each layer according to the number of effective index items obtained in the query of the previous layer except the top B + tree;
2. estimating the quantity of the query results of each level according to the query range of each heading keyword;
3. the node size of each B + tree in the multi-level B + tree is preset, the height of each layer of B + tree can be estimated according to the node size and the number of different values of each layer of channel head keywords, and then the data quantity required to be read in the searching process is estimated. And accumulating the read data quantity of each layer to obtain ERS. Since the scheme a needs to consider the sorting time, the dynamic index policy, when executing the query statement, counts the proportion of time occupied by sorting in the query process, and calculates the average value avr _ sort _ ratio thereof. Then the value of ERS of scheme a is adjusted to ERS ═ (1+ avr _ sort _ ratio);
4. and comparing to obtain query sequence data according to the ERS values of the three schemes.
In the embodiment, the user directly submits the query condition needing to select the filtering keywords and the query condition needing to be ranked without paying attention to the specific use of the scheme A, the scheme B or the scheme C.
In the embodiment, the process of index selection is hidden for the user, and the most appropriate index scheme is automatically calculated for the user.
In an embodiment, a multi-level B + tree, that is, a query model, may be quickly constructed by using computing resources of a plurality of computing nodes based on a distributed index construction algorithm of a MapReduce programming model. The B + trees in the rest levels except the topmost level in the multi-level B + tree are independent. The tree building process may therefore be divided into multiple subtasks based on the value of the first header key, and performed concurrently. The index can be rapidly and concurrently constructed by utilizing the computing resources of the multiple nodes, the index construction efficiency is improved, and the user experience of the interactive application is improved. The flow of the distributed index construction algorithm is shown in fig. 4, and the specific steps are as follows:
1. in the Map stage, the track header data is divided into M data pieces on average, each data table represents a Map task, and then the tasks are randomly distributed to the Map workers;
2. and the Map Worker executes a Map function to process the data sheet after receiving the Map task: all the headers in the slice are read and then a key/value pair is generated for each seismic trace. The Key is composed of a first channel head Key word of the index, the values of other channel head Key words except the first channel head Key word in the index are stored in a Value field, and the Value also contains pointer information pointing to the seismic channel;
3. map Worker uses a partition function: a hash (key) mod R divides locally generated key/value pairs into R groups, each group of data belongs to a Reduce task, and then a Map Worker respectively sends each group of data to corresponding Reduce workers;
4. the Reduce Worker receives the key/value pairs sent from the Map workers, then executes the Reduce function, establishes a temporary tree locally, the top layer of the temporary tree is organized into an ordered array, and the structure of other layers is the same as that of a complete multi-level B + tree;
5. after all temporary trees of the reduce worker are generated, all temporary trees are merged by the master process, and the structure and merging process of the temporary trees are shown in fig. 5. And after the top-level arrays of the temporary trees are combined, establishing a new B + tree as a top-level B + tree of the multi-level B + tree. And a pointer pointing to the position of each sub-tree in the lower layer is stored in a leaf node of the top-layer B + tree, and the pointer comprises information such as file identification, stored offset and the like.
In an embodiment, when an application executes a query, the most appropriate query model needs to be selected. If the query model does not exist, a distributed index builder is started to create an index. A system administrator can configure a usable node list for the index building program, and the index building program only occupies a small part of computing resources of each node, so that the operation of other application programs is prevented from being influenced.
In the embodiment, as shown in fig. 6, searching is performed in the top B + tree according to the query condition of the corresponding heading key word, and an effective index record in a leaf node is found; then, finding the pointer information in the index records to find the set of the lower-layer sub B + tree; and according to the found B + tree, continuing searching according to the query condition of the corresponding heading key word, finding pointer information of the effective index record, judging whether the node pointed by the pointer information is the bottommost node, if not, continuing searching the set of the lower-layer sub B + tree, otherwise, searching the bottommost B + tree structure according to the query condition of the corresponding heading key word, finding the final seismic data corresponding to all effective index items, and finishing the query.
In an embodiment, the index is constructed by an IndexBTreeWriter class, as shown in fig. 7, and the specific steps are as follows:
1. calling a constructor IndexBTreeWriter (const IndexAttr & index _ attr, HeadInfo. head _ info) to generate a class of the IndexBTreeWriter;
2. calling an OpenWrite (int64_ t diff _ key _ num) function to predict the total number of index pieces which need to be read respectively by using three index selection schemes according to the filtering condition and the sequencing condition of the query, and then selecting an index method which is most suitable for the query from the scheme A, the scheme B and the scheme C;
3. and adding an index slice generated by each seismic data according to the query by repeatedly calling a WriteOneIndex (const IndexElement & index) function until all the index slices are constructed.
In an embodiment, the multidimensional range query is implemented by an IndexBTreeReader class, as shown in fig. 8, the specific steps are as follows:
1. calling a constructor IndexBTreeReader (const IndexAttr & index _ attr, Headinfo. head _ info) to generate a class of the IndexBTreeReader;
2. and calling a GetValidTraces (const RowFilterByKey & row _ filter, std:: vector < int64_ t > valid _ trace _ ar) function, inquiring the existing multistage B + tree indexes, and recording all tracks meeting the filtering condition in an integer vector pointer named valid _ trace _ ar.
In the embodiment, a plurality of mutually independent B + trees are arranged between the same levels of the multi-level B + trees, and the structure ensures that the B + trees of the corresponding levels can be searched by respectively using the query conditions of the first keywords in the query process, so that the access to redundant data is reduced, and higher query performance is obtained.
Based on the same inventive concept, the embodiment of the invention also provides a device for inquiring the multidimensional range of seismic data, as described in the following embodiments. Because the principles of solving the problems are similar to the seismic data multi-dimensional range query method, the implementation of the device can be referred to the implementation of the method, and repeated details are not repeated.
Fig. 9 is a block diagram of a seismic data multidimensional range query device in an embodiment of the invention, and as shown in fig. 9, the device includes:
a data obtaining module 901, configured to obtain a group of heading keywords of the seismic data and query condition data corresponding to each heading keyword in the group of heading keywords;
a pointer determining module 902, configured to determine one or more pointers corresponding to the group of header keywords according to query condition data corresponding to each header keyword in the group of header keywords and a pre-established query model, where the one or more pointers carry pointer information corresponding to each pointer, and the query model is pre-established according to a plurality of historical header keywords and query sequence data of the seismic data;
and a data query module 903, configured to query the seismic data according to one or more pointers corresponding to the group of heading keywords.
In one embodiment, the header key includes: shot point coordinates, demodulator probe coordinates, sampling point number, shot number and track number.
In one embodiment, the query order data is obtained as follows:
determining an approximate value of the data quantity to be read corresponding to each preset sequencing scheme in a plurality of preset sequencing schemes;
and obtaining query sequence data according to the data quantity approximate value to be read corresponding to each preset sequencing scheme.
In one embodiment, the query model is pre-established based on a plurality of historical trace header keywords and query sequence data of the seismic data, and comprises: the query model is pre-established according to one or more B + trees corresponding to each historical trace head keyword in a plurality of historical trace head keywords of the seismic data and query sequence data.
In summary, in the embodiments of the present invention, a group of heading keywords of seismic data and query condition data corresponding to each heading keyword in the group of heading keywords are obtained; determining one or more pointers corresponding to the group of header keywords according to query condition data corresponding to each header keyword in the group of header keywords and a pre-established query model, wherein the one or more pointers carry pointer information corresponding to each pointer, and the query model is pre-established according to a plurality of historical header keywords and query sequence data of the seismic data; and querying the seismic data according to one or more pointers corresponding to the group of channel head keywords. According to the method and the device, one or more pointers corresponding to a group of channel head keywords are determined according to query condition data corresponding to each channel head keyword in the group of channel head keywords and a pre-established query model, then the seismic data are queried according to the determined pointers, the scanning range of query is effectively reduced, the seismic data can be queried in a multi-dimensional range, a large amount of redundant data are prevented from being accessed in the query process, the query efficiency is improved, and the user experience is optimized.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A seismic data multidimensional range query method is characterized by comprising the following steps:
acquiring a group of channel head keywords of the seismic data and query condition data corresponding to each channel head keyword in the group of channel head keywords;
determining one or more pointers corresponding to the group of header keywords according to query condition data corresponding to each header keyword in the group of header keywords and a pre-established query model, wherein the one or more pointers carry pointer information corresponding to each pointer, and the query model is pre-established according to a plurality of historical header keywords and query sequence data of the seismic data;
and querying the seismic data according to one or more pointers corresponding to the group of channel head keywords.
2. The method of claim 1, wherein the header key comprises: shot point coordinates, demodulator probe coordinates, sampling point number, shot number and track number.
3. The method of claim 1, wherein the query order data is obtained as follows:
determining an approximate value of the data quantity to be read corresponding to each preset sequencing scheme in a plurality of preset sequencing schemes;
and obtaining query sequence data according to the data quantity approximate value to be read corresponding to each preset sequencing scheme.
4. The method of claim 1, wherein the query model is pre-built based on a plurality of historical trace-head keywords and query sequence data for the seismic data, comprising: the query model is pre-established according to one or more B + trees corresponding to each historical trace head keyword in a plurality of historical trace head keywords of the seismic data and query sequence data.
5. A seismic data multidimensional range query device, comprising:
the data acquisition module is used for acquiring a group of channel head keywords of the seismic data and query condition data corresponding to each channel head keyword in the group of channel head keywords;
the pointer determining module is used for determining one or more pointers corresponding to the group of the header keywords according to query condition data corresponding to each header keyword in the group of the header keywords and a pre-established query model, wherein the one or more pointers carry pointer information corresponding to each pointer, and the query model is pre-established according to a plurality of historical header keywords and query sequence data of the seismic data;
and the data query module is used for querying the seismic data according to the one or more pointers corresponding to the group of the channel head keywords.
6. The apparatus of claim 5, wherein the header key comprises: shot point coordinates, demodulator probe coordinates, sampling point number, shot number and track number.
7. The apparatus of claim 5, wherein the query order data is obtained as follows:
determining an approximate value of the data quantity to be read corresponding to each preset sequencing scheme in a plurality of preset sequencing schemes;
and obtaining query sequence data according to the data quantity approximate value to be read corresponding to each preset sequencing scheme.
8. The apparatus of claim 5, wherein the query model is pre-built based on a plurality of historical trace-head keywords and query sequence data for the seismic data, comprising: the query model is pre-established according to one or more B + trees corresponding to each historical trace head keyword in a plurality of historical trace head keywords of the seismic data and query sequence data.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 4 when executing the computer program.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program for executing the method of any one of claims 1 to 4.
CN201911181069.8A 2019-11-27 2019-11-27 Seismic data multi-dimensional range query method and device Pending CN112860734A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911181069.8A CN112860734A (en) 2019-11-27 2019-11-27 Seismic data multi-dimensional range query method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911181069.8A CN112860734A (en) 2019-11-27 2019-11-27 Seismic data multi-dimensional range query method and device

Publications (1)

Publication Number Publication Date
CN112860734A true CN112860734A (en) 2021-05-28

Family

ID=75985537

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911181069.8A Pending CN112860734A (en) 2019-11-27 2019-11-27 Seismic data multi-dimensional range query method and device

Country Status (1)

Country Link
CN (1) CN112860734A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114153851A (en) * 2021-12-06 2022-03-08 智慧足迹数据科技有限公司 GEOHASH indexing method, GEOHASH indexing device, computer equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020107860A1 (en) * 2000-11-29 2002-08-08 Gobeille Robert C. Data structure and storage and retrieval method supporting ordinality based searching and data retrieval
CN101676899A (en) * 2008-09-18 2010-03-24 上海宝信软件股份有限公司 Profiling and inquiring method for massive database records
CN102073727A (en) * 2011-01-12 2011-05-25 中国石油集团川庆钻探工程有限公司 Method for describing seismic data
CN102890722A (en) * 2012-10-25 2013-01-23 国家电网公司 Indexing method applied to time sequence historical database
CN105550241A (en) * 2015-12-07 2016-05-04 珠海多玩信息技术有限公司 Multidimensional database query method and apparatus
CN109343117A (en) * 2018-11-10 2019-02-15 北京科胜伟达石油科技股份有限公司 Double buffer dual-thread seismic data display methods
CN109446293A (en) * 2018-11-13 2019-03-08 嘉兴学院 A kind of parallel higher-dimension nearest Neighbor

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020107860A1 (en) * 2000-11-29 2002-08-08 Gobeille Robert C. Data structure and storage and retrieval method supporting ordinality based searching and data retrieval
CN101676899A (en) * 2008-09-18 2010-03-24 上海宝信软件股份有限公司 Profiling and inquiring method for massive database records
CN102073727A (en) * 2011-01-12 2011-05-25 中国石油集团川庆钻探工程有限公司 Method for describing seismic data
CN102890722A (en) * 2012-10-25 2013-01-23 国家电网公司 Indexing method applied to time sequence historical database
CN105550241A (en) * 2015-12-07 2016-05-04 珠海多玩信息技术有限公司 Multidimensional database query method and apparatus
CN109343117A (en) * 2018-11-10 2019-02-15 北京科胜伟达石油科技股份有限公司 Double buffer dual-thread seismic data display methods
CN109446293A (en) * 2018-11-13 2019-03-08 嘉兴学院 A kind of parallel higher-dimension nearest Neighbor

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
祝媛媛;: "地震数据可视化技术的优化研究与开发", 西部探矿工程, vol. 27, no. 04, pages 76 - 79 *
邵彬;来志强;彭波;: "磁盘存储SEG-D地震数据格式的解编方法", 中国石油和化工标准与质量, vol. 31, no. 02, pages 33 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114153851A (en) * 2021-12-06 2022-03-08 智慧足迹数据科技有限公司 GEOHASH indexing method, GEOHASH indexing device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
US8688723B2 (en) Methods and apparatus using range queries for multi-dimensional data in a database
US6778977B1 (en) Method and system for creating a database table index using multiple processors
US6185557B1 (en) Merge join process
US5978794A (en) Method and system for performing spatial similarity joins on high-dimensional points
US7158996B2 (en) Method, system, and program for managing database operations with respect to a database table
Yagoubi et al. Dpisax: Massively distributed partitioned isax
Beckmann et al. A revised R*-tree in comparison with related index structures
US6772163B1 (en) Reduced memory row hash match scan join for a partitioned database system
EP3014488B1 (en) Incremental maintenance of range-partitioned statistics for query optimization
US20130151535A1 (en) Distributed indexing of data
EP1234258B1 (en) System for managing rdbm fragmentations
EP3289484B1 (en) Method and database computer system for performing a database query using a bitmap index
JP2004518226A (en) Database system and query optimizer
US8583655B2 (en) Using an inverted index to produce an answer to a query
JP6418431B2 (en) Method for efficient one-to-one coupling
US6732107B1 (en) Spatial join method and apparatus
Vu et al. R*-grove: Balanced spatial partitioning for large-scale datasets
Holanda et al. Cracking KD-Tree: The First Multidimensional Adaptive Indexing (Position Paper).
CN112860734A (en) Seismic data multi-dimensional range query method and device
US9747363B1 (en) Efficient storage and retrieval of sparse arrays of identifier-value pairs
Vespa et al. Efficient bulk-loading on dynamic metric access methods
US6694324B1 (en) Determination of records with a specified number of largest or smallest values in a parallel database system
Wang et al. Mlb+-tree: A multi-level b+-tree index for multidimensional range query on seismic data
KR20010109945A (en) RS-tree for k-nearest neighbor queries with non spatial selection predicates and method for using it
Pai et al. Workload-aware and Learned Z-Indexes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination