CN114371501A - SEG D (seismic isolation image) segmentation classification mixed compression method for mass seismic source data - Google Patents

SEG D (seismic isolation image) segmentation classification mixed compression method for mass seismic source data Download PDF

Info

Publication number
CN114371501A
CN114371501A CN202011114683.5A CN202011114683A CN114371501A CN 114371501 A CN114371501 A CN 114371501A CN 202011114683 A CN202011114683 A CN 202011114683A CN 114371501 A CN114371501 A CN 114371501A
Authority
CN
China
Prior art keywords
data
seismic
compression
seg
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011114683.5A
Other languages
Chinese (zh)
Inventor
徐雷良
徐维秀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Petrochemical Corp
Sinopec Oilfield Service Corp
Sinopec Petroleum Engineering Geophysics Co Ltd
Sinopec Petroleum Engineering Geophysics Co Ltd Shengli Branch
Original Assignee
China Petrochemical Corp
Sinopec Oilfield Service Corp
Sinopec Petroleum Engineering Geophysics Co Ltd
Sinopec Petroleum Engineering Geophysics Co Ltd Shengli Branch
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Petrochemical Corp, Sinopec Oilfield Service Corp, Sinopec Petroleum Engineering Geophysics Co Ltd, Sinopec Petroleum Engineering Geophysics Co Ltd Shengli Branch filed Critical China Petrochemical Corp
Priority to CN202011114683.5A priority Critical patent/CN114371501A/en
Publication of CN114371501A publication Critical patent/CN114371501A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01VGEOPHYSICS; GRAVITATIONAL MEASUREMENTS; DETECTING MASSES OR OBJECTS; TAGS
    • G01V1/00Seismology; Seismic or acoustic prospecting or detecting
    • G01V1/28Processing seismic data, e.g. for interpretation or for event detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Environmental & Geological Engineering (AREA)
  • Geology (AREA)
  • General Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Geophysics (AREA)
  • Geophysics And Detection Of Objects (AREA)

Abstract

The invention provides an SEG D (seismic isolation image) segmentation, classification and mixed compression method for mass seismic source data, which comprises the following steps: decoding the SEG D data file, and performing Huffman coding on the acquired quality control related parameters; recording the take-off time and amplitude information in the specific time window of the auxiliary track; sequentially solving the time difference correction factor of each seismic channel in the array, performing time difference correction on the seismic channel samples by using the time difference correction factor, and calculating the variation coefficient of the samples after the time difference correction of the array and the last array; if the variation coefficient is larger than the threshold value, integral wavelet transform is carried out on the sampled data, the wavelet coefficient is quantized by utilizing the dynamic threshold value, and the quantized data is subjected to arithmetic coding; if the variation coefficient is not larger than the threshold value, carrying out lossless coding on the two-row data difference value; and carrying out code grouping on the segmented and classified data according to a certain rule to form a character stream. The method meets the requirement of quality control aging of mass earthquake acquisition sites, and does not cause distortion due to data compression.

Description

SEG D (seismic isolation image) segmentation classification mixed compression method for mass seismic source data
Technical Field
The invention relates to the technical field of oil-gas seismic exploration and development, in particular to an SEG D (seismic isolation image) segmented classification mixed compression method for mass seismic source data.
Background
Seismic data compression methods are gradually emerging and developing as the amount of oil and gas seismic survey data increases. Before and after the beginning of this century, the compression technology of data transmission of a data coding and acquisition system of a seismic acquisition instrument has been reported and invented sporadically, and the compression research of seismic data of which the main content is transformed by wavelet in the later period has been long and not decayed.
At present, the following modes mainly exist in seismic data compression: starting from the seismic acquisition signals, the seismic signals are appropriately transformed, such as wavelet transformation, and the acquisition system group codes and the formatted data are based on the transformed values; starting from the seismic acquisition system, compressing the channel data stream of the seismic acquisition equipment so as to improve the transmission efficiency; extracting information representing the attributes of the SEG Y file of the general dumped seismic data, realizing the random access of the data indoors and saving the storage space; data compression methods, which are dedicated to data processing and maintain spatial attribute information, are commonly used for data processing. Through retrieval, relevant reports about massive seismic source data compression of a seismic acquisition field are rarely seen at present (the seismic source data refers to seismic data directly generated by a field seismic acquisition system, generally refers to SEG D standard, and individual equipment adopts SEG Y standard); the data compression method for indoor data processing is particularly characterized in that the data compression and reconstruction method is driven by a compressed sensing technology.
From the compression technique employed, seismic data compression mainly takes the following approaches: the wavelet transform method is used for seismic data compression for a long time, a large number of compression technologies appear from early to present, the seismic data compression thinking is different greatly by utilizing the wavelet transform, and the difference of data compression ratio and fidelity is larger by utilizing different wavelet bases, code combination modes and coding algorithms; compressed sensing is a new technology generated along with the progress of seismic exploration technology, although the time is short, research results are rich, the literature amount of massive seismic data compressed sensing reconstruction is more and more, research hotspots are to perform compressed sensing reconstruction on seismic data based on methods and technologies in different fields, the methods in the fields comprise wavelet transformation, Fourier discrete cosine transformation and the like, and at present, the data reconstruction of compressed sensing by combining with the wavelet transformation is a great trend; the method is characterized in that the seismic single shot record is regarded as an image, and the seismic single shot record is compressed by utilizing an image compression technology, wherein JPEG 2000 and an improved algorithm thereof, and a multi-level tree set splitting algorithm (SPIHT) are rather representative algorithms; in addition, the method for compressing the seismic data and compressing the seismic data without loss through the parting technology and the chaos theory is also researched, but the application effect is not obvious. In conclusion, the massive seismic data compression methods which are characterized by wavelet transformation and compressed sensing are more productive, and the methods effectively explore the aspects of ensuring the data noise-signal ratio and improving the compression ratio and are tested in engineering practice.
The existing seismic data compression methods and technologies are more, but the data compression for field seismic data monitoring is lack of pertinence, and the result is the phoenix-hair unicorn. In addition, some common methods, for example, the error of the seismic data reconstruction based on the prediction filtering method is large, and the seismic data reconstruction based on some mathematical transformations is low in precision; in the field of compressed sensing, the data reconstruction aims at obtaining seismic data with better imaging on the premise of reducing data volume, so that the starting point and the footfall point of the data reconstruction are different from seismic source data compression oriented to on-site quality monitoring, and the compressed sensing seismic data compression is not suitable for on-site quality monitoring of the seismic source data due to the difference of the target and means; the wavelet transformation utilizes different wavelet bases to compress data to obtain different application effects, however, the quality control requirements based on seismic source data are different from the methods, and the field quality control needs to take compression ratio and compression and decompression time into consideration; the method is similar to seismic image data compression, particularly adopts JPEG 2000 technology to carry out data compression to obtain certain technical results, but the seismic data is far from enough to be treated as common image data, and the information characteristics reflected by the seismic data are submerged by a simple image compression mode.
In the application No.: the Chinese patent application of CN201310570935.9 relates to a method for compressing massive seismic data in parallel, which belongs to the technical field of petroleum seismic exploration data processing and is characterized in that a CPU is used for converting three-dimensional seismic data to obtain seismic data volume data, the seismic data volume data is decomposed and secondarily decomposed to obtain seismic data subblocks and seismic data blocks, the seismic data blocks are transmitted to a GPU, a visual perception model is used for parallel compression, and the compression results of all the seismic data blocks are fused to obtain the compression result of the whole seismic data. The visual perception model adopted by the patent is an intuitive understanding of human naked eyes on seismic data, therefore, in the patent, the seismic data are firstly converted into voxel data, the voxel is taken as a compression object, the seismic data are divided into a strong edge block, a detail block and a flat sliding block according to the entropy value and the variance of data blocks, different visual weights are given to the strong edge block, the detail block and the flat sliding block, wavelet transformation is adopted for compression, the priority of the most visual coefficient is ensured, the typical application of image compression in the seismic data is provided, and the image compression covers the physical characteristics of the seismic data such as frequency, signal-to-noise ratio and the like, so the method has no universality; in addition, at present, seismic acquisition is widely distributed in various complex earth surface and geological conditions, the entropy and variance calculated by dividing data subblocks and blocks with fixed sizes blur the boundary between large local visual variation in the blocks and large overall visual difference of the blocks, and the established visual weight greatly influences the final compression effect.
In the application No.: in chinese patent application No. cn201310495175.x, a method and an apparatus for compressing massive seismic data with spatial attribute information are provided, which are applied to the technical field of seismic data compression processing, and the method includes: determining seismic data meeting preset compression conditions as seismic data to be compressed; determining a compression ratio according to the parameters of the observation system, the attribute of the common-center-point CMP gather surface element, the form of an underground interface and the characteristics of a seismic wave field; determining the direction in which the seismic wave field is most densely distributed; according to the data attribute distribution condition, carrying out grid division on the seismic data according to the spatial position; selecting a compression function according to the seismic data distribution condition and/or the compression requirement; and compressing the seismic data to be compressed in each grid along the direction of the most dense seismic wave field spatial distribution by using the compression function and the compression ratio. The compression formula is only linear processing of the whole numerical value of the seismic channel sampling data, and the goal of reducing the seismic data volume is achieved by deleting a large number of seismic channels which do not have strong influence on the spatial attribute, so that the compressed data cannot be reconstructed at all, and the influence on the seismic data processing in the later period is serious due to the fact that the shot-geophone relationship is completely changed, and the seismic data processing is effective only on a specific target.
In the application No.: CN201810010947.9, chinese patent application, relates to a seismic data compression and reconstruction method, which comprises the following steps: firstly, performing wavelet transformation on seismic data to increase the compressibility of the seismic data; then, constructing a measurement matrix which can be realized by hardware according to the chaotic sequence, and carrying out compression observation on the wavelet-transformed seismic data by using the chaotic measurement matrix; and finally, improving a Bayesian wavelet tree structure compressed sensing reconstruction algorithm, and recovering complete seismic data by using the improved BTSWCS-vb algorithm. The method is used for data compression of seismic sampling data flow depending on hardware, a measuring matrix used for compression is realized by hardware, and a mode of collecting and compressing simultaneously is adopted, so that in order to ensure reconstruction accuracy, the compression ratio is low and is less than 1:5 at most, and the requirement of field quality control of seismic source data is difficult to meet.
Therefore, a new SEG D sectional classification mixed compression method for mass seismic source data is invented, and the technical problems are solved.
Disclosure of Invention
The invention aims to provide an SEG (seismic source data) segmented, classified and mixed compression method for providing an engineering practical data compression solution for monitoring and analyzing the quality of a mass seismic acquisition site.
The aim of the invention can be achieved by the following technical measures: the SEG D sectional classification mixed compression method for the mass seismic source data comprises the following steps: step 1: decoding the SEG D data file, acquiring all quality control related parameters from various header blocks and track headers, and performing Huffman coding on the parameters; step 2: recording the takeoff time and amplitude information in a specific time window for the auxiliary track data; and step 3: according to the requirement of a work area exploration target, eliminating sampling data irrelevant to quality control in arrangement according to a null range; and 4, step 4: calculating time difference correction factors of seismic channels in the array, performing time difference correction on the seismic channels, and calculating variation coefficients of all samples of the array and all samples of the previous array after time difference correction of all channels of the array; and 5: if the variation coefficient is larger than a set threshold value, performing integer normalization processing on the array samples; step 6: performing integer wavelet transformation on the normalized data by using Daubechies 9 wavelet basis functions; and 7: calculating a dynamic threshold value by using the wavelet coefficients, quantizing the wavelet coefficients by adopting a multilevel tree set splitting algorithm, and limiting precision control by using controlled factors; and 8: carrying out arithmetic coding on the quantized data, and counting coding summary information; and step 9: if the coefficient of variation obtained in step 4 is not greater than the set threshold, carrying out differential coding and normalization on the difference value between the array and the previous array sample; step 10: performing Huffman coding on the normalized data, and counting coding summary information; step 11: and carrying out code grouping on the segmented and classified data according to a certain rule to form a character stream.
The object of the invention can also be achieved by the following technical measures:
in step 1, a general head block is firstly de-coded, channel group positions and data channel positions are calculated, auxiliary channels and data channels are identified, various head blocks and channel heads are then de-coded, channel head marking words, seismic channel index information and other parameters irrelevant to field data analysis and quality control are eliminated, and then Hoffman coding is carried out on quality control parameters.
In step 2, the auxiliary track data is scanned and typical data of the reference signal and the verification signal, including the jump-off time and amplitude, within a specific time window are recorded and directly used for group coding.
In step 3, the data elimination of the space-time range is determined based on the monitoring target, and the eliminated content is far-bias arrangement or seismic channels and deep seismic data.
In step 4, the moveout correction factor is used for moveout correction of the spread seismic trace data, and the calculation formula is as follows:
Figure BDA0002727850080000041
in the formula: Δ ts,iIs the time difference correction factor of the ith track in the current arrangement s and has the unit of s, px、pyIs the horizontal and vertical coordinates of the shot point and has the unit of m, ps,i(x)、ps,i(y) is the horizontal and vertical coordinates of the ith track in the current arrangement s, the unit is m, and v is the speed of the low deceleration belt, and the unit is m/s;
the coefficient of variation reflects the degree of similarity of two adjacent seismic trace samples, and the calculation formula is as follows:
Figure BDA0002727850080000051
in the formula: cv is the coefficient of variation, M is the number of the seismic traces in the permutation, and σ is the standard deviation, and is defined as follows:
Figure BDA0002727850080000052
in the formula: t is the maximum sampling number, x, of the seismic channel after the elimination of the arranged datas,i(tj)、xs-1,i(tj) Respectively are sampling values of the jth discrete point of the ith track in the current permutation s and the last permutation s-1.
In step 5, the setting of the threshold is closely related to the type of the seismic source, the earth surface and the underground geological conditions and is determined by analyzing single shot test data before construction; the normalization processing calculation formula is as follows:
Figure BDA0002727850080000053
in the formula: z is a radical ofs,i(tj) Normalizing the ith sampled data in the ith channel in the current permutation s; int [. C]Is an integral function; k is a compression bit, which is a positive integer value no greater than 32, typically 8, 16, or 24; x is the number ofs,i(tj) Sampling values of jth discrete points of an ith channel in the current arrangement s; min (x)s,i)、max(xs,i) Respectively, the minimum sample value and the maximum sample value of the ith track in the current permutation s.
In step 6, the integer wavelet transform adopts Daubechies 9 wavelet basis functions, and the number of decomposition layers is 3-5.
In step 7, the dynamic threshold of the multilevel tree set splitting algorithm is calculated by the wavelet coefficient, and the calculation formula of the dynamic threshold is as follows:
Figure BDA0002727850080000054
in the formula: t ispIs a dynamic threshold, p is the cycle number, p is 0,1,2, …, k-1;
Figure BDA0002727850080000055
is a rounded down function; lg (-) is a logarithmic function with base 10; max (·) is a function of the maximum; wfAnd (a, b) represents wavelet coefficients, wherein a is a time scale and b is a frequency scale.
In step 7, the controlled factor is obtained from the compression position, the compression time, and the mean square error of the data before and after compression, and the calculation formula is as follows:
Figure BDA0002727850080000061
in the formula: CF is a controlled factor, unit: db; xi is a compression time factor used for regulating and controlling compression time and distortion degree, 0< xi <100 is taken, k is still a compression bit, MSE is a mean square error, and the definition is as follows:
Figure BDA0002727850080000062
in the formula: m is the number of seismic channels after the current arrangement data are removed, T is the maximum sampling number of the seismic channels after the current arrangement data are removed, and zs,i(tj)、
Figure BDA0002727850080000068
Respectively are the data before and after the compression of the ith sampling point of the current permutation s.
In step 8, the statistical summary information content includes the space-time range M and T of data elimination, compression bit k, compression time factor xi, normalization factor, seismic channel sampling extreme value min (x)s,i) And max (x)s,i)。
In step 9, if the coefficient of variation obtained in step 4 is not greater than the set threshold, performing differential encoding on the difference between the array and the previous array sample and performing integer normalization, where the normalization calculation formula is:
Figure BDA0002727850080000063
in the formula: z is a radical ofs,i(tj) Normalizing the ith sampled data in the ith channel in the current permutation s; int [. C]Is an integral function; k is a compression position;
Figure BDA0002727850080000064
the difference value of the jth sampling value of the ith channel in the current permutation s and the corresponding sampling value of the previous permutation is obtained;
Figure BDA0002727850080000065
the minimum value and the maximum value of the ith track difference value in the current permutation s.
In step 10, the normalized data is huffman coded, and the coded summary information is counted, the content includes the minimum value of the threshold, the compression bit and the sampling difference value
Figure BDA0002727850080000066
And maximum value
Figure BDA0002727850080000067
In step 11, the group code is organized according to the sequence of the data in step 1, step 2, step 8 and step 10, wherein a corresponding serialization rule is established for each type of information.
The SEG D segmentation classification mixed compression method for the mass seismic source data is oriented to the seismic source data and has strong pertinence; the compression target is used for efficient quality monitoring and data analysis of field seismic data, and the balance among a noise-signal ratio, a distortion rate, a compression ratio and compression time is required; in the compression of seismic sampling, the similarity of adjacent array samples is fully considered, namely the more similar the adjacent array samples are, the higher the compression ratio is, and the shorter the compression time is. Therefore, the invention aims to provide a practical data compression solution for field quality monitoring and analysis of large-data-volume single-gun data acquisition in the field of mass seismic exploration and acquisition engineering, so that the field aging requirement is met, and the data cannot be seriously distorted due to compression.
Drawings
FIG. 1 is a flow chart of an embodiment of the SEG D segmented classification hybrid compression method for mass seismic source data of the present invention;
FIG. 2 is a comparison of data compression before and after single shot records using a compression bit k of 8 and a controlled factor CF of 90 in an embodiment of the present invention.
Detailed Description
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
As shown in fig. 1, fig. 1 is a flow chart of the SEG D segmentation classification hybrid compression method for mass seismic source data according to the present invention.
In step 1, the SEG D data file is de-encoded, all quality control related parameters are obtained from various header blocks and track headers, and Huffman coding is carried out on the parameters;
and eliminating marked channel head characters, seismic channel index information and other parameters irrelevant to field data analysis and quality control, particularly redundant data in a standard, and then carrying out Huffman coding on the quality control parameters.
In step 2, recording the takeoff time and amplitude information in a specific time window for the auxiliary track data;
typical data, including the jump-off time and amplitude, of the reference signal and the verification signal within a particular time window in the secondary track are recorded and used directly for group coding.
In step 3, according to the requirements of the exploration target of the work area, eliminating sampling data which are arranged and are irrelevant to quality control according to the empty range;
according to the requirement of a work area exploration target, sampling data irrelevant to quality control in arrangement are removed according to a space range, and generally removed contents are remote arrangement or seismic traces and deep seismic data.
In step 4, time difference correction is carried out on the seismic channel by solving the time difference correction factor of the seismic channel in the array, and after time difference correction is carried out on all channels in the array, the variation coefficients of all samples of the array and all samples of the previous array are calculated;
sequentially solving the time difference correction factors of the seismic channels in the current arrangement, and performing time difference correction on each seismic channel, wherein the calculation formula is as follows:
Figure BDA0002727850080000071
in the formula: Δ ts,iIs the time difference correction factor of the ith track in the current arrangement s and has the unit of s, px、pyIs the horizontal and vertical coordinates of the shot point and has the unit of m, ps,i(x)、ps,iAnd (y) is the ith horizontal and vertical coordinates in the current arrangement s, the unit is m, and v is the speed of the low deceleration belt, and the unit is m/s.
Then, obtaining the variation coefficient of all samples of the last arrayed seismic trace, wherein the calculation formula is as follows:
Figure BDA0002727850080000081
in the formula: cv is the coefficient of variation, M is the number of the seismic traces in the permutation, and σ is the standard deviation, and is defined as follows:
Figure BDA0002727850080000082
in the formula: t is the maximum sampling number, x, of the seismic channel after the data of the array s are removeds,i(tj)、xs-1,i(tj) Sampling values of the jth discrete point of the ith track in the current permutation s and the last permutation s-1.
In step 5, if the coefficient of variation is greater than the set threshold, performing integer normalization on the permutation sample, where the calculation formula of the normalization process is:
Figure BDA0002727850080000083
in the formula: z is a radical ofs,i(tj) Normalizing the ith sampled data in the ith channel in the current permutation s; int [. C]Is an integral function; the compression bit k is a positive integer value no greater than 32, typically 8, 16 or 24; x is the number ofs,i(tj) Sampling values of jth discrete points of an ith channel in the current arrangement s; min (x)s,i)、max(xs,i) The minimum sample value and the maximum sample value of the ith track in the current permutation s.
In step 6, integer wavelet transform is carried out on the normalized data by using Daubechies 9 wavelet basis functions;
the integer wavelet transform adopts Daubechies 9 wavelet basis functions, and the number of wavelet decomposition layers is preferably 3-5.
In step 7, calculating a dynamic threshold value by the wavelet coefficient, quantizing the wavelet coefficient by adopting a multilevel tree set Splitting (SPIHT) algorithm, and controlling the precision by a controlled factor;
calculating a dynamic threshold according to the wavelet coefficients, and quantizing the wavelet coefficients by adopting a multi-level tree set Splitting (SPIHT) algorithm, wherein the calculation formula of the dynamic threshold is as follows:
Figure BDA0002727850080000084
in the formula: t ispIs a dynamic threshold, p is the cycle number, p is 0,1,2, …, k-1;
Figure BDA0002727850080000091
is a rounded down function; lg (-) is a logarithmic function with base 10; max (·) is a function of the maximum; wfAnd (a, b) represents wavelet coefficients, wherein a is a time scale and b is a frequency scale.
The precision control is determined by a controlled factor, which is defined by a compression position, a compression time and a mean square error of data before and after compression, and a calculation formula is as follows:
Figure BDA0002727850080000092
in the formula: CF is a controlled factor (unit: db), xi is a compression time factor, mainly controls compression time and distortion, generally takes 0< xi <100, k is still a compression bit, MSE is a mean square error, and is defined as follows:
Figure BDA0002727850080000093
in the formula: m is the number of seismic channels after the data of the array s are removed, T is the maximum sampling number of the seismic channels after the data of the array s are removed, and z iss,i(tj)、
Figure BDA0002727850080000094
Respectively are the data before and after the compression of the ith sampling point in the current arrangement s.
In step 8, performing arithmetic coding on the quantized data, and counting coding summary information;
carrying out arithmetic coding on the quantized data, and counting coding summary information, wherein the content comprises space-time ranges M and T of data elimination, a compression bit k, a compression time factor xi, a normalization factor, and a sampling extreme value min (x) of each seismic channel in the current arrangement ss,i) And max (x)s,i)。
In step 9, if the coefficient of variation obtained in step 4 is not greater than the set threshold, performing differential coding and integer normalization on the difference between the array and the previous array sample, where the normalization calculation formula is:
Figure BDA0002727850080000095
in the formula: z is a radical ofs,i(tj) Normalizing the ith sampling difference value of the ith channel in the current permutation s; int [. C]Is an integral function; k is a compression position;
Figure BDA0002727850080000096
the difference value of the jth sampling value of the ith channel in the current permutation s and the corresponding sampling value of the previous permutation s-1 is obtained;
Figure BDA0002727850080000097
the minimum value and the maximum value of the ith track difference value in the current permutation s.
In step 10, the normalized data is huffman coded, and the coded summary information is counted, the content includes the compression bit and the minimum value of the sampling difference value in the current arrangement s
Figure BDA0002727850080000101
And
Figure BDA0002727850080000102
in step 11, after all the arrangements are finished in steps 3-10, the data obtained in step 1, step 2, step 8 and step 10 are subjected to code combination according to a certain rule to form a character stream.
The group code is organized according to the data sequence in the steps 1,2, 8 and 10, wherein a corresponding serialization rule is established for each type of information.
In one embodiment of the present invention, the method comprises the following steps:
(1) inputting an original seismic record SEG D file, distinguishing versions 1.0, 2.1 and 3.0 according to SEG D related standards, performing de-editing on the file, calculating channel group positions and data channel positions, identifying auxiliary channels and data channels, then performing de-editing on various channel heads, and identifying file header blocks, channel headers and sampling data through processing, wherein the file header blocks, the channel headers and the sampling data comprise data indexes and redundant information, and quality control parameters and non-quality control parameters are particularly distinguished;
(2) screening keywords, selecting keyword data related to quality control to perform Huffman coding, wherein the keyword data comprises a shot position and a sampling rate;
(3) for auxiliary channel data, identifying and extracting reference signal and verification signal data in a specific time window, and for vibroseis acquisition, scanning type data which are directly used for code grouping;
(4) processing according to the array in sequence, setting a monitoring range according to the exploration target requirement, and removing far-offset array or seismic channel and deep seismic data;
(5) according to the low velocity-drop zone velocity, calculating the time difference correction factor of each seismic channel in the current arrangement, and performing time difference correction on each seismic channel, wherein the calculation formula is
Figure BDA0002727850080000103
In the formula,. DELTA.ts,iIs the time difference correction factor of the ith track in the current arrangement s and has the unit of s, px、pyIs the horizontal and vertical coordinates of the shot point and has the unit of m, ps,i(x)、ps,i(y) is the horizontal and vertical coordinates of the ith track in the current arrangement s, the unit is m, and v is the speed of the low deceleration belt, and the unit is m/s;
(6) the coefficient of variation of all seismic trace samples of the current array and the last array is obtained, and the calculation formula is
Figure BDA0002727850080000104
In the formula, cv is the coefficient of variation, M is the number of seismic traces in the current permutation s, σ is the standard deviation, which is defined as
Figure BDA0002727850080000105
In the formula, T is the maximum sampling number of the seismic channel after the data of the current array s is removed, xs,i(tj)、xs-1,i(tj) Sampling values of the jth discrete point of the ith channel in the current permutation s and the last permutation s-1;
(7) if the variation coefficient is larger than the set threshold value, the permutation sampling is processed with integer normalization, the threshold value is related to earthquake exciting factors, earth surface and geological conditions, environmental noise, permutation length and recording time, the value can be easily obtained through test analysis before formal production, and the normalization calculation formula is
Figure BDA0002727850080000111
In the formula, zs,i(tj) Normalizing the ith sampled data in the ith channel in the current permutation s; int [. C]Is an integral function; k is a compression bit, which is a positive integer value no greater than 32, typically 8, 16, or 24; x is the number ofs,i(tj) Sampling values of jth discrete points of an ith channel in the current arrangement s; min (x)s,i)、max(xs,i) The minimum sampling value and the maximum sampling value of the ith channel in the current permutation s are obtained;
(8) performing integer wavelet transform on the normalized data by using a Daubechies 9 wavelet basis function, wherein the number of layers of wavelet decomposition is 3;
(9) for coefficients with different frequencies, a multilevel tree set Splitting (SPIHT) algorithm is used for performing wavelet coefficient quantization processing, and a dynamic threshold is defined as
Figure BDA0002727850080000112
In the formula: t ispIs a dynamic threshold, p is the cycle number, p is 0,1,2, …, k-1;
Figure BDA0002727850080000113
is a rounded down function; lg (-) is a logarithmic function with base 10; max (·) is a function of the maximum;Wf(a, b) represents wavelet coefficients, a is a time scale, b is a frequency scale, wherein p is 0 and is an initial threshold;
(10) the controlled factor controls the data compression precision, which is defined by the compression ratio, the compression quality and the compression time, and the calculation formula is
Figure BDA0002727850080000114
In the formula, CF is a controlled factor, and the units db and xi are compression time factors, mainly controlling compression time and distortion degree, and generally taking 0<ξ<100, k is still the compression bit, MSE is the mean square error, defined as
Figure BDA0002727850080000115
In the formula, M is the number of seismic channels with the data of the current array s removed, T is the maximum sampling number of the seismic channels with the data of the current array s removed, and zs,i(tj)、
Figure BDA0002727850080000116
Respectively compressing the data of the ith sampling point in the current arrangement s before and after compression;
(11) coding the obtained quantized sampling data by using arithmetic coding, and counting the related summary information of the arrangement in the above process, including the arrangement serial number, space-time range M and T, compression bit k, compression time factor xi, normalization factor, ith sampling extreme value min (x) in the current arrangement ss,i) And max (x)s,i);
(12) If the variation coefficient in (6) is not larger than the set threshold, the difference value between the array and the previous array sample is differentially encoded and integer normalized, and the normalization formula is
Figure BDA0002727850080000121
In the formula, zs,i(tj) Normalizing the ith sampled data in the ith channel in the current permutation s; int [. C]Is an integral function; k is a compression position;
Figure BDA0002727850080000122
for the ith track j in the current permutation sThe difference between the sample value and the corresponding sample of the previous permutation s-1;
Figure BDA0002727850080000123
the minimum value and the maximum value of the difference value of the ith track in the current permutation s are obtained;
(13) performing Huffman coding on the normalized data, and counting coding summary information including compression bit and minimum value of sampling difference values in current arrangement s
Figure BDA0002727850080000124
And maximum value
Figure BDA0002727850080000125
(14) The data in the above (2), (3), (11) and (13) are coded according to a certain formatting rule, and the rules comprise: the content is organized according to the file header, the auxiliary channel information, the summary information of each permutation code, the seismic channel header summary information and the coding data, and the sizes of the items are respectively as follows: the file header is 256B, the auxiliary trace is 64B, the arrangement is 128B, the seismic trace header 32B and the seismic trace sampling data are not limited, the size of the data can be properly adjusted, and finally, a compressed file is formed for output.
The character stream file formed by the method can be directly used for seismic source data SEG D compression, and provides an efficient and relatively fidelity compression technology for field seismic data monitoring and analysis.
In one embodiment of the present invention, FIG. 2 is a comparison of seismic data compressed by the present method using a compression bit of 8 and a controlled factor of 90, where the upper graph is the original seismic single shot (local), the middle graph is the seismic record (local) of the corresponding left graph after compression and decompression, and the lower graph is the difference between the compressed and original seismic records. Table 1 shows a comparison of parameters for seismic data compression using multi-threading in parallel with 8-bit compression and different controlled factors.
TABLE 1 parameter Table resulting from 8-bit compression and different controlled factor compression
Figure BDA0002727850080000126
Figure BDA0002727850080000131
A plurality of three-dimensional earthquake acquisition work areas in east old oil areas and west desert areas of China are tested to show that: the data compression is carried out by adopting a compression bit of 8 and a controlled factor of 60 dB, the compression ratio is about 16.32, the noise-signal ratio is only about 0.09%, the average time consumption of ten thousand seismic data compression is 2.71s, and the requirements of quality control, analysis and data dump of on-site mass seismic source data are completely met.
The invention discloses a SEG D (seismic source data-based) segmented, classified, mixed and combined compression method for massive seismic source data applied to a field, which provides effective support for analysis and quality monitoring of massive seismic acquisition field data and greatly improves the production efficiency. At present, the main targets of oil and gas exploration and development in China are lithology, hidden oil and gas reservoirs, complex fault blocks, small fractures, micro-amplitude structures and other complex areas, and high-quality seismic data are needed as supports. With the progress of oil-gas seismic exploration technology, seismic equipment, computers and network technology, the acquisition of large data volume of a work area and instantaneous single-shot big data flow becomes the main characteristic of the acquisition of massive seismic data, and the traditional seismic data monitoring mode meets the challenge in the massive acquisition and needs to accelerate the field data analysis and quality control processing rate through data compression. The seismic source data SEG D is compressed respectively according to the track head and the sampling data and then is coded, particularly the similarity of the seismic data arranged adjacently is considered, the SEG D data compression can be realized rapidly and efficiently, and the requirement of on-site near-real-time quality control is met. Firstly, acquiring quality control related parameters from SEG D file de-encoding to perform Huffman encoding; recording key parameters of the seismic auxiliary channel data; cutting the sampling data in a space-time dimension to obtain quality control range data; solving a time difference correction factor of the current arrangement, performing time difference correction on all seismic channels, and calculating the variation coefficient of the arrangement and the previous arrangement; if the coefficient is larger than the set threshold, calculating to obtain wavelet coefficient by integer wavelet transform for each arranged data, quantizing the wavelet coefficient by dynamic threshold, controlling the precision by controlled factors limited by compression position, compression time and mean square error of data before and after compression, and arithmetically encoding the quantized data; if the coefficient of variation is smaller than the set threshold, the integer normalization is carried out on the difference value between the array and the last array sample, differential coding is carried out firstly, then Huffman coding is carried out, and the relevant information of the process is recorded; different codes and related information of the keywords, the auxiliary channels and the seismic sampling data are combined according to a certain rule to form a compressed character stream for field data analysis and quality control.
The above-mentioned embodiment is only one of the preferred embodiments of the present invention, and general changes and substitutions by those skilled in the art within the technical scope of the present invention are included in the protection scope of the present invention.

Claims (13)

1. The SEG D sectional classification mixed compression method for the mass seismic source data is characterized by comprising the following steps of:
step 1: decoding the SEG D data file, acquiring all quality control related parameters from various header blocks and track headers, and performing Huffman coding on the parameters;
step 2: recording the takeoff time and amplitude information in a specific time window for the auxiliary track data;
and step 3: according to the requirement of a work area exploration target, eliminating sampling data irrelevant to quality control in arrangement according to a null range;
and 4, step 4: calculating time difference correction factors of seismic channels in the array, performing time difference correction on the seismic channels, and calculating variation coefficients of all samples of the array and all samples of the previous array after time difference correction of all channels of the array;
and 5: if the variation coefficient is larger than a set threshold value, performing integer normalization processing on the array samples;
step 6: performing integer wavelet transformation on the normalized data by using Daubechies 9 wavelet basis functions;
and 7: calculating a dynamic threshold value by using the wavelet coefficients, quantizing the wavelet coefficients by adopting a multilevel tree set splitting algorithm, and limiting precision control by using controlled factors;
and 8: carrying out arithmetic coding on the quantized data, and counting coding summary information;
and step 9: if the coefficient of variation obtained in step 4 is not greater than the set threshold, carrying out differential coding and normalization on the difference value between the array and the previous array sample;
step 10: performing Huffman coding on the normalized data, and counting coding summary information;
step 11: and carrying out code grouping on the segmented and classified data according to a certain rule to form a character stream.
2. The SEG D (SEG-classification-mixing-compression) method for massive seismic source data according to claim 1, wherein in the step 1, a general header block is firstly de-encoded, channel group positions and data channel positions are calculated, auxiliary channels and data channels are identified, then various types of header blocks and channel headers are de-encoded, marked channel header characters, seismic channel index information and other parameters irrelevant to field data analysis and quality control are removed, and then Hoffman encoding is carried out on quality control parameters.
3. The SEG D (SEG-classification-mixing-compression) method for compressing seismic data according to claim 1, wherein in step 2, auxiliary trace data are scanned, and typical data of reference signals and verification signals in a specific time window, including jump-off time and amplitude, are recorded and directly used for code grouping.
4. The SEG D segmentation, classification, mixing and compression method for massive seismic source data as claimed in claim 1, wherein in step 3, the elimination of the spatiotemporal data is determined based on the monitoring target, and the eliminated contents are far-offset arrays or seismic traces and deep seismic data.
5. The SEG D segmentation classification hybrid compression method for massive seismic source data according to claim 1, wherein in step 4, the moveout correction factor is used for moveout correction of the seismic trace data, and the calculation formula is as follows:
Figure FDA0002727850070000021
in the formula: Δ ts,iIs the time difference correction factor of the ith track in the current arrangement s and has the unit of s, px、pyIs the horizontal and vertical coordinates of the shot point and has the unit of m, ps,i(x)、ps,i(y) is the horizontal and vertical coordinates of the ith track in the current arrangement s, the unit is m, and v is the speed of the low deceleration belt, and the unit is m/s;
the coefficient of variation reflects the degree of similarity of two adjacent seismic trace samples, and the calculation formula is as follows:
Figure FDA0002727850070000022
in the formula: cv is the coefficient of variation, M is the number of the seismic traces in the permutation, and σ is the standard deviation, and is defined as follows:
Figure FDA0002727850070000023
in the formula: t is the maximum sampling number, x, of the seismic channel after the elimination of the arranged datas,i(tj)、xs-1,i(tj) Respectively are sampling values of the jth discrete point of the ith track in the current permutation s and the last permutation s-1.
6. The SEG D (SEG-classification-mixing-compression) method for massive seismic source data according to claim 1, wherein in the step 5, the setting of the threshold is closely related to the type of a seismic source, the surface and the underground geological conditions and is determined by analyzing single shot test data before construction; the normalization processing calculation formula is as follows:
Figure FDA0002727850070000024
in the formula: z is a radical ofs,i(tj) Normalizing the ith sampled data in the ith channel in the current permutation s; int [. C]Is an integral function; k is a compression bit, which is a positive integer value no greater than 32, typically 8, 16, or 24; x is the number ofs,i(tj) Sampling values of jth discrete points of an ith channel in the current arrangement s; min (x)s,i)、max(xs,i) Respectively, the minimum sample value and the maximum sample value of the ith track in the current permutation s.
7. The SEG D (SEG-classification-mixing) compression method for massive seismic source data according to claim 1, wherein in the step 6, the integer wavelet transform adopts Daubechies 9 wavelet basis functions, and the number of decomposition layers is 3-5.
8. The SEG D (SEG-classification-mixing) compression method for massive seismic source data according to claim 1, wherein in step 7, the dynamic threshold of the multilevel tree set splitting algorithm is calculated from wavelet coefficients, and the calculation formula of the dynamic threshold is as follows:
Figure FDA0002727850070000031
in the formula: t ispIs a dynamic threshold, p is the cycle number, p is 0,1,2, …, k-1;
Figure FDA0002727850070000032
is a rounded down function; lg (-) is a logarithmic function with base 10; max (·) is a function of the maximum; wfAnd (a, b) represents wavelet coefficients, wherein a is a time scale and b is a frequency scale.
9. The SEG D (SEG-classification-mixing) compression method for massive seismic source data according to claim 8, wherein in step 7, the controlled factors are obtained from the compression position, the compression time and the mean square error of the data before and after compression, and the calculation formula is as follows:
Figure FDA0002727850070000033
in the formula: CF is a controlled factor, unit: db; xi is a compression time factor used for regulating and controlling compression time and distortion degree, 0< xi <100 is taken, k is still a compression bit, MSE is a mean square error, and the definition is as follows:
Figure FDA0002727850070000034
in the formula: m is the number of seismic channels after the current arrangement data are removed, T is the maximum sampling number of the seismic channels after the current arrangement data are removed, and zs,i(tj)、
Figure FDA0002727850070000035
Respectively are the data before and after the compression of the ith sampling point of the current permutation s.
10. The SEG D (SEG-classification-mixing-compression) method for seismic data mass according to claim 1, wherein in step 8, the statistical summary information content comprises space-time ranges M and T of data elimination, compression bit k, compression time factor xi, normalization factor, seismic trace sampling extreme value min (x) and seismic trace sampling extreme value xis,i) And max (x)s,i)。
11. The SEG D (SEG-classification-mixing-compression) method for massive seismic source data according to claim 1, wherein in step 9, the difference between the array and the previous array sample is differentially encoded and normalized by an integer, and the normalization calculation formula is as follows:
Figure FDA0002727850070000041
in the formula: z is a radical ofs,i(tj) Normalizing the ith sampled data in the ith channel in the current permutation s; int [. C]Is an integral function; k is a compression position;
Figure FDA0002727850070000042
the difference value of the jth sampling value of the ith channel in the current permutation s and the corresponding sampling value of the previous permutation is obtained;
Figure FDA0002727850070000043
the minimum value and the maximum value of the ith track difference value in the current permutation s.
12. The SEG D (SEG-classified mixed compression method for massive seismic source data) according to claim 1, wherein in the step 10, the normalized data is subjected to Huffman coding, and coding summary information is counted, wherein the content comprises a threshold value, a compression position and a minimum value of a sampling difference value
Figure FDA0002727850070000044
And maximum value
Figure FDA0002727850070000045
13. The SEG D (SEG-classification-mixing-compression) method for massive seismic source data according to claim 1, wherein in step 11, the group codes are organized according to the sequence of the data in steps 1,2, 8 and 10, and a corresponding serialization rule is formulated for each type of information.
CN202011114683.5A 2020-10-16 2020-10-16 SEG D (seismic isolation image) segmentation classification mixed compression method for mass seismic source data Pending CN114371501A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011114683.5A CN114371501A (en) 2020-10-16 2020-10-16 SEG D (seismic isolation image) segmentation classification mixed compression method for mass seismic source data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011114683.5A CN114371501A (en) 2020-10-16 2020-10-16 SEG D (seismic isolation image) segmentation classification mixed compression method for mass seismic source data

Publications (1)

Publication Number Publication Date
CN114371501A true CN114371501A (en) 2022-04-19

Family

ID=81138554

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011114683.5A Pending CN114371501A (en) 2020-10-16 2020-10-16 SEG D (seismic isolation image) segmentation classification mixed compression method for mass seismic source data

Country Status (1)

Country Link
CN (1) CN114371501A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116405037A (en) * 2023-03-28 2023-07-07 昆明理工大学 Astronomical star table-oriented compression preprocessing encoder and application

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116405037A (en) * 2023-03-28 2023-07-07 昆明理工大学 Astronomical star table-oriented compression preprocessing encoder and application
CN116405037B (en) * 2023-03-28 2024-04-30 昆明理工大学 Astronomical star table-oriented compression preprocessing encoder and application

Similar Documents

Publication Publication Date Title
US20200394450A1 (en) An enhanced graph transformation-based point cloud attribute compression method
CN111046341B (en) Unconventional natural gas fracturing effect evaluation and productivity prediction method based on principal component analysis
CN108897975B (en) Method for predicting gas content of coal bed gas logging based on deep belief network
CN102186069B (en) Remote sensing image data compression method capable of maintaining measurement performance
CN110320555B (en) Seismic data reconstruction method
CN113865868B (en) Rolling bearing fault diagnosis method based on time-frequency domain expression
CN108960333A (en) Lossless compression method for high spectrum image based on deep learning
CN110248190A (en) A kind of compressed sensing based multilayer residual error coefficient image encoding method
CN107240136A (en) A kind of Still Image Compression Methods based on deep learning model
RU2004101179A (en) ANALYSIS OF NMR DATA OF REPEATED MEASUREMENTS BASED ON MAXIMUM ENTROPY
Donoho et al. High-performance seismic trace compression
CN114492521A (en) Intelligent lithology while drilling identification method and system based on acoustic vibration signals
CN114371501A (en) SEG D (seismic isolation image) segmentation classification mixed compression method for mass seismic source data
CN111523572A (en) Real-time bridge structure damage state identification method and system
CN112836393B (en) Method for analyzing reservoir heterogeneity based on multi-scale entropy
CN108390871B (en) Radar data compression method based on autoregressive model frame prediction
CN113935240A (en) Artificial seismic wave simulation method based on generative confrontation network algorithm
CN110907996B (en) Automatic dense gas reservoir identification method
CN114885036B (en) Real-time lossy compression method and system for ground penetrating radar data
CN105528623A (en) Imaging spectrum image sparse representation method based on ground object class classification redundant dictionary
CN115269679A (en) Multidimensional time series overall complexity evaluation method
CN114896468A (en) File type matching method and intelligent data entry method based on neural network
Braverman et al. Semi-streaming quantization for remote sensing data
Purohit Image Compression Using Wavelets and Vector Quantization Techniques
Bogachev et al. Method of Reversible Compression of Frames of Measurement Data Based on Parquet Partition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination