CA1173145A

CA1173145A - Method of seismic exploration

Info

Publication number: CA1173145A
Application number: CA000389084A
Authority: CA
Inventors: Aaron J. Davis
Original assignee: Mobil Oil Corp
Current assignee: ExxonMobil Oil Corp
Priority date: 1980-12-31
Filing date: 1981-10-30
Publication date: 1984-08-21
Also published as: GB2090408B; GB2090408A; FR2497357B1; NO814321L; DE3149771A1; FR2497357A1

Abstract

METHOD OF SEISMIC EXPLORATION
Abstract A seismic data compression technique comprises sampling each individual seismic trace, operating upon a set number of samples to generate a predicted sample, quantizing the difference between the next sample and the predicted value of the sample, and storing and/or transmitting the quantum number so that the amount of information which need be stored and/or transmitted is limited. In a preferred feature, a linear prediction differential pulse code modulation scheme is used to provide the predicted value, while an adaptive quantization scheme is used to quantize the error value to be transmitted, thus yielding further improvements in accuracy. A
feedback loop can be applied to the decompression operation to limit quantization noise and further improve the fidelity of representation of the decompressed signals.

Description

~ 31 ~ 5 F-0746 L -l-METHDD OF SEISMIC EXPLORATION
.
This invention relates to a method of seismic exploration, and more particularly to a method o~ compressing large quantities of digital data resulting from seismic exp.loration so as to be transmitted and/or stored more efficiently.
It has been common for many years to perform seismic exploration ~or oil, gas and other minerals, typical techniques involving generating an acoustic wave at the surface of the earth or ocean. This wave travels downwardly into the earth and is reflected upwardly from subterranean layers of rock towards the surface of the earth or ocean where its return may be detected. Typically, the output of the detectors are analog electrical signals which are converted into digital form and are recorded. The recorded signals can then be analyzed and used to yield a picture of the subterranean structure of the earth which can be interpreted by geophysicists in their search for oil, gas and other valuable minerals.
Within this field of seismic exploration, it is desirable to generate ever better pictures which in turn requires more digital data to be recorded. At the present time, a typical shipboard exploration system will involve the recordal of samples from 208 trailing geophones, each sampled every 4 msec. to 16 bit accuracy.
Recordal of this data proceeds at such a rate that an entire reel of magnetlc tape is filled with such data in the order of lO minutes.
Clearly it would be desirable to reduce the amount of data ~o be stored, if this could be accomplished without loss of accuracy.
A further impetus for the development of useful data compression schemes is that it is frequently desirable that the data is transmitted from an ocean-going exploration vessel to a home base computer for analysis, while the ocean-going vessel is still in the area of exploration. In this way, if the data is of particular interest, or if for some reason it was improperly recorded requiring re-exploration, the ship can return to the area of exploration without having to travel an excessive distance. However, to transmit full digital representations of the seismic data recorded ~' 1~7~ 5 as mentioned above is prohibitively expensive using present day transmission facilities, and this situation is not likely to improve. This is therefore another area in which compression of seismic data could be of use.
Clearly, in order to be useful, data compression techniques must not be permitted to distort the signal too greatly upon decompression. Two particular factors are of concern here; firstly, the RMS difference between the full word trace and the same trace after compression and decompression; and secondly, the correlation of the difference trace with a trace of full word accuracy. The RMS
error describes quantitatively how much distortion is present in the compressed and decompressed traces. The correlation indicates whether the errors made will stack out (that is, whether the errors will be reduced upon employment of prior art techniques to reduce random noise in seismic records by summing) and, if they will not, what will be the pattern of the distortion.
Prior art techniques of data compression in seismic systems have failed to provide adequate results. These techniques have fallen into two basic categories. One class involves frequency domain techniques, such as shown best in "Seismic Data Compression Methods" by L.C. Woods in Geophysics, 30, No. 4 (1974) pp. 499-525, and the other involves time-domain techniques. Use has been made of gain bit only quantization, that is, characterization of the full word by the exponent of the power of 2 nearest the sample so that only the exponent needs to be transmitted. Clearly, this may involve errors equal to half the difference between successive powers of 2. Moreover, the errors are correlated with the amplitude of the signal - as the amplitude of the signal rises, so too does the probable error. Two variations of differential pulse code modulation (DPCM) have also been tried, known as simple DPCM and running sum DPCM. These techniques in general attempt to use the fact that the difference between successive samples is smaller than the samples themselves. However, the running sum DPCM method, although very fast, results in rather high RMS noise. Simple DPCM

~L~ ~7~3~ ~ S

has smaller distortion but requires a long low cut filter to eliminate propagation errors. Accordingly, while DPCM techniques are useful, there remains a need in the seismic art for improvement in the technique employed.
~ pplication of DPCM techniques and speech encoding are discussed generally by N.S. Jayant in "Digital Coding of Speech Waveforms: PCM, DPCM and DM Quantizers", Proceedings of the IEEE, 62, No. 5 (May, 1964), but in connection with speech waveform coding and not seismic data coding.
It will be appreciated that seismic data is of a generally repetitive character, that is, it involves a sinusoidal wave convolved with a reflectivity series. As such, the wave nature of the records can be utilized in data compression and it need not be assumed that the data are random. It is this fact which is exploited by the differential pulse code modulation techniques mentioned above. The advantages of differential pulse code modulation techniques over gain bit only coding or sign bit coding, in which a sign only is transmitted, indicating whether a given sample is positive or negative, are threefold. Firstly, the errors are many tlmes smaller than those that occur when quantizing the full wave~orm into a fixed number of binary gain bits. Secondly, the errors are much less correlated than with either gain bit coding or slgn bit coding, so that it is less likely that the errors will be enhanced when seismic data processing is performed thereon.
Thirdly, differential pulse code modulation techniques are less sensitive to low frequency noise than are the gain bit and sign bit coding techniques and are therefore much more suitable for f-k filtering, a well known technique to reduce noise and commonly employed in seismic data processing. Therefore, it is desirable to use some form of differential pulse code modulation, but as mentioned above, neither simple nor running sum DPCM techniques yield results of sufficiently low distortion to be useful.
The present invention seeks to provide a method of seismic exploration in which data can be recorded at an in-field location , ~

1~'7~ 5 and transmitted to a home base location for processing which is economical of transmission time but not sacrifical of accuracy. In this way, the results of the processing can be transmitted back to the in-field location during exploration so that if it is necessary to resurvey the area or if an area proves to be of particular interest indicating further study, the appropriate decision can be made while the exploration team, whether a land-based or shipboard crew, is still in the vicinity of interest. In accordance with the invention, this is achieved by an improved method of differential pulse code modulation coding for compression of the seismic data.
The present invention therefore provides a method of seismic exploration in which acoustic energy is transmitted through a subterranean structure, energy reflected from a subterranean interface and returned upwardly to the surface is detected, electrical output data signals are generated in response to that detected energy and the data signals are compresses for storage and/or transmission, the step of data compression comprising the steps of sampling the signals at a fixed rate;
analyzing pluralities of output samples successively in accordance with a prediction operator to yield a predicted value for the next input sample;
comparing the next input sample with the corresponding predicted value to yield a prediction error value;
quantizing the prediction error value to yield a quantum number; and storing and/or transmitting the quantum number;
and the step of data compression being such that upon inversion of the operator for decompression of the stored quantum numbers, feedback may be applied to the inverted operator to cause noise in the quantized samples to pass unfiltered.
Thus, the present invention overcomes the disadvantages of the known systems described above by using a seismic data compression scheme which utilizes linear prediction differential 3~.~5 pulse code modulation coding by exploiting the convoluted nature of seismic data. In this method, a given number of samples are quantized and analyzed to yield a predicted next sample. The difference between the actual next sample and the predicted next sample is then encoded and transmitted. The quantized sample then becomes the last of the preceding samples, used to predict the next actual sample. This technique is referred to as linear prediction waveform coding. It is used in con~unction with an adaptive quantization scheme in which the difference between the predicted and the actual samples is compared with à predetermined number of "bins" or "windows" into which the signal value may be expected to fall. The windows are expanded or contracted in accordance with the resùlts of this comparison. In this way the linear prediction coding scheme with uniform interval adaptive quantization results in the least RMS error as well as the most random correlation of errors with data, as compared with a quantizaticn technique which adapts itself to the signal, resulting in correlation of the errors with the signal. The linear prediction scheme also permits the design of a ~ixed length data format. Moreover, the`linear prediction coding method is not as sensitive to transmission errors since the prediction error is distributed over the len~qth of the operator;
that is, over the preceding samples compared with the next sample.
The linear prediction differential pulse code modulation scheme of the invention is dependent on the band width of the signal (that is, the broader the band width, the larger the prediction error;
therefore, the larger the quantization error). Decimating the data, for example taking every other sample, reduces the volume of data by a factor of 2, but increases the quantization error as well.
Fortunately, seismic data has a relatively invariant low-pass periodic component and can therefore be compressed well using linear prediction since it is the sinusoidal nature of the seismic wave which allows the predicted value to be relatively close to the actual value, allowing a relatively small difference signal to be coded and transmitted at minimum error.

~ 4 5 Although in many cases the coded error signals are themselves of interest, without decoding and decompression, upon receipt of the transmitted encoded signal, therefore, the inverse operation can be performed to yield an output waveform substantially similar to that encoded. Feedback may be employed during decompression to filter noise, improving the signal-to-noise ratio.
The method of the present invention is described below in greater detail by way of example only with reference to the accompanying drawings, in which:
Fig. l represents an overall view of a seismic exploration system according to the invention;
Fig. 2 shows a graphic comparison between full word coding of seismic signals and four compression coding methods considered;
Fig. 3 represents the differences between full word coding and the four compression coding methods shown in Fig. 2;
Fig. 4 represents the five records of Fig. 2 after f-k and spike deconvolution processing;
Fig. 5 represents the differences between the full word and the four other encoding techniques considered after f-k and spike deconvolution processing;
Fig. 6 shows the records of Fig. 4 after normal moveout and stacking operations;
Fig. 7 shows the differences between the normal moveout corrected and stacked full word seismic sample and the four encoded records;
Fig. 8 shows a schematic flowchart of an adaptive quantization process used in the method of the invention;
Fig. 9 shows pictorially how the adaptive quantization method of Fig. 8 operates;
Fig. lO shows the invertibility of the deconvolution used in the technique of the invention;
Fig. ll shows the manner in which the linear prediction coding operator can be used to reduce error;

S

Fig. 12 shows how feedback can be applied to the overall system shown in Fig. 10;
Fig. 13 shows a summary flowchart of the compression technique used according to the invention;
Fig. 14 shows a summary flowchart of the decompression technique used according to the invention;
Fig. 15 shows a display of actual seismic data using full word coding;
Fig. 16 shows the same data having been coded at two bits per word and decompressed using the method of the invention;
Fig. 17 shows a detailed flowchart of the operator design routine;
Fig. 18 shows a detailed flowchart of the coding routine;
Fig. 19 shows an enlargement of the quantization portion of the coding routine shown in Fig. 18; and Figs. 20a and'20b show detailed flowcharts depicting the unpacking and decoding performed on a compressed data stream after transmission or storage.
Referring to the drawings, Fig. 1 shows in general terms the'arrangement of a marine seismic exploration system according to the in,vention (although it must be borne in mind that the invention has~equal applicabillty'to land-based operations). An exploration vessel 10 has mounted thereon a plurality of sources 12 of acoustic energy such as compressed air guns which are cont mlled to emit an acoustic wave 14 downwardly into the sea. The wave passes through the ocean floor and the first sub-sea layer 16 and is eventually reflected from a lower layer 18 upwardly to reach a plurality of hydrophones 20 streamed behlnd the vessel. The output signals of the hydrophones 20 are passed by means of a cable 22 into conventional p mcessing equipment carried onboard the exploration vessel 10 such as amplifiers 24 and filters 26. The data may be converted to digital format in analog-to-digital converter 28 and may then be encoded in accordance with the invention in a ccding processor 30. The compressed data may be stored and/or transmltted .
, ~7~ S
F-07~6-L -8-by means of a transmitter 32 and an antenna 34, optionally via a satellite 36 to an earth-bound antenna 38, possibly for further p mcessing in a computer indicated generally at 40.
As noted above, vast quantities of data are generated during seismic explorations whether on land or on sea and it is therefore desirable to provide methods of data compression whereby the amount of data which must be transmitted is limited. Also as noted above, the seismic data is of a generally wave-like character so that successive samples of the analog output signals of the hydrophones 20 do not vary greatly from one to the next. Thus, for example, a simple differential pulse code modulation scheme may be used which encodes the difference between two successive samples rather than the full amplitude of the data sample. This difference is typically considerably less than the overall amplitude and bits are thereby saved. When the correlation between two samples is high (greater than about 0.5) the differences between successive samples are small.
While this method can be workable, it does not ensure the absence of residual low frequency errors which tend to be greatly amplified upon reconstruction by integration at the earth bound processing station 40. Such amplified low frequencies cause dri~ting and sometimes DC shifts which can be unstable in further processing operations. Therefore, it is necessary to apply a long low cut filter to the samples prior to integration and to remove the mean later. Unfortunately, this causes unwanted distortion in the ~
early stage in the processing operation, which is further aggravated by later stages.
An improvement on the simple digital differential pulse code modulation, known as Nnning sum differential pulse code modulation, makes use of recursion to alleviate the low frequency problem mentioned above. The object of this system is continually to adjust for low frequencies by taking the difference between the incoming signal and the running sum over all the previous quantized differences. If there was no error in the quantization, this :1~ 731~5 operation would be simply the difference trace as above. However, since errors must be expected to occur in the quantization process, the difference to be quantized is the change between two successive inputs plus the quantization error in the previous difference. Each time a large error occurs, it is thus corrected for by the next difference instead of propagating through the entire trace. While this technique is better than simply differencing the data from an error propagation standpoint and allows elimination of the long low cut filter and mean removal mentioned above, there is correlation of the error with the data.
The method which is usually referred to as differential pulse code modulation is here referred to as linear prediction differential pulse code modulation for the sake of clarity. As compared to running sum differential pulse code modulation, instead of predicting the last sample using the sum of the previous samples, in this method one attempts to predict the next sample from a linear combination, controlled by a prediction operator, of the past several samples. For example, if the quantization was error free, the difference between the current input and the prediction sample is simply the prediction error. Such a sequence is particularly desirable for quantlzing purposes since the spectrum is whitened (i.e., has a broader bandwidth) and the RMS amplitude level is minimized. However, quantization will add noise to the system that must be controlled upon reconstruction or it will render reconstruction unreliable. Rccording to the method of the present invention, the quantization noise is controlled by feeding back the reconstructed output to the preceding input so as to correct the prediction operator, rather than simply predicting on the basis of a series of inputs.
Fig. 2 shows comparisons of input data with output data generated using four of the techniques discussed above. The five records are all comparable to seismograms, that is, their vertical axis is time while the horizontal axis is distance from a given origin. Fig. 2a represents a display of synthetic data coded 1 ~73i F-0746-L -lO-according to the full 32-bit word sample; this is the type of data which is to be compressed according to the method of the invention.
Fig. 2b shows the full word coded data of Fig. 2a after it has been encoded using linear prediction differential pulse code modulation techniques. Fig. 2c is comparable, using running sum differential pulse code modulation. Fig. 2d shows simple differential pulse code modulation. Finally, Fig. 2e shows gain bit only coding. Those skilled in the art will recognize that while the five seismograms shown in Fig. 2 are generally similar in outline, nevertheless the quality of the coding drops off appreciably from Fig. 2b to Fig.
2e. This is shown graphically in Fig. 3 which represents results of subtraction of the coded data shown in Figs. 2b through 2e from the full word data of Fig. 2a, thus proviping an indication of the accuracy of the coding scheme employed. Fig. 3a, linear prediction differential pulse code modulation, is clearly the best of the coding techniques tried; significant error is limited to a region between 1.8 and 2.3 sec. It can be seen that only where the amplitudes are high and the predictability is low, such as the superposition of all the events on the further traces, are the errors in the linear prediction differential pulse code modulation representation significant.
Fig. 4 shows seismograms similar to those of Fig. 2 but after conventional data processing operations have been performed thereon; that is, f-k filtering and spike deconvolution methods.
Both of these methods relate to the use of mathematical p mcesses for generating coherent frequency information from seismic records.
In this case, the "airburst", the surface noise generated by the acoustic energy was first removed by f-k filtering; spike deconvolution was then applied to the input records. Fig. 4a shows very little high frequency noise and only a small remnant airburst remains in the full word coding system. Likewise, in all three of the differential pulse code modulation methods, shown in Figs. 4b, 4c and 4d, the airburst was successfully attenuated. The gain bit only coding, of Fig. 4e, however, has in addition to a higher noise 1~73145 F-0746-L -ll-level, a large remnant airburst. The simple differential pulse codemodulation method (Fig. 4d) has been improved over the running sum method of Fig. 4c by a deconvolution operation which removes some effect of the low cut filter discussed above.
The difference records of Fig. 5 are comparable to those of Fig. 3, that is, Figs. 4b, 4c, 4d and 4e were subtracted from Fig.
4a to yield Figs. 5a, 5b, 5c and 5d, respectively. Again, the linear prediction differential pulse code modulation method of Fig.
5a has the least error; however, it appears that most of the error lies along reflection events which have been deconvoluted. Simple differential pulse code modulation also shows a relatively good signal (Fig. 5c) and again the gain bit only coding (Fig. 5d) is very noisy. Comparison of the signal-to-noise ratios of these records bears this conclusion out. The average RMS signal-to-noise ratio of linear prediction differential pulse code modulation signals averaged over 25 traces is 3.93; of running sum 1.43; of simple differential pulse code modulation 2.03; and gain bit only coding 1.05.
~ nother well known prior art technique which can be used to improve signal-to-noise ratio in seismic data is normal moveout correction and stacking. In these processes, the records are time corrected for the varying offsets between source and receiver and are stacked, i.e., are summed either mathematically or electrically, whereby the noise, being random, tends to be cancelled out, while the signals are amplified. Fig. 6 shows from left to right examples of this technique performed on the five test records of Figs. 4a, 4b, 4c, 4d and 4e, while Fig. 7 shows difference traces of the five traces shown in Fig. 6; that is, the four encoded traces are subtracted from the full word trace of Fig. 6. After application of this technique the signal-to-quantization-noise ratio of the signals processed by the linear prediction differential pulse coding method is 5.7; of running sum 2.63; of simple differential pulse code modulation 3.17 and of gain bit only 1.47. Since none of these ratios was improved by normal moveout correction and stacking by a 1 ~ 7 ~L~ 5 factor greater than 1.8 (in the case of running sum differential pulse code modulation) it can be concluded that the errors in all techniques are not uncorrelated with the data. While linear predictive differential pulse code modulation has the least distortion after processing, from the difference traces of Fig. 7 the errors appear to be less correlated with the data after stacking. Even though simple differential pulse code modulation has a higher signal-to-noise ratio than running sum differential pulse code modulation, the distortion caused by the low cut fflter is much more noticeable on the simple DPoM stacked traces than on those processed in accordance with the running sum technique. The gain bit only difference trace exhibits all the major events, but the noise level is very high and every wavelet shows noticeable distortion.
As discussed above, it can be shown that linear prediction differential pulse code modulation coding, assuming a noiseless channel, can be successfully applied to seismic data with minimal distortion. The same results can be demonstrated with real data.
Moreover, it was demonstrated that less complicated running sum and simple differential pulse code modulation schemes yield better results than gain bit only codinq, but still below the quality received from linear prediction differential pulse code modulation.
Another important aspect of the present invention relates to a ~urther improvement on linear prediction differential pulse code modulation. A fi~ed word length coding scheme is used which, unlike gain only, implies a specific compression ratio. Because seismic data requires an automatic gain ranging capability, the adaptive coding scheme i~ required. According to this adaptive coding scheme, the quantization windows into which all data must fit are varied with respect to the overall amplitude of the signal. Simply put, the adaptive coding scheme when applied to linear prediction differential pulse code modulation, varies the spacing of the bins or ranges of possible values into which the prediction error trace falls and expands or contracts these range in accordance with the 1~7~31~S

previous output For example, if the last output indicated that theerror was in the outermost bin, the bin would be expanded for the next input. If the next sample fell in an inner bin, the bin would be contracted. In this way, better matching of bin size with error can be achieved, still using a limited number of bins; as only the identification of the bin need be transmitted and/or stored, substantial data compression is achieved. Thus, while the adaptive quantization scheme is very parsimonious with respect to the number of bits required, it still has the ability to adjust itself to the magnitude of the signal. It is particularly useful in the case of a seismic data recording, wherein the amplitude decays slowly as a function of time as well as varying gradually from one sample to the next.
The method of the invention is illustrated in block diagram and pictorially in Figs. 8 and 9 respectively. Initially, a finite number of "bins" is set up which accommodates the approximate range of values expected for the first few data samples. The number of bins is equal to two raised to the power equal to the number of binary digits per word specified. In other words, two bits (b = 2) would allow four bins or states; three or four bits to be transmitted per sample would allow ~ or 16 bins, respectively. As a sample S(t) comes in, the identification M(t) of the bin which accommodates the sample is recorded. Then, in accordance with a time invariant formula involving only the bin number of the previous samples, the bin size is changed with respect to the bin within which the sample was chosen. Thus, for example, if the signal falls in the largest bin, the bins are expanded for the next sample. The bin sizes represent uniform intervals so all the bins are changed by the same fæ tor. These factors may be determined for normally distributed signals with various correlation coefficient as set forth in the article by N.S. Jayant referred to above, as well as in his article "Adaptive Quantization with a One-Word Memory", Bell System Technical Journal, 52, No. 7, (September 1973) pp. 1119-1144 and in "Quantizing for Minimum Distortion" by J. Max in IRE
Transactions on Information Theory, 6, pp. 7-12 (1960).

1~731~c5 F-07~6-L -14-The error E in any particular sample can be written as E = le ¦ - ¦ e ¦

where e is the actual sample and e is the quantized sample.
Absolute values are chosen for the definition, as the sign of the signal is preferably handled separately; that is, both e and e are made to have the same sign if e is not equal to zero. The quantization is performed such that any e in the interval (n - 1) Q ~ e ~ nQ
is quantized as (n - 1/2) Q

where Q is the variable quantization interval and n is the bin number (1, 2, 3..., 2b 1, 2b). Where N is equal to 2b, the largest bin, any e in the interval (N - 1) Q ~ e ~ ~yD
is quantized as e = (N - 1/2) Q.

Assuming that the actual sample has an equal probability o~
being anywhere in the interval, it follows that the error E will be roughly proportional to Q, the quantization interval. Since the adaptive quantization scheme allows Q to increase and decrease as the signal does, the errors (assuming a uniform probability distribtuion in the interval) also increase and decrease. This correlation of the magnitude of the errors with the signal result in errors in the spectral bandwidth of the signal. This situation, a ~7~3~

consequence of any adaptive coding scheme, is contrary to the object that the errors be small and outside the bandwidth of the signal, as discussed above. The solution according to the method of the invention is to ~ilter the signal prior to quantization such that the output is as broad band as possible, with the smallest possible average amplitude. In this way, variations in the bin size can be minimized and therefore so can the errors.
In addition to the above listed two criteria for the filtering operation, that is, reduction of the amplitude variation and broadening of the spectra, it is required that any filtering be invertible at least in the noiseless case, so that the original data can be recaptured from the filtered and coded sequence. It has been observed that the most sensitive process in normal seismic processing operations is deconvolution; that is, separation out of repetitive events in the seismic data. It follows that if the seismic records were to be corrupted at any stage by coding, it would preferably be after deconvolution. The fact that deconvolution can be inverted additionally aids in this judgment.
The deconvolution process may be broken down so as to be invertible is shown in Fig. 10. Incoming samples S(t) are broken down by the summation over all n of the product ~(t - n) A(n) where A(n) is the prediction operator. Comparison of S(t) and ~(t) yields R(t), the departure of the predicted sample from the actual sample or "predicted error." At a later time, the inverse operation may be performed on R(t) to yield the exact input S(t).
It is well known that the deconvolved trace has a broader spectrum. This may be accomplished by solving for the prediction operator A(n) which in a preferred embodiment is chosen to minimize the sum of the squares of the prediction error trace. A
minimization process for this variance is outlined in Fig. 11. I is the sum of all the squares of all the R(t)'s each of which = S(t) -S(t), where S(t) is in turn equal to the sum overall the terms ~S(t - n) A(n)]. Therefore, one can solve a series of n linear equations for A(n) where 1~ 73i~5

2 I , o ~ A(n) to minimize I. In a particularly preferred embodiment of the invention, solution of the resulting set of linear equations amounts to solvent a Toeplitz matrix for the autocorrelation vector shifted one sample ahead.
Regardless of what prefiltering operation is performed on data be~ore quantization, it will be appreciated that the inverse filter when applied to reconstructed data will affect the quantization nolse in the same manner as the signal. The inverse filter effect is more noticeable on data quantized with the simple DPCM coding scheme. The errors even though barely noticeable in the di~ference trace, are greatly amplified upon inverse filtering by integration. Consequently a feedback mechanism must be employed in order to control the noise. The requirement of ~eedback in turn adds two more requirements to the pre-iltering operation: that its operator be minimum delay and be designed from a zero-mean data window. There are numerous well-known methods available to solve for such a prediction operator, but all involve a trado-of~ between the degree to which the mean squared error is minimized and the rate of convergence Or the reciprocal operator. Hbwever, it has now been ound that the autocorrelation method results in the most computationally e~ficient operator as well as resulting in a suitably ront loaded operator, which allows smoothing of the prediction error resulting from the adqptive quantizer. As the non-linear quantization results in noise of both low and high frequencies, the autocorrelation method should be applied to the autocovarlance (mean removed from the data) rather than the autocorrelation in the design of the prediction operator so that low requency dri-ts should not be predicted when actual data is processed. This provides an instructive constrast to the speech compression techniques mentioned above. In speech compression, the 1~73~5 low frequencies (i.e. below about 200 Hz) can be filtered as they contribute little to the signal, and can be disregarded in operator design. In seismic data compression, the typical frequency range is 5-50 Hz, and accordingly such low-cut filtration would remove all data of interest. Hence, an autocorrelation technique as mentioned above should be used (see Rubiner and Schafer, Digital Processing of Speech Signals, Prentice Hall, Englewood U iffs, New Jersey (1978), especially Chapter 8, Section 8.1.1). ~y subtraction of the mean value from each of the samples in the window of interest, the overall sample can be centered on zero, thus avoiding prediction of low-frequency drift and unstable coding.
Fig. 12 shows a flowchart of the loop ~lith feedback added.
In addition to the loop shown in Fig. 10, a second summation loop comprising a summa~ion node and a prediction step yielding the formulation of an S(t) fed back to the main summation node is incorporated. In this way an R(t) is what is transmitted; that is, an estimated error, calculated according to a definable process, which can be decoded at the receive side, as also indicated in Fig.
12, is transmitted.
Summarizing, the method of the invention involves solutions of several problems as follows. The first problem is to minimize the number of possible states or bins into which the data can be transferred. The solution is to use an adaptive quantization scheme; since the number of the bin into which a given sample falls is that which is transmitted, this maximizes the advantages of the data compression method of the invention. Preferably, a uniform interval, adaptive quantization method is used; this involves the use of a one-word memory to retain the bin size from one sampling interval to the next. However, adaptive quantization results in noise which is proportional to the signal and which has similar spectral components. The solution to this problem is to transform the signal into a prediction error trace R(t) before quantization;
however, this raises the problem that the inverse transform of the prediction error scheme amplifies the noise and narrows the 1~ 731~5 spectrum. The solution to this problem is to incorporate a feedback loop in the system, forcing the noise to pass unfiltered. The flowchart resulting from application of these concepts is that shown in Fig. 12.
Figs. 13 and 14 show the compression and decompression routines respectively. The compression of the data as shown in Fig.
13 is done at the sending location, or at the time o~ storage of data, and the corresponding decompression routine is performed at the receiving end, or upon retrieval. The compression flowchart shown in Fig. 13 includes computing the prediction operators, setting up the initial quantization levels and computing the error, coding it and packing the compressed trace. As shown in Fig. 14, the operator information A defining the initial conditions becomes a header and is stored or transmitted with the packed difference codes M(t), and the decompression routine operates upon these. In this connection, it is again noted that seismic data differs significantly from for example speech, in that it has an increasingly convoluted nature; this permits a single operator A(n) to be defined for the entire trace, record, line or region of exploration. By comparison, in speech encoding the operator must be recalculated on the order of every 20 msec; indeed, the operator itsel~ is typically all that is transmitted and the prediction errors are synthetically generated. After storage and/or transmission, the sample is unpacked yielding M(t), the data which is decoded to yield R(t), and this is added to the predicted sample calculated in accorda~ce with the previous samples to yield the reconstituted sample S(t). S(t) is used to predict the next output. The inverse of the compression scheme shown pictorially in Fig. 9 is then used to perform the inverse of the changing of the bin size to prepare for the next sample M(t). R(t) is then used to correct that predicted output, which is then the reconstituted sample S(t).
Those skilled in the art will recognize that correct design of the prediction operator is important to the accuracy and stability of the output of the method of the invention. In apreferred feature of this invention, the data used in computing the autocorrelation function either follow a mute curve or are set at a constant start time. The length of the window employed is at least 10 times the maximum operator length; if a short record occurs, the operator length is reduced. The mean value is removed from the data window but not from the data itself. The autocorrelation is computed assuming all zeroes outside the window of interest. The initial level of white noise is added to the center ordinant, so as to avoid difficulties resulting from inverting zeroes; the Levinson recùrsion routine is then used to compute the operator. The error condition is checked and if positive, computes another operator with 1/2 percent more white noise added. If the white noise required is greater than 10% the trace is "killed"; i.e., not transmitted or recorded, as the case may be.
The compression routine itself can be summarized as follows. The trace is first initialized by zeroing a first predetermined number of points. Then the initial bin size is set as equal to the average of some predetermined number of non-zero absolute values encountered. The expansion and contraction factors are set according to the 1973 Jayant article referred to above. The predictinn error is then computed, the compressed sample is coded and packed, the next sample is predicted and so on, repeating the computation of error, coding and packing processes for each data sample. Stability may be confirmed continually. Thus the output of the compression routine shown in Fig. 13 is a packed integer trace containing the coded random components indicating the departure fmm the error and the header information including the operator and the initial bin size.
The decompression routine is shown in Fig. 14 and follows substantially the inverse p mcess. First, the data is unpacked into integer representation. This coded trace can, in fact, be used as a gain equalized deconvolved trace or the integers can be decoded to yield an approximation of the full dynamic range of the deconvolved ~ l~73~4S

trace; finally, of course, using the operator supplied in theheader, the original input can be reconstructed at very high fidelity. Those skilled in the art will recognize that the compression mutine according to the invention, while providing up to a 10 to 1 reduction in the amount of time required, for example, for transmission of the data over a satellite radio link or for storage of the data on magnetic tape, requires some computational time for implementation. To this end, it is presently envisioned that a single computer would desirably be dedicated to this purpose, i.e., be programmed to perform the compression routine, for example, shipboard. It presently appears that use of, for example, a "PHOENIX I" minicomputer as the coding unit 30 of Fig. 1 would be suitable.
Figs. 15 and 16 demonstrate dramatically the success of the method. Fig. 15 is a representation of seismic data having been stored in full 32-bit word array; while Fig. 16 is the same data having been compressed to a 2-bit per sample level and decampressed as above. The data of Fig. 16 could thus be transmitted at a cost of less than one-tenth that of Fig. 15. Those skilled in the art, however, will recognize that Fig. 16 is very comparable to Fig. 15 and would be quite useful in determining the success of ongoing seismic prospecting operations. Moreover, it is found that variation in the output with respect to the original is related to the scale on which the data is plotted. That shown in Fig. 16 was plotted at 12 traces per inch plotted at a speed of 5 inches per second. At lower resolution, e.g., above about 20 traces per inch, no differences at all are noted between the full word and the two-bit compressed/decompressed records. At the 12 trace per inch scale shown in Figs. 15 and 16, Fig. 16 does show a slight high frequency ~itter riding on the very long wave length seismic data, which can be mitigated readily by applying a high cut ~ilter to the deep data in the section section.
Those skilled in the art will recognize also that while decompression and reconstitution of the data in its original format 1~73~g~5 is possible, it is not required before the data can be made useful.
The seismic data can, for example, be processed as two-bit integers directly from the compressed format. In such a case, gain equalization and deconvolution before stacking are redundant in the normal seismic processing flow. Another output option skips the inverse filter in the decompression routine. Such an output can be processed in the usual way without deconvolution before stack. For the first option method, the four state [-2, -1, +1, +2] trace is gain equalized on a sample-by-sample basis. Any amplitude differences on the stack are due to correlation between traces.
Reference is now made to Figs. 17 to 20 for detailed descriptions of the processes of operator design and coding performed prior to transmission and/or storage of a compressed signal, and the unpacking and decoding routines performed after storage is terminated or upon receipt of a compressed signal. Fig.
17 shows the operator design flowchart. The incoming samples S(t) having been received, a series of samples of predetermined length representative of the data which will be compressed, is chosen for computation of the autocorrelation. The mean is removed from the series, so as to eliminate any DC level. The autocovariance is then computed over the samples between zero and Nmax lags so as to determine the linear relationship between the present sample and one at each lag. The maximum absolute value between Nmin and Nmax is then located and is the location of the operator length Na. The zero lag is then used to normalize the autocorrelation and to set up the right side of the Toeplitz equations referred to above used to solve for the operator A(n). An amount "Pcent" of white noise is added and the center ordinate of the autocovariance and the operator A(n) is then computed by solution of the simultaneous differential equations mentioned above. If a stable solution to the equations is found in this routine, the error flag will be negative and the values between Na ~ 1 to Nmax can be set to zero and A(n) used in the compression. The length of the operator A(n) depends on the maximum stable operator length, i.e., on the error. If the error 1 ~ ~7~3~ ~ 5 flag is raised, indicating that the operator is not stable, one-half percent additional white noise is added to Pcent and the calculation reperformed. If Pcent is greater than 10%, indicating that the procedure has been performed 18 times, no operator is computed based on that series of samples.
Fig. 18 shows the coding operation performed after the operator A(n) is calculated as above. The routine is ~irst initialized by computing the value "Sig", the average of the absolute values of the first 25 non-zero samples. The expansion and contraction factors are then set in accordance with Jayant's articles depending on the number of bits of coded data desired, and the quantum bin size Q is set proportional to Sig, calculated as above. The value "Xpret", which represents the next expected value, is set to 0 and a counter registert "Last," is set to equal 1. The error E is then calculated using the operator as calculated above, and is equal to the actual value S(t) less Xpret. The difference is quantized by comparison of E with the various bins, which yields the quantized error Value ET. The number of the bin within which the quantized error fit, M(t), the quantum number, is packed and stored as part of the record to be transmitted. The quantization process is detailed below in connection with Fig. 19. The stability of the process is then checked by comparing the absolute value of the sample with the quantum interval, that is, against the bin size.
Clearly, if the bin size is much greater than the absolute value of the sample, for several samples, an error has occurred. If the stability check reveals that the operation is proceeding properly, the approximate input value S can then be calculated from ET I
Xpret. This in turn can be convolved in the next step with A(n) to yield the next expected value Xpret. If this is the last sample the next trace is read; if not, the process begins again.
As mentioned above, Fig. 19 details the quantization process used in the coding process detailed with reference to Fig.
18. Here the error E is to be separated into a quantized bin size and an estimated error, M(t) and ET respectively. The process ~l73~.~5 begins by scaling of Q in accordance with the "Last" value available; that is, the bin size is set to a beginning value. The bin size thus determined is compared with Qmin and Qmax, that is, with the maximum and minimum bin sizes previously determined, which amounts to an error check. The absolute value of E is used in the calculations which follow. Variables x and n are initialized, while Last is set equal to "ISTATE", which equals half the number of total bins available. For example, in the two-bit case, four bins are available, as one of four bins can be characterized by a two-bit code, thus: small positive samples fall in bin 0; large positive samples fall in bin 1, bin 2 is for small negative samples, and bin

3 is for large negative samples. In this case, ISTATE would be 2.
X is then set equal to x + Q, and the absolute value of E is compared with x, which amounts to determining whether E fits within this bin; if yes, then M(t) has been determined. Last, the last bin selected, is set equal to n; the quantized error is set equal to the bin size Q(Last - 1/2). If E is less than zero, ET is inverted while M is set equal to Last + ISTATE, that is, to the bin within which E fell. Where E is greater than or equal to x, x is incremented by Q, the bin size, and the process repeated. It will be aopreciated that these last few steps are necessitated by the fact that it is desired to transmit 2 bits without a sign. Thus, the numbers of the bins corresponding to the values centered around 0, that is, to O~E~Q, E7Q, o7E~- Q, and E~ -- Q are set equal to 0, 1, 2 and 3 respectively. The carry over of Last to the "scale Q"
step, with which the quantization routine begins allows the bin size to vary as the process continues thus making the quantization adaptive. For example, if bin 1, the largest bin, were that within which the absolute value of E fell, the overall bin size would then be incremented by the scaling operation. If on the other hand the absolute value of E fell within the 0 bin, the bin size would be made smaller, as clearly E would be of a small value and therefore in order to increase accuracy, the bin size should be made relatively smaller.

~L~ 7~.4c~

Finally, Figs. 20a and 20b show the unpacking and decoding process performed upon receipt of data compressed in accordance with the method of the invention at a receiving location or upon release from storage. As noted above, the data can be decoded to give a reconstructed, a merely deconvolved, or a gain equalized and deconvolved trace, depending on the choice of the operation. The variables coming in are records of M(t), the bins into which each quantized error sample fit, A(n) the operator, and Sig, the initial value assigned to the bin size. The initial samples are zeroed and the expansion and contraction factors are set up as above. Q is likewise initialized as above. Q is compared with Qmax and Qmin to determine validity of this value. The quantized predicted value is first given magnitude by the bin number and the quantization level Q
and then the sign is applied.
If Iout, a variable indicating the desired mode of processing the data, is equal to 2, indicating that the stored compressed data is to be gain equalized and deconvoluted, S(t) is simply set equal to Last if M is less than ISTATE; and to -~ast if not; that is to say the output function is simply equal to the relative amplitude and sign of the error sample. It is found that in general even this yields a meaningful result. If Iout is not equal to 2, the meaning being that the trace is either to be deconvolved or reconstructed, further processing is performed. If M
is greater than ISTATE, the sign of ET is negative. If Iout is equal to 1, the trace is deconvolved only which involves merely setting S(t) = ET; if not, then S(t) is set equal to Xpret + E(t) which yields a fully reconstructed trace point; Xpret is equal to the sum of A(n)S(t - n) which yields the next expected value Xpret and the process is repeated.
Given the flowcharts of Figs. 17 through 20, those skilled in the art will readily be able to fully implement the method of the invention.
It will thus be appreciated by those skilled in the art that there has been described a method for data compression ,.~

~ 4 5 specifically useful in seismic exploration applications which allows the representation of 32-bit floating point samples by a two-bit integer without loss of fidelity of reproduction. The use of this concept of adaptive quantization and differential pulse code modulation alone is insufficient to achieve this result as the noise generated by such a scheme when used on seismic data results in unstable, meaningless data. Accordingly, the signal is instead transformed into a prediction error trace, rather than a data trace, prior to quantization; in a particularly preferred feature, this is achieved through the use of linear prediction differential pulse code modulation modffied as above.

Claims

CLAIMS:

1. A method of seismic exploration in which acoustic energy is transmitted through a subterranean structure, energy reflected from a subterranean interface and returned upwardly to the surface is detected, electrical output data signals are generated in response to that detected energy and the data signals are compressed for storage and/or transmission, the step of data compression comprising the steps of sampling the signals at a fixed rate;
analyzing pluralities of output samples successively in accordance with a prediction operator to yield a predicted value for the next input sample;
comparing the next input sample with the corresponding predicted value to yield a prediction error value;
quantizing the prediction error value to yield a quantum number; and storing and/or transmitting the quantum number;
and the step of data compression being such that upon inversion of the operator for decompression of the stored quantum numbers, feedback may be applied to the inverted operator to cause noise in the quantized samples to pass unfiltered.

2. A method according to claim 1, wherein the prediction operator comprises linear prediction differential pulse code modulation in which a predetermined number of successive samples is used to predict the next succeeding sample and the next succeeding sample thereafter becomes the last of the predetermined number of samples, and the earliest of the predetermined number of samples is deleted from the determined number of samples.

3. A method according to claim 1 or claim 2, wherein the quantization is adaptive whereby increased accuracy can be realized in the quantization of the samples.

4. A method according to claim 1, wherein the quantization step comprises categorizing each sample as belonging to one of a fixed number of discrete states, the states forming a continuous range in which those states disposed in the center of the range have fixed bounds and those states on the ends of the range have one bound fixed to mate with those disposed in the center of the range and one bound open, whereby all sample values may be categorized as belonging to only one of the states.

5. A method according to claim 4, wherein the bounds of the states in the center of the range are varied in response to the categorization of one or more prior samples whereby the accuracy of the quantization is further improved.

6. A method according to any one of claims 1,2 or 4, wherein a plurality of quantized samples corresponding to a given period of seismic exploration is preceded by a header giving information as to the initial conditions of the data compression method whereby data decompression may be performed.

7. A method according to any one of claims 1,2 or 4, wherein the analyzing step comprises performing autocovariance on each plurality of samples, whereby low-frequency drift of the compression is avoided.

8. A method for compressing samples of signals of a convoluted nature, comprising the steps of defining a prediction operator by analysis performed with respect to a predetermined number of the samples for operating on a series of the samples of predetermined length to yield a predicted value for a next sample;
measuring the difference between the predicted value and the next sample;
comparing the difference to a series of numbered bins of variable size;
recording the number of the bin within which the difference fits; and varying the size of the bins with respect to the number.

1527n