DK201400248A1

DK201400248A1 - A computer assisted method for quantification of total hydrocarbon concentrations and pollution type apportionment in soil samples by use of gc-fid chromatograms

Info

Publication number: DK201400248A1
Application number: DKPA201400248A
Authority: DK
Inventors: Søren Furbo; Jan Henning Christensen; Giorgio Tomasi
Original assignee: Københavns Uni
Priority date: 2014-05-06
Filing date: 2014-05-06
Publication date: 2015-11-16
Also published as: WO2015169686A2; DK178302B1; WO2015169686A3

Abstract

There is provided a computer assisted method for producing a data set for adjusting GC-FID (gas-chromatography/flame ionization detector) chromatograms for retention time related changes in the sensitivity of the GC-FID system. There is also provided a computer assisted method for obtaining a corrected GC-FID chromatogram of a soil sample comprising one or more hydrogen compounds, wherein the correction of the chromatogram includes an adjustment of the chromatogram for retention time related changes in the sensitivity of the GC-FID system, and wherein the adjustment may be performed by use of the produced data set. Furthermore, there is provided a computer assisted method for producing a number of calibration curves to be used in determining total hydrocarbon concentrations of soil samples from GC-FID chromatograms, and there is provided a computer assisted method for determining the total hydrocarbon concentrations by use of GC-FID chromatograms and the produced calibration curves. In addition, there is provided a computer assisted method for obtaining a pollution type model, and to use such a pollution type model to determine the pollution type apportionment of a soil sample.

Description

A COMPUTER ASSISTED METHOD FOR QUANTIFICATION OF TOTAL HYDROCARBON CONCENTRATIONS AND POLLUTION TYPE APPORTIONMENT IN SOIL SAMPLES BY USE OF GC-FID CHROMATOGRAMS

FIELD OF THE INVENTION

The present invention relates in general to a computer assisted method for determining the total hydrocarbon concentrations in soil samples by use of GC-FID (gas-chromatography/flame ionization detector) chromatograms. The invention also relates to a method for producing calibration curves to be used in determining the total hydrocarbon concentrations of soil samples. The invention further relates to a method for producing a data set for adjusting GC-FID chromatograms for retention time related changes in the sensitivity of the GC-FID system. The invention additionally relates to a method for determining the relative influences of different pollution types on a soil sample.

BACKGROUND OF THE INVENTION

Approximately 150.000 analyses of total petroleum hydrocarbon (TPH) concentrations and hydrocarbon fractions are performed yearly by Danish environmental laboratories alone. Quantification of TPH concentrations and hydrocarbon fractions is today based on manual integration of GC-FID chromatograms, which is prone to human errors and is time consuming. Furthermore, the GC-FID chromatograms often contain information e.g., about the source of hydrocarbons that today is not exploited. Thus, the available data is not used to determine whether the hydrocarbon content is of biological origin, petroleum (e.g., lubricating oils, heavy oil or lighter oil fractions), creosote, diffuse pollution from combustion processes etc. Such additional information is valuable for the customer especially to provide data for a proper risk assessment/soil classification and for liability cases.

It is the aim of the present invention to provide a computer assisted method for use in analyses of total petroleum hydrocarbon (TPH) concentrations, and thereby reduce the risks for human errors in TPH concentration measures, and further to provide a user with additional information on the type of hydrocarbon contamination.

SUMMARY OF THE INVENTION

According to a first aspect of the invention there is provided a computer assisted method for producing a data set for adjusting GC-FID (gas-chromatography/flame ionization detector) chromatograms for retention time related changes in the sensitivity of the GC-FID system, the method comprising the steps of: a) selecting a number of characteristic hydrocarbon compounds; b) producing a number of liquid sample solutions (standard solutions) having different but known concentrations of a mixture of the characteristic hydrocarbon compounds; c) obtaining a GC-FID chromatogram for each or at least some of the sample solutions with each chromatogram consisting of a detector signal intensity curve as a function of retention time, wherein each characteristic hydrocarbon compound of a sample solution is represented by a maximum intensity peak at a peak apex retention time, which peak retention time is characteristic for the corresponding hydrocarbon compound, and wherein each maximum intensity peak has a corresponding intensity peak area being a function of the sample concentration of the corresponding hydrocarbon compound; d) storing data representing the obtained chromatograms; e) calculating peak area data for each or at least part of the intensity peak areas for each or at least part of the stored chromatograms; f) producing a set of area-concentration data for each of the selected hydrocarbon compounds, each set of area-concentration data being based on the calculated peak area data and the known sample concentrations of the selected hydrocarbon compound; g) determining a response factor, RF, for each of the selected hydrocarbon compounds, said response factor being determined as a slope representing at least part of an area-concentration curve obtained from the set of area-concentration data, which curve represents the calculated peak areas as a function of the concentration of the selected hydrocarbon compound in the different sample solutions; h) for each selected hydrocarbon compound storing the obtained response factor together with the corresponding maximum peak retention time thereby obtaining a first set of response factor-retention time data; i) calculating by use of interpolation one or more new response factors for retention times between two nearest neighbor maximum peak retention times of the first set of response factor-retention time data, thereby obtaining a second and expanded set of response factor-retention time data; and j) storing said second set of response factor-retention time data.

It is preferred that for step e), then for each maximum intensity peak a start peak retention time, spRT, being lower than the maximum peak or peak apex retention time, paRT, is defined, and an end peak retention time, epRT, being higher than the maximum peak retention time, paRT, is defined, and the intensity peak area is calculated as the area covered by the intensity curve above an intensity baseline being drawn from the intensity curve at start peak retention time, spRT, to end peak retention time, epRT. Here, the start peak retention time may be found as the last RT before the paRT where the intensity function have a negative slope, and the end peak retention time is found as the first RT after the paRT where the intensity function have a positive slope, respectively.

For step g) it is preferred that the curve slope defining a response factor, RF, is calculated by minimizing the unweighted or weighted sum of squares of the differences between a linear curve and the area-concentration data.

For step i) it is preferred that the new response factors are calculated by use of linear interpolation using the response factor values of two nearest neighbor maximum peak retention times. However, it is also within an embodiment of the first aspect of the invention that for step i) the new response factors are calculated by use of linear interpolation using the response factor values of two nearest neighbor maximum peak retention times and by use of response factor values for maximum peak retention times closest to the two nearest neighbor maximum peak retention times. For step i) it is also preferred that the number of calculated new interpolated response factors in-between two nearest neighbor retention times equals the number of data points obtained between these two retention times determined by the sampling rate of the flame ionization detector.

According to an embodiment of the first aspect of the invention, then before the area calculating of step e), a reduction of non-sample information (data artifacts) is performed on the chromatograms represented by the data sets stored in step d). The reduction of non-sample information may include subtraction of data, which represents a chromatogram obtained by GC-FID analysis of a sample containing no hydrocarbon compounds, from each of the stored standard solution chromatograms. It is also within an embodiment of the first aspect of the invention that the GC-FID system used for obtaining the chromatograms comprises a gas chromatograph with a column, wherein the reduction of non-sample information includes subtraction of intensity data representing a blank chromatogram from each of the standard solution chromatograms, and wherein the intensity data for the blank chromatogram is obtained by taking the average of the intensities recorded before any compounds elute from the column, or wherein the blank chromatogram results from the injection of a sample into the column, which sample does not contain any hydrocarbon compounds. The reduction of non-sample information may include retention time alignment of the chromatograms. The retention time alignment of a chromatogram may comprise shifting retention time sections of the chromatogram by a constant value (rigid alignment), and/or comprise shifting the retention times by a value being a function of retention time (non-rigid alignment). The non-rigid alignment may consist of sequential stretching and compression of the chromatogram in order to best align the chromatogram to a target chromatogram so that the retention time of each compound in the aligned chromatogram is the same as the retention time for these compounds in the target chromatogram .

For the first aspect of the invention it is preferred that each of the standard solutions contains between 10-20 individual hydrocarbon compounds. The number of standard solutions may be selected to be in the range of 4-12. The concentrations of characteristic hydrocarbon compounds in the standard solutions may be in the range of 0.2 - 100 ppm.

According to a second aspect of the invention there is provided a computer assisted method for producing a number of calibration curves to be used in determining total hydrocarbon concentrations of soil samples from GC-FID (gas-chromatography/flame ionization detector) chromatograms, the method comprising the steps of: a) selecting a number of characteristic hydrocarbon compounds; b) producing a number of liquid sample solutions (standard solutions) having different but know concentrations of a mixture of the characteristic hydrocarbon compounds; c) obtaining a GC-FID chromatogram for each or at least part of the sample solutions with each chromatogram consisting of a detector signal intensity curve as a function of retention time, wherein each characteristic hydrocarbon compound of a sample solution is represented by a maximum intensity peak at a maximum peak retention time, which peak retention time is characteristic for the corresponding hydrocarbon compound, and wherein each maximum intensity peak has a corresponding intensity peak area being a function of the sample concentration of the corresponding hydrocarbon compound; d) storing data representing the obtained chromatograms; e) calculating peak area data for each or at least part of the intensity peak areas for each or at least part of the stored chromatograms, whereby each calculated peak area corresponds to the sample concentration of the characteristic hydrocarbon compound with its maximum peak retention time; f) dividing at least part of the stored chromatograms for which peak area data are calculated in step e) in two or more consecutive retention time groups; g) calculate corresponding group area data and group concentration data for each retention time group within each of the divided chromatograms, said group area data representing the sum of the calculated peak areas represented by the peak area data of step e) within the respective retention time group, and said group concentration data representing the sum of the sample concentrations corresponding to the calculated peak areas being summed within the retention time group; h) producing a calibration data set for each of the retention time groups, each calibration data set holding retention time group area data with corresponding retention time group concentration data for each of the sample solutions for which a GC-FID chromatogram is obtained; and i) producing a calibration curve giving retention time group area as a function of retention time group concentration for each of the retention time groups based on the obtained calibration data set.

For the second aspect of the invention the calibration curve may be a first or second order polynomial model calculated by least squares regression or weighted least squares regerssion of the obtained calibration data set.

It is preferred that for step e) in the second aspect of the invention, then for each maximum intensity peak a start peak retention time, spRT, being lower than the maximum peak or peak apex retention time, paRT, is defined, and an end peak retention time, epRT, being higher than the maximum peak retention time, paRT, is defined, and the intensity peak area is calculated as the area covered by the intensity curve above an intensity baseline being drawn from the intensity curve at start peak retention time, spRT, to end peak retention time, epRT. The start peak retention time may be found as the last RT before the paRT where the intensity function have a negative slope, and the end peak retention time may be found as the first RT after the paRT where the instensity function have a positive slope, respectively.

For the second aspect of the invention it is preferred that before the area calculating of step e), a reduction of non-sample information (data artifacts) is performed on the chromatograms represented by the data sets stored in step d). The reduction of nonsample information may include subtraction of data, which represents a chromatogram obtained by GC-FID analysis of a sample containing no hydrocarbon compounds, from each of the stored standard solution chromatograms. It is also within an embodiment of the second aspect of the invention that the GC-FID system used for obtaining the chromatograms comprises a gas chromatograph with a column, wherein the reduction of non-sample information includes subtraction of intensity data representing a blank chromatogram from each of the standard solution chromatograms, and wherein the intensity data for the blank chromatogram is obtained by taking the average of the intensities recorded before any compounds elute from the column, or wherein the blank chromatogram results from the injection of a sample into the column, which sample does not contain any hydrocarbon compounds. The reduction of non-sample information may include retention time alignment of the chromatograms. The retention time alignment of a chromatogram may comprise shifting retention time sections of the chromatogram by a constant value (rigid alignment), and/or comprise shifting the retention times by a value being a function of retention time (non-rigid alignment). The non-rigid alignment may consist of sequential stretching and compression of the chromatogram in order to best align the chromatogram to a target chromatogram so that the retention time of each compound in the aligned chromatogram is the same as the retention time for these compounds in the target chromatogram. The reduction of non-sample information may include an adjustment of the chromatograms for retention time related changes in the sensitivity of the GC-FID system, and the adjustment of the chromatograms may be performed by dividing each intensity of a chromatogram with the response factor corresponding to the retention time of the intensity, where the response factor is found from the second set of response factor-retention time data according to any one of the methods of the first aspect of the invention.

Also for the second aspect of the invention it is preferred that each of the standard solutions contains between 10-20 individual hydrocarbon compounds. The number of standard solutions may be selected to be in the range of 4-12. The concentrations of characteristic hydrocarbon compounds in the standard solutions may be in the range of 0.2 - 100 ppm.

According to a third aspect of the invention there is provided a computer assisted method for determining the total hydrocarbon concentrations by use of GC-FID (gas-chromatography/flame ionization detector) chromatograms and calibration curves obtained according to any of the methods of the second aspect of the invention, the method comprising the steps of: a) obtaining a liquid extract of a soil sample for which the hydrocarbon concentration is to be determined; b) obtaining a GC-FID chromatogram for the liquid extract which chromatogram shows a detector signal intensity curve with intensity peaks as a function of retention time; c) storing data representing the obtained chromatogram; d) dividing the stored chromatogram data in a number of consecutive retention time groups being equal to the retention time groups represented by the obtained calibration curves; e) calculating intensity group area data for each retention time group, each group area data being representative of the intensity curve area covered by the intensity peaks within the corresponding retention time group; and f) determining the total hydrocarbon concentration for a retention time group from the calibration curve, which calibration curve represents the selected retention time group, as the retention time group concentration having a retention time group area equal to the obtained intensity group area.

For the third aspect of the invention it is preferred that before the area calculating of step e), a reduction of non-sample information (data artifacts) is performed on the chromatogram represented by the data set stored in step c). The reduction of non-sample information may include subtraction of data, which represents a chromatogram obtained by GC-FID analysis of a sample containing no hydrocarbon compounds, from the stored chromatogram data. It is also within an embodiment of the third aspect of the invention that the GC-FID system used for obtaining the chromatogram comprises a gas chromatograph with a column, wherein the reduction of non-sample information includes subtraction of intensity data representing a blank chromatogram from the stored chromatogram, and wherein the intensity data for the blank chromatogram is obtained by taking the average of the intensities recorded before any compounds elute from the column, or wherein the blank chromatogram results from the injection of a sample into the column, which sample does not contain any hydrocarbon compounds. The reduction of non-sample information may include retention time alignment of the chromatogram. The retention time alignment of a chromatogram may comprise shifting retention time sections of the chromatogram by a constant value (rigid alignment), and/or comprise shifting the retention times by a value being a function of retention time (non-rigid alignment). The non-rigid alignment may consist of sequential stretching and compression of the chromatogram in order to best align the chromatogram to a target chromatogram so that the retention time of each compound in the aligned chromatogram is the same as the retention time for these compounds in the target chromatogram. The reduction of nonsample information may include an adjustment of the chromatogram for retention time related changes in the sensitivity of the GC-FID system, and the adjustment of the chromatogram may be performed by dividing each intensity of a chromatogram with the response factor corresponding to the retention time of the intensity, where the response factor is found from the second set of response factor-retention time data according to any of the methods of the first aspect of the invention.

For the third aspect of the invention it is preferred that for step e), the calculation of the intensity group area data for each retention time group is performed by summing all intensities of the chromatogram within the retention time group.

According to a fourth aspect of the invention there is provided a method or computer assisted method for obtaining a corrected GC-FID (gas-chromatography/flame ionization detector) chromatogram of a soil sample comprising one or more hydrogen compounds, the method comprising the steps of: a) obtaining a liquid extract of the hydrogen compound comprising soil sample: b) obtaining a GC-FID chromatogram for the liquid extract, which chromatogram shows a detector signal intensity curve with intensity peaks as a function of retention time; c) performing a reduction of non-sample information on the obtained chromatogram, wherein the reduction of non-sample information includes an adjustment of the chromatogram for retention time related changes in the sensitivity of the GC-FID system; and d) storing data representing the chromatogram being corrected for non-sample information.

For the fourth aspect of the invention it is preferred that the adjustment of the chromatogram is performed by adjusting the intensity curve of the chromatogram by response factor values given by a response factor function being a function of retention time and expressing variation in the sensitivity of the GC-FID system as a function of retention time. The response factor function may be based on GC-FID chromatograms representing a number of liquid standard sample solutions having different but known concentrations of a mixture of a number of selected characteristic hydrocarbon compounds. The standard sample solution chromatograms may show a detector signal intensity curve as a function of retention time, with each characteristic hydrocarbon compound of a sample solution being represented by a maximum intensity peak at a maximum peak retention time, which peak retention time is characteristic for the corresponding hydrocarbon compound, and with each maximum intensity peak having a corresponding intensity peak area being a function of the sample concentration of the corresponding hydrocarbon compound, and it is preferred that the response factor function is based on calculated peak area data for each or at least part of the intensity peak areas for each or at least part of the standard sample solution chromatograms.

For the fourth aspect of the invention it is preferred that the response factor function is based on a set of area-concentration data produced for each of the selected hydrocarbon compounds, each set of area-concentration data being based on the calculated peak area data and the known sample concentrations of the selected hydrocarbon compound. The response factor function may be based on response factor values being determined for each of the selected characteristic hydrocarbon compounds, where the response factor values are determined as a slope representing at least part of an area-concentration curve obtained from the set of area-concentration data, which curve represents the calculated peak areas as a function of the concentration of the selected hydrocarbon compound in the different standard sample solutions. It is preferred that for each selected hydrocarbon compound, the obtained response factor value is paired together with the corresponding maximum peak retention time, whereby a first set of response factor-retention time data providing at least part of the response factor function is obtained.

For the fourth aspect of the invention it is preferred that one or more new response factor values are calculated by use of interpolation. The new response factor value(s) is/are calculated for retention times between two nearest neighbor maximum peak retention times of the first set of response factor-retention time data, whereby a second and expanded set of response factor-retention time data defining the response factor function is obtained.

Also for the fourth aspect of the invention, the reduction of non-sample information may include subtraction of data, which represents a chromatogram obtained by GC-FID analysis of a sample containing no hydrocarbon compounds, from the stored chromatogram. Also for the fourth aspect to f the invention, the GC-FID system used for obtaining the chromatogram may comprise a gas chromatograph with a column, wherein the reduction of non-sample information includes subtraction of intensity data representing a blank chromatogram from the obtained chromatogram, and wherein the intensity data for the blank chromatogram is obtained by taking the average of the intensities recorded before any compounds elute from the column, or wherein the blank chromatogram results from the injection of a sample into the column, which sample does not contain any hydrocarbon compounds. It is preferred that the reduction of non-sample information includes retention time alignment of the chromatogram. The retention time alignment of the chromatogram may comprise shifting retention time sections of the chromatogram by a constant value (rigid alignment), and/or comprise shifting the retention times by a value being a function of retention time (non-rigid alignment). The non-rigid alignment may consist of sequential stretching and compression of the chromatogram in order to best align the chromatogram to a target chromatogram so that the retention time of each compound in the aligned chromatogram is the same as the retention time for these compounds in the target chromatogram .

It is within an embodiment of the fourth aspect of the invention that the response factor values used for adjusting the intensity curve of the chromatogram are found from the second set of response factor-retention time data according to any of the methods of the first aspect of the invention.

According to a fifth aspect of the invention there is provided a computer assisted method for constructing a pollution type model for hydrocarbon pollutions of soil, where a pollution type model comprises a number of chromatographic pollution profiles with corresponding hydrocarbon pollution types, the method comprising of the steps: a) obtaining a liquid extract of a number of reference oil containing soil samples for which samples the contained type or types of oil is/are known; b) obtaining a reference GC-FID chromatogram for each of the reference soil samples, each chromatogram showing a detector signal intensity curve with intensity peaks as a function of retention time; c) storing data representing the obtained reference chromatograms; d) generating and storing a predetermined number of chromatographic pollution profiles based on the stored set of chromatograms; e) classifying the chromatographic pollution profiles to identify which oil pollution type is matched to a chromatographic pollution profile, thereby obtaining a reference library listing chromatographic pollution profiles with matching pollution types.

For the fifth aspect of the invention it is preferred that step d) comprises a principal convex hull analysis in which the chromatographic pollution profiles are weighted averages of the stored reference chromatograms. Here, the weights for the weighted averages may be chosen to give the best fit of a weighted sums of the chromatographic pollution profiles to the stored reference chromatograms.

For the fifth aspect of the invention, then the classification of step e) may be performed based on the prior knowledge of the oil pollution types in the samples. The classification can be performed by comparing the pollution profiles with prior knowledge about the chromatographic fingerprints of certain pollution types, or by comparing the sample distributions with knowledge of the pollution types present in the samples.

Also for the fifth aspect of the invention it is preferred that before the generation of pollution profiles of step d), a reduction of non-sample information (data artifacts) is performed on any of the chromatograms represented by the data set stored in step c).

The reduction of non-sample information may include subtraction of data, which represents a chromatogram obtained by GC-FID analysis of a sample containing no hydrocarbon compounds, from the stored chromatogram data. The GC-FID system used for obtaining the chromatograms comprises a gas chromatograph with a column, and the reduction of non-sample information may include subtraction of intensity data representing a blank chromatogram from the stored chromatogram, and the intensity data for the blank chromatogram may be obtained by taking the average of the intensities recorded before any compounds elute from the column, or the blank chromatogram may result from the injection of a sample into the column, which sample does not contain any hydrocarbon compounds. The reduction of non-sample information may include retention time alignment of the chromatogram. The retention time alignment of a chromatogram may comprise shifting retention time sections of the chromatogram by a constant value (rigid alignment), and/or comprises shifting the retention times by a value being a function of retention time (non-rigid alignment). The non-rigid alignment may consist of sequential stretching and compression of the chromatogram in order to best align the chromatogram to a target chromatogram so that the retention time of each compound in the aligned chromatogram is the same as the retention time for these compounds in the target chromatogram. The reduction of non-sample information may include an adjustment of the chromatogram for retention time related changes in the sensitivity of the GF-FID system, and the adjustment of the chromatogram may be performed by dividing each intensity of a chromatogram with the response factor corresponding to the retention time of the intensity, where the response factor is found from the second set of response factor-retention time data according to any of the methods of the first aspect of the invention.

According to a sixth aspect of the invention there is provided a computer assisted method for pollution type apportionment of oil polluted soil samples, by use of GC-FID (gas-chromatography/flame ionization detector) chromatograms and a pollution type model obtained according to any of the methods of the fifth aspect of the invention, the method comprising the steps of: a) obtaining a liquid extract of a soil sample for which the pollution type apportionment is to be determined; b) obtaining a GC-FID chromatogram for the liquid extract which chromatogram shows a detector signal intensity curve with intensity peaks as a function of retention time; c) storing data representing the obtained chromatogram; d) comparing the contaminated soil chromatogram with the chromatographic pollution profiles of the pollution type model, and determine a representation of the contaminated soil chromatogram based on the chromatographic pollution profiles, which representation gives a fraction measure for each contributing pollution profile; e) comparing the fraction measures and the contributing chromatographic pollution profiles of the representation with the reference library, and determining based on said comparison an pollution type measure, which pollution type measure gives a fraction measure of the pollution types being represented by the chromatographic pollution profiles of the representation, thereby obtaining a measure of identified pollution types within said contaminated soil.

The sixth aspect of the invention may further comprise a step f) of comparing the obtained representation of step d) with the original stored contaminated soil chromatogram, and determining based on said comparison a measure for the difference or mismatch between the obtained representation and the contaminated soil chromatogram.

Also for the sixth aspect of the invention it is preferred that before the comparison in step d), a reduction of non-sample information (data artifacts) is performed on the chromatogram represented by the data set stored in step c). This reduction of non-sample information may include any of the steps described for the fifth aspect of the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

Fig. 1 is a block diagram illustrating different measurement and computational methods used in accordance with an aspect of the invention,

Fig. 2 is a flow chart illustrating a method according to an aspect of the invention for producing a data set for adjusting GC-FID (gas-chromatography/flame ionization detector) chromatograms for retention time related changes in the sensitivity of the GC-FID system,

Fig. 3 is a flow chart illustrating a method according to an aspect of the invention for correcting chromatograms for non-sample contributions,

Figs. 4a-4d show examples of chromatograms for different sample types,

Figs. 5a-5f show chromatograms for a sample at different stages of removal of nonsample contributions,

Fig. 6a-6e illustrate different steps of the method of Fig. 2 for producing a data set for adjusting GC-FID chromatograms for retention time related changes in the sensitivity of the GC-FID system,

Fig. 7 is a flow chart illustrating a method according to an aspect of the invention for producing a number of calibration curves to be used in determining total hydrocarbon concentrations of contaminated soil samples from GC-FID chromatograms,

Figs. 8a-8c illustrate different steps of the method of Fig. 7 for producing a calibration curve to be used in determining total hydrocarbon concentrations,

Fig. 9 is a flow chart illustrating a method according to an aspect of the invention for determining the total hydrocarbon concentrations of contaminated soil samples by use of GC-FID chromatograms and obtained calibration curves,

Figs. 10a-10h illustrate the use of calibration curves for determination of total hydrocarbon concentrations of contaminated soil,

Fig. 11 is a flow chart illustrating a method according to an aspect of the invention for construction of pollution type model,

Fig. 12 is a flow chart illustrating a method according to an aspect of the invention for determining the distribution of types of hydrocarbon pollution in soil samples by applying the pollution type model of Fig. 11,

Figs. 13a-13c are GC-FID chromatograms representing soil samples with different types of pollution, which chromatograms may be used for constructing the pollution type model of Fig. 11,

Fig. 14 is a diagram illustrating the distribution of types of hydrocarbon pollution in 15 different soil samples, where the distribution is determined following the method of Fig. 12 using a pollution type model based on the chromatograms of Figs. 13a-13c, and

Fig. 15 is a block diagram illustrating a GC-FID system, which can be used in accordance with embodiment of the methods of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

Fig. 1 is a block diagram illustrating different measurement and computational methods used in accordance with an aspect of the invention. Fig. 1 shows five methods, where for the first method 101, GC-FID (gas-chromatography/flame ionization detector) chromatograms of liquid standard solutions are used to create standard calibration curves for total petroleum hydrocarbon determination (TPH). The chromatograms of the liquid standard solutions may also be used in the second method to determine response factors (RFs) as a function of retention time (RT), 102, where the response factors can be used for adjusting obtained chromatograms for retention time related changes in the sensitivity of the GC-FID system. The third method 103 makes use of the standard curves of the first method 101 and the response factors of the second method 102 in order calculate the total petroleum hydrocarbon concentrations in polluted soil samples. However, besides determining the hydrocarbon concentration, there may also be a need for determining the distribution of the types of hydrocarbon compounds within the soil samples. The fourth method 104 is a method of constructing a hydrocarbon pollution type model (source model), which may make use of the response actors of the second method 102, and the fifth method 105 is a method making use of the source model 104 to determine the distribution of hydrocarbon compounds in polluted samples (source apportionment).

When analyzing GC-FID chromatograms representing hydrocarbon contaminated soil samples, it may be necessary to correct the chromatograms for non-sample contributions in order to obtain a good estimate of the total hydrocarbon concentration in the samples.

According to an aspect of the present invention there is provided a new computer assisted method for producing a data set, which can be used for adjusting GC-FID chromatograms for retention time related changes in the sensitivity of the GC-FID system. This method is illustrated if Fig. 2 and discussed in the following.

Select hydrocarbon compounds -201

First a number of characteristic hydrocarbon compounds are selected. Typically, straight-chain alkanes covering the boiling point range of the compounds to be analyzed are used.

Produce liquid standard solutions - 202

Then a number of liquid sample solutions, which may be referred to as standard solutions, are produced. Each standard solution has a known concentration of a mixture of the characteristic hydrocarbon compounds, and the concentration is different from sample to sample. It is preferred to use a mixture of between 10-20 characteristic hydrocarbon compounds. In order to obtain the number of standard solutions, a solution that contains the hydrocarbon compounds in high concentrations, typically between 50 and 200 milligrams per litre of solvent (mg/L), is produced. This solution is named the ‘stock solution’. The stock solution is then diluted with solvent into 4 to 12 standard solutions with concentrations between 0.2 and 100 mg/L. The concentrations in the standard solutions should be so that the intensities of the GC-FID chromatograms of the standard solutions span the intensities in the GC-FID chromatograms of the soil extracts or samples to be analysed. It is preferred that each of the standard solutions contains between 10-20 individual hydrocarbon compounds.

Obtain GC-FID chromatograms of standard samples - 203

For each of the standard solutions a GC-FID chromatogram is obtained, where each chromatogram shows a detector signal intensity curve as a function of retention time. For each of the standard solution chromatograms, each of the characteristic hydrocarbon compounds of the solution is represented by a maximum intensity peak at a maximum peak retention time, which peak retention time is characteristic for the corresponding hydrocarbon compound, and each where maximum intensity peak has a corresponding intensity peak area being a function of the sample concentration of the corresponding hydrocarbon compound.

In order to obtain the standard solution chromatograms, the standard solutions are pipetted into suitable vials which are sealed with a septum. A gas chromatography (GC) method is constructed, consisting of an injection method (split, splitless, on-column etc.), an injection volume, a temperature program, a flow rate, column characteristics (length, internal diameter, stationary phase chemistry and thickness) etc. The vials are placed in the liquid auto-sampler of a GC. For each standard solution, a chromatogram is recorded by injecting a volume of the standard solution in the inlet of the GC and heating the column according to the temperature program. The effluent of the column is passed through a flame ionization detector (FID), and the current through the FID is recorded at each retention time. This is called a GC-FID chromatogram of the standard solution.

Store chromatograms - 204

The GC-FID chromatograms can be stored on the computer that controls the GC-FID used to record them, or elsewhere.

Remove non-sample variation - 205

Optionally, variation from e.g. small changes in the chromatographic column or instrument parameters between runs, or differences in the sensitivity towards different compounds of the instrumentation can be removed before further analysis. The procedure for removing non-sample variation or non-sample contributions from the chromatograms is described in the discussion given to the flowchart of Fig. 3.

Calculate peak areas - 206

For all hydrocarbon compounds in each GC-FID chromatogram peak area data are calculated for each of the intensity peak areas, where each calculated peak area are proportional to the sample concentration of the characteristic hydrocarbon compound having the maximum peak retention time belonging to the maximum intensity peak. The peak areas are calculated by first determining the start and end of each peak and then by integrating the area above the baseline (and below the intensity curve) connecting the start and end of the peak.

Thus, for each maximum intensity peak a start peak retention time, spRT, being lower than the maximum peak or peak apex retention time, paRT, is defined, and an end peak retention time, epRT, being higher than the maximum peak retention time, paRT, is defined, and the intensity peak area is calculated as the area covered by the intensity curve above an intensity baseline being drawn from the intensity curve at start peak retention time, spRT, to end peak retention time, epRT. The spRT and epRT can be found as the last RT before the paRT where the intensity function have a negative slope, and the first RT after the paRT where the instensity function have a positive slope, respectively.

Produce area-concentration data - 207

For each hydrocarbon compound, a set of area-concentration data is produced. The data set consists of the peak area data calculated from the chromatograms and the corresponding concentration for each standard solution.

Calculate response factors, RFs, for hydrocarbon compounds - 208 For each of the selected hydrocarbon compounds a response factor, RF, is determined. The response factor, RF, is determined as a slope or slope part representing an area-concentration curve obtained from the set of area-concentration data. The area-concentration curve represents the calculated peak areas as a function of the concentration of the selected hydrocarbon compound in the different sample solutions.

The response factors, RFs, may be calculated by least squares regression or weighted least squares regression. Here, a model is made, and the differences between the model and the observed data are found. The sum of the squares of these errors are then minimized.

Store response factors, RFs, for hydrocarbon compounds 209

For each of the selected hydrocarbon compounds, the obtained response factor, RF, is stored together with the maximum peak retention time, where the peak retention time is characteristic for the selected hydrocarbon compound. The result is a first set of response factor-peak retention time data, holding the obtained response factors as a function of retention time.

Calculate new response factors, RF, as a function of retention time, RT -210 The first set of response factor-peak retention time data only holds data for a limited number of retention times, which is the peak retention times represented by the number of selected hydrocarbon compounds. In order to adjust chromatograms representing hydrocarbon compounds with other peak retention times, a number of new response factors are calculated for retention times between the peak retention times represented in the first data set. The new response factors are calculated by use of interpolation, whereby a second and expanded set of response factor-retention time data is obtained.

The new response factors for retention times between two nearest neighbor peak retention times of the first data set may be calculated by use of linear interpolation using the already obtained response factor values belonging to these two nearest neighbor peak retention times. The new response factors may also be calculated by use of linear interpolation using the response factor values of the two nearest neighbor peak retention times and by use of response factor values for peak retention times closest to the two nearest neighbor peak retention times. The number of calculated new interpolated response factors in-between two nearest neighbor retention times may be equal to the number of data points obtained between these two retention times, which is determined by the sampling rate of the flame ionization detector.

Store new response factors, RF, as a function of retention time, RT - 211

The second set of response factor-retention time data holding the first data set and the new calculated response factors are stored. This second set of response factor-retention time data, may be used for adjusting GC-FID chromatograms for retention time related changes in the sensitivity of the GC-FID system

As already mentioned above, it may be necessary to correct the chromatograms for nonsample contributions in order to obtain a good estimate of the total hydrocarbon concentration in the samples. Fig. 3 is a flow chart illustrating a method according to an aspect of the invention for correcting chromatograms for non-sample contributions and discussed in the following.

Subtract chromatograms from analysis of pure solvent or extract made without soil - 301 To correct for the part of the detector signal that originates from electronic noise, column bleed or other non-sample sources, the most likely shape of these interferences (background) can be subtracted from the obtained chromatograms before further analysis. The background can either be the average of the intensity recorded before any part of the sample elutes from the column, a chromatogram of a sample containing no hydrocarbon compounds or an average of several such chromatograms. Thus, a reduction of nonsample information may include subtraction of data representing a chromatogram obtained by GC-FID analysis of a sample containing no hydrocarbon compounds from the obtained chromatograms.

Rigid or non-rigid alignment of chromatograms - 302

To correct for changes in the retention times at which compounds are detected, the retention times in the obtained chromatograms can be aligned before further analysis. The retention time alignment can either change the retention times of an entire chromatogram in the same direction and by the same amount (rigid alignment); or the retention time alignment can be made by changing the retention time of sections of the chromatogram in the same direction and by the same amount (interval rigid alignment); or the retention time alignment can be made by changing the retention time of the chromatograms by a continuous, non-constant function of retention time (warping). Typically, warping will be preceded by rigid alignment. Non-rigid alignment could be correlation optimized warping or dynamic time warping, These techniques consisting of sequential stretching and compression of the chromatogram in order to best align each chromatogram to a target chromatogram such that the RT of each compound in each chromatogram is the same as the RT of that compound in the target chromatogram (the chromatograms are aligned). A facilitator chromatogram is a good choice as a target chromatogram.

The above described steps 301 and 302 of reduction or correction for non-sample information (data artifacts) may be used in step 205 for the chromatograms stored in step 204 in the method described above and illustrated in Fig. 2.

Correct chromatograms using response factor, RF, function - 303 To correct for retention time-correlated changes in the sensitivity of the GC-FID system, each intensity value in obtained chromatograms can be divided by the response factor, RF, value at the corresponding retention time, where the response factor value is found from the second set of response factor-retention time data 211.

Figs. 4a-4d show examples of chromatograms for different sample types, where Fig. 4a shows a blank sample chromatogram, i.e. a sample that does not contain any hydrocarbon compounds, Fig. 4b shows a facilitator sample chromatogram, i.e. a chromatogram of a sample that is a mix of several soil extracts, Fig. 4c shows a contaminated soil sample chromatogram, and Fig. 4d shows a standard solution sample chromatogram.

Figs. 5a-5f show chromatograms for a facilitator sample at different stages of removal of non-sample contributions. Fig. 5a shows the raw chromatogram, with Fig. 5b showing a detailed part of Fig. 5a. Fig. 5c shows the chromatogram of Fig. 5a after background removal as discussed in step 301 of Fig. 3, where the background removal is performed by subtracting the mean of the blank part of the chromatograms. Fig. 5 d shows a detailed part of Fig. 5c. Fig. 5e shows the chromatogram of Fig. 5c after retention time alignment as discussed in step 302 of Fig. 3, where the retention time alignment is performed by correlation optimized warping. Fig. 5 f shows a detailed part of Fig. 5c.

Fig. 6a-6d illustrate different steps of the method of Fig. 2 for producing a data set for adjusting GC-FID chromatograms for retention time related changes in the sensitivity of the GC-FID system. Fig. 6a shows a chromatogram of one of the liquid standard solutions obtained in step 203, but after removal of non-sample variation of step 205. Fig. 6b is a zoomed in view on one of the maximum intensity peaks of Fig. 6a having an intensity peak area. Below this, in dashed lines, is the slop of the intensity curve at each point. The vertical, dot-dashed lines indicate the last RT before the peak apex where the slope is negative and the first RT after the peak apex where the slope is positive. These are the peak start RT, psRT and peak end RT, peRT, respectively. The dotted line is the line between intensity curve at psRT and peRT. The integration finds the area above this line, but below the intensity curve, corresponding to step 206. The peak area is calculated for each of the standard solutions for this hydrogen compound, and a set of area-concentration data is produced, which data corresponds to the selected hydrogen compound and its peak retention time, step 207. An area-concentration curve can be obtained for the hydrogen compound as illustrated in Fig. 6c.

A set of area-concentration data is produced for each of the intensity peaks of Fig. 6a, where each peak represents a selected characteristic hydrogen compound. A response factor, RF, can be calculated based on the curve of Fig. 6c, step 208. The response factor may be determined as the slope of the curve or as the slope of part of the curve of Fig. 6c,

A response factor, RF, is determined for each set of area-concentration data, and thereby for each selected hydrocarbon compound. The determined response factors are stored together with the peak retention times of the corresponding hydrocarbon compounds, and a first set of response factor-peak retention time data is obtained, step 209. The number of determined response factor values is given by the number of selected hydrocarbon compounds, but in order to adjust chromatograms for hydrocarbon contaminated soil samples, response factor values for a higher number of retention times is needed. These extra response factor values may be obtained by use of extrapolation between the response factor values of the first set, step 210. This is illustrated in Fig. 6d, which shows the first determined response factor values together with the new extrapolated response factor values as a function of retention time in solid lines, together with a chromatogram of a standard sample in dotted lines. This second set of response factor-retention time data holding the first data set and the new calculated response factors is stored, step 211, and can be used for adjusting GC-FID chromatograms, as shown in Figs. 6e and 6f, with the unadjusted chromatogram in dotted lines, Fig. 6f, below the adjusted chromatogram in solid lines, Fig. 6e.

When determining the total hydrocarbon concentrations in soil samples by use of GC-FID chromatograms, the chromatograms are divided in consecutive retention time groups representing boiling point regions of compounds, such as benzene - C10 (decane); C10 - C25 (pentacosane); C25 - C35 (pentatriacontane) and > C35 and the total concentration of hydrocarbon compounds within each retention time group is determined.

In order to improve the determination of the total hydrocarbon concentrations in soil samples, then according to an aspect of the present invention, there is provided a computer assisted method for producing a number of calibration curves to be used in the determination of total hydrocarbon concentrations of soil samples from GC-FID (gas-chromatography/flame ionization detector) chromatograms. This method is illustrated in Fig. 7 and discussed in the following.

The first steps 701-704 of Fig. 7 are similar to the steps 201-204 of Fig. 2, and the chromatograms obtained from steps of 201-204 may also be used for the method of Fig.

7.

Select hydrocarbon compounds -701

Produce liquid standard solutions - 702

Obtain GC-FID chromatograms of standard samples - 703

Store chromatograms - 704

Remove non-sample variation - 705

This step 705 is similar to step 205 of Fig. 2, but since the response factor function may have been determined from the process of Fig. 2, then it is preferred that step 705 includes all three steps 301,302 and 303 of Fig. 3.

Calculate peak areas - 706

This step is similar to step 206 of Fig. 2.

For all hydrocarbon compounds in each GC-FID chromatogram peak area data are calculated for each of the intensity peak areas, where each calculated peak area corresponds to the sample concentration of the characteristic hydrocarbon compound having the maximum peak retention time belonging to the maximum intensity peak. The peak areas are calculated by first determining the start and end of each peak and then by integrating the area above the baseline (and below the intensity curve) connecting the start and end of the peak.

Thus, for each maximum intensity peak a start peak retention time, spRT, being lower than the peak apex retention time, paRT, is defined, and an end peak retention time, epRT, being higher than the peak apex retention time, paRT, is defined, and the intensity peak area is calculated as the area covered by the intensity curve above an intensity baseline being drawn from the intensity curve at start peak retention time, spRT, to end peak retention time, epRT. The spRT and epRT can be found as the last RT before the paRT where the intensity function have a negative slope, and the first RT after the paRT where the instensity function have a positive slope, respectively.

Calculate sum of areas of all peaks in a retention time region - 707 The stored chromatograms for which peak area data are calculated are divided in the required retention time groups or regions. For each retention time group or region of the divided chromatograms, corresponding group area data and group concentration data are calculated. Here, the group area data represents the sum of the calculated peak areas within the respective retention time group, and the group concentration data represents the sum of the sample concentrations corresponding to the calculated peak areas being summed within the retention time group.

Make concentration/area curve (calibration curves) - 708 A calibration data set is produced for each of the retention time groups, where each calibration data set holds retention time group area data with corresponding retention time group concentration data for each of the sample solutions for which a GC-FID chromatogram is obtained. For each retention time group, a calibration curve is produced based on the obtained calibration data set, where the calibration curve gives the retention time group concentration as a function of the retention time group area.

The calibration curves may be a first or second order polynomial model calculated by least squares regression or weighted least squares regression of the obtained calibration data set. The differences between the model and the observed data is found. The sum of the squares of these errors are then minimized.

Figs. 8a-8c illustrate different steps of the method of Fig. 7 for producing a calibration curve to be used in determining total hydrocarbon concentration. Fig. 8a shows an obtained chromatogram for one standard solution, where non-sample variations have been removed, as described in step 705. Fig. 8b illustrates calculation of the peak area of one intensity peak, as described in step 706. All the calculated peak areas within one retention time region is summed, and all the sample concentrations corresponding to the intensity peaks within this retention time group are summed, and a data set of the summed peak areas and the summed sample concentrations is obtained. Thus, from one standard solution chromatogram of Fig. 8a, one point for a calibration curve as shown in Fig. 8c is obtained. A number of standard solution chromatograms with different solution concentration are obtained, resulting in a corresponding number of points for producing the calibration curve of Fig. 8c.

A main object of the methods of the present invention is to provide an improved determination of the total hydrocarbon concentrations in soil samples, and according to an aspect of the present invention, there is provided a computer assisted method for determining the total hydrocarbon concentrations by use of GC-FID chromatograms and buy use of the calibration curves obtained by the above discussed method illustrated in Fig. 7. This method is illustrated in Fig. 9 and discussed in the following.

Obtain liquid extracts of soil samples -901

The first step is to obtain a liquid extract of the soil sample or samples for which the hydrocarbon concentration is to be determined. The soil samples are extracted with one or more suitable solvents in order to dissolve the hydrocarbon compounds present. This is done to facilitate the transfer to the gas-chromatograph (GC).

Obtain GC-FID chromatograms - 902

For each soil sample, which is to be analyzed, obtain a GC-FID chromatogram for the liquid extract, where the chromatogram shows a detector signal intensity curve with intensity peaks as a function of retention time. The sample solutions are pipetted into suitable vials which are sealed with a septum. The vials are placed in the liquid autosampler of a GC. For each sample, a chromatogram is recorded by injecting a volume of the sample solution in the inlet of the GC and heating the column according to the temperature program. The effluent of the column is passed through a flame ionization detector (FID), and the current through the FID is recorded at each retention time to produce a GC-FID chromatogram of the sample.

Store chromatogram - 903

For each chromatogram, data representing the chromatogram is stored. The GC-FID chromatograms can be stored on the computer that controls the GC-FID used to record them, or elsewhere.

Remove non-sample variation - 904

Optionally, non-sample variation can be removed before further analysis. This step 904 is similar to step 705 of Fig. 7, and using the response factor function being determined from the process of Fig. 2, then it is preferred that step 904 includes all three steps 301,302 and 303 of Fig. 3.

Calculate areas of retention time regions - 905

For each chromatogram being analyzed, the stored chromatogram data is divided in a number of consecutive retention time groups or regions being equal to the retention time groups represented by the calibration curves obtained by the method illustrated in Fig. 7. For each retention time group or region of the chromatogram being analyzed, intensity group area data is calculated, where each group area data is representative of the intensity curve area covered by the intensity peaks within the corresponding retention time group. The calculation of the intensity group area data for each retention time group is performed by summing all intensities of the chromatograms in the retention time group after the non-sample information have been removed from the chromatograms.

Calculate total hydrocarbon concentrations - 906

Using the calibration curves obtained by the method illustrated in Fig. 7, where a calibration curve is produced for each retention time group or region, the total hydrocarbon concentration can be determined for each retention time group of a chromatogram, by looking up the hydrocarbon concentration corresponding to the calculated intensity group area data for a retention time group or region.

Figs. 10a-10h illustrate the use of calibration curves for determination of total hydrocarbon concentrations in a contaminated soil sample. Four calibration curves, each representing a retention time group or region, have been obtained as described above in connection with Figs. 7 and 8. Fig. 10b shows the calibration curve for C6-C10 (Benzene to Decane), with a retention time region I from 0-4,4993 min. Fig. 10d shows the calibration curve for C10-C25 (Decane to Pentacosane), with a retention time region II from 4,4993-13,0093 min. Fig. 10f shows the calibration curve for C25-C35 (Pentacosane to Pentatriacontane ), with a retention time region III from 13,0093-16,1993 min. Fig. 10h shows the calibration curve for hydrocarbons above C35 (Pentatriacontane), with a retention time region IV from 16,1993 to 20 min.

A liquid extract of the soil sample to be analyzed is obtained, step 901, a GC-FID chromatogram is obtained and non sample variations are removed, steps 902-904. The chromatogram obtained from step 904 is then divided in the four retention time groups or regions l-IV as described in connections with Figs. 10b, 10d, 10f and 10h, and the total area covered by the intensity curve within each retention time region is calculated, step 905. This is illustrated in Fig. 10a for region I, in Fig. 10c for region II, in Fig. 10e for region III, and in 10g for region IV. From the calculated intensity areas, the corresponding hydrocarbon concentrations are found, step 906. For region I, Fig. 10a, the intensity area is 8,4 and from the curve of Fig. 10b, the concentration in the sample of hydrocarbons C6-C10 is found to be 2,1 pg/ml. For region II, Fig. 10c, the intensity area is 2641,4 and from the curve of Fig. 10d, the concentration in the sample of hydrocarbons C10-C25 is found to be 653,1 pg/ml. For region III, Fig. 10e, the intensity area is 68,7 and from the curve of Fig. 10f, the concentration in the sample of hydrocarbons C25-C35 is found to be 17,0 pg/ml. For region IV, Fig. 10g, the intensity area is 28,4 and from the curve of Fig. 10h, the concentration in the sample of hydrocarbons above C35 is found to be 7,0 pg/ml.

As discussed above, a main object of the methods of the present invention is to provide an improved determination of the total hydrocarbon concentrations in samples of polluted soil. However, besides determining the total concentration of hydrocarbon compounds, there is also a need to determine the oil types or types of hydrocarbon compounds within the soil samples, and further to determine the distribution of the oil types or types of hydrocarbon compounds within the soil samples.

According to an aspect of the present invention, there is provided a two step solution for determining the types of hydrocarbon compounds within the soil samples, and for determining the distribution of the types of hydrocarbon compounds within the soil samples. The methods of this two step solution is illustrated in Figs. 11 and 12 and discussed in the following, where Fig. 11 is a flow chart illustrating a computer assisted method for construction of a pollution type model, and Fig. 12 is a flow chart illustrating a computer assisted method for determining the type of hydrocarbon contamination in soil samples by applying the pollution type model of Fig. 11.

Fig. 11 - construction of a pollution type model:

The model consists of a number of chromatographic pollution profiles and their identity, that is, what type of pollution each chromatographic pollution profile represent. Each sample chromatogram can be described as a weighted sum of these chromatographic pollution profiles. The weights represent the proportions of the chromatographic pollution profiles in the sample chromatograms. The identity of the corresponding chromatographic pollution profiles can be used to establish the proportion of the hydrocarbons in a sample that originates from different pollution types.

Select soil samples -1101

First a set of reference soil samples containing different pollution types are selected so that a range of hydrocarbon pollutions are covered. For example, the samples could include soils polluted with combinations of creosote, heating oil and lubricants at varying degree and type of weathering (e.g. evaporation, biological). The model will not be able to classify pollution types not present in these samples.

Obtain liquid extracts of soil samples -1102

Then the soil samples are extracted with one or more suitable solvents in order to dissolve the hydrocarbon compounds present. This is done to facilitate the transfer to the gas-chromatograph, GC, of the GC-FID system.

Obtain GC-FID chromatograms -1103

For each reference soil sample, which is to be analyzed, obtain a reference GC-FID chromatogram for the liquid extract, where the reference chromatogram shows a detector signal intensity curve with intensity peaks as a function of retention time. The sample solutions are pipetted into suitable vials which are sealed with a septum. The vials are placed in the liquid autosampler of a GC. For each sample, a chromatogram is recorded by injecting a volume of the sample solution in the inlet of the GC and heating the column according to the temperature program. The effluent of the column is passed through a flame ionization detector (FID), and the current through the FID is recorded at each retention time to produce a GC-FID chromatogram of the sample.

Store chromatograms -1104

For each reference chromatogram, data representing the chromatogram is stored. The GC-FID chromatograms can be stored on the computer that controls the GC-FID used to record them, or elsewhere.

Remove non-sample variation -1105

Optionally, non-sample variation can be removed before further analysis. This step 1105 is similar to step 705 of Fig. 7, and using the response factor function being determined from the process of Fig. 2, then it is preferred that step 1104 includes all three steps 301, 302 and 303 of Fig. 3.

Normalize reference chromatograms -1106

Optionally, for each chromatogram, the intensities may be divided by the total intensity in that chromatogram. This is done to ensure that the differences between the chromatograms are primarily due to the type of pollution and not the amount. The total intensity can be e.g. the sum or the Euclidean norm (the square root of sum of the squares) of the intensities at all retention times, RTs.

Estimate chromatographic pollution profiles and estimate the proportion of the chromatographic pollution profiles in the reference sample chromatograms -1107 and 1108 A pollution type model comprises a predetermined number of chromatographic pollution profiles and the corresponding pollution type of each pollution profile. In order to obtain a pollution type model, a number of chromatographic pollution model profiles are generated based on the stored reference chromatograms.

The chromatographic pollution profiles is chosen so that the difference between a pollution type model of the chromatograms, made by adding the pollution model profiles according to their contributions, is as close to the reference chromatograms as possible.

The construction of the chromatographic pollution profiles may be done iteratively. In this approach, an initial estimate of the chromatographic pollution profiles is made. This estimate can consist of random numbers. From this, the contributions of the estimated chromatographic pollution profiles to each reference chromatogram is determined, so that the pollution type model of the chromatograms is as close to the reference chromatograms as possible. From these estimated contributions, a new set of chromatographic pollution profiles is determined, so that the pollution type model of the chromatograms is as close to the reference chromatograms as possible. When the difference between the pollution type model and the reference chromatograms is less than a set threshold, or when the change in the chromatographic pollution profiles in a step is smaller than a set threshold, the estimation is terminated, and the final estimate of the chromatographic pollution profiles is used in the pollution type model.

When generating and storing a predetermined number of chromatographic pollution profiles based on the stored set of chromatograms, the generation of the chromatographic pollution profiles may comprise a principal convex hull analysis in which the chromatographic pollution profiles are weighted averages of the stored reference chromatograms. The weights for the weighted averages may be chosen to give the best fit of a weighted sum of the chromatographic pollution profiles to the stored reference chromatograms.

Identify pollution types -1109

The estimated pollution type profiles are classified according to the type of pollution they represent. This classification can be performed by comparing the pollution profiles with prior knowledge about the chromatographic fingerprints of certain pollution types, or by comparing the sample distributions with knowledge of the pollution types present in the samples.

Store pollution type model -1110

Each chromatographic pollution profile is stored along with information about which type of pollution it represents.

Figs. 13a-13c are chromatographic pollution profiles obtained according to the method described in connection with Fig. 11. Fig. 13a shows a chromatographic pollution profile representing a pollution with lubricant oil, Fig. 13b shows a chromatographic pollution profile representing a pollution with a pyrogenic pollution, Fig. 13c shows a chromatographic pollution profile representing a pollution with diesel oil.

Fig. 12 is a flow chart illustrating a method according to an aspect of the invention for determining the distribution of types of hydrocarbon pollution in soil samples by applying the pollution type model of Fig. 11:

Obtain liquid extracts of soil samples -1201

The soil samples are extracted with one or more suitable solvents in order to dissolve the hydrocarbon compounds present. This is done to facilitate the transfer to the GC.

Obtain GC-FID chromatograms - 1202

For each soil sample, which is to be analyzed, obtain a GC-FID chromatogram for the liquid extract, where the chromatogram shows a detector signal intensity curve with intensity peaks as a function of retention time. The sample solutions are pipetted into suitable vials which are sealed with a septum. A GC-FID method is constructed. The vials are placed in the liquid autosampler of a GC. For each sample, a chromatogram is recorded by injecting a volume of the sample solution in the inlet of the GC and heating the column according to the temperature program. The effluent of the column is passed through a flame ionization detector, FID, and the current through the FID is recorded at each retention time to produce a GC-FID chromatogram of the sample.

Store chromatograms - 1203

Remove non-sample variation - 1204

Optionally, non-sample variation can be removed before further analysis. This step 1204 is similar to step 705 of Fig. 7, and using the response factor function being determined from the process of Fig. 2, then it is preferred that step 1204 includes all three steps 301, 302 and 303 of Fig. 3.

Normalize chromatograms -1205

For each chromatogram, the intensities are divided by the total intensity in that chromatogram. This is done to ensure that the differences between the chromatograms are primarily due to the type of pollution and not the amount. The total intensity can be e.g. the sum of all intensities or the Euclidean norm (the sum of the square of the intensities).

Chromatographic pollution profiles - 1206

The chromatographic pollution profiles from the pollution type model, 1110, are used.

Determine the proportion of chromatographic pollution profiles in sample chromatograms -1207

Each chromatogram is modelled as a weighted sum of the chromatographic pollution profiles. The weights are chosen to minimize the differences between the chromatogram and the weighted sum ('the residuals'). To measure the difference, the sum of the squares of the differences are summed for all retention times, RTs.

Store unexplained parts of sample chromatograms - 1208

For each sample, the residuals are stored. This is used to identify samples that contain pollution types not described by the model. Samples with residuals above a user-defined threshold cannot be described by the current pollution type model.

Determine pollution distribution in samples - 1209

For each sample, the proportion of each pollution type is determined from the proportions of chromatographic pollution profiles in the corresponding sample chromatogram. For each pollution type, the proportion of the corresponding chromatographic pollution profile describes the proportion of hydrocarbons in the sample that originates from that pollution type.

Fig. 14 is a diagram illustrating the distribution of types of hydrocarbon pollution in 15 different soil samples, where the distribution is determined following the method of Fig. 12 using a pollution type model according to the method of Fig. 11 and based on the chromatograms of Figs. 13a-13c. Each column corresponds to a soil sample, where the black part shows the percentage of pollution being from the lubricant oil, the grey part shows the percentage of pyrogenic pollution, and the white part shows the percentage of pollution being from a diesel oil.

In the following is described an example of a method of constructing a pollution type model and a method of identifying oil types and the distribution of the identified oil types in oil contaminated soil by use of the pollution type model:

According to an aspect of the invention there is provided a computer assisted method for constructing a pollution type model for hydrocarbon pollutions of soil, where a pollution type model comprises a number of chromatographic pollution profiles and the corresponding pollution type, which method is based on GC-FID (gas-chromatography/flame ionization detector) chromatograms having a number of intensity peaks with corresponding retention times. The method comprises the following steps: a) selecting a number of reference GC-FID chromatograms, each reference chromatogram representing an oil containing soil sample for which sample the contained type or types of oil is/are known; corresponds to steps 1101, 1102 and 1103 of Fig. 11.

b) storing a data set representing each of the selected reference chromatograms; step 1104 of Fig. 11. It is preferred that a reduction of non-sample information is performed on stored data representing the reference GC-FID chromatograms; this is described in step 1105 of Fig. 11. It is also preferred that the stored chromatograms are normalized as described in step 1106 of Fig. 11.

c) generating and storing a predetermined number of pure archetype chromatograms based on the stored set of reference chromatograms; this corresponds to the estimated model described in steps 1107 and 1108.

d) classifying the pure archetype chromatograms against the set of reference chromatograms to identify which oil type or oil types is/are matched to a pure archetype chromatogram, thereby obtaining a reference library listing pure archetype chromatograms with matching oil type or oil types, which reference library can be stored; this corresponds to step 1109 of identifying pollution types, where the estimated pollution type models are classified, and to the storage of pollution type model of step 1110 of Fig.

11.

For step c) it is preferred that the number or pure archetypes are determined to be at least 3. For step c) it is also preferred that a data set boundary, the convex hull, is defined for the number of reference chromatograms, and that the pure archetypes are calculated as convex combinations of each of the reference chromatograms, with each pure archetype lying on the convex hull, and with the pure archetypes being selected by minimizing the squared error in representing each reference chromatogram as a mixture of the pure archetypes. The calculation of the pure archetypes may comprise the use of an alternating minimizing algorithm.

According to an aspect of the invention, there is provided a computer assisted method for identifying and determining the relative amounts of oil types in oil contaminated soil by use of an obtained pollution type model, which method is based on GC-FID (gas-chromatography/flame ionization detector) chromatograms having a number of intensity peaks with corresponding retention times. The method comprises the following steps: a) obtaining a GC-FID chromatogram for a liquid sample of oil contaminated soil, for which soil the contained oil types are to be identified, and storing data representing said contaminated soil chromatogram; corresponds to steps 1201, 1202 and 1203 of Fig. 12. It is preferred that a reduction of non-sample information is performed on stored data representing the soil sample GC-FID chromatogram; this is described in step 1204 of Fig.

12. It is also preferred that the stored chromatogram is normalized as described in step 1205 of Fig. 12.

b) comparing the contaminated soil chromatogram with a predetermined pollution type model comprising a number of chromatographic pollution profiles and the corresponding pollution type. Here, the chromatographic pollution profiles of the pollution type model may be pure archetype chromatograms, which may be determined as described above in connection with Fig. 11. Step b) may further comprise determining, by use of archetypal analysis based on the pure archetype chromatograms of the pollution type model, a representation being a pure archetype chromatogram representation of the contaminated soil chromatogram, which representation gives a fraction measure for each contributing pure archetype chromatogram; and c) comparing the fraction measures and the contributing pure archetype chromatograms of the representation with a reference library listing pure archetype chromatograms with matching oil type or oil types, and determining based on said comparison an oil fraction measure, which oil fraction measure gives a fraction measure of the oil types being represented by the pure archetype chromatograms of the representation, thereby obtaining a measure of identified oil types within said oil contaminated soil. The obtained measure giving the distribution of oil types in the soil sample may be stored. Steps b) and c) correspond to steps 1207 and 1209 of Fig. 12.

The method for identifying oil types may further comprise a step d) of comparing the obtained representation of step b) with the original stored contaminated soil chromatogram, and then determine based on this comparison a measure for the difference or mismatch between the obtained representation and the contaminated soil chromatogram. The difference measure of step d) may be determined as the sum or weighted sum of squares of the difference in intensity between the stored contaminated chromatogram and the obtained representation, where the summation is performed for all retention times. Samples with a difference or residuals above a user-defined threshold cannot be described by the current pollution type model. Step d) corresponds to step 1208 of Fig. 12.

The method for identifying oil types may also comprise an update of the reference library, where the update may comprise the steps of: aa) selecting an update GC-FID chromatogram representing an oil contaminated soil sample, for which sample a difference or mismatch measure has been determined according to process step d); bb) updating the stored data set of step b) by including the update chromatogram to obtain an updated set of reference chromatograms; cc) generating and storing a predetermined number of updated pure archetype chromatograms based on the stored updated set of reference chromatograms; dd) classifying the updated pure archetype chromatograms against the set of updated reference chromatograms to identify which oil type or oil types is/are matched to a pure archetype chromatogram, thereby obtaining an updated reference library listing the updated pure archetype chromatograms with matching oil type or oil types.

The oil type or types represented by the update chromatogram may be determined by conventional methods, such as visual inspection or knowledge of the source. The update chromatogram may be selected from chromatograms having a mismatch or difference measure larger than or equal to a predetermined threshold value. This threshold value may be at least 10%, such as 20%, such as 25%, such as 30%

For step b) it is preferred that the representation is determined as a best match combination of pure archetype chromatograms to the contaminated soil chromatogram.

Fig. 15 is a block diagram illustrating a system, which can be used in accordance with embodiment of the methods of the present invention. The system comprises a GC-FID (gas-chromatography/flame ionization detector) 1501 for obtaining chromatograms for soil samples, a computer 1502 and a computer storage 1503 for storing chromatogram data and for performing computational steps based on the stored chromatogram data.

Claims

1. A computer assisted method for producing a data set for adjusting GC-FID (gas-chromatography/flame ionization detector) chromatograms for retention time related changes in the sensitivity of the GC-FID system, the method comprising the steps of: a) selecting a number of characteristic hydrocarbon compounds; b) producing a number of liquid sample solutions (standard solutions) having different but known concentrations of a mixture of the characteristic hydrocarbon compounds; c) obtaining a GC-FID chromatogram for each or at least part of the sample solutions with each chromatogram consisting of a detector signal intensity curve as a function of retention time, wherein each characteristic hydrocarbon compound of a sample solution is represented by a maximum intensity peak at a maximum peak retention time, which peak retention time is characteristic for the corresponding hydrocarbon compound, and wherein each maximum intensity peak has a corresponding intensity peak area being a function of the sample concentration of the corresponding hydrocarbon compound; d) storing data representing the obtained chromatograms; e) calculating peak area data for each or at least part of the intensity peak areas for each or at least part of the stored chromatograms; f) producing a set of area-concentration data for each of the selected hydrocarbon compounds, each set of area-concentration data being based on the calculated peak area data and the known sample concentrations of the selected hydrocarbon compound; g) determining a response factor, RF, for each of the selected hydrocarbon compounds, said response factor being determined as a slope representing at least part of an area-concentration curve obtained from the set of area-concentration data, which curve represents the calculated peak areas as a function of the concentration of the selected hydrocarbon compound in the different sample solutions; h) for each selected hydrocarbon compound storing the obtained response factor together with the corresponding maximum peak retention time thereby obtaining a first set of response factor-retention time data; i) calculating by use of interpolation one or more new response factors for retention times between two nearest neighbor maximum peak retention times of the first set of response factor-retention time data, thereby obtaining a second and expanded set of response factor-retention time data; and j) storing said second set of response factor-retention time data.

2. A method according to claim 1, wherein for step e), then for each maximum intensity peak a start peak retention time, spRT, being lower than the maximum peak or peak apex retention time, paRT, is defined, and an end peak retention time, epRT, being higher than the maximum peak retention time, paRT, is defined, and the intensity peak area is calculated as the area covered by the intensity curve above an intensity baseline being drawn from the intensity curve at start peak retention time, spRT, to end peak retention time, epRT.

3. A method according to claim 2, wherein the start peak retention time is found as the last RT before the paRT where the intensity function have a negative slope, and the end peak retention time is found as the first RT after the paRT where the intensity function have a positive slope, respectively.

4. A method according to any one of the claims 1-3, wherein for step g) the curve slope defining a response factor, RF, is calculated by minimizing the unweighted or weighted sum of squares of the differences between a linear curve and the area-concentration data.

5. A method according to any one of the claims 1-4, wherein for step i) the new response factors are calculated by use of linear interpolation using the response factor values of two nearest neighbor maximum peak retention times.

6. A method according to any one of the claims 1-4, wherein for step i) the new response factors are calculated by use of linear interpolation using the response factor values of two nearest neighbor maximum peak retention times and by use of response factor values for maximum peak retention times closest to the two nearest neighbor maximum peak retention times.

7. A method according to any one of the claims 1-6, wherein for step i), the number of calculated new interpolated response factors in-between two nearest neighbor retention times equals the number of data points obtained between these two retention times determined by the sampling rate of the flame ionization detector.

8. A method according to any one of the claims 1-7, wherein before the area calculating of step e), a reduction of non-sample information (data artifacts) is performed on the chromatograms represented by the data sets stored in step d).

9. A method according to claim 8, wherein the reduction of non-sample information includes subtraction of data, which represents a chromatogram obtained by GC-FID analysis of a sample containing no hydrocarbon compounds, from each of the stored standard solution chromatograms.

10. A method according to claim 8, wherein the GC-FID system used for obtaining the chromatograms comprises a gas chromatograph with a column, wherein the reduction of non-sample information includes subtraction of intensity data representing a blank chromatogram from each of the standard solution chromatograms, and wherein the intensity data for the blank chromatogram is obtained by taking the average of the intensities recorded before any compounds elute from the column, or wherein the blank chromatogram results from the injection of a sample into the column, which sample does not contain any hydrocarbon compounds.

11. A method according to any one of the claims 8-10, wherein the reduction of nonsample information includes retention time alignment of the chromatograms.

12. A method according to claim 11, wherein the retention time alignment of a chromatogram comprises shifting retention time sections of the chromatogram by a constant value (rigid alignment), and/or comprises shifting the retention times by a value being a function of retention time (non-rigid alignment).

13. A method according to claim 12, wherein the non-rigid alignment consists of sequential stretching and compression of the chromatogram in order to best align the chromatogram to a target chromatogram so that the retention time of each compound in the aligned chromatogram is the same as the retention time for these compounds in the target chromatogram .

14. A method according to any one of the claims 1-13, wherein each of the standard solutions contains between 10-20 individual hydrocarbon compounds.

15. A method according to any one of the claims 1-14, wherein the number of standard solutions is selected to be in the range of 4-12.

16. A method according to any one of the claims 1-15, wherein the concentrations of characteristic hydrocarbon compounds in the standard solutions are in the range of 0.2 -100 ppm.

17. A computer assisted method for producing a number of calibration curves to be used in determining total hydrocarbon concentrations of soil samples from GC-FID (gas-chromatography/flame ionization detector) chromatograms, the method comprising the steps of: a) selecting a number of characteristic hydrocarbon compounds; b) producing a number of liquid sample solutions (standard solutions) having different but know concentrations of a mixture of the characteristic hydrocarbon compounds; c) obtaining a GC-FID chromatogram for each or at least part of the sample solutions with each chromatogram consisting of a detector signal intensity curve as a function of retention time, wherein each characteristic hydrocarbon compound of a sample solution is represented by a maximum intensity peak at a maximum peak retention time, which peak retention time is characteristic for the corresponding hydrocarbon compound, and wherein each maximum intensity peak has a corresponding intensity peak area being a function of the sample concentration of the corresponding hydrocarbon compound; d) storing data representing the obtained chromatograms; e) calculating peak area data for each or at least part of the intensity peak areas for each or at least part of the stored chromatograms, whereby each calculated peak area corresponds to the sample concentration of the characteristic hydrocarbon compound with its maximum peak retention time; f) dividing at least part of the stored chromatograms for which peak area data are calculated in step e) in two or more consecutive retention time groups; g) calculate corresponding group area data and group concentration data for each retention time group within each of the divided chromatograms, said group area data representing the sum of the calculated peak areas represented by the peak area data of step e) within the respective retention time group, and said group concentration data representing the sum of the sample concentrations corresponding to the calculated peak areas being summed within the retention time group; h) producing a calibration data set for each of the retention time groups, each calibration data set holding retention time group area data with corresponding retention time group concentration data for each of the sample solutions for which a GC-FID chromatogram is obtained; and i) producing a calibration curve giving retention time group concentration as a function of retention time group area for each of the retention time groups based on the obtained calibration data set.

18. A method according to 17 where the calibration curve is a first or second order polynomial model calculated by least squares regression or weighted least squares regression of the obtained calibration data set.

19. A method according to claim 17 or 18, wherein for step e), then for each maximum intensity peak a start peak retention time, spRT, being lower than the maximum peak or peak apex retention time, paRT, is defined, and an end peak retention time, epRT, being higher than the maximum peak retention time, paRT, is defined, and the intensity peak area is calculated as the area covered by the intensity curve above an intensity baseline being drawn from the intensity curve at start peak retention time, spRT, to end peak retention time, epRT.

20. A method according to claim 19, wherein the start peak retention time is found as the last RT before the paRT where the intensity function have a negative slope, and the end peak retention time is found as the first RT after the paRT where the instensity function have a positive slope, respectively.

21. A method according to any one of the claims 17-20, wherein before the area calculating of step e), a reduction of non-sample information (data artifacts) is performed on the chromatograms represented by the data sets stored in step d).

22. A method according to claim 21, wherein the reduction of non-sample information includes subtraction of data, which represents a chromatogram obtained by GC-FID analysis of a sample containing no hydrocarbon compounds, from each of the stored standard solution chromatograms.

23. A method according to claim 21 or 22, wherein the GC-FID system used for obtaining the chromatograms comprises a gas chromatograph with a column, wherein the reduction of non-sample information includes subtraction of intensity data representing a blank chromatogram from each of the standard solution chromatograms, and wherein the intensity data for the blank chromatogram is obtained by taking the average of the intensities recorded before any compounds elute from the column, or wherein the blank chromatogram results from the injection of a sample into the column, which sample does not contain any hydrocarbon compounds.

24. A method according to any one of the claims 21-23, wherein the reduction of non-sample information includes retention time alignment of the chromatograms.

25. A method according to claim 24, wherein the retention time alignment of a chromatogram comprises shifting retention time sections of the chromatogram by a constant value (rigid alignment), and/or comprises shifting the retention times by a value being a function of retention time (non-rigid alignment).

26. A method according to claim 25, wherein the non-rigid alignment consists of sequential stretching and compression of the chromatogram in order to best align the chromatogram to a target chromatogram so that the retention time of each compound in the aligned chromatogram is the same as the retention time for these compounds in the target chromatogram.

27. A method according to any one of the claims 21-26, wherein the reduction of non-sample information includes an adjustment of the chromatograms for retention time related changes in the sensitivity of the GC-FID system.

28. A method according to claim 27, wherein said adjustment of the chromatograms is performed by dividing each intensity of a chromatogram with the response factor corresponding to the retention time of the intensity, where the response factor is found from the second set of response factor-retention time data according to any one of the claims 1-16.

29. A method according to any one of the claims 17-28, wherein each of the standard solutions contains between 10-20 individual hydrocarbon compounds.

30. A method according to any one of the claims 17-29, wherein the number of standard solutions is selected to be in the range of 4-12.

31. A method according to any one of the claims 17-30, wherein the concentrations of characteristic hydrocarbon compounds in the standard solutions are in the range of 0.2 -100 ppm.

32. A computer assisted method for determining the total hydrocarbon concentrations by use of GC-FID (gas-chromatography/flame ionization detector) chromatograms and calibration curves obtained according to any one of the claims 17-SI , the method comprising the steps of: a) obtaining a liquid extract of a soil sample for which the hydrocarbon concentration is to be determined; b) obtaining a GC-FID chromatogram for the liquid extract which chromatogram shows a detector signal intensity curve with intensity peaks as a function of retention time; c) storing data representing the obtained chromatogram; d) dividing the stored chromatogram data in a number of consecutive retention time groups being equal to the retention time groups represented by the obtained calibration curves; e) calculating intensity group area data for each retention time group, each group area data being representative of the intensity curve area covered by the intensity peaks within the corresponding retention time group; and f) determining the total hydrocarbon concentration for a retention time group from the calibration curve, which calibration curve represents the selected retention time group, as the retention time group concentration having a retention time group area equal to the obtained intensity group area.

33. A method according to claim 32, wherein before the area calculating of step e), a reduction of non-sample information (data artifacts) is performed on the chromatogram represented by the data set stored in step c).

34. A method according to claim 33, wherein the reduction of non-sample information includes subtraction of data, which represents a chromatogram obtained by GC-FID analysis of a sample containing no hydrocarbon compounds, from the stored chromatogram data.

35. A method according to claim 34, wherein the GC-FID system used for obtaining the chromatograms comprises a gas chromatograph with a column, wherein the reduction of non-sample information includes subtraction of intensity data representing a blank chromatogram from the stored chromatogram, and wherein the intensity data for the blank chromatogram is obtained by taking the average of the intensities recorded before any compounds elute from the column, or wherein the blank chromatogram results from the injection of a sample into the column, which sample does not contain any hydrocarbon compounds.

36. A method according to any one of the claims 33-35, wherein the reduction of non-sample information includes retention time alignment of the chromatogram.

37. A method according to claim 36, wherein the retention time alignment of a chromatogram comprises shifting retention time sections of the chromatogram by a constant value (rigid alignment), and/or comprises shifting the retention times by a value being a function of retention time (non-rigid alignment).

38. A method according to claim 37, wherein the non-rigid alignment consists of sequential stretching and compression of the chromatogram in order to best align the chromatogram to a target chromatogram so that the retention time of each compound in the aligned chromatogram is the same as the retention time for these compounds in the target chromatogram.

39. A method according to any one of the claims 33-38, wherein the reduction of non-sample information includes an adjustment of the chromatogram for retention time related changes in the sensitivity of the GC-FID system.

40. A method according to claim 39, wherein said adjustment of the chromatogram is performed by dividing each intensity of a chromatogram with the response factor corresponding to the retention time of the intensity, where the response factor is found from the second set of response factor-retention time data according to any one of the claims 1-16.

41. A method according to any one of the claims 32-40, wherein for step e) calculation of the intensity group area data for each retention time group is performed by summing all intensities of the chromatograms within the retention time group.

42. A method or computer assisted method for obtaining a corrected GC-FID (gas-chromatography/flame ionization detector) chromatogram of a soil sample comprising one or more hydrogen compounds, the method comprising the steps of: a) obtaining a liquid extract of the hydrogen compound comprising soil sample: b) obtaining a GC-FID chromatogram for the liquid extract, which chromatogram shows a detector signal intensity curve with intensity peaks as a function of retention time; c) performing a reduction of non-sample information on the obtained chromatogram, wherein the reduction of non-sample information includes an adjustment of the chromatogram for retention time related changes in the sensitivity of the GC-FID system; and d) storing data representing the chromatogram being corrected for non-sample information.

43. A method according to claim 42, wherein said adjustment of the chromatogram is performed by adjusting the intensity curve of the chromatogram by response factor values given by a response factor function being a function of retention time and expressing variation in the sensitivity of the GC-FID system as a function of retention time.

44. A method according to claim 43, wherein the response factor function is based on GC-FID chromatograms representing a number of liquid standard sample solutions having different but known concentrations of a mixture of a number of selected characteristic hydrocarbon compounds.

45. A method according to claim 44, wherein the standard sample solution chromatograms show a detector signal intensity curve as a function of retention time, with each characteristic hydrocarbon compound of a sample solution being represented by a maximum intensity peak at a maximum peak retention time, which peak retention time is characteristic for the corresponding hydrocarbon compound, and with each maximum intensity peak having a corresponding intensity peak area being a function of the sample concentration of the corresponding hydrocarbon compound, and wherein the response factor function is based on calculated peak area data for each or at least part of the intensity peak areas for each or at least part of the standard sample solution chromatograms.

46. A method according to claim 45, wherein the response factor function is based on a set of area-concentration data produced for each of the selected hydrocarbon compounds, each set of area-concentration data being based on the calculated peak area data and the known sample concentrations of the selected hydrocarbon compound.

47. method according to claim 46, wherein the response factor function is based on response factor values being determined for each of the selected characteristic hydrocarbon compounds, said response factor values being determined as a slope representing at least part of an area-concentration curve obtained from the set of area-concentration data, which curve represents the calculated peak areas as a function of the concentration of the selected hydrocarbon compound in the different standard sample solutions.

48. A method according to claim 47, wherein for each selected hydrocarbon compound the obtained response factor value is paired together with the corresponding maximum peak retention time thereby obtaining a first set of response factor-retention time data providing at least part of the response factor function.

49. A method according to claim 48, wherein one or more new response factor values are calculated by use of interpolation for retention times between two nearest neighbor maximum peak retention times of the first set of response factor-retention time data, thereby obtaining a second and expanded set of response factor-retention time data defining the response factor function.

50. A method according to any one of the claims 42-49, wherein the reduction of non-sample information includes subtraction of data representing a chromatogram obtained by GC-FID analysis of a sample containing no hydrocarbon compounds from the stored chromatogram.

51. A method according to any one of the claims 42-50, wherein the GC-FID system used for obtaining the chromatogram comprises a gas chromatograph with a column, wherein the reduction of non-sample information includes subtraction of intensity data representing a blank chromatogram from the obtained chromatogram, and wherein the intensity data for the blank chromatogram is obtained by taking the average of the intensities recorded before any compounds elute from the column, or wherein the blank chromatogram results from the injection of a sample into the column, which sample does not contain any hydrocarbon compounds.

52. A method according to any one of the claims 42-51, wherein the reduction of non-sample information includes retention time alignment of the chromatogram.

53. A method according to claim 52, wherein the retention time alignment of the chromatogram comprises shifting retention time sections of the chromatogram by a constant value (rigid alignment), and/or comprises shifting the retention times by a value being a function of retention time (non-rigid alignment).

54. A method according to claim 53, wherein the non-rigid alignment consists of sequential stretching and compression of the chromatogram in order to best align the chromatogram to a target chromatogram so that the retention time of each compound in the aligned chromatogram is the same as the retention time for these compounds in the target chromatogram .

55. A method according to claim 43, wherein the response factor values are found from the second set of response factor-retention time data according to any one of the claims 1-16.

56. A computer assisted method for constructing a pollution type model for hydrocarbon pollutions of soil, where a pollution type model comprises a number of chromatographic pollution profiles with corresponding hydrocarbon pollution types, the method comprising of the steps: a) obtaining a liquid extract of a number of reference oil containing soil samples for which samples the contained type or types of oil is/are known; b) obtaining a reference GC-FID chromatogram for each of the reference soil samples, each chromatogram showing a detector signal intensity curve with intensity peaks as a function of retention time; c) storing data representing the obtained reference chromatograms; d) generating and storing a predetermined number of chromatographic pollution profiles based on the stored set of chromatograms; e) classifying the chromatographic pollution profiles to identify which oil pollution type is matched to a chromatographic pollution profile, thereby obtaining a reference library listing chromatographic pollution profiles with matching pollution types.

57. A method according to claim 56, wherein step d) comprises a principal convex hull analysis in which the chromatographic pollution profiles are weighted averages of the stored reference chromatograms.

58. A method according to claim 57, wherein the weights for the weighted averages are chosen to give the best fit of a weighted sums of the chromatographic pollution profiles to the stored reference chromatograms.

59. A method according to any one of the claims 56-58, where step e) is performed based on the prior knowledge of the oil pollution types in the samples.

60. A method according to any one of the claims 56-59, wherein before the generation of pollution profiles of step d), a reduction of non-sample information (data artifacts) is performed on the chromatogram represented by the data set stored in step c).

61. A method according to claim 60, wherein the reduction of non-sample information includes subtraction of data, which represents a chromatogram obtained by GC-FID analysis of a sample containing no hydrocarbon compounds, from the stored chromatogram data.

62. A method according to claim 60, wherein the GC-FID system used for obtaining the chromatograms comprises a gas chromatograph with a column, wherein the reduction of non-sample information includes subtraction of intensity data representing a blank chromatogram from the stored chromatogram, and wherein the intensity data for the blank chromatogram is obtained by taking the average of the intensities recorded before any compounds elute from the column, or wherein the blank chromatogram results from the injection of a sample into the column, which sample does not contain any hydrocarbon compounds.

63. A method according to any one of the claims 60-62, wherein the reduction of non-sample information includes retention time alignment of the chromatogram.

64. A method according to claim 63, wherein the retention time alignment of a chromatogram comprises shifting retention time sections of the chromatogram by a constant value (rigid alignment), and/or comprises shifting the retention times by a value being a function of retention time (non-rigid alignment).

65. A method according to claim 64, wherein the non-rigid alignment consists of sequential stretching and compression of the chromatogram in order to best align the chromatogram to a target chromatogram so that the retention time of each compound in the aligned chromatogram is the same as the retention time for these compounds in the target chromatogram.

66. A method according to any one of the claims 60-65, wherein the reduction of non-sample information includes an adjustment of the chromatogram for retention time related changes in the sensitivity of the GC-FID system.

67. A method according to claim 66, wherein said adjustment of the chromatogram is performed by dividing each intensity of a chromatogram with the response factor corresponding to the retention time of the intensity, where the response factor is found from the second set of response factor-retention time data according to any one of the claims 1-16.

68. A computer assisted method for pollution type apportionment of oil polluted soil samples, by use of GC-FID (gas-chromatography/flame ionization detector) chromatograms and a pollution type model obtained according to any one of the claims 56-67, the method comprising the steps of: a) obtaining a liquid extract of a soil sample for which the pollution type apportionment is to be determined; b) obtaining a GC-FID chromatogram for the liquid extract which chromatogram shows a detector signal intensity curve with intensity peaks as a function of retention time; c) storing data representing the obtained chromatogram; d) comparing the contaminated soil chromatogram with the chromatographic pollution profiles of the pollution type model, and determine a representation of the contaminated soil chromatogram based on the chromatographic pollution profiles, which representation gives a fraction measure for each contributing pollution profile; e) comparing the fraction measures and the contributing chromatographic pollution profiles of the representation with the reference library, and determining based on said comparison an pollution type measure, which pollution type measure gives a fraction measure of the pollution types being represented by the chromatographic pollution profiles of the representation, thereby obtaining a measure of identified pollution types within said contaminated soil.

69. A method according to claim 68, further comprising a step f) of comparing the obtained representation of step d) with the original stored contaminated soil chromatogram, and determining based on said comparison a measure for the difference or mismatch between the obtained representation and the contaminated soil chromatogram.

70. A method according to claim 68 or 69, wherein before the comparison in step d), a reduction of non-sample information (data artifacts) is performed on the chromatogram represented by the data set stored in step c).