WO2014037481A1

WO2014037481A1 - Method of estimating the mutagenicity of hydrocarbon samples

Info

Publication number: WO2014037481A1
Application number: PCT/EP2013/068430
Authority: WO
Inventors: Bhaduri ANIRBAN; Amjad Nissar Chaudry; Girish Rao
Original assignee: Shell Internationale Research; Shell Oil Company
Priority date: 2012-09-06
Filing date: 2013-09-06
Publication date: 2014-03-13

Abstract

A method of estimating the mutagenicity of a hydrocarbon sample is disclosed. The concentrations of at least two PACs in the hydrocarbon sample are measured, and a predictive model and classification rules are applied to the concentrations of at least two PACs. 5

Description

METHOD OF ESTIMATING THE MUTAGENICITY OF HYDROCARBON

SAMPLES

Field of the Invention

The invention is directed to a method of estimating the mutagenicity of hydrocarbon samples, particularly bituminous samples .

Background of the Invention

Bitumen is a complex mixture of hydrocarbons and hydrocarbon derivatives, including aliphatic, naphthenic and aromatic compounds. Bituminous materials may contain polycyclic aromatic compounds (PACs). During handling of bituminous materials at elevated temperatures (e.g.

during road paving or roofing) fumes are emitted that may contain traces of PACs. Although PAC concentrations are small, worker exposure to bitumen fumes is of potential concern because some PACs are considered to be

carcinogenic and/or mutagenic. There is therefore a need for assessing the mutagenic activity of a sample that contains hydrocarbons, such as a bitumen sample. A property of the sample, known as its Mutagenicity Index (MI) value, is determined in order to ensure compliance with international safety standards for handling and use.

Mutagenicity Index is conventionally determined using a biological assay method known as the Modified Ames Test, as described in U.S. Pat. No. 4,499,187. The test is time-consuming (e.g. a turnaround time of about one month for the assay) and is typically carried out in specialist laboratories. It is desirable to develop alternative methods for estimating Mutagenicity Index which are quicker and simpler. Brandt et al, in Polycyclic Aromatic Compounds 16 (1999) 21, describe a method of measuring the quantity of PACs in bituminous materials. A sample is subjected to solvent extraction using dimethylsulfoxide in a Flow Injection Analysis coil. The extraction is followed by normal phase liquid chromatography and then gas

chromatography. Brandt describes that it is possible to correlate this PAC concentration of a bituminous material with its mutagenic tendency. This method is effective but complex.

WO 2008 107477 describes a method of analysing hydrocarbon compounds in a bituminous material by means of comprehensive multi-dimensional gas chromatography. The method may be used to measure the quantity of PACs in a bituminous material and to determine the mutagenic tendency of the material.

The present inventors have sought to provide a reliable and quick method of estimating the mutagenicity of hydrocarbon samples such as bitumen samples.

Summary of the Invention

Accordingly, the present invention provides a method of estimating the mutagenicity of a hydrocarbon sample, comprising steps of

a) measuring the concentrations of at least two PACs in the hydrocarbon sample;

b) applying a predictive model to the concentrations of at least two PACs to estimate a Mutagenicity Index; and

c) applying classification rules to the concentrations of at least two PACS to classify the hydrocarbon sample into one of two or more mutagenicity

categories ; wherein the predictive model has been developed by a process of:

i) selecting a set of hydrocarbon samples;

ii) measuring the concentrations of at least two PACS in each hydrocarbon sample;

iii) determining the Mutagenicity Index of each

hydrocarbon sample; and

iv) using regression analysis to determine the

correlation between the concentrations of at least two PACS and the Mutagenicity Index;

and the classification rules have been developed by a process of

v) selecting a set of hydrocarbon samples;

vi) measuring the concentrations of at least two PACS in each hydrocarbon sample;

vii) determining the Mutagenicity Index of each

hydrocarbon sample; and

viii) using discriminant analysis to develop functions that separate the hydrocarbon samples into mutagenicity categories based upon the

concentrations of at least two PACs.

The present inventors have found that this two-fold statistical approach, combining regression analysis and discriminant analysis, ensures reliable estimation of mutagenicity of hydrocarbon samples, particularly bituminous samples .

Description of Figures

Figure 1 shows a diagram illustrating a preferred method according to the present invention.

Figure 2 shows a graph illustrating quadratic discriminant analysis of PAC concentrations and

Mutagenicity Index for a sample set of bituminous compounds . Detailed Description of the Invention

The present invention provides a method of

estimating the mutagenicity of a hydrocarbon sample. The term "hydrocarbon sample" is used to describe any sample that contains at least a proportion of hydrocarbon components, including samples that contain only trace amounts of hydrocarbon components . Preferably the method is used for samples that are predominantly hydrocarbon, e.g. at least 90wt% hydrocarbon based upon the weight of the sample. However, the method may also be used for predominantly aqueous samples that may contain trace amounts (e.g. less than lOOOppm) of hydrocarbon, e.g. water samples that have been in contact with crude oils. Preferably the sample is a residue from a crude oil refining process, and most preferably the sample is a bituminous sample.

In step (a), the concentrations of at least two PACs in the hydrocarbon sample are measured. Suitable

measuring techniques are known to the skilled person. A preferred technique is GC-MS (gas chromatography - mass spectrometry) . These methods are described in, for example, "Characterization of polynuclear aromatic hydrocarbons in bitumen, heavy oil fractions boiling above 350°C by GCMS", Poirier & Das, Fuel, vol. 63, 3, 361-367, 1984; "Air sampling and determination of vapours and aerosols of bitumen and polycyclic aromatic

hydrocarbons in the Human Bitumen Study", Breuer et al, Archives of Toxicology, vol. 85, Supplement 1, 11-20, 2011; and "Determination of polycyclic aromatic

hydrocarbons from bitumen concrete roads in drainage water by microextraction, large-volume sampling and gas chromatography—mas s spectrometry with selected ion monitoring", Kubince, Kuran, Ostrovsky and Sojak, Journal of Chromatography A, vol. 653, 2, 363-368, 1993.

Alternative measuring techniques include

comprehensive multi-dimensional gas chromatography as described in WO 2008 107477 and the method described by

Brandt et al, in Polycyclic Aromatic Compounds 16 (1999) 21. High Performance Liquid Chromatography combined with fluorescence could also be used.

The number of PAC concentrations that should be measured in step (a) will be determined by how many PAC concentrations have been measured when developing the predictive model (in step (ii)) and the classification rules (in step (vi)) . Suitably the concentrations of at least five PACs are measured in step (a), preferably the concentrations of at least 10 PACs, more preferably at least 20 PACs. Preferably the PAC concentrations that are measured in step (a) include the concentrations of at least one of benzo [a] pyrene, benzo [k] fluoranthene and benzo [b] fluoranthene .

In step (b), a predictive model is applied to the concentrations of at least two PACs to estimate a

Mutagenicity Index. The predictive model has been developed by a process of:

i) selecting a set of hydrocarbon samples;

iii) determining the Mutagenicity Index of each

hydrocarbon sample; and

iv) using regression analysis to determine a

correlation between the concentrations of at least two PACS and the Mutagenicity Index.

The set of hydrocarbon samples in step (i) suitably includes at least 10 samples, preferably at least 20 samples. A greater number of samples will improve the reliability of the predictive model. To obtain a reliable model the skilled person should select a set of

hydrocarbon samples that reflect the type of hydrocarbon samples that will be tested using the method of the invention. For example, if the method of estimating the mutagenicity is to be applied to bituminous samples, then the set of hydrocarbon samples in step (i) should suitably include a proportion of, preferably a majority of, and most preferably all bituminous samples. If the method is to be applied to two different types of samples, e.g. bituminous samples and cracked residue samples, it may be preferable to use two sets of

hydrocarbon samples (e.g. bituminous samples and cracked residue samples) and develop two predictive models (e.g. one for bituminous samples and one for cracked residue samples ) .

It is preferred that, when developing the predictive model, a greater number of PAC concentrations are measured because this can help to improve the reliability of the model. Suitably the concentrations of at least five PACs are measured in step (ii), preferably the concentrations of at least 10 PACs, more preferably at least 20 PACs.

Different PACs contribute differently to the

Mutagenicity Index, and it is preferred to measure the concentrations of the PACs that seem to have the most significant effect on the Mutagenicity Index. Preferably the PAC concentrations that are measured in step (ii) include the concentration of at least one of

benzo [a] pyrene, benzo [k] fluoranthene and

benzo [b] fluoranthene . The methods that may be used to measure the

concentrations in step (ii) are the same as the methods that may be used to measure the concentrations in step (a) . Preferably, GC-MS is used in step (ii) . However, it is not necessary to use the same method in step (ii) as in step (a), e.g. the predictive model could be developed using GC-MS results from step (ii) and the PAC

concentrations could be measured using a GCxGC

measurement in step (a) .

In step (iii) the Mutagenicity Index of each hydrocarbon sample is determined. Preferably the

Mutagenicity Index is determined using the Modified Ames Test, for example as described in ASTM E 1687-10. The test has been described in a number of references including G.R. Blackburn, R.A. Deitch, C.A. Schreiner and

C.R. Mackerer, Cell Biology and Toxicology, 2, 1, 63-84, 31, 1986; G.R. Blackburn, R.A. Deitch and C.A. Schreiner, Cell Biology and Toxicology, 1, 1, 67-80, 16, 1984; T.A. Roy, S.W. Johnson, G.R. Blackburn and C.R. Mackerer, Fundamental & Applied Toxicology, 10, 466-476, 1988.

In step (iv), regression analysis is used to determine the correlation between the concentrations of at least two PACS and the Mutagenicity Index. Preferably, multiple linear regression analysis is used. In a preferred method, multiple subsets of PAC concentrations are randomly selected, and multiple predictive models are developed based upon regression analysis of the subsets of PAC concentrations and Mutagenicity Index. The quality of each model is assessed and then the best models are chosen and are used to estimate mutagenicity. A most preferred method has the following steps:

1) Randomly generate a number between 1 and 5 (nv) 2) Randomly shuffle a basis set of PAC concentrations and select the first nv variables

3) Build a Multiple Linear Regression Model using the subset of PACs and the Mutagenicity Index

4) Assess the quality of the Model

5) Repeat steps 1-4, e.g. 1000 times

6) Filter the models, removing repeat selections and models of lower quality

7) Use the remaining models to predict Mutagenicity Index

The restriction of including up to five variables in step (1) keeps the model degrees of freedom as high as possible in order to minimise the risk of overfitting the training data, which would lead to poorer predictive models. The quality of each of the predictive models may be assessed in step (4) using techniques known to the skilled statistician, e.g. by calculating p values for the parameter estimates or by recording the adjusted R² value (adjusted for the number of variables in the model). In step (6), repeat selections (models using identical PAC concentrations) are removed and thresholds are applied to the quality assessments to remove models of lower quality, e.g. models where adjusted R² < 0.7 could be removed, and models where p-value > 0.2 could be removed. The remaining models may be used to estimate the Mutagenicity Index.

The outcome of step (b) is an estimated Mutagenicity Index. In the preferred method where multiple linear regression is used, a set of predicted Mutagenicity Index values are provided from a set of predictive models . The median value of this set of predicted values can be taken as the estimated Mutagenicity Index. This median value can be reported with a confidence interval. Alternatively, the set of values can be analysed to calculate the proportion of predicted values that fall within specified ranges, e.g. the proportion of values where the Mutagenicity Index is greater than 2, the proportion between 1 and 2, and the proportion less than 1.

Classification rules can be applied to the estimated Mutagenicity Index from step (b) to classify the samples into more than one mutagenicity category. For example, median values > 2 could be classified as "non-compliant", median values < 1 could be classified as "compliant" and other values could be classified as "test".

Alternatively, if the proportion of predicted values with Mutagenicity Index > 2 was less than 10%, this could be classified as "compliant", if the proportion of predicted values with Mutagenicity Index > 2 was more than 80%, this could be classified as "non-compliant" and all other samples could be classified as "test".

In step (c), classification rules are applied to the concentrations of at least two PACS to classify the hydrocarbon sample into one of two or more mutagenicity categories. The classification rules have been developed by a process of

v) selecting a set of hydrocarbon samples;

vi) measuring the concentrations of at least two PACS in every hydrocarbon sample;

vii) determining the Mutagenicity Index of each

hydrocarbon sample; and

viii) using discriminant analysis to develop functions that separate the hydrocarbon samples into

mutagenicity categories based upon the

concentrations of at least two PACs. The set of hydrocarbon samples for use in step (v) can be the same or different from the set of hydrocarbon samples used in step (i) . For practical reasons, the skilled person may often use the same set. Essentially the same considerations apply when choosing the samples for step (v) as for step (i), but there may be more freedom to choose a broad range of hydrocarbon samples in step (v) as the discriminant analysis may be less affected by outlying results than the regression

analysis .

The methods that may be used to measure the

concentrations in step (vi) are the same as the methods that may be used to measure the concentrations in step (a) and step (ii) . Preferably, GC-MS is used in step (vi) . However, it is not necessary to use the same method in step (vi) as in steps (a) and (ii), e.g. the

classification rules could be developed using GC-MS results from step (vi) and the PAC concentrations could be measured using a GCxGC measurement in step (a) .

The determination of the Mutagenicity Index in step

(vii) is as outlined above for step (iii) .

In step (viii), discriminant analysis is used to develop functions that separate the hydrocarbon samples into mutagenicity categories based upon the

concentrations of at least two PACs . Preferably,

quadratic discriminant analysis is used. It is preferred that, when developing the classification rules, the concentrations of only two PACs per sample are used in the analysis. Most preferably the PAC concentrations that are used include benzo [ a] pyrene and phenanthrene.

It is not necessary that the same PAC concentrations are used in step (viii) as in step (iv) . It is preferred that many PAC concentrations (e.g. more than 10) will be used to develop the predictive model and only two PAC concentrations will be used in the discriminant analysis.

Methods of discriminant analysis are well known to the skilled statistician and computer programs may be used to carry out the analysis. Preferably there are three mutagenicity categories that can be denoted as "compliant", "non-compliant" and "test". In a less preferred embodiment, there are two mutagenicity

categories that can be denoted as "compliant" and "non- compliant". The categories correspond to Mutagenicity Index values, e.g. "compliant" could be < 1, "non- compliant" could be > 2 and "test" could be > 1 and < 2. For the set of hydrocarbon samples selected in step (v) , the skilled person obtains PAC concentration data and Mutagenicity Index values and can then categorise the samples according to their Mutagenicity Index values.

Discriminant analysis is used to define boundaries between the categories, based upon PAC concentrations. These boundaries are represented by functions (quadratic functions when quadratic discriminant analysis is used) .

In step (c) the functions developed in step (viii) are used to calculate the probability that a hydrocarbon sample, having particular PAC concentrations, falls into each of the defined mutagenicity categories .

Classification rules are applied to these probabilities, e.g. if the probability of falling into the "non- compliant" category is greater than 80%, then classify as "non-compliant"; if the probability of falling into the "compliant" category is greater than 20% and the

probability of falling into the "non-compliant" category is less than 10%, then classify as "compliant"; otherwise categorise as "test". With the present invention, the skilled person obtains an estimated Mutagenicity Index from step (b) (either a value or set of values), which can be subjected to classification rules to categorise the hydrocarbon sample. The skilled person obtains probabilities of falling within mutagenicity categories in step (c) and these are subjected to classification rules to categorise the hydrocarbon sample. The two-fold statistical approach of the present invention provides the skilled person with two estimates of mutagenicity which can help the skilled person to understand whether a sample should be handled, not handled or further tested. The two estimates of mutagenicity are suitably combined to give a final estimate of mutagenicity.

Figure 1 shows a diagram which provides an example of how the skilled person can use the results of the present method. In step (a) the skilled person tests the sample, measuring the concentrations of at least two PACs (and probably more than 10 PACs) . In step (b) the skilled person applies the predictive model to all or some of the measured PAC concentrations. The outcome of step (b) is a set of Mutagenicity Index predictions based upon a set of predictive models. The proportion of predictions of Mutagenicity Index that are greater than 2 (termed "M") is calculated. The samples are categorised based upon the following rules :

If M < 10% : "Compliant"

If M > 80% : "Non-compliant"

If 10% < M < 80%: "Test"

Then the sample is given a score based on upon the categorisation: "Compliant" is 0, "Test" is 1 and "Non- compliant" is 2. In step (c) the skilled person applies the

classification rules to two of the measured PAC

concentrations. The outcome is a set of probabilities that the sample falls into the "Compliant", "Non- compliant" and "Test" categories (termed P(C), P(NC) and

P(T)) . The samples are categorised based upon the following rules :

If P(C) > 20% and P(NC) < 10% : "Compliant"

If P(NC) > 80% : "Non-compliant"

Else: "Test"

Then the sample is given a score based on upon the categorisation: "Compliant" is 0, "Test" is 1 and "Non- compliant" is 2.

In a final step, the scores from the two methods are added together. This provides a final categorisation. A combined score of 0 or 1 is categorised as "Compliant". A score of 2 is categorised as "Test" and a score of 3 or 4 is categorised as "Non-compliant". The skilled person can use the final score to decide whether to use the sample (if it is "Compliant"), not use the sample (if it is

"Non-compliant") or send it for further testing.

The invention will now be described by reference to an example which is not intended to be limiting of the invention .

E ample

Development of Multiple Linear Regression Predictive Model

A training set of 33 bitumen samples was selected. The concentration of 16 different PACs was measured using GC-MS at the Doring laboratory. The PACs were

Naphthalene, Acenaphthylene, Acenaphthene, Fluorene, Phenanthrene, Anthracene, Fluoranthene , Pyrene,

Benzo [a] anthracene, Chrysene, Benzo [b] fluoranthene, Benzo [k] fluoranthene, Benzo [a] pyrene, Indeno [ 1 , 2 , 3- cd]pyrene, Dibenzo [a, h] anthracene and

Benzo [g, h, i] perylene .

The Mutagenic Index of each bitumen sample was measured using the Modified Ames Test according to ASTM E

1687-10.

One sample was excluded from the training set for the multiple linear regression analysis because the Mutagenicity Index for this sample was significantly higher than all the other values (almost 4 times the next highest value) .

The regression analysis was carried out using the following steps:

• Randomly generate a number between 1 and 5 (nv) · Randomly shuffle a basis set of 16 PAC

concentrations and select the first nv variables

• Build a Multiple Linear Regression Model using the subset of PACs and the Mutagenicity Index

• Calculate the p values for the parameter estimates and the adjusted R² value (adjusted for the number of PAC concentrations in the model)

• Repeat 1000 times

• Filter the models, removing repeat selections

(models using identical PAC concentrations) and removing models of lower quality (models where adjusted R² < 0.7 and models where p-value > 0.2) This provided a set of models which were used to estimate mutagenicity values from the concentrations of 16 PACs.

Development of Classification Rules using Quadratic

Discriminant Analysis

The same training set was used as for the Multiple Linear Regression model development. However, the high MI value sample was not excluded, so the training set included all 33 samples .

The concentrations of benzo [a] pyrene and

phenanthrene for each sample were used to classify the mutagenicity categories . The Mutagenicity Index for each sample was known and those with values > 2 were

categorised as "non-compliant"; those with values between 1 and 2 were categorised as "test" and those with values < 1 were categorised as "compliant". Quadratic

discriminant analysis was used to develop quadratic functions that mark boundaries between the categories. Figure 2 shows the 33 samples plotted according to their benzo [a] pyrene and phenanthrene concentrations. The symbol (T, A or ■) denotes whether the measured

Mutagenicity Index fell into the test, non-compliant or compliant category. The quadratic curves show how the quadratic discriminant analysis has provided boundaries between test, non-compliant and compliant regions.

Assessment of Predictive Model and Classification Rules

During development, the robustness of the modelling approaches, regression and discriminant analysis was tested by doing a cross-validation analysis using the test dataset. A leave-one-out approach was the most common technique applied, where each sample is left out of the training dataset, and used as a validation sample. Models were built using the remaining training data and used to predict each left out sample in turn. The assessment of the modelling was then judged using the cross-validated samples whose response was already known. The results of this assessment helped to define the classification rules.

Claims

C L A I M S

1. A method of estimating the mutagenicity of a hydrocarbon sample, comprising steps of

a) measuring the concentrations of at least two PACs in the hydrocarbon sample;

categories ;

wherein the predictive model has been developed by a process of:

i) selecting a set of hydrocarbon samples;

iii) determining the Mutagenicity Index of each

hydrocarbon sample; and

iv) using regression analysis to determine the

and the classification rules have been developed by a process of

v) selecting a set of hydrocarbon samples;

vii) determining the Mutagenicity Index of each

hydrocarbon sample; and

concentrations of at least two PACs.

2. A method according to claim 1, wherein the

hydrocarbon sample is a bituminous sample, the set of hydrocarbon samples in step (i) is a set of bituminous samples and the set of hydrocarbon samples in step (v) is a set of bituminous samples.

3. A method according to claim 1 or claim 2, wherein in step (a) the concentrations of the at least two PACs are measured by gas chromatography - mass spectrometry.

4. A method according to any preceding claim, wherein in steps (ii) and (vi) the concentrations of the PACs are measured by gas chromatography - mass spectrometry.

5. A method according to any preceding claim, wherein the concentrations of at least ten PACs are measured in step (a) and in step (ii) and this includes the

concentration of at least one of benzo [a] pyrene,

benzo [k] fluoranthene and benzo [b] fluoranthene .

6. A method according to any preceding claim, wherein at least 20 hydrocarbon samples are selected in step (i) and at least 20 hydrocarbon samples are selected in step (v) .

7. A method according to any preceding claim, wherein in steps (iii) and (vii) the Mutagenicity Index is determined using the Modified Ames Test.

8. A method according to any preceding claim, wherein in step (iv) multiple linear regression analysis is used.

9. A method according to any preceding claim, wherein in step (a) and step (vi) the concentrations of

benzo [a] pyrene and phenanthrene are measured and in step

(viii) discriminant analysis is used to develop functions that separate the hydrocarbon samples into mutagenicity categories based upon the concentrations of

benzo [a] pyrene and phenanthrene.

10. A method according to any preceding claim, wherein in step (viii) quadratic discriminant analysis is used.