CN116359420A - Chromatographic data impurity qualitative analysis method based on clustering algorithm and application - Google Patents

Chromatographic data impurity qualitative analysis method based on clustering algorithm and application Download PDF

Info

Publication number
CN116359420A
CN116359420A CN202310378197.1A CN202310378197A CN116359420A CN 116359420 A CN116359420 A CN 116359420A CN 202310378197 A CN202310378197 A CN 202310378197A CN 116359420 A CN116359420 A CN 116359420A
Authority
CN
China
Prior art keywords
impurity
peak
retention time
cluster
clustering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310378197.1A
Other languages
Chinese (zh)
Other versions
CN116359420B (en
Inventor
李奇文
柳彦宏
刘鹏飞
张�浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yantai Guogong Intelligent Technology Co ltd
Original Assignee
Yantai Guogong Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yantai Guogong Intelligent Technology Co ltd filed Critical Yantai Guogong Intelligent Technology Co ltd
Priority to CN202310378197.1A priority Critical patent/CN116359420B/en
Publication of CN116359420A publication Critical patent/CN116359420A/en
Application granted granted Critical
Publication of CN116359420B publication Critical patent/CN116359420B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/86Signal analysis
    • G01N30/8675Evaluation, i.e. decoding of the signal into analytical information
    • G01N30/8679Target compound analysis, i.e. whereby a limited number of peaks is analysed
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/86Signal analysis
    • G01N30/8693Models, e.g. prediction of retention times, method development and validation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Immunology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Analytical Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Biochemistry (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Library & Information Science (AREA)
  • Treatment Of Liquids With Adsorbents In General (AREA)

Abstract

The invention relates to the technical field of impurity analysis methods, in particular to a chromatographic data impurity qualitative analysis method based on a clustering algorithm and application thereof, wherein the method comprises the steps of calculating relative retention time; preliminary clustering, namely performing random sample clustering on the preset number of chromatographic data, and forming a cluster set by clustering results; and (3) classifying and reclustering, wherein the clustering is performed on the primary clustering result, and a cluster set of the most clusters after all chromatograms are calculated is obtained, namely an impurity model of the substance. The chromatogram is compared with the impurity model to confirm that the chromatogram is abnormal in impurity number, new impurity and single impurity area. The method comprises the steps of clustering and reclustering of chromatographic data based on KMeans and Euclidean distance algorithm, and comparing the chromatographic data with an impurity model, and is mainly used for controlling the discovery of new impurities, the quantity and the area of the impurities in the fine chemical production process, solving the problems of low manual discrimination efficiency and discrimination errors and improving the working efficiency.

Description

Chromatographic data impurity qualitative analysis method based on clustering algorithm and application
Technical Field
The invention relates to a chromatographic data impurity qualitative analysis method based on a clustering algorithm and application thereof, belonging to the technical field of impurity analysis methods.
Background
In the fine chemical industry, an important means for controlling the quality of the production process is to obtain a chromatogram through inspection of a chromatograph to judge the quantity of impurities and the area of single impurities. Because of various products in the fine chemical industry and the reason of small batch production, a large number of samples to be inspected are produced every day, most of the samples are inspected by a chromatograph, such as impurity condition detection of raw materials, quality control of a production process, quality control of finished products and the like, and workers need to visually compare peak images in a chromatogram every day and classify and count impurities.
At present, most quality control personnel count a large amount of data of chromatogram peaks every day or every week to carry out impurity classification and confirmation, and many unidentified peaks need to search the chromatogram to carry out naked eye identification and classification. And the current chromatogram and other chromatograms need to be taken for comparison of each peak in the chromatographic test process, so as to confirm whether the impurity quantity is increased or the area of a single impurity is increased. This also increases the effort and risk of quality control errors.
Based on the reasons, how to reduce the chromatogram classification statistics work, the comparison among chromatograms, find new impurities, increase the number of the impurities and abnormal area of single impurities through the chromatogram data is a technical problem to be solved.
Disclosure of Invention
Aiming at the defects existing in the prior art, the invention provides a chromatographic data impurity qualitative analysis method based on a clustering algorithm and application thereof, which can realize convenient and quick impurity alignment, solve the problem that impurity alignment can only be performed by staff through historical experience at the present stage, and can realize reminding and early warning of new impurity and abnormal impurity and impurity trend analysis, thereby effectively improving the working efficiency.
The technical scheme for solving the technical problems is as follows: a qualitative analysis method for chromatographic data impurities based on a clustering algorithm comprises the following steps:
s1, calculating relative retention time
Calculating the relative retention time of impurity peaks in an existing chromatogram of a certain substance;
s2, preliminary clustering
Clustering the preset number of chromatogram data by using a K-Means algorithm according to the impurity peak relative retention time calculated in the step S1 as a sample to obtain a cluster set, and marking cluster identifiers for all clusters in the cluster set;
the preliminary clustering output result comprises: clustering cluster identifiers of all cluster types and central retention time of each cluster type;
s3, clustering and reclassifying
Aiming at the cluster integration result of the step S2, searching out the data of the most impurity peaks in the existing chromatographic data of each cluster, carrying out clustering calculation by using a K-Means algorithm by taking the relative retention time of the most impurity peaks as a sample to form an impurity data model, calculating the average area of the impurity peaks gathered into each cluster in the clustering process, and calculating the maximum relative retention time and the minimum relative retention time of all the impurity peaks contained in each cluster as impurity boundaries;
the clustering reclassifying result output comprises: cluster identification of impurity peaks in the impurity data model, central retention time of each cluster, average area of the impurity peaks in each cluster and impurity boundaries;
s4, comparing new chromatogram data with impurity data models
The Euclidean distance between the peak to be judged in the new chromatogram and the retention time of the center of the corresponding cluster in the impurity model is shortest; judging whether the relative retention time of the peak is within the impurity boundary of the corresponding cluster; comparing the area of the peak to be judged with the average area of the impurity peaks in the corresponding clusters to confirm the impurity quantity of the new chromatogram, the new impurity confirmation and the abnormal condition of the single impurity area;
the output result of step S4 is: each impurity peak cluster identification in the new chromatogram, the relative retention time of each impurity peak, the area of each impurity peak and the impurity quantity of the new chromatogram.
Further, in step S1, the relative retention time of the impurity peak is calculated according to the following formulas (1) - (2):
Figure BDA0004171110470000021
Rt=t-t*coef(2)
wherein coef is the time offset coefficient of the impurity peak, T is the retention time of the impurity peak, T is the retention time of the main peak, and Rt is the relative retention time of the impurity peak;
further, the main peak is a peak with a peak area of more than or equal to 95% in the chromatogram; the impurity peak is a peak with a peak area of more than or equal to 10ppt after the main peak is removed from the chromatogram.
Further, in step S4, when the relative retention time of the peak to be judged is shortest with the euclidean distance of the retention time of the center of a certain corresponding cluster in the impurity model, and the relative retention time of the peak to be judged falls within the impurity boundary of the certain corresponding cluster, if the area of the peak to be judged is greater than 120% of the average area of the impurity peaks in the certain corresponding cluster, the alarm is given by considering that the area of the single impurity of the peak to be judged is too large.
Further, in step S4, when the relative retention time of the peak to be determined is the shortest euclidean distance from the retention time of a certain corresponding cluster center in the impurity model, but the relative retention time of the peak to be determined is outside the impurity boundary of the certain corresponding cluster, the peak to be determined is considered to be a new impurity, and an alarm is given.
The invention also discloses application of the chromatographic data impurity qualitative analysis method based on the clustering algorithm, wherein the method is applied to analysis of impurity peaks in chromatograms in the chemical production or research and development process, and whether the number of impurities is increased or the area of single impurities is increased is confirmed.
Further, the method is applied to the quality control of production raw materials, the quality control of production processes and the quality control of finished products.
The beneficial effects of the invention are as follows:
in the conventional method, quality inspection personnel are required to confirm the retention time through experience, but the retention time is calculated in a mathematical formula mode, so that the accuracy of data is obviously improved; the impurity peaks of samples of various varieties are classified by common quality control personnel through experience, and the classification is more scientific and has basis by a mathematical mode of preliminary clustering and cluster reclassifying, so that the occurrence of human misjudgment is reduced.
The method can realize impurity control alarm in the fine chemical production process, confirms the identification of peaks through comparison of chromatogram data and an impurity model, and then realizes the alarm of increasing the number of impurities by comparing the identification of peaks between chromatograms with the corresponding peaks, so as to discover new impurities and single impurity area abnormal conditions in time.
The method can solve the problems of low manual discrimination efficiency and discrimination errors, and can effectively improve the working efficiency and the discrimination accuracy by controlling impurities through a computer program.
Drawings
Fig. 1 is a flowchart of a qualitative analysis method for chromatographic data impurities based on a clustering algorithm in the embodiment.
FIG. 2 is a schematic diagram of the preliminary clustering described in the examples;
FIG. 3 is a schematic diagram of cluster reclassification as described in the examples.
Detailed Description
The following describes the present invention in detail. The present invention may be embodied in many other forms than described herein and similarly modified by those skilled in the art without departing from the spirit of the invention, so that the invention is not limited to the specific embodiments disclosed.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
In this application, the impurity control service is typically implemented by a computer program, taking the production process of an OLED intermediate chemical as an example, the whole process is as follows: impurity model setting, impurity model training, production raw material quality control, production process quality control and finished product warehousing quality control, and a chromatographic data impurity qualitative analysis method based on a clustering algorithm is shown in figure 1.
Step one, calculation of relative retention time
The method comprises the steps of presetting raw material impurity data models for OLED intermediates and OLED intermediates, training 200 chromatograms (chromatograms can be liquid phase chromatograms or gas phase chromatograms, and corresponding chromatograms are selected according to detection requirements of chemicals), wherein main peaks are peaks with peak areas of more than or equal to 95% in the chromatograms, and taking peaks with areas of more than 10ppt as impurity peaks according to requirements of operation instruction except the main peaks.
The chromatographic instrument causes that the farther the other peaks of the chromatogram are from the main peak, the larger the offset time is, so that the time offset coefficient coef of all impurity peaks except the main peak in all chromatogram data needs to be calculated, the specific calculation formula is as follows (1), the relative retention time Rt of the impurity peaks is calculated after the time offset coefficient is obtained, and the calculation formula is as follows (2):
Figure BDA0004171110470000041
Rt=t-t*coef(2)
where coef is the time offset coefficient of the impurity peak, T is the retention time of the main peak, and Rt is the relative retention time of the impurity peak.
Step two, preliminary clustering
When the sample is subjected to chromatographic inspection, 200 chromatograms of the same sample are found forward according to time sequence, the main peak and the peak with the peak area of less than 10ppt are removed, and the impurity peaks of the chromatograms are subjected to K-Means algorithm clustering operation, wherein the process is as follows:
according to the formula in the first step, the relative retention time of all impurity peaks is calculated.
The chromatogram sample is composed of a plurality of impurity peaks, and assuming that the chromatogram a is composed of peak values of A1, A2, A3, etc., the expression is a { A1, A2, A3.
And (3) finding a chromatogram data sample with the most impurity peaks in all chromatograms of the sample, and selecting the peak relative retention time of the random data amount as the cluster retention time after clustering (assuming that D { D1, D2, D3.. Degree. }).
The preliminary classification is performed as shown in fig. 2: selecting a chromatogram data sample (assumed as E { E1, E2, E3 … … }) and calculating the Euclidean distance from the relative retention time of the impurity peak in the chromatogram data sample to the retention time of the center of the D { D1, D2, D3 … … } cluster, and attributing the Euclidean distance to the cluster corresponding to the center retention time with the smallest Euclidean distance, for example: e1 and D1 are considered to be the same cluster, assuming that the cluster is identified as B1, the average retention time of D1 and E1 is 4.4, the central retention time of 4.4 identified as B1 cluster is reset, and other impurity peaks of D and E chromatogram data are processed as above to form cluster set B { B1, B2, B3 … … }.
Repeating the above operation with all other chromatogram data and B { B1, B2, B3 … … } until all chromatogram data are iterated to form a cluster set.
Outputting a result: the cluster clusters cluster identification of all cluster classes, the center retention time of each cluster class.
Step three, cluster reclassifying
And (3) clustering again each cluster containing the most peak retention time according to the result of the preliminary clustering in the step two, wherein the specific implementation method is as follows:
as shown in fig. 3, the most impurity peaks in the chromatogram data contained in each cluster in the preliminary clustering result are found according to the cluster classification, assuming that the preliminary clusters reach cluster clusters L { L1, L2, the chromatogram data sample H { H1 containing the most impurity peaks in the L1 cluster class in the L3....the above, h1, H2, H3 in H2, H3 … …), the L2 cluster contains J3 and J4 of the chromatogram data samples J { J1, J2 and J3 … … } with the largest impurity peaks. And so on to get the set { H1, H2, H3, J4 … … } containing the maximum number of impurity peaks for all the preliminary cluster classes L.
And carrying out K-Means clustering calculation on 200 pieces of chromatogram data based on the relative retention time of the impurity peaks to form an impurity model, calculating the average area of all the impurity peaks in one cluster in the clustering process to form the average area of the impurity peaks, and taking the maximum relative retention time and the minimum relative retention time of all the impurity peaks contained in a new cluster as impurity boundaries.
And (3) outputting results: cluster identification of impurity peaks in the impurity data model, center retention time of each cluster, average area of impurity peaks in each cluster, and impurity boundaries.
Fourth, comparing the chromatogram data with the impurity data model
The chromatogram data and the impurity data model are compared, and the specific method is as follows:
assuming that a { A1, A2, A3 … … } is an impurity model, the properties of impurity elements in the impurity model include: cluster identity, center retention time, average area, impurity boundary, chromatogram data for which B { B1, B2, B3 … … } needs to be aligned, the properties of the chromatogram data peaks include: relative retention time, area.
The relative retention time of all impurity peaks of B { B1, B2, B3 … … } and Euclidean distance of the retention time of all element cluster centers in the A { A1, A2, A3 … … } set are calculated.
Assuming that B1 is closest to the A1 retention time Euclidean distance, the retention time of B1 is within the impurity boundaries of A1. Then the B1 impurity is considered A1 and if the area of B1 is greater than 120% of the average area of A1, an alarm is given that the area of the single impurity of B1 is too large.
Assuming that the relative retention time of B2 is closest to the euclidean distance of the retention time of A2, the retention time of B2 is outside the impurity boundary of A2, then the B2 impurity is considered to be a new impurity for warning.
And (5) performing approximation comparison on all elements of the data B and all elements of the data A, and marking cluster type labels.
Outputting a result: cluster identification of impurity peaks, relative retention time, area, number of impurities of the chromatogram.
The specific application scene is as follows:
1. and (3) controlling the quality of the production raw materials, carrying out chromatographic inspection on raw material samples before warehousing the raw materials, obtaining chromatographic data, and searching three batches of chromatographic data of sample inspection before current sample inspection. And carrying out step two and step three on the chromatographic data of 200 batches of samples inspected before the four batches are time-ordered to obtain an impurity data model, carrying out step four on the four batches of chromatographic data and the impurity data model one by one, finding new impurities to alarm, finding that a certain impurity is 120% larger than the average area of the corresponding impurities of the impurity model to alarm, finding that the impurity number of the current inspection chromatogram is larger than the impurity number of any one batch of the first three batches, and finding that a certain impurity of the current inspection chromatogram is a new impurity relative to the first three batches of impurities to alarm.
2. And (3) controlling the quality of the production process, and performing chromatographic inspection on a sample in a certain process in the production process to obtain chromatogram data.
And (3) transversely comparing and early warning, searching chromatogram data of the current sample and the similar samples of the orders, sequentially searching 200 sample chromatograms of other orders in the same process in time, and executing the third step and the fourth step to obtain an impurity data model. And step five, executing the current sample and other samples in the same process as the order and the impurity data model. And (3) discovering new impurities to alarm, namely discovering that a certain impurity is 120% larger than the average area of the corresponding impurity of the impurity model, and discovering that the impurity number of the current inspection chromatogram is larger than the impurity number of any one of the other similar samples in the same process of the same order to alarm, and discovering that a certain impurity of the current inspection chromatogram is a new impurity relative to the impurity of the similar samples in the same process of the same order to alarm.
And (3) longitudinally comparing and early warning, namely searching for a chromatogram data result which is compared with the impurity data model by the sample in the previous processing process of the current sample in the current production processing process, and if the impurity number of the current sample is larger than that of the previous processing process, carrying out early warning. In the production process of chemicals, after the reaction is finished, a plurality of post-treatment processes are needed to remove impurities, and the impurity quantity condition is gradually decreased, so that if more samples exist in the next treatment process than the impurities in the last treatment process, the problems are possibly caused, and the method can be used for giving early warning to remind workers of checking the production process and the product condition.
3. And the quality control of the finished product is consistent with the control mode of the production raw materials.
By adopting the method, a large amount of chromatographic data samples of a tap enterprise which is an OLED intermediate in China are used. The accuracy of the comparison of the results of the impurity data model trained on 12000 chromatograms of 60 samples and the retention time of the impurity report classified by daily statistics of the enterprise reaches 98.3%, the other 0.9% are also in the range specified by the enterprise, and the fact that 16 impurities are not counted by a customer is found, which exceeds the expectations of quality control personnel. The result obtained by comparing 200 chromatogram data of a certain substance with an impurity data model has a similarity of 94.5% with the daily comparison record of the enterprise, and the other 3% are mistakes or peaks which are not recorded as impurities by quality control personnel. Through result analysis, the method is reliable and effective, and the reasons of the extremely small part of inaccuracy are the overlarge deviation of the chromatograph of the enterprise and the different inspection methods of inspection staff.
The method provided by the invention carries out correct peak classification and comparison between the chromatogram data through the chromatogram data, solves the problems of low manual discrimination efficiency and discrimination errors, effectively improves the working efficiency and reduces labor force.
The technical features of the above-described embodiments may be arbitrarily combined, and in order to simplify the description, all possible combinations of the technical features in the above-described embodiments are not exhaustive, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit or scope of the invention as defined in the appended claims.

Claims (7)

1. The qualitative analysis method for chromatographic data impurities based on the clustering algorithm is characterized by comprising the following steps of:
s1, calculating relative retention time
Calculating the relative retention time of impurity peaks in an existing chromatogram of a certain substance;
s2, preliminary clustering
Clustering the preset number of chromatogram data by using a K-Means algorithm according to the impurity peak relative retention time calculated in the step S1 as a sample to obtain a cluster set, and marking cluster identifiers for all clusters in the cluster set;
the preliminary clustering output result comprises: clustering cluster identifiers of all cluster types and central retention time of each cluster type;
s3, clustering and reclassifying
Aiming at the cluster integration result of the step S2, searching out the data of the most impurity peaks in the existing chromatographic data of each cluster, carrying out clustering calculation by using a K-Means algorithm by taking the relative retention time of the most impurity peaks as a sample to form an impurity data model, calculating the average area of the impurity peaks gathered into each cluster in the clustering process, and calculating the maximum relative retention time and the minimum relative retention time of all the impurity peaks contained in each cluster as impurity boundaries;
the clustering reclassifying result output comprises: cluster identification of impurity peaks in the impurity data model, central retention time of each cluster, average area of the impurity peaks in each cluster and impurity boundaries;
s4, comparing new chromatogram data with impurity data models
The Euclidean distance between the peak to be judged in the new chromatogram and the retention time of the center of the corresponding cluster in the impurity model is shortest; judging whether the relative retention time of the peak is within the impurity boundary of the corresponding cluster; comparing the area of the peak to be judged with the average area of the impurity peaks in the corresponding clusters to confirm the impurity quantity of the new chromatogram, the new impurity confirmation and the abnormal condition of the single impurity area;
the output result of step S4 is: each impurity peak cluster identification in the new chromatogram, the relative retention time of each impurity peak, the area of each impurity peak and the impurity quantity of the new chromatogram.
2. The qualitative analysis method of chromatographic data according to claim 1, wherein in step S1, the relative retention time of the impurity peak is calculated according to the following formulas (1) - (2):
Figure FDA0004171110460000011
Rt=t-t*coef(2)
where coef is the time offset coefficient of the impurity peak, T is the retention time of the main peak, and Rt is the relative retention time of the impurity peak.
3. The qualitative analysis method of chromatographic data impurities based on a clustering algorithm according to claim 2, wherein the main peak is a peak with a peak area of more than or equal to 95% in a chromatogram; the impurity peak is a peak with a peak area of more than or equal to 10ppt after the main peak is removed from the chromatogram.
4. The qualitative analysis method of chromatographic data impurities based on a clustering algorithm according to claim 1, wherein in step S4, when the relative retention time of the peak to be judged is shortest in euclidean distance from the retention time of a center of a certain corresponding cluster in the impurity model, and the relative retention time of the peak to be judged falls within the impurity boundary of the certain corresponding cluster, if the area of the peak to be judged is greater than 120% of the average area of the impurity peaks in the certain corresponding cluster, the single impurity area of the peak to be judged is considered to be too large for alarming.
5. The qualitative analysis method of chromatographic data impurity based on clustering algorithm according to claim 1, wherein in step S4, when the relative retention time of the peak to be judged and the euclidean distance of the retention time of the center of a certain corresponding cluster in the impurity model are shortest, but the relative retention time of the peak to be judged is outside the impurity boundary of the certain corresponding cluster, the peak to be judged is regarded as a new impurity for alarming.
6. The application of the qualitative analysis method of chromatographic data impurities based on a clustering algorithm according to claims 1-5, wherein the method is applied to analysis of impurity peaks in chromatograms in chemical production or research and development processes, and is used for confirming whether the number of impurities is increased or the area of single impurities is increased.
7. The application of the qualitative analysis method for chromatographic data impurities based on a clustering algorithm according to claim 6, wherein the method is applied to the quality control of production raw materials, the quality control of production processes and the quality control of finished products.
CN202310378197.1A 2023-04-11 2023-04-11 Chromatographic data impurity qualitative analysis method based on clustering algorithm and application Active CN116359420B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310378197.1A CN116359420B (en) 2023-04-11 2023-04-11 Chromatographic data impurity qualitative analysis method based on clustering algorithm and application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310378197.1A CN116359420B (en) 2023-04-11 2023-04-11 Chromatographic data impurity qualitative analysis method based on clustering algorithm and application

Publications (2)

Publication Number Publication Date
CN116359420A true CN116359420A (en) 2023-06-30
CN116359420B CN116359420B (en) 2023-08-18

Family

ID=86908442

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310378197.1A Active CN116359420B (en) 2023-04-11 2023-04-11 Chromatographic data impurity qualitative analysis method based on clustering algorithm and application

Country Status (1)

Country Link
CN (1) CN116359420B (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1154268A2 (en) * 2000-05-09 2001-11-14 Air Products And Chemicals, Inc. Method for operating an ion mobility spectrometer used to detect trace atmospheric impurities in gases
WO2007135408A1 (en) * 2006-05-18 2007-11-29 Pliva Hrvatska D.O.O. Impurities of donepezil
CN102778517A (en) * 2012-07-16 2012-11-14 无锡市产品质量监督检验中心 Method for detecting cocoa powder adulteration based on lipid clustering analysis
CN103226140A (en) * 2013-03-21 2013-07-31 中国科学院大学 High performance liquid chromatography based illegal cooking oil cluster analysis method
US20160231297A1 (en) * 2013-10-16 2016-08-11 Shimadzu Corporation Chromatogram data processing system
CN108037207A (en) * 2017-12-22 2018-05-15 广东方制药有限公司 A kind of UPLC-MS-MS rapid screenings Artemisia capillaris and the method for artemisia scoparia difference base source otherness
CN109416926A (en) * 2016-04-11 2019-03-01 迪森德克斯公司 MASS SPECTRAL DATA ANALYSIS workflow
CN110161161A (en) * 2019-07-01 2019-08-23 汕头出入境检验检疫局检验检疫技术中心 Redwood identification method based on high-efficiency liquid-phase fingerprint and clustering
JP2021535387A (en) * 2018-08-30 2021-12-16 マイクロマス ユーケー リミテッド Mass correction
CN114236000A (en) * 2021-12-13 2022-03-25 山东省药学科学院 High performance liquid chromatography method for determining defluorinated impurities in ezetimibe
CN114295766A (en) * 2021-12-24 2022-04-08 中国科学院上海有机化学研究所 Metabonomics data processing method and device based on stable isotope labeling
CN114755357A (en) * 2022-04-14 2022-07-15 武汉迈特维尔生物科技有限公司 Automatic integration method, system, equipment and medium for chromatographic mass spectrometry
CN115083549A (en) * 2022-07-18 2022-09-20 烟台国工智能科技有限公司 Product raw material ratio reverse derivation method based on data mining
CN115601412A (en) * 2022-10-18 2023-01-13 浙江大学(Cn) Method for detecting crystalline polymer powder impurities based on multispectral image processing

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1154268A2 (en) * 2000-05-09 2001-11-14 Air Products And Chemicals, Inc. Method for operating an ion mobility spectrometer used to detect trace atmospheric impurities in gases
WO2007135408A1 (en) * 2006-05-18 2007-11-29 Pliva Hrvatska D.O.O. Impurities of donepezil
CN102778517A (en) * 2012-07-16 2012-11-14 无锡市产品质量监督检验中心 Method for detecting cocoa powder adulteration based on lipid clustering analysis
CN103226140A (en) * 2013-03-21 2013-07-31 中国科学院大学 High performance liquid chromatography based illegal cooking oil cluster analysis method
US20160231297A1 (en) * 2013-10-16 2016-08-11 Shimadzu Corporation Chromatogram data processing system
CN109416926A (en) * 2016-04-11 2019-03-01 迪森德克斯公司 MASS SPECTRAL DATA ANALYSIS workflow
CN108037207A (en) * 2017-12-22 2018-05-15 广东方制药有限公司 A kind of UPLC-MS-MS rapid screenings Artemisia capillaris and the method for artemisia scoparia difference base source otherness
JP2021535387A (en) * 2018-08-30 2021-12-16 マイクロマス ユーケー リミテッド Mass correction
CN110161161A (en) * 2019-07-01 2019-08-23 汕头出入境检验检疫局检验检疫技术中心 Redwood identification method based on high-efficiency liquid-phase fingerprint and clustering
CN114236000A (en) * 2021-12-13 2022-03-25 山东省药学科学院 High performance liquid chromatography method for determining defluorinated impurities in ezetimibe
CN114295766A (en) * 2021-12-24 2022-04-08 中国科学院上海有机化学研究所 Metabonomics data processing method and device based on stable isotope labeling
CN114755357A (en) * 2022-04-14 2022-07-15 武汉迈特维尔生物科技有限公司 Automatic integration method, system, equipment and medium for chromatographic mass spectrometry
CN115083549A (en) * 2022-07-18 2022-09-20 烟台国工智能科技有限公司 Product raw material ratio reverse derivation method based on data mining
CN115601412A (en) * 2022-10-18 2023-01-13 浙江大学(Cn) Method for detecting crystalline polymer powder impurities based on multispectral image processing

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
易伦朝 等: "色谱指纹图谱与中药质量控制", 色谱, vol. 26, no. 02, pages 166 - 171 *
朱臻宇 等: "中药色谱指纹图谱全排序模板匹配算法研究", 第二军医大学学报, vol. 28, no. 02, pages 183 - 187 *

Also Published As

Publication number Publication date
CN116359420B (en) 2023-08-18

Similar Documents

Publication Publication Date Title
US9020237B2 (en) Method for optimizing observed image classification criterion and image classification apparatus
JP4394728B2 (en) Influence factor identification device
US7283659B1 (en) Apparatus and methods for searching through and analyzing defect images and wafer maps
CN105955214B (en) Batch process fault detection method based on sample time-series and neighbour's affinity information
US11625029B2 (en) Manufacturing condition setting automating apparatus and method
CN114429311A (en) Dynamic monitoring method and system for semiconductor manufacturing process
CN112000081A (en) Fault monitoring method and system based on multi-block information extraction and Mahalanobis distance
CN113987954A (en) Chip yield monitoring method and device, electronic equipment and storage medium
CN116359420B (en) Chromatographic data impurity qualitative analysis method based on clustering algorithm and application
CN116842240A (en) Data management and control system based on full-link management and control
CN108537249B (en) Industrial process data clustering method for density peak clustering
JP2005142467A (en) Semiconductor device manufacturing method, and semiconductor manufacturing system
EP4239427A1 (en) Abnormality diagnosing model construction method, abnormality diagnosing method, abnormality diagnosing model construction device, and abnormality diagnosing device
CN113255096A (en) High-loss line abnormal distribution area positioning method and system based on forward stepwise regression
CN113341888A (en) Multivariable process control method
CN110914771A (en) Quality analysis device and quality analysis method
CN111906772A (en) Intelligent product processing method based on industrial robot
US11507961B2 (en) Fabricated data detection method
CN110889395A (en) Machine learning-based mechanical motion identification method and system
CN110226160B (en) State analysis device, state analysis method, and storage medium
US6944561B2 (en) Method for detection of manufacture defects
CN115905802A (en) Semiconductor wafer test yield analysis method based on thermodynamic diagram
CN111199419B (en) Stock abnormal transaction identification method and system
Nagy et al. An industrial application using process mining to reduce the number of faulty products
CN117131244B (en) Novel distributed big data screening and filtering system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant