CN108334839B - Chemical information identification method based on deep learning image identification technology - Google Patents
Chemical information identification method based on deep learning image identification technology Download PDFInfo
- Publication number
- CN108334839B CN108334839B CN201810098220.0A CN201810098220A CN108334839B CN 108334839 B CN108334839 B CN 108334839B CN 201810098220 A CN201810098220 A CN 201810098220A CN 108334839 B CN108334839 B CN 108334839B
- Authority
- CN
- China
- Prior art keywords
- atoms
- chemical
- deep learning
- identified
- chemical bond
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/126—Character encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
- G06F18/2414—Smoothing the distance, e.g. radial basis function networks [RBFN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/103—Formatting, i.e. changing of presentation of documents
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/457—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by analysing connectivity, e.g. edge linking, connected component analysis or slices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/22—Character recognition characterised by the type of writing
- G06V30/226—Character recognition characterised by the type of writing of cursive writing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Biodiversity & Conservation Biology (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Character Discrimination (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention belongs to the technical field of image recognition, and particularly relates to a chemical information recognition method based on a deep learning image recognition technology. The method comprises the following steps: (1) identifying the input image by using a node target identifier; (2) identifying text content of the node identified in the step (1) by using a handwritten font target identifier, and further determining a specific atom corresponding to the node; (3) combining the plurality of recognized atoms pairwise, and recognizing the chemical bond between the two atoms by using a chemical bond target recognizer again; (4) searching the attribute of the identified atom in a database, calculating the related attribute of the structural formula and outputting the attribute; or storing the identified atoms and chemical bonds among the atoms as a file in a custom king format or drawing the atoms in a new picture and outputting the atoms and the chemical bonds. The invention can solve the problem of identifying chemical structural formulas or reaction formulas on hand-painted pictures and can be widely applied to daily work of chemical workers.
Description
Technical Field
The invention belongs to the technical field of image recognition, and particularly relates to a chemical information recognition method based on a deep learning image recognition technology.
Background
At present, deep learning is widely applied to the aspect of image recognition, and the main application scenes of the deep learning are face recognition, license plate recognition, common object recognition and plant recognition. However, the deep learning image recognition technology is not applied to the chemical structural formula or the reactive image recognition.
Disclosure of Invention
In order to solve the problem of identifying chemical structural formulas or reaction formulas on hand-drawing and pictures, the invention aims to take pictures of the chemical structural formulas or reaction formulas drawn on tools or paper by a user and upload the pictures to obtain the compositions of the corresponding structural formulas or reaction formulas and the related attributes of the structural formulas.
In order to achieve the above object, the present invention provides a chemical information recognition method based on a deep learning image recognition technology, which includes the following steps:
(1) identifying nodes by using a node target identifier based on a deep learning image identification technology on an input image;
(2) identifying text contents of the nodes identified in the step (1) by using a handwritten font target identifier based on a deep learning image identification technology, and further determining specific atoms corresponding to the nodes;
(3) combining the plurality of recognized atoms pairwise, and recognizing the chemical bond between the two atoms by using a chemical bond target recognizer based on a deep learning image recognition technology;
(4) searching the attributes of the identified atoms in a database, wherein the attributes comprise information such as relative atomic mass, isotope mass and abundance, common chemical valence and the like, calculating the related attributes of the structural formula and outputting the attributes;
or storing the identified atoms and chemical bonds among the atoms as a file with a user-defined king format and outputting the file;
or, the identified atoms and chemical bonds between the atoms are drawn in a new picture and output.
In addition, the method also comprises the following steps:
(5) performing arrow recognition on the input image by using an arrow target recognizer based on a deep learning image recognition technology;
then storing the identified arrows and the atoms identified in the steps (2) and (3) and chemical bonds among the atoms into a file with a custom king format, and outputting the file;
or, drawing the identified arrow and the atom and the chemical bond between the atoms identified in the steps (2) and (3) in a new picture and outputting the new picture.
The target recognizer based on the deep learning image recognition technology in the steps (1), (3) and (5) is obtained by performing off-line training in advance by using a fast-rcnn algorithm proposed by Ross Girshick team based on the deep learning image recognition technology, and is used for recognizing arrows, atoms, spatial coordinates of the atoms and chemical bonds in the image.
The handwritten font target recognizer based on the deep learning image recognition technology in the step (2) is obtained by performing offline training in advance by using a LeNet model based on the deep learning image recognition technology and Caffe, and is used for recognizing text contents in an image.
Preferably, the step of training the object recognizer offline comprises training the object recognizer offline using a set of images.
Training the image set used by the target recognizer includes: (a) handwriting the font picture; (b) nodes connected by multiple and multiple types of chemical bonds; (c) common chemical bonds such as single bond, double bond, triple bond, etc.; (d) arrow pictures are commonly used in chemistry.
More preferably, the image set is used to (a) train a handwriting recognizer in the LeNet model for determining whether a node is an element of the periodic table of elements, plain text, or an "carbon" element that is not to be displayed.
And (c) training a node target recognizer in a fast-rcnn algorithm by using the image set (b) for determining all nodes in the image and the spatial coordinates of the nodes.
Using the image set (c) to train a chemical bond target recognizer in a fast-rcnn algorithm for determining the chemical bond type between atoms and whether chemical bonds exist between atoms.
Using the image set (d) to train an arrow object recognizer in the fast-rcnn algorithm for determining whether an arrow and its spatial location coordinates are present in the input image.
Wherein, the step (3) comprises the following steps:
step (31), for all the identified atoms, combining every two, using the chemical bond target identifier to identify whether the chemical bond target identifier contains a chemical bond, and identifying the type of the chemical bond when the chemical bond target identifier contains the chemical bond;
and (32) adding association to the two atoms according to the identification of whether the chemical bond and the type of the chemical bond are contained, wherein the association type is the type of the identified chemical bond.
Wherein the correlation attributes of the computational structural formula described in step (4) include:
step (41), according to atoms and chemical bonds among the atoms, ensuring an electronic stable structure of the outermost layer 8 of the atoms, automatically supplementing hydrogen, counting the types of the atoms and the number of the atoms, and generating a molecular formula of a chemical structural formula;
step (42), converting the structural formula into smiles names according to the public smiles protocol format according to the atoms and the universal smiles names converted by chemical bonds among the atoms;
step (43), searching an English name corresponding to the chemical structural formula in the database through the corresponding smiles;
and (44) calculating the corresponding abundances of the accurate molecular mass, the relative molecular mass and the mass-to-charge ratio of the molecular formula.
Calculating the accurate molecular mass of the molecular formula in step (44), and adding the atomic masses with the maximum atomic isotope abundance in the molecular formula to obtain the accurate molecular mass; calculating the relative molecular mass, and adding the relative atomic masses of all atoms in the molecular formula to obtain the relative molecular mass; calculating the corresponding abundance of mass-to-charge ratio, from equation (a + b)nAnd (4) calculating expansion coefficients, wherein a and b represent isotopes of the same atom, and n represents the number of the atom in the molecule.
The user-defined king in the step (4) is a text file encoded by using a UTF8 format, and each online structural formula editor can automatically analyze the file content and can edit the file content again in the editor.
Compared with the prior art, the invention has the beneficial effects that: the invention can obtain the chemical structural formula or the reaction formula recognized by a computer through the recognized content after processing the hand-drawn chemical structural formula or the reaction formula and the chemical structural formula or the reaction formula in the picture through recognizing nodes, atoms, chemical bonds and arrows, and can obtain the related attributes of the structural formula through some calculations, thereby being widely applied to daily work of chemical workers, such as structural formula editors, word documents and the like, saving the drawing time and improving the working efficiency.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention;
FIG. 2 is a diagram illustrating a custom king format according to the present invention.
Detailed Description
The invention is further explained below with reference to specific embodiments and the accompanying drawings.
Example 1
As shown in fig. 1, a chemical information recognition method based on a deep learning image recognition technology includes the following steps:
step 1, recognizing an arrow by using an arrow target recognizer based on a deep learning image recognition technology for an input image;
step 2, identifying nodes of the input image by using a node target identifier based on a deep learning image identification technology;
or step 6, storing the identified arrows, atoms and chemical bonds among the atoms into a file with a custom king format, and outputting the file;
or, step 7, drawing the identified arrows, atoms and chemical bonds between atoms in a new picture and outputting the new picture.
The target recognizer based on the deep learning image recognition technology in the steps 1, 2 and 4 is obtained by using a deep learning image recognition technology in advance and performing offline training by using a fast-rcnn algorithm proposed by a Ross Girshick team, and is used for recognizing atoms and spatial coordinates in an image and recognizing chemical bonds in the image, the handwriting font recognizer based on the deep learning image recognition technology in the step 3 is obtained by using a deep learning image recognition technology in advance and performing offline training by using a LeNet model of Caffe and is used for recognizing text contents in the image, and the step of performing offline training on the target recognizer comprises the step of performing offline training on the target recognizer by using an image set, wherein the image set comprises: (a) handwriting font picture: training a handwriting character recognizer in a LeNet model of Caffe to determine whether the node is an element in an element periodic table, a plain text or an undisplayed 'carbon' element; (b) multiple and multiple types of chemical bond-linked nodes: training a node target recognizer in a fast-rcnn algorithm proposed by a RossGirshick team for determining all nodes and spatial coordinates thereof in an image; (c) common chemical bonds such as single bond, double bond, triple bond, etc.: training a chemical bond target recognizer in a fast-rcnn algorithm proposed by a Ross Girshick team to determine the type of chemical bonds among atoms and whether chemical bonds exist among atoms; (d) arrow pictures are commonly used in chemistry: an arrow target recognizer is trained in the fast-rcnn algorithm proposed by Ross Girshick team to determine whether an arrow and its spatial position coordinates are present in the input image.
The step 4 specifically comprises the following steps:
step 41, combining all the identified atoms pairwise, identifying whether the atoms contain chemical bonds or not by using the chemical bond target identifier, and identifying the types of the chemical bonds when the atoms contain the chemical bonds;
and 42, adding association to the two atoms according to the identification of whether the chemical bond and the type of the chemical bond are contained, wherein the association type is the type of the identified chemical bond.
The correlation properties of the computational structural formula described in step 5 include:
step 51, according to atoms and chemical bonds among the atoms, ensuring an electronic stable structure of the outermost layer 8 of the atoms, automatically supplementing hydrogen, and counting the types and the numbers of the atoms to generate a molecular formula of a chemical structural formula;
step 52, converting the structural formula into smiles names according to the public smiles protocol format according to the atoms and the universal smiles names converted by chemical bonds among the atoms;
step 53, searching an English name corresponding to the chemical structural formula in the database through the corresponding smiles;
step 54, calculating the corresponding abundances of the accurate molecular mass, the relative molecular mass and the mass-to-charge ratio of the molecular formula: calculating the accurate molecular mass of the molecular formula, and adding the atomic masses with the maximum atomic isotopic abundance in the molecular formula to obtain the accurate molecular mass; calculating the relative molecular mass, and adding the relative atomic masses of all atoms in the molecular formula to obtain the relative molecular mass; calculating the corresponding abundance of mass-to-charge ratio, from equation (a + b)nThe coefficient of expansion is calculated, a and b represent isotopes of the same atom, and n represents the number of the atom in the molecule, for example: chlorine (Cl) element, isotopes having Cl35-34.96885、Cl37-36.9659 corresponding to an abundance of 75.78%, 24.22%, formula Cl2The mass-to-charge ratio and corresponding abundance were calculated as (Cl)35+Cl37)2Corresponding to the expansion formula of (Cl)35)2+2Cl35Cl37+(Cl37)2Then, there are three mass-to-charge ratios m/z, which are: cl35+Cl35=34.96885+34.96885=69.9377、Cl35+Cl37=34.96885+36.9659=71.93475、Cl37+Cl3773.9318 ═ 36.9659+ 36.9659; the corresponding abundances are: cl35*Cl35=75.78%*75.78%=0.57426084、Cl35*Cl37*2=75.78%*24.22%*2=0.36707832、Cl37*Cl3724.22% by 24.22% 0.05866084, the corresponding abundance after normalization is shown in table 1:
TABLE 1 formula Cl2Normalized corresponding abundance
m/z | Abundance ratio |
69.9377 | 100% |
71.93475 | 63.9% |
73.9318 | 10.2% |
AtomBlock memory atom in FIG. 2, contains the following format:
Begin Atom
Index Type x y HCount
End Atom
wherein, identifying multiple atoms adds multiple sets of text of the same format between Begin Atom and End Atom. Index is ordinal number, increasing from 1; type is element name, example: "C"; x is the x coordinate of the atom in the plane; y is the y coordinate of the atom in the plane; HCount is the number of hydrogen coordinated by the atom.
The BondBlock in FIG. 2 stores chemical bonds between atoms, comprising the following formats:
Begin Bond
Index Type Atom1index Atom2index
End Bond
wherein, identifying multiple chemical bonds adds sets of identically formatted text between the Begin Bond and the End Bond. Index is ordinal number, increasing from 1; type is a chemical bond Type; atom1index is the ordinal number of one of the connected atoms in Atom Block; atom2index is the ordinal number of another Atom attached in Atom Block.
Text Block in fig. 2 stores plain Text information, containing the following format:
Begin Text
Index x y Text
End Text
wherein, the recognition of a plurality of plain texts adds a plurality of groups of texts with the same format between the Begin Text and the End Text. Index is ordinal number, increasing from 1; x is the x coordinate of the plain text in the plane; y is the y coordinate of the plain text in the plane; text is the content of plain Text.
Shape Block stores the arrow information in fig. 2 containing the following format:
Begin Shape
Index x1,y1;x2,y2
End Shape
wherein, recognizing multiple arrows adds multiple groups of texts with the same format between the Begin Shape and the End Shape. Index is ordinal number, increasing from 1; x1 is the starting point x coordinate of the arrow in the plane; y1 is the origin y coordinate of the arrow in the plane; x2 is the end x coordinate of the arrow in the plane; y2 is the end y coordinate of the arrow in the plane.
Of course, the foregoing is only a preferred embodiment of the invention and should not be taken as limiting the scope of the embodiments of the invention. The present invention is not limited to the above examples, and equivalent changes and modifications made by those skilled in the art within the spirit and scope of the present invention should be construed as being included in the scope of the present invention.
Claims (10)
1. A chemical information identification method based on a deep learning image identification technology is characterized by comprising the following steps:
(1) identifying nodes by using a node target identifier based on a deep learning image identification technology on an input image;
(2) identifying text contents of the nodes identified in the step (1) by using a handwritten font target identifier based on a deep learning image identification technology, and further determining specific atoms corresponding to the nodes;
(3) combining the plurality of recognized atoms pairwise, and recognizing the chemical bond between the two atoms by using a chemical bond target recognizer based on a deep learning image recognition technology;
(4) searching the attribute of the identified atom in a database, calculating the related attribute of the structural formula and outputting the attribute;
or storing the identified atoms and chemical bonds among the atoms as a file with a user-defined king format and outputting the file;
or, the identified atoms and chemical bonds between the atoms are drawn in a new picture and output.
2. The chemical information identification method based on the deep learning image identification technology according to claim 1, characterized by further comprising the following steps:
(5) performing arrow recognition on the input image by using an arrow target recognizer based on a deep learning image recognition technology;
then storing the identified arrows and the atoms identified in the steps (2) and (3) and chemical bonds among the atoms into a file with a custom king format, and outputting the file;
or, drawing the identified arrow and the atom and the chemical bond between the atoms identified in the steps (2) and (3) in a new picture and outputting the new picture.
3. The method for identifying chemical information based on deep learning image identification technology as claimed in claim 2, wherein the target recognizer based on deep learning image identification technology in the steps (1) (2) (3) (5) is obtained by performing offline training using deep learning image identification technology in advance.
4. The method of claim 3, wherein the step of training the object recognizer offline comprises training the object recognizer offline by using an image set.
5. The method of claim 4, wherein training the set of images used by the target recognizer comprises: (a) handwriting the font picture; (b) nodes connected by multiple and multiple types of chemical bonds; (c) a common chemical bond; (d) arrow pictures are commonly used in chemistry.
6. The method of claim 5, wherein the image set is used to (a) train a handwriting recognizer in LeNet model for determining whether a node is an element in the periodic table of elements, plain text, or "carbon" without display;
training a node target recognizer in a fast-rcnn algorithm by using the image set (b) for determining all nodes and spatial coordinates thereof in the image;
training a chemical bond target recognizer in a fast-rcnn algorithm using the image set (c) for determining the chemical bond type between atoms and whether chemical bonds exist between atoms;
using the image set (d) to train an arrow object recognizer in the fast-rcnn algorithm for determining whether an arrow and its spatial location coordinates are present in the input image.
7. The chemical information identification method based on the deep learning image identification technology according to any one of claims 1 to 6, wherein the step (3) specifically comprises the following steps:
step (31), for all the identified atoms, combining every two, using the chemical bond target identifier to identify whether the chemical bond target identifier contains a chemical bond, and identifying the type of the chemical bond when the chemical bond target identifier contains the chemical bond;
and (32) adding association to the two atoms according to the identification of whether the chemical bond and the type of the chemical bond are contained, wherein the association type is the type of the identified chemical bond.
8. The method for identifying chemical information based on deep learning image identification technology as claimed in any one of claims 1 to 6, wherein the calculating the correlation attribute of the structural formula in step (4) comprises:
step (41), according to atoms and chemical bonds among the atoms, ensuring an electronic stable structure of the outermost layer 8 of the atoms, automatically supplementing hydrogen, counting the types of the atoms and the number of the atoms, and generating a molecular formula of a chemical structural formula;
step (42), converting the structural formula into smiles names according to the public smiles protocol format according to the atoms and the universal smiles names converted by chemical bonds among the atoms;
step (43), searching an English name corresponding to the chemical structural formula in the database through the corresponding smiles;
and (44) calculating the corresponding abundances of the accurate molecular mass, the relative molecular mass and the mass-to-charge ratio of the molecular formula.
9. The method for recognizing chemical information based on deep learning image recognition technology as claimed in claim 8, wherein the precise molecular mass of the molecular formula is calculated in step (44) and is obtained by adding the atomic masses with the maximum abundance of all atomic isotopes in the molecular formula; calculating the relative molecular mass, and adding the relative atomic masses of all atoms in the molecular formula to obtain the relative molecular mass; calculating the corresponding abundance of mass-to-charge ratio, from equation (a + b)nAnd (4) calculating expansion coefficients, wherein a and b represent isotopes of the same atom, and n represents the number of the atom in the molecule.
10. The method for identifying chemical information based on deep learning image identification technology as claimed in any one of claims 1 to 6, wherein the custom king in step (4) is a text file encoded by UTF8 format.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810098220.0A CN108334839B (en) | 2018-01-31 | 2018-01-31 | Chemical information identification method based on deep learning image identification technology |
PCT/CN2018/105414 WO2019148852A1 (en) | 2018-01-31 | 2018-09-13 | Chemical information identification method based on deep learning image identification technology |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810098220.0A CN108334839B (en) | 2018-01-31 | 2018-01-31 | Chemical information identification method based on deep learning image identification technology |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108334839A CN108334839A (en) | 2018-07-27 |
CN108334839B true CN108334839B (en) | 2021-09-14 |
Family
ID=62927657
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810098220.0A Active CN108334839B (en) | 2018-01-31 | 2018-01-31 | Chemical information identification method based on deep learning image identification technology |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN108334839B (en) |
WO (1) | WO2019148852A1 (en) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108334839B (en) * | 2018-01-31 | 2021-09-14 | 青岛清原精准农业科技有限公司 | Chemical information identification method based on deep learning image identification technology |
CN110413740B (en) * | 2019-08-06 | 2022-10-14 | 百度在线网络技术(北京)有限公司 | Query method and device of chemical expression, electronic equipment and storage medium |
WO2021125206A1 (en) | 2019-12-16 | 2021-06-24 | 富士フイルム株式会社 | Image analysis device, image analysis method, and program |
EP3937106A1 (en) * | 2020-07-08 | 2022-01-12 | Tata Consultancy Services Limited | System and method of extraction of information and graphical representation for design of formulated products |
CN111897987B (en) * | 2020-07-10 | 2022-05-31 | 山西大学 | Molecular structure diagram retrieval method based on evolution calculation multi-view fusion |
WO2023277725A1 (en) * | 2021-06-28 | 2023-01-05 | Autonomous Non-Profit Organization For Higher Education "Skolkovo Institute Of Science And Technology" | Method and system for recognizing chemical information from document images |
CN115908775A (en) * | 2021-08-16 | 2023-04-04 | 中国科学院上海药物研究所 | Chemical structural formula identification method and device, storage medium and electronic equipment |
CN114464273A (en) * | 2021-12-22 | 2022-05-10 | 天翼云科技有限公司 | Molecular structure database construction method and device, electronic equipment and storage medium |
CN114581924A (en) * | 2022-03-01 | 2022-06-03 | 苏州阿尔脉生物科技有限公司 | Method and device for extracting elements in chemical reaction flow chart |
CN114842486A (en) * | 2022-07-04 | 2022-08-02 | 南昌大学 | Handwritten chemical structural formula recognition method, system, storage medium and equipment |
CN114898391A (en) * | 2022-07-12 | 2022-08-12 | 苏州阿尔脉生物科技有限公司 | Method and device for determining chemical reaction route and electronic equipment |
Family Cites Families (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5157736A (en) * | 1991-04-19 | 1992-10-20 | International Business Machines Corporation | Apparatus and method for optical recognition of chemical graphics |
JP3545075B2 (en) * | 1994-12-28 | 2004-07-21 | 富士通株式会社 | Compound analyzer |
CN101261554A (en) * | 2008-04-21 | 2008-09-10 | 东莞市步步高教育电子产品有限公司 | Formula, expression hand-written inputting and computing system and method |
CN101329731A (en) * | 2008-06-06 | 2008-12-24 | 南开大学 | Automatic recognition method pf mathematical formula in image |
US20100163316A1 (en) * | 2008-12-30 | 2010-07-01 | Microsoft Corporation | Handwriting Recognition System Using Multiple Path Recognition Framework |
CN102033866A (en) * | 2009-09-29 | 2011-04-27 | 国际商业机器公司 | Method and system for checking chemical name |
US8718375B2 (en) * | 2010-12-03 | 2014-05-06 | Massachusetts Institute Of Technology | Sketch recognition system |
US9558403B2 (en) * | 2011-08-26 | 2017-01-31 | Council Of Scientific And Industrial Research | Chemical structure recognition tool |
CN102693303B (en) * | 2012-05-18 | 2017-06-06 | 上海极值信息技术有限公司 | The searching method and device of a kind of formulation data |
CN103700084A (en) * | 2012-09-28 | 2014-04-02 | 淮海工学院 | Chemical molecular structure chart partition method based on area size and curvature |
US10346681B2 (en) * | 2015-09-26 | 2019-07-09 | Wolfram Research, Inc. | Method and computing device for optically recognizing mathematical expressions |
CN106980856B (en) * | 2016-01-15 | 2020-11-27 | 北京字节跳动网络技术有限公司 | Formula identification method and system and symbolic reasoning calculation method and system |
CN105894931A (en) * | 2016-06-06 | 2016-08-24 | 宁波市铭时三维科技发展有限公司 | Two-dimensional code containing three-dimensional printing method for using molecular structure model as chemical training aid |
CN106372456B (en) * | 2016-08-26 | 2019-01-22 | 浙江工业大学 | A kind of Advances in protein structure prediction based on deep learning |
CN106650686A (en) * | 2016-12-30 | 2017-05-10 | 南开大学 | Online hand-written chemical symbol identification method based on Hidden Markov model |
CN106874688B (en) * | 2017-03-01 | 2019-03-12 | 中国药科大学 | Intelligent lead compound based on convolutional neural networks finds method |
CN107169485B (en) * | 2017-03-28 | 2020-10-09 | 北京捷通华声科技股份有限公司 | Mathematical formula identification method and device |
CN108334839B (en) * | 2018-01-31 | 2021-09-14 | 青岛清原精准农业科技有限公司 | Chemical information identification method based on deep learning image identification technology |
-
2018
- 2018-01-31 CN CN201810098220.0A patent/CN108334839B/en active Active
- 2018-09-13 WO PCT/CN2018/105414 patent/WO2019148852A1/en active Application Filing
Also Published As
Publication number | Publication date |
---|---|
CN108334839A (en) | 2018-07-27 |
WO2019148852A1 (en) | 2019-08-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108334839B (en) | Chemical information identification method based on deep learning image identification technology | |
WO2023138023A1 (en) | Multimodal document information extraction method based on graph neural network, device and medium | |
WO2022142011A1 (en) | Method and device for address recognition, computer device, and storage medium | |
CN109472234B (en) | Intelligent recognition method for handwriting input | |
CN105574133A (en) | Multi-mode intelligent question answering system and method | |
CN113052023A (en) | CAD drawing analysis method, device, equipment and storage medium | |
CN105335348A (en) | Object statement based dependency syntax analysis method and apparatus and server | |
CN109918351B (en) | Method and system for converting Beamer presentation into PowerPoint presentation | |
CN110083580B (en) | Method and system for converting Word document into PowerPoint document | |
CN113010711B (en) | Method and system for automatically generating movie poster based on deep learning | |
CN115917613A (en) | Semantic representation of text in a document | |
CN112650858A (en) | Method and device for acquiring emergency assistance information, computer equipment and medium | |
CN108537109B (en) | OpenPose-based monocular camera sign language identification method | |
CN103678593A (en) | Interactive space scene retrieval method based on space scene draft description | |
CN109784236B (en) | Method for identifying table contents in railway drawing | |
CN114821255A (en) | Method, apparatus, device, medium and product for fusion of multimodal features | |
CN105912723A (en) | Storage method of custom field | |
CN115359492A (en) | Text image matching model training method, picture labeling method, device and equipment | |
CN113536798A (en) | Multi-instance document key information extraction method and system | |
CN112231473A (en) | Commodity classification method based on multi-mode deep neural network model | |
CN113065475A (en) | Rapid and accurate CAD (computer aided design) legend identification method | |
CN111144256A (en) | Spreadsheet formula synthesis and error detection method based on video dynamic analysis | |
CN113393179B (en) | Data integration system based on time sequence difference | |
CN111062419B (en) | Compression and recovery method for deep learning data set | |
Qiu et al. | A font style learning and transferring method based on strokes and structure of Chinese characters |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |