CN110991149A - Multi-mode entity linking method and entity linking system - Google Patents

Multi-mode entity linking method and entity linking system Download PDF

Info

Publication number
CN110991149A
CN110991149A CN201911101194.3A CN201911101194A CN110991149A CN 110991149 A CN110991149 A CN 110991149A CN 201911101194 A CN201911101194 A CN 201911101194A CN 110991149 A CN110991149 A CN 110991149A
Authority
CN
China
Prior art keywords
entity
object recognition
picture
model
recognition model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911101194.3A
Other languages
Chinese (zh)
Inventor
徐叶强
王峰
窦任荣
吴云标
谢海博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Aixue Information Technology Co Ltd
Original Assignee
Guangzhou Aixue Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Aixue Information Technology Co Ltd filed Critical Guangzhou Aixue Information Technology Co Ltd
Priority to CN201911101194.3A priority Critical patent/CN110991149A/en
Publication of CN110991149A publication Critical patent/CN110991149A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-mode entity linking method and a system, wherein the linking method comprises the following steps: generating an object recognition model: collecting a labeled picture, and preprocessing the collected and labeled picture; constructing an object recognition model; training an object recognition model; generating an entity link library: acquiring entity linguistic data, and associating an entity with a picture tag to obtain an entity link library; entity linking: and preprocessing the picture obtained by shooting, inputting the picture into the object recognition model to obtain an object recognition result, and searching the object recognition result in the entity link library to obtain a text result of the entity. The invention achieves the purpose of entity disambiguation through the object recognition of the picture and realizes the multi-modal entity link from the picture to the text. The method specifically comprises the steps of shooting common objects in life through a camera, then carrying out object recognition on the objects in pictures, and finally linking the object recognition results to corresponding entities, so that entity linking from pictures to texts in a multi-mode is realized.

Description

Multi-mode entity linking method and entity linking system
Technical Field
The invention relates to the fields of deep learning, digital image processing, knowledge maps and the like, in particular to application of image recognition and entity linking technology.
Background
The entity linking refers to extracting entity names in a section of text, and mapping the entity names to a unique entity in a specified knowledge base after disambiguation. Entity links can help computers to find important semantic information in sentences and judge different meanings of words in different context, and are indispensable in helping computer to solve natural languages.
At present, the entity link technology is widely applied in the fields of information extraction, information retrieval, content analysis, automatic question answering, knowledge base expansion and the like. But its limitation is application in this field of text only.
In real life, the media of information include various modalities such as voice, video, and pictures, in addition to text. At present, the realization of entity link technology from picture to text in two different modes does not appear.
In view of the above, it is desirable to provide a multi-modal entity linking technique based on picture object recognition.
Disclosure of Invention
The invention firstly provides a multi-mode entity linking method, which is used for carrying out object recognition on common object pictures obtained in life and linking object recognition results to corresponding entities, thereby realizing multi-mode entity linking from pictures to texts.
The invention further provides a multi-mode entity linking system.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a multi-modal entity linking method, comprising the steps of:
generating an object recognition model: collecting a labeled picture, and preprocessing the collected and labeled picture; constructing an object recognition model; training an object recognition model;
(II) generating an entity link library: acquiring entity linguistic data, and associating an entity with a picture tag to obtain an entity link library;
(III) entity linking: and preprocessing the picture obtained by shooting, inputting the picture into the object recognition model to obtain an object recognition result, and searching the object recognition result in the entity link library to obtain a text result of the entity.
Preferably, the image preprocessing method includes: encoding, thresholding or filtering operations, regularization.
Preferably, the process of constructing the object recognition model includes: an inclusion V3 deep neural network model is adopted to construct an object recognition model, the input of the model is an object picture to be recognized, and the output is the name of an object and the corresponding probability thereof. The inclusion structure uses a 1 convolution kernel to reduce dimensions and the fully connected layer is replaced by a simple global average pooling.
Preferably, the specific process of training the object recognition model is as follows: and (3) training the model by using a deep learning software library Tensorflow, inputting the preprocessed pictures as training samples, setting a learning rate, performing model training after various parameters of an iteration period are set, and finally obtaining the Incep V3 model with the best training effect.
Preferably, the manner of generating the associated entity and the picture tag in the entity link library is as follows: after the entity library is obtained, the entity of the entity library is associated with the entity of the picture through manual marking; and after manual labeling, obtaining an entity-picture label library.
Preferably, the entity link further includes entity link result display after the entity text result is obtained, and the retrieval result is displayed through visual display or voice broadcast.
The invention also provides a multi-mode entity linking system, which comprises the following modules:
an object recognition model generation module: collecting a labeled picture, and preprocessing the collected and labeled picture; constructing an object recognition model; training an object recognition model;
the entity link library generation module: acquiring entity linguistic data, and associating an entity with a picture tag to obtain an entity link library;
an entity linking module: and preprocessing the picture obtained by shooting, inputting the picture into the object recognition model to obtain an object recognition result, and searching the object recognition result in the entity link library to obtain a text result of the entity.
The invention also proposes a readable storage medium on which a computer program is stored which, when being executed by a processor, carries out the steps of the method.
The invention also provides a multi-modal entity link generation device, which comprises a memory, a processor and a computing program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the steps of the method.
Preferably, the equipment further comprises an intelligent desk lamp, the memory and the processor are embedded into the intelligent desk lamp, and the intelligent desk lamp comprises a sound pickup device.
The innovation point of the invention is that the object recognition of the picture is used for achieving the purpose of entity disambiguation, and multi-modal entity link from the picture to the text is realized. In practice, objects which are common in life are shot through hardware equipment with a camera, then object recognition is carried out on the objects in pictures, and finally the object recognition results are linked to corresponding entities, so that entity linking from pictures to texts in a multi-mode is achieved.
Drawings
FIG. 1 is a multi-modal entity linking flow diagram;
FIG. 2 is a diagram of the Inception V3 model network architecture;
fig. 3 and 4 are schematic diagrams illustrating the relationship between the apple parent class and the apple child class.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present invention. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
A multi-modal entity linking method, comprising the steps of:
object recognition model generation
1. Picture collection and labeling
One of the purposes of the invention is to identify objects which are common in daily life, so that the objects can be obtained by shooting and collecting through shooting equipment. In addition, the ImageNet project published by the li flying team comprises 2 ten thousand categories, more than 1400 ten thousand labeled pictures, and the categories can be used as one of the picture training sets. And storing the acquisition and labeling results of the picture training set into a picture-entity label library.
2. Picture preprocessing
After the training set is collected, the recognition effect is improved for the object, the apparent characteristics (such as color distribution, whole brightness and darkness, size and the like) of each picture are consistent as much as possible, and the pictures are preprocessed according to the requirements. Common preprocessing methods are coding, thresholding or filtering operations, regularization, etc.
3. Object recognition model construction
The traditional object identification method adopts a statistical-based method, but with the development of deep learning in recent years, practice shows that the method based on deep learning has far better effect than the method based on statistics. The invention adopts an Inception V3 deep neural network model to build an object recognition model. Compared with other deep neural network models, such as AlexNet and VGGNet models, the Incepton V3 model has fewer network parameters, and can accelerate the training and loading speed of the model. Meanwhile, the network model introduces an inclusion structure to replace the traditional operation technology of simple convolution plus an activation function.
Fig. 2 is a diagram of an inclusion V3 object identification model network architecture. The input of the model is an object picture to be recognized, and the output is the name of the object and the corresponding probability thereof. The Incep structure uses a 1 convolution kernel to reduce the dimension, and the problem of large calculation amount can be effectively solved. The number of parameters can be reduced by replacing the fully connected layer with a simple global average pooling.
4. Object recognition model training
After the model is built, the model is trained using the open-source deep learning software library Tensorflow. And inputting the picture obtained after the preprocessing as a training sample, setting parameters such as a learning rate and an iteration period, and then performing model training to finally obtain the Incep V3 model with the best training effect.
Second, entity link library generation
1. Entity corpus collection
The knowledge base used by the entity corpus comprises Wikipedia, Baidu encyclopedia, Freebase, YAGO and the like, and the knowledge base contains abundant entities and attribute values of the entities, and data collection can be performed on the entities through a web crawler to obtain an entity base.
2. Entity-picture tag association
After the entity library is obtained, the entity of the entity library needs to be associated with the entity of the picture through manual marking. For example, the entity "apple" may include two entities "fruit apple" and "apple computer", which are labeled to correspond to the fruit apple and the apple computer in the picture library, respectively.
And after manual labeling, obtaining an entity-picture label library.
Third, entity linking
1. Picture taking
And acquiring a target picture through shooting equipment.
2. Picture preprocessing
The image is required to be preprocessed before object recognition, and the purpose of the image is the same as that of the generation of an object recognition model.
3. Object recognition
And inputting the preprocessed picture into the object recognition model to obtain an object recognition result.
4. Entity linking
And searching the object identification result in the entity-picture mapping library to obtain a text result of the entity. Such as a "fruit apple" shot, would be linked to the "fruit apple" entity, as shown in fig. 3. The "apple computer" obtained by shooting is linked to the "apple computer" entity, as shown in fig. 4. But both are "apple" entities in the text domain.
5. Entity linking result presentation
Displaying the retrieval result through visualization such as Echarts; the voice broadcasting display can also be carried out through voice equipment, such as an intelligent desk lamp.
Aiming at the problems that the traditional image recognition technology needs manual preprocessing, has low accuracy and the like, a plurality of deep learning models are gradually applied to the image recognition field at present, such as: deep Belief Networks (DBNs), Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and the like. These network models can automatically learn and extract image features.
Entity linking refers to finding candidate correct entity descriptions in a knowledge base, and is a key technology for constructing a knowledge graph. The method mainly comprises candidate entity generation and candidate entity sorting, and the finally linked entities are determined by calculating the similarity between entity indexes and candidate entities. The traditional entity link generally needs to perform named entity recognition, that is, predefined entities are extracted from a specific text, and entity link is performed according to the extracted entities. According to the embodiment of the invention, an object name, namely an entity index, is obtained by constructing, training and packaging an object recognition model, input image modal information is converted into character modal information, entity linking of a knowledge base is carried out according to the entity index, and finally visual display is carried out, so that the multi-mode entity linking method and the entity linking system are realized.
The invention uses the increment V3 model to identify the object, inputs an object picture, calls the model to successfully identify the object in the picture, and returns the identification result. For example: a picture containing "orange" is input, and the recognition result of "orange, 0.9564" can be returned on the page. The above process converts the object picture information belonging to the image modality into the object name information belonging to the text modality. The object name is an entity name, and entity linkage is carried out by inquiring corresponding entities in a knowledge base to obtain the description information of the corresponding entities. Finally, the entity description information is visually displayed through Echarts.
In addition, the invention can be combined with an intelligent desk lamp, and has the following application scenes: the children place objects (such as apples) under the intelligent desk lamp, obtain object pictures through a camera of the intelligent desk lamp, call an object recognition model to return object names, and then link the object names to corresponding entities in a knowledge base to obtain description information of the corresponding entities. The desk lamp can read the object recognition result and the description information. The applet bound to the intelligent desk lamp device may also synchronize object identification and history of entity links. For preschool and low-grade children, the invention can help the children to know common objects in life and help parents to assist in solving the educational difficulty of enlightening the knowledge of the children.
Example (b):
a method of multi-modal entity linking, the method comprising the steps of:
step S101, an object recognition model is constructed and trained.
This step builds an inclusion V3 object recognition model based on the TensorFlow framework. Compared with AlexNet and VGGNet models, the Incepton V3 model has fewer network parameters and can accelerate the training and loading speed of the model. Meanwhile, the network model introduces an inclusion structure to replace the traditional operation technology of simple convolution plus an activation function. Fig. 2 is a diagram of an inclusion V3 object identification model network architecture. The input of the model is an object picture to be recognized, and the output is the name of the object and the corresponding probability thereof. The Incepton structure uses a 1 x 1 convolution kernel to reduce the dimension, and can effectively solve the problem of large calculation amount. The number of parameters can be reduced by replacing the fully connected layer with a simple global average pooling. The inclusion V3 model can identify predefined 1000 classes of common objects, each with a separate numbering correspondence. And training an Incep V3 model to obtain a pb model file, loading files of classification names corresponding to the classification character strings and files of classification numbers corresponding to the classification character strings respectively, establishing a mapping relation between the classification numbers and the corresponding classification names, transmitting the classification numbers, and returning object classification names.
Since the inclusion V3 object recognition model can only recognize the predefined 1000 types of common objects, the model recognition effect is poor for some object pictures which are not in the predefined 1000 types. Therefore, it is necessary to add a new class of object pictures and train again to identify the new class of object pictures. Specifically, the picture training data of each new category needs to be added to the original training set, so as to form a new training set for training.
Step S102, packaging the object recognition model.
The method comprises the steps of firstly, initializing and loading an input object picture, loading an inclusion V3 model file, then constructing a model object, and calculating the picture by using TensorFlow. And finally, obtaining the predefined 1000-class probabilities corresponding to the input object pictures through an inclusion V3 model, sequencing the predefined 1000-class probabilities, returning the class with the highest probability of the input pictures in the predefined 1000 classes, and outputting the names and the probabilities of the objects as the recognition results.
And step S103, calling the object recognition model and returning the recognition result.
In the step, an inclusion V3 object recognition model is packaged into a RESTful interface. The background mainly receives the object picture transmitted by the foreground and stores the object picture to the appointed path of the server. Specifically, it is first necessary to add the path information of the object identification interface and the path information uploaded by the object picture file in the configuration file. And then, judging whether the uploading path of the object picture file exists or not, if so, splicing the time stamp and the file name of the file into a new file name, and then splicing the path and storing. And then, creating an Httppost object, wherein the Http request adopts a post mode, the request is input as a packaged object recognition model interface path, receiving object picture path information and reading a picture, calling an object recognition model, and then returning a recognition result. And finally, returning the obtained result to the foreground for displaying.
And step S104, linking the recognition result to the corresponding entity in the knowledge base.
The object name, namely the entity name, can be obtained through the object recognition model, and the step converts the input image modal information into character modal information. And querying an entity and attribute information corresponding to the entity name through a knowledge graph (a knowledge base constructed based on an ontology, wherein the identified object is linked with an entity in the knowledge base) constructed based on the ontology, so as to link the entity name to the corresponding entity in the knowledge base and obtain knowledge related to the entity name. And loading the generated owl file by using an Apache Jena tool, and calling SPARQLAPI with a Java program to realize a query processing function in a Jena framework to realize query of corresponding entity concepts and attribute information of the object recognition result in a knowledge base. And finally, returning the inquired related knowledge to a system page for displaying. FIG. 3 is a diagram illustrating the relationship between the parent class and the child class of the apple. Wherein, the father is apple, and the subclass has ten categories of ID, alias, nature and taste, distribution area, nutritive value and the like. For example, if the object identification result is an apple, the apple is an entity name, the entity name of the apple can be linked to a corresponding entity apple in the knowledge base by an entity linking technology, and then the attribute information of the entity apple is returned for visual display.
And step S105, carrying out visual display by using Echarts.
The entity linking system has the overall effects that: uploading a picture of an object to be identified, clicking an 'identification' button to obtain an identification result of the picture, including the name and the corresponding probability of the object, and displaying the identification result on a page. And if the uploaded picture file cannot be empty and the size of the file cannot exceed 1M, judging the size of the uploaded file, and otherwise, re-uploading the picture file smaller than 1M.
Meanwhile, the object name, i.e. the entity name, can obtain the knowledge related to the object through the entity linking technology, such as: the identification result of the uploaded pictures is apples, and knowledge of the apples, including the IDs, functions, distribution areas, nutritive values and the like of the apples, can be obtained through entity linking. And finally, carrying out visual display on the obtained object encyclopedia knowledge in a form of a chart by utilizing Echarts.
The above-described embodiments of the present invention do not limit the scope of the present invention. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present invention shall be included in the protection scope of the claims of the present invention.

Claims (10)

1. A multi-modal entity linking method, comprising the steps of:
generating an object recognition model: collecting a labeled picture, and preprocessing the collected and labeled picture; constructing an object recognition model; training an object recognition model;
(II) generating an entity link library: acquiring entity linguistic data, and associating an entity with a picture tag to obtain an entity link library;
(III) entity linking: and preprocessing the picture obtained by shooting, inputting the picture into the object recognition model to obtain an object recognition result, and searching the object recognition result in the entity link library to obtain a text result of the entity.
2. The method of claim 1, wherein the image pre-processing method is: encoding, thresholding or filtering operations, regularization.
3. The method of claim 2, wherein the process of constructing the object recognition model is: an inclusion V3 deep neural network model is adopted to construct an object recognition model, the input of the model is an object picture to be recognized, and the output is the name of an object and the corresponding probability thereof. The inclusion structure uses a 1 convolution kernel to reduce dimensions and the fully connected layer is replaced by a simple global average pooling.
4. The method according to claim 3, wherein the specific process of training the object recognition model is: and (3) training the model by using a deep learning software library Tensorflow, inputting the preprocessed pictures as training samples, setting a learning rate, performing model training after various parameters of an iteration period are set, and finally obtaining the Incep V3 model with the best training effect.
5. The method of claim 4, wherein the manner of generating the associated entity and picture tag in the entity link library is: after the entity library is obtained, the entity of the entity library is associated with the entity of the picture through manual marking; and after manual labeling, obtaining an entity-picture label library.
6. The method of claim 5, wherein the entity link further comprises displaying the entity link result after obtaining the entity text result, and displaying the search result through visual display or voice broadcast.
7. A multi-modal entity linking system, comprising the following modules:
an object recognition model generation module: collecting a labeled picture, and preprocessing the collected and labeled picture; constructing an object recognition model; training an object recognition model;
the entity link library generation module: acquiring entity linguistic data, and associating an entity with a picture tag to obtain an entity link library;
an entity linking module: and preprocessing the picture obtained by shooting, inputting the picture into the object recognition model to obtain an object recognition result, and searching the object recognition result in the entity link library to obtain a text result of the entity.
8. A readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.
9. A multi-modal entity link generation apparatus comprising a memory, a processor and a computing program stored on the memory and executable on the processor, the processor executing the program to perform the steps of the method of any of claims 1-6.
10. The apparatus of claim 8, further comprising an intelligent desk lamp, wherein the memory and processor are embedded on the intelligent desk lamp, wherein the intelligent desk lamp comprises a sound pickup device.
CN201911101194.3A 2019-11-12 2019-11-12 Multi-mode entity linking method and entity linking system Pending CN110991149A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911101194.3A CN110991149A (en) 2019-11-12 2019-11-12 Multi-mode entity linking method and entity linking system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911101194.3A CN110991149A (en) 2019-11-12 2019-11-12 Multi-mode entity linking method and entity linking system

Publications (1)

Publication Number Publication Date
CN110991149A true CN110991149A (en) 2020-04-10

Family

ID=70083926

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911101194.3A Pending CN110991149A (en) 2019-11-12 2019-11-12 Multi-mode entity linking method and entity linking system

Country Status (1)

Country Link
CN (1) CN110991149A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111581973A (en) * 2020-04-24 2020-08-25 中国科学院空天信息创新研究院 Entity disambiguation method and system
CN111949815A (en) * 2020-08-17 2020-11-17 西南石油大学 Artificial intelligent corrosion material selection system based on image recognition technology and working method
CN112163109A (en) * 2020-09-24 2021-01-01 中国科学院计算机网络信息中心 Entity disambiguation method and system based on picture
CN112256951A (en) * 2020-09-09 2021-01-22 青岛大学 Intelligent household seedling planting system
CN112347768A (en) * 2020-10-12 2021-02-09 出门问问(苏州)信息科技有限公司 Entity identification method and device
CN113672092A (en) * 2021-08-26 2021-11-19 南京邮电大学 VR live-action teaching model big data teaching knowledge mining method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416384A (en) * 2018-03-05 2018-08-17 苏州大学 A kind of image tag mask method, system, equipment and readable storage medium storing program for executing
CN108509420A (en) * 2018-03-29 2018-09-07 赵维平 Gu spectrum and ancient culture knowledge mapping natural language processing method
CN109002834A (en) * 2018-06-15 2018-12-14 东南大学 Fine granularity image classification method based on multi-modal characterization
CN109034248A (en) * 2018-07-27 2018-12-18 电子科技大学 A kind of classification method of the Noise label image based on deep learning
CN109299274A (en) * 2018-11-07 2019-02-01 南京大学 A kind of natural scene Method for text detection based on full convolutional neural networks

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416384A (en) * 2018-03-05 2018-08-17 苏州大学 A kind of image tag mask method, system, equipment and readable storage medium storing program for executing
CN108509420A (en) * 2018-03-29 2018-09-07 赵维平 Gu spectrum and ancient culture knowledge mapping natural language processing method
CN109002834A (en) * 2018-06-15 2018-12-14 东南大学 Fine granularity image classification method based on multi-modal characterization
CN109034248A (en) * 2018-07-27 2018-12-18 电子科技大学 A kind of classification method of the Noise label image based on deep learning
CN109299274A (en) * 2018-11-07 2019-02-01 南京大学 A kind of natural scene Method for text detection based on full convolutional neural networks

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
MIN LIN, QIANG CHEN, SHUICHENG YAN: "Network in Network", 《ARXIV:1312.4400[CS.NE]》 *
SZEGEDY C,LIU W,JIA Y Q,ET A1.: "Going deeper with convolutions", 《2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION(CVPR)》 *
姜新猛: "基于TensorFlow的卷积神经网络的应用研究", 《中国优秀硕士学位论文全文数据库(电子期刊)信息科技辑》 *
王猛: "实体链接方法研究及信息安全领域实体链接系统实现", 《中国优秀硕士学位论文全文数据库(电子期刊)信息科技辑,2018年第12期》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111581973A (en) * 2020-04-24 2020-08-25 中国科学院空天信息创新研究院 Entity disambiguation method and system
CN111949815A (en) * 2020-08-17 2020-11-17 西南石油大学 Artificial intelligent corrosion material selection system based on image recognition technology and working method
CN112256951A (en) * 2020-09-09 2021-01-22 青岛大学 Intelligent household seedling planting system
CN112163109A (en) * 2020-09-24 2021-01-01 中国科学院计算机网络信息中心 Entity disambiguation method and system based on picture
CN112347768A (en) * 2020-10-12 2021-02-09 出门问问(苏州)信息科技有限公司 Entity identification method and device
CN113672092A (en) * 2021-08-26 2021-11-19 南京邮电大学 VR live-action teaching model big data teaching knowledge mining method and system

Similar Documents

Publication Publication Date Title
CN110837579B (en) Video classification method, apparatus, computer and readable storage medium
CN110119786B (en) Text topic classification method and device
CN110991149A (en) Multi-mode entity linking method and entity linking system
CN113312500B (en) Method for constructing event map for safe operation of dam
CN109271493B (en) Language text processing method and device and storage medium
CN113395578B (en) Method, device, equipment and storage medium for extracting video theme text
CN112819023B (en) Sample set acquisition method, device, computer equipment and storage medium
CN113010702B (en) Interactive processing method and device for multimedia information, electronic equipment and storage medium
CN112995690B (en) Live content category identification method, device, electronic equipment and readable storage medium
CN115203338A (en) Label and label example recommendation method
CN116955591A (en) Recommendation language generation method, related device and medium for content recommendation
CN114329181A (en) Question recommendation method and device and electronic equipment
CN113011126A (en) Text processing method and device, electronic equipment and computer readable storage medium
CN117709866A (en) Method and system for generating bidding document and computer readable storage medium
Jishan et al. Bangla language textual image description by hybrid neural network model
CN117216535A (en) Training method, device, equipment and medium for recommended text generation model
CN114676705B (en) Dialogue relation processing method, computer and readable storage medium
CN113569068B (en) Descriptive content generation method, visual content encoding and decoding method and device
CN114491209A (en) Method and system for mining enterprise business label based on internet information capture
CN114239730A (en) Cross-modal retrieval method based on neighbor sorting relation
CN116383426B (en) Visual emotion recognition method, device, equipment and storage medium based on attribute
Ullah et al. A review of multi-modal learning from the text-guided visual processing viewpoint
US11354894B2 (en) Automated content validation and inferential content annotation
CN116523041A (en) Knowledge graph construction method, retrieval method and system for equipment field and electronic equipment
Alberola et al. Artificial Vision and Language Processing for Robotics: Create end-to-end systems that can power robots with artificial vision and deep learning techniques

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200410