CN111309969A - Video retrieval method matched with text information - Google Patents

Video retrieval method matched with text information Download PDF

Info

Publication number
CN111309969A
CN111309969A CN202010046793.6A CN202010046793A CN111309969A CN 111309969 A CN111309969 A CN 111309969A CN 202010046793 A CN202010046793 A CN 202010046793A CN 111309969 A CN111309969 A CN 111309969A
Authority
CN
China
Prior art keywords
video
information
vector matrix
model
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010046793.6A
Other languages
Chinese (zh)
Inventor
邓清勇
钱利智
谭智辉
向懿
房海鹏
徐康宇
曾艳
欧阳艳
关屋大雄
胡怡玮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiangtan University
Original Assignee
Xiangtan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiangtan University filed Critical Xiangtan University
Priority to CN202010046793.6A priority Critical patent/CN111309969A/en
Publication of CN111309969A publication Critical patent/CN111309969A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/7867Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The invention provides a video retrieval method matched with text information. Firstly, using a knowledge map to perform information expansion on character information and establish a character characteristic vector matrix, secondly training an FCN model by referring to the character characteristic vector matrix, establishing a relation between a video and the character information, using a unidirectional LSTM neural network to generate characteristic description for the video and establish a video characteristic vector matrix, then importing the two vector matrices into an RNN recurrent neural network model for training, and finally adding a method for generating the character characteristic vector matrix and the video characteristic vector matrix into the trained model as an interface for processing the characters and the video to realize video retrieval matched with the character information. The invention can search out the video with high content conformity in the video library through the input character information, and because the screening and the searching are completed in the RNN, the characteristic description information of the video is not required to be stored, thereby reducing the storage amount of key data, improving the efficiency of video searching and realizing the video searching based on the video content.

Description

Video retrieval method matched with text information
Technical Field
The invention relates to the technical field of video retrieval, in particular to a video retrieval method matched with text information.
Background
With the rapid development of internet technology and various video shooting, editing and collecting equipment is continuously updated, and the number of network videos is increased explosively. People can check videos more conveniently and also require more efficient and accurate video retrieval. The traditional text-based video retrieval method needs to annotate video information manually and then uses a text-based database management system to retrieve the video, so that a large amount of time and storage index space are needed in the video retrieval process. With the rapid increase of the amount of video data, text-based video retrieval cannot meet the retrieval requirements of people, it is difficult to retrieve videos through a small amount of brief text information, and the efficiency is low or even ineffective when the retrieval based on video content is processed.
In summary, the key to solving the video retrieval problem is how to expand the text information to reduce the retrieval complexity and how to realize the retrieval based on the video content. With the development of artificial intelligence technology, deep learning technology provides a new idea for solving the problems. The neural network is a machine learning technology which simulates the human brain so as to realize artificial intelligence, wherein the knowledge map technology can be used for displaying the complex knowledge field through data mining, information processing, knowledge measurement and graphic drawing, processing and using, and can be used for expanding the character information. Video description (video) technology can generate text description for video, namely, the conversion from video image field to text field. A Recurrent Neural Network (RNN) may be used to implement the overall functionality of the video retrieval system. Based on the method, a video retrieval method matched with the text information is designed.
Disclosure of Invention
The invention discloses a video retrieval method matched with text information, which mainly applies knowledge mapping and a video processing technology to process the text information and video, realizes video retrieval based on video content, improves the efficiency of video retrieval and reduces the data storage capacity.
According to the application background of the invention, a video retrieval method matched with text information is provided, which comprises the following steps:
step 1, using a knowledge map to perform information expansion on text information and establish a text characteristic vector matrix, training a full convolution neural network FCN model by referring to the text characteristic vector matrix, establishing a relation between a video and the text information, using a unidirectional LSTM neural network to generate characteristic description on the video and establish a video characteristic vector matrix, recording methods and parameters for performing information expansion on the text information and generating the text characteristic vector matrix, using a video capturing technology to generate description on the video to be detected and establishing the video characteristic vector matrix through a word2vec model:
1) using a knowledge graph to perform information expansion on input character information for retrieval, splitting the input character information into a group of words, using a word2vec model and a knowledge graph embedding model to obtain vector expressions of the words and knowledge base entities, mapping the vectors to the same vector space through nonlinear transformation, using the vectors to construct a KCNN neural network, giving a vocabulary database, further obtaining vector expressions of the vocabulary retrieval on the input character information, using a DNN neural network model to predict the association probability of the characters and expanded information, establishing a character characteristic vector matrix, taking a characteristic information vector with the highest association degree to add in the matrix, performing information expansion on the input characters, and recording a method and parameters for performing information expansion on the character information and generating the character characteristic vector matrix;
2) the method comprises the steps of establishing a corresponding characteristic vocabulary library by referring to information vocabularies in a character characteristic matrix, generating description for a video to be detected by using a video adaptation technology, establishing a vocabulary full convolution neural network Lexical-FCN model, generating data for each frame of the video through an FCN neural network, establishing weak mapping relation between the data and a lexicon gathered from a character characteristic vector matrix through model training, roughly dividing 16 regions by using an anchor method in a target detection method in the last layer output by the FCN neural network, confirming the types of region sequences, selecting a part of sequences, generating description based on the character characteristic vector matrix by using a one-way LSTM neural network, establishing a video characteristic vector matrix by using a word2vec model, and recording a method and parameters for generating description for the video to be detected by using a video adaptation technology and establishing the video characteristic vector matrix through the word2vec model.
Step 2, importing a character characteristic vector matrix and a video characteristic vector matrix into an RNN recurrent neural network model for training, taking a method for performing information expansion on character information by using a knowledge map and generating character characteristic vectors as an interface for processing input character information, taking a method for generating description on a video to be detected by using a video capturing technology and establishing a video characteristic vector matrix through a word2vec model as an interface for processing the video, loading the whole model into a video retrieval engine, and processing and judging whether the usability of the model achieves the target:
1) matching and establishing a relation between a character characteristic vector matrix and a video characteristic vector matrix which are imported into an RNN recurrent neural network model, inputting for many times to generate activation functions of different types and different contents, so as to improve the precision of screening and matching, continuously adjusting and transmitting parameters, generating a multilayer network, iterating the training process to continuously adjust the parameters until the training is completed, using the stored method for generating the character characteristic vector matrix and the video characteristic vector matrix as an input interface of the model, and finally loading the model into a video retrieval engine;
2) inputting character information describing video salient features, connecting to a video resource library, entering the interior of an engine as input through a video search engine and character information expansion, participating in screening and matching processes, simultaneously entering the video of the video library into the engine, extracting corresponding features, then carrying out self-processing type matching and screening on the engine, and finally returning a processed optimal result as a retrieval result by the engine.
Compared with the prior art, the method has the advantages that:
1. the used data is from the input text information and the characteristic description of the video, and the video retrieval based on the video content is realized.
2. The retrieval process is optimized, the method for retrieving the video by using the text information is provided, and the recognition rate and the video retrieval rate are improved.
3. By using the character information to match with the characteristics of the video, the storage of various items of video characteristics given by experience in a method for manually establishing the video index is reduced, the storage capacity of key data is reduced, and the execution amount of data retrieval operation is reduced.
4. The method has the advantages that the deep learning algorithm is used, the established character characteristic vector matrix and the established video characteristic vector matrix are used as training samples to be trained, the relation corresponding to each dimension characteristic is obtained, the artificial subjectivity of each characteristic relation given by experience in the existing index method is overcome, the weight of information elements of the video on the influence of retrieval results is better, the screening effect of a video search engine is better, the search results are more in line with the user requirements, the user experience is improved, and the video retrieval efficiency is improved.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a schematic diagram of text message expansion according to the present invention;
FIG. 3 is a schematic diagram of a text eigenvector matrix according to the invention;
FIG. 4 is a schematic diagram of video feature generation of the present invention;
FIG. 5 is a schematic diagram of a video feature vector matrix according to the present invention;
FIG. 6 is a schematic diagram of RNN model training of the present invention
Detailed Description
As shown in fig. 1, the technical scheme of the invention comprises the following specific steps:
step 1, using a knowledge map to perform information expansion on text information and establish a text characteristic vector matrix, training a full convolution neural network FCN model by referring to the text characteristic vector matrix, establishing a relation between a video and the text information, using a unidirectional LSTM neural network to generate characteristic description on the video and establish a video characteristic vector matrix, recording methods and parameters for performing information expansion on the text information and generating the text characteristic vector matrix, using a video capturing technology to generate description on the video to be detected and establishing the video characteristic vector matrix through a word2vec model:
1) as shown in fig. 2, the method includes performing information expansion on input text information for retrieval by using a knowledge graph, splitting the input text information into a group of words, obtaining vector representations of the words and knowledge base entities by using a word2vec model and a knowledge graph embedding model, mapping the vectors to the same vector space through nonlinear transformation, constructing a KCNN neural network by using the vectors, giving a vocabulary database, further obtaining vector representations of the word information for vocabulary retrieval, predicting association probabilities of the words and extension information by using another DNN neural network model, establishing a text feature vector matrix shown in fig. 3, adding feature information vectors with the highest association degree into a matrix, performing information expansion on the input words, and recording a method and parameters for performing information expansion on the word information and generating a text feature vector matrix:
splitting input characters into a group of words, linking the words with the entity of the knowledge base, finding out all adjacent entities within one hop of the distance from the linked entity, obtaining vector representation of the words by using a word2vec model, and obtaining vector representation of the entity of the knowledge base by using a knowledge map embedding model;
mapping vector representation of input characters, link entities and context entities to the same vector space through a nonlinear transformation
g(e1:n)=[g(e1)g(e2)…g(en)]
Figure BDA0002369709150000034
Then, similar to three channels of RGB in the image, vector representations of words, link entities and context entities are used as input of CNN neural network multiple channels, a KCNN neural network is constructed, and thus the input of a KCNN neural network model can be expressed as:
Figure BDA0002369709150000031
Figure BDA0002369709150000032
Figure BDA0002369709150000033
given a vocabulary database, obtaining vector representation of character information through a KCNN neural network: calculating the normalized influence weight by using a DNN neural network model as an attention network and a normalization function softmax:
Figure BDA0002369709150000041
obtaining a vector representation of the database about the input text:
Figure BDA0002369709150000042
and predicting the association probability of the characters and the expansion information by using another DNN model, representing input from two levels of semantics and knowledge through results of the two models, fusing heterogeneous information sources by an alignment mechanism of an entity and words, and better capturing the implicit relationship between the characters so that the information expansion of the input characters can be performed through the implicit relationship, and recording a method and parameters for performing the information expansion on the character information and generating a character feature vector matrix.
2) Referring to the information words in the character feature matrix, as shown in fig. 4, a corresponding feature word library is established, a video capturing technology is used to generate descriptions for the video to be detected, a full convolution neural network Lexical-FCN model is established, each frame of the video is used to generate data through the FCN network, weak mapping relation between data and word library gathered from character characteristic vector matrix is established through model training, in the last layer of FCN neural network output, roughly dividing 16 regions by using an anchor method in a target detection method, confirming the type of a region sequence, selecting a part of the sequence, generating description based on a character feature vector matrix by using a unidirectional LSTM neural network, then establishing a video feature vector matrix shown in figure 5 by using a word2vec model, recording a method and parameters for generating description of a video to be detected by using a video capturing technology and establishing the video feature vector matrix by using the word2vec model:
establishing a Lexical-FCN model, generating data for each frame of a video through an FCN, establishing weak mapping relation between the data and a word library gathered from a character characteristic vector matrix through model training, roughly dividing 16 regions by using an anchor method in a target detection method at the last layer of FCN output, and generating service for region sequences;
the region sequence generation uses a sub-modulation mapping mathematical method to extract 30 frames of video, confirms the type of the region sequence, selects a part of sequence generation description, and selects the standard of selection
Figure BDA0002369709150000043
So select A in the sequence*To maximize their correlation with characteristics, R being in particular
Figure BDA0002369709150000044
Is a linear combination of functions for each sequence a, f requires three aspects: information, coherence and diversity, the formula is
Figure BDA0002369709150000045
Figure BDA0002369709150000046
Figure BDA0002369709150000047
The region sequence is obtained by greedily step by step, and the benefit of adding the region r at each time is
Figure BDA0002369709150000048
Then at each step r is chosen to maximize its increment, for the parameter weight w:
Figure BDA0002369709150000051
extracting region sequences which contain information and are coherent and have larger difference (diversity) between the region sequences by using a greedy method step by step;
using a one-way LSTM neural network with type information c added: s*=argmaxsP (c | v), generating description words aiming at different types of sequences, establishing a feature vector matrix with symmetrical structure through a word2vec model, recording a method and parameters for generating description for a video to be detected by using a video adaptation technology and establishing a video feature vector matrix through the word2vec model.
Step 2, importing a character characteristic vector matrix and a video characteristic vector matrix into an RNN recurrent neural network model for training, taking a method for performing information expansion on character information by using a knowledge map and generating the character characteristic vector matrix as an interface for processing input character information, taking a method for generating description on a video to be detected by using a video capturing technology and establishing the video characteristic vector matrix through a word2vec model as an interface for processing the video, loading the whole model into a video retrieval engine, and processing and judging whether the usability of the model achieves the target:
1) as shown in fig. 6, matching and establishing a connection between a text feature vector matrix and a video feature vector matrix imported into an RNN recurrent neural network model, inputting for many times to generate activation functions of different types and different contents, so as to improve the precision of screening and matching, continuously adjusting and transferring parameters, generating a multilayer network, iterating the training process to continuously adjust parameters until the training is completed, using the stored method for generating the text feature vector matrix and the video feature vector matrix as an input interface of the model, and loading the model into a video search engine:
the network receives two input eigenvectors X at time ttAnd YtThe value of the hidden layer is then StThe output is OtA critical point StIs not only dependent on XtAlso depends on St-1We can use the following formula:
Figure BDA0002369709150000052
St=f(U·Xt+T·Yt+W·St-1)
2) inputting character information describing video salient features, connecting to a video resource library, entering the interior of an engine as input through a video search engine and character information expansion, participating in screening and matching processes, simultaneously entering the video of the video library into the engine, extracting corresponding features, then carrying out self-processing type matching and screening on the engine, and finally returning a processed optimal result as a retrieval result by the engine.

Claims (4)

1. A video retrieval method matched with text information is characterized by at least comprising the following steps:
step 1, performing information expansion on text information by using a knowledge map and establishing a text characteristic vector matrix, training a full convolution neural network FCN model by referring to the text characteristic vector matrix, establishing a relation between a video and the text information, generating characteristic description on the video by using a unidirectional LSTM neural network and establishing a video characteristic vector matrix, recording a method and parameters for performing information expansion on the text information and generating the text characteristic vector matrix, generating description on the video to be detected by using a video capturing technology and establishing the video characteristic vector matrix through a word2vec model;
and 2, importing the character characteristic vector matrix and the video characteristic vector matrix into an RNN recurrent neural network model for training, taking a method for performing information expansion on character information by using a knowledge map and generating the character characteristic vector matrix as an interface for processing the character information, taking a method for generating description on a video to be detected by using a video capturing technology and establishing the video characteristic vector matrix through a word2vec model as an interface for processing the video, and finally loading the whole model into a video retrieval engine to process and judge whether the usability of the model achieves a target or not.
2. The method of claim 1, wherein said using a knowledge-graph to perform information expansion on textual information and to establish a textual feature vector matrix further comprises:
1) splitting input text information into a group of words, linking the words with the entity of the knowledge base, finding out all adjacent entities within one hop of the linked entity, obtaining vector representation of the words by using a word2vec model, and obtaining vector representation of the entity of the knowledge base by using a knowledge map embedded model;
2) mapping vector representations of input characters, link entities and context entities to the same vector space through a nonlinear transformation:
g(e1:n)=[g(e1)g(e2)…g(en)]
Figure FDA0002369709140000011
3) then, similar to three channels of RGB in the image, vector representations of words, link entities and context entities are used as input of CNN neural network multiple channels, a KCNN neural network is constructed, and thus the input of a KCNN neural network model can be expressed as:
Figure FDA0002369709140000012
Figure FDA0002369709140000013
Figure FDA0002369709140000014
4) given a vocabulary database, obtaining vector representation of character information through a KCNN neural network: calculating the normalized influence weight by using a DNN neural network model as an attention network and a normalization function softmax:
Figure FDA0002369709140000021
obtaining a vector representation of the lexical database with respect to the input text:
Figure FDA0002369709140000022
and the other DNN neural network model is used for predicting the association probability of the characters and the expanded information, the input is expressed from two levels of semantics and knowledge through the results of the two models, and the alignment mechanism of the entity and the word is fused with heterogeneous information sources, so that the implicit relationship between the characters can be better captured, and the information of the input character information can be expanded through the implicit relationship.
3. The method of claim 1, wherein training a full convolution neural network (FCN) model with reference to a text feature vector matrix to establish a relationship between the video and the text information, and using a unidirectional LSTM neural network to generate feature descriptions for the video and establish a video feature vector matrix further comprises:
1) establishing a Lexical-FCN model, generating data for each frame of a video through an FCN, establishing weak mapping relation between the data and a word bank converged from a character characteristic vector matrix through model training, roughly dividing 16 regions by using an anchor method in a target detection method on the last layer output by the FCN neural network, and generating service for region sequences;
2) the region sequence generation uses a sub-modulation mapping mathematical method to extract 30 frames of video, confirms the type of the region sequence, selects a part of sequence generation description, and selects the standard of selection as
Figure FDA0002369709140000023
So select A in the sequence*To maximize the characteristic correlation, R is specifically
Figure FDA0002369709140000028
Is a linear combination of functions for each sequence a, f requires three aspects: information, coherence and diversity, the formula is
Figure FDA0002369709140000024
Figure FDA0002369709140000025
Figure FDA0002369709140000026
The region sequence is obtained by greedily step by step, and the benefit of adding the region r at each time is
Figure FDA0002369709140000027
Then at each step r is chosen to maximize its increment, for the parameter weight w:
Figure FDA0002369709140000031
extracting a region sequence which contains information and is coherent and has large difference between the region sequences by using a greedy method step by step;
3) using a unidirectional LSTM neural network with type information c added S*=arg maxsP (c | v), generating descriptions for different types of sequencesVocabulary and establishing a feature vector matrix with symmetrical structure through a word2vec model, recording a method and parameters for generating description on a video to be detected by using a video adaptation technology and establishing a video feature vector matrix through the word2vec model.
4. The method of claim 1, wherein the step of importing the text eigenvector matrix and the video eigenvector matrix into the RNN recurrent neural network model for training further comprises:
1) matching and establishing a relation between a character characteristic vector matrix and a video characteristic vector matrix which are imported into an RNN recurrent neural network model, inputting for many times to generate activation functions of different types and different contents, so as to improve the precision of screening and matching, continuously adjusting and transmitting parameters, generating a multilayer network, iterating the training process to continuously adjust the parameters until the training is completed, using the stored method for generating the character characteristic vector matrix and the video characteristic vector matrix as an input interface of the model, and finally loading the model into a video retrieval engine;
2) inputting character information describing video salient features, connecting to a video resource library, entering the interior of an engine as input through a video search engine and character information expansion, participating in screening and matching processes, simultaneously entering the video of the video library into the engine, extracting corresponding features, then carrying out self-processing type matching and screening on the engine, and finally returning the processed optimal result as a retrieval result by the engine.
CN202010046793.6A 2020-01-16 2020-01-16 Video retrieval method matched with text information Pending CN111309969A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010046793.6A CN111309969A (en) 2020-01-16 2020-01-16 Video retrieval method matched with text information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010046793.6A CN111309969A (en) 2020-01-16 2020-01-16 Video retrieval method matched with text information

Publications (1)

Publication Number Publication Date
CN111309969A true CN111309969A (en) 2020-06-19

Family

ID=71145139

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010046793.6A Pending CN111309969A (en) 2020-01-16 2020-01-16 Video retrieval method matched with text information

Country Status (1)

Country Link
CN (1) CN111309969A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113643588A (en) * 2021-08-17 2021-11-12 国匠堂(郑州)教育科技有限公司 Chinese character calligraphy teaching system and using method
CN113641854A (en) * 2021-07-28 2021-11-12 上海影谱科技有限公司 Method and system for converting characters into video

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107291825A (en) * 2017-05-26 2017-10-24 北京奇艺世纪科技有限公司 With the search method and system of money commodity in a kind of video
CN109992676A (en) * 2019-04-01 2019-07-09 中国传媒大学 Across the media resource search method of one kind and searching system
CN110598048A (en) * 2018-05-25 2019-12-20 北京中科寒武纪科技有限公司 Video retrieval method and video retrieval mapping relation generation method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107291825A (en) * 2017-05-26 2017-10-24 北京奇艺世纪科技有限公司 With the search method and system of money commodity in a kind of video
CN110598048A (en) * 2018-05-25 2019-12-20 北京中科寒武纪科技有限公司 Video retrieval method and video retrieval mapping relation generation method and device
CN109992676A (en) * 2019-04-01 2019-07-09 中国传媒大学 Across the media resource search method of one kind and searching system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113641854A (en) * 2021-07-28 2021-11-12 上海影谱科技有限公司 Method and system for converting characters into video
CN113641854B (en) * 2021-07-28 2023-09-26 上海影谱科技有限公司 Method and system for converting text into video
CN113643588A (en) * 2021-08-17 2021-11-12 国匠堂(郑州)教育科技有限公司 Chinese character calligraphy teaching system and using method

Similar Documents

Publication Publication Date Title
US11875267B2 (en) Systems and methods for unifying statistical models for different data modalities
CN109934261B (en) Knowledge-driven parameter propagation model and few-sample learning method thereof
CN112015868B (en) Question-answering method based on knowledge graph completion
US8676725B1 (en) Method and system for entropy-based semantic hashing
CN110362660A (en) A kind of Quality of electronic products automatic testing method of knowledge based map
CN108932342A (en) A kind of method of semantic matches, the learning method of model and server
CN109344266B (en) Dual-semantic-space-based antagonistic cross-media retrieval method
CN110795527B (en) Candidate entity ordering method, training method and related device
CN108446334B (en) Image retrieval method based on content for unsupervised countermeasure training
CN111242033B (en) Video feature learning method based on discriminant analysis of video and text pairs
CN113177141B (en) Multi-label video hash retrieval method and device based on semantic embedded soft similarity
CN111737432A (en) Automatic dialogue method and system based on joint training model
CN113626589B (en) Multi-label text classification method based on mixed attention mechanism
CN113268609A (en) Dialog content recommendation method, device, equipment and medium based on knowledge graph
CN113806554B (en) Knowledge graph construction method for massive conference texts
CN111309969A (en) Video retrieval method matched with text information
CN111512299A (en) Method for content search and electronic device thereof
CN113468891A (en) Text processing method and device
CN113392265A (en) Multimedia processing method, device and equipment
CN111930894A (en) Long text matching method and device, storage medium and electronic equipment
CN110516240B (en) Semantic similarity calculation model DSSM (direct sequence spread spectrum) technology based on Transformer
CN114239730A (en) Cross-modal retrieval method based on neighbor sorting relation
CN110990630B (en) Video question-answering method based on graph modeling visual information and guided by using questions
CN114386424B (en) Industry professional text automatic labeling method, industry professional text automatic labeling device, industry professional text automatic labeling terminal and industry professional text automatic labeling storage medium
CN116452353A (en) Financial data management method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200619