WO2018205838A1 - 一种相似视频的检索方法、装置和存储介质 - Google Patents

一种相似视频的检索方法、装置和存储介质 Download PDF

Info

Publication number
WO2018205838A1
WO2018205838A1 PCT/CN2018/084580 CN2018084580W WO2018205838A1 WO 2018205838 A1 WO2018205838 A1 WO 2018205838A1 CN 2018084580 W CN2018084580 W CN 2018084580W WO 2018205838 A1 WO2018205838 A1 WO 2018205838A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
candidate
word
entity
information
Prior art date
Application number
PCT/CN2018/084580
Other languages
English (en)
French (fr)
Inventor
张媛媛
于群
占飞
华枭
張永燊
熊磊
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2018205838A1 publication Critical patent/WO2018205838A1/zh
Priority to US16/509,289 priority Critical patent/US10853660B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3338Query expansion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3335Syntactic pre-processing, e.g. stopword elimination, stemming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5846Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using extracted text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/5866Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/7867Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/48Matching video sequences
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/196Recognition using electronic means using sequential comparisons of the image signals with a plurality of references
    • G06V30/1983Syntactic or structural pattern recognition, e.g. symbolic string recognition

Definitions

  • the present application relates to the field of communications technologies, and in particular, to a method, an apparatus, and a storage medium for retrieving similar videos.
  • the text depth representation model (word2vec) is generally used for modeling. Then, the similarity of the video title is calculated based on the model, and similar videos are searched according to the similarity.
  • the corpus of the training word2vec model mainly comes from the network. If the corpus is updated, the word2vec model needs to be retrained.
  • the embodiment of the present application provides a similar video retrieval method, apparatus, and storage medium, which can not only improve the recall rate and the accuracy of the retrieval result, but also reduce the frequency of model training and save computing resources.
  • the embodiment of the present application provides a method for retrieving a similar video, which is applied to a computing device, and includes:
  • the video information includes a video tag and a video title
  • a similar video of the video information is determined according to the first candidate video set and the second candidate video set.
  • the embodiment of the present application further provides a similar video retrieval device, including:
  • processor coupled to the processor, the memory having machine readable instructions executable by the processor, the processor executing the machine readable instructions to:
  • the video information includes a video tag and a video title
  • a similar video of the video information is determined according to the first candidate video set and the second candidate video set.
  • the embodiment of the present application further provides a non-transitory computer readable storage medium storing machine readable instructions, the machine readable instructions being executable by a processor to perform the following operations:
  • the video information includes a video tag and a video title
  • a similar video of the video information is determined according to the first candidate video set and the second candidate video set.
  • FIG. 1a is a schematic diagram of a scenario of a similar video retrieval method provided by an embodiment of the present application
  • FIG. 1b is another schematic diagram of a method for retrieving a similar video according to an embodiment of the present application
  • FIG. 1c is a flowchart of a method for retrieving similar videos provided by an embodiment of the present application
  • 2a is another flowchart of a method for retrieving similar videos provided by an embodiment of the present application
  • 2b is a diagram showing an example of a relationship edge in a similar video retrieval method provided by an embodiment of the present application
  • FIG. 3 is a schematic structural diagram of a similar video retrieval apparatus according to an embodiment of the present application.
  • FIG. 3b is another schematic structural diagram of a similar video retrieval apparatus according to an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of a computing device provided by an embodiment of the present application.
  • the traditional method of retrieving similar videos relies heavily on the effect of word segmentation, the magnitude of the corpus, and the timeliness of corpus updates. If the corpus update is not timely, for some new video information, the correct word segmentation result may not be obtained, which may affect the training result of the word2vec model, and finally the recall rate (R, Recall, also referred to as the recall rate) is low. The search results are not accurate. If the frequency of corpus update is too large, it will consume a lot of training time and computing resources, resulting in serious waste of resources.
  • the embodiment of the present application provides a similar video retrieval method, apparatus, and storage medium, which can not only improve the recall rate and the accuracy of the retrieval result, but also reduce the frequency of model training and save computing resources.
  • the retrieval device of the similar video may be specifically integrated into a computing device such as a server or a terminal.
  • a computing device such as a server or a terminal.
  • the server can obtain and learn from the video library according to the preset knowledge map.
  • the video information matches the video to obtain the first candidate video set.
  • the video information can be trained by using a preset word2vec model (text depth representation model) to convert the video information into a word vector and according to the word.
  • the vector is filtered from the video library to obtain a second candidate video set, and then the similar video of the video information is determined according to the first candidate video set and the second candidate video set, and then the video can be determined.
  • the similar video is provided for use in a variety of scenarios. For example, referring to FIG. 1b, the similar video may be provided or recommended to the user, or the video may be classified in this manner, and the like.
  • the present embodiment will be described from the perspective of a similar video retrieval device, which may be integrated into a computing device such as a server or a terminal.
  • a similar video retrieval method is applied to a computing device, comprising: acquiring video information that needs to retrieve a similar video, the video information including a video tag and a video title, and obtaining, according to a preset knowledge map, the video information from the video library.
  • Video obtaining a first candidate video set, training the video information by using a preset word2vec model, converting the video information into a word vector, and filtering a video similar to the video information from the video library according to the word vector, to obtain a second a candidate video set, determining a similar video of the video information according to the first candidate video set and the second candidate video set.
  • the specific process of the similar video retrieval method can be as follows:
  • Step 101 Obtain video information that needs to retrieve similar videos.
  • the retrieval device of the similar video may receive the retrieval request sent by the terminal, wherein the retrieval request indicates that the video information of the similar video needs to be retrieved; or, when the video needs to be classified, the local (ie, the similar video retrieval device) Or generating a corresponding retrieval request by another device, and then, the retrieval device of the similar video acquires video information that needs to retrieve a similar video according to the retrieval request, and the like.
  • the video information may include information such as a video tag and a video title
  • the video tag refers to information that can be used to represent the content and/or type of the video, such as a movie, a TV series, a comedy, an adventure film, etc.;
  • the video tag may also be information associated with the video, such as a star, a director, an attraction, or a production company, and the like.
  • the video title refers to the title content of the video, and may specifically include text and symbols in the title.
  • Step 102 Obtain a video matching the video information from the video library according to the preset knowledge map, to obtain a first candidate video set.
  • the specific content may be as follows:
  • entity words refer to words with specific semantics, specifically can be a noun that can refer to a certain thing, such as Zhang San, or XX TV series, and so on.
  • the video library stores a plurality of videos, each of which has corresponding video information, and the video information may include information such as a video tag and a video title.
  • the entity word having a strong association relationship with the seed may be determined according to the preset knowledge map, and the entity word is determined as a candidate word, and the video containing the candidate word in the video information is obtained from the video library, and the first candidate is obtained. Video collection.
  • the strong association relationship means that the relationship degree is less than or equal to the set value, that is, if the relationship between the entity word and the seed is less than or equal to the set value, it indicates that the entity word has a strong relationship with the seed, that is, Said that the step "determining an entity word having a strong association relationship with the seed according to the preset knowledge map, determining the entity word as a candidate word" may include:
  • the library can be built from a preset knowledge map.
  • the set value can be set according to the requirements of the actual application.
  • the relationship degree is 1 degree and 2 degrees, which is called an entity with a relationship of "near". Therefore, 1 degree can be used as the set value. ,and many more.
  • the seed may be mapped to a corresponding entity in the preset knowledge base by means of a Named Entity Linking (NEL) method, and then the relationship between the seed and other entities in the knowledge base is obtained.
  • NEL Named Entity Linking
  • named entity link is a process of linking a seed to an unambiguous entity in the knowledge base, including the merging of synonymous entities and the disambiguation of ambiguous entities.
  • the number of relationship edges is used to describe the relationship between the entities. It refers to the number of relationship edges that need to pass through from entity A to entity B in the entity relationship diagram. Generally, it can be expressed as degrees. For convenience of description, In the embodiment of the present application, the degree is referred to as relationship degree. The smaller the number of relationship edges, the lower the relationship degree (ie, the degree), and the lower the relationship degree, the closer an entity is to itself. For example, if an entity goes to itself, its relationship degree (degree) is 0. Wait, no longer repeat them here.
  • the entity library may be set in advance by the operation and maintenance personnel, or may be established by the retrieval device of the similar video, that is, in step "map the seed to the preset entity library.
  • the retrieval method of the similar video may further include:
  • the basic thesaurus and the Internet information may be updated periodically or in real time, and then the entity library is updated.
  • the entity library may be established according to the above manner, or the difference update may be performed, that is, for the newly added information, the analysis may be performed to determine whether the entity word included in the newly added information already exists. In the entity library, if it exists, it does not need to be added to the entity library. Otherwise, if it does not exist, it indicates that the entity word contained in the newly added information is a new entity word, and the corresponding Internet information can be obtained at this time.
  • the Internet information is cleaned by non-entity words, and then a triple relationship of the new entity word and the entity word in the cleaned internet information is constructed, and the constructed triplet relationship is added to the entity library.
  • Step 103 Train the video information by using a preset word2vec model to convert the video information into a word vector.
  • the video tag and the video title may be segmented to obtain a video text after the word segmentation, and then the video text of the segmentation is trained by using a preset word2vec model to obtain a word vector of the video text after the segmentation, that is, the word is obtained.
  • the word vector corresponding to each word in the video text after the word segmentation.
  • the word vector refers to the use of vectors to express words. It should be noted that the word vector corresponding to each word obtained by the training has a fixed dimension, so that the dimension explosion can be effectively prevented, and the calculation amount of the subsequent similarity calculation is reduced.
  • the word2vec model may be set in advance by the operation and maintenance personnel, or may be pre-established by the retrieval device of the similar video, that is, the similar video before the step of “training the video text after the segmentation using the preset word2vec model”.
  • the retrieval method may also include:
  • the preset corpus is obtained, and the sentences in the corpus are segmented, and the original model is learned according to the sentence after the word segmentation, and the word2vec model is obtained.
  • the content of the corpus can be set according to the needs of the actual application.
  • the user-generated content (UGC, User Generated Content) can be sampled within a certain period of time, and the network information, such as various encyclopedic materials, can be crawled. , to build the corpus, and so on.
  • steps 102 and 103 can be performed in no particular order.
  • Step 104 Filter a video similar to the video information from the video library according to the word vector to obtain a second candidate video set.
  • Video collections for example, can be as follows:
  • the similarity between the video information and each video in the video library is calculated, and the video with the similarity higher than the preset threshold is selected to obtain the second candidate video set.
  • the preset threshold may be set according to the requirements of the actual application, and the similarity between the two videos may be obtained by calculating the dot product of the word vectors of the two videos, or other similarity algorithms may be used. To calculate, no longer repeat them here.
  • Step 105 Determine a similar video of the video information according to the first candidate video set and the second candidate video set.
  • the videos in the first candidate video set and the second candidate video set are respectively scored, and the integrated score values of the respective videos are calculated according to the scores, and then, the videos with higher scores are integrated, for example, the integrated score value is greater than
  • the video of the preset score value is determined as a similar video of the video information, and the like.
  • Determining a similar video of the video information according to the first candidate video set and the second candidate video set may include:
  • the weighted value of the second score value is obtained as a comprehensive score value of each video; the video whose integrated score value is greater than the preset score value is determined as the similar video of the video information, and the formula can be expressed as follows:
  • S is the integrated score of video X
  • A is the first score of video X in the first candidate video set
  • B is the second score of the video X in the second candidate video set
  • is the first score
  • the weight of the value ie the weight of the video in the first candidate video set
  • is the weight of the second score value (ie the weight of the video in the second candidate video set)
  • the sum of ⁇ and ⁇ is 1, ⁇ and ⁇
  • the specific value can be set by the actual application requirements, such as through user feedback.
  • the scoring range of the first score value and the second score value may be set according to the requirements of the actual application, for example, may be set to be between [0, 1], and the like.
  • the second score of the video X is 0.
  • a video L in the second candidate video set is not in the first candidate. In the video collection, the first score of the video L is 0.
  • the embodiment may obtain the video matching the video information from the video library according to the preset knowledge map, and obtain the first candidate video set;
  • the video information may be trained by using a preset word2vec model, and a video similar to the video information is filtered from the video library according to the training result to obtain a second candidate video set, and then, according to the first candidate video set and the second candidate video.
  • the collection determines a similar video of the video information, thereby achieving the purpose of retrieving similar videos. Because the scheme uses a combination of knowledge map and text similarity to retrieve similar videos, it can not only use the knowledge map to compensate for the low recall rate and the large computing resource consumption caused by the frequency and quantity of corpus update.
  • the problem can also use the similarity calculation to increase the context information of the requested video (that is, the video that needs to be retrieved).
  • the ambiguity of the stipulation causes the recall result to be biased. Therefore, the scheme can not only improve the recall rate and retrieval result. Accuracy, and can reduce the frequency of model training, saving computing resources.
  • the retrieval device of the similar video is specifically integrated into the server as an example for description.
  • Step 201 The server acquires video information that needs to retrieve a similar video.
  • the server may specifically receive a retrieval request sent by the terminal, where the retrieval request indicates that video information of a similar video needs to be retrieved; or the server may also generate the local (ie, server) or other device when the video needs to be classified. Corresponding retrieval request, and then the server acquires video information that needs to retrieve similar videos according to the retrieval request, and the like.
  • the video information may include information such as a video tag and a video title
  • the video tag refers to information that can be used to represent the content and/or type of the video, such as a movie, a TV series, a comedy, an adventure film, etc.;
  • the video tag may also be information associated with the video, such as a star, a director, an attraction, or a production company, and the like.
  • the video title refers to the title content of the video, and may specifically include text and symbols in the title.
  • Step 202 The server extracts an entity word from the video information, such as a video tag and a video title, to obtain a seed.
  • the video that needs to retrieve a similar video is video K
  • the video K is a costume drama " ⁇ X list”
  • the video title is " ⁇ X list first episode”
  • the video label is "famous costume”
  • "television drama” At this time, at this time, entity words such as " ⁇ X list”, "costume” and "television” can be extracted from the video title and video tag of the video K to obtain a seed.
  • Step 203 The server determines, according to the preset knowledge map, an entity word that has a strong association relationship with the seed, and determines the entity word as a candidate word, and obtains a video that includes the candidate word in the video information from the video library, to obtain a first candidate. Video collection.
  • the strong association relationship means that the relationship degree is less than or equal to the set value, that is, if the relationship between the entity word and the seed is less than or equal to the set value, it indicates that the entity word has a strong relationship with the seed, for example, A word consistent with the seed, a synonym of the seed, a synonym, and a word having a predetermined specific relationship with the seed may be considered to have a strong association with the seed. That is, the step "the server determines an entity word having a strong association relationship with the seed according to the preset knowledge map, and determining the entity word as a candidate word" may include:
  • the server maps the seed to the entity word in the preset entity library, determines the degree of relationship between the seed and each entity word in the entity library, and selects an entity word whose relationship degree is less than or equal to the set value as the candidate word.
  • the set value can be set according to the requirements of the actual application, and the entity library can be established according to the preset knowledge map.
  • the seed can be mapped to the preset knowledge base by using NEL technology or the like.
  • the number of edges of the seed in the knowledge base with other entities is obtained, and the relationship between the seed and other entities in the knowledge base is obtained.
  • the number of the relationship between the seed " ⁇ X list” and the entity word “Wang Wu” is “1", and the corresponding relationship degree is 1 degree;
  • the seed " ⁇ X list The number of relations with the entity word “Zhang San” is “1”, and the corresponding relationship degree is 1 degree;
  • the number of relationship between the seed " ⁇ X list” and the entity word “Zhang Taitai” is “2”, the corresponding relationship The degree is 2 degrees;
  • the number of the relationship between the seed “ ⁇ X list” and the entity word “Li Si” is “2”, and the corresponding relationship degree is 2 degrees. If the set value is set to 1 degree, "Zhang San” and “Wang Wu” can be used as candidates.
  • the entity library may be set by the operation and maintenance personnel in advance, or may be established by the server.
  • the entity may be as follows:
  • the server sets the basic vocabulary, obtains the Internet information according to the basic vocabulary, and performs non-entity word cleaning on the Internet information, and constructs a triple relationship between the entity words according to the basic vocabulary and the cleaned Internet information.
  • Entity library
  • the basic vocabulary and the Internet information may be updated in a timely or real-time manner, and then the entity library is updated.
  • the entity library is updated.
  • Step 204 The server segments the video tag, the video title, and the like to obtain a video text after the word segmentation.
  • the video K is taken as an example. If the video title of the video K is “the first episode of the ⁇ X list” and the video tags are “famous costumes” and “television dramas”, the text can be segmented, such as a video.
  • the title "The first episode of ⁇ X list” is divided into “ ⁇ X list” and "first episode”, which divides the video tag "costume” into “famous costume”, divides the video tag "TV drama” into “TV drama”, and so on. Get the video text after the word segmentation.
  • steps 202 and 204 may be in no particular order.
  • Step 205 The server uses the preset word2vec model to train the video text after the word segmentation to obtain a word vector of the video text after the word segmentation.
  • the word2vec model may be used to separate the word segments. Train to get the word vector corresponding to each participle.
  • the word2vec model may be set in advance by the operation and maintenance personnel, or may be pre-established by the retrieval device of the similar video.
  • the server may obtain a preset corpus and segment the sentences in the corpus, and then According to the sentence after the word segmentation, the original model is learned and the word2vec model is obtained.
  • the content of the corpus can be set according to the needs of the actual application. For example, UGC can be sampled within a certain period, such as UGC for the whole year, and network information, such as various encyclopedias, can be crawled to build. The corpus, and so on, will not be repeated here.
  • Step 206 The server selects a video similar to the video information from the video library according to the word vector of the video text after the segmentation, to obtain a second candidate video set.
  • a video similar to the video information from the video library according to the word vector of the video text after the segmentation, to obtain a second candidate video set.
  • the following may be specifically as follows:
  • the server calculates the similarity between the video information and each video in the video library according to the word vector of the video text after the segmentation, and selects a video whose similarity is higher than the preset threshold to obtain a second candidate video set.
  • the preset threshold may be set according to the requirements of the actual application, and the similarity between the two videos may be obtained by calculating the dot product of the word vectors of the two videos, or other similarity algorithms may be used. To calculate, no longer repeat them here.
  • Step 207 The server determines a similar video of the video information according to the first candidate video set and the second candidate video set.
  • the server may separately score the videos in the first candidate video set and the second candidate video set, calculate the integrated score value of each video according to the scores, and then combine the videos with higher scores, such as the integrated score value.
  • a video larger than the preset score value is determined as a similar video of the video information, and the like.
  • the server in order to improve flexibility, it is also possible to finely adjust the corresponding weights for the videos in the first candidate video set and the second candidate video set, so that the search result is more accurate, that is, steps.
  • the determining, by the server, the similar video of the video information according to the first candidate video set and the second candidate video set may specifically include:
  • the server separately scores the video in the first candidate video set to obtain a first score value.
  • the server separately scores the video in the second candidate video set to obtain a second score value.
  • the server separately calculates the weighted values of the first score value and the corresponding second score value to obtain a comprehensive score value of each video.
  • the server determines the video whose integrated score is greater than the preset score as the similar video of the video information, which can be expressed as follows:
  • S is the integrated score of video X
  • A is the first score of video X in the first candidate video set
  • B is the second score of the video X in the second candidate video set
  • is the first score
  • the weight of the value ie the weight of the video in the first candidate video set
  • is the weight of the second score value (ie the weight of the video in the second candidate video set)
  • the sum of ⁇ and ⁇ is 1, ⁇ and ⁇
  • the specific value can be set by the actual application requirements, such as through user feedback.
  • the scoring range of the first score value and the second score value may be set according to the requirements of the actual application, for example, may be set to be between [0, 1], and the like.
  • the second score of the video X is 0.
  • a video L in the second candidate video set is not in the first candidate. In the video collection, the first score of the video L is 0.
  • the embodiment may obtain the video matching the video information from the video library according to the preset knowledge map, and obtain the first candidate video set;
  • the video information may be trained by using a preset word2vec model, and a video similar to the video information is filtered from the video library according to the training result to obtain a second candidate video set, and then, according to the first candidate video set and the second candidate video.
  • the collection determines a similar video of the video information, thereby achieving the purpose of retrieving similar videos. Because the scheme uses a combination of knowledge map and text similarity to retrieve similar videos, it can not only use the knowledge map to compensate for the low recall rate and the large computing resource consumption caused by the frequency and quantity of corpus update.
  • the problem can also use the similarity calculation to increase the context information of the requested video (that is, the video that needs to be retrieved).
  • the ambiguity of the stipulation causes the recall result to be biased. Therefore, the scheme can not only improve the recall rate and retrieval result. Accuracy, and can reduce the frequency of model training, saving computing resources.
  • the embodiment of the present application further provides a similar video retrieval device, and the similar video retrieval device may be specifically integrated in a computing device such as a server or a terminal.
  • the similar video retrieval device may include an acquisition unit 301, a matching unit 302, a training unit 303, a screening unit 304, and a determination unit 305.
  • the obtaining unit 301 is configured to acquire video information that needs to retrieve a similar video, where the video information includes a video tag and a video title.
  • the video information may include information such as a video tag and a video title
  • the video tag refers to information that can be used to represent video content and/or type and information associated with the video.
  • the video title refers to the title content of the video, and may specifically include text and symbols in the title.
  • the matching unit 302 is configured to obtain a video matching the video information from the video library according to the preset knowledge map, to obtain a first candidate video set.
  • the matching unit 302 can include an extracting subunit and a matching subunit.
  • the extracting subunit may be configured to extract an entity word from the video tag and the video title to obtain a seed.
  • the matching subunit may be configured to obtain a video matching the seed from the video library according to the preset knowledge map, to obtain a first candidate video set.
  • the matching sub-unit may be specifically configured to determine, according to a preset knowledge map, an entity word that has a strong association relationship with the seed, and determine the entity word as a candidate word, and the video information obtained from the video library includes the candidate word. Video, get the first candidate video collection.
  • the strong association relationship means that the relationship degree is less than or equal to the set value, that is, if the relationship between an entity word and the seed is less than or equal to the set value, it indicates that the entity word has a strong relationship with the seed, namely:
  • the matching sub-unit may be specifically configured to map the seed to an entity word in the preset entity library, determine a relationship between the seed and each entity word in the entity library, and select an entity whose relationship degree is less than or equal to the set value.
  • the word is used as a candidate, wherein the entity library can be established according to a preset knowledge map.
  • the set value can be set according to the requirements of the actual application.
  • the relationship degree is 1 degree and 2 degrees, which is called an entity with a relationship of "near". Therefore, 1 degree can be used as the set value. ,and many more.
  • the seed may be mapped to the corresponding entity in the preset knowledge base by means of NEL technology, and then the number of edges of the seed in the knowledge base with other entities is obtained, and the seed is obtained with other entities.
  • the degree of relationship in the knowledge base may be mapped to the corresponding entity in the preset knowledge base by means of NEL technology, and then the number of edges of the seed in the knowledge base with other entities is obtained, and the seed is obtained with other entities.
  • the entity library may be set in advance by the operation and maintenance personnel, or may be established by the retrieval device of the similar video, that is, as shown in FIG. 3b, the similar video retrieval device may also be An entity library establishing unit 306 is included.
  • the entity library establishing unit 306 can be configured to set a basic vocabulary, obtain Internet information according to the basic vocabulary, perform non-entity word cleaning on the Internet information, and construct an entity word according to the basic vocabulary and the cleaned Internet information. The relationship between the triples is obtained by the entity library.
  • the entity library establishing unit 306 may obtain basic classified entity words, such as stars, movies, and the like as basic vocabularies from the cell vocabulary of some applications, and then obtain Internet information according to the basic vocabulary, for example, may obtain some inclusions.
  • the webpage of the encyclopedia data cleans up the non-entity words in these webpages and constructs a triple relationship between the entity words and the entity words, thereby obtaining an entity library in which these triplet relationships are preserved.
  • the entity library establishing unit 306 may also update the basic vocabulary and the Internet information in a timely or real-time manner, and then update the entity library. For details, refer to the previous embodiment. , will not repeat them here.
  • the training unit 303 is configured to train the video information by using a preset word2vec to convert the video information into a word vector.
  • the training unit 303 can be specifically configured to perform segmentation on the video tag and the video title, obtain the video text after the word segmentation, and train the video text after the segmentation by using the preset word2vec to obtain the word vector of the video text after the segmentation.
  • the word2vec model may be set in advance by the operation and maintenance personnel, or may be pre-established by the retrieval device of the similar video.
  • the similar video retrieval device may further include a model establishing unit 307, as follows:
  • the model establishing unit 307 can be configured to obtain a preset corpus, and segment the sentences in the corpus, and learn the preset original model according to the sentence after the word segmentation to obtain the word2vec model.
  • the content of the corpus can be set according to the needs of the actual application.
  • the UGC can be sampled within a certain period of time, and the network information, such as various encyclopedias, can be crawled to construct the corpus, etc.
  • the network information such as various encyclopedias
  • the screening unit 304 is configured to filter, according to the word vector, a video similar to the video information from the video library to obtain a second candidate video set.
  • the screening unit 304 may filter the video similar to the video information from the video library according to the word vector of the video text after the word segmentation to obtain the second candidate video. Collections, for example, can be as follows:
  • the screening unit 304 is specifically configured to calculate a similarity between the video information and each video in the video library according to the word vector of the video text after the word segmentation, and select a video with a similarity higher than a preset threshold to obtain a second candidate. Video collection.
  • the preset threshold may be set according to the requirements of the actual application, and the similarity between the two videos may be obtained by calculating the dot product of the word vectors of the two videos, or other similarity algorithms may be used. To calculate.
  • the determining unit 305 is configured to determine a similar video of the video information according to the first candidate video set and the second candidate video set.
  • the determining unit 305 may be specifically configured to separately score the video in the first candidate video set to obtain a first score value; and respectively score the video in the second candidate video set to obtain a second score value. Calculating the weighted value of the first score value and the corresponding second score value separately, and obtaining the integrated score value of each video; determining the video whose integrated score value is greater than the preset score value as the similar video of the video information, for specific reference.
  • the foregoing units may be implemented as a separate entity, or may be implemented in any combination, and may be implemented as the same or a plurality of entities.
  • the foregoing method embodiments and details are not described herein.
  • the matching unit 302 can obtain the video matching the video information from the video library according to the preset knowledge map. Obtaining a first candidate video set; on the other hand, the video information may be trained by the training unit 303 using a preset word2vec model to convert the video information into a word vector, and the screening unit 304 filters the video vector according to the word vector. A video similar to the video information is obtained, and a second candidate video set is obtained. Then, the determining unit 305 determines a similar video of the video information according to the first candidate video set and the second candidate video set, thereby achieving the purpose of retrieving the similar video.
  • the scheme uses a combination of knowledge map and text similarity to retrieve similar videos, it can not only use the knowledge map to compensate for the low recall rate and the large computing resource consumption caused by the frequency and quantity of corpus update.
  • the problem can also use the similarity calculation to increase the context information of the requested video (that is, the video that needs to be retrieved).
  • the ambiguity of the stipulation causes the recall result to be biased. Therefore, the scheme can not only improve the recall rate and retrieval result. Accuracy, and can reduce the frequency of model training, saving computing resources.
  • the embodiment of the present application further provides a computing device (such as the foregoing server), as shown in FIG. 4, which shows a schematic structural diagram of a computing device according to an embodiment of the present application, specifically:
  • the computing device can include one or more processors 401 of the processing core, memory 402 of one or more computer readable storage media, power supply 403 and input unit 404, and the like.
  • processors 401 of the processing core memory 402 of one or more computer readable storage media
  • power supply 403 and input unit 404 power supply 403 and input unit 404, and the like.
  • FIG. 4 does not constitute a limitation on computing devices, and may include more or fewer components than those illustrated, or some components may be combined, or different component arrangements. among them:
  • Processor 401 is the control center of the computing device that connects various portions of the entire computing device using various interfaces and lines, by running or executing software programs and/or modules stored in memory 402, and by calling stored in memory 402. Data, performing various functions of the computing device and processing data to provide overall monitoring of the computing device.
  • the processor 401 may include one or more processing cores; the processor 401 may integrate an application processor and a modem processor, wherein the application processor mainly processes an operating system, a user interface, and an application. Etc.
  • the modem processor primarily handles wireless communications. It can be understood that the above modem processor may not be integrated into the processor 401.
  • the memory 402 can be used to store software programs and modules, and the processor 401 executes various functional applications and data processing by running software programs and modules stored in the memory 402.
  • the memory 402 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be stored according to Data created by the use of computing devices, etc.
  • memory 402 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, memory 402 can also include a memory controller to provide processor 401 access to memory 402.
  • the computing device also includes a power supply 403 for powering various components.
  • the power supply 403 can be logically coupled to the processor 401 through a power management system to manage charging, discharging, and power management through the power management system.
  • the power supply 403 may also include any one or more of a DC or AC power source, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
  • the computing device can also include an input unit 404 that can be used to receive input numeric or character information, as well as to generate keyboard, mouse, joystick, optical, or trackball signal inputs related to user settings and function controls.
  • an input unit 404 can be used to receive input numeric or character information, as well as to generate keyboard, mouse, joystick, optical, or trackball signal inputs related to user settings and function controls.
  • the computing device may further include a display unit or the like, which will not be described herein.
  • the processor 401 in the computing device loads the executable file (machine readable instruction) corresponding to the process of one or more applications into the memory 402 according to the following instruction, and is processed by the processor 401.
  • the 401 is configured to execute machine readable instructions (such as an application for implementing the retrieval method of the similar video described above) stored in the memory 402 to implement various functions, as follows:
  • the video information includes a video tag and a video title
  • the video matching the video information is obtained from the video library according to the preset knowledge map
  • the first candidate video set is obtained
  • the preset word2vec model is used.
  • the video information is trained to convert the video information into a word vector, and the video corresponding to the video information is filtered from the video library according to the word vector, to obtain a second candidate video set, according to the first candidate video set and the second
  • the candidate video set determines a similar video of the video information.
  • the processor 401 executing machine readable instructions stored in the memory 402 can do the following:
  • the word segmentation, the video text after the word segmentation is obtained, and the video text of the segmentation word is trained by using the preset word2vec model to obtain the word vector of the video text after the word segmentation, and then the word vector of the video text is filtered from the video library according to the word segmentation.
  • the word2vec model may be preset by the operation and maintenance personnel, or may be pre-established by the computing device, that is, the processor 401 may also run an application (ie, machine readable instructions) stored in the memory 402, thereby implementing the following: Features:
  • the preset corpus is obtained, and the sentences in the corpus are segmented, and the original model is learned according to the sentence after the word segmentation, and the word2vec model is obtained.
  • the content of the corpus can be set according to the needs of the actual application.
  • the UGC can be sampled within a certain period of time, and the network information, such as various encyclopedias, can be crawled to construct the corpus, etc. .
  • the computing device of the embodiment can obtain the video matching the video information from the video library according to the preset knowledge map, and obtain the first candidate video set;
  • the video information may be trained by using a preset word2vec to convert the video information into a word vector, and a video similar to the video information is filtered from the video library according to the word vector to obtain a second candidate video set, and then, A similar video of the video information is determined according to the first candidate video set and the second candidate video set, thereby achieving the purpose of retrieving similar videos.
  • the scheme uses a combination of knowledge map and text similarity to retrieve similar videos, it can not only use the knowledge map to compensate for the low recall rate and the large computing resource consumption caused by the frequency and quantity of corpus update.
  • the problem can also use the similarity calculation to increase the context information of the requested video (that is, the video that needs to be retrieved).
  • the ambiguity of the stipulation causes the recall result to be biased. Therefore, the scheme can not only improve the recall rate and retrieval result. Accuracy, and can reduce the frequency of model training, saving computing resources.
  • the embodiment of the present application provides a storage medium in which a plurality of machine readable instructions are executable, and the machine readable instructions can be loaded by a processor to perform any similar video provided by the embodiments of the present application.
  • the steps in the retrieval method can be executed by a processor to do the following:
  • the video information includes a video tag and a video title, and the video matching the video information is obtained from the video library according to the preset knowledge map, and the first candidate video set is obtained, and the preset word2vec model is used.
  • the video information is trained to convert the video information into a word vector, and a video similar to the video information is filtered from the video library according to the word vector to obtain a second candidate video set according to the first candidate video set and the second candidate.
  • the video collection determines a similar video of the video information.
  • the storage medium may include: a read only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk.
  • ROM read only memory
  • RAM random access memory
  • magnetic disk a magnetic disk or an optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Library & Information Science (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请实施例公开了一种相似视频的检索方法、装置和存储介质,该方法包括:获取需要检索相似视频的视频信息,其中,所述视频信息包括视频标签和视频标题;根据预设的知识图谱从视频库中获取与所述视频信息匹配的视频,得到第一候选视频集合;采用预设的文本深度表示模型对所述视频信息进行训练,以将所述视频信息转化为词向量;根据所述词向量从所述视频库中筛选与所述视频信息相似的视频,得到第二候选视频集合;根据所述第一候选视频集合和第二候选视频集合确定所述视频信息的相似视频。

Description

一种相似视频的检索方法、装置和存储介质
本申请要求于2017年5月11日提交中国专利局、申请号为201710331203.2,申请名称为“一种相似视频的检索方法、装置和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及通信技术领域,具体涉及一种相似视频的检索方法、装置和存储介质。
背景技术
在信息爆炸的时代,面对海量的视频,如何从中准确且全面地检索到相似视频,对于用户查询和视频信息推荐等场景,都有着积极的意义。
在检索相似视频时,一般都会采用文本深度表示模型(word2vec)来进行建模,然后,基于该模型计算视频标题的相似度,并根据该相似度来查找相似的视频。其中,训练word2vec模型的语料主要来自于网络,若语料发生更新,则该word2vec模型需要重新进行训练。
发明内容
本申请实施例提供一种相似视频的检索方法、装置和存储介质,不仅可以提高召回率和检索结果的准确性,而且可以降低模型训练的频率,节省计算资源。
本申请实施例提供一种相似视频的检索方法,应用于计算设备,包括:
获取需要检索相似视频的视频信息,其中,所述视频信息包括视频标签和视频标题;
根据预设的知识图谱从视频库中获取与所述视频信息匹配的视频, 得到第一候选视频集合;
采用预设的文本深度表示模型对所述视频信息进行训练,以将所述视频信息转化为词向量;
根据所述词向量从所述视频库中筛选与所述视频信息相似的视频,得到第二候选视频集合;
根据所述第一候选视频集合和第二候选视频集合确定所述视频信息的相似视频。
相应的,本申请实施例还提供一种相似视频的检索装置,包括:
处理器以及与所述处理器相连接的存储器,所述存储器中存储有可由所述处理器执行的机器可读指令,所述处理器执行所述机器可读指令完成以下操作:
获取需要检索相似视频的视频信息,其中,所述视频信息包括视频标签和视频标题;
根据预设的知识图谱从视频库中获取与所述视频信息匹配的视频,得到第一候选视频集合;
采用预设的文本深度表示模型对所述视频信息进行训练,以将所述视频信息转化为词向量;
根据所述词向量从所述视频库中筛选与所述视频信息相似的视频,得到第二候选视频集合;
根据所述第一候选视频集合和第二候选视频集合确定所述视频信息的相似视频。
本申请实施例还提供一种非易失性计算机可读存储介质,所述存储介质中存储有机器可读指令,所述机器可读指令可以由处理器执行以完成以下操作:
获取需要检索相似视频的视频信息,其中,所述视频信息包括视频标签和视频标题;
根据预设的知识图谱从视频库中获取与所述视频信息匹配的视频,得到第一候选视频集合;
采用预设的文本深度表示模型对所述视频信息进行训练,以将所述视频信息转化为词向量;
根据所述词向量从所述视频库中筛选与所述视频信息相似的视频,得到第二候选视频集合;
根据所述第一候选视频集合和第二候选视频集合确定所述视频信息的相似视频。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1a是本申请实施例提供的相似视频的检索方法的场景示意图;
图1b是本申请实施例提供的相似视频的检索方法的另一场景示意图;
图1c是本申请实施例提供的相似视频的检索方法的流程图;
图2a是本申请实施例提供的相似视频的检索方法的另一流程图;
图2b是本申请实施例提供的相似视频的检索方法中关系边的示例图;
图3a是本申请实施例提供的相似视频的检索装置的结构示意图;
图3b是本申请实施例提供的相似视频的检索装置的另一结构示意图;
图4是本申请实施例提供的计算设备的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实 施例,而不是全部的实施例。基于本申请中的实施例,本领域技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
传统的检索相似视频的方法非常依赖于分词效果、语料的量级以及语料更新的及时性。若语料更新不及时,对于一些新出现的视频信息,则可能无法取得正确的分词结果,进而影响word2vec模型的训练结果,最终导致召回率(R,Recall,也称为查全率)较低,检索结果不准确。而若语料更新频率过大,则需要消耗大量的训练时间和计算资源,导致严重的资源浪费。
有鉴于此,本申请实施例提供一种相似视频的检索方法、装置和存储介质,不仅可以提高召回率和检索结果的准确性,而且可以降低模型训练的频率,节省计算资源。
其中,该相似视频的检索装置具体可以集成在服务器或终端等计算设备中。例如,以集成在服务器中为例,则参见图1a,服务器在获取到需要检索相似视频的视频信息,比如视频标签和视频标题后,一方面,可以根据预设知识图谱从视频库中获取与该视频信息匹配的视频,得到第一候选视频集合,另一方面,可以采用预设的word2vec模型(文本深度表示模型)对该视频信息进行训练,以将视频信息转化为词向量,并根据词向量从该视频库中筛选与该视频信息相似的视频,得到第二候选视频集合,然后,根据该第一候选视频集合和第二候选视频集合确定该视频信息的相似视频,此后,便可以将该相似视频提供给多种场景使用,比如,参见图1b,可以将该相似视频提供或推荐给用户,或者,也可以通过这种方式对视频进行分类,等等。
以下分别进行详细说明。
本实施例将从相似视频的检索装置的角度进行描述,该相似视频的检索装置具体可以集成在服务器或终端等计算设备中。
一种相似视频的检索方法,应用于计算设备,包括:获取需要检索相似视频的视频信息,该视频信息包括视频标签和视频标题,根据预设 知识图谱从视频库中获取与该视频信息匹配的视频,得到第一候选视频集合,采用预设word2vec模型对该视频信息进行训练,以将视频信息转化为词向量,根据词向量从该视频库中筛选与该视频信息相似的视频,得到第二候选视频集合,根据该第一候选视频集合和第二候选视频集合确定该视频信息的相似视频。
如图1c所示,该相似视频的检索方法的具体流程可以如下:
步骤101、获取需要检索相似视频的视频信息。
例如,相似视频的检索装置可以接收终端发送的检索请求,其中,该检索请求指示需要检索相似视频的视频信息;或者,也可以在需要对视频进行分类时,在本地(即相似视频的检索装置)或由其他设备生成相应的检索请求,然后,由该相似视频的检索装置根据该检索请求获取需要检索相似视频的视频信息,等等。
其中,该视频信息可以包括视频标签和视频标题等信息,视频标签指的是可以用来表示视频内容和/或类型的信息,比如可以是电影、电视剧、喜剧、冒险片,等等;在本申请一些实施例中,该视频标签还可以是与该视频具有关联关系的信息,比如某明星、某导演、某景点、或某制作公司,等等。而视频标题指的是该视频的标题内容,具体可以包括标题中的文字和符号。
步骤102、根据预设知识图谱从视频库中获取与该视频信息匹配的视频,得到第一候选视频集合,例如,具体可以如下:
(1)从该视频标签和视频标题中抽取实体词,得到种子(也称为种子词,seed words)。
其中,实体词指的是具有特定语义的词语,具体可以是能够指代某项事物的名词,比如张三、或XX电视剧,等等。
(2)根据预设知识图谱从视频库中获取与该种子匹配的视频,得到第一候选视频集合。
其中,该视频库保存有多个视频,每个视频均具有相应的视频信息,该视频信息可以包括视频标签和视频标题等信息。
例如,具体可以根据预设知识图谱确定与该种子具有强关联关系的实体词,将所述实体词确定为候选词,从视频库中获取视频信息中包含该候选词的视频,得到第一候选视频集合。
其中,强关联关系指的是关系度小于或等于设定值,即若某实体词与该种子的关系度小于或等于设定值,则表明该实体词与该种子具有强关联关系,也就是说,步骤“根据预设知识图谱确定与该种子具有强关联关系的实体词,将所述实体词确定为候选词”可以包括:
将该种子映射到预设实体库中的实体词上,确定该种子与该实体库中各个实体词的关系度,选取关系度小于或等于设定值的实体词作为候选词,其中,该实体库可以根据预设知识图谱建立而成。
其中,该设定值可以根据实际应用的需求进行设置,比如,一般可以将关系度为1度和2度的,称为关系“近”的实体,因此,可以将1度作为该设定值,等等。
比如,具体可以通过命名实体链接(NEL,Named Entity Linking)技术等方式将该种子映射到预设知识库中的相应的实体上,然后,获取该种子在该知识库中与其他实体的关系边数量,得到该种子与其他实体在该知识库中的关系度。
其中,命名实体链接,简称实体链接(Entity Linking),是把种子链接到知识库中一个无歧义实体的过程,包括同义实体的合并、以及歧义实体的消歧等。
而关系边数量则用于描述实体之间关联关系的远近,指的是在实体关系图中从实体A到实体B所需经过的关系边的数量,一般可以表示为度数,为了描述方便,在本申请实施例中,将该度数称为关系度。其中,关系边数量越少,则关系度(即度数)越低,而关系度越低,则表明一个实体越贴近本身,比如,如果一个实体走向本身,则其关系度(度数)为0,等等,在此不再赘述。
在本申请一些实施例中,实体库可以由运维人员预先进行设置,或者,也可以由该相似视频的检索装置自行建立而成,即在步骤“将该种 子映射到预设实体库中的实体词上”之前,该相似视频的检索方法还可以包括:
设置基础词库,根据该基础词库获取互联网信息,并对该互联网信息进行非实体词清洗,根据该基础词库、以及清洗后的互联网信息构建实体词之间的三元组关系,得到实体库。
比如,可以从一些应用的细胞词库中取得基础的分类实体词,例如明星、电影等作为基础词库,然后根据这个基础词库获取互联网信息,比如可以获取一些包含百科资料的网页,将这些网页中的非实体词清洗掉,并构建实体词与实体词之间的三元组关系,如(<张三,明星>,<属于>,<一二三四,电视剧>),等等,从而得到保存有这些三元组关系的实体库。
需说明的是,为了保证检索结果的准确性,可以定时或实时地对该基础词库和互联网信息进行更新,进而对实体库进行更新。具体更新时,可以按照上述建立实体库的方式进行建立,也可以进行差量更新,即对于新增的信息,可以先进行分析,以确定该新增的信息中所包含的实体词是否已存在于该实体库中,若存在,则无需添加至实体库,反之,若不存在,则表明该新增的信息中所包含的实体词为新实体词,此时可以获取相应的互联网信息,对该互联网信息进行非实体词清洗,然后,构建该新的实体词与清洗后互联网信息中的实体词的三元组关系,并将构建的三元组关系添加至实体库中。
步骤103、采用预设word2vec模型对该视频信息进行训练,以将视频信息转化为词向量。
例如,具体可以对视频标签和视频标题等进行分词,得到分词后视频文本,然后,采用预设的word2vec模型对该分词后视频文本进行训练,得到该分词后视频文本的词向量,即得到该分词后视频文本中每个词都对应的词向量。
其中,词向量,顾名思义,指的是采用向量来表达词。需说明的是,该训练得到的每个词对应的词向量均具有固定的维度,这样,可以有效防止维度爆炸,降低后续相似度计算的计算量。
其中,该word2vec模型可以由运维人员预先进行设置,也可以由该相似视频的检索装置预先进行建立,即在步骤“采用预设word2vec模型对该分词后视频文本进行训练”之前,该相似视频的检索方法还可以包括:
获取预设的语料集,并对该语料集中的句子进行分词,根据分词后的句子对预设的原始模型进行学习,得到word2vec模型。
其中,该语料集中的内容可以根据实际应用的需求进行设置,比如,可以对一定期限内的用户原创内容(UGC,User Generated Content)进行抽样,以及对网络信息,如各类百科语料进行抓取,来构建该语料集,等等。
需说明的是,步骤102和103的执行可以不分先后。
步骤104、根据词向量从该视频库中筛选与该视频信息相似的视频,得到第二候选视频集合。
例如,若在步骤103中,训练得到分词后视频文本的词向量,则此时,可以根据该分词后视频文本的词向量从该视频库中筛选与该视频信息相似的视频,得到第二候选视频集合,比如,具体可以如下:
根据该分词后视频文本的词向量,分别计算该视频信息与该视频库中各个视频的相似度,选取相似度高于预设阈值的视频,得到第二候选视频集合。
其中,该预设阈值可以根据实际应用的需求进行设置,而两个视频之间的相似度可以通过计算这两个视频的词向量的点积来得到,或者,也可以采用其他的相似度算法来计算,在此不再赘述。
步骤105、根据该第一候选视频集合和第二候选视频集合确定该视频信息的相似视频。
例如,具体分别对第一候选视频集合和第二候选视频集合中的视频进行评分,根据这些评分计算各个视频的综合分数值,然后,将综合分数值较高的视频,比如将综合分数值大于预设分数值的视频确定为该视频信息的相似视频,等等。
在本申请一些实施例中,为了提高灵活性,还可以通过为第一候选视频集合和第二候选视频集合中的视频分别设定相应的权重进行微调,从而使得检索结果更为准确,即步骤“根据该第一候选视频集合和第二候选视频集合确定该视频信息的相似视频”可以包括:
分别对该第一候选视频集合中的视频进行评分,得到第一分数值;分别对该第二候选视频集合中的视频进行评分,得到第二分数值;分别计算该第一分数值和相应的第二分数值的加权值,得到各个视频的综合分数值;将综合分数值大于预设分数值的视频确定为该视频信息的相似视频,用公式表示即可以如下:
S=α*A+β*B;
其中,S为视频X的综合分数值,A为第一候选视频集合中的视频X的第一分数值,B为第二候选视频集合中该视频X的第二分数值,α为第一分数值的权重(即第一候选视频集合中的视频的权重),β为第二分数值的权重(即第二候选视频集合中的视频的权重),α和β的和为1,α和β的具体取值可以通过实际应用的需求,比如通过用户的反馈来进行设置。
需说明的是,第一分数值和第二分数值的打分范围可以根据实际应用的需求进行设置,比如,可以设置为在[0,1]之间,等等。另外,还需说明的是,若第二候选视频集合中不存在视频X,则该视频X的第二分数值为0,同理,若第二候选视频集合中的某视频L不在第一候选视频集合中,则该视频L的第一分数值为0。
由上可知,本实施例在获取需要检索相似视频的视频信息后,一方面可以根据预设知识图谱从视频库中获取与该视频信息匹配的视频,得到第一候选视频集合;另一方面,可以采用预设word2vec模型对该视频信息进行训练,并根据训练结果从视频库中筛选与该视频信息相似的视频,得到第二候选视频集合,然后,根据第一候选视频集合和第二候选视频集合确定该视频信息的相似视频,从而达到检索相似视频的目的。由于该方案采用了知识图谱和文本相似度相结合的方案来检索相似视 频,因此,既可以利用知识图谱,来弥补因为语料更新频率和数量问题所导致的召回率低和计算资源消耗较大的问题,也可以利用相似度计算来增加请求视频(即需要进行检索的视频)的上下文信息,规约一词多义导致召回结果跑偏的问题,所以,采用该方案不仅可以提高召回率和检索结果的准确性,而且可以降低模型训练的频率,节省计算资源。
根据上述实施例所描述的方法,以下将举例作进一步详细说明。
在本实施例中将以该相似视频的检索装置具体集成在服务器中为例进行说明。
如图2a所示,一种相似视频的检索方法,具体流程可以如下:
步骤201、服务器获取需要检索相似视频的视频信息。
例如,服务器具体可以接收终端发送的检索请求,其中,该检索请求指示需要检索相似视频的视频信息;或者,服务器也可以在需要对视频进行分类时,在本地(即服务器)或由其他设备生成相应的检索请求,然后,由该服务器根据该检索请求获取需要检索相似视频的视频信息,等等。
其中,该视频信息可以包括视频标签和视频标题等信息,视频标签指的是可以用来表示视频内容和/或类型的信息,比如可以是电影、电视剧、喜剧、冒险片,等等;在本申请一些实施例中,该视频标签还可以是与该视频具有关联关系的信息,比如某明星、某导演、某景点、或某制作公司,等等。而视频标题指的是该视频的标题内容,具体可以包括标题中的文字和符号。
步骤202、服务器从该视频信息,如视频标签和视频标题中抽取实体词,得到种子。
例如,若需要检索相似视频的视频为视频K,该视频K为某古装电视剧“琅X榜”,且其视频标题为“琅X榜第一集”,视频标签为“古装”、以及“电视剧”等,则此时,可以从该视频K的视频标题和视频标签中抽取实体词,如“琅X榜”、“古装”和“电视剧”等,得到种子。
步骤203、服务器根据预设知识图谱确定与该种子具有强关联关系 的实体词,将所述实体词确定为候选词,从视频库中获取视频信息中包含该候选词的视频,得到第一候选视频集合。
其中,强关联关系指的是关系度小于或等于设定值,即若某实体词与该种子的关系度小于或等于设定值,则表明该实体词与该种子具有强关联关系,比如,与该种子一致的词语、该种子的近义词、同义词、以及与该种子具有预设特定关系的词,均可认为与该种子具有强关联关系。也就是说,步骤“服务器根据预设知识图谱确定与该种子具有强关联关系的实体词,将所述实体词确定为候选词”可以包括:
服务器将该种子映射到预设实体库中的实体词上,确定该种子与该实体库中各个实体词的关系度,选取关系度小于或等于设定值的实体词作为候选词。
其中,该设定值可以根据实际应用的需求进行设置,而该实体库则可以根据预设知识图谱建立而成,比如,具体可以通过NEL技术等方式将该种子映射到预设知识库中的相应的实体上,然后,获取该种子在该知识库中与其他实体的关系边数量,得到该种子与其他实体在该知识库中的关系度。
比如,还是以视频K的种子“琅X榜”为例,如图2b所示,由于电视剧《琅X榜》的主演为明星“张三”和“王五”,而“张三”的妻子为“张太太”,搭档为“李四”,因此,种子“琅X榜”与实体词“王五”的关系边数量为“1”,相应的关系度为1度;种子“琅X榜”与实体词“张三”的关系边数量为“1”,相应的关系度为1度;种子“琅X榜”与实体词“张太太”的关系边数量为“2”,相应的关系度为2度;种子“琅X榜”与实体词“李四”的关系边数量为“2”,相应的关系度为2度。若设置的设定值为1度,所以,此时可以将“张三”和“王五”作为候选词。
在本申请一些实施例中,实体库可以由运维人员预先进行设置,或者,也可以由该服务器建立而成,例如,具体可以如下:
服务器设置基础词库,根据该基础词库获取互联网信息,并对该互联网信息进行非实体词清洗,根据该基础词库、以及清洗后的互联网信息构建实体词之间的三元组关系,得到实体库。
比如,可以从一些应用的细胞词库中取得基础的分类实体词,例如明星、电影等作为基础词库,然后根据这个基础词库获取互联网信息,比如可以获取一些包含百科资料的网页,将这些网页中的非实体词清洗掉,并构建实体词与实体词之间的三元组关系,如(<张三,明星>,<属于>,<琅X榜,电视剧>),等等,从而得到保存有这些三元组关系的实体库。
需说明的是,为了保证检索结果的准确性,可以定时或实时地对该基础词库和互联网信息进行更新,进而对实体库进行更新,具体可参见前述实施例中的描述,在此不再赘述。
步骤204、服务器对该视频标签和视频标题等进行分词,得到分词后视频文本。
例如,还是以视频K为例,如该视频K的视频标题为“琅X榜第一集”,视频标签为“古装”、以及“电视剧”等,则可以对这些文本进行分词,比如将视频标题“琅X榜第一集”划分为“琅X榜”和“第一集”,将视频标签“古装”划分为“古装”,将视频标签“电视剧”划分为“电视剧”,等等,得到分词后视频文本。
需说明的是,步骤202和204的执行可以不分先后。
步骤205、服务器采用预设的word2vec模型对该分词后视频文本进行训练,得到该分词后视频文本的词向量。
比如,若在步骤204中,得到分词后视频文本包括“琅X榜”、“第一集”、“古装”、以及“电视剧”,则此时,可以采用预设的word2vec模型分别对这些分词进行训练,得到各个分词对应的词向量。
其中,该word2vec模型可以由运维人员预先进行设置,也可以由该相似视频的检索装置预先进行建立,比如,服务器具体可以获取预设的语料集,并对该语料集中的句子进行分词,然后,根据分词后的句子对预设的原始模型进行学习,得到word2vec模型。
其中,该语料集中的内容可以根据实际应用的需求进行设置,比如,可以对一定期限内的UGC,比如全年的UGC进行抽样,以及对网络信息, 如各类百科语料进行抓取,来构建该语料集,等等,在此不再赘述。
步骤206、服务器根据该分词后视频文本的词向量从该视频库中筛选与该视频信息相似的视频,得到第二候选视频集合,例如,具体可以如下:
服务器根据该分词后视频文本的词向量,分别计算该视频信息与该视频库中各个视频的相似度,选取相似度高于预设阈值的视频,得到第二候选视频集合。
其中,该预设阈值可以根据实际应用的需求进行设置,而两个视频之间的相似度可以通过计算这两个视频的词向量的点积来得到,或者,也可以采用其他的相似度算法来计算,在此不再赘述。
步骤207、服务器根据该第一候选视频集合和第二候选视频集合确定该视频信息的相似视频。
例如,服务器可以分别对第一候选视频集合和第二候选视频集合中的视频进行评分,根据这些评分计算各个视频的综合分数值,然后,将综合分数值较高的视频,比如将综合分数值大于预设分数值的视频确定为该视频信息的相似视频,等等。
在本申请一些实施例中,为了提高灵活性,还可以通过为第一候选视频集合和第二候选视频集合中的视频分别设定相应的权重进行微调,从而使得检索结果更为准确,即步骤“服务器根据该第一候选视频集合和第二候选视频集合确定该视频信息的相似视频”具体可以包括:
(1)服务器分别对该第一候选视频集合中的视频进行评分,得到第一分数值。
(2)服务器分别对该第二候选视频集合中的视频进行评分,得到第二分数值。
(3)服务器分别计算该第一分数值和相应的第二分数值的加权值,得到各个视频的综合分数值。
(4)服务器将综合分数值大于预设分数值的视频确定为该视频信息的相似视频,用公式表示即可以如下:
S=α*A+β*B;
其中,S为视频X的综合分数值,A为第一候选视频集合中的视频X的第一分数值,B为第二候选视频集合中该视频X的第二分数值,α为第一分数值的权重(即第一候选视频集合中的视频的权重),β为第二分数值的权重(即第二候选视频集合中的视频的权重),α和β的和为1,α和β的具体取值可以通过实际应用的需求,比如通过用户的反馈来进行设置。
需说明的是,第一分数值和第二分数值的打分范围可以根据实际应用的需求进行设置,比如,可以设置为在[0,1]之间,等等。另外,还需说明的是,若第二候选视频集合中不存在视频X,则该视频X的第二分数值为0,同理,若第二候选视频集合中的某视频L不在第一候选视频集合中,则该视频L的第一分数值为0。
由上可知,本实施例在获取需要检索相似视频的视频信息后,一方面可以根据预设知识图谱从视频库中获取与该视频信息匹配的视频,得到第一候选视频集合;另一方面,可以采用预设word2vec模型对该视频信息进行训练,并根据训练结果从视频库中筛选与该视频信息相似的视频,得到第二候选视频集合,然后,根据第一候选视频集合和第二候选视频集合确定该视频信息的相似视频,从而达到检索相似视频的目的。由于该方案采用了知识图谱和文本相似度相结合的方案来检索相似视频,因此,既可以利用知识图谱,来弥补因为语料更新频率和数量问题所导致的召回率低和计算资源消耗较大的问题,也可以利用相似度计算来增加请求视频(即需要进行检索的视频)的上下文信息,规约一词多义导致召回结果跑偏的问题,所以,采用该方案不仅可以提高召回率和检索结果的准确性,而且可以降低模型训练的频率,节省计算资源。
为了更好地实施以上方法,本申请实施例还提供一种相似视频的检索装置,该相似视频的检索装置具体可以集成在在服务器或终端等计算设备中。
例如,如图3a所示,该相似视频的检索装置可以包括获取单元301、 匹配单元302、训练单元303、筛选单元304、以及确定单元305。
获取单元301,用于获取需要检索相似视频的视频信息,该视频信息包括视频标签和视频标题。
其中,该视频信息可以包括视频标签和视频标题等信息,视频标签指的是可以用来表示视频内容和/或类型的信息以及与该视频具有关联关系的信息。而视频标题指的是该视频的标题内容,具体可以包括标题中的文字和符号。
匹配单元302,用于根据预设知识图谱从视频库中获取与该视频信息匹配的视频,得到第一候选视频集合。
例如,该匹配单元302可以包括抽取子单元和匹配子单元。
该抽取子单元,可以用于从该视频标签和视频标题中抽取实体词,得到种子。
该匹配子单元,可以用于根据预设知识图谱从视频库中获取与该种子匹配的视频,得到第一候选视频集合。
比如,该匹配子单元,具体可以用于根据预设知识图谱确定与该种子具有强关联关系的实体词,将所述实体词确定为候选词,从视频库中获取视频信息中包含该候选词的视频,得到第一候选视频集合。
其中,强关联关系指的是关系度小于或等于设定值,即若某实体词与该种子的关系度小于或等于设定值,则表明该实体词与该种子具有强关联关系,即:
该匹配子单元,具体可以用于将该种子映射到预设实体库中的实体词上,确定该种子与该实体库中各个实体词的关系度,选取关系度小于或等于设定值的实体词作为候选词,其中,该实体库可以根据预设知识图谱建立而成。
其中,该设定值可以根据实际应用的需求进行设置,比如,一般可以将关系度为1度和2度的,称为关系“近”的实体,因此,可以将1度作为该设定值,等等。
比如,具体可以通过NEL技术等方式将该种子映射到预设知识库中 的相应的实体上,然后,获取该种子在该知识库中与其他实体的关系边数量,得到该种子与其他实体在该知识库中的关系度。
在本申请一些实施例中,实体库可以由运维人员预先进行设置,或者,也可以由该相似视频的检索装置自行建立而成,即如图3b所示,该相似视频的检索装置还可以包括实体库建立单元306。
该实体库建立单元306,可以用于设置基础词库,根据该基础词库获取互联网信息,并对该互联网信息进行非实体词清洗,根据该基础词库、以及清洗后的互联网信息构建实体词之间的三元组关系,得到实体库。
比如,该实体库建立单元306具体可以从一些应用的细胞词库中取得基础的分类实体词,例如明星、电影等作为基础词库,然后根据这个基础词库获取互联网信息,比如可以获取一些包含百科资料的网页,将这些网页中的非实体词清洗掉,并构建实体词与实体词之间的三元组关系,从而得到保存有这些三元组关系的实体库。
需说明的是,为了保证检索结果的准确性,该实体库建立单元306还可以定时或实时地对该基础词库和互联网信息进行更新,进而对实体库进行更新,具体可参见前面的实施例,在此不再赘述。
训练单元303,用于采用预设word2vec对该视频信息进行训练,以将视频信息转化为词向量。
例如,该训练单元303,具体可以用于对视频标签和视频标题进行分词,得到分词后视频文本,采用预设word2vec对该分词后视频文本进行训练,得到该分词后视频文本的词向量。
其中,该word2vec模型可以由运维人员预先进行设置,也可以由该相似视频的检索装置预先进行建立,即如图3b所示,该相似视频的检索装置还可以包括模型建立单元307,如下:
该模型建立单元307,可以用于获取预设的语料集,并对该语料集中的句子进行分词,根据分词后的句子对预设的原始模型进行学习,得到word2vec模型。
其中,该语料集中的内容可以根据实际应用的需求进行设置,比如,可以对一定期限内的UGC进行抽样,以及对网络信息,如各类百科语料进行抓取,来构建该语料集,等等,具体可参见前面的实施例,在此不再赘述。
筛选单元304,用于根据词向量从该视频库中筛选与该视频信息相似的视频,得到第二候选视频集合;
例如,若训练单元303训练得到分词后视频文本的词向量,则该筛选单元304可以根据该分词后视频文本的词向量从该视频库中筛选与该视频信息相似的视频,得到第二候选视频集合,比如,可以如下:
该筛选单元304,具体可以用于根据该分词后视频文本的词向量,分别计算该视频信息与该视频库中各个视频的相似度,选取相似度高于预设阈值的视频,得到第二候选视频集合。
其中,该预设阈值可以根据实际应用的需求进行设置,而两个视频之间的相似度可以通过计算这两个视频的词向量的点积来得到,或者,也可以采用其他的相似度算法来计算。
确定单元305,用于根据该第一候选视频集合和第二候选视频集合确定该视频信息的相似视频。
例如,该确定单元305,具体可以用于分别对该第一候选视频集合中的视频进行评分,得到第一分数值;分别对该第二候选视频集合中的视频进行评分,得到第二分数值;分别计算该第一分数值和相应的第二分数值的加权值,得到各个视频的综合分数值;将综合分数值大于预设分数值的视频确定为该视频信息的相似视频,具体可参见前面的实施例,在此不再赘述。
具体实施时,以上各个单元可以作为独立的实体来实现,也可以进行任意组合,作为同一或若干个实体来实现,以上各个单元的具体实施可参见前面的方法实施例,在此不再赘述。
由上可知,本实施例所提供相似视频的检索装置在获取需要检索相似视频的视频信息后,一方面可以由匹配单元302根据预设知识图谱从 视频库中获取与该视频信息匹配的视频,得到第一候选视频集合;另一方面,可以由训练单元303采用预设word2vec模型对该视频信息进行训练,以将视频信息转化为词向量,并由筛选单元304根据词向量从视频库中筛选与该视频信息相似的视频,得到第二候选视频集合,然后,由确定单元305根据第一候选视频集合和第二候选视频集合确定该视频信息的相似视频,从而达到检索相似视频的目的。由于该方案采用了知识图谱和文本相似度相结合的方案来检索相似视频,因此,既可以利用知识图谱,来弥补因为语料更新频率和数量问题所导致的召回率低和计算资源消耗较大的问题,也可以利用相似度计算来增加请求视频(即需要进行检索的视频)的上下文信息,规约一词多义导致召回结果跑偏的问题,所以,采用该方案不仅可以提高召回率和检索结果的准确性,而且可以降低模型训练的频率,节省计算资源。
本申请实施例还提供一种计算设备(如前述的服务器),如图4所示,其示出了本申请实施例所涉及的计算设备的结构示意图,具体来讲:
该计算设备可以包括一个或者一个以上处理核心的处理器401、一个或一个以上计算机可读存储介质的存储器402、电源403和输入单元404等部件。本领域技术人员可以理解,图4中示出的计算设备结构并不构成对计算设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。其中:
处理器401是该计算设备的控制中心,利用各种接口和线路连接整个计算设备的各个部分,通过运行或执行存储在存储器402内的软件程序和/或模块,以及调用存储在存储器402内的数据,执行计算设备的各种功能和处理数据,从而对计算设备进行整体监控。在本申请一些实施例中,处理器401可包括一个或多个处理核心;处理器401可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器401中。
存储器402可用于存储软件程序以及模块,处理器401通过运行存储在存储器402的软件程序以及模块,从而执行各种功能应用以及数据 处理。存储器402可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据计算设备的使用所创建的数据等。此外,存储器402可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。相应地,存储器402还可以包括存储器控制器,以提供处理器401对存储器402的访问。
计算设备还包括给各个部件供电的电源403,在本申请一些实施例中,电源403可以通过电源管理系统与处理器401逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。电源403还可以包括一个或一个以上的直流或交流电源、再充电系统、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。
该计算设备还可包括输入单元404,该输入单元404可用于接收输入的数字或字符信息,以及产生与用户设置以及功能控制有关的键盘、鼠标、操作杆、光学或者轨迹球信号输入。
尽管未示出,计算设备还可以包括显示单元等,在此不再赘述。具体在本实施例中,计算设备中的处理器401会按照如下的指令,将一个或一个以上的应用程序的进程对应的可执行文件(机器可读指令)加载到存储器402中,并由处理器401来运行存储在存储器402中的机器可读指令(如用于实现前述的相似视频的检索方法的应用程序),从而实现各种功能,如下:
获取需要检索相似视频的视频信息,该视频信息包括视频标签和视频标题,根据预设知识图谱从视频库中获取与该视频信息匹配的视频,得到第一候选视频集合,采用预设word2vec模型对该视频信息进行训练,以将视频信息转化为词向量,并根据词向量从该视频库中筛选与该视频信息相似的视频,得到第二候选视频集合,根据该第一候选视频集合和第二候选视频集合确定该视频信息的相似视频。
所述处理器401执行存储在存储器402中的机器可读指令可完成以下操作:
从该视频标签和视频标题中抽取实体词,得到种子,然后根据预设知识图谱从视频库中获取与该种子匹配的视频,得到第一候选视频集合;以及,对该视频标签和视频标题进行分词,得到分词后视频文本,采用预设的word2vec模型对该分词后视频文本进行训练,得到该分词后视频文本的词向量,然后,根据该分词后视频文本的词向量从该视频库中筛选与该视频信息相似的视频,得到第二候选视频集合。
其中,该word2vec模型可以由运维人员预先进行设置,也可以由该计算设备预先进行建立,即处理器401还可以运行存储在存储器402中的应用程序(即机器可读指令),从而实现以下功能:
获取预设的语料集,并对该语料集中的句子进行分词,根据分词后的句子对预设的原始模型进行学习,得到word2vec模型。
其中,该语料集中的内容可以根据实际应用的需求进行设置,比如,可以对一定期限内的UGC进行抽样,以及对网络信息,如各类百科语料进行抓取,来构建该语料集,等等。
以上各个操作的具体实施可参见前面的实施例,在此不再赘述。
由上可知,本实施例的计算设备在获取需要检索相似视频的视频信息后,一方面可以根据预设知识图谱从视频库中获取与该视频信息匹配的视频,得到第一候选视频集合;另一方面,可以采用预设word2vec对该视频信息进行训练,以将视频信息转化为词向量,并根据词向量从视频库中筛选与该视频信息相似的视频,得到第二候选视频集合,然后,根据第一候选视频集合和第二候选视频集合确定该视频信息的相似视频,从而达到检索相似视频的目的。由于该方案采用了知识图谱和文本相似度相结合的方案来检索相似视频,因此,既可以利用知识图谱,来弥补因为语料更新频率和数量问题所导致的召回率低和计算资源消耗较大的问题,也可以利用相似度计算来增加请求视频(即需要进行检索的视频)的上下文信息,规约一词多义导致召回结果跑偏的问题,所以,采用该方案不仅可以提高召回率和检索结果的准确性,而且可以降低模型训练的频率,节省计算资源。
本领域普通技术人员可以理解,上述实施例的各种方法中的全部或部分步骤可以通过指令来完成,或通过指令控制相关的硬件来完成,该指令可以存储于一非易失性计算机可读存储介质中,并由处理器进行加载和执行。
为此,本申请实施例提供一种存储介质,其中存储有多条机器可读指令,该机器可读指令能够被处理器进行加载,以执行本申请实施例所提供的任一种相似视频的检索方法中的步骤。例如,该机器可读指令可以由处理器执行以完成以下操作:
获取需要检索相似视频的视频信息,该视频信息包括视频标签和视频标题,根据预设知识图谱从视频库中获取与该视频信息匹配的视频,得到第一候选视频集合,采用预设word2vec模型对该视频信息进行训练,以将视频信息转化为词向量,根据词向量从该视频库中筛选与该视频信息相似的视频,得到第二候选视频集合,根据该第一候选视频集合和第二候选视频集合确定该视频信息的相似视频。
以上各个操作的具体实施可参见前面的实施例,在此不再赘述。
其中,该存储介质可以包括:只读存储器(ROM,Read Only Memory)、随机存取记忆体(RAM,Random Access Memory)、磁盘或光盘等。
由于该存储介质中所存储的指令,可以执行本申请实施例所提供的任一种相似视频的检索方法中的步骤,因此,可以实现本申请实施例所提供的任一种相似视频的检索方法所能实现的有益效果,详见前面的实施例,在此不再赘述。
以上对本申请实施例所提供的一种相似视频的检索方法、装置和存储介质进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。

Claims (23)

  1. 一种相似视频的检索方法,应用于计算设备,包括:
    获取需要检索相似视频的视频信息,其中,所述视频信息包括视频标签和视频标题;
    根据预设的知识图谱从视频库中获取与所述视频信息匹配的视频,得到第一候选视频集合;
    采用预设的文本深度表示模型对所述视频信息进行训练,以将所述视频信息转化为词向量;
    根据所述词向量从所述视频库中筛选与所述视频信息相似的视频,得到第二候选视频集合;
    根据所述第一候选视频集合和第二候选视频集合确定所述视频信息的相似视频。
  2. 根据权利要求1所述的方法,所述根据预设的知识图谱从视频库中获取与所述视频信息匹配的视频,得到第一候选视频集合,包括:
    从所述视频标签和视频标题中抽取实体词,得到种子;
    根据预设的知识图谱从所述视频库中获取与所述种子匹配的视频,得到第一候选视频集合。
  3. 根据权利要求2所述的方法,所述根据预设的知识图谱从所述视频库中获取与所述种子匹配的视频,得到第一候选视频集合,包括:
    根据预设的知识图谱确定与所述种子具有强关联关系的实体词,将所述实体词确定为候选词;
    从所述视频库中获取视频信息中包含所述候选词的视频,得到第一候选视频集合。
  4. 根据权利要求3所述的方法,所述根据预设的知识图谱确定与所述种子具有强关联关系的实体词,将所述实体词确定为候选词,包括:
    将所述种子映射到预设实体库中的实体词上,其中,所述实体库根据预设知识图谱建立而成;
    确定所述种子与所述实体库中各个实体词的关系度;
    选取关系度小于或等于设定值的实体词作为候选词。
  5. 根据权利要求4所述的方法,所述将所述种子映射到预设实体库中的实体词上之前,还包括:
    设置基础词库;
    根据所述基础词库获取互联网信息,并对所述互联网信息进行非实体词清洗;
    根据所述基础词库以及清洗后的互联网信息构建实体词之间的三元组关系,得到所述实体库。
  6. 根据权利要求1至5任一项所述的方法,所述采用预设的文本深度表示模型对所述视频信息进行训练,以将所述视频信息转化为词向量,包括:
    对所述视频标签和视频标题进行分词,得到分词后视频文本;
    采用预设文本深度表示模型对所述分词后视频文本进行训练,得到所述分词后视频文本的词向量;
    所述根据词向量从所述视频库中筛选与所述视频信息相似的视频,得到第二候选视频集合,包括:
    根据所述分词后视频文本的词向量从所述视频库中筛选与所述视频信息相似的视频,得到第二候选视频集合。
  7. 根据权利要求6所述的方法,所述根据所述分词后视频文本的词向量从所述视频库中筛选与所述视频信息相似的视频,得到第二候选视频集合,包括:
    根据所述分词后视频文本的词向量,分别计算所述视频信息与所述视频库中各个视频的相似度;
    选取相似度高于预设阈值的视频,得到第二候选视频集合。
  8. 根据权利要求6所述的方法,采用预设文本深度表示模型对所述分词后视频文本进行训练之前,还包括:
    获取预设的语料集,并对所述语料集中的句子进行分词;
    根据分词后的句子对预设的原始模型进行学习,得到所述文本深度表示模型。
  9. 根据权利要求1至5任一项所述的方法,所述根据所述第一候选视频集合和第二候选视频集合确定所述视频信息的相似视频,包括:
    分别对所述第一候选视频集合中的视频进行评分,得到第一分数值;
    分别对所述第二候选视频集合中的视频进行评分,得到第二分数值;
    分别计算所述第一分数值和相应的第二分数值的加权值,得到各个视频的综合分数值;
    将综合分数值大于预设分数值的视频确定为所述视频信息的相似视频。
  10. 一种相似视频的检索装置,包括:
    处理器以及与所述处理器相连接的存储器,所述存储器中存储有可由所述处理器执行的机器可读指令,所述处理器执行所述机器可读指令完成以下操作:
    获取需要检索相似视频的视频信息,其中,所述视频信息包括视频标签和视频标题;
    根据预设的知识图谱从视频库中获取与所述视频信息匹配的视频,得到第一候选视频集合;
    采用预设的文本深度表示模型对所述视频信息进行训练,以将所述视频信息转化为词向量;
    根据所述词向量从所述视频库中筛选与所述视频信息相似的视频,得到第二候选视频集合;
    根据所述第一候选视频集合和第二候选视频集合确定所述视频信息的相似视频。
  11. 根据权利要求10所述的装置,所述处理器执行所述机器可读指令完成以下操作:
    从所述视频标签和视频标题中抽取实体词,得到种子;
    根据预设的知识图谱从所述视频库中获取与所述种子匹配的视频,得到第一候选视频集合。
  12. 根据权利要求11所述的装置,所述处理器执行所述机器可读指令完成以下操作:
    根据预设的知识图谱确定与所述种子具有强关联关系的实体词,将所述实体词确定为候选词,从所述视频库中获取视频信息中包含所述候选词的视频,得到第一候选视频集合。
  13. 根据权利要求12所述的装置,所述处理器执行所述机器可读指令完成以下操作:
    将所述种子映射到预设实体库中的实体词上,其中,所述实体库根据预设知识图谱建立而成,确定所述种子与所述实体库中各个实体词的关系度,选取关系度小于或等于设定值的实体词作为候选词。
  14. 根据权利要求10至13任一项所述的装置,所述处理器执行所述机器可读指令完成以下操作:
    对所述视频标签和视频标题进行分词,得到分词后视频文本,采用预设文本深度表示模型对所述分词后视频文本进行训练,得到所述分词后视频文本的词向量;
    根据所述分词后视频文本的词向量从所述视频库中筛选与所述视频信息相似的视频,得到第二候选视频集合。
  15. 一种非易失性计算机可读存储介质,所述存储介质中存储有机器可读指令,所述机器可读指令可以由处理器执行以完成以下操作:
    获取需要检索相似视频的视频信息,其中,所述视频信息包括视频标签和视频标题;
    根据预设的知识图谱从视频库中获取与所述视频信息匹配的视频,得到第一候选视频集合;
    采用预设的文本深度表示模型对所述视频信息进行训练,以将所述视频信息转化为词向量;
    根据所述词向量从所述视频库中筛选与所述视频信息相似的视频,得到第二候选视频集合;
    根据所述第一候选视频集合和第二候选视频集合确定所述视频信息的相似视频。
  16. 根据权利要求15所述的非易失性计算机可读存储介质,所述根据预设的知识图谱从视频库中获取与所述视频信息匹配的视频,得到第一候选视频集合,包括:
    从所述视频标签和视频标题中抽取实体词,得到种子;
    根据预设的知识图谱从所述视频库中获取与所述种子匹配的视频,得到第一候选视频集合。
  17. 根据权利要求16所述的非易失性计算机可读存储介质,所述根据预设的知识图谱从所述视频库中获取与所述种子匹配的视频,得到第一候选视频集合,包括:
    根据预设的知识图谱确定与所述种子具有强关联关系的实体词,将所述实体词确定为候选词;
    从所述视频库中获取视频信息中包含所述候选词的视频,得到第一候选视频集合。
  18. 根据权利要求17所述的非易失性计算机可读存储介质,所述根据预设的知识图谱确定与所述种子具有强关联关系的实体词,将所述实体词确定为候选词,包括:
    将所述种子映射到预设实体库中的实体词上,其中,所述实体库根据预设知识图谱建立而成;
    确定所述种子与所述实体库中各个实体词的关系度;
    选取关系度小于或等于设定值的实体词作为候选词。
  19. 根据权利要求18所述的非易失性计算机可读存储介质,所述将所述种子映射到预设实体库中的实体词上之前,还包括:
    设置基础词库;
    根据所述基础词库获取互联网信息,并对所述互联网信息进行非实体词清洗;
    根据所述基础词库以及清洗后的互联网信息构建实体词之间的三元组关系,得到所述实体库。
  20. 根据权利要求15至19任一项所述的非易失性计算机可读存储介质,所述采用预设的文本深度表示模型对所述视频信息进行训练,以将 所述视频信息转化为词向量,包括:
    对所述视频标签和视频标题进行分词,得到分词后视频文本;
    采用预设文本深度表示模型对所述分词后视频文本进行训练,得到所述分词后视频文本的词向量;
    所述根据词向量从所述视频库中筛选与所述视频信息相似的视频,得到第二候选视频集合,包括:
    根据所述分词后视频文本的词向量从所述视频库中筛选与所述视频信息相似的视频,得到第二候选视频集合。
  21. 根据权利要求20所述的非易失性计算机可读存储介质,所述根据所述分词后视频文本的词向量从所述视频库中筛选与所述视频信息相似的视频,得到第二候选视频集合,包括:
    根据所述分词后视频文本的词向量,分别计算所述视频信息与所述视频库中各个视频的相似度;
    选取相似度高于预设阈值的视频,得到第二候选视频集合。
  22. 根据权利要求20所述的非易失性计算机可读存储介质,采用预设文本深度表示模型对所述分词后视频文本进行训练之前,还包括:
    获取预设的语料集,并对所述语料集中的句子进行分词;
    根据分词后的句子对预设的原始模型进行学习,得到所述文本深度表示模型。
  23. 根据权利要求15至19任一项所述的非易失性计算机可读存储介质,所述根据所述第一候选视频集合和第二候选视频集合确定所述视频信息的相似视频,包括:
    分别对所述第一候选视频集合中的视频进行评分,得到第一分数值;
    分别对所述第二候选视频集合中的视频进行评分,得到第二分数值;
    分别计算所述第一分数值和相应的第二分数值的加权值,得到各个视频的综合分数值;
    将综合分数值大于预设分数值的视频确定为所述视频信息的相似视频。
PCT/CN2018/084580 2017-05-11 2018-04-26 一种相似视频的检索方法、装置和存储介质 WO2018205838A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/509,289 US10853660B2 (en) 2017-05-11 2019-07-11 Method and apparatus for retrieving similar video and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710331203.2 2017-05-11
CN201710331203.2A CN107066621B (zh) 2017-05-11 2017-05-11 一种相似视频的检索方法、装置和存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/509,289 Continuation US10853660B2 (en) 2017-05-11 2019-07-11 Method and apparatus for retrieving similar video and storage medium

Publications (1)

Publication Number Publication Date
WO2018205838A1 true WO2018205838A1 (zh) 2018-11-15

Family

ID=59597017

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/084580 WO2018205838A1 (zh) 2017-05-11 2018-04-26 一种相似视频的检索方法、装置和存储介质

Country Status (3)

Country Link
US (1) US10853660B2 (zh)
CN (1) CN107066621B (zh)
WO (1) WO2018205838A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111339360A (zh) * 2020-02-24 2020-06-26 北京奇艺世纪科技有限公司 视频处理方法、装置、电子设备及计算机可读存储介质
CN111767796A (zh) * 2020-05-29 2020-10-13 北京奇艺世纪科技有限公司 一种视频关联方法、装置、服务器和可读存储介质
CN111950360A (zh) * 2020-07-06 2020-11-17 北京奇艺世纪科技有限公司 一种识别侵权用户的方法及装置
CN112312205A (zh) * 2020-10-21 2021-02-02 腾讯科技(深圳)有限公司 一种视频处理方法、装置、电子设备和计算机存储介质

Families Citing this family (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066621B (zh) * 2017-05-11 2022-11-08 腾讯科技(深圳)有限公司 一种相似视频的检索方法、装置和存储介质
CN110019665A (zh) * 2017-09-30 2019-07-16 北京国双科技有限公司 文本检索方法及装置
CN110019669B (zh) * 2017-10-31 2021-06-29 北京国双科技有限公司 一种文本检索方法及装置
CN110019668A (zh) * 2017-10-31 2019-07-16 北京国双科技有限公司 一种文本检索方法及装置
CN110019670A (zh) * 2017-10-31 2019-07-16 北京国双科技有限公司 一种文本检索方法及装置
CN108038183B (zh) * 2017-12-08 2020-11-24 北京百度网讯科技有限公司 结构化实体收录方法、装置、服务器和存储介质
CN108012192A (zh) * 2017-12-25 2018-05-08 北京奇艺世纪科技有限公司 一种视频资源的识别和聚合的方法及系统
CN108694223B (zh) * 2018-03-26 2021-07-16 北京奇艺世纪科技有限公司 一种用户画像库的构建方法及装置
CN108810577B (zh) * 2018-06-15 2021-02-09 深圳市茁壮网络股份有限公司 一种用户画像的构建方法、装置及电子设备
CN109255037B (zh) * 2018-08-31 2022-03-08 北京字节跳动网络技术有限公司 用于输出信息的方法和装置
CN109492687A (zh) * 2018-10-31 2019-03-19 北京字节跳动网络技术有限公司 用于处理信息的方法和装置
CN110166650B (zh) * 2019-04-29 2022-08-23 北京百度网讯科技有限公司 视频集的生成方法及装置、计算机设备与可读介质
CN110245259B (zh) * 2019-05-21 2021-09-21 北京百度网讯科技有限公司 基于知识图谱的视频打标签方法及装置、计算机可读介质
CN110446065A (zh) * 2019-08-02 2019-11-12 腾讯科技(武汉)有限公司 一种视频召回方法、装置及存储介质
CN110532404B (zh) * 2019-09-03 2023-08-04 北京百度网讯科技有限公司 一种源多媒体确定方法、装置、设备及存储介质
CN110688529A (zh) * 2019-09-26 2020-01-14 北京字节跳动网络技术有限公司 用于检索视频的方法、装置和电子设备
CN110598049A (zh) * 2019-09-26 2019-12-20 北京字节跳动网络技术有限公司 用于检索视频的方法、装置、电子设备和计算机可读介质
CN111046227B (zh) * 2019-11-29 2023-04-07 腾讯科技(深圳)有限公司 一种视频查重方法及装置
CN111026913B (zh) * 2019-12-10 2024-04-23 北京奇艺世纪科技有限公司 一种视频分发方法、装置、电子设备及存储介质
CN111274445B (zh) * 2020-01-20 2021-04-23 山东建筑大学 基于三元组深度学习的相似视频内容检索方法及系统
CN111325033B (zh) * 2020-03-20 2023-07-11 中国建设银行股份有限公司 实体识别方法、装置、电子设备及计算机可读存储介质
CN112911331A (zh) * 2020-04-15 2021-06-04 腾讯科技(深圳)有限公司 针对短视频的音乐识别方法、装置、设备及存储介质
CN111522994B (zh) * 2020-04-15 2023-08-01 北京百度网讯科技有限公司 用于生成信息的方法和装置
CN111639228B (zh) * 2020-05-29 2023-07-18 北京百度网讯科技有限公司 视频检索方法、装置、设备及存储介质
CN111737595B (zh) * 2020-06-24 2024-02-06 支付宝(杭州)信息技术有限公司 一种候选词推荐方法、词库排序模型训练方法及装置
US11809480B1 (en) * 2020-12-31 2023-11-07 Meta Platforms, Inc. Generating dynamic knowledge graph of media contents for assistant systems
CN114282059A (zh) * 2021-08-24 2022-04-05 腾讯科技(深圳)有限公司 视频检索的方法、装置、设备及存储介质
CN116028668A (zh) * 2021-10-27 2023-04-28 腾讯科技(深圳)有限公司 信息处理方法、装置、计算机设备以及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007200249A (ja) * 2006-01-30 2007-08-09 Nippon Telegr & Teleph Corp <Ntt> 映像検索方法及び装置及びプログラム及びコンピュータ読み取り可能な記録媒体
CN101976258A (zh) * 2010-11-03 2011-02-16 上海交通大学 基于对象分割和特征加权融合的视频语义提取方法
CN106326388A (zh) * 2016-08-17 2017-01-11 乐视控股(北京)有限公司 一种信息处理方法和装置
CN107066621A (zh) * 2017-05-11 2017-08-18 腾讯科技(深圳)有限公司 一种相似视频的检索方法、装置和存储介质

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9361523B1 (en) * 2010-07-21 2016-06-07 Hrl Laboratories, Llc Video content-based retrieval
JP2016502194A (ja) * 2012-11-30 2016-01-21 トムソン ライセンシングThomson Licensing ビデオ検索方法及び装置
CN103984772B (zh) * 2014-06-04 2017-07-18 百度在线网络技术(北京)有限公司 文本检索字幕库生成方法和装置、视频检索方法和装置
CN104834686B (zh) * 2015-04-17 2018-12-28 中国科学院信息工程研究所 一种基于混合语义矩阵的视频推荐方法
CN105956053B (zh) * 2016-04-27 2019-07-16 海信集团有限公司 一种基于网络信息的搜索方法及装置
CN106126619A (zh) * 2016-06-20 2016-11-16 中山大学 一种基于视频内容的视频检索方法及系统
CN106156365B (zh) * 2016-08-03 2019-06-18 北京儒博科技有限公司 一种知识图谱的生成方法及装置
CN106484664B (zh) * 2016-10-21 2019-03-01 竹间智能科技(上海)有限公司 一种短文本间相似度计算方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007200249A (ja) * 2006-01-30 2007-08-09 Nippon Telegr & Teleph Corp <Ntt> 映像検索方法及び装置及びプログラム及びコンピュータ読み取り可能な記録媒体
CN101976258A (zh) * 2010-11-03 2011-02-16 上海交通大学 基于对象分割和特征加权融合的视频语义提取方法
CN106326388A (zh) * 2016-08-17 2017-01-11 乐视控股(北京)有限公司 一种信息处理方法和装置
CN107066621A (zh) * 2017-05-11 2017-08-18 腾讯科技(深圳)有限公司 一种相似视频的检索方法、装置和存储介质

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111339360A (zh) * 2020-02-24 2020-06-26 北京奇艺世纪科技有限公司 视频处理方法、装置、电子设备及计算机可读存储介质
CN111339360B (zh) * 2020-02-24 2024-03-26 北京奇艺世纪科技有限公司 视频处理方法、装置、电子设备及计算机可读存储介质
CN111767796A (zh) * 2020-05-29 2020-10-13 北京奇艺世纪科技有限公司 一种视频关联方法、装置、服务器和可读存储介质
CN111767796B (zh) * 2020-05-29 2023-12-15 北京奇艺世纪科技有限公司 一种视频关联方法、装置、服务器和可读存储介质
CN111950360A (zh) * 2020-07-06 2020-11-17 北京奇艺世纪科技有限公司 一种识别侵权用户的方法及装置
CN111950360B (zh) * 2020-07-06 2023-08-18 北京奇艺世纪科技有限公司 一种识别侵权用户的方法及装置
CN112312205A (zh) * 2020-10-21 2021-02-02 腾讯科技(深圳)有限公司 一种视频处理方法、装置、电子设备和计算机存储介质
CN112312205B (zh) * 2020-10-21 2024-03-22 腾讯科技(深圳)有限公司 一种视频处理方法、装置、电子设备和计算机存储介质

Also Published As

Publication number Publication date
CN107066621A (zh) 2017-08-18
US10853660B2 (en) 2020-12-01
US20190332867A1 (en) 2019-10-31
CN107066621B (zh) 2022-11-08

Similar Documents

Publication Publication Date Title
WO2018205838A1 (zh) 一种相似视频的检索方法、装置和存储介质
US11216504B2 (en) Document recommendation method and device based on semantic tag
US10146874B2 (en) Refining topic representations
WO2021073254A1 (zh) 基于知识图谱的实体链接方法、装置、设备和存储介质
JP6801350B2 (ja) 記述的なトピックラベルの生成
CN107180045B (zh) 一种互联网文本蕴含地理实体关系的抽取方法
JP6284643B2 (ja) 非構造化テキストにおける特徴の曖昧性除去方法
WO2019085236A1 (zh) 检索意图识别方法、装置、电子设备及可读存储介质
EP2833271A1 (en) Multimedia question and answer system and method
CN108701161B (zh) 为搜索查询提供图像
CN111753060A (zh) 信息检索方法、装置、设备及计算机可读存储介质
WO2020073952A1 (zh) 用于图像识别的图像集的建立方法、装置、网络设备和存储介质
WO2018045646A1 (zh) 基于人工智能的人机交互方法和装置
CN111930518B (zh) 面向知识图谱表示学习的分布式框架构建方法
CN113392651B (zh) 训练词权重模型及提取核心词的方法、装置、设备和介质
CN112836487B (zh) 一种自动评论方法、装置、计算机设备及存储介质
US9773166B1 (en) Identifying longform articles
CN109615001B (zh) 一种识别相似文章的方法和装置
US20120317125A1 (en) Method and apparatus for identifier retrieval
CN108009135A (zh) 生成文档摘要的方法和装置
CN113032673B (zh) 资源的获取方法、装置、计算机设备及存储介质
CN103885933A (zh) 用于评价文本的情感度的方法和设备
CN113590876A (zh) 一种视频标签设置方法、装置、计算机设备及存储介质
Syed et al. Exploring symmetrical and asymmetrical Dirichlet priors for latent Dirichlet allocation
CN112231554A (zh) 一种搜索推荐词生成方法、装置、存储介质和计算机设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18799168

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18799168

Country of ref document: EP

Kind code of ref document: A1