CN108460085A - A kind of video search sequence training set construction method and device based on user journal - Google Patents

A kind of video search sequence training set construction method and device based on user journal Download PDF

Info

Publication number
CN108460085A
CN108460085A CN201810052822.2A CN201810052822A CN108460085A CN 108460085 A CN108460085 A CN 108460085A CN 201810052822 A CN201810052822 A CN 201810052822A CN 108460085 A CN108460085 A CN 108460085A
Authority
CN
China
Prior art keywords
user
daily record
video
playing duration
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810052822.2A
Other languages
Chinese (zh)
Inventor
赵晓萌
胡军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201810052822.2A priority Critical patent/CN108460085A/en
Publication of CN108460085A publication Critical patent/CN108460085A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/71Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The video search sequence training set construction method and device that an embodiment of the present invention provides a kind of based on user journal, wherein the method includes:It obtains user and searches for daily record, the user of the acquisition searches in daily record comprising the corresponding video-related features of video being searched;Daily record is searched for the user of acquisition to sample, the user after being sampled searches for daily record as training sample;The video playing duration for obtaining every training sample obtains the score of every training sample according to the correspondence of preset score and video playing duration;All training samples and corresponding score are configured to training set.The embodiment of the present invention can realize automatic structure training set, save human cost.

Description

A kind of video search sequence training set construction method and device based on user journal
Technical field
The present invention relates to video search technique areas, sort and instruct more particularly to a kind of video search based on user journal Practice collection construction method and device.
Background technology
With the rise of machine learning order models (Learning to Rank), searching engine field Nei Ge major companies are confused It is confused to attempt to replace existing rule-based order models using Learning to Rank.And machine learning order models are in reality During now sorting, need to be trained based on training set data, then realizing using machine learning order models It just needs first to build training set before sequence.
Existing Learning to Rank training sets, for example, the LETOR of Microsoft, MSLR-WEB30K and Yahoo Learning to Rank Challenge training sets are the training sets for web page search engine.Also, these are directed to net The training set of page search engine is then to be directed to search result by being scanned for query word, according to query word and document pair, Document under the query word is evaluated by the way of manually evaluating to whether related to the query word, judges to carry out after correlation Artificial stepping provides the certain score structure training set of these search results.Wherein, document is to being searched under a query word Document.The existing training set construction method for web page search engine is manually evaluated in the case of given query word The certain score structure training set of search result is provided, building process is simply easily realized.
However, inventor has found in the implementation of the present invention, at least there are the following problems for the prior art:
The existing training set for web page search engine scans for query word, to searching by the way of manually evaluating Hitch fruit carries out artificial stepping and provides certain score structure training set, and the structure of this training set needs to expend a large amount of manpower Cost, and the artifical influence factor caused by the subjectivity of people is very big.
Invention content
The embodiment of the present invention is designed to provide a kind of video search sequence training set structure side based on user journal Method and device save human cost to realize automatic structure training set.Specific technical solution is as follows:
In order to achieve the above objectives, the video search that the embodiment of the invention discloses a kind of based on user journal sorts training set Construction method, the method includes:
It obtains user and searches for daily record, the user of the acquisition searches in daily record comprising the corresponding video of video being searched Correlated characteristic;It is included at least in the video-related features:Video playing duration;
Daily record is searched for the user of acquisition to sample, the user after being sampled searches for daily record as training sample;
The video playing duration for obtaining every training sample is closed according to preset score is corresponding with video playing duration System obtains the score of every training sample;
All training samples and corresponding score are configured to training set.
Optionally, the acquisition user searches for the step of daily record, is:When each user scans for according to search term, obtain It takes and the search daily record generated is searched for according to user.
Optionally, the video-related features further include:Video feature itself, videotext correlative character and user's dimension Spend feature;
The acquisition user searches for the step of daily record, including:
When user scans for according to search term, the original user search daily record for search word and search is generated, it is original User searches in daily record:Video length, video data type feature in video feature itself and the corresponding spy of each feature Value indicative;
Calculate the video freshness characteristic value in video feature itself, the videotext correlation of search term and search result Characteristic value and user's click, historic click-through rate, search time corresponding user's dimensional characteristics value;
By the characteristic value comprising video feature itself, videotext correlative character value and user's dimensional characteristics value Whole features are added into the original user search daily record, generate end user and search for daily record and preserve.
Optionally, the training sample includes positive sample and negative sample;
The user of described pair of acquisition searches for daily record and samples, and the user after being sampled searches for daily record as training sample The step of, including:
From the user of acquisition search for daily record in obtain user click play daily record, using every click play daily record as One trained positive sample;
The user of acquisition is searched for into the user in daily record in addition to the user clicks the daily record played and searches for daily record progress Negative sampling, obtains and bears sample log, using the negative sample log of acquisition as the negative sample of training.
Optionally, in the daily record for being searched in daily record from the user of acquisition and obtaining user and clicking broadcasting, every is clicked The daily record of the broadcasting positive sample trained as one further include:Set its type to first kind type identification;
The step of negative sampling, including:
Obtain user search in daily record the user in addition to the user clicks the daily record played do not click on broadcasting conduct it is negative Sample log sets its type to the second Class Type mark;
It obtains second Class Type to identify in corresponding negative sample log, search term and search result text relevant are small In the negative sample log of predetermined threshold value, it sets its type to third Class Type mark;
It obtains second Class Type to identify in corresponding negative sample log, the negative sampling day determined according to preset rules Will sets its type to the 4th Class Type mark.
Optionally, the video playing duration for obtaining every training sample, when according to preset score and video playing The step of long correspondence, the score of every training sample of acquisition, including:
By the playing duration of every positive sample respectively with preset first playing duration threshold value, the second playing duration threshold value and Third playing duration threshold value is compared;Wherein, it presets the first playing duration threshold value and is less than default second playing duration threshold value, in advance If the second playing duration threshold value is less than default third playing duration threshold value;
If the video playing duration of the positive sample is less than default first playing duration threshold value, which is scored at Preset minimum point;
If alternatively, the video playing duration of the positive sample is more than default first playing duration threshold value, and less than default the Two playing duration threshold values, then the positive sample be scored at preset time low point;
If alternatively, the video playing duration of the positive sample is more than default second playing duration threshold value, and less than default the Three playing duration threshold values, then the positive sample be scored at preset secondary high score;
If alternatively, the video playing duration of the positive sample is more than default third playing duration threshold value, the positive sample It is scored at preset best result;
The score of every negative sample is determined as preset minimum point.
In order to achieve the above objectives, the video search that the embodiment of the invention discloses a kind of based on user journal sorts training set Construction device, described device include:
Log acquisition module searches for daily record for obtaining user, and the user of the acquisition searches in daily record comprising searched The corresponding video-related features of video arrived;It is included at least in the video-related features:Video playing duration;
Sample acquisition module is searched for daily record for the user to acquisition and is sampled, and the user after being sampled searches for day Will is as training sample;
Score acquisition module, the video playing duration for obtaining every training sample, according to preset score and video The correspondence of playing duration obtains the score of every training sample;
Training set builds module, for all training samples and corresponding score to be configured to training set.
Optionally, the log acquisition module, is specifically used for:
When each user scans for according to search term, obtains and the search daily record generated is searched for according to user.
Optionally, the user acquired in the log acquisition module searches for corresponding comprising the video being searched in daily record Video-related features, the video-related features further include:Video feature itself, videotext correlative character and user's dimension Feature;
The log acquisition module, is specifically used for:
When user scans for according to search term, the original user search daily record for search word and search is generated, it is original User searches in daily record:Video length, video data type feature in video feature itself and the corresponding spy of each feature Value indicative;
Calculate the video freshness characteristic value in video feature itself, the videotext correlation of search term and search result Characteristic value and user's click, historic click-through rate, search time corresponding user's dimensional characteristics value;
By the characteristic value comprising video feature itself, videotext correlative character value and user's dimensional characteristics value Whole features are added into the original user search daily record, generate end user and search for daily record and preserve.
Optionally, the training sample includes positive sample and negative sample;
The sample acquisition module, including:
Positive sample acquisition submodule will for searching in daily record the daily record for obtaining user and clicking broadcasting from the user of acquisition Every is clicked the daily record played the positive sample trained as one;
Negative sample acquisition submodule, for the user that will obtain search in daily record except the user click broadcasting daily record it Outer user searches for daily record and carries out negative sampling, obtains and bears sample log, using the negative sample log of acquisition as the negative sample of training.
Optionally, the positive sample acquisition submodule, is additionally operable to:Set its type to first kind type identification;
The negative sample acquisition submodule, including negative sampling submodule;
The negative sampling submodule, is used for:
Obtain user search in daily record the user in addition to the user clicks the daily record played do not click on broadcasting conduct it is negative Sample log sets its type to the second Class Type mark;
It obtains second Class Type to identify in corresponding negative sample log, search term and search result text relevant are small In the negative sample log of predetermined threshold value, it sets its type to third Class Type mark;
It obtains second Class Type to identify in corresponding negative sample log, the negative sampling day determined according to preset rules Will sets its type to the 4th Class Type mark.
Optionally, the score acquisition module, is specifically used for:
By the playing duration of every positive sample respectively with preset first playing duration threshold value, the second playing duration threshold value and Third playing duration threshold value is compared;Wherein, it presets the first playing duration threshold value and is less than default second playing duration threshold value, in advance If the second playing duration threshold value is less than default third playing duration threshold value;
If the video playing duration of the positive sample is less than default first playing duration threshold value, which is scored at Preset minimum point;
If alternatively, the video playing duration of the positive sample is more than default first playing duration threshold value, and less than default the Two playing duration threshold values, then the positive sample be scored at preset time low point;
If alternatively, the video playing duration of the positive sample is more than default second playing duration threshold value, and less than default the Three playing duration threshold values, then the positive sample be scored at preset secondary high score;
If alternatively, the video playing duration of the positive sample is more than default third playing duration threshold value, the positive sample It is scored at preset best result;
The score of every negative sample is determined as preset minimum point.
At the another aspect that the present invention is implemented, a kind of electronic equipment, including processor, communication interface, storage are additionally provided Device and communication bus, wherein the processor, the communication interface, the memory are completed mutual by the communication bus Between communication;
The memory, for storing computer program;
The processor when for executing the program stored on the memory, is realized described in above-mentioned first aspect Video search sequence training set construction method based on user journal.
At the another aspect that the present invention is implemented, a kind of computer readable storage medium is additionally provided, it is described computer-readable Instruction is stored in storage medium, when run on a computer so that computer execute it is any of the above-described it is described based on The video search sequence training set construction method of family daily record.
At the another aspect that the present invention is implemented, the embodiment of the present invention additionally provides a kind of computer program production comprising instruction Product, when run on a computer so that computer executes any of the above-described video search row based on user journal Sequence training set construction method.
A kind of video search sequence training set construction method and device based on user journal provided in an embodiment of the present invention, Daily record is searched for by obtaining user, and daily record is searched for the user of acquisition and is sampled, training sample is obtained, further according to user couple The playing duration that search daily record corresponds to video obtains the score of the training sample, and training sample and corresponding score are configured to Video search sequence training set realizes automatic structure training set, saves human cost.Certainly, implement any of the present invention Product or method must be not necessarily required to reach all the above advantage simultaneously.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with Obtain other attached drawings according to these attached drawings.
Fig. 1 is a kind of video search sequence training set construction method stream based on user journal provided in an embodiment of the present invention Cheng Tu;
Fig. 2 is a kind of video search sequence training set construction device knot based on user journal provided in an embodiment of the present invention Structure schematic diagram;
Fig. 3 is a kind of electronic equipment structural schematic diagram provided in an embodiment of the present invention.
Specific implementation mode
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
Training set in order to solve to be directed to web page search engine in the prior art scans for query word, using manually commenting The mode of valence carries out artificial stepping to search result and provides certain score structure training set, and the structure of training set needs to expend big The human cost of amount, and the big problem of artifical influence factor caused by by artificial subjectivity, the embodiment of the present invention propose one kind Video search sequence training set construction method based on user journal.
A kind of video search sequence training set construction method based on user journal of the embodiment of the present invention, passes through to obtain and use Daily record is searched at family, and is searched for daily record to the user of acquisition and sampled, and training sample is obtained, further according to user to searching for daily record pair It answers the playing duration of video to obtain the score of the training sample, training sample and corresponding score is configured to video search row Sequence training set realizes automatic structure training set, saves human cost.
Fig. 1 is a kind of video search sequence training set construction method stream based on user journal provided in an embodiment of the present invention Cheng Tu, this method include:
S110 obtains user and searches for daily record, and the user of the acquisition searches in daily record to be corresponded to comprising the video being searched Video-related features;It is included at least in the video-related features:Video playing duration.
In the embodiment of the present invention, the search daily record for obtaining user is collected, a kind of optional realization method can be, using reality Shi Jilu simultaneously preserves the mode that user searches for daily record, and it is corresponding to contain the video being searched in user's search daily record of preservation Video-related features.
Although can also obtain user in practical application by the way of restoring offline searches for daily record, to save on line The memory space of server.But in practice, because including a large amount of reality in the corresponding video-related features of the video searched Shi Tezheng is real-time statistics on line when user scans for, by line for example, the features such as 2 hours clicking rate of video history What the flows such as push obtained, result is not accurate enough when the time of true statistical and push restores offline.Also, restore offline Situations such as user searches for daily record inevitably occurrence log is lost, this can cause the user restored offline to search for daily record and line The daily record of upper real-time statistics is inconsistent.Therefore preferably by recording in real time online and the mode preserved obtains in the embodiment of the present application User searches for daily record.
In the embodiment of the present invention, step S110 is a kind of optionally to obtain the step of user searches for daily record, Ke Yiwei:It uses every time When family is scanned for according to search term, obtains and the search daily record generated is searched for according to user.
It is appreciated that in the embodiment of the present invention, when user scans for according to search term, online note in real time may be used The mode of record obtains and searches for each search result generated according to user, searched for a search result as a user Daily record preserves user and searches for each user search daily record generated.Wherein, each user searches in daily record and contains this The corresponding video-related features of video being searched, the video-related features have included at least the playing duration of video.
In the embodiment of the present invention, step S110 another kinds realization method can be to be carried out using certain probability or frequency It is online to record the search daily record generated when user scans for according to search term in real time, to reduce the pressure of server.For example, can It is received with searching for the search daily record generated to user using setting regular or irregular interval a few minutes or dozens of minutes Collection.
Certainly, the application is only illustrated with above-mentioned realization method, and the side that user searches for daily record is obtained in practical application Formula is not limited to that.
In the embodiment of the present invention, optionally, video-related features further include described in step S110:Video feature itself, Videotext correlative character and user's dimensional characteristics;
The acquisition user searches for the step of daily record, including:
When user scans for according to search term, the original user search daily record for search word and search is generated, it is original User searches in daily record:Video length, video data type feature in video feature itself and the corresponding spy of each feature Value indicative;
Calculate the video freshness characteristic value in video feature itself, the videotext correlation of search term and search result Characteristic value and user's click, historic click-through rate, search time corresponding user's dimensional characteristics value;
By the characteristic value comprising video feature itself, videotext correlative character value and user's dimensional characteristics value Whole features are added into the original user search daily record, generate end user and search for daily record and preserve.
In embodiments of the present invention, a kind of optional method and step for obtaining user's search daily record can be:When user's root When being scanned for according to search term, the original user search daily record for the search word and search is generated, in original user search daily record It contains the search daily record and corresponds to some videos of video feature itself, for example, video length, video data type feature and each The corresponding characteristic value of feature.
Meanwhile it according to user's search time and can search in the latest update time calculating video feature of video itself Video freshness characteristic value;Can be by using BM25, VSM, language model or text overlap one or more among probability Mode calculates the text relevant characteristic value for obtaining search term and searching for daily record, obtains videotext correlative character value;It can be with Real-time statistics user whether click the video, click the time, search time, video be averaged playing duration, click when the video be No update simultaneously calculates user's history clicking rate, and statistics, the result calculated are pushed on line, the user of the search video is obtained Dimensional characteristics value.
Then, by characteristic value, videotext correlative character value and the user's dimension for including video feature itself of acquisition Whole features of characteristic value are added into the original user search daily record of generation, are generated end user and are searched for daily record and protect It deposits, the format of preservation can be:
event_id\t uid\t query\t video_id\t position\tis_click\t feature_id: Value ...
Wherein, event_id be user search event identifier, uid be user identifier, query be input term, Video_id is video identifier, position be video location mark, is_click be video whether by user click and Feature id are the signature identification of video, the characteristic value that value is video.
Certainly, the embodiment of the present invention is only illustrated with above-mentioned realization method, and daily record is searched for user in practical application Format preservation is not limited to that.
S120 searches for daily record to the user of acquisition and samples, and the user after being sampled searches for daily record as training sample This.
In the embodiment of the present invention, a kind of realization method can be, to searching for day by recording the user for preserving and obtaining in real time Will is sampled, and can search for daily record to the user that the same day collects to sample, be can also be the user collected to the previous day Search daily record is sampled, and the user after sampling is searched for daily record as training sample, those skilled in the art in practical application It can be selected according to actual demand, this is not restricted for the embodiment of the present invention.
In the embodiment of the present invention, the training sample may include positive sample and negative sample, a kind of optionally to acquisition User searches for daily record and samples, and the user after being sampled searches for the step of daily record is as training sample, including:
From the user of acquisition search for daily record in obtain user click play daily record, using every click play daily record as One trained positive sample;
The user of acquisition is searched for into the user in daily record in addition to the user clicks the daily record played and searches for daily record progress Negative sampling, obtains and bears sample log, using the negative sample log of acquisition as the negative sample of training.
In practical applications, a kind of optional realization methods of step S120 can be to search for daily record for the user of acquisition, It therefrom chooses user and clicks the daily record played, every is clicked the daily record played the positive sample trained as one, to user's point The user hit except the daily record of broadcasting searches for daily record and carries out negative sampling, obtains the negative sample for bearing sample log as training, wherein It may include the daily record that user clicks the daily record played and user does not click on broadcasting that the user, which searches for daily record,.
Optionally, in the daily record for being searched in daily record from the user of acquisition and obtaining user and clicking broadcasting, every is clicked The daily record of the broadcasting positive sample trained as one can also include:Set its type to first kind type identification;
So, the step of negative sampling may include:
Obtain user search in daily record the user in addition to the user clicks the daily record played do not click on broadcasting conduct it is negative Sample log sets its type to the second Class Type mark;
It obtains second Class Type to identify in corresponding negative sample log, search term and search result text relevant are small In the negative sample log of predetermined threshold value, it sets its type to third Class Type mark;Wherein, preset threshold value art technology Personnel can be configured according to actual demand;
It obtains second Class Type to identify in corresponding negative sample log, the negative sampling day determined according to preset rules Will sets its type to the 4th Class Type mark.Wherein, preset rules can be some need that those skilled in the art specify The video for suppressing cheating can also be regular score to be calculated when obtaining user and searching for daily record, and rule is divided Number is pushed on line and carries out record preservation, directly judge whether it can be with according to regular score during being sampled Sample log is born as third class, setting those skilled in the art of specific preset rules can set according to actual demand It sets, this is not restricted for the embodiment of the present invention.
In the embodiment of the present invention, a kind of optional realization methods of step S120 can be that the user of acquisition is searched for daily record Middle user clicks positive sample of the daily record played as training, and sets its type to first kind type identification.Wherein, first Class Type mark could be provided as number 1 or other, may be set to be alphabetical A or other, and other types mark accordingly can be with It is set as 2,3,4 or B, C, D.Setting those skilled in the art of concrete type mark can be configured according to actual demand, this This is not restricted for inventive embodiments.
In the embodiment of the present invention, it regard the sample log that obtains later of sampling as training sample, the format of preservation can be with For:
event_id\t uid\t query\t video_id\t position\t is_click\t label_type\ t feature_id:Value ...
Wherein, label_type is the type of training sample.
S130 obtains the video playing duration of every training sample, according to pair of preset score and video playing duration It should be related to, obtain the score of every training sample.
In the embodiment of the present invention, optionally, a kind of realization method of step S130 can be:
By the playing duration of every positive sample respectively with preset first playing duration threshold value, the second playing duration threshold value and Third playing duration threshold value is compared;Wherein, it presets the first playing duration threshold value and is less than default second playing duration threshold value, in advance If the second playing duration threshold value is less than default third playing duration threshold value;
If the video playing duration of the positive sample is less than default first playing duration threshold value, which is scored at Preset minimum point;
If alternatively, the video playing duration of the positive sample is more than default first playing duration threshold value, and less than default the Two playing duration threshold values, then the positive sample be scored at preset time low point;
If alternatively, the video playing duration of the positive sample is more than default second playing duration threshold value, and less than default the Three playing duration threshold values, then the positive sample be scored at preset secondary high score;
If alternatively, the video playing duration of the positive sample is more than default third playing duration threshold value, the positive sample It is scored at preset best result;
The score of every negative sample is determined as preset minimum point.
In the embodiment of the present invention, preset minimum point can be 0, and secondary low point is 1, and secondary high score is 2, and best result is 3.This hair A kind of default score and the correspondence of video playing duration can be arranged as shown in table 1 in bright embodiment.Described first plays Duration threshold value can be expressed as short_ploytime_threshold, and the second playing duration threshold value can be expressed as Middle_ploytime_threshold, the third playing duration threshold value can be expressed as long_ploytime_ threshold。
For example, can be according to the video playing duration of every positive sample, according to preset score and video playing duration Correspondence, the score for obtaining every positive sample are as follows:When the positive sample video playing duration is less than short_ploytime_ When threshold, it is scored at 0, it is 0 that its label label, which is arranged, indicates that this user clicks to play and belongs to maloperation, or What user clicked broadcasting is not the video that user wants viewing;When the positive sample video playing duration is more than short_ Ploytime_threshold, and less than middle_ploytime_threshold when, be scored at 1, its label be set Label is 1, indicates that user is dissatisfied to this search result;When the positive sample video playing duration is more than middle_ Ploytime_threshold, and less than long_ploytime_threshold when, be scored at 2, its label label be set It is 2, indicates that user is generally satisfied to this search result;When the positive sample video playing duration is more than long_ When ploytime_threshold, 3 are scored at, it is 3 that its label label, which is arranged, indicates that user feels quite pleased this search knot Fruit.Optionally, it is 0 to be expressed as label by above-mentioned minimum point, and it is 1 that secondary low point, which is expressed as label, and secondary high score is expressed as label and is 2, it is 3 that best result, which is expressed as label,.
A kind of correspondence of default score and video playing duration of table 1
In embodiments of the present invention, for presetting playing duration threshold value short_ploytime_threshold, middle_ Ploytime_threshold and long_ploytime_threshold can classify according to different video lengths, phase Identical playing duration threshold value is used like the video of video length.For example, video total duration being divided within 30 minutes is short Video, video total duration are divided into medium video within 2 hours, and video total duration is in 3 more than hour being divided into Long video;The short_ploytime_threshold of short-sighted frequency could be provided as 5 minutes, middle_ploytime_ Threshold could be provided as 10 minutes, and long_ploytime_threshold could be provided as 20 minutes;Medium video Short_ploytime_threshold could be provided as 20 minutes, and middle_ploytime_threshold could be provided as 40 minutes, long_ploytime_threshold could be provided as 70 minutes;The short_ploytime_ of long video Threshold could be provided as 1 hour, and middle_ploytime_threshold could be provided as 2 hours, long_ Ploytime_threshold could be provided as 3 hours, etc..Certainly, the embodiment of the present invention is only with above-mentioned realization method It illustrates, to video, length is classified on time in practical application, and the mode that setting plays threshold value is not limited to that.
The score of every negative sample is determined as preset minimum point, is scored at 0, it is 0 that its label label, which is arranged,.
In the embodiment of the present invention, according to the video playing duration of every training sample, broadcast with video according to preset score The correspondence of duration is put, after the score for obtaining every training sample, preserving format can be:
event_id\t uid\t query\t video_id\t position\t label_type\t label\t feature_id:Value ...
Wherein, label is the score of training sample.
All training samples and corresponding score are configured to training set by S140.
In the embodiment of the present invention, the training sample of acquisition and its corresponding score are configured to training set.The present invention is implemented Example can also know that negative sampling is corresponding after building training set by the type label_type of training sample in training set Negative sample type, those skilled in the art can control the ratio of various types negative sample to adjust when using the training set Trained effect.
A kind of video search sequence training set construction method based on user journal provided in an embodiment of the present invention, by obtaining Family search daily record is taken, and daily record is searched for the user of acquisition and is sampled, training sample is obtained, further according to user to searching for day The playing duration that will corresponds to video obtains the score of the training sample, and training sample and corresponding score are configured to video and searched Rope sequence training set realizes automatic structure training set, saves human cost, avoid because artificial stepping is by artificial subjectivity The big problem of caused artifical influence factor.
It is corresponding with the aforementioned video search sequence training set construction method based on user journal, the implemented in the present invention Two aspects additionally provide a kind of video search sequence training set construction device based on user journal.Fig. 2 is the embodiment of the present invention A kind of structural schematic diagram of the video search sequence training set construction device based on user journal provided, the device include:
Log acquisition module 210 searches for daily record for obtaining user, and it includes to be searched that the user of the acquisition, which searches in daily record, The corresponding video-related features of video that rope arrives;It is included at least in the video-related features:Video playing duration;
Sample acquisition module 220 is searched for daily record for the user to acquisition and is sampled, user's search after being sampled Daily record is as training sample;
Score acquisition module 230, the video playing duration for obtaining every training sample, according to preset score with regard The correspondence of frequency playing duration obtains the score of every training sample;
Training set builds module 240, for all training samples and corresponding score to be configured to training set.
A kind of video search sequence training set construction device based on user journal provided in an embodiment of the present invention, by obtaining Family search daily record is taken, and daily record is searched for the user of acquisition and is sampled, training sample is obtained, further according to user to searching for day The playing duration that will corresponds to video obtains the score of the training sample, and training sample and corresponding score are configured to video and searched Rope sequence training set realizes automatic structure training set, saves human cost, avoid because artificial stepping is by artificial subjectivity The big problem of caused artifical influence factor.
It should be noted that the device of the embodiment of the present invention is using a kind of above-mentioned video search row based on user journal The device of sequence training set construction method, then it is above-mentioned based on user journal video search sequence training set construction method all realities It applies example and is suitable for the device, and can reach same or analogous advantageous effect.
Optionally, the log acquisition module 210, is specifically used for:
When each user scans for according to search term, obtains and the search daily record generated is searched for according to user.
Optionally, the user acquired in the log acquisition module 210 searches in daily record comprising the video pair being searched The video-related features answered, the video-related features further include:Video feature itself, videotext correlative character and user Dimensional characteristics;
The log acquisition module 210, is specifically used for:
When user scans for according to search term, the original user search daily record for search word and search is generated, it is original User searches in daily record:Video length, video data type feature in video feature itself and the corresponding spy of each feature Value indicative;
Calculate the video freshness characteristic value in video feature itself, the videotext correlation of search term and search result Characteristic value and user's click, historic click-through rate, search time corresponding user's dimensional characteristics value;
By the characteristic value comprising video feature itself, videotext correlative character value and user's dimensional characteristics value Whole features are added into the original user search daily record, generate end user and search for daily record and preserve.
Optionally, the training sample includes positive sample and negative sample;
The sample acquisition module 220, including:
Positive sample acquisition submodule will for searching in daily record the daily record for obtaining user and clicking broadcasting from the user of acquisition Every is clicked the daily record played the positive sample trained as one;
Negative sample acquisition submodule, for the user that will obtain search in daily record except the user click broadcasting daily record it Outer user searches for daily record and carries out negative sampling, obtains and bears sample log, using the negative sample log of acquisition as the negative sample of training.
Optionally, the positive sample acquisition submodule, is additionally operable to:Set its type to first kind type identification;
The negative sample acquisition submodule, including negative sampling submodule;
The negative sampling submodule, is used for:
Obtain user search in daily record the user in addition to the user clicks the daily record played do not click on broadcasting conduct it is negative Sample log sets its type to the second Class Type mark;
It obtains second Class Type to identify in corresponding negative sample log, search term and search result text relevant are small In the negative sample log of predetermined threshold value, it sets its type to third Class Type mark;
It obtains second Class Type to identify in corresponding negative sample log, the negative sampling day determined according to preset rules Will sets its type to the 4th Class Type mark.
Optionally, the score acquisition module 230, is specifically used for:
By the playing duration of every positive sample respectively with preset first playing duration threshold value, the second playing duration threshold value and Third playing duration threshold value is compared;Wherein, it presets the first playing duration threshold value and is less than default second playing duration threshold value, in advance If the second playing duration threshold value is less than default third playing duration threshold value;
If the video playing duration of the positive sample is less than default first playing duration threshold value, which is scored at Preset minimum point;
If alternatively, the video playing duration of the positive sample is more than default first playing duration threshold value, and less than default the Two playing duration threshold values, then the positive sample be scored at preset time low point;
If alternatively, the video playing duration of the positive sample is more than default second playing duration threshold value, and less than default the Three playing duration threshold values, then the positive sample be scored at preset secondary high score;
If alternatively, the video playing duration of the positive sample is more than default third playing duration threshold value, the positive sample It is scored at preset best result;
The score of every negative sample is determined as preset minimum point.
A kind of video search sequence training set construction device based on user journal provided in an embodiment of the present invention, by obtaining Family search daily record is taken, and daily record is searched for the user of acquisition and is sampled, training sample is obtained, further according to user to searching for day The playing duration that will corresponds to video obtains the score of the training sample, and training sample and corresponding score are configured to video and searched Rope sequence training set realizes automatic structure training set, saves human cost, avoid because artificial stepping is by artificial subjectivity The big problem of caused artifical influence factor.
The embodiment of the present invention additionally provides a kind of electronic equipment, as shown in figure 3, including processor 301, communication interface 302, Memory 303 and communication bus 304, wherein processor 301, communication interface 302, memory 303 are complete by communication bus 304 At mutual communication,
Memory 303, for storing computer program;
Processor 301 when for executing the program stored on memory 303, realizes following steps:
It obtains user and searches for daily record, the user of the acquisition searches in daily record comprising the corresponding video of video being searched Correlated characteristic;It is included at least in the video-related features:Video playing duration;
Daily record is searched for the user of acquisition to sample, the user after being sampled searches for daily record as training sample;
The video playing duration for obtaining every training sample is closed according to preset score is corresponding with video playing duration System obtains the score of every training sample;
All training samples and corresponding score are configured to training set.
The communication bus 304 that above-mentioned electronic equipment is mentioned can be Peripheral Component Interconnect standard (Peripheral Component Interconnect, abbreviation PCI) bus or expanding the industrial standard structure (Extended Industry Standard Architecture, abbreviation EISA) bus etc..The communication bus can be divided into address bus, data/address bus, control Bus processed etc..It for ease of indicating, is only indicated with a thick line in figure, it is not intended that an only bus or a type of total Line.
Communication interface 302 is for the communication between above-mentioned electronic equipment and other equipment.
Memory 303 may include random access memory (Random Access Memory, abbreviation RAM), can also Including nonvolatile memory (non-volatile memory), for example, at least a magnetic disk storage.Optionally, memory It can also be at least one storage device for being located remotely from aforementioned processor.
Above-mentioned processor 301 can be general processor, including central processing unit (Central Processing Unit, abbreviation CPU), network processing unit (Network Processor, abbreviation NP) etc.;It can also be digital signal processor (Digital Signal Processing, abbreviation DSP), application-specific integrated circuit (Application Specific Integrated Circuit, abbreviation ASIC), field programmable gate array (Field-Programmable Gate Array, Abbreviation FPGA) either other programmable logic device, discrete gate or transistor logic, discrete hardware components.
A kind of electronic equipment provided in an embodiment of the present invention is searched for daily record by obtaining user, and is searched to the user of acquisition Suo Zhi is sampled, and training sample is obtained, and the playing duration for corresponding to video to search daily record further according to user obtains the instruction Training sample and corresponding score are configured to video search sequence training set, realize automatic structure instruction by the score for practicing sample Practice collection, saves human cost, avoid the big problem of artifical influence factor caused by by artificial subjectivity because of artificial stepping.
In another embodiment provided by the invention, a kind of computer readable storage medium is additionally provided, which can It reads to be stored with instruction in storage medium, when run on a computer so that computer executes any institute in above-described embodiment The video search sequence training set construction method based on user journal stated, obtains identical technique effect.
In another embodiment provided by the invention, a kind of computer program product including instruction is additionally provided, when it When running on computers so that computer executes any video search row based on user journal in above-described embodiment Sequence training set construction method, obtains identical technique effect.
In the above-described embodiments, can come wholly or partly by software, hardware, firmware or its arbitrary combination real It is existing.When implemented in software, it can entirely or partly realize in the form of a computer program product.The computer program Product includes one or more computer instructions.When loading on computers and executing the computer program instructions, all or It partly generates according to the flow or function described in the embodiment of the present invention.The computer can be all-purpose computer, special meter Calculation machine, computer network or other programmable devices.The computer instruction can be stored in computer readable storage medium In, or from a computer readable storage medium to the transmission of another computer readable storage medium, for example, the computer Instruction can pass through wired (such as coaxial cable, optical fiber, number from a web-site, computer, server or data center User's line (DSL)) or wireless (such as infrared, wireless, microwave etc.) mode to another web-site, computer, server or Data center is transmitted.The computer readable storage medium can be any usable medium that computer can access or It is comprising data storage devices such as one or more usable mediums integrated server, data centers.The usable medium can be with It is magnetic medium, (for example, floppy disk, hard disk, tape), optical medium (for example, DVD) or semiconductor medium (such as solid state disk Solid State Disk (SSD)) etc..
It should be noted that herein, relational terms such as first and second and the like are used merely to a reality Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also include other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence " including one ... ", it is not excluded that There is also other identical elements in the process, method, article or apparatus that includes the element.
Each embodiment in this specification is all made of relevant mode and describes, identical similar portion between each embodiment Point just to refer each other, and each embodiment focuses on the differences from other embodiments.Especially for device/ For electronic equipment/storage medium embodiment, since it is substantially similar to the method embodiment, so fairly simple, the phase of description Place is closed referring to the part of embodiment of the method to illustrate.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the scope of the present invention.It is all Any modification, equivalent replacement, improvement and so within the spirit and principles in the present invention, are all contained in protection scope of the present invention It is interior.

Claims (13)

  1. The training set construction method 1. a kind of video search based on user journal sorts, which is characterized in that including:
    It obtains user and searches for daily record, the user of the acquisition searches in daily record comprising the corresponding video correlation of video being searched Feature;It is included at least in the video-related features:Video playing duration;
    Daily record is searched for the user of acquisition to sample, the user after being sampled searches for daily record as training sample;
    The video playing duration for obtaining every training sample is obtained according to the correspondence of preset score and video playing duration Obtain the score of every training sample;
    All training samples and corresponding score are configured to training set.
  2. 2. according to the method described in claim 1, it is characterized in that, the step of acquisition user searches for daily record, is:It uses every time When family is scanned for according to search term, obtains and the search daily record generated is searched for according to user.
  3. 3. according to the method described in claim 1, it is characterized in that, the video-related features further include:Video feature itself, Videotext correlative character and user's dimensional characteristics;
    The acquisition user searches for the step of daily record, including:
    When user scans for according to search term, the original user search daily record for search word and search, original user are generated Include in search daily record:Video length, video data type feature in video feature itself and the corresponding characteristic value of each feature;
    Calculate the video freshness characteristic value in video feature itself, the videotext correlative character of search term and search result Value and user's click, historic click-through rate, search time corresponding user's dimensional characteristics value;
    By the characteristic value comprising video feature itself, the whole of videotext correlative character value and user's dimensional characteristics value Feature is added into the original user search daily record, generates end user and searches for daily record and preserve.
  4. 4. according to the method described in claim 3, it is characterized in that, the training sample includes positive sample and negative sample;
    The user of described pair of acquisition searches for daily record and samples, and the user after being sampled searches for step of the daily record as training sample Suddenly, including:
    The daily record for obtaining user and clicking broadcasting is searched in daily record from the user of acquisition, and the daily record played is clicked as one using every Trained positive sample;
    User's search daily record that the user of acquisition searches in daily record in addition to the user clicks the daily record played is subjected to negative adopt Sample obtains and bears sample log, using the negative sample log of acquisition as the negative sample of training.
  5. 5. according to the method described in claim 4, it is characterized in that, obtaining user in described searched in daily record from the user of acquisition The daily record played is clicked, every is clicked the daily record played the positive sample trained as one, further includes:It sets its type to First kind type identification;
    The step of negative sampling, including:
    It obtains user and searches in daily record that user does not click on the negative sampling of conduct of broadcasting in addition to the user clicks the daily record played Daily record sets its type to the second Class Type mark;
    It obtains second Class Type to identify in corresponding negative sample log, search term is less than pre- with search result text relevant If the negative sample log of threshold value, it sets its type to third Class Type mark;
    Second Class Type is obtained to identify in corresponding negative sample log, it, will according to the negative sample log that preset rules determine Its type is set as the 4th Class Type mark.
  6. 6. according to the method described in claim 4, it is characterized in that, it is described obtain every training sample video playing duration, According to the correspondence of preset score and video playing duration, the step of obtaining the score of every training sample, including:
    By the playing duration of every positive sample respectively with preset first playing duration threshold value, the second playing duration threshold value and third Playing duration threshold value is compared;Wherein, it presets the first playing duration threshold value and is less than default second playing duration threshold value, preset the Two playing duration threshold values are less than default third playing duration threshold value;
    If the video playing duration of the positive sample is less than default first playing duration threshold value, which is scored at default Minimum point;
    If alternatively, the video playing duration of the positive sample is more than default first playing duration threshold value, and being broadcast less than default second Duration threshold value is put, then the positive sample is scored at preset low point secondary;
    If alternatively, the video playing duration of the positive sample is more than default second playing duration threshold value, and being broadcast less than default third Duration threshold value is put, then the positive sample is scored at preset secondary high score;
    If alternatively, the video playing duration of the positive sample is more than default third playing duration threshold value, the score of the positive sample For preset best result;
    The score of every negative sample is determined as preset minimum point.
  7. The training set construction device 7. a kind of video search based on user journal sorts, which is characterized in that including:
    Log acquisition module searches for daily record for obtaining user, and it includes to be searched that the user of the acquisition, which searches in daily record, The corresponding video-related features of video;It is included at least in the video-related features:Video playing duration;
    Sample acquisition module is searched for daily record for the user to acquisition and is sampled, and the user after being sampled searches for daily record and makees For training sample;
    Score acquisition module, the video playing duration for obtaining every training sample, according to preset score and video playing The correspondence of duration obtains the score of every training sample;
    Training set builds module, for all training samples and corresponding score to be configured to training set.
  8. 8. device according to claim 7, which is characterized in that the log acquisition module is specifically used for:
    When each user scans for according to search term, obtains and the search daily record generated is searched for according to user.
  9. 9. device according to claim 7, which is characterized in that the user acquired in the log acquisition module searches for daily record In comprising the corresponding video-related features of video that are searched, the video-related features further include:Video feature itself regards Frequency text relevant feature and user's dimensional characteristics;
    The log acquisition module, is specifically used for:
    When user scans for according to search term, the original user search daily record for search word and search, original user are generated Include in search daily record:Video length, video data type feature in video feature itself and the corresponding characteristic value of each feature;
    Calculate the video freshness characteristic value in video feature itself, the videotext correlative character of search term and search result Value and user's click, historic click-through rate, search time corresponding user's dimensional characteristics value;
    By the characteristic value comprising video feature itself, the whole of videotext correlative character value and user's dimensional characteristics value Feature is added into the original user search daily record, generates end user and searches for daily record and preserve.
  10. 10. device according to claim 9, which is characterized in that the training sample includes positive sample and negative sample;
    The sample acquisition module, including:
    Positive sample acquisition submodule, for searching in daily record the daily record for obtaining user and clicking broadcasting from the user of acquisition, by every Click the daily record played the positive sample trained as one;
    Negative sample acquisition submodule, the user for that will obtain search in daily record in addition to the user clicks the daily record played User searches for daily record and carries out negative sampling, obtains and bears sample log, using the negative sample log of acquisition as the negative sample of training.
  11. 11. device according to claim 10, which is characterized in that the positive sample acquisition submodule is additionally operable to:By its class Type is set as first kind type identification;
    The negative sample acquisition submodule, including negative sampling submodule;
    The negative sampling submodule, is used for:
    It obtains user and searches in daily record that user does not click on the negative sampling of conduct of broadcasting in addition to the user clicks the daily record played Daily record sets its type to the second Class Type mark;
    It obtains second Class Type to identify in corresponding negative sample log, search term is less than pre- with search result text relevant If the negative sample log of threshold value, it sets its type to third Class Type mark;
    Second Class Type is obtained to identify in corresponding negative sample log, it, will according to the negative sample log that preset rules determine Its type is set as the 4th Class Type mark.
  12. 12. device according to claim 10, which is characterized in that the score acquisition module is specifically used for:
    By the playing duration of every positive sample respectively with preset first playing duration threshold value, the second playing duration threshold value and third Playing duration threshold value is compared;Wherein, it presets the first playing duration threshold value and is less than default second playing duration threshold value, preset the Two playing duration threshold values are less than default third playing duration threshold value;
    If the video playing duration of the positive sample is less than default first playing duration threshold value, which is scored at default Minimum point;
    If alternatively, the video playing duration of the positive sample is more than default first playing duration threshold value, and being broadcast less than default second Duration threshold value is put, then the positive sample is scored at preset low point secondary;
    If alternatively, the video playing duration of the positive sample is more than default second playing duration threshold value, and being broadcast less than default third Duration threshold value is put, then the positive sample is scored at preset secondary high score;
    If alternatively, the video playing duration of the positive sample is more than default third playing duration threshold value, the score of the positive sample For preset best result;
    The score of every negative sample is determined as preset minimum point.
  13. 13. a kind of electronic equipment, which is characterized in that including processor, communication interface, memory and communication bus, wherein processing Device, communication interface, memory complete mutual communication by communication bus;
    Memory, for storing computer program;
    Processor when for executing the program stored on memory, realizes any method and steps of claim 1-6.
CN201810052822.2A 2018-01-19 2018-01-19 A kind of video search sequence training set construction method and device based on user journal Pending CN108460085A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810052822.2A CN108460085A (en) 2018-01-19 2018-01-19 A kind of video search sequence training set construction method and device based on user journal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810052822.2A CN108460085A (en) 2018-01-19 2018-01-19 A kind of video search sequence training set construction method and device based on user journal

Publications (1)

Publication Number Publication Date
CN108460085A true CN108460085A (en) 2018-08-28

Family

ID=63220890

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810052822.2A Pending CN108460085A (en) 2018-01-19 2018-01-19 A kind of video search sequence training set construction method and device based on user journal

Country Status (1)

Country Link
CN (1) CN108460085A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109753601A (en) * 2018-11-28 2019-05-14 北京奇艺世纪科技有限公司 Recommendation information clicking rate determines method, apparatus and electronic equipment
CN109857845A (en) * 2019-01-03 2019-06-07 北京奇艺世纪科技有限公司 Model training and data retrieval method, device, terminal and computer readable storage medium
CN111061954A (en) * 2019-12-19 2020-04-24 腾讯音乐娱乐科技(深圳)有限公司 Search result sorting method and device and storage medium
CN111199728A (en) * 2018-10-31 2020-05-26 阿里巴巴集团控股有限公司 Training data acquisition method and device, intelligent sound box and intelligent television
CN112084150A (en) * 2020-09-09 2020-12-15 北京百度网讯科技有限公司 Model training method, data retrieval method, device, equipment and storage medium
WO2021042826A1 (en) * 2019-09-05 2021-03-11 苏宁云计算有限公司 Video playback completeness prediction method and apparatus
CN113077020A (en) * 2021-06-07 2021-07-06 广东电网有限责任公司湛江供电局 Transformer cluster management method and system
CN113326363A (en) * 2021-05-27 2021-08-31 北京百度网讯科技有限公司 Searching method and device, prediction model training method and device, and electronic device
CN113378781A (en) * 2021-06-30 2021-09-10 北京百度网讯科技有限公司 Training method and device of video feature extraction model and electronic equipment
CN114501151A (en) * 2022-01-24 2022-05-13 青岛聚看云科技有限公司 Display device and media asset recommendation method
CN115048587A (en) * 2022-08-12 2022-09-13 中博信息技术研究院有限公司 LambdaMart-based address book search intelligent sorting method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110072466A1 (en) * 1999-04-19 2011-03-24 At&T Intellectual Property Ii, L.P. Via Transfer From At&T Corp. Browsing and Retrieval of Full Broadcast-Quality Video
CN104994424A (en) * 2015-06-30 2015-10-21 北京奇艺世纪科技有限公司 Method and device for constructing audio/video standard data set
CN106897398A (en) * 2017-02-08 2017-06-27 北京奇艺世纪科技有限公司 A kind of video display method and device
CN107122467A (en) * 2017-04-26 2017-09-01 努比亚技术有限公司 The retrieval result evaluation method and device of a kind of search engine, computer-readable medium
CN107341272A (en) * 2017-08-25 2017-11-10 北京奇艺世纪科技有限公司 A kind of method for pushing, device and electronic equipment
CN107368573A (en) * 2017-07-14 2017-11-21 北京奇艺世纪科技有限公司 Video quality evaluation method and device
CN107577707A (en) * 2017-07-31 2018-01-12 北京奇艺世纪科技有限公司 A kind of target data set creation method, device and electronic equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110072466A1 (en) * 1999-04-19 2011-03-24 At&T Intellectual Property Ii, L.P. Via Transfer From At&T Corp. Browsing and Retrieval of Full Broadcast-Quality Video
CN104994424A (en) * 2015-06-30 2015-10-21 北京奇艺世纪科技有限公司 Method and device for constructing audio/video standard data set
CN106897398A (en) * 2017-02-08 2017-06-27 北京奇艺世纪科技有限公司 A kind of video display method and device
CN107122467A (en) * 2017-04-26 2017-09-01 努比亚技术有限公司 The retrieval result evaluation method and device of a kind of search engine, computer-readable medium
CN107368573A (en) * 2017-07-14 2017-11-21 北京奇艺世纪科技有限公司 Video quality evaluation method and device
CN107577707A (en) * 2017-07-31 2018-01-12 北京奇艺世纪科技有限公司 A kind of target data set creation method, device and electronic equipment
CN107341272A (en) * 2017-08-25 2017-11-10 北京奇艺世纪科技有限公司 A kind of method for pushing, device and electronic equipment

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111199728A (en) * 2018-10-31 2020-05-26 阿里巴巴集团控股有限公司 Training data acquisition method and device, intelligent sound box and intelligent television
CN109753601B (en) * 2018-11-28 2021-10-22 北京奇艺世纪科技有限公司 Method and device for determining click rate of recommended information and electronic equipment
CN109753601A (en) * 2018-11-28 2019-05-14 北京奇艺世纪科技有限公司 Recommendation information clicking rate determines method, apparatus and electronic equipment
CN109857845A (en) * 2019-01-03 2019-06-07 北京奇艺世纪科技有限公司 Model training and data retrieval method, device, terminal and computer readable storage medium
CN109857845B (en) * 2019-01-03 2021-06-22 北京奇艺世纪科技有限公司 Model training and data retrieval method, device, terminal and computer-readable storage medium
WO2021042826A1 (en) * 2019-09-05 2021-03-11 苏宁云计算有限公司 Video playback completeness prediction method and apparatus
CN111061954A (en) * 2019-12-19 2020-04-24 腾讯音乐娱乐科技(深圳)有限公司 Search result sorting method and device and storage medium
CN111061954B (en) * 2019-12-19 2022-03-15 腾讯音乐娱乐科技(深圳)有限公司 Search result sorting method and device and storage medium
CN112084150A (en) * 2020-09-09 2020-12-15 北京百度网讯科技有限公司 Model training method, data retrieval method, device, equipment and storage medium
CN112084150B (en) * 2020-09-09 2024-07-26 北京百度网讯科技有限公司 Model training and data retrieval method, device, equipment and storage medium
CN113326363A (en) * 2021-05-27 2021-08-31 北京百度网讯科技有限公司 Searching method and device, prediction model training method and device, and electronic device
CN113326363B (en) * 2021-05-27 2023-07-25 北京百度网讯科技有限公司 Searching method and device, prediction model training method and device and electronic equipment
CN113077020A (en) * 2021-06-07 2021-07-06 广东电网有限责任公司湛江供电局 Transformer cluster management method and system
CN113378781A (en) * 2021-06-30 2021-09-10 北京百度网讯科技有限公司 Training method and device of video feature extraction model and electronic equipment
CN114501151A (en) * 2022-01-24 2022-05-13 青岛聚看云科技有限公司 Display device and media asset recommendation method
CN114501151B (en) * 2022-01-24 2024-02-23 青岛聚看云科技有限公司 Display equipment and media asset recommendation method
CN115048587A (en) * 2022-08-12 2022-09-13 中博信息技术研究院有限公司 LambdaMart-based address book search intelligent sorting method

Similar Documents

Publication Publication Date Title
CN108460085A (en) A kind of video search sequence training set construction method and device based on user journal
CN108304512A (en) A kind of thick sort method of video search engine, device and electronic equipment
CN110781317B (en) Method and device for constructing event map and electronic equipment
CN102737029B (en) Searching method and system
CN110008378B (en) Corpus collection method, device, equipment and storage medium based on artificial intelligence
CN110737859B (en) UP master matching method and device
CN108304490B (en) Text-based similarity determination method and device and computer equipment
CN110222975A (en) A kind of loss customer analysis method, apparatus, electronic equipment and storage medium
CN112104495B (en) System fault root cause positioning method based on network topology
CN108345601A (en) Search result ordering method and device
CN105159910A (en) Information recommendation method and device
CN108959595B (en) Website construction and experience method and device based on virtual and reality
CN112995690B (en) Live content category identification method, device, electronic equipment and readable storage medium
CN109388634B (en) Address information processing method, terminal device and computer readable storage medium
CN112199582B (en) Content recommendation method, device, equipment and medium
CN105915960A (en) User type determination method and device
CN105701097A (en) Social-network-platform-based public opinion analysis method and system
CN110321845A (en) A kind of method, apparatus and electronic equipment for extracting expression packet from video
CN110659311A (en) Topic pushing method and device, electronic equipment and storage medium
CN107766234A (en) A kind of assessment method, the apparatus and system of the webpage health degree based on mobile device
CN105574030A (en) Information search method and device
CN110209551A (en) A kind of recognition methods of warping apparatus, device, electronic equipment and storage medium
CN115510202A (en) Intelligent question-answering system based on power grid equipment knowledge graph
CN112700203B (en) Intelligent marking method and device
CN110674632A (en) Method and device for determining security level, storage medium and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180828