CN108629224B - Information demonstrating method and device - Google Patents
Information demonstrating method and device Download PDFInfo
- Publication number
- CN108629224B CN108629224B CN201710152564.0A CN201710152564A CN108629224B CN 108629224 B CN108629224 B CN 108629224B CN 201710152564 A CN201710152564 A CN 201710152564A CN 108629224 B CN108629224 B CN 108629224B
- Authority
- CN
- China
- Prior art keywords
- presented
- information
- image
- frame
- target item
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 55
- 230000004044 response Effects 0.000 claims abstract description 18
- 238000001514 detection method Methods 0.000 claims description 22
- 238000013527 convolutional neural network Methods 0.000 claims description 17
- 238000004422 calculation algorithm Methods 0.000 claims description 13
- 230000006835 compression Effects 0.000 claims description 10
- 238000007906 compression Methods 0.000 claims description 10
- 230000003542 behavioural effect Effects 0.000 claims description 6
- 238000013528 artificial neural network Methods 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 5
- 238000007405 data analysis Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 description 14
- 230000006870 function Effects 0.000 description 13
- 230000008569 process Effects 0.000 description 11
- 238000010586 diagram Methods 0.000 description 9
- 239000011159 matrix material Substances 0.000 description 8
- 230000006854 communication Effects 0.000 description 6
- 238000000605 extraction Methods 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 5
- 210000002569 neuron Anatomy 0.000 description 5
- 230000009467 reduction Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 239000000284 extract Substances 0.000 description 4
- 238000013507 mapping Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 210000004027 cell Anatomy 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000033001 locomotion Effects 0.000 description 2
- 230000005291 magnetic effect Effects 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 235000013361 beverage Nutrition 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 210000003127 knee Anatomy 0.000 description 1
- 239000010985 leather Substances 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
- G06F18/24155—Bayesian classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/25—Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
- H04N21/266—Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
- H04N21/2668—Creating a channel for a dedicated end-user group, e.g. insertion of targeted commercials based on end-user profiles
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/431—Generation of visual interfaces for content selection or interaction; Content or additional data rendering
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Software Systems (AREA)
- Signal Processing (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Evolutionary Biology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
Abstract
This application discloses information demonstrating methods and device.One specific embodiment of this method includes: the key frame detected in target video, wherein key frame is the frame that image entropy is greater than preset image entropy threshold in target video;In response to detecting key frame, the image of target item is detected from key frame;In response to detecting the image of target item from key frame, determine whether the number that the frame of image of target item is continuously presented after key frame is greater than scheduled frame number;If more than scheduled frame number, then the information to be presented with the images match of target item is obtained, and information to be presented is presented in the frame of image that target item is continuously presented.Information to be presented can be pointedly presented in the embodiment to the target item in target video, improve the accuracy rate of information push.
Description
Technical field
This application involves field of computer technology, and in particular to video technique field more particularly to information demonstrating method and
Device.
Background technique
With the development of the quickly universal and digital image collection processing technique of internet, network video industry rises abruptly rapidly
It rises, and plays increasingly important role in people's daily life.Include more letters such as image, sound, text as one kind
The comprehensive media of breath, video has powerful information carrying and transmission capacity, therefore the semantic analysis of video and understanding are already
An important research direction as multimedia signal processing field.On the other hand, with e-commerce platform Fast Growth, net
Network shopping is increasingly becoming the shopping way of people's longest selection, this brings for the combination of network video industry and e-commerce
Business opportunity.
Analysis video content simultaneously combines it with user personalized information, forms personalized advertisement recommender system and helps
In the clicking rate and conversion ratio that promote advertisement, on the other hand spectators, which can be effectively reduced, in personalized advertisement recommendation passively to receive
The sense of discomfort of set advertisement.Therefore, the content analysis for various network videos and carry out the relevant advertisements such as shopping at network clothes
The personalized recommendation for information of being engaged in has important research significance and practical value.
Summary of the invention
The purpose of the application is to propose a kind of improved information demonstrating method and device, to solve background above technology department
Divide the technical issues of mentioning.
In a first aspect, the embodiment of the present application provides a kind of information demonstrating method, this method comprises: in detection target video
Key frame, wherein key frame be target video in image entropy be greater than preset image entropy threshold frame;It is closed in response to detecting
Key frame detects the image of target item from key frame;In response to detecting the image of target item from key frame, determine
Whether the number of the frame of the continuous image that target item is presented is greater than scheduled frame number after key frame;If more than scheduled frame
Number then obtains the information to be presented with the images match of target item, and is in the frame of image that target item is continuously presented
Existing information to be presented.
In some embodiments, the key frame in target video is detected, comprising: obtain image entropy and be greater than preset image entropy
The frame of threshold value is as key frame;According to the playing sequence of target video, the image entropy after acquisition key frame is greater than preset figure
As the first frame of entropy threshold;Determine whether the similarity of first frame and key frame is less than preset similarity threshold;If being less than pre-
If similarity threshold, it is determined that go out first frame be key frame.
In some embodiments, the image of target item is detected from key frame, comprising: based on convolution mind trained in advance
The image of target item is detected from key frame through network, wherein the image of convolutional neural networks target item for identification is special
Levy and determine according to characteristics of image the image of target item.
In some embodiments, determine whether the number that the frame of image of target item is continuously presented after key frame is big
In scheduled frame number, comprising: determine whether the image of target item is continuously presented on after key frame using compression track algorithm
Different frames in;If continuous be presented, add up the number of the frame of the continuous image that target item is presented, and determines the number of frame
Whether scheduled frame number is greater than.
In some embodiments, information to be presented is presented in the frame of image that target item is continuously presented, comprising: determine
Location information of the image of target item in the frame of image that target item is continuously presented;It is determined according to location information to be presented
The position of appearing of information;Information to be presented is presented on position of appearing.
In some embodiments, the information to be presented with the images match of target item is obtained, comprising: obtain letter to be presented
Breath set, wherein information to be presented includes picture;Determine the picture and mesh in information aggregate to be presented in every information to be presented
Mark the similarity between the image of article;At least one is chosen from information aggregate to be presented according to the descending sequence of similarity
Item information to be presented.
In some embodiments, information to be presented includes text information;And it obtains and the images match of target item
Information to be presented, comprising: obtain the text information with the categorical match of the image of target item.
In some embodiments, the information to be presented with the images match of target item is obtained, comprising: acquisition passes through terminal
Watch the class label of the user of target video, wherein the class label of user is carried out greatly by the behavioral data to user
What data were analyzed;The letter to be presented of class label matched at least one with user is obtained from information aggregate to be presented
Breath.
Second aspect, the embodiment of the present application provide a kind of information presentation device, which includes: that key frame detection is single
Member, for detecting the key frame in target video, wherein key frame is that image entropy is greater than preset image entropy threshold in target video
The frame of value;Image detecting element, for detecting the image of target item from key frame in response to detecting key frame;It determines
Unit, for the image in response to detecting target item from key frame, object is continuously presented in determination after key frame
Whether the number of the frame of the image of product is greater than scheduled frame number;Display unit, for if more than scheduled frame number, then obtaining and mesh
The information to be presented of the images match of article is marked, and information to be presented is presented in the frame of image that target item is continuously presented.
In some embodiments, key frame detection unit is further used for: obtaining image entropy and is greater than preset image entropy threshold
The frame of value is as key frame;According to the playing sequence of target video, the image entropy after acquisition key frame is greater than preset image
The first frame of entropy threshold;Determine whether the similarity of first frame and key frame is less than preset similarity threshold;If being less than default
Similarity threshold, it is determined that go out first frame be key frame.
In some embodiments, image detecting element is further used for: based on convolutional neural networks trained in advance from pass
The image of target item is detected in key frame, wherein the convolutional neural networks characteristics of image of target item and according to figure for identification
As feature determines the image of target item.
In some embodiments, determination unit is further used for: the image of target item is determined using compression track algorithm
Whether continuously it is presented in the different frames after key frame;If continuous be presented, add up the continuous image that target item is presented
Frame number, and determine frame number whether be greater than scheduled frame number.
In some embodiments, display unit is further used for: determining that object is continuously being presented in the image of target item
Location information in the frame of the image of product;The position of appearing of information to be presented is determined according to location information;It is on position of appearing
Existing information to be presented.
In some embodiments, display unit is further used for: obtaining information aggregate to be presented, wherein information to be presented
Including picture;It determines similar between the picture and the image of target item in information aggregate to be presented in every information to be presented
Degree;At least one information to be presented is chosen from information aggregate to be presented according to the descending sequence of similarity.
In some embodiments, information to be presented includes text information;And display unit is further used for: acquisition and mesh
Mark the text information of the categorical match of the image of article.
In some embodiments, display unit is further used for: obtaining the class that the user of target video is watched by terminal
Distinguishing label, wherein the class label of user is to carry out big data analysis by the behavioral data to user to obtain;From to be presented
The information to be presented of class label matched at least one with user is obtained in information aggregate.
The third aspect, the embodiment of the present application provide a kind of equipment, comprising: one or more processors;Storage device is used
In storing one or more programs, when one or more programs are executed by one or more processors, so that at one or more
Device is managed to realize such as the method in first aspect in any embodiment.
Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, are stored thereon with computer journey
Sequence is realized when the program is executed by processor such as the method in first aspect in any embodiment.
Information demonstrating method and device provided by the embodiments of the present application pass through the mesh in the key frame in detection target video
Information to be presented is presented on the frame of image that target item is continuously presented in the image for marking article, and the application is based on target video
Content carry out targetedly information present, improve information presentation precision, to reduce cost and improve the point of user
Hit rate.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other
Feature, objects and advantages will become more apparent upon:
Fig. 1 is that this application can be applied to exemplary system architecture figures therein;
Fig. 2 is the flow chart according to one embodiment of the information demonstrating method of the application;
Fig. 3 a is the schematic diagram according to the building process of the compression vector of the information demonstrating method of the application;
Fig. 3 b is the schematic diagram that process is presented according to the information of the information demonstrating method of the application;
Fig. 4 is the flow chart according to another embodiment of the information demonstrating method of the application;
Fig. 5 is the structural schematic diagram according to one embodiment of the information presentation device of the application;
Fig. 6 is adapted for the structural schematic diagram for the computer system for realizing the equipment of the embodiment of the present application.
Specific embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to
Convenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase
Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 is shown can be using the exemplary system of the embodiment of the information demonstrating method or information presentation device of the application
System framework 100.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105.
Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with
Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out
Send message etc..The various client applications for supporting playing video file can be installed on terminal device 101,102,103, such as
Web browser applications, shopping class application, searching class application, instant messaging tools, social platform software etc..
Terminal device 101,102,103 can be with display screen and support the various electronic equipments of video playing, packet
Include but be not limited to smart phone, tablet computer, E-book reader, MP3 player (Moving Picture Experts
Group Audio Layer III, dynamic image expert's compression standard audio level 3), MP4 (Moving Picture
Experts Group Audio Layer IV, dynamic image expert's compression standard audio level 4) it is player, on knee portable
Computer and desktop computer etc..
Server 105 can be to provide the server of various services, such as to showing on terminal device 101,102,103
Video provides the background video server supported.Background video server can to receive video playing request etc. data into
The processing such as row analysis, and processing result (such as video data) is fed back into terminal device.
It should be noted that information demonstrating method provided by the embodiment of the present application is generally executed by server 105, accordingly
Ground, information presentation device are generally positioned in server 105.
It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need
It wants, can have any number of terminal device, network and server.
With continued reference to Fig. 2, the process 200 of one embodiment of the information demonstrating method according to the application is shown.The letter
Cease rendering method, comprising the following steps:
Step 201, the key frame in target video is detected.
In the present embodiment, the electronic equipment (such as server shown in FIG. 1) of information demonstrating method operation thereon can
To receive video playing using its terminal for carrying out video playing from user by wired connection mode or radio connection
Request, according to video playing request target video, and detects the key frame in target video.Wherein, key frame is the mesh
Mark the frame that image entropy in video is greater than preset image entropy threshold.The bit that image entropy is expressed as image gray levels set is average
Number, per bit/pixel also illustrate the average information of video source.Image entropy is defined as:
Wherein H is image entropy, piIt is the probability for the pixel that gray scale is i in image.Image entropy in target video is obtained to be greater than
The frame of preset image entropy threshold, can remove the blank frame in video, further decrease the complexity of algorithm.
In some optional implementations of the present embodiment, the key frame in target video is detected, comprising: obtain image
Entropy is greater than the frame of preset image entropy threshold as key frame;According to the playing sequence of target video, after obtaining key frame
Image entropy is greater than the first frame of preset image entropy threshold;Determine whether first frame and the similarity of key frame are less than preset phase
Like degree threshold value;If being less than preset similarity threshold, it is determined that going out first frame is key frame.Under normal circumstances, in target video
Comprising multiple independent scenes, the key frame of the image comprising target item is extracted in each independent scene, is facilitated
It reduces and repeats to detect, to reduce the complexity of algorithm.The application is detected in video using the event information of successive frame in video
Key frame.So-called event, which refers to, is divided into independent frame unit for video, and continuity is stronger between frame and frame in each cell,
Image information difference is smaller, and the image difference degree between different units is larger.The similarity of image is using pixel between image
Difference is portrayed.It is shown below:
Sim=-abs (curFrame-preFrame) (formula 2)
Wherein sim is similarity, and curFrame, preFrame are respectively picture of the same pixel in two continuous frames image
Element value, abs is absolute value.According to the playing sequence of video, first image entropy got is greater than preset image entropy threshold
Frame as key frame, the pixel value of any pixel is preFrame on the key frame.And it should in frame after the key frame
The pixel value that pixel is in another pixel of same position is curFrame, if the sim being calculated according to formula 2
Value is less than preset similarity threshold, then the frame after the key frame is also determined as key frame.
Step 202, in response to detecting key frame, the image of target item is detected from key frame.
In the present embodiment, there may be the images of multiple articles in key frame, for example, T-shirt, cap, shoes, beverage etc. are schemed
Picture.The image of target item can be detected from these images, carried out pointedly information and presented.Rather than key frame is presented
In include all items photographed image-related information.For example, it is desired to when information relevant to T-shirt is presented, using T-shirt as object
Product detect the image of T-shirt.
In some optional implementations of the present embodiment, the image of target item is detected from key frame, comprising: base
The image of target item is detected from key frame in convolutional neural networks trained in advance, wherein convolutional neural networks are for knowing
The characteristics of image of other target item and the image that target item is determined according to characteristics of image.Object is extracted with convolutional Neural net
Product can effectively identify position and classification information of the image of target item in key frame, consequently facilitating succeeding target chases after
Track and article are recommended.The picture of convolutional neural networks is inputted for one, extraction candidate region, every picture extract first
Then 1000 candidate regions carry out picture size normalization to each candidate region, then extracted and waited using convolutional Neural net
The high dimensional feature of favored area classifies to candidate region finally by full articulamentum.By classifying to each region,
To extract the image of the target item on key frame, its position can also be determined.The network inspection of the application trained in advance
The target of survey may include garment type, such as shoes, jacket, shorts, skirt, one-piece dress etc..These information are for subsequent article
Recommend significant.The location information of target item is convenient for the position initialization of succeeding target tracking.
Convolutional neural networks (Convolutional Neural Networks, CNN) are a kind of artificial neural networks.Volume
Product neural network is a kind of feedforward neural network, its artificial neuron can respond single around in a part of coverage area
Member has outstanding performance for large-scale image procossing.Generally, the basic structure of CNN includes two layers, and one is characterized extract layer,
The input of each neuron is connected with the local acceptance region of preceding layer, and extracts the feature of the part.Once the local feature quilt
After extraction, its positional relationship between other feature is also decided therewith;The second is computation layer, each computation layer of network by
Multiple Feature Mapping layer compositions, each Feature Mapping layer is a plane, and the weight of all neurons is equal in plane.Feature is reflected
Activation primitive of the structure using the small sigmoid function of influence function core as convolutional network is penetrated, so that Feature Mapping has position
Motion immovability.Further, since the neuron on a mapping face shares weight, thus reduce the number of network freedom parameter.
Each of convolutional neural networks feature extraction layer all followed by one is used to ask the computation layer of local average and second extraction,
This distinctive structure of feature extraction twice reduces feature resolution.Its artificial neuron can respond a part covering model
Interior surrounding cells are enclosed, have outstanding performance for large-scale image procossing.Convolutional neural networks are formed more by combination low-level feature
Add abstract high-rise expression attribute classification or feature, to find that the distributed nature of data indicates.The essence of deep learning is logical
Crossing building has the machine learning model of many hidden layers and the training data of magnanimity, to learn more useful feature, to merge
The accuracy of classification or prediction is promoted afterwards.The convolutional neural networks can be used to identify the feature of the target item in key frame,
In, the feature of the target item may include the features such as the color of target item, texture, shade, direction change, quality.
Step 203, in response to detecting the image of target item from key frame, determination is continuously presented after key frame
Whether the number of the frame of the image of target item is greater than scheduled frame number.
In the present embodiment, the object that a variety of track algorithms detect in tracking step 202 in successive frame can be used
The image of product.Only all there is the image of target item in continuous multiple frames, then that information to be presented is presented is just significant.Choosing
The frame for taking the image of target item to be more than certain threshold value there are the time is launched, and one side user has time enough to go to click
On the one hand information content to be presented also can be effectively reduced, to not influence the viewing body of user in information to be presented, such as advertisement
It tests.User's click information entry can enter the corresponding article webpage of information to be presented.Such as tracking study and detection can be used
Track algorithms such as (TLD, tracking learning and detection) carry out the tracking of the image of target item.
In some optional implementations of the present embodiment, the figure that target item is continuously presented after key frame is determined
Whether the number of the frame of picture is greater than scheduled frame number, comprising: determines whether the image of target item connects using compression track algorithm
Continue in the different frames after being presented on key frame;If continuous be presented, add up the frame of the continuous image that target item is presented
Number, and determine whether the number of frame is greater than scheduled frame number.Compression tracking is a kind of simply and efficiently compressed sensing based
Track algorithm.First with the random perception for meeting compressed sensing (restricted isometry property, RIP) condition
Square carries out dimensionality reduction to multi-scale image feature, is then carried out in the feature after dimensionality reduction using simple Naive Bayes Classifier
Classification.As general pattern classification framework: first extracting the feature of image, then classified by classifier to it, difference is
Here feature extraction uses compressed sensing, and classifier uses naive Bayesian.Then every frame updates classifier by on-line study.
It is as follows to compress track algorithm process:
(1) when t frame, we sample and obtain the image sheet of several targets (positive sample) and background (negative sample),
Then multi-scale transform is carried out to them, then dimensionality reduction is carried out to multi-scale image feature by a sparseness measuring matrix, then
It goes to train Naive Bayes Classifier by the feature (including target and background, belong to two classification problems) after dimensionality reduction.
(2) when t+1 frame, we (are kept away n scanning window of surrounding sample in the target position that previous frame traces into
Remove scanning entire image from), by same sparseness measuring matrix to its dimensionality reduction, feature is extracted, it is then trained with t frame
Naive Bayes Classifier is classified, and the classification maximum window of score is taken as target window.It thereby realizes from t frame
To the target following of t+1 frame.
The building process for compressing vector is as shown in Figure 3a, and Fig. 3 a shows the sparse matrix of a n × m, it can be by one
The x (m dimension) in dimensional images space transforms to the space v (n dimension) an of low-dimensional, and mathematical expression is exactly: v=Rx, wherein matrix R
In, 301,303 and 302 respectively represent matrix element as negative, positive number and zero.Arrow indicates one of a line of calculation matrix R
Nonzero element perceives an element in x, is equivalent to the ash of a square window filter and a certain fixed position of input picture
Spend convolution.
X is projected to the v of lower dimensional space by using sparse random matrix R above.This random matrix R only needs
It calculates once when program starts, is then remained unchanged during tracking.By integrogram, we can efficiently calculate v.
The building process of classifier is as follows: to each sample z (m dimensional vector), its low-dimensional expression be v (n tie up to
Amount, n is much smaller than m).It is assumed that each element in v is independently distributed.It can be modeled by Naive Bayes Classifier.
Wherein, H (v) is classifier, y ∈ { 0,1 } representative sample label, and y=0 indicates that negative sample, y=1 indicate positive sample,
Assuming that the prior probability of two classes is equal, p (y=1)=p (y=0)=0.5.It is assumed that the conditional probability p in classifier H (v)
(vi| y=1) and p (vi| y=0) Gaussian Profile is also belonged to, mean value and variance are respectivelyFor adapt to it is long when with
Track needs to constantly update model, i.e., goes to recalculate the mean value and variance of positive negative sample according to the sample newly detected, updates
Mode is as follows:
λ > 0 is Studying factors in formula 4 and formula 5, is in practical applications the accumulation for avoiding error, the application take λ=
0.85。
Step 204, if more than scheduled frame number, then the information to be presented with the images match of target item is obtained, and
Information to be presented is presented in the frame of the continuous image that target item is presented.
In the present embodiment, the target item image of the detection and step 203 of the target item image based on step 202
Tracking step can extract type, track, the frame number of appearance and duration of target item etc. from target video.These
Information will be helpful to realize the personalized recommendation for being directed to user information.Letter to be presented is matched in preset information bank to be presented
Breath, modification frame data or superposition by way of by information to be presented and present target item image frame be combined into it is new
Frame, information to be presented to be presented in newly-generated frame.The information to be presented can be the text or picture being linked on webpage.
As shown in Figure 3b, target item " T-shirt " 304 is detected in the key frame in target video, from preset information bank to be presented
In match the picture 305 that webpage can be linked to associated with " T-shirt " and presented in key frame.User clicks picture
After 305, related web page can be entered and browse information associated with " T-shirt ".Target is detected in key frame in target video
Article " shoes " 306 matches the picture 307 that can be linked to webpage associated with " shoes " in preset information bank to be presented
And it is presented in key frame.After user clicks picture 307, related web page can be entered and browse information associated with " shoes ".
In some optional implementations of the present embodiment, continuously present target item image frame in present to
Information is presented, comprising: determine location information of the image of target item in the frame of image that target item is continuously presented;According to
Location information determines the position of appearing of information to be presented;Information to be presented is presented on position of appearing.The presentation of information to be presented
Position can be near the image of target item, can also be in the position of the image of other not shelter target articles.It can be according to mesh
The size for marking the image of article determines the position of appearing of information to be presented, for example, if target item is a pair of shoes, and wait be in
Existing information is shoe advertisement, and the position occupied is also bigger than shoes image itself, then is not suitable for pasting on the image of shoes wide
It accuses, and advertisement should be added beside shoes image.If target item is a wardrobe, since the size of wardrobe image is bigger,
Therefore compare and be suitble to directly be superimposed information to be presented on wardrobe image.
The method provided by the above embodiment of the application is real by the way that the content of target video and information to be presented to be associated
Show and be imbued with targetedly information presentation, has improved the hit rate of information to be presented.
With further reference to Fig. 4, it illustrates the processes 400 of another embodiment of information demonstrating method.The information is presented
The process 400 of method, comprising the following steps:
Step 401, the key frame in target video is detected.
Step 402, in response to detecting key frame, the image of target item is detected from key frame.
Step 403, in response to detecting the image of target item from key frame, determination is continuously presented after key frame
Whether the number of the frame of the image of target item is greater than scheduled frame number.
Step 401-403 and step 201-203 are essentially identical, therefore repeat no more.
Step 404, if more than scheduled frame number, then information aggregate to be presented is obtained.
In the present embodiment, when the frame number determined in step 403 is greater than scheduled frame number, from preset information to be presented
Ku Li is matched and the higher information to be presented of target item image similarity.The information to be presented may include picture.
Step 405, determine the picture and target item in information aggregate to be presented in every information to be presented image it
Between similarity.
In the present embodiment, if including picture in the information to be presented, the histogram and target of picture can be determined
Similarity between the histogram of the image of article.First to the pixel number of target item image and the picture of information to be presented
According to generating respective histogram data, be normalized to respective image histogram data and reuse Pasteur's coefficient
(Bhattacharyya coefficient) algorithm calculates histogram data, finally obtains image similarity value, value
Range is between [0,1], and 0 indicates extremely different, and 1 indicates extremely similar (identical).
In some optional implementations of the present embodiment, if the information to be presented includes text information, obtain
With the text information of the categorical match of the image of target item.Classification is determined according to the keyword in text information, with object
The classification of the image of product is matched, and similarity is obtained.For example, text information is " 299 yuan of XX sneakers price ", the text information
It can achieve 90% with the similarity of the image of target item " sneakers ", the image and text information " XX of target item " sneakers "
The similarity of 299 yuan of leather shoes price " can achieve 70%, the image and text information " XX basketball price of target item " sneakers "
299 yuan " similarity can be solely 10%.
Step 406, it is to be presented that at least one is chosen from information aggregate to be presented according to the descending sequence of similarity
Information.
In the present embodiment, at least one information to be presented is chosen based on the similarity that step 405 determines.It is selected to
The number that information is presented can be directly proportional to the size of the image of target item.For example, the biggish image of area ratio can be with
Show several information to be presented more.The lesser image of area ratio preferably only shows an information to be presented, to avoid a presumptuous guest usurps the role of the host.
In some optional implementations of the present embodiment, the letter to be presented with the images match of target item is obtained
Breath, comprising: obtain the class label that the user of target video is watched by terminal, wherein the class label of user is by right
The behavioral data of user carries out what big data analysis obtained;It obtains from information aggregate to be presented and is matched with the class label of user
At least one information to be presented.Information to be presented is further screened namely based on the personal characteristics of user, to user's needle
Information to be presented is chosen to property.For example, can determine that the user of viewing target video is women by big data analysis, then may be used
Female article relevant information is chosen as information to be presented.
Can by establish a user, information to be presented, target item image combination information recommendation mould to be presented
Type, can be effectively predicted the clicking rate (ctr, Click-Through-Rate) of information to be presented, and clicking rate highest is estimated in push
Information to be presented, to promote the conversion ratio that information to be presented is launched.The feature of the recommended models mainly include user characteristics,
Three kinds of the feature of the image of the feature of article involved in information to be presented and the target item detected from target video.With
The feature at family mainly includes the letter that age, gender, region, occupation, platform of user etc. can be drawn a portrait by user's big data
Breath.The feature for the article that information to be presented is related to mainly include target item type, price, the article place of production (or seller where
Ground), deposit of faith clicking rate to be presented.The mesh detected in target video is mainly included in the feature of the image of target item
The image of target item occurs in the similarity and target video of the article that the image and information to be presented for marking article are related to
Duration.
The processing of the feature for the article being related to information to be presented mainly includes discretization and two kinds of characteristic crossover.
(1) discretization
The feature of information recommendation model to be presented mainly includes three types discussed above, include in initial feature from
Dissipate feature (such as user's gender, user region) and continuous feature (such as item price, age of user, target item image and
The similarity for the article that information to be presented is related to, clicking rate of information to be presented etc.).Although wherein clicking rate and age are all to connect
Continuous numerical value, but itself meaning is different, and the comparison of age size is nonsensical to information recommendation to be presented, and the size of clicking rate
It is then meaningful, it is therefore desirable to the processing of discretization is done to features described above.
The processing mode of discretized features is as follows: continuous feature is done segment processing.It is 10 sections as clicking rate ctr divides, such as
Fruit ctr=0.05, then character pair position 1.Other kinds of characteristic processing is similar.
(2) characteristic crossover
After feature sliding-model control, the feature after processing can be stretched as to a vector, as final feature.But
This mode is linear model, has ignored the interaction between feature.If the combination of gender and type of goods is to letter to be presented
Breath clicking rate has very direct influence.Therefore to feature intersects can effectively lift scheme predict accuracy rate.Characteristic crossover
Method i.e. two features are combined to form new continuous feature, as gender and goods categories (m class) combination after if generate 2m
A discrete feature.
If the discrete features vector that the application is formed is x, the dimension of feature is 113.Wherein x1~x10 is that age of user is special
Levy section;X11~x18 is user's regional feature section;X19~x25 is user's job characteristics section;X26~x30 is that user watches video
Platform features section;X31~x38 is goods categories characteristic segments;X39~x50 is item price characteristic segments;X51~x58 is for article
Characteristic of field section;X59~x60 is article clicking rate characteristic segments;X61~x65 is that detection target duration characteristics section occurs;X66~x75
For detection target and advertising items similarity characteristic segments;X76~x91 is that goods categories/user's gender combines characteristic segments;X92~
X113 is user's gender/item price assemblage characteristic section.
Logic-based regression model recommends information to be presented.Logic Regression Models (Logic Regression, LR), are one
A algorithm being widely used in advertisement recommendation.If training dataset is D=(x1,y1),(x2,y2)...(xN,yN), whereinFor construction feature, yiWhether advertisement is clicked, and 1 is clicks, and -1 is not click on.
The basic assumption of LR is, and conditional probability P (y=1 | x;θ) meet following expression:
Here g (θTIt x) is the sigmoid function mentioned, x is feature vector, and θ is parameter vector, corresponding decision letter
Number are as follows:
y*=1, ifP (y=1 | x) > 0.5 (formula 7)
It is next the parameter in solving model after the mathematical form of model determines.Using maximal possibility estimation, that is, look for
To one group of parameter, so that the likelihood score (probability) of data is bigger under this group of parameter.In Logic Regression Models, likelihood score L (θ)
It may be expressed as:
L (θ)=P (D | θ)=∏ P (y | x;θ)=∏ g (θTx)y(1-g(θTx))1-y(formula 8)
Take the available log likelihood l (θ) of logarithm:
L (θ)=∑ ylogg (θTx)+(1-y)log(1-g(θTX)) (formula 9)
In LR model, maximizing above-mentioned likelihood function can be obtained optimized parameter.The application declines iteration using gradient
Parameter is solved, it is optimal to approach by choosing the value for making objective function change a most fast direction adjusting parameter in each step
Value.
To get the recommender system for arriving recommendation information to be presented after model training completion.It calculates in information bank to be presented
The predetermined number information to be presented retrieved carries out clicking rate prediction, and selection is estimated the highest information to be presented of clicking rate and carried out
It presents.
Figure 4, it is seen that compared with the corresponding embodiment of Fig. 2, the process of the information demonstrating method in the present embodiment
400 highlight the step of selecting information to be presented.So as to accurately select information to be presented, letter to be presented is extracted
Effective information to be presented is presented as far as possible, reduces the cost for launching information to be presented for the hit rate of breath.
It presents and fills this application provides a kind of information as the realization to method shown in above-mentioned each figure with further reference to Fig. 5
The one embodiment set, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which specifically can be applied to respectively
In kind electronic equipment.
As shown in figure 5, the information presentation device 500 of the present embodiment includes: key frame detection unit 501, image detection list
Member 502, determination unit 503 and display unit 504.Wherein, key frame detection unit 501 is used to detect the key in target video
Frame, wherein key frame is the frame that image entropy is greater than preset image entropy threshold in target video;Image detecting element 502 is used for
In response to detecting key frame, the image of target item is detected from key frame;Determination unit 503 is used in response to from key frame
In detect the image of target item, determine whether the number that the frame of image of target item is continuously presented after key frame big
In scheduled frame number;Display unit 504 is used to then obtain with the images match of target item wait be in if more than scheduled frame number
Existing information, and information to be presented is presented in the frame of image that target item is continuously presented.
In the present embodiment, the key frame detection unit 501 of information presentation device 500, image detecting element 502, determination
The specific processing of unit 503 and display unit 504 can be with reference to step 201, the step 202, step in Fig. 2 corresponding embodiment
203, step 204.
In some optional implementations of the present embodiment, key frame detection unit 501 is further used for: obtaining image
Entropy is greater than the frame of preset image entropy threshold as key frame;According to the playing sequence of target video, after obtaining key frame
Image entropy is greater than the first frame of preset image entropy threshold;Determine whether first frame and the similarity of key frame are less than preset phase
Like degree threshold value;If being less than preset similarity threshold, it is determined that going out first frame is key frame.
In some optional implementations of the present embodiment, image detecting element 502 is further used for: based on instruction in advance
Experienced convolutional neural networks detect the image of target item from key frame, wherein convolutional neural networks object for identification
The characteristics of image of product and the image that target item is determined according to characteristics of image.
In some optional implementations of the present embodiment, determination unit 503 is further used for: being calculated using compression tracking
Method determines in the different frames after whether the image of target item is continuously presented on key frame;If continuous be presented, add up to connect
The number of the frame of the continuous image that target item is presented, and determine whether the number of frame is greater than scheduled frame number.
In some optional implementations of the present embodiment, display unit 504 is further used for: determining target item
Location information of the image in the frame of image that target item is continuously presented;The presentation of information to be presented is determined according to location information
Position;Information to be presented is presented on position of appearing.
In some optional implementations of the present embodiment, display unit 504 is further used for: obtaining information to be presented
Set, wherein information to be presented includes picture;Determine the picture and target in information aggregate to be presented in every information to be presented
Similarity between the image of article;At least one is chosen from information aggregate to be presented according to the descending sequence of similarity
Information to be presented.
In some optional implementations of the present embodiment, information to be presented includes text information;And display unit
504 are further used for: obtaining the text information with the categorical match of the image of target item.
In some optional implementations of the present embodiment, display unit 504 is further used for: obtaining and is seen by terminal
See the class label of the user of target video, wherein the class label of user is to carry out counting greatly by the behavioral data to user
It is obtained according to analysis;The information to be presented of class label matched at least one with user is obtained from information aggregate to be presented.
Below with reference to Fig. 6, it illustrates the knots of the computer system 600 for the equipment for being suitable for being used to realize the embodiment of the present application
Structure schematic diagram.Equipment shown in Fig. 6 is only an example, should not function to the embodiment of the present application and use scope bring and appoint
What is limited.
As shown in fig. 6, computer system 600 includes central processing unit (CPU) 601, it can be read-only according to being stored in
Program in memory (ROM) 602 or be loaded into the program in random access storage device (RAM) 603 from storage section 608 and
Execute various movements appropriate and processing.In RAM 603, also it is stored with system 600 and operates required various programs and data.
CPU 601, ROM 602 and RAM 603 are connected with each other by bus 604.Input/output (I/O) interface 605 is also connected to always
Line 604.
I/O interface 605 is connected to lower component: the importation 606 including keyboard, mouse etc.;It is penetrated including such as cathode
The output par, c 607 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section 608 including hard disk etc.;
And the communications portion 609 of the network interface card including LAN card, modem etc..Communications portion 609 via such as because
The network of spy's net executes communication process.Driver 610 is also connected to I/O interface 605 as needed.Detachable media 611, such as
Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 610, in order to read from thereon
Computer program be mounted into storage section 608 as needed.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium
On computer program, which includes the program code for method shown in execution flow chart.In such reality
It applies in example, which can be downloaded and installed from network by communications portion 609, and/or from detachable media
611 are mounted.When the computer program is executed by central processing unit (CPU) 601, limited in execution the present processes
Above-mentioned function.It should be noted that computer-readable medium described herein can be computer-readable signal media or
Computer readable storage medium either the two any combination.Computer readable storage medium for example can be --- but
Be not limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.
The more specific example of computer readable storage medium can include but is not limited to: have one or more conducting wires electrical connection,
Portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only deposit
Reservoir (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory
Part or above-mentioned any appropriate combination.In this application, computer readable storage medium, which can be, any include or stores
The tangible medium of program, the program can be commanded execution system, device or device use or in connection.And
In the application, computer-readable signal media may include in a base band or the data as the propagation of carrier wave a part are believed
Number, wherein carrying computer-readable program code.The data-signal of this propagation can take various forms, including but not
It is limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer
Any computer-readable medium other than readable storage medium storing program for executing, the computer-readable medium can send, propagate or transmit use
In by the use of instruction execution system, device or device or program in connection.Include on computer-readable medium
Program code can transmit with any suitable medium, including but not limited to: wireless, electric wire, optical cable, RF etc., Huo Zheshang
Any appropriate combination stated.
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the application, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use
The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box
The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually
It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse
Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding
The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction
Combination realize.
Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard
The mode of part is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor packet
Include key frame detection unit, image detecting element, determination unit and display unit.Wherein, the title of these units is in certain feelings
The restriction to the unit itself is not constituted under condition, for example, key frame detection unit is also described as " detection target video
In key frame unit ".
As on the other hand, present invention also provides a kind of computer-readable medium, which be can be
Included in device described in above-described embodiment;It is also possible to individualism, and without in the supplying device.Above-mentioned calculating
Machine readable medium carries one or more program, when said one or multiple programs are executed by the device, so that should
Device: the key frame in detection target video, wherein key frame is that image entropy is greater than preset image entropy threshold in target video
Frame;In response to detecting key frame, the image of target item is detected from key frame;In response to detecting mesh from key frame
The image of article is marked, determines whether the number that the frame of image of target item is continuously presented after key frame is greater than scheduled frame
Number;If more than scheduled frame number, then the information to be presented with the images match of target item is obtained, and object is continuously being presented
Information to be presented is presented in the frame of the image of product.
Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.Those skilled in the art
Member is it should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic
Scheme, while should also cover in the case where not departing from the inventive concept, it is carried out by above-mentioned technical characteristic or its equivalent feature
Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed herein
Can technical characteristic replaced mutually and the technical solution that is formed.
Claims (18)
1. a kind of information demonstrating method, which is characterized in that the described method includes:
Detect the key frame in target video, wherein the key frame is that image entropy is greater than preset figure in the target video
As the frame of entropy threshold;
In response to detecting the key frame, the image of target item is detected from the key frame;
In response to detecting the image of the target item from the key frame, determination is continuously presented after the key frame
Whether the number of the frame of the image of the target item is greater than scheduled frame number;
If more than scheduled frame number, then the information to be presented with the images match of the target item is obtained, and described continuous
It presents and the information to be presented is presented in the frame of the image of the target item;
Wherein, the information to be presented of the acquisition and the images match of the target item, comprising:
By establish user, information to be presented, target item image combination information recommendation model to be presented, predict it is to be presented
The highest information to be presented of clicking rate is estimated in the clicking rate of information, push.
2. the method according to claim 1, wherein the key frame in the detection target video, comprising:
It obtains image entropy and is greater than the frame of preset image entropy threshold as key frame;
According to the playing sequence of the target video, obtains the image entropy after the key frame and be greater than preset image entropy threshold
First frame;
Determine whether the first frame and the similarity of the key frame are less than preset similarity threshold;
If being less than preset similarity threshold, it is determined that going out the first frame is key frame.
3. being wrapped the method according to claim 1, wherein detecting the image of target item from the key frame
It includes:
The image of target item is detected from the key frame based on convolutional neural networks trained in advance, wherein the convolution
Neural network the characteristics of image of the target item and determines the figure of the target item according to described image feature for identification
Picture.
4. the method according to claim 1, wherein described in the determination is continuously presented after the key frame
Whether the number of the frame of the image of target item is greater than scheduled frame number, comprising:
Determine whether the image of the target item is continuously presented on the difference after the key frame using compression track algorithm
Frame in;
If continuous be presented, the number of the frame of the accumulative image that the target item is continuously presented, and determine the number of the frame
Whether scheduled frame number is greater than.
5. the method according to claim 1, wherein described in the image that the target item is continuously presented
Frame in the information to be presented is presented, comprising:
Determine location information of the image of the target item in the frame of the image that the target item is continuously presented;
The position of appearing of the information to be presented is determined according to the positional information;
The information to be presented is presented on the position of appearing.
6. method described in -5 any one according to claim 1, which is characterized in that the figure of the acquisition and the target item
As matched information to be presented, comprising:
Obtain information aggregate to be presented, wherein the information to be presented includes picture;
Between the image for determining the picture and the target item in the information aggregate to be presented in every information to be presented
Similarity;
At least one information to be presented is chosen from the information aggregate to be presented according to the descending sequence of similarity.
7. the method according to claim 1, wherein the information to be presented includes text information;And
The information to be presented of the acquisition and the images match of the target item, comprising:
Obtain the text information with the categorical match of the image of the target item.
8. the method according to claim 1, wherein the acquisition and the images match of the target item to
Information is presented, comprising:
Obtain the class label that the user of the target video is watched by terminal, wherein the class label of the user is logical
It crosses and what big data analysis obtained is carried out to the behavioral data of the user;
The information to be presented of class label matched at least one with the user is obtained from information aggregate to be presented.
9. a kind of information presentation device, which is characterized in that described device includes:
Key frame detection unit, for detecting the key frame in target video, wherein the key frame is in the target video
Image entropy is greater than the frame of preset image entropy threshold;
Image detecting element, for detecting the image of target item from the key frame in response to detecting the key frame;
Determination unit is determined for the image in response to detecting the target item from the key frame in the key
Whether the number that the frame of the image of the target item is continuously presented after frame is greater than scheduled frame number;
Display unit, for if more than scheduled frame number, then obtaining the information to be presented with the images match of the target item,
And the information to be presented is presented in the frame of the image that the target item is continuously presented;
Wherein, the information to be presented of the acquisition and the images match of the target item, comprising:
By establish user, information to be presented, target item image combination information recommendation model to be presented, predict it is to be presented
The highest information to be presented of clicking rate is estimated in the clicking rate of information, push.
10. device according to claim 9, which is characterized in that the key frame detection unit is further used for:
It obtains image entropy and is greater than the frame of preset image entropy threshold as key frame;
According to the playing sequence of the target video, obtains the image entropy after the key frame and be greater than preset image entropy threshold
First frame;
Determine whether the first frame and the similarity of the key frame are less than preset similarity threshold;
If being less than preset similarity threshold, it is determined that going out the first frame is key frame.
11. device according to claim 9, which is characterized in that described image detection unit is further used for:
The image of target item is detected from the key frame based on convolutional neural networks trained in advance, wherein the convolution
Neural network the characteristics of image of the target item and determines the figure of the target item according to described image feature for identification
Picture.
12. device according to claim 9, which is characterized in that the determination unit is further used for:
Determine whether the image of the target item is continuously presented on the difference after the key frame using compression track algorithm
Frame in;
If continuous be presented, the number of the frame of the accumulative image that the target item is continuously presented, and determine the number of the frame
Whether scheduled frame number is greater than.
13. device according to claim 9, which is characterized in that the display unit is further used for:
Determine location information of the image of the target item in the frame of the image that the target item is continuously presented;
The position of appearing of the information to be presented is determined according to the positional information;
The information to be presented is presented on the position of appearing.
14. according to device described in claim any one of 9-13, which is characterized in that the display unit is further used for:
Obtain information aggregate to be presented, wherein the information to be presented includes picture;
Between the image for determining the picture and the target item in the information aggregate to be presented in every information to be presented
Similarity;
At least one information to be presented is chosen from the information aggregate to be presented according to the descending sequence of similarity.
15. device according to claim 9, which is characterized in that the information to be presented includes text information;And
The display unit is further used for:
Obtain the text information with the categorical match of the image of the target item.
16. device according to claim 9, which is characterized in that the display unit is further used for:
Obtain the class label that the user of the target video is watched by terminal, wherein the class label of the user is logical
It crosses and what big data analysis obtained is carried out to the behavioral data of the user;
The information to be presented of class label matched at least one with the user is obtained from information aggregate to be presented.
17. a kind of information presenting device, comprising:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
Now such as method described in any one of claims 1-8.
18. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor
Such as method described in any one of claims 1-8 is realized when execution.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710152564.0A CN108629224B (en) | 2017-03-15 | 2017-03-15 | Information demonstrating method and device |
PCT/CN2018/072285 WO2018166288A1 (en) | 2017-03-15 | 2018-01-11 | Information presentation method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710152564.0A CN108629224B (en) | 2017-03-15 | 2017-03-15 | Information demonstrating method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108629224A CN108629224A (en) | 2018-10-09 |
CN108629224B true CN108629224B (en) | 2019-11-05 |
Family
ID=63522608
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710152564.0A Active CN108629224B (en) | 2017-03-15 | 2017-03-15 | Information demonstrating method and device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN108629224B (en) |
WO (1) | WO2018166288A1 (en) |
Families Citing this family (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111125501B (en) * | 2018-10-31 | 2023-07-25 | 北京字节跳动网络技术有限公司 | Method and device for processing information |
CN109495784A (en) * | 2018-11-29 | 2019-03-19 | 北京微播视界科技有限公司 | Information-pushing method, device, electronic equipment and computer readable storage medium |
CN111683267A (en) * | 2019-03-11 | 2020-09-18 | 阿里巴巴集团控股有限公司 | Method, system, device and storage medium for processing media information |
CN110570318B (en) * | 2019-04-18 | 2023-01-31 | 创新先进技术有限公司 | Vehicle loss assessment method and device executed by computer and based on video stream |
CN110311945B (en) * | 2019-04-30 | 2022-11-08 | 上海掌门科技有限公司 | Method and equipment for presenting resource pushing information in real-time video stream |
CN110177250A (en) * | 2019-04-30 | 2019-08-27 | 上海掌门科技有限公司 | A kind of method and apparatus for the offer procurement information in video call process |
CN110189242B (en) * | 2019-05-06 | 2023-04-11 | 阿波罗智联(北京)科技有限公司 | Image processing method and device |
CN110610510B (en) * | 2019-08-29 | 2022-12-16 | Oppo广东移动通信有限公司 | Target tracking method and device, electronic equipment and storage medium |
CN110853124B (en) * | 2019-09-17 | 2023-09-08 | Oppo广东移动通信有限公司 | Method, device, electronic equipment and medium for generating GIF dynamic diagram |
CN110764726B (en) * | 2019-10-18 | 2023-08-22 | 网易(杭州)网络有限公司 | Target object determination method and device, terminal equipment and storage medium |
CN112749326B (en) * | 2019-11-15 | 2023-10-03 | 腾讯科技(深圳)有限公司 | Information processing method, information processing device, computer equipment and storage medium |
CN110941594B (en) * | 2019-12-16 | 2023-04-18 | 北京奇艺世纪科技有限公司 | Splitting method and device of video file, electronic equipment and storage medium |
CN111079864A (en) * | 2019-12-31 | 2020-04-28 | 杭州趣维科技有限公司 | Short video classification method and system based on optimized video key frame extraction |
CN111611417B (en) * | 2020-06-02 | 2023-09-01 | Oppo广东移动通信有限公司 | Image de-duplication method, device, terminal equipment and storage medium |
CN112085120B (en) * | 2020-09-17 | 2024-01-02 | 腾讯科技(深圳)有限公司 | Multimedia data processing method and device, electronic equipment and storage medium |
CN113312951B (en) * | 2020-10-30 | 2023-11-07 | 阿里巴巴集团控股有限公司 | Dynamic video target tracking system, related method, device and equipment |
CN113763098B (en) * | 2020-12-21 | 2024-08-20 | 北京沃东天骏信息技术有限公司 | Method and device for determining an article |
CN113792037A (en) * | 2021-02-03 | 2021-12-14 | 北京沃东天骏信息技术有限公司 | Method and apparatus for determining image information |
CN113033475B (en) * | 2021-04-19 | 2024-01-12 | 北京百度网讯科技有限公司 | Target object tracking method, related device and computer program product |
CN113766330A (en) * | 2021-05-26 | 2021-12-07 | 腾讯科技(深圳)有限公司 | Method and device for generating recommendation information based on video |
CN114640863B (en) * | 2022-03-04 | 2024-09-24 | 广州方硅信息技术有限公司 | Character information display method, system and device in live broadcasting room and computer equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103810711A (en) * | 2014-03-03 | 2014-05-21 | 郑州日兴电子科技有限公司 | Keyframe extracting method and system for monitoring system videos |
CN104715023A (en) * | 2015-03-02 | 2015-06-17 | 北京奇艺世纪科技有限公司 | Commodity recommendation method and system based on video content |
CN105282573A (en) * | 2014-07-24 | 2016-01-27 | 腾讯科技(北京)有限公司 | Embedded information processing method, client side and server |
CN105679017A (en) * | 2016-01-27 | 2016-06-15 | 福建工程学院 | Slight traffic accident assistant evidence collection method and system |
CN105872588A (en) * | 2015-12-09 | 2016-08-17 | 乐视网信息技术(北京)股份有限公司 | Method and device for loading advertisement in video |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100355382B1 (en) * | 2001-01-20 | 2002-10-12 | 삼성전자 주식회사 | Apparatus and method for generating object label images in video sequence |
-
2017
- 2017-03-15 CN CN201710152564.0A patent/CN108629224B/en active Active
-
2018
- 2018-01-11 WO PCT/CN2018/072285 patent/WO2018166288A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103810711A (en) * | 2014-03-03 | 2014-05-21 | 郑州日兴电子科技有限公司 | Keyframe extracting method and system for monitoring system videos |
CN105282573A (en) * | 2014-07-24 | 2016-01-27 | 腾讯科技(北京)有限公司 | Embedded information processing method, client side and server |
CN104715023A (en) * | 2015-03-02 | 2015-06-17 | 北京奇艺世纪科技有限公司 | Commodity recommendation method and system based on video content |
CN105872588A (en) * | 2015-12-09 | 2016-08-17 | 乐视网信息技术(北京)股份有限公司 | Method and device for loading advertisement in video |
CN105679017A (en) * | 2016-01-27 | 2016-06-15 | 福建工程学院 | Slight traffic accident assistant evidence collection method and system |
Also Published As
Publication number | Publication date |
---|---|
WO2018166288A1 (en) | 2018-09-20 |
CN108629224A (en) | 2018-10-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108629224B (en) | Information demonstrating method and device | |
US20200175550A1 (en) | Method for identifying advertisements for placement in multimedia content elements | |
US10713794B1 (en) | Method and system for using machine-learning for object instance segmentation | |
EP3267362B1 (en) | Machine learning image processing | |
CN108446390B (en) | Method and device for pushing information | |
US10902262B2 (en) | Vision intelligence management for electronic devices | |
CN110390033B (en) | Training method and device for image classification model, electronic equipment and storage medium | |
WO2021155691A1 (en) | User portrait generating method and apparatus, storage medium, and device | |
CN111859149A (en) | Information recommendation method and device, electronic equipment and storage medium | |
US9286623B2 (en) | Method for determining an area within a multimedia content element over which an advertisement can be displayed | |
CN111709398A (en) | Image recognition method, and training method and device of image recognition model | |
CN111292168B (en) | Data processing method, device and equipment | |
CN113569129A (en) | Click rate prediction model processing method, content recommendation method, device and equipment | |
CN112364204A (en) | Video searching method and device, computer equipment and storage medium | |
CN113766330A (en) | Method and device for generating recommendation information based on video | |
WO2024041483A1 (en) | Recommendation method and related device | |
CN113434716A (en) | Cross-modal information retrieval method and device | |
CN112766284B (en) | Image recognition method and device, storage medium and electronic equipment | |
CN110415009A (en) | Computerized system and method for intra-video modification | |
US11823217B2 (en) | Advanced segmentation with superior conversion potential | |
US12079856B2 (en) | Method for providing shopping information for individual products and electronic device performing same | |
CN112862538A (en) | Method, apparatus, electronic device, and medium for predicting user preference | |
US20150052086A1 (en) | System and method for identifying a target area in a multimedia content element | |
CN114330519A (en) | Data determination method and device, electronic equipment and storage medium | |
Rungruangbaiyok et al. | Probabilistic static foreground elimination for background subtraction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |