CN109325148A - The method and apparatus for generating information - Google Patents
The method and apparatus for generating information Download PDFInfo
- Publication number
- CN109325148A CN109325148A CN201810878632.6A CN201810878632A CN109325148A CN 109325148 A CN109325148 A CN 109325148A CN 201810878632 A CN201810878632 A CN 201810878632A CN 109325148 A CN109325148 A CN 109325148A
- Authority
- CN
- China
- Prior art keywords
- video
- label
- identified
- frame
- classification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Abstract
The embodiment of the present application discloses the method and apparatus for generating information.One specific embodiment of the method for the generation information includes: to obtain video to be identified;Understand that technology understands the content of video to be identified using video, obtains video content label;The text that video to be identified is analyzed using text data analytical technology obtains videotext label;Based on video content label and videotext label, the semantic label of video to be identified is determined.The embodiment can be understood by video content and videotext is analyzed, it is automatic to extract video content label and videotext label, and the semantic label of video to be identified is determined based on video content label and videotext label, effectively improve the accuracy and integrality of the semantic label of video to be identified.
Description
Technical field
This application involves field of computer technology, and in particular to technical field of the computer network, more particularly to generate information
Method and apparatus.
Background technique
Current short-sighted frequency has become the important channel that people obtain information, short number of videos sharp increase, source video sequence
Diversification and UGC (User Generated Content, user's original content) video accounting are significantly increased, how to help
User is quickly found out interested video, becomes urgent problem to be solved.Current video recommendations and video retrieval technology are main
Dependent on the text information of video, since short video text message is on the low side, how to be based on video content and constructs universal description information,
There are no mature solutions.
Existing video tab extraction scheme mainly utilizes the title data of video, utilizes NLP's (natural language processing)
The technologies such as participle, part of speech analysis, entity analysis extract the entity candidate word in title.Later, the people for being directed to some scene is utilized
After the knowledge base (usually multistage tag database) that work is established is filtered candidate word, video tab and label are obtained
Upper layer classification information.
Summary of the invention
The embodiment of the present application provides the method and apparatus for generating information.
In a first aspect, the embodiment of the present application provides a kind of method for generating information, including obtain video to be identified;Using
Video understands that technology understands the content of video to be identified, obtains video content label;Using text data analytical technology analysis to
The text for identifying video, obtains videotext label;Based on video content label and videotext label, video to be identified is determined
Semantic label.
In some embodiments, understand that technology understands video content using video, it includes following for obtaining video content label
At least one of: by video input video classification model to be identified, obtain class label;The video frame of video to be identified is detected frame by frame
Interior face, the face that will test are matched with the face sample in face database, and the face for obtaining and detecting matches
Face sample personage's name label and the people information label that is associated with the people's name;Using motion detection mould trained in advance
Type identifies the movement in the video frame of video to be identified frame by frame, obtains action message, merges the action message of each frame, is moved
Make label;Using identification disaggregated model trained in advance, scene in the video frame of video to be identified and entity are identified frame by frame simultaneously
Fusion recognition is as a result, obtain the scene tag and entity tag in video frame.
In some embodiments, by video input video classification model to be identified, obtaining class label includes: uniform extraction
The video frame of video to be identified obtains sequence of frames of video to be identified;Using image classification network handles identify sequence of frames of video into
Row feature extraction obtains the characteristics of image sequence of video to be identified;Extract the audio signal of video to be identified;By video to be identified
Audio signal input Classification of Speech convolutional neural networks, feature extraction is carried out to voice per second, obtains video to be identified
Phonetic feature sequence;Based on characteristics of image sequence and phonetic feature sequence, determine that video to be identified corresponds to the general of each label
Rate value;The label that probability value is greater than threshold value is determined as to the class label of video to be identified.
In some embodiments, it is based on characteristics of image sequence and phonetic feature sequence, it is each to determine that video to be identified corresponds to
The probability value of label includes: that the double-current shot and long term of characteristics of image sequence and the training in advance of phonetic feature sequence inputting is remembered net
Network obtains the probability value that video to be identified corresponds to each label.
In some embodiments, feature of the image classification network based on the video frame modeled using timing segmented network and
The corresponding label training of video sample obtains;And/or the convolutional neural networks of Classification of Speech are determined based on following steps: being extracted and regarded
Meier scale filter group feature in the audio signal of frequency sample;Based on Meier scale filter group feature and audio signal pair
The label answered, the convolutional neural networks of training Classification of Speech.
In some embodiments, the text of video to be identified includes at least one of the following: the title text of video to be identified;
Text in the obtained video frame of video to be identified is detected using video OCR.
In some embodiments, the text that video to be identified is analyzed using text data analytical technology, obtains videotext
Label includes: the text based on video to be identified, and the candidate entity tag of video to be identified is extracted from multistage tag database;
Based on the part of speech and different degree of NLP technology analysis entities label, screening obtains videotext label.
In some embodiments, it is based on video content label and videotext label, determines the semantic mark of video to be identified
Label include: to determine classification belonging to the label in video content label, in video based on the multistage tag database pre-established
Hold the relationship in the label and multistage tag database in label between other labels;Using natural language processing technique, analysis
The label of classification belonging to the label in label, video content label in video content label and based on determined by relationship
The part of speech and different degree of label;Based on part of speech and different degree, the label in video content label and videotext label is carried out
Sequence and screening, obtain the semantic label of video to be identified.
In some embodiments, method further include: videotext label is based on, to user's pushing video.
Second aspect, the embodiment of the present application provide a kind of device for generating information, comprising: video acquisition unit is matched
It is set to and obtains video to be identified;Video understands unit, is configured to understand using video understanding technology the content of video to be identified,
Obtain video content label;Video analysis unit is configured to analyze the text of video to be identified using text data analytical technology
This, obtains videotext label;Tag determination unit is configured to determine based on video content label and videotext label
The semantic label of video to be identified.
In some embodiments, video understands that unit includes at least one of the following: visual classification subelement, be configured to by
Video input video classification model to be identified, obtains class label;Face datection subelement is configured to detect frame by frame to be identified
Face in the video frame of video, the face that will test are matched with the face sample in face database, obtain and detect
The face sample that matches of face personage name label and the people information label that is associated with the people's name;Action recognition is single
Member is configured to be identified the movement in the video frame of video to be identified frame by frame using motion detection model trained in advance, obtained
Action message merges the action message of each frame, obtains movement label;Scene and Entity recognition subelement are configured to using pre-
First trained identification disaggregated model, identify scene in the video frame of video to be identified and entity frame by frame and fusion recognition as a result,
Obtain the scene tag and entity tag in video frame.
In some embodiments, visual classification subelement include: video frame extract subelement, be configured to uniformly extract to
The video frame for identifying video, obtains sequence of frames of video to be identified;Image characteristics extraction subelement is configured to using image classification
Network handles identify that sequence of frames of video carries out feature extraction, obtain the characteristics of image sequence of video to be identified;Audio signal extracts
Subelement is configured to extract the audio signal of video to be identified;Speech feature extraction subelement is configured to view to be identified
The convolutional neural networks of the audio signal input Classification of Speech of frequency, carry out feature extraction to voice per second, obtain view to be identified
The phonetic feature sequence of frequency;Probability value determines subelement, is configured to determine based on characteristics of image sequence and phonetic feature sequence
Video to be identified corresponds to the probability value of each label;Class label determines subelement, is configured to probability value being greater than threshold value
Label is determined as the class label of video to be identified.
In some embodiments, probability value determines that subelement is further configured to: characteristics of image sequence and voice is special
The double-current shot and long term memory network for levying sequence inputting training in advance, obtains the probability value that video to be identified corresponds to each label.
In some embodiments, the image classification network in image characteristics extraction subelement is based on using timing segmented network
The corresponding label training of the feature and video sample of the video frame modeled obtains;And/or in image characteristics extraction subelement
The convolutional neural networks of Classification of Speech are determined based on following steps: the Meier scale extracted in the audio signal of video sample filters
Device group feature;Based on Meier scale filter group feature and the corresponding label of audio signal, the convolutional Neural of training Classification of Speech
Network.
In some embodiments, the text of the video to be identified in video analysis unit is included at least one of the following: wait know
The title text of other video;Text in the obtained video frame of video to be identified is detected using video OCR.
In some embodiments, video analysis unit is further configured to: the text based on video to be identified, from multistage
The candidate entity tag of video to be identified is extracted in tag database;Part of speech based on NLP technology analysis entities label and important
Degree, screening obtain videotext label.
In some embodiments, tag determination unit includes: that label relationship determines subelement, is configured to be based on to build in advance
Vertical multistage tag database, determine classification belonging to the label in video content label, the label in video content label with
Relationship in multistage tag database between other labels;Part of speech different degree determines subelement, is configured to using natural language
Processing technique analyzes the label and base of the label in video content label, classification belonging to the label in video content label
The part of speech and different degree of the label determined by relationship;Tag sorting screens subelement, is configured to based on part of speech and different degree,
Label in video content label and videotext label is ranked up and is screened, the semantic label of video to be identified is obtained.
In some embodiments, device further include: video push unit is configured to based on videotext label, Xiang Yong
Family pushing video.
The third aspect, the embodiment of the present application provide a kind of equipment, comprising: one or more processors;Storage device is used
In the one or more programs of storage;When one or more programs are executed by one or more processors, so that at one or more
It manages device and realizes as above any method.
Fourth aspect, the embodiment of the present application provide a kind of computer-readable medium, are stored thereon with computer program, should
As above any method is realized when program is executed by processor.
The method and apparatus provided by the embodiments of the present application for generating information, firstly, obtaining video to be identified;Later, it uses
Video understands that technology understands the content of video to be identified, obtains video content label;Later, using text data analytical technology point
The associated text for analysing video to be identified obtains video semanteme label;Finally, it is based on video content label and video semanteme label,
Determine video tab to be identified.In this course, it can be understood by video content and videotext is analyzed, it is automatic to extract view
Frequency content tab and videotext label, and determine based on video content label and videotext label the semanteme of video to be identified
Label effectively improves the accuracy and integrality of the semantic label of video to be identified.
Detailed description of the invention
Non-limiting embodiment is described in detail referring to made by the following drawings by reading, other features,
Objects and advantages will become more apparent upon:
Fig. 1 is that this application can be applied to exemplary system architecture figures therein;
Fig. 2 is the flow diagram according to one embodiment of the method for the generation information of the application;
Fig. 3 is an application scenarios schematic diagram according to the method for the generation information of the embodiment of the present application;
Fig. 4 a is one of the method for the class label of determining video to be identified in the method according to the generation information of the application
The flow diagram of a embodiment;
Fig. 4 b is the exemplary block diagram of one embodiment of the double-current shot and long term memory network in Fig. 4 a;
Fig. 5 is the structural schematic diagram of one embodiment of the device of the generation information of the application;
Fig. 6 is adapted for the structural schematic diagram for the computer system for realizing the server of the embodiment of the present application.
Specific embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to
Convenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase
Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105,
106.Network 104 between terminal device 101,102,103 and server 105,106 to provide the medium of communication link.Net
Network 104 may include various connection types, such as wired, wireless communication link or fiber optic cables etc..
User 110 can be used terminal device 101,102,103 and be interacted by network 104 with server 105,106, to connect
Receive or send message etc..Various telecommunication customer end applications, such as search engine can be installed on terminal device 101,102,103
Class application, shopping class application, instant messaging tools, mailbox client, social platform software, video playback class application etc..
Terminal device 101,102,103 can be the various electronic equipments with display screen, including but not limited to intelligent hand
Machine, tablet computer, E-book reader, MP3 player (Moving Picture Experts Group Audio Layer
III, dynamic image expert's compression standard audio level 3), MP4 (Moving Picture Experts Group Audio
Layer IV, dynamic image expert's compression standard audio level 4) player, pocket computer on knee and desktop computer etc.
Deng.
Server 105,106 can be to provide the server of various services, such as provide terminal device 101,102,103
The background server of support.The data that background server can submit terminal such as be analyzed, stored or be calculated at processing, and the general
Analysis, storage or calculated result are pushed to terminal device.
It should be noted that generating the method for information in practice, provided by the embodiment of the present application generally by server
105, it 106 executes, correspondingly, the device for generating information is generally positioned in server 105,106.However, when terminal device
Performance can satisfy this method execution condition or the equipment setting condition when, information is generated provided by the embodiment of the present application
Method can also be executed by terminal device 101,102,103, generate information device also can be set in terminal device 101,
102, in 103.
It should be understood that the number of terminal vehicle, network and server in Fig. 1 is only schematical.According to realization need
It wants, can have any number of terminal vehicle, network and server.
With continued reference to Fig. 2, the process 200 of one embodiment of the method for the generation information according to the application is shown.It should
The method for generating information, comprising the following steps:
Step 201, video to be identified is obtained.
In the present embodiment, electronic equipment (such as the service shown in FIG. 1 of the method operation of above-mentioned generation information thereon
Device or terminal) video to be identified can be obtained from video library or other terminals.
Step 202, understand that technology understands the content of video to be identified using video, obtain video content label.
In the present embodiment, to video to be identified, the content of video to be identified, example can be understood from multiple concern dimensions
Such as, it personage's name and people information, action message, scene information and entity information etc. can be closed from visual classification label, video
Dimension is infused, to understand the content of video to be identified.Here video understands that technology can be taken for different concern dimensions
It is applicable in the video understanding technology of the angle, to understand the content of video to be identified.For example, using the identification model of artificial intelligence,
To identify focus corresponding to each concern dimension.
In some optional implementations of the present embodiment, understands that technology understands video content using video, obtain video
Content tab may include at least one of following: by video input video classification model to be identified, obtain class label;It examines frame by frame
The face in video frame is surveyed, the face that will test is matched with the face sample in face database, the people for obtaining and detecting
The personage's name label for the face sample that face matches and the people information label that is associated with the people's name;Using movement trained in advance
Detection model identifies the movement in the video frame of video to be identified frame by frame, obtains action message, merges the action message of each frame,
Obtain movement label;Using identification disaggregated model trained in advance, identify frame by frame scene in the video frame of video to be identified and
Entity and fusion recognition are as a result, obtain the scene tag and entity tag in video frame.
In this implementation, video classification model is taken to detect video to be identified, available video tab;It takes pair
Video carries out Face datection, then the face that will test out carries out to determine the people in video with the sample matches for having label
Name label and the people information label for being associated with the people's name;Motion detection model is taken to identify the action message in video frame simultaneously
Each frame is merged as a result, obtaining movement label;Identification disaggregated model is taken to identify the scene in video frame and entity and fusion recognition
As a result, obtaining the scene tag and entity tag in video frame.By these identification and detection, can be obtained from each dimension to
The video content label of video is identified, so as to improve the comprehensive and accuracy of video content label.
Here video classification model can take the method training of machine learning to obtain.Video classification model is trained
Afterwards with the machine learning model of visual classification ability, visual classification result is obtained for the video according to input.Machine learning
Model can be using neural network model, support vector machines or Logic Regression Models etc..Neural network model such as convolution mind
Through network, reverse transmittance nerve network, Feedback Neural Network, radial base neural net or self organizing neural network etc..It is instructing
When practicing video classification model, the fine granularity video general label system of a classification can be constructed in advance, and covering is most of common
Videotext label and usually use video type.
It, can be using the Face datection side in the prior art or the technology of future development when detecting the face in video frame
Method realizes that the application is not construed as limiting this.For example, active shape model (ASM, Active Shape can be used
Models), active appearance models (AAM, Active Appearance Models), cascade posture regression model (CPR,
Cascaded pose regression), depth convolutional neural networks (DCNN, Deep Convolutional Network) etc.
Method for detecting human face realizes the detection to face.
Here motion detection model, can using the motion detection method in the prior art or the technology of future development come
It realizes, the application is not construed as limiting this.For example, can using the recognition methods based on single frames, recognition methods based on CNN etc. come
Realize motion detection.After the movement for detecting each frame, the recognition result of each frame can be merged, with obtain it is objective,
Action message in comprehensive video.
Here identification disaggregated model can also take the method training of machine learning to obtain.Identification disaggregated model is instruction
There is the machine learning model of identification classification capacity after white silk, obtain point of scene and entity in video for the video according to input
Class result.Machine learning model can be using neural network model, support vector machines or Logic Regression Models etc..Neural network
Model such as convolutional neural networks, reverse transmittance nerve network, Feedback Neural Network, radial base neural net or self-organizing mind
Through network etc..
When carrying out scene and Entity recognition, a series of scenes and entity fine granularity label system can be constructed in advance, such as
Vehicle, animal etc., for each vertical class training fine grit classification model.Scene and entity in Forecasting recognition video frame frame by frame,
The scene extracted in each frame and entity are merged later, obtain main entity tag, scene tag in video.
Step 203, the text that video to be identified is analyzed using text data analytical technology obtains videotext label.
In the present embodiment, associated text here may include title text, additional information text (such as except title
Except other brief introductions, description information), the text in video frame etc..For title text and additional information text, text data
Analytical technology can using in the technology of the prior art or future development for analyzing the technology of text data, the application to this not
It limits.For example, the title text and additional information text to video to be identified are segmented, are determined the part of speech of word and important
Degree, obtains the information of title text and additional information text.For the text in video frame, skill is identified using videotext first
Art detects text, further segments, determines part of speech and different degree of word etc., obtains detection text information.Finally, by title
The information of text and additional information text, detection text information, obtain videotext label.
In some optional implementations of the present embodiment, the text of video to be identified can mainly include following at least one
: the title text of video to be identified;Text in the obtained video frame of video to be identified is detected using video OCR.Pass through
The text that video to be identified is arranged includes the title text of video to be identified, and detects video institute to be identified using video OCR
The obtained text in video frame, it is possible to reduce the calculation amount of text data analysis.
In some optional implementations of the present embodiment, the mark of video to be identified is analyzed based on text data analytical technology
Topic, obtaining title text may include: the text based on video to be identified, and video to be identified is extracted from multistage tag database
Candidate entity tag;Based on the part of speech and different degree of NLP technology analysis entities label, screening obtains videotext label.
In this implementation, the candidate entity tag of corresponding associated text can be extracted from multistage tag database,
And it is based further on NLP technology, the part of speech and different degree of analysis entities label, so that screening obtains videotext label.
Step 204, it is based on video content label and videotext label, determines the semantic label of video to be identified.
In the present embodiment, it from above-mentioned video content label and videotext label, is weighed based on predetermined label
Label corresponding to weight, sequence and screening video content label and videotext label, obtains videotext label.In this way, can
To obtain final view to be identified to above-mentioned video content label and videotext label is once reordered and postsearch screening
The semantic label of frequency.
In some optional implementations of the present embodiment, be based on video content label and videotext label, determine to
The semantic label for identifying video includes: to determine the label in video content label based on the multistage tag database pre-established
Relationship in affiliated classification, the label in video content label and multistage tag database between other labels;Using nature
Language processing techniques (NLP) analyze the mark of the label in video content label, classification belonging to the label in video content label
The part of speech and different degree of label and the label based on determined by relationship;Based on part of speech and different degree, to video content label and view
Label in frequency text label is ranked up and screens, and obtains the semantic label of video to be identified.
In this implementation, by being established in video content label and videotext label and multistage tag database
Label between relationship, to above-mentioned video content label and videotext label is once reordered and postsearch screening, can
To obtain the semantic label of video to be identified.
In some optional implementations of the present embodiment, the above method further include: videotext label is based on, to user
Pushing video.The accuracy of the video pushed to user can be improved in this way.
Below in conjunction with Fig. 3, the exemplary application scene of the method for the generation information of the application is described.
As shown in figure 3, Fig. 3 shows the schematic stream of an application scenarios of the method for the generation information according to the application
Cheng Tu.
As shown in figure 3, the method 300 for generating information is run in electronic equipment 310, may include:
Firstly, obtaining video 301 to be identified;
Later, understand that technology 302 understands the content of video to be identified using video, obtain video content label 303;
Later, the text that video to be identified is analyzed using text data analytical technology 304 obtains videotext label 305;
Later, it is based on video content label 303 and videotext label 305, determines the semantic label of video to be identified
306。
It should be appreciated that the application scenarios of the method for generation information shown in above-mentioned Fig. 3, only for generating information
The exemplary description of method does not represent the restriction to this method.For example, each step shown in above-mentioned Fig. 3, it can be into one
Step uses the implementation method of more details.
The method of the generation information of the above embodiments of the present application, available video to be identified;Technology is understood using video
The content for understanding video to be identified obtains video content label;The pass of video to be identified is analyzed using text data analytical technology
Join text, obtains videotext label;Based on video content label and videotext label, the semantic mark of video to be identified is determined
Label.In this course, it can be understood by video content and videotext is analyzed, it is automatic to extract video content label and video
Text label, and determine based on video content label and videotext label the semantic label of video to be identified, effectively improve to
Identify the accuracy and integrality of the semantic label of video.In the optional implementation in part, video to be identified can be based on
Semantic label, to user recommend video, be able to solve new video cold start-up recommendation problem, realize personalized recommendation, promoted to
The specific aim of user's push new video.
Referring to FIG. 4, it illustrates the classification marks for determining video to be identified in the method according to the generation information of the application
The flow chart of one embodiment of the method for label.
As shown in figure 4, the process 400 of the method for the generation information of the present embodiment, may comprise steps of:
In step 401, the video frame for uniformly extracting video to be identified obtains sequence of frames of video to be identified.
In the present embodiment, by uniformly extracting video frame, it can be substantially reduced the data volume of video to be identified, thus plus
Speed obtains the efficiency of final result.
In step 402, feature extraction is carried out using image classification network handles identification sequence of frames of video, obtained to be identified
The characteristics of image sequence of video.
In the present embodiment, image classification network is the convolutional neural networks with image classification ability after training, is used for
Image classification result is obtained according to the feature of each input picture.Convolutional neural networks can using AlexNet, VGG,
GoogLeNet, Resnet etc. are used as core network architecture.
In a specific example, image classification network is based on using timing segmented network (Temporal Segment
Networks is abbreviated as TSN) feature of video frame that is modeled and the corresponding label training of video sample obtain.
In this implementation, TSN network is made of two-way CNN, including time convolutional neural networks and spatial convoluted mind
Through network.After extracting video clip in the video frame from video sample, each video clip includes a frame image, can be incited somebody to action
Video clip sequence inputs the two-way CNN of TSN respectively, and each segment obtains segment characterizations, then each segment input segment is distributed
Formula consistency network (segmental consesus), the feature of the video exported.Feature and video based on the output
The corresponding label of sample, can be with training image sorter network.
In step 403, the audio signal of video to be identified is extracted.
In the present embodiment, video to be identified can be extracted using the method in the prior art for extracting video/audio
Audio signal, the application are not construed as limiting this.For example, the audio file or use tool of available video turn video format
It is changed to audio format, to obtain audio signal.
In step 404, by the convolutional neural networks of the audio signal input Classification of Speech of video to be identified, to per second
Voice carries out feature extraction, obtains the phonetic feature sequence of video to be identified.
In the present embodiment, the convolutional neural networks of Classification of Speech are the convolution minds with Classification of Speech ability after training
Through network, for obtaining audio classification result according to the feature of each input audio.Convolutional neural networks can use
AlexNet, VGG, GoogLeNet, Resnet etc. are used as core network architecture.
In a specific example, the convolutional neural networks of Classification of Speech are determined based on following steps: extracting video sample
Meier scale filter group feature in this audio signal;It is corresponding based on Meier scale filter group feature and audio signal
Label, the convolutional neural networks of training Classification of Speech.
In this implementation, the extracted feature of the convolutional neural networks of Classification of Speech is the Meier mark in audio signal
Filter group (Fbank) feature is spent, using the corresponding label of the audio signal of this feature and video sample, voice point can be trained
The convolutional neural networks of class.
In step 405, the double-current shot and long term by the training in advance of characteristics of image sequence and phonetic feature sequence inputting is remembered
Network obtains the probability value that video to be identified corresponds to each label.
In the present embodiment, the double-current shot and long term memory network of training can be special with input picture characteristic sequence and voice in advance
Sequence is levied, later for characteristics of image sequence and phonetic feature sequence, considers the feature of different time research object respectively, again
The extraction of characteristic sequence is carried out, and attention is respectively adopted and merges the feature after characteristics of image sequential extraction procedures to be formed more
Long vector merges phonetic feature sequence to form longer vector, and merges again to the vector after two merging
Longer vector is formed together, and " the distributed nature expression " acquired finally is mapped to by sample labeling sky using full articulamentum
Between, finally determine that video to be identified corresponds to the probability value of each label using classifier.
In a specific example, the double-current shot and long term memory network of training can illustrate with reference to Fig. 4 b in advance.Such as
Shown in Fig. 4 b, double-current shot and long term memory network may include two-way series model, attention model, full articulamentum and sigmoid
Classifier, classifier, two-way series model divide the RGB image characteristic sequence and phonetic feature sequence that input video to be identified
Not carry out Recursion process, and respectively the characteristics of image sequence after Recursion process is merged to be formed more using attention model
Long vector merges phonetic feature sequence to form longer vector, and the vector after two are merged merges again
Longer vector is formed together, and " the distributed nature expression " acquired finally is mapped to sample mark using two full articulamentums
Remember space, to improve the accuracy of final classification result, it is each finally to determine that video to be identified corresponds to using sigmoid classifier
The probability value of label.Since sigmoid classifier has relatively good anti-interference, it is built up with sigmoid unit group
Artificial neural network also have good robustness.
Fig. 4 a is returned to, in a specific example, the double-current shot and long term memory network of training is via following steps in advance
It determines: obtaining the video sample for having video tab;Uniformly extract the video frame of video sample;Using image classification network to institute
The video frame of extraction carries out feature extraction, obtains the characteristics of image sequence of video sample;Extract the audio signal in video sample;
By the convolutional neural networks of the audio signal input Classification of Speech in video sample, feature extraction is carried out to voice per second, is obtained
To the phonetic feature sequence of video sample;Using the characteristics of image sequence of video sample, the phonetic feature sequence of video sample as
Input, using the video tab of video sample as output, training double fluid shot and long term memory network.
It, can be by being input with characteristics of image sequence, phonetic feature sequence, with video sample in this implementation
Video tab is output, training double fluid shot and long term memory network, to consider that the feature of different time research object is come respectively
To output as a result, improving the accuracy of the classification results of double-current shot and long term memory network.
Above-mentioned video sample can directly acquire from information flow library and mark tag set, can also be to from information
The tag set of mark obtained in stream library carries out further data cleansing, obtains for trained video sample.
In a specific example, video sample can be determined based on following steps: obtain institute in message stream data library
There is the mark tag set of video;It is sorted from high to low according to the frequency of occurrences and has marked label;From the mark mark after sequence
The label of preset quantity is extracted in label as candidate tag set;Candidate tag set is screened, filters out and meets filtering
The word of rule;Candidate label in the filtered candidate tag set of vectorization calculates similar between candidate label two-by-two
Degree;Merge two candidate labels that similarity is greater than predetermined threshold;Video in candidate label after judgement merging under each label
Whether there is appearance consistency and Semantic Similarity, filters out the ambiguous label of tool, the label chosen;Based on what is chosen
Label constructs video sample.
In this implementation, the label chosen can also constitute multistage number of tags according to the major class and subclassification of label
According to library, to adjust the label finally used according to the size of the probability of subclassification label.If some subclassification label is general
Rate is relatively high, then it is assumed that and it is more credible, while its corresponding second level label and level-one label can be exported, increase label number,
With label granularity;If the probability of some subclassification label is relatively low, then it is assumed that it is insincere, can by the label to second level or
Level-one label mapping, on the label of coarseness, general accuracy rate can be some higher.
In a specific example of this implementation, since the video in Feed (information flow) library has million ranks
Outsourcing annotation results can sort from high to low by the label frequency of occurrences after taking all label results, take out preceding 10,000
A label is as candidate tag set.
Later, this 10,000 entity tag words of artificial direct viewing can be used, the word for meeting filtering rule is filtered out
Language, such as filter out adjective, verb, be unable to vision (such as tongue twister), star's name can be divided (face recognition technology can be passed through
Identification, therefore be added without video tab set) etc. do not meet the word of video tab requirement.
Then, to each label, its corresponding video content is watched, judges whether the video under same label has appearance
Consistency and Semantic Similarity.Such as label " koala ", it is both the pet name of a kind of animal and the daughter of certain star, there is discrimination
Justice just directly filters out.
Finally, by above-mentioned steps, available 3000 labels, and each label is built into the system of three-level,
Such as sport -> ball game -> football.The corresponding all video datas of these labels are retained simultaneously, amount to 1,000 ten thousand or so views
Frequently, these data can be used for subsequent model training.It is trained for example, third level label can be directly used: if some
Label probability is relatively high, then it is assumed that and it is more credible, while its corresponding second level label and level-one label can be exported, increase label
Number and label granularity;If some label probability is relatively low, then it is assumed that it is insincere, it can be by the label to second level or one
Grade label mapping, on the label of coarseness, general accuracy rate can be some higher.
In a step 406, the label that probability value is greater than threshold value is determined as to the class label of video to be identified.
In the present embodiment, after determining the probability that video to be identified corresponds to each label, probability value can be greater than
The label of threshold value is determined as the class label of video to be identified as valuable label.
In some optional implementations of the present embodiment, method life described in above-mentioned Fig. 2-Fig. 4 of information is generated
It is further comprising the steps of on the basis of embodiment at the method for information: to extract the full articulamentum of double-current shot and long term memory network
The feature vector of output;The feature vector for comparing feature vector and video to be recommended, obtains video similarity;It is similar based on video
Degree determines video recommended to the user from video to be recommended.The essence of video recommended to the user can be improved in the implementation
Accuracy.
The method of the generation information of the above embodiments of the present application can utilize video using LSTM recurrent neural network
Sequential organization models a complete event, also considers the double-current feature of image and voice simultaneously, so that the classification mark of output
It signs more accurate abundant.
In the present embodiment, using the two characteristic sequences as the input of double-current shot and long term memory network, thus in feature
Sequence stage merges, and according to the feature after merging, obtains the probability value that final video to be identified corresponds to each label.
It should be appreciated that after step 401-404 obtains characteristics of image sequence and phonetic feature sequence, it can also direct base
In characteristics of image sequence and phonetic feature sequence, determine that video to be identified corresponds to the probability value of each label;Probability value is greater than
The label of threshold value is determined as the class label of video to be identified.
Specifically, characteristics of image sequence is got respectively in the convolutional neural networks based on image classification network and Classification of Speech
After column and phonetic feature sequence, image classification label and Classification of Speech mark can be determined according to the two characteristic sequences respectively
Label, finally obtain each mark according to the default weight and default score value of each label in image classification label and Classification of Speech label
The scoring of label, so that it is determined that video to be identified corresponds to the probability value of each label.Here default weight and default score value, can be with
It is determined based on NLP (natural language processing) technology.
With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, this application provides a kind of generation information
One embodiment of device, the Installation practice is corresponding with Fig. 2-embodiment of the method shown in Fig. 4, which can specifically answer
For in various electronic equipments.
As shown in figure 5, the device 500 of the generation information of the present embodiment may include: video acquisition unit 510, it is configured
At acquisition video to be identified;Video understands unit 520, is configured to understand the interior of video to be identified using video understanding technology
Hold, obtains video content label;Video analysis unit 530 is configured to analyze view to be identified using text data analytical technology
The text of frequency obtains videotext label;Tag determination unit 540 is configured to based on video content label and videotext
Label determines the semantic label of video to be identified.
In some optional implementations of the present embodiment, video understands that unit 520 includes at least one of the following: video point
Class subelement 521 is configured to video input video classification model to be identified obtaining class label;Face datection subelement
522, it is configured to detect the face in the video frame of video to be identified frame by frame, the face in face and face database that will test
Sample is matched, and is obtained personage's name label of the face sample to match with the face detected and is associated with the people of the people's name
Object information labels;Action recognition subelement 523 is configured to be identified frame by frame using motion detection model trained in advance wait know
Movement in the video frame of other video, obtains action message, merges the action message of each frame, obtains movement label;Scene and reality
Body identifies subelement 524, is configured to identify the video frame of video to be identified frame by frame using identification disaggregated model trained in advance
Interior scene and entity and fusion recognition is as a result, obtain the scene tag and entity tag in video frame.
In some optional implementations of the present embodiment, visual classification subelement 521 includes (not shown): video
Frame extracts subelement, is configured to uniformly extract the video frame of video to be identified, obtains sequence of frames of video to be identified;Characteristics of image
Subelement is extracted, is configured to carry out feature extraction using image classification network handles identification sequence of frames of video, obtain to be identified
The characteristics of image sequence of video;Audio signal extracts subelement, is configured to extract the audio signal of video to be identified;Voice is special
Sign extracts subelement, is configured to by the convolutional neural networks of the audio signal input Classification of Speech of video to be identified, to per second
Voice carry out feature extraction, obtain the phonetic feature sequence of video to be identified;Probability value determines subelement, is configured to be based on
Characteristics of image sequence and phonetic feature sequence determine that video to be identified corresponds to the probability value of each label;Class label determines son
Unit is configured to for the label that probability value is greater than threshold value being determined as the class label of video to be identified.
In some optional implementations of the present embodiment, probability value determines that subelement is further configured to: by image
The double-current shot and long term memory network of characteristic sequence and phonetic feature sequence inputting training in advance, obtain video to be identified correspond to it is each
The probability value of label.
In some optional implementations of the present embodiment, the image classification network in image characteristics extraction subelement is based on
The corresponding label training of the feature and video sample of the video frame modeled using timing segmented network is obtained;And/or image is special
The convolutional neural networks that sign extracts the Classification of Speech in subelement are determined based on following steps: extracting the audio signal of video sample
In Meier scale filter group feature;Based on Meier scale filter group feature and the corresponding label of audio signal, training language
The convolutional neural networks of cent class.
In some optional implementations of the present embodiment, the text of the video to be identified in video analysis unit include with
It is at least one of lower: the title text of video to be identified;Text in the obtained video frame of video to be identified is detected using video OCR
This.
In some optional implementations of the present embodiment, video analysis unit is further configured to: based on to be identified
The text of video extracts the candidate entity tag of video to be identified from multistage tag database;Based on NLP technology analysis entities
The part of speech and different degree of label, screening obtain videotext label.
In some optional implementations of the present embodiment, tag determination unit includes (not shown): label relationship
It determines subelement, is configured to determine belonging to the label in video content label based on the multistage tag database pre-established
Classification, the relationship in the label in video content label and multistage tag database between other labels;Part of speech different degree is true
Stator unit is configured to use natural language processing technique, in the label, video content label in analysis video content label
Label belonging to the label of classification and the part of speech and different degree of the label based on determined by relationship;Tag sorting screening is single
Member is configured to that the label in video content label and videotext label is ranked up and is sieved based on part of speech and different degree
Choosing, obtains the semantic label of video to be identified.
In some optional implementations of the present embodiment, device further include: video push unit 550 is configured to base
In videotext label, to user's pushing video.
It should be appreciated that each step in the method that all units recorded in device 500 can be described with reference Fig. 2-Fig. 4
It is corresponding.It is equally applicable to device 500 and unit wherein included above with respect to the operation and feature of method description as a result,
This is repeated no more.
Below with reference to Fig. 6, it illustrates the computer systems 600 for the server for being suitable for being used to realize the embodiment of the present application
Structural schematic diagram.Terminal device or server shown in Fig. 6 are only an example, should not function to the embodiment of the present application and
Use scope brings any restrictions.
As shown in fig. 6, computer system 600 includes central processing unit (CPU) 601, it can be read-only according to being stored in
Program in memory (ROM) 602 or be loaded into the program in random access storage device (RAM) 603 from storage section 608 and
Execute various movements appropriate and processing.In RAM 603, also it is stored with system 600 and operates required various programs and data.
CPU 601, ROM 602 and RAM 603 are connected with each other by bus 604.Input/output (I/O) interface 605 is also connected to always
Line 604.
I/O interface 605 is connected to lower component: the importation 606 including keyboard, mouse etc.;It is penetrated including such as cathode
The output par, c 607 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section 608 including hard disk etc.;
And the communications portion 609 of the network interface card including LAN card, modem etc..Communications portion 609 via such as because
The network of spy's net executes communication process.Driver 610 is also connected to I/O interface 605 as needed.Detachable media 611, such as
Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 610, in order to read from thereon
Computer program be mounted into storage section 608 as needed.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium
On computer program, which includes the program code for method shown in execution flow chart.In such reality
It applies in example, which can be downloaded and installed from network by communications portion 609, and/or from detachable media
611 are mounted.When the computer program is executed by central processing unit (CPU) 601, limited in execution the present processes
Above-mentioned function.It should be noted that computer-readable medium described herein can be computer-readable signal media or
Computer readable storage medium either the two any combination.Computer readable storage medium for example can be --- but
Be not limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.
The more specific example of computer readable storage medium can include but is not limited to: have one or more conducting wires electrical connection,
Portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only deposit
Reservoir (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory
Part or above-mentioned any appropriate combination.In this application, computer readable storage medium, which can be, any include or stores
The tangible medium of program, the program can be commanded execution system, device or device use or in connection.And
In the application, computer-readable signal media may include in a base band or the data as the propagation of carrier wave a part are believed
Number, wherein carrying computer-readable program code.The data-signal of this propagation can take various forms, including but not
It is limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer
Any computer-readable medium other than readable storage medium storing program for executing, the computer-readable medium can send, propagate or transmit use
In by the use of instruction execution system, device or device or program in connection.Include on computer-readable medium
Program code can transmit with any suitable medium, including but not limited to: wireless, electric wire, optical cable, RF etc., Huo Zheshang
Any appropriate combination stated.
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the application, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use
The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box
The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually
It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse
Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding
The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction
Combination realize.
Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard
The mode of part is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor packet
Include video acquisition unit, video understands unit, video analysis unit and tag determination unit.Wherein, the title of these units exists
The restriction to the unit itself is not constituted in the case of certain, for example, video acquisition unit is also described as " obtaining wait know
The unit of other video ".
As on the other hand, present invention also provides a kind of computer-readable medium, which be can be
Included in device described in above-described embodiment;It is also possible to individualism, and without in the supplying device.Above-mentioned calculating
Machine readable medium carries one or more program, when said one or multiple programs are executed by the device, so that should
Device: video to be identified is obtained;Understand that technology understands the content of video to be identified using video, obtains video content label;It adopts
The text that video to be identified is analyzed with text data analytical technology obtains videotext label;Based on video content label and view
Frequency text label determines the semantic label of video to be identified.
Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.Those skilled in the art
Member is it should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic
Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature
Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed herein
Can technical characteristic replaced mutually and the technical solution that is formed.
Claims (20)
1. a kind of method for generating information, comprising:
Obtain video to be identified;
Understand that technology understands the content of the video to be identified using video, obtains video content label;
The text that the video to be identified is analyzed using text data analytical technology, obtains videotext label;
Based on video content label and videotext label, the semantic label of video to be identified is determined.
2. obtaining video according to the method described in claim 1, wherein, the use video understands that technology understands video content
Content tab includes at least one of the following:
By the video input video classification model to be identified, class label is obtained;
The face in the video frame of the video to be identified, the face sample in the face and face database that will test are detected frame by frame
It is matched, obtain personage's name label of the face sample to match with the face detected and is associated with personage's letter of the people's name
Cease label;
Using motion detection model trained in advance, the movement in the video frame of the video to be identified is identified frame by frame, is moved
Make information, merge the action message of each frame, obtains movement label;
Using identification disaggregated model trained in advance, scene in the video frame of the video to be identified and entity are identified frame by frame simultaneously
Fusion recognition is as a result, obtain the scene tag and entity tag in video frame.
3. it is described by the video input video classification model to be identified according to the method described in claim 2, wherein, it obtains
Class label includes:
The video frame for uniformly extracting the video to be identified, obtains sequence of frames of video to be identified;
Feature extraction is carried out using image classification network handles identification sequence of frames of video, obtains the characteristics of image sequence of video to be identified
Column;
Extract the audio signal of the video to be identified;
By the convolutional neural networks of the audio signal input Classification of Speech of the video to be identified, feature is carried out to voice per second
It extracts, obtains the phonetic feature sequence of video to be identified;
Based on described image characteristic sequence and the phonetic feature sequence, determine that video to be identified corresponds to the probability of each label
Value;
The label that probability value is greater than threshold value is determined as to the class label of the video to be identified.
4. described to be based on described image characteristic sequence and the phonetic feature sequence according to the method described in claim 3, wherein
Column determine that video to be identified corresponds to the probability value of each label and includes:
By the double-current shot and long term memory network of described image characteristic sequence and the phonetic feature sequence inputting training in advance, obtain
The video to be identified corresponds to the probability value of each label.
5. according to the method described in claim 3, wherein, described image sorter network is based on being modeled using timing segmented network
Video frame feature and the corresponding label training of video sample obtain;And/or
The convolutional neural networks of the Classification of Speech are determined based on following steps: extracting the Meier in the audio signal of video sample
Scale filter group feature;Based on Meier scale filter group feature and the corresponding label of audio signal, Classification of Speech is trained
Convolutional neural networks.
6. according to the method described in claim 1, wherein, the text of the video to be identified includes at least one of the following:
The title text of the video to be identified;
Text in the obtained video frame of video to be identified is detected using video OCR.
7. according to claim 1 or method described in 6 any one, wherein described to analyze institute using text data analytical technology
The text for stating video to be identified, obtaining videotext label includes:
Based on the text of the video to be identified, the candidate entity mark of the video to be identified is extracted from multistage tag database
Label;
The part of speech and different degree of the entity tag are analyzed based on NLP technology, screening obtains videotext label.
8. described to be based on video content label and videotext label according to the method described in claim 1, wherein, determine to
Identification video semantic label include:
Based on the multistage tag database pre-established, classification belonging to the label in the video content label, described is determined
Relationship in label and multistage tag database in video content label between other labels;
Using natural language processing technique, the label in the video content label, the mark in the video content label are analyzed
The label of classification belonging to label and the part of speech and different degree of the label based on determined by the relationship;
Based on the part of speech and different degree, the label in the video content label and the videotext label is ranked up
And screening, obtain the semantic label of video to be identified.
9. according to the method described in claim 1, wherein, the method also includes:
Based on the videotext label, to user's pushing video.
10. a kind of device for generating information, comprising:
Video acquisition unit is configured to obtain video to be identified;
Video understands unit, is configured to understand using video understanding technology the content of the video to be identified, obtains in video
Hold label;
Video analysis unit is configured to analyze the text of the video to be identified using text data analytical technology, depending on
Frequency text label;
Tag determination unit is configured to determine the semanteme of video to be identified based on video content label and videotext label
Label.
11. device according to claim 10, wherein the video understands that unit includes at least one of the following:
Visual classification subelement is configured to the video input video classification model to be identified obtaining class label;
Face datection subelement is configured to detect the face in the video frame of the video to be identified frame by frame, will test
Face is matched with the face sample in face database, the personage's identifier for the face sample that the face for obtaining and detecting matches
The people information label signed and be associated with the people's name;
Action recognition subelement is configured to identify the video to be identified frame by frame using motion detection model trained in advance
Video frame in movement, obtain action message, merge the action message of each frame, obtain movement label;
Scene and Entity recognition subelement are configured to using identification disaggregated model trained in advance, and identification is described wait know frame by frame
Scene and entity and fusion recognition in the video frame of other video is as a result, obtain the scene tag and entity tag in video frame.
12. device according to claim 11, wherein the visual classification subelement includes:
Video frame extracts subelement, is configured to uniformly extract the video frame of the video to be identified, obtains video frame to be identified
Sequence;
Image characteristics extraction subelement is configured to carry out feature using image classification network handles identification sequence of frames of video to mention
It takes, obtains the characteristics of image sequence of video to be identified;
Audio signal extracts subelement, is configured to extract the audio signal of the video to be identified;
Speech feature extraction subelement is configured to the convolution mind of the audio signal input Classification of Speech of the video to be identified
Through network, feature extraction is carried out to voice per second, obtains the phonetic feature sequence of video to be identified;
Probability value determines subelement, is configured to determine based on described image characteristic sequence and the phonetic feature sequence wait know
Other video corresponds to the probability value of each label;
Class label determines subelement, is configured to for the label that probability value is greater than threshold value being determined as the class of the video to be identified
Distinguishing label.
13. device according to claim 12, wherein the probability value determines that subelement is further configured to:
By the double-current shot and long term memory network of described image characteristic sequence and the phonetic feature sequence inputting training in advance, obtain
The video to be identified corresponds to the probability value of each label.
14. device according to claim 12, wherein the image classification network base in described image feature extraction subelement
It is obtained in the feature of the video frame modeled using timing segmented network and the corresponding label training of video sample;And/or
The convolutional neural networks of the Classification of Speech in described image feature extraction subelement are determined based on following steps: being extracted
Meier scale filter group feature in the audio signal of video sample;Based on Meier scale filter group feature and audio signal
Corresponding label, the convolutional neural networks of training Classification of Speech.
15. device according to claim 10, wherein the text of the video to be identified in the video analysis unit
It includes at least one of the following:
The title text of the video to be identified;
Text in the obtained video frame of video to be identified is detected using video OCR.
16. device described in 0 or 15 any one according to claim 1, wherein the video analysis unit is further configured
At:
Based on the text of the video to be identified, the candidate entity mark of the video to be identified is extracted from multistage tag database
Label;
The part of speech and different degree of the entity tag are analyzed based on NLP technology, screening obtains videotext label.
17. device according to claim 10, wherein the tag determination unit includes:
Label relationship determines subelement, is configured to determine the video content based on the multistage tag database pre-established
In classification belonging to label in label, the label in the video content label and multistage tag database between other labels
Relationship;
Part of speech different degree determines subelement, is configured to analyze in the video content label using natural language processing technique
Label, classification belonging to the label in the video content label label and the label based on determined by the relationship
Part of speech and different degree;
Tag sorting screens subelement, is configured to based on the part of speech and different degree, to the video content label and described
Label in videotext label is ranked up and screens, and obtains the semantic label of video to be identified.
18. device according to claim 10, wherein described device further include:
Video push unit is configured to based on the videotext label, to user's pushing video.
19. a kind of server, comprising:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
The now method as described in any in claim 1-9.
20. a kind of computer-readable medium, is stored thereon with computer program, such as right is realized when which is executed by processor
It is required that any method in 1-9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810878632.6A CN109325148A (en) | 2018-08-03 | 2018-08-03 | The method and apparatus for generating information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810878632.6A CN109325148A (en) | 2018-08-03 | 2018-08-03 | The method and apparatus for generating information |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109325148A true CN109325148A (en) | 2019-02-12 |
Family
ID=65263242
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810878632.6A Pending CN109325148A (en) | 2018-08-03 | 2018-08-03 | The method and apparatus for generating information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109325148A (en) |
Cited By (63)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109275027A (en) * | 2018-09-26 | 2019-01-25 | Tcl海外电子(惠州)有限公司 | Speech output method, electronic playback devices and the storage medium of video |
CN109819284A (en) * | 2019-02-18 | 2019-05-28 | 平安科技(深圳)有限公司 | A kind of short video recommendation method, device, computer equipment and storage medium |
CN109886335A (en) * | 2019-02-21 | 2019-06-14 | 厦门美图之家科技有限公司 | Disaggregated model training method and device |
CN109933688A (en) * | 2019-02-13 | 2019-06-25 | 北京百度网讯科技有限公司 | Determine the method, apparatus, equipment and computer storage medium of video labeling information |
CN109947989A (en) * | 2019-03-18 | 2019-06-28 | 北京字节跳动网络技术有限公司 | Method and apparatus for handling video |
CN110147711A (en) * | 2019-02-27 | 2019-08-20 | 腾讯科技(深圳)有限公司 | Video scene recognition methods, device, storage medium and electronic device |
CN110213668A (en) * | 2019-04-29 | 2019-09-06 | 北京三快在线科技有限公司 | Generation method, device, electronic equipment and the storage medium of video title |
CN110222234A (en) * | 2019-06-14 | 2019-09-10 | 北京奇艺世纪科技有限公司 | A kind of video classification methods and device |
CN110267097A (en) * | 2019-06-26 | 2019-09-20 | 北京字节跳动网络技术有限公司 | Video pushing method, device and electronic equipment based on characteristic of division |
CN110263214A (en) * | 2019-06-21 | 2019-09-20 | 北京百度网讯科技有限公司 | Generation method, device, server and the storage medium of video title |
CN110278447A (en) * | 2019-06-26 | 2019-09-24 | 北京字节跳动网络技术有限公司 | Video pushing method, device and electronic equipment based on continuous feature |
CN110287371A (en) * | 2019-06-26 | 2019-09-27 | 北京字节跳动网络技术有限公司 | Video pushing method, device and electronic equipment end to end |
CN110300329A (en) * | 2019-06-26 | 2019-10-01 | 北京字节跳动网络技术有限公司 | Video pushing method, device and electronic equipment based on discrete features |
CN110503076A (en) * | 2019-08-29 | 2019-11-26 | 腾讯科技(深圳)有限公司 | Video classification methods, device, equipment and medium based on artificial intelligence |
CN110532433A (en) * | 2019-09-03 | 2019-12-03 | 北京百度网讯科技有限公司 | Entity recognition method, device, electronic equipment and the medium of video scene |
CN110598651A (en) * | 2019-09-17 | 2019-12-20 | 腾讯科技(深圳)有限公司 | Information processing method, device and storage medium |
CN110674349A (en) * | 2019-09-27 | 2020-01-10 | 北京字节跳动网络技术有限公司 | Video POI (Point of interest) identification method and device and electronic equipment |
CN110688526A (en) * | 2019-11-07 | 2020-01-14 | 山东舜网传媒股份有限公司 | Short video recommendation method and system based on key frame identification and audio textualization |
CN110704680A (en) * | 2019-08-20 | 2020-01-17 | 咪咕文化科技有限公司 | Label generation method, electronic device and storage medium |
CN110769267A (en) * | 2019-10-30 | 2020-02-07 | 北京达佳互联信息技术有限公司 | Video display method and device, electronic equipment and storage medium |
CN110781348A (en) * | 2019-10-25 | 2020-02-11 | 北京威晟艾德尔科技有限公司 | Video file analysis method |
CN110781347A (en) * | 2019-10-23 | 2020-02-11 | 腾讯科技(深圳)有限公司 | Video processing method, device, equipment and readable storage medium |
CN110826471A (en) * | 2019-11-01 | 2020-02-21 | 腾讯科技(深圳)有限公司 | Video label labeling method, device, equipment and computer readable storage medium |
CN111143611A (en) * | 2019-12-31 | 2020-05-12 | 新疆联海创智信息科技有限公司 | Information acquisition method and device |
CN111222011A (en) * | 2020-01-06 | 2020-06-02 | 腾讯科技(深圳)有限公司 | Video vector determination method and device |
CN111274442A (en) * | 2020-03-19 | 2020-06-12 | 聚好看科技股份有限公司 | Method for determining video label, server and storage medium |
CN111444331A (en) * | 2020-03-12 | 2020-07-24 | 腾讯科技(深圳)有限公司 | Content-based distributed feature extraction method, device, equipment and medium |
CN111582360A (en) * | 2020-05-06 | 2020-08-25 | 北京字节跳动网络技术有限公司 | Method, apparatus, device and medium for labeling data |
CN111586494A (en) * | 2020-04-30 | 2020-08-25 | 杭州慧川智能科技有限公司 | Intelligent strip splitting method based on audio and video separation |
CN111626202A (en) * | 2020-05-27 | 2020-09-04 | 北京百度网讯科技有限公司 | Method and device for identifying video |
CN111695422A (en) * | 2020-05-06 | 2020-09-22 | Oppo(重庆)智能科技有限公司 | Video tag acquisition method and device, storage medium and server |
CN111737523A (en) * | 2020-04-22 | 2020-10-02 | 聚好看科技股份有限公司 | Video tag, search content generation method and server |
WO2020199904A1 (en) * | 2019-04-02 | 2020-10-08 | 腾讯科技(深圳)有限公司 | Video description information generation method, video processing method, and corresponding devices |
CN111767765A (en) * | 2019-04-01 | 2020-10-13 | Oppo广东移动通信有限公司 | Video processing method and device, storage medium and electronic equipment |
CN111767796A (en) * | 2020-05-29 | 2020-10-13 | 北京奇艺世纪科技有限公司 | Video association method, device, server and readable storage medium |
CN111797850A (en) * | 2019-04-09 | 2020-10-20 | Oppo广东移动通信有限公司 | Video classification method and device, storage medium and electronic equipment |
CN111859947A (en) * | 2019-04-24 | 2020-10-30 | 北京嘀嘀无限科技发展有限公司 | Text processing device and method, electronic equipment and storage medium |
CN112052356A (en) * | 2020-08-14 | 2020-12-08 | 腾讯科技(深圳)有限公司 | Multimedia classification method, apparatus and computer-readable storage medium |
CN112115299A (en) * | 2020-09-17 | 2020-12-22 | 北京百度网讯科技有限公司 | Video searching method and device, recommendation method, electronic device and storage medium |
CN112163560A (en) * | 2020-10-22 | 2021-01-01 | 腾讯科技(深圳)有限公司 | Video information processing method and device, electronic equipment and storage medium |
CN112784111A (en) * | 2021-03-12 | 2021-05-11 | 有半岛(北京)信息科技有限公司 | Video classification method, device, equipment and medium |
CN112822506A (en) * | 2021-01-22 | 2021-05-18 | 百度在线网络技术(北京)有限公司 | Method and apparatus for analyzing video stream |
WO2021099858A1 (en) * | 2019-11-19 | 2021-05-27 | International Business Machines Corporation | Video segmentation based on weighted knowledge graph |
CN112948631A (en) * | 2019-12-11 | 2021-06-11 | 北京金山云网络技术有限公司 | Video tag generation method and device and electronic terminal |
CN113038175A (en) * | 2021-02-26 | 2021-06-25 | 北京百度网讯科技有限公司 | Video processing method and device, electronic equipment and computer readable storage medium |
CN113032342A (en) * | 2021-03-03 | 2021-06-25 | 北京车和家信息技术有限公司 | Video labeling method and device, electronic equipment and storage medium |
CN113132752A (en) * | 2019-12-30 | 2021-07-16 | 阿里巴巴集团控股有限公司 | Video processing method and device |
CN113163272A (en) * | 2020-01-07 | 2021-07-23 | 海信集团有限公司 | Video editing method, computer device and storage medium |
CN113254814A (en) * | 2021-05-12 | 2021-08-13 | 平安国际智慧城市科技股份有限公司 | Network course video labeling method and device, electronic equipment and medium |
CN113365102A (en) * | 2020-03-04 | 2021-09-07 | 阿里巴巴集团控股有限公司 | Video processing method and device and label processing method and device |
CN113382279A (en) * | 2021-06-15 | 2021-09-10 | 北京百度网讯科技有限公司 | Live broadcast recommendation method, device, equipment, storage medium and computer program product |
CN113435443A (en) * | 2021-06-28 | 2021-09-24 | 中国兵器装备集团自动化研究所有限公司 | Method for automatically identifying landmark from video |
CN113569088A (en) * | 2021-09-27 | 2021-10-29 | 腾讯科技(深圳)有限公司 | Music recommendation method and device and readable storage medium |
CN113673427A (en) * | 2021-08-20 | 2021-11-19 | 北京达佳互联信息技术有限公司 | Video identification determination method and device, electronic equipment and storage medium |
CN113821681A (en) * | 2021-09-17 | 2021-12-21 | 深圳力维智联技术有限公司 | Video tag generation method, device and equipment |
CN113901263A (en) * | 2021-09-30 | 2022-01-07 | 宿迁硅基智能科技有限公司 | Label generating method and device for video material |
CN113987267A (en) * | 2021-10-28 | 2022-01-28 | 上海数禾信息科技有限公司 | Video file label generation method and device, computer equipment and storage medium |
CN114140673A (en) * | 2022-02-07 | 2022-03-04 | 人民中科(济南)智能技术有限公司 | Illegal image identification method, system and equipment |
CN114693353A (en) * | 2022-03-31 | 2022-07-01 | 方付春 | Electronic commerce data processing method, electronic commerce system and cloud platform |
CN116028593A (en) * | 2022-12-14 | 2023-04-28 | 北京百度网讯科技有限公司 | Character identity information recognition method and device in text, electronic equipment and medium |
CN116680624A (en) * | 2023-08-03 | 2023-09-01 | 国网浙江省电力有限公司宁波供电公司 | Classification method, system and storage medium for metadata of power system |
CN117573870A (en) * | 2023-11-20 | 2024-02-20 | 中国人民解放军国防科技大学 | Text label extraction method, device, equipment and medium for multi-mode data |
CN111859947B (en) * | 2019-04-24 | 2024-05-10 | 北京嘀嘀无限科技发展有限公司 | Text processing device, method, electronic equipment and storage medium |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1909195A1 (en) * | 2006-10-05 | 2008-04-09 | Kubj Limited | Various methods and apparatuses for moving thumbnails with metadata |
CN103164471A (en) * | 2011-12-15 | 2013-06-19 | 盛乐信息技术(上海)有限公司 | Recommendation method and system of video text labels |
CN103377381A (en) * | 2012-04-26 | 2013-10-30 | 富士通株式会社 | Method and device for identifying content attribute of image |
US20150139610A1 (en) * | 2013-11-15 | 2015-05-21 | Clipmine, Inc. | Computer-assisted collaborative tagging of video content for indexing and table of contents generation |
CN105046630A (en) * | 2014-04-04 | 2015-11-11 | 影像搜索者公司 | image tag add system |
CN105095288A (en) * | 2014-05-14 | 2015-11-25 | 腾讯科技(深圳)有限公司 | Data analysis method and data analysis device |
CN105930841A (en) * | 2016-05-13 | 2016-09-07 | 百度在线网络技术(北京)有限公司 | Method and device for automatic semantic annotation of image, and computer equipment |
CN108229662A (en) * | 2018-01-03 | 2018-06-29 | 华南理工大学 | A kind of multi-modal time series modeling method based on two benches study |
CN108228911A (en) * | 2018-02-11 | 2018-06-29 | 北京搜狐新媒体信息技术有限公司 | The computational methods and device of a kind of similar video |
CN108256513A (en) * | 2018-03-23 | 2018-07-06 | 中国科学院长春光学精密机械与物理研究所 | A kind of intelligent video analysis method and intelligent video record system |
-
2018
- 2018-08-03 CN CN201810878632.6A patent/CN109325148A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1909195A1 (en) * | 2006-10-05 | 2008-04-09 | Kubj Limited | Various methods and apparatuses for moving thumbnails with metadata |
CN103164471A (en) * | 2011-12-15 | 2013-06-19 | 盛乐信息技术(上海)有限公司 | Recommendation method and system of video text labels |
CN103377381A (en) * | 2012-04-26 | 2013-10-30 | 富士通株式会社 | Method and device for identifying content attribute of image |
US20150139610A1 (en) * | 2013-11-15 | 2015-05-21 | Clipmine, Inc. | Computer-assisted collaborative tagging of video content for indexing and table of contents generation |
CN105046630A (en) * | 2014-04-04 | 2015-11-11 | 影像搜索者公司 | image tag add system |
CN105095288A (en) * | 2014-05-14 | 2015-11-25 | 腾讯科技(深圳)有限公司 | Data analysis method and data analysis device |
CN105930841A (en) * | 2016-05-13 | 2016-09-07 | 百度在线网络技术(北京)有限公司 | Method and device for automatic semantic annotation of image, and computer equipment |
CN108229662A (en) * | 2018-01-03 | 2018-06-29 | 华南理工大学 | A kind of multi-modal time series modeling method based on two benches study |
CN108228911A (en) * | 2018-02-11 | 2018-06-29 | 北京搜狐新媒体信息技术有限公司 | The computational methods and device of a kind of similar video |
CN108256513A (en) * | 2018-03-23 | 2018-07-06 | 中国科学院长春光学精密机械与物理研究所 | A kind of intelligent video analysis method and intelligent video record system |
Non-Patent Citations (1)
Title |
---|
罗世操: "基于深度学习的图像语义提取与图像检索技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (102)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109275027A (en) * | 2018-09-26 | 2019-01-25 | Tcl海外电子(惠州)有限公司 | Speech output method, electronic playback devices and the storage medium of video |
CN109933688A (en) * | 2019-02-13 | 2019-06-25 | 北京百度网讯科技有限公司 | Determine the method, apparatus, equipment and computer storage medium of video labeling information |
CN109819284A (en) * | 2019-02-18 | 2019-05-28 | 平安科技(深圳)有限公司 | A kind of short video recommendation method, device, computer equipment and storage medium |
CN109819284B (en) * | 2019-02-18 | 2022-11-15 | 平安科技(深圳)有限公司 | Short video recommendation method and device, computer equipment and storage medium |
CN109886335B (en) * | 2019-02-21 | 2021-11-26 | 厦门美图之家科技有限公司 | Classification model training method and device |
CN109886335A (en) * | 2019-02-21 | 2019-06-14 | 厦门美图之家科技有限公司 | Disaggregated model training method and device |
CN110147711A (en) * | 2019-02-27 | 2019-08-20 | 腾讯科技(深圳)有限公司 | Video scene recognition methods, device, storage medium and electronic device |
CN110147711B (en) * | 2019-02-27 | 2023-11-14 | 腾讯科技(深圳)有限公司 | Video scene recognition method and device, storage medium and electronic device |
CN109947989A (en) * | 2019-03-18 | 2019-06-28 | 北京字节跳动网络技术有限公司 | Method and apparatus for handling video |
CN109947989B (en) * | 2019-03-18 | 2023-08-29 | 北京字节跳动网络技术有限公司 | Method and apparatus for processing video |
CN111767765A (en) * | 2019-04-01 | 2020-10-13 | Oppo广东移动通信有限公司 | Video processing method and device, storage medium and electronic equipment |
WO2020199904A1 (en) * | 2019-04-02 | 2020-10-08 | 腾讯科技(深圳)有限公司 | Video description information generation method, video processing method, and corresponding devices |
US11861886B2 (en) | 2019-04-02 | 2024-01-02 | Tencent Technology (Shenzhen) Company Limited | Method and apparatus for generating video description information, and method and apparatus for video processing |
CN111797850A (en) * | 2019-04-09 | 2020-10-20 | Oppo广东移动通信有限公司 | Video classification method and device, storage medium and electronic equipment |
CN111859947A (en) * | 2019-04-24 | 2020-10-30 | 北京嘀嘀无限科技发展有限公司 | Text processing device and method, electronic equipment and storage medium |
CN111859947B (en) * | 2019-04-24 | 2024-05-10 | 北京嘀嘀无限科技发展有限公司 | Text processing device, method, electronic equipment and storage medium |
CN110213668A (en) * | 2019-04-29 | 2019-09-06 | 北京三快在线科技有限公司 | Generation method, device, electronic equipment and the storage medium of video title |
CN110222234A (en) * | 2019-06-14 | 2019-09-10 | 北京奇艺世纪科技有限公司 | A kind of video classification methods and device |
CN110263214A (en) * | 2019-06-21 | 2019-09-20 | 北京百度网讯科技有限公司 | Generation method, device, server and the storage medium of video title |
CN110278447A (en) * | 2019-06-26 | 2019-09-24 | 北京字节跳动网络技术有限公司 | Video pushing method, device and electronic equipment based on continuous feature |
CN110267097A (en) * | 2019-06-26 | 2019-09-20 | 北京字节跳动网络技术有限公司 | Video pushing method, device and electronic equipment based on characteristic of division |
CN110300329B (en) * | 2019-06-26 | 2022-08-12 | 北京字节跳动网络技术有限公司 | Video pushing method and device based on discrete features and electronic equipment |
CN110287371A (en) * | 2019-06-26 | 2019-09-27 | 北京字节跳动网络技术有限公司 | Video pushing method, device and electronic equipment end to end |
CN110278447B (en) * | 2019-06-26 | 2021-07-20 | 北京字节跳动网络技术有限公司 | Video pushing method and device based on continuous features and electronic equipment |
CN110300329A (en) * | 2019-06-26 | 2019-10-01 | 北京字节跳动网络技术有限公司 | Video pushing method, device and electronic equipment based on discrete features |
CN110704680A (en) * | 2019-08-20 | 2020-01-17 | 咪咕文化科技有限公司 | Label generation method, electronic device and storage medium |
CN110704680B (en) * | 2019-08-20 | 2022-10-04 | 咪咕文化科技有限公司 | Label generation method, electronic device and storage medium |
CN110503076B (en) * | 2019-08-29 | 2023-06-30 | 腾讯科技(深圳)有限公司 | Video classification method, device, equipment and medium based on artificial intelligence |
CN110503076A (en) * | 2019-08-29 | 2019-11-26 | 腾讯科技(深圳)有限公司 | Video classification methods, device, equipment and medium based on artificial intelligence |
CN110532433A (en) * | 2019-09-03 | 2019-12-03 | 北京百度网讯科技有限公司 | Entity recognition method, device, electronic equipment and the medium of video scene |
CN110598651B (en) * | 2019-09-17 | 2021-03-12 | 腾讯科技(深圳)有限公司 | Information processing method, device and storage medium |
CN110598651A (en) * | 2019-09-17 | 2019-12-20 | 腾讯科技(深圳)有限公司 | Information processing method, device and storage medium |
CN110674349A (en) * | 2019-09-27 | 2020-01-10 | 北京字节跳动网络技术有限公司 | Video POI (Point of interest) identification method and device and electronic equipment |
CN110674349B (en) * | 2019-09-27 | 2023-03-14 | 北京字节跳动网络技术有限公司 | Video POI (Point of interest) identification method and device and electronic equipment |
CN110781347A (en) * | 2019-10-23 | 2020-02-11 | 腾讯科技(深圳)有限公司 | Video processing method, device, equipment and readable storage medium |
CN110781348A (en) * | 2019-10-25 | 2020-02-11 | 北京威晟艾德尔科技有限公司 | Video file analysis method |
CN110769267A (en) * | 2019-10-30 | 2020-02-07 | 北京达佳互联信息技术有限公司 | Video display method and device, electronic equipment and storage medium |
CN110769267B (en) * | 2019-10-30 | 2022-02-08 | 北京达佳互联信息技术有限公司 | Video display method and device, electronic equipment and storage medium |
CN110826471B (en) * | 2019-11-01 | 2023-07-14 | 腾讯科技(深圳)有限公司 | Video tag labeling method, device, equipment and computer readable storage medium |
CN110826471A (en) * | 2019-11-01 | 2020-02-21 | 腾讯科技(深圳)有限公司 | Video label labeling method, device, equipment and computer readable storage medium |
CN110688526A (en) * | 2019-11-07 | 2020-01-14 | 山东舜网传媒股份有限公司 | Short video recommendation method and system based on key frame identification and audio textualization |
WO2021099858A1 (en) * | 2019-11-19 | 2021-05-27 | International Business Machines Corporation | Video segmentation based on weighted knowledge graph |
US11093755B2 (en) | 2019-11-19 | 2021-08-17 | International Business Machines Corporation | Video segmentation based on weighted knowledge graph |
CN114746857B (en) * | 2019-11-19 | 2023-05-09 | 国际商业机器公司 | Video segmentation based on weighted knowledge graph |
CN114746857A (en) * | 2019-11-19 | 2022-07-12 | 国际商业机器公司 | Video segmentation based on weighted knowledge graph |
GB2605723A (en) * | 2019-11-19 | 2022-10-12 | Ibm | Video segmentation based on weighted knowledge graph |
CN112948631A (en) * | 2019-12-11 | 2021-06-11 | 北京金山云网络技术有限公司 | Video tag generation method and device and electronic terminal |
CN113132752A (en) * | 2019-12-30 | 2021-07-16 | 阿里巴巴集团控股有限公司 | Video processing method and device |
CN113132752B (en) * | 2019-12-30 | 2023-02-24 | 阿里巴巴集团控股有限公司 | Video processing method and device |
CN111143611B (en) * | 2019-12-31 | 2024-01-16 | 新疆联海创智信息科技有限公司 | Information acquisition method and device |
CN111143611A (en) * | 2019-12-31 | 2020-05-12 | 新疆联海创智信息科技有限公司 | Information acquisition method and device |
CN111222011B (en) * | 2020-01-06 | 2023-11-14 | 腾讯科技(深圳)有限公司 | Video vector determining method and device |
CN111222011A (en) * | 2020-01-06 | 2020-06-02 | 腾讯科技(深圳)有限公司 | Video vector determination method and device |
CN113163272A (en) * | 2020-01-07 | 2021-07-23 | 海信集团有限公司 | Video editing method, computer device and storage medium |
CN113365102B (en) * | 2020-03-04 | 2022-08-16 | 阿里巴巴集团控股有限公司 | Video processing method and device and label processing method and device |
CN113365102A (en) * | 2020-03-04 | 2021-09-07 | 阿里巴巴集团控股有限公司 | Video processing method and device and label processing method and device |
CN111444331A (en) * | 2020-03-12 | 2020-07-24 | 腾讯科技(深圳)有限公司 | Content-based distributed feature extraction method, device, equipment and medium |
CN111444331B (en) * | 2020-03-12 | 2023-04-07 | 腾讯科技(深圳)有限公司 | Content-based distributed feature extraction method, device, equipment and medium |
CN111274442A (en) * | 2020-03-19 | 2020-06-12 | 聚好看科技股份有限公司 | Method for determining video label, server and storage medium |
CN111274442B (en) * | 2020-03-19 | 2023-10-27 | 聚好看科技股份有限公司 | Method for determining video tag, server and storage medium |
CN111737523B (en) * | 2020-04-22 | 2023-11-14 | 聚好看科技股份有限公司 | Video tag, generation method of search content and server |
CN111737523A (en) * | 2020-04-22 | 2020-10-02 | 聚好看科技股份有限公司 | Video tag, search content generation method and server |
CN111586494A (en) * | 2020-04-30 | 2020-08-25 | 杭州慧川智能科技有限公司 | Intelligent strip splitting method based on audio and video separation |
CN111695422A (en) * | 2020-05-06 | 2020-09-22 | Oppo(重庆)智能科技有限公司 | Video tag acquisition method and device, storage medium and server |
CN111582360A (en) * | 2020-05-06 | 2020-08-25 | 北京字节跳动网络技术有限公司 | Method, apparatus, device and medium for labeling data |
CN111695422B (en) * | 2020-05-06 | 2023-08-18 | Oppo(重庆)智能科技有限公司 | Video tag acquisition method and device, storage medium and server |
CN111582360B (en) * | 2020-05-06 | 2023-08-15 | 北京字节跳动网络技术有限公司 | Method, apparatus, device and medium for labeling data |
CN111626202A (en) * | 2020-05-27 | 2020-09-04 | 北京百度网讯科技有限公司 | Method and device for identifying video |
CN111626202B (en) * | 2020-05-27 | 2023-08-29 | 北京百度网讯科技有限公司 | Method and device for identifying video |
US11657612B2 (en) | 2020-05-27 | 2023-05-23 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for identifying video |
CN111767796B (en) * | 2020-05-29 | 2023-12-15 | 北京奇艺世纪科技有限公司 | Video association method, device, server and readable storage medium |
CN111767796A (en) * | 2020-05-29 | 2020-10-13 | 北京奇艺世纪科技有限公司 | Video association method, device, server and readable storage medium |
CN112052356A (en) * | 2020-08-14 | 2020-12-08 | 腾讯科技(深圳)有限公司 | Multimedia classification method, apparatus and computer-readable storage medium |
CN112052356B (en) * | 2020-08-14 | 2023-11-24 | 腾讯科技(深圳)有限公司 | Multimedia classification method, apparatus and computer readable storage medium |
CN112115299A (en) * | 2020-09-17 | 2020-12-22 | 北京百度网讯科技有限公司 | Video searching method and device, recommendation method, electronic device and storage medium |
CN112163560A (en) * | 2020-10-22 | 2021-01-01 | 腾讯科技(深圳)有限公司 | Video information processing method and device, electronic equipment and storage medium |
CN112163560B (en) * | 2020-10-22 | 2024-03-05 | 腾讯科技(深圳)有限公司 | Video information processing method and device, electronic equipment and storage medium |
CN112822506A (en) * | 2021-01-22 | 2021-05-18 | 百度在线网络技术(北京)有限公司 | Method and apparatus for analyzing video stream |
CN113038175A (en) * | 2021-02-26 | 2021-06-25 | 北京百度网讯科技有限公司 | Video processing method and device, electronic equipment and computer readable storage medium |
CN113032342A (en) * | 2021-03-03 | 2021-06-25 | 北京车和家信息技术有限公司 | Video labeling method and device, electronic equipment and storage medium |
CN113032342B (en) * | 2021-03-03 | 2023-09-05 | 北京车和家信息技术有限公司 | Video labeling method and device, electronic equipment and storage medium |
CN112784111A (en) * | 2021-03-12 | 2021-05-11 | 有半岛(北京)信息科技有限公司 | Video classification method, device, equipment and medium |
CN113254814A (en) * | 2021-05-12 | 2021-08-13 | 平安国际智慧城市科技股份有限公司 | Network course video labeling method and device, electronic equipment and medium |
CN113382279A (en) * | 2021-06-15 | 2021-09-10 | 北京百度网讯科技有限公司 | Live broadcast recommendation method, device, equipment, storage medium and computer program product |
CN113382279B (en) * | 2021-06-15 | 2022-11-04 | 北京百度网讯科技有限公司 | Live broadcast recommendation method, device, equipment, storage medium and computer program product |
CN113435443A (en) * | 2021-06-28 | 2021-09-24 | 中国兵器装备集团自动化研究所有限公司 | Method for automatically identifying landmark from video |
CN113673427B (en) * | 2021-08-20 | 2024-03-22 | 北京达佳互联信息技术有限公司 | Video identification method, device, electronic equipment and storage medium |
CN113673427A (en) * | 2021-08-20 | 2021-11-19 | 北京达佳互联信息技术有限公司 | Video identification determination method and device, electronic equipment and storage medium |
CN113821681B (en) * | 2021-09-17 | 2023-09-26 | 深圳力维智联技术有限公司 | Video tag generation method, device and equipment |
CN113821681A (en) * | 2021-09-17 | 2021-12-21 | 深圳力维智联技术有限公司 | Video tag generation method, device and equipment |
CN113569088A (en) * | 2021-09-27 | 2021-10-29 | 腾讯科技(深圳)有限公司 | Music recommendation method and device and readable storage medium |
CN113569088B (en) * | 2021-09-27 | 2021-12-21 | 腾讯科技(深圳)有限公司 | Music recommendation method and device and readable storage medium |
CN113901263A (en) * | 2021-09-30 | 2022-01-07 | 宿迁硅基智能科技有限公司 | Label generating method and device for video material |
CN113987267A (en) * | 2021-10-28 | 2022-01-28 | 上海数禾信息科技有限公司 | Video file label generation method and device, computer equipment and storage medium |
CN114140673A (en) * | 2022-02-07 | 2022-03-04 | 人民中科(济南)智能技术有限公司 | Illegal image identification method, system and equipment |
CN114140673B (en) * | 2022-02-07 | 2022-05-20 | 人民中科(北京)智能技术有限公司 | Method, system and equipment for identifying violation image |
CN114693353A (en) * | 2022-03-31 | 2022-07-01 | 方付春 | Electronic commerce data processing method, electronic commerce system and cloud platform |
CN116028593A (en) * | 2022-12-14 | 2023-04-28 | 北京百度网讯科技有限公司 | Character identity information recognition method and device in text, electronic equipment and medium |
CN116680624B (en) * | 2023-08-03 | 2023-10-20 | 国网浙江省电力有限公司宁波供电公司 | Classification method, system and storage medium for metadata of power system |
CN116680624A (en) * | 2023-08-03 | 2023-09-01 | 国网浙江省电力有限公司宁波供电公司 | Classification method, system and storage medium for metadata of power system |
CN117573870A (en) * | 2023-11-20 | 2024-02-20 | 中国人民解放军国防科技大学 | Text label extraction method, device, equipment and medium for multi-mode data |
CN117573870B (en) * | 2023-11-20 | 2024-05-07 | 中国人民解放军国防科技大学 | Text label extraction method, device, equipment and medium for multi-mode data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109325148A (en) | The method and apparatus for generating information | |
CN109117777A (en) | The method and apparatus for generating information | |
CN111259215B (en) | Multi-mode-based topic classification method, device, equipment and storage medium | |
CN112632385A (en) | Course recommendation method and device, computer equipment and medium | |
CN111090763B (en) | Picture automatic labeling method and device | |
CN111708913B (en) | Label generation method and device and computer readable storage medium | |
CN109034069A (en) | Method and apparatus for generating information | |
CN113254711B (en) | Interactive image display method and device, computer equipment and storage medium | |
CN111783712A (en) | Video processing method, device, equipment and medium | |
CN113761253A (en) | Video tag determination method, device, equipment and storage medium | |
Soltanian et al. | Hierarchical concept score postprocessing and concept-wise normalization in CNN-based video event recognition | |
CN112015928A (en) | Information extraction method and device of multimedia resource, electronic equipment and storage medium | |
CN108062416B (en) | Method and apparatus for generating label on map | |
CN115131698A (en) | Video attribute determination method, device, equipment and storage medium | |
CN111488813A (en) | Video emotion marking method and device, electronic equipment and storage medium | |
CN115248855A (en) | Text processing method and device, electronic equipment and computer readable storage medium | |
CN116955591A (en) | Recommendation language generation method, related device and medium for content recommendation | |
CN116977701A (en) | Video classification model training method, video classification method and device | |
CN112565903A (en) | Video recommendation method and device, server and storage medium | |
CN116010545A (en) | Data processing method, device and equipment | |
CN116955707A (en) | Content tag determination method, device, equipment, medium and program product | |
WO2021147084A1 (en) | Systems and methods for emotion recognition in user-generated video(ugv) | |
CN115114469A (en) | Picture identification method, device and equipment and storage medium | |
CN115130453A (en) | Interactive information generation method and device | |
CN111782762A (en) | Method and device for determining similar questions in question answering application and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |