CN109635158A - For the method and device of video automatic labeling, medium and electronic equipment - Google Patents
For the method and device of video automatic labeling, medium and electronic equipment Download PDFInfo
- Publication number
- CN109635158A CN109635158A CN201811542198.0A CN201811542198A CN109635158A CN 109635158 A CN109635158 A CN 109635158A CN 201811542198 A CN201811542198 A CN 201811542198A CN 109635158 A CN109635158 A CN 109635158A
- Authority
- CN
- China
- Prior art keywords
- video
- frame
- sequence
- label
- frames
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Image Analysis (AREA)
Abstract
The disclosure is directed to a kind of for the method and device of video automatic labeling, medium and electronic equipment, belong to technical field of video processing.This method comprises: the video is decomposed framing in response to getting video;The frame resolved into is grouped according to pre-defined rule;Each group of frame is connected into sequence of frames of video;The sequence of frames of video is inputted into machine learning model, by the label of the machine learning model output video frame sequence;Label based on sequence of frames of video labels for the video.The disclosure is labelled automatically for video by machine learning model, improves the accuracy rate and efficiency to label.
Description
Technical field
This disclosure relates to technical field of video processing, in particular to a kind of method for video automatic labeling and
Device, medium and electronic equipment.
Background technique
Video tab is the label classified according to video attribute and set, be video content is ranked up and to
User provides the important evidence of personalized recommendation etc..
In recent years, propagation information carried out by video, show self etc. with very high temperature.It is emerging that user finds oneself sense
The video of interest and certain businessmans or platform recommend video to require the label according to video.In particular, usually having in video very big
Part is no voice and caption information.Not according to the method to label conventionally by video speech and subtitle to video
It is feasible.And it relies on the mode manually demarcated and will cause poor efficiency, the low accuracy problem to label to this partial video.
Accordingly, it is desirable to provide a kind of is newly method and apparatus, medium and the electronic equipment of video automatic labeling.
It should be noted that information is only used for reinforcing the reason to the background of the disclosure disclosed in above-mentioned background technology part
Solution, therefore may include the information not constituted to the prior art known to persons of ordinary skill in the art.
Summary of the invention
The disclosure is designed to provide a kind of scheme for video automatic labeling, and then at least to a certain extent gram
Inefficient, the low accuracy rate problem to label caused by taking the limitation and defect due to the relevant technologies for video.
According to one aspect of the disclosure, a kind of method for video automatic labeling is provided, comprising:
In response to getting video, the video is decomposed into framing;
The frame resolved into is grouped according to pre-defined rule;
Each group of frame is connected into sequence of frames of video;
The sequence of frames of video is inputted into machine learning model, by the mark of the machine learning model output video frame sequence
Label;
Label based on sequence of frames of video labels for the video,
Wherein, the machine learning model is trained as follows: by each video in sequence of frames of video sample set
Frame sequence sample inputs the machine learning model, and the sequence of frames of video sample is by the video of various known labels according to institute
Pre-defined rule grouping is stated, and each group of frame is connected in series, machine learning model output video frame sequence samples institute
From video label, be compared with video known label, if it is inconsistent, adjusting in the machine learning model
Coefficient keeps the label of the machine learning model output consistent with the video known label.
In a kind of exemplary embodiment of the disclosure, include: according to pre-defined rule grouping by the frame resolved into
Using continuous predetermined number frame as one group.
In a kind of exemplary embodiment of the disclosure, include: according to pre-defined rule grouping by the frame resolved into
Take predetermined number frame as one group at random from the frame of decomposition.
In a kind of exemplary embodiment of the disclosure, include: according to pre-defined rule grouping by the frame resolved into
The frame that the video resolves into is divided into N group, N is positive integer, and the number of sequence of frames of video is also N, by frame number
I-th group is formed for the frame of aN+i, wherein a and i is positive integer, 0≤a≤N-1,1≤i≤N.
In a kind of exemplary embodiment of the disclosure, the frame by each group is connected into sequence of frames of video and includes:
Each group of frame is connected into sequence of frames of video according to the sequencing of the frame number of each frame.
In a kind of exemplary embodiment of the disclosure, the label based on sequence of frames of video is the video mark
Label, comprising:
Using the maximum label of probability right accounting in the label of obtained sequence of frames of video as the final label of video.
In a kind of exemplary embodiment of the disclosure, the label based on sequence of frames of video is the video mark
Label, comprising:
By the maximum top n label of number in the label of obtained sequence of frames of video all as the label beaten for video.
In a kind of exemplary embodiment of the disclosure, by the maximum label of number in the label of obtained sequence of frames of video
As the label beaten for video, comprising:
If the maximum label of number have it is multiple, increase resolve into frame grouping number.
In a kind of exemplary embodiment of the disclosure, the number for increasing the frame grouping resolved into includes:
If pre-defined rule includes using continuous predetermined number frame as one group, the number of increase group makes at least one
The frame that grouping includes partly overlaps.
In a kind of exemplary embodiment of the disclosure, include: according to pre-defined rule grouping by the frame resolved into
The frame resolved into is grouped according to the first pre-defined rule, and is grouped according to the second pre-defined rule, the first pre-defined rule
It is different from the second pre-defined rule.
According to one aspect of the disclosure, a kind of device for video automatic labeling is provided, comprising:
Decomposing module, in response to getting video, the video to be decomposed framing;
Grouping module, for the frame resolved into be grouped according to pre-defined rule;
Laminating module, for each group of frame to be connected into sequence of frames of video;
First demarcating module, for the sequence of frames of video to be inputted machine learning model, by the machine learning model
The label of output video frame sequence;
Second demarcating module labels for the label based on sequence of frames of video for the video.According to the one of the disclosure
A aspect provides a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer
Method described in any of the above embodiments is realized when program is executed by processor.
According to one aspect of the disclosure, a kind of electronic equipment is provided characterized by comprising
Processor;And
Memory, for storing the executable instruction of the processor;
Wherein, the processor is configured to execute side described in any of the above embodiments via the executable instruction is executed
Method.
A kind of scheme for video automatic labeling of the disclosure.In the program, in response to getting video, by the video
Decompose framing;The frame resolved into is grouped according to pre-defined rule;Each group of frame is connected into sequence of frames of video;By the video
Frame sequence inputs machine learning model, by the label of the machine learning model output video frame sequence;Based on sequence of frames of video
Label, label for the video.The disclosure labelled automatically for video by machine learning model, is improved and is labelled
Accuracy rate and efficiency, and in order to avoid inputting the poor efficiency that machine learning model brings machine learning model one by one, it adopts
Concatenated mode after being grouped is taken, makes the frame for inputting machine learning model that there is regularity, the high efficiency of input is further increased and beaten
The accuracy rate of label.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not
The disclosure can be limited.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and shows the implementation for meeting the disclosure
Example, and together with specification for explaining the principles of this disclosure.It should be evident that the accompanying drawings in the following description is only the disclosure
Some embodiments for those of ordinary skill in the art without creative efforts, can also basis
These attached drawings obtain other attached drawings.
Fig. 1 schematically shows a kind of flow chart of method for video automatic labeling.
Fig. 2 schematically shows a kind of Application Scenarios-Example figure of method for video automatic labeling.
Fig. 3 schematically shows a kind of block diagram of device for video automatic labeling.
Fig. 4 schematically shows a kind of for realizing the above-mentioned electronic equipment example block diagram for video automatic labeling method.
Fig. 5 schematically shows a kind of for realizing the above-mentioned computer-readable storage medium for video automatic labeling method
Matter.
Specific embodiment
Example embodiment is described more fully with reference to the drawings.However, example embodiment can be with a variety of shapes
Formula is implemented, and is not understood as limited to example set forth herein;On the contrary, thesing embodiments are provided so that the disclosure will more
Fully and completely, and by the design of example embodiment comprehensively it is communicated to those skilled in the art.Described feature, knot
Structure or characteristic can be incorporated in any suitable manner in one or more embodiments.In the following description, it provides perhaps
More details fully understand embodiment of the present disclosure to provide.It will be appreciated, however, by one skilled in the art that can
It is omitted with technical solution of the disclosure one or more in the specific detail, or others side can be used
Method, constituent element, device, step etc..In other cases, be not shown in detail or describe known solution to avoid a presumptuous guest usurps the role of the host and
So that all aspects of this disclosure thicken.
In addition, attached drawing is only the schematic illustrations of the disclosure, it is not necessarily drawn to scale.Identical attached drawing mark in figure
Note indicates same or similar part, thus will omit repetition thereof.Some block diagrams shown in the drawings are function
Energy entity, not necessarily must be corresponding with physically or logically independent entity.These function can be realized using software form
Energy entity, or these functional entitys are realized in one or more hardware modules or integrated circuit, or at heterogeneous networks and/or place
These functional entitys are realized in reason device device and/or microcontroller device.
A kind of method for video automatic labeling is provided firstly in this example embodiment.A kind of application of this method
In scene, some videos are obtained according to the demand that user video uses first, these demands used may include commercial object
Recommendation, video display platform classification show etc.;These videos can be crawled from public network, can be from having shooting
The capture apparatus of function, at the same these videos can be with subtitle perhaps voice can also not have subtitle or voice,
Particular determination is not done in the present exemplary embodiment to this.It then is these video automatic Calibration labels using machine learning model,
Video can be recommended or be classified according to the label demarcated for video during recommendation or classification etc..Utilizing machine
Before learning model is these video automatic Calibration labels, by the way that video is decomposed these frames after framing according to scheduled rule
Grouping, each framing for then obtaining grouping are defeated by sequence of frames of video respectively according to the sequential series of frame number at sequence of frames of video
Enter machine learning model calibration label, the pressure of machine learning model can be reduced in this way, improve the input rate of input model.
Finally, the label of video is obtained to the label that sequence of frames of video is demarcated based on machine learning model.Disclosure embodiment is general
For labelling before video is online for video.The effect to label be after video is online, can according to these labels to
Video is recommended at family, can also realize fast search by these labels when user searches for video.This mode can be certain
The accuracy rate and rate that the calibration of the video of no subtitle and voice is improved in degree, have equally been readily applicable to subtitle and language
The video of sound.This can run on server for the method for video automatic labeling, can also run on server cluster or cloud
Server etc., certainly, those skilled in the art can also run method of the invention in other platforms according to demand, this is exemplary
Particular determination is not done in embodiment to this.Refering to what is shown in Fig. 1, should may include following step for the method for video automatic labeling
It is rapid:
Step S110. decomposes framing in response to getting video, by the video.
The frame resolved into is grouped by step S120. according to pre-defined rule.
Each group of frame is connected into sequence of frames of video by step S130..
The sequence of frames of video is inputted machine learning model by step S140., exports video by the machine learning model
The label of frame sequence.
Label of the step S150. based on sequence of frames of video labels for the video,
Wherein, the machine learning model is trained as follows: by each video in sequence of frames of video sample set
Frame sequence sample inputs the machine learning model, and the sequence of frames of video sample is by the video of various known labels according to institute
Pre-defined rule grouping is stated, and each group of frame is connected in series, machine learning model output video frame sequence samples institute
From video label, be compared with video known label, if it is inconsistent, adjusting in the machine learning model
Coefficient keeps the label of the machine learning model output consistent with the video known label.
In the above-mentioned method for video automatic labeling, by the way that in response to getting video, the video is decomposed framing
The frame resolved into is grouped according to pre-defined rule afterwards, each group of frame is then connected into sequence of frames of video, while by the view
Frequency frame sequence inputs machine learning model, by the label of the machine learning model output video frame sequence, finally based on described
The label of sequence of frames of video labels for the video, solve to the video of acquisition according to video attribute, classification demand etc. into
The problem of row automatic Calibration.It is labelled automatically for video by machine learning model, improves the accuracy rate and efficiency to label,
And in order to avoid inputting the poor efficiency that machine learning model brings machine learning model one by one, take concatenated after being grouped
Mode makes the frame for inputting machine learning model have regularity, and the high efficiency of input further increases the accuracy rate to label.
In the following, by conjunction with attached drawing to each step in the method for video automatic labeling above-mentioned in this example embodiment
Carry out detailed explanation and explanation.
In step s 110, in response to getting video, the video is decomposed into framing.
In this example embodiment, refering to what is shown in Fig. 2, firstly, server 201 is from user terminal 202 or other clothes
Business device 203 obtains video.When obtaining video from user terminal 202, obtaining video is to be realized by user to the upload of video.
When obtaining video from other servers 203, can be carried out by the website periodically or non-periodically to other same displaying videos
It crawls and obtains.Certainly, in the case where crawling, the authorization of the website of other same displaying videos is obtained.
The user terminal can be mobile terminal device (such as can be mobile phone), be also possible to other have storage or
The terminal device (such as can be camera, wrist-watch etc.) of video capability is shot, there is no special restriction on this for this example.Further
, which may include one, also may include it is multiple, there is no special restriction on this for this example;Other servers
It can be any server that can store video from internet or other storage equipment, which can be with
Including one, also may include it is multiple, there is no special restriction on this for this example.It then, can root by video decomposition framing
It is carried out according to the frame head mark of each frame.Frame head can be added in transmission, the frame head band in each frame, i.e. video a picture
There is frame head mark.It is identified by the frame head, video accurately can be decomposed into framing.
In the step s 120, the frame resolved into is grouped according to pre-defined rule.
In a kind of this exemplary embodiment, by the frame resolved into according to pre-defined rule be grouped in pre-defined rule be by
Continuous predetermined number frame is as one group, comprising:, will even according to the adaptation frame number of machine learning model after video is decomposed framing
Continuous multiple frames are divided into one group, such as the video for including in each superposition frame of the machine learning model identification for demarcating label
Best frame number is 12, then continuous 12 frames in frame that video decomposes is divided into one group;Further, every group contain
Frame number, which can not be, is necessary for best frame number.
It is simplified processing using continuous predetermined number frame as one group of benefit, it is convenient and easy.
In a kind of this exemplary embodiment, by the frame resolved into according to pre-defined rule be grouped in pre-defined rule be from
Take predetermined number frame as one group at random in the frame of decomposition, comprising: by the video frame obtained after decomposition according to machine learning mould
The adaptation frame number of type randomly selects M and is used as one group;Further, every group of frame number contained, which can not be, is necessary for optimum frame
Number.
The benefit of scheme for extracting random frame is grouped after upsetting each frame, can prevent on the time relatively close to
Frame is assigned in a group, so that the case where frame in a group may reflect the frame of various time points, there is every group of frame more
Representativeness, to improve the effect that video labels.
In a kind of this exemplary embodiment, by the frame resolved into according to pre-defined rule be grouped in pre-defined rule be by
The frame that the video resolves into is divided into N group, and N is positive integer, and the number for being superimposed frame is also N, and the frame that frame number is aN+i is formed
I-th group, wherein a and i is positive integer, 0≤a≤N-1,1≤i≤N.For example, being divided into 5 after video is decomposed into 100 frames
Frame number is the combination of the frames such as 2,7,12,17,22,27 wherein being that the frames groups such as 1,6,11,16,21,26 are combined into one group by frame number by group
Be one group, behind each group, such combination can be so that the frame in group be evenly distributed in video, and can
To bring a part of randomness, the accuracy of machine learning model calibration label is being improved to a certain degree and is expanding machine learning mould
The range of type calibration video.
Although the embodiment of three groupings foregoing illustrate, however, those skilled in the art should understand that, the disclosure is unlimited
In three of the above packet mode.
In a kind of originally exemplary embodiment, the frame resolved into is grouped according to the first pre-defined rule, and according to second
Pre-defined rule grouping, wherein first pre-defined rule and the second pre-defined rule are predetermined in 3 embodiments before this example
Any one either above-mentioned undisclosed but those skilled in the art in rule benefit from above-mentioned introduction it is contemplated that coming out
The rule of other embodiment, and the first pre-defined rule is different with the second pre-defined rule, for example, the first pre-defined rule is
Pre-defined rule in 1st embodiment, i.e., using continuous predetermined number frame as one group, the second pre-defined rule is the 2nd reality
The pre-defined rule in mode is applied, the video frame obtained after decomposition is randomly selected M according to the adaptation frame number of machine learning model
As one group, but the first pre-defined rule and the second pre-defined rule cannot be simultaneously pre-defined rule in the 1st embodiment or
Pre-defined rule in the 2nd embodiment of person.Meanwhile it is the video frame obtained after decomposition is pre- by the first pre-defined rule and second
When set pattern is then grouped simultaneously, the group number of the group finally obtained can be and pass through the first pre-defined rule or the second pre- set pattern
It is then individually grouped that obtained group number is consistent, is also possible to twice of group number being individually grouped.
In one embodiment, the group number of the group finally obtained with pass through the first pre-defined rule or the second pre-defined rule list
Solely being grouped obtained group number unanimously this can be implemented so that
All frames resolved into are divided into first part's frame and second part frame;
First part's frame is grouped using the first pre-defined rule, second part frame is grouped using the first pre-defined rule, is pressed
The group that the frame for being combined together as resolving into that the group and the second pre-defined rule being divided into according to the first pre-defined rule are divided into is divided into.
For example, first part is the first half of all frames resolved into, such as 100 frames are decomposited, preceding 50 frames are the
A part.The first half of the frame number resolved into is applied into the first pre-defined rule, i.e., using continuous predetermined number frame as one group, example
Such as 10 one group of frame, it is divided into 5 groups;Second part is the later half of all frames resolved into, such as decomposites 100 frames, rear 50
A frame is second part.The later half of the frame number resolved into is applied into the second pre-defined rule, i.e., is made continuous predetermined number frame
It is one group, such as 10 one group of frame, is divided into 5 groups.10 groups are added up in total, the group being divided into as 100 frames resolved into.
In one embodiment, the group number of the group finally obtained is by the first pre-defined rule or the second pre-defined rule list
Solely being grouped 2 times of obtained group number this can be implemented so that
All frames decomposited are grouped using the first pre-defined rule, all frames decomposited are used into the second pre-defined rule
Grouping, the group being divided into according to the first pre-defined rule are divided into the frame for being combined together as resolving into that the second pre-defined rule is divided into
Group.
For example, resolve into 100 frames are applied into the first pre-defined rule, i.e., using continuous predetermined number frame as one group,
Such as 10 one group of frame, it is divided into 10 groups;Again resolve into 100 frames are applied into the second pre-defined rule again, i.e., will continuously made a reservation for
Number frame is as one group, such as 10 one group of frame, is divided into 10 groups.20 groups are added up in total, as 100 frames resolved into point
At group.
The randomness of grouping can be improved by this packet mode, to improve the standard of video calibration to a certain extent
True rate.
In step s 130, each group of frame is connected into sequence of frames of video.
In this exemplary embodiment, the video frame in each group after all video frame packets, which is together in series, to be become
Sequence of frames of video, the video frame obtained after decomposition be it is individually separated, input machine learning model efficiency can be relatively low, by video
Frame sequence input can effectively improve input efficiency, and then improve calibration efficiency.
In a kind of originally exemplary embodiment, it includes: according to each frame that each group of frame, which is connected into sequence of frames of video,
The sequencing of frame number each group of frame is connected into sequence of frames of video.For example, one grouping in, including frame number be 11,
1,6,26,16,21 etc. frame, so that it may all frames be together in series according to 1,6,11,16,21,26 etc. sequence.Thus may be used
So that being consistent in the sequence of frame and original video, the accuracy rate of calibration is effectively improved.
In step S140, the sequence of frames of video is inputted into machine learning model, is exported by the machine learning model
The label of sequence of frames of video.
In this exemplary embodiment, the sequence of frames of video is inputted into machine learning model, by the machine learning
The label of model output video frame sequence.Sequence of frames of video is inputted in trained machine learning model in advance first, by pre-
First trained machine learning model demarcates the label of video according to sequence of frames of video.
In this exemplary embodiment, the training method including carrying out machine learning model.The training method includes: head
Each sequence of frames of video sample in sequence of frames of video sample set is first inputted into the machine learning model, the sequence of frames of video
Sample is to be grouped the video of various known labels according to the pre-defined rule, and each group of frame is connected in series, described
Machine learning model output video frame sequence samples from video label, be compared with video known label, if
It is inconsistent, then the coefficient in the machine learning model is adjusted, the label and the video for making the machine learning model output are
Know that label is consistent.
In step S150, the label based on sequence of frames of video labels for the video.
It is in a kind of originally exemplary embodiment, probability right accounting in the label of obtained sequence of frames of video is maximum
Label is as the final label of video, such as having 5 in all sequence of frames of video labels for finally obtaining is to make laughs, and 2 are lives,
1 is talk show, then the label that will finally make laughs as video.
In a kind of originally exemplary embodiment, by the maximum top n mark of number in the label of obtained sequence of frames of video
Label are as the label beaten for video, such as predetermined N is 3, while having 5 tourisms in obtained superposition frame tagging, 4 open airs, and 3
Make laughs, 2 life, 2 talk shows, 1 cuisines, then by tourism, open air, make laughs all as the label of video.
In a kind of this exemplary embodiment, if the maximum label of the number has multiple, increase the frame resolved into
The number of grouping, while if pre-defined rule includes the number of increase group using continuous predetermined number frame as one group, make to
Few a part of group of frame for including partly overlaps, for example, if the label in the 5 sequence of frames of video calibration initially obtained has 3
It makes laughs, 3 tourisms, 2 open airs, then there is the maximum label of number and makes laughs and travel have 3 in 1 beautiful scenery, but predetermined
It selects the maximum number of label as video tab, so needing to re-scale, therefore carries out second and divide series connection, for the first time
It is taken again a little in the sequence of frames of video of acquisition, sequence of frames of video of connecting after grouping;Such as take two points of first time sequence of frames of video
One of place be new division points, if for the first time series connection be using 1-20 frame as sequence of frames of video 1,20-40 frame is as video
Frame sequence 2 ... ... 80-100 can take different starting points for second as sequence of frames of video 5, such as by 10-30 frame
As sequence of frames of video 1,30-50 frame is as sequence of frames of video 2 ... ... 90-10 is as sequence of frames of video 5, and to total twice
The label as small video that totally 10 superposition frames label, and take the camera lens number labeled most.Such calibration side
Formula can improve the accuracy of calibration label to a certain extent.
The disclosure additionally provides a kind of device for video automatic labeling.Refering to what is shown in Fig. 4, the video automatic labeling
Device may include decomposing module 310, grouping module 320, laminating module 330, the first demarcating module 340 and the second calibration mold
Block 350.Wherein:
Decomposing module 310, in response to getting video, the video to be decomposed framing;
Grouping module 320, for the frame resolved into be grouped according to pre-defined rule;
Laminating module 330, for each group of frame to be connected into sequence of frames of video;
First demarcating module 340, for the sequence of frames of video to be inputted machine learning model, by the machine learning mould
The label of type output video frame sequence;
Second demarcating module 350 labels for the label based on sequence of frames of video for the video,
It is above-mentioned be video automatic labeling device in each module detail it is corresponding be that video is beaten automatically
It is described in detail in the method for label, therefore details are not described herein again.
It should be noted that although being referred to several modules or list for acting the equipment executed in the above detailed description
Member, but this division is not enforceable.In fact, according to embodiment of the present disclosure, it is above-described two or more
Module or the feature and function of unit can embody in a module or unit.Conversely, an above-described mould
The feature and function of block or unit can be to be embodied by multiple modules or unit with further division.
In addition, although describing each step of method in the disclosure in the accompanying drawings with particular order, this does not really want
These steps must be executed in this particular order by asking or implying, or having to carry out step shown in whole could realize
Desired result.Additional or alternative, it is convenient to omit multiple steps are merged into a step and executed by certain steps, and/
Or a step is decomposed into execution of multiple steps etc..
Through the above description of the embodiments, those skilled in the art is it can be readily appreciated that example described herein is implemented
Mode can also be realized by software realization in such a way that software is in conjunction with necessary hardware.Therefore, according to the disclosure
The technical solution of embodiment can be embodied in the form of software products, which can store non-volatile at one
Property storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) in or network on, including some instructions are so that a calculating
Equipment (can be personal computer, server, mobile terminal or network equipment etc.) is executed according to disclosure embodiment
Method.
In an exemplary embodiment of the disclosure, a kind of electronic equipment that can be realized the above method is additionally provided.
Person of ordinary skill in the field it is understood that various aspects of the invention can be implemented as system, method or
Program product.Therefore, various aspects of the invention can be embodied in the following forms, it may be assumed that complete hardware embodiment, complete
The embodiment combined in terms of full Software Implementation (including firmware, microcode etc.) or hardware and software, can unite here
Referred to as circuit, " module " or " system ".
The electronic equipment 400 of this embodiment according to the present invention is described referring to Fig. 4.The electronics that Fig. 4 is shown
Equipment 400 is only an example, should not function to the embodiment of the present invention and use scope bring any restrictions.
As shown in figure 4, electronic equipment 400 is showed in the form of universal computing device.The component of electronic equipment 400 can wrap
It includes but is not limited to: at least one above-mentioned processing unit 410, at least one above-mentioned storage unit 420, the different system components of connection
The bus 430 of (including storage unit 420 and processing unit 410).
Wherein, the storage unit is stored with program code, and said program code can be held by the processing unit 410
Row, so that various according to the present invention described in the execution of the processing unit 410 above-mentioned " illustrative methods " part of this specification
The step of illustrative embodiments.For example, the processing unit 410 can execute step S110 as shown in fig. 1: in response to
Video is got, the video is decomposed into framing, step S120: the frame resolved into is grouped according to pre-defined rule, step S130:
Each group of frame is connected into sequence of frames of video, step S140: the sequence of frames of video being inputted into machine learning model, by described
The label of machine learning model output video frame sequence, step S150: the label based on sequence of frames of video is the video mark
Label, wherein the machine learning model is trained as follows: by each sequence of frames of video in sequence of frames of video sample set
Sample inputs the machine learning model, and the sequence of frames of video sample is by the video of various known labels according to described predetermined
Rule grouping, and each group of frame is connected in series, the machine learning model output video frame sequence samples from
The label of video is compared with video known label, if it is inconsistent, the coefficient in the machine learning model is adjusted,
Keep the label of the machine learning model output consistent with the video known label.
Storage unit 420 may include the readable medium of volatile memory cell form, such as Random Access Storage Unit
(RAM) 4201 and/or cache memory unit 4202, it can further include read-only memory unit (ROM) 4203.
Storage unit 420 can also include program/utility with one group of (at least one) program module 4205
4204, such program module 4205 includes but is not limited to: operating system, one or more application program, other program moulds
It may include the realization of network environment in block and program data, each of these examples or certain combination.
Bus 430 can be to indicate one of a few class bus structures or a variety of, including storage unit bus or storage
Cell controller, peripheral bus, graphics acceleration port, processing unit use any bus structures in a variety of bus structures
Local bus.
Electronic equipment 400 can also be with one or more external equipments 600 (such as keyboard, sensing equipment, bluetooth equipment
Deng) communication, can also be enabled a user to one or more equipment interact with the electronic equipment 400 communicate, and/or with make
Any equipment (such as the router, modulation /demodulation that the electronic equipment 400 can be communicated with one or more of the other calculating equipment
Device etc.) communication.This communication can be carried out by input/output (I/O) interface 450.Also, electronic equipment 400 can be with
By network adapter 460 and one or more network (such as local area network (LAN), wide area network (WAN) and/or public network,
Such as internet) communication.As shown, network adapter 460 is communicated by bus 430 with other modules of electronic equipment 400.
It should be understood that although not shown in the drawings, other hardware and/or software module can not used in conjunction with electronic equipment 400, including but not
Be limited to: microcode, device driver, redundant processing unit, external disk drive array, RAID system, tape drive and
Data backup storage system etc..
Through the above description of the embodiments, those skilled in the art is it can be readily appreciated that example described herein is implemented
Mode can also be realized by software realization in such a way that software is in conjunction with necessary hardware.Therefore, according to the disclosure
The technical solution of embodiment can be embodied in the form of software products, which can store non-volatile at one
Property storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) in or network on, including some instructions are so that a calculating
Equipment (can be personal computer, server, terminal installation or network equipment etc.) is executed according to disclosure embodiment
Method.
In an exemplary embodiment of the disclosure, a kind of computer readable storage medium is additionally provided, energy is stored thereon with
Enough realize the program product of this specification above method.In some possible embodiments, various aspects of the invention may be used also
In the form of being embodied as a kind of program product comprising program code, when described program product is run on the terminal device, institute
Program code is stated for executing the terminal device described in above-mentioned " illustrative methods " part of this specification according to this hair
The step of bright various illustrative embodiments.
Refering to what is shown in Fig. 5, describing the program product for realizing the above method of embodiment according to the present invention
500, can using portable compact disc read only memory (CD-ROM) and including program code, and can in terminal device,
Such as it is run on PC.However, program product of the invention is without being limited thereto, in this document, readable storage medium storing program for executing can be with
To be any include or the tangible medium of storage program, the program can be commanded execution system, device or device use or
It is in connection.
Described program product can be using any combination of one or more readable mediums.Readable medium can be readable letter
Number medium or readable storage medium storing program for executing.Readable storage medium storing program for executing for example can be but be not limited to electricity, magnetic, optical, electromagnetic, infrared ray or
System, device or the device of semiconductor, or any above combination.The more specific example of readable storage medium storing program for executing is (non exhaustive
List) include: electrical connection with one or more conducting wires, portable disc, hard disk, random access memory (RAM), read-only
Memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read only memory
(CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.
Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal,
In carry readable program code.The data-signal of this propagation can take various forms, including but not limited to electromagnetic signal,
Optical signal or above-mentioned any appropriate combination.Readable signal medium can also be any readable Jie other than readable storage medium storing program for executing
Matter, the readable medium can send, propagate or transmit for by instruction execution system, device or device use or and its
The program of combined use.
The program code for including on readable medium can transmit with any suitable medium, including but not limited to wirelessly, have
Line, optical cable, RF etc. or above-mentioned any appropriate combination.
The program for executing operation of the present invention can be write with any combination of one or more programming languages
Code, described program design language include object oriented program language-Java, C++ etc., further include conventional
Procedural programming language-such as " C " language or similar programming language.Program code can be fully in user
It calculates and executes in equipment, partly executes on a user device, being executed as an independent software package, partially in user's calculating
Upper side point is executed on a remote computing or is executed in remote computing device or server completely.It is being related to far
Journey calculates in the situation of equipment, and remote computing device can pass through the network of any kind, including local area network (LAN) or wide area network
(WAN), it is connected to user calculating equipment, or, it may be connected to external computing device (such as utilize ISP
To be connected by internet).
In addition, above-mentioned attached drawing is only the schematic theory of processing included by method according to an exemplary embodiment of the present invention
It is bright, rather than limit purpose.It can be readily appreciated that the time that above-mentioned processing shown in the drawings did not indicated or limited these processing is suitable
Sequence.In addition, be also easy to understand, these processing, which can be, for example either synchronously or asynchronously to be executed in multiple modules.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to its of the disclosure
His embodiment.This application is intended to cover any variations, uses, or adaptations of the disclosure, these modifications, purposes or
Adaptive change follow the general principles of this disclosure and including the undocumented common knowledge in the art of the disclosure or
Conventional techniques.The description and examples are only to be considered as illustrative, and the true scope and spirit of the disclosure are by claim
It points out.
Claims (10)
1. a kind of method for video automatic labeling characterized by comprising
In response to getting video, the video is decomposed into framing;
The frame resolved into is grouped according to pre-defined rule;
Each group of frame is connected into sequence of frames of video;
The sequence of frames of video is inputted into machine learning model, by the label of the machine learning model output video frame sequence;
Label based on sequence of frames of video labels for the video,
Wherein, the machine learning model is trained as follows: by each video frame sequence in sequence of frames of video sample set
Column sample inputs the machine learning model, and the sequence of frames of video sample is by the video of various known labels according to described pre-
Set pattern is then grouped, and each group of frame is connected in series, the machine learning model output video frame sequence samples are come from
Video label, be compared with video known label, if it is inconsistent, adjusting in the machine learning model and being
Number keeps the label of the machine learning model output consistent with the video known label.
2. the method according to claim 1, wherein the pre-defined rule includes: by continuous predetermined number frame
As one group.
3. the method according to claim 1, wherein the pre-defined rule includes: to resolve into the video
Frame is divided into N group, and N is positive integer, and the number for being superimposed frame is also N, the frame that frame number is aN+i is formed i-th group, wherein a and i
For positive integer, 0≤a≤N-1,1≤i≤N.
4. the method according to claim 1, wherein the frame by each group is connected into sequence of frames of video packet
It includes:
Each group of frame is connected into sequence of frames of video according to the sequencing of the frame number of each frame.
5. the method according to claim 1, wherein the label based on sequence of frames of video, is the video
It labels, comprising:
Using the maximum label of number in the label of obtained sequence of frames of video as the label beaten for video.
6. according to the method described in claim 5, it is characterized in that, number in the label of obtained sequence of frames of video is maximum
Label is as the label beaten for video, comprising:
If the maximum label of number have it is multiple, increase resolve into frame grouping number.
7. according to the method described in claim 6, it is characterized in that, it is described increase resolve into frame grouping number include: as
Fruit pre-defined rule includes the then number of increase group using continuous predetermined number frame as one group, makes at least part group include
Frame partly overlaps.
8. a kind of device for video automatic labeling characterized by comprising
Decomposing module, in response to getting video, the video to be decomposed framing;
Grouping module, for the frame resolved into be grouped according to pre-defined rule;
Laminating module, for each group of frame to be connected into sequence of frames of video;
First demarcating module is exported for the sequence of frames of video to be inputted machine learning model by the machine learning model
The label of sequence of frames of video;
Second demarcating module labels for the label based on sequence of frames of video for the video,
Wherein, the machine learning model is trained as follows: by each video frame sequence in sequence of frames of video sample set
Column sample inputs the machine learning model, and the sequence of frames of video sample is by the video of various known labels according to described pre-
Set pattern is then grouped, and each group of frame is connected in series, the machine learning model output video frame sequence samples are come from
Video label, be compared with video known label, if it is inconsistent, adjusting in the machine learning model and being
Number keeps the label of the machine learning model output consistent with the video known label.
9. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program quilt
Claim 1-7 described in any item methods are realized when processor executes.
10. a kind of electronic equipment characterized by comprising
Processor;And
Memory, for storing the executable instruction of the processor;
Wherein, the processor is configured to require 1-7 described in any item via executing the executable instruction and carry out perform claim
Method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811542198.0A CN109635158A (en) | 2018-12-17 | 2018-12-17 | For the method and device of video automatic labeling, medium and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811542198.0A CN109635158A (en) | 2018-12-17 | 2018-12-17 | For the method and device of video automatic labeling, medium and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109635158A true CN109635158A (en) | 2019-04-16 |
Family
ID=66074656
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811542198.0A Pending CN109635158A (en) | 2018-12-17 | 2018-12-17 | For the method and device of video automatic labeling, medium and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109635158A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110163115A (en) * | 2019-04-26 | 2019-08-23 | 腾讯科技(深圳)有限公司 | A kind of method for processing video frequency, device and computer readable storage medium |
CN111144508A (en) * | 2019-12-30 | 2020-05-12 | 中国矿业大学(北京) | Automatic control system and control method for coal mine auxiliary shaft rail transportation |
CN111491206A (en) * | 2020-04-17 | 2020-08-04 | 维沃移动通信有限公司 | Video processing method, video processing device and electronic equipment |
CN111797801A (en) * | 2020-07-14 | 2020-10-20 | 北京百度网讯科技有限公司 | Method and apparatus for video scene analysis |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2869236A1 (en) * | 2013-10-31 | 2015-05-06 | Alcatel Lucent | Process for generating a video tag cloud representing objects appearing in a video content |
CN106874827A (en) * | 2015-12-14 | 2017-06-20 | 北京奇虎科技有限公司 | Video frequency identifying method and device |
CN106878632A (en) * | 2017-02-28 | 2017-06-20 | 北京知慧教育科技有限公司 | A kind for the treatment of method and apparatus of video data |
CN107277650A (en) * | 2017-07-25 | 2017-10-20 | 中国华戎科技集团有限公司 | video file cutting method and device |
CN108694217A (en) * | 2017-04-12 | 2018-10-23 | 合信息技术(北京)有限公司 | The label of video determines method and device |
-
2018
- 2018-12-17 CN CN201811542198.0A patent/CN109635158A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2869236A1 (en) * | 2013-10-31 | 2015-05-06 | Alcatel Lucent | Process for generating a video tag cloud representing objects appearing in a video content |
CN106874827A (en) * | 2015-12-14 | 2017-06-20 | 北京奇虎科技有限公司 | Video frequency identifying method and device |
CN106878632A (en) * | 2017-02-28 | 2017-06-20 | 北京知慧教育科技有限公司 | A kind for the treatment of method and apparatus of video data |
CN108694217A (en) * | 2017-04-12 | 2018-10-23 | 合信息技术(北京)有限公司 | The label of video determines method and device |
CN107277650A (en) * | 2017-07-25 | 2017-10-20 | 中国华戎科技集团有限公司 | video file cutting method and device |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110163115A (en) * | 2019-04-26 | 2019-08-23 | 腾讯科技(深圳)有限公司 | A kind of method for processing video frequency, device and computer readable storage medium |
CN110163115B (en) * | 2019-04-26 | 2023-10-13 | 腾讯科技(深圳)有限公司 | Video processing method, device and computer readable storage medium |
CN111144508A (en) * | 2019-12-30 | 2020-05-12 | 中国矿业大学(北京) | Automatic control system and control method for coal mine auxiliary shaft rail transportation |
CN111491206A (en) * | 2020-04-17 | 2020-08-04 | 维沃移动通信有限公司 | Video processing method, video processing device and electronic equipment |
CN111797801A (en) * | 2020-07-14 | 2020-10-20 | 北京百度网讯科技有限公司 | Method and apparatus for video scene analysis |
CN111797801B (en) * | 2020-07-14 | 2023-07-21 | 北京百度网讯科技有限公司 | Method and apparatus for video scene analysis |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109635158A (en) | For the method and device of video automatic labeling, medium and electronic equipment | |
US10970334B2 (en) | Navigating video scenes using cognitive insights | |
CN105654950B (en) | Adaptive voice feedback method and device | |
CN107430858B (en) | Communicating metadata identifying a current speaker | |
CN110674350B (en) | Video character retrieval method, medium, device and computing equipment | |
CN104735468B (en) | A kind of method and system that image is synthesized to new video based on semantic analysis | |
CN109614517B (en) | Video classification method, device, equipment and storage medium | |
CN109688463A (en) | A kind of editing video generation method, device, terminal device and storage medium | |
US8064641B2 (en) | System and method for identifying objects in video | |
CN109660865A (en) | Make method and device, medium and the electronic equipment of video tab automatically for video | |
CN109919244B (en) | Method and apparatus for generating a scene recognition model | |
US20200371741A1 (en) | Electronic apparatus, document displaying method thereof and non-transitory computer readable recording medium | |
CN110321958A (en) | Training method, the video similarity of neural network model determine method | |
CN108319723A (en) | A kind of picture sharing method and device, terminal, storage medium | |
CN103052953A (en) | Information processing device, method of processing information, and program | |
CN111309200B (en) | Method, device, equipment and storage medium for determining extended reading content | |
US20170115853A1 (en) | Determining Image Captions | |
CN110990598B (en) | Resource retrieval method and device, electronic equipment and computer-readable storage medium | |
US10162879B2 (en) | Label filters for large scale multi-label classification | |
US20200081981A1 (en) | System and method for a scene builder | |
CN113810742A (en) | Virtual gift processing method and device, electronic equipment and storage medium | |
CN111989930A (en) | Display device and control method of display device | |
US20240022620A1 (en) | System and method of communications using parallel data paths | |
CN117036827A (en) | Multi-mode classification model training, video classification method, device, medium and equipment | |
CN114722234B (en) | Music recommendation method, device and storage medium based on artificial intelligence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190416 |