CN108235114A - Content analysis method and system, electronic equipment, the storage medium of video flowing - Google Patents
Content analysis method and system, electronic equipment, the storage medium of video flowing Download PDFInfo
- Publication number
- CN108235114A CN108235114A CN201711066691.5A CN201711066691A CN108235114A CN 108235114 A CN108235114 A CN 108235114A CN 201711066691 A CN201711066691 A CN 201711066691A CN 108235114 A CN108235114 A CN 108235114A
- Authority
- CN
- China
- Prior art keywords
- video
- gpu
- information
- sub
- video flowing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 22
- 238000012545 processing Methods 0.000 claims abstract description 49
- 230000001815 facial effect Effects 0.000 claims description 78
- 230000006854 communication Effects 0.000 claims description 16
- 238000004891 communication Methods 0.000 claims description 15
- 238000004590 computer program Methods 0.000 claims description 8
- 230000004044 response Effects 0.000 claims description 7
- 235000013399 edible fruits Nutrition 0.000 claims description 6
- 238000004040 coloring Methods 0.000 claims description 3
- 238000000034 method Methods 0.000 description 38
- 230000008569 process Effects 0.000 description 12
- 238000001514 detection method Methods 0.000 description 11
- 238000009877 rendering Methods 0.000 description 10
- 238000013527 convolutional neural network Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 238000002224 dissection Methods 0.000 description 6
- 238000005034 decoration Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000000605 extraction Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000007689 inspection Methods 0.000 description 3
- 238000007726 management method Methods 0.000 description 3
- 238000013523 data management Methods 0.000 description 2
- 238000000354 decomposition reaction Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000005086 pumping Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The embodiment of the present disclosure discloses a kind of content analysis method of video flowing and system, electronic equipment, storage medium, wherein, the system comprises:Central processing unit CPU and multiple graphics processing unit GPU, wherein, the video flowing is divided into multiple sub-video streams corresponding with the multiple GPU, and the multiple sub-video stream is distributed to corresponding GPU by CPU for obtaining video flowing;The video image that GPU is used in pair sub-video stream corresponding with the GPU carries out Context resolution processing, obtains the Context resolution of the corresponding sub-video stream as a result, and the Context resolution result of corresponding sub-video stream is reported to the CPU;CPU is additionally operable to according to the multiple GPU Context resolutions reported as a result, obtaining the Context resolution result of the video flowing.The embodiment of the present disclosure realizes the intelligently parsing to video content, and improves processing speed and treatment effeciency to video flowing parallel processing by multiple GPU.
Description
Technical field
This disclosure relates to computer vision technique, especially a kind of parsing method and system of video flowing, are deposited electronic equipment
Storage media.
Background technology
With the development of video traffic and universal, the understanding to video content and mark demand have been risen.However, existing skill
It is most of still using manual type in art, it is less efficient, and need to occupy a large amount of human resources, particularly video display class is regarded
Frequently, such as:Film and serial etc., it may appear that miscellaneous scene and article make manual type face more challenges.
Disclosure
The embodiment of the present disclosure provides a kind of content analysis method of video flowing and system and electronic equipment.
According to the one side of the embodiment of the present disclosure, a kind of Context resolution system of video flowing is provided, including:
Central processing unit CPU and multiple graphics processing unit GPU, wherein,
The video flowing is divided into multiple sub-videos corresponding with the multiple GPU by the CPU for obtaining video flowing
Stream, and the multiple sub-video stream is distributed to corresponding GPU, wherein, each sub-video stream is included in the video flowing
The continuous video image of an at least frame, and different GPU corresponds to different sub-video streams;
The video image that the GPU is used in pair sub-video stream corresponding with the GPU carries out Context resolution processing, obtains
The Context resolution of the corresponding sub-video stream by the Context resolution result of the corresponding sub-video stream as a result, and report to institute
State CPU;
The CPU is additionally operable to according to the multiple GPU Context resolutions reported as a result, obtaining the content solution of the video flowing
Analyse result.
In another embodiment based on disclosure above system, the CPU by the video flowing be divided into it is described
The corresponding multiple sub-video streams of multiple GPU, including:
The video flowing is divided into corresponding with the multiple GPU more by the CPU according to the quantity of the multiple GPU
A sub-video stream.
In another embodiment based on disclosure above system, the Context resolution result include it is following at least
One:People information, dress ornament information, Item Information and scene information.
In another embodiment based on disclosure above system, the dress ornament information is included in following information at least
It is a kind of:The classification information of dress ornament, colouring information, texture information, neckline information, cuff information and the dress ornament image coordinate letter
Breath;And/or
The face information includes at least one of following information:Name information, face character information, face image
Coordinate information.
In another embodiment based on disclosure above system, the GPU includes face recognition module, is used for:
The people information library of the video flowing is obtained, the people information library of the video flowing includes the people in the video flowing
The facial image of object and name information;
Using the people information library of the video flowing, an at least frame video figure in the corresponding sub-video streams of the GPU is determined
The face information of picture.
In another embodiment based on disclosure above system, the face recognition module is specifically used for:
Face datection is carried out to every frame video image in an at least frame video image for the corresponding sub-video stream, is obtained
To described per at least one of frame video image facial image, wherein, corresponding to identical personage and appear in the son and regard
There is at least at least one of frame continuous videos image facial image during frequency flows identical face tracking to identify;
Correspond to from the sub-video stream at least one facial image of same face tracking mark and determine target person
Face image;
By to the facial image in the video people information library of the target facial image and the video flowing into pedestrian
Face compares, and determines the corresponding personage of the face tracking mark of the target facial image.
In another embodiment based on disclosure above system, the GPU further includes dress ornament identification module, is used for:
It is regarded in an at least frame video image for the sub-video stream detected according to the face recognition module per frame
Facial image in frequency image carries out dress ornament detection process per frame video image to described, obtains in every frame video image
At least one dress ornament image, wherein, same face tracking is corresponded in the sub-video stream and is identified and corresponding to same dress ornament
Dress ornament image there is identical dress ornament tracking mark;
According at least one dress ornament image in the sub-video stream with identical dress ornament tracking mark, the dress ornament is determined
The corresponding dress ornament of tracking mark.
In another embodiment based on disclosure above system, the dress ornament identification module is additionally operable to establish the clothes
Incidence relation between decorations tracking mark and face tracking mark, the Context resolution result include the face tracking mark
Know corresponding people information and identify the corresponding dress ornament information of associated dress ornament tracking mark with the face tracking.
In another embodiment based on disclosure above system, the CPU is additionally operable to render in the video flowing
The Context resolution result of the video flowing.
In another embodiment based on disclosure above system, the CPU obtains video flowing, including:
The CPU obtains the video flowing that user is uploaded by internet.
In another embodiment based on disclosure above system, the CPU is additionally operable to pass through interconnection in acquisition user
Before the video flowing passed on the net, in response to the video upload request of the user, subscription authentication is carried out to the user.
According to the one side of the embodiment of the present disclosure, a kind of electronic equipment is provided, including:
Communication unit for being asked in response to the user of reception, by video stream to Context resolution system and receives
The Context resolution result for the video flowing that the Context resolution system is sent;
Storage unit, for preserving the Context resolution result of the video flowing.
In another embodiment based on the above-mentioned electronic equipment of the disclosure, further include:
Rendering unit, for rendering the Context resolution result of the video flowing in the video flowing and showing the rendering
Result.
According to the one side of the embodiment of the present disclosure, provide a kind of content analysis method of video flowing, applied to including
The Context resolution system of central processing unit CPU and multiple graphics processing unit GPU, including:
CPU obtains video flowing, and the video flowing is divided into multiple sub-video streams corresponding with the multiple GPU, and will
The multiple sub-video stream is distributed to corresponding GPU, wherein, each sub-video stream includes at least one in the video flowing
The continuous video image of frame, and different GPU corresponds to different sub-video streams;
Video image in GPU pairs of sub-video stream corresponding with the GPU carries out Context resolution processing, obtains the correspondence
Sub-video stream Context resolution as a result, and the Context resolution result of the corresponding sub-video stream is reported to the CPU;
CPU is according to the multiple GPU Context resolutions reported as a result, obtaining the Context resolution result of the video flowing.
In another embodiment based on the disclosure above method, it is described by the video flowing be divided into it is the multiple
The corresponding multiple sub-video streams of GPU, including:
According to the quantity of the multiple GPU, the video flowing is divided into multiple sub-videos corresponding with the multiple GPU
Stream.
In another embodiment based on the disclosure above method, the Context resolution result include it is following at least
One:People information, dress ornament information, Item Information and scene information.
In another embodiment based on the disclosure above method, the dress ornament information is included in following information at least
It is a kind of:The classification information of dress ornament, colouring information, texture information, neckline information, cuff information and the dress ornament image coordinate letter
Breath;And/or
The face information includes at least one of following information:Name information, face character information, face image
Coordinate information.
In another embodiment based on the disclosure above method, described GPU pairs sub-video stream corresponding with the GPU
In video image carry out Context resolution processing, including:
The people information library of the video flowing is obtained, the people information library of the video flowing includes the people in the video flowing
The facial image of object and name information;
Using the people information library of the video flowing, an at least frame video figure in the corresponding sub-video streams of the GPU is determined
The face information of picture.
In another embodiment based on the disclosure above method, the people information library using the video flowing,
Determine the face information of an at least frame video image in the corresponding sub-video streams of the GPU, including:
Face datection is carried out to every frame video image in an at least frame video image for the corresponding sub-video stream, is obtained
To described per at least one of frame video image facial image, wherein, corresponding to identical personage and appear in the son and regard
There is at least at least one of frame continuous videos image facial image during frequency flows identical face tracking to identify;
Correspond to from the sub-video stream at least one facial image of same face tracking mark and determine target person
Face image;
By to the facial image in the video people information library of the target facial image and the video flowing into pedestrian
Face compares, and determines the corresponding personage of the face tracking mark of the target facial image.
In another embodiment based on the disclosure above method, described GPU pairs sub-video stream corresponding with the GPU
In video image carry out Context resolution processing, further include:
It is regarded in an at least frame video image for the sub-video stream detected according to the face recognition module per frame
Facial image in frequency image carries out dress ornament detection process per frame video image to described, obtains in every frame video image
At least one dress ornament image;Wherein, same face tracking is corresponded in the sub-video stream to identify and corresponding to same dress ornament
Dress ornament image there is identical dress ornament tracking mark;
According at least one dress ornament image in the sub-video stream with identical dress ornament tracking mark, the dress ornament is determined
The corresponding dress ornament of tracking mark.
In another embodiment based on the disclosure above method, further include:
Establish the incidence relation between the dress ornament tracking mark and face tracking mark, the Context resolution result
Corresponding people information is identified including the face tracking and identifies associated dress ornament tracking mark institute with the face tracking
Corresponding dress ornament information.
In another embodiment based on the disclosure above method, further include:
CPU renders the Context resolution result of the video flowing in the video flowing.
In another embodiment based on the disclosure above method, the CPU obtains video flowing, including:
The CPU obtains the video flowing that user is uploaded by internet.
In another embodiment based on the disclosure above method, obtain user in the CPU and uploaded by internet
Video flowing before, further include:
In response to the video upload request of the user, subscription authentication is carried out to the user.
According to the one side of the embodiment of the present disclosure, a kind of electronic equipment is provided, including:Memory, can for storing
Execute instruction;
And processor, it completes to regard as described above to perform the executable instruction for communicating with the memory
The operation of the content analysis method of frequency stream.
According to the one side of the embodiment of the present disclosure, a kind of computer storage media is provided, it can for storing computer
The instruction of reading, described instruction are performed the operation for the content analysis method for performing video flowing as described above.
According to the one side of the embodiment of the present disclosure, a kind of computer program is provided, including computer-readable code,
Be characterized in that, when the computer-readable code in equipment when running, the processor execution in the equipment be used to implement as
The instruction of each step in the content analysis method of the upper video flowing.
The content analysis method and system and electronic equipment, CPU of video flowing based on embodiment of the present disclosure offer will regard
Frequency stream is divided into multiple sub-video streams corresponding with multiple GPU, and multiple sub-video streams are distributed to corresponding GPU, multiple GPU
Context resolution processing concurrently is carried out to corresponding sub-video stream respectively, so as to improve processing speed and treatment effeciency.
Description of the drawings
Fig. 1 is the structure diagram of the Context resolution system of video flowing that the embodiment of the present disclosure provides.
Fig. 2 is the structure diagram of electronic equipment that the embodiment of the present disclosure provides.
Fig. 3 is the schematic flow chart of the content analysis method of video flowing that the embodiment of the present disclosure provides.
Fig. 4 is the structure diagram for realizing the terminal device of the embodiment of the present application or the electronic equipment of server.
Specific embodiment
The various exemplary embodiments of the disclosure are described in detail now with reference to attached drawing.It should be noted that:Unless in addition have
Body illustrates that the unlimited system of component and the positioned opposite of step, numerical expression and the numerical value otherwise illustrated in these embodiments is originally
Scope of disclosure.
Simultaneously, it should be appreciated that for ease of description, the size of the various pieces shown in attached drawing is not according to reality
Proportionate relationship draw.
It is illustrative to the description only actually of at least one exemplary embodiment below, is never used as to the disclosure
And its application or any restrictions that use.
Technology, method and apparatus known to person of ordinary skill in the relevant may be not discussed in detail, but suitable
In the case of, the technology, method and apparatus should be considered as part of specification.
It should be noted that:Similar label and letter represents similar terms in following attached drawing, therefore, once a certain Xiang Yi
It is defined in a attached drawing, then in subsequent attached drawing does not need to that it is further discussed.
The embodiment of the present disclosure can be applied to computer system/system, can be with numerous other general or specialized calculate
System environment or configuration operate together.Suitable for be used together with computer system/system well-known computing system, environment
And/or the example of configuration includes but not limited to:Personal computer system, component computer system, thin client, thick client computer,
It is hand-held or laptop devices, the system based on microprocessor, set-top box, programmable consumer electronics, NetPC Network PC, small-sized
Computer system, large computer system and distributed cloud computing technology environment including any of the above described system, etc..
Computer system/system can be in computer system executable instruction (such as program performed by computer system
Module) general linguistic context under describe.In general, program module can include routine, program, target program, component, logic, data
Structure etc., they perform specific task or realize specific abstract data type.Computer system/system can divide
Implement in cloth cloud computing environment, in distributed cloud computing environment, task is set by the remote processing being linked through a communication network
Standby execution.In distributed cloud computing environment, program module can be located at the Local or Remote calculating system for including storage device
It unites on storage medium.
Fig. 1 is the exemplary structure diagram of the Context resolution system of video flowing that the embodiment of the present disclosure provides.Such as
Shown in Fig. 1, which includes:Central processing unit CPU 110 and multiple graphics processing unit GPU120.
Specifically, CPU110 is used to obtain video flowing.Wherein, optionally, which can include multi-frame video image.
For example, the video flowing can be specially movie or television serial etc., the disclosure is not construed as limiting this.
CPU 110 is additionally operable to video flowing being divided into multiple sub-video streams corresponding with multiple GPU120, and by multiple sons
Video flowing is distributed to corresponding GPU120, wherein, each sub-video stream includes the continuous video figure of an at least frame in video flowing
Picture, and different GPU120 corresponds to different sub-video streams.
Optionally, a GPU 120 can correspond to one or more sub-video streams, and different GPU 120 is corresponding
The quantity of sub-video stream can be identical or different.As an example, the quantity of multiple sub-video stream can be equal to the GPU
120 integral multiple, at this point, optionally, different GPU 120 can correspond to the sub-video stream of same number.For example, multiple son
The quantity of video flowing can be equal to the quantity of multiple GPU 120, and multiple sub-video stream can be with multiple GPU 120
It corresponds, but the embodiment of the present disclosure is without being limited thereto.
GPU120 is used for the video image progress Context resolution processing in pair sub-video stream corresponding with GPU120, obtains pair
The Context resolution for the sub-video stream answered by the Context resolution result of corresponding sub-video stream as a result, and report to CPU110.
Each GPU 120 in multiple GPU 120 can carry out Context resolution processing to its corresponding sub-video stream respectively,
Wherein, optionally, each GPU 120 may be used identical flow and carry out Context resolution processing, in order to make it easy to understand, below with
It is described for the operation of one of GPU 120.
Optionally, Context resolution result can include it is following at least one of:People information, dress ornament information, article letter
Breath and scene information.
As an example, dress ornament information can include at least one of following information:Classification information, the color of dress ornament
Information, texture information, neckline information, the image coordinate information of cuff information and dress ornament.
Wherein, the classification information of dress ornament can represent dress ornament classification, such as housing, shirt, etc..The image coordinate of dress ornament
Information can indicate position of the dress ornament in video image.Optionally, dress ornament information can also include other information, and the disclosure is real
It applies example and does not do any restriction to this.
As an example, face information can include at least one of following information:Name information, face character letter
The image coordinate information of breath, face.
Wherein, name information can be specially the name of personage in video, such as dramatis personae's name, or people
Real Name or stage name of object, etc..Character attribute information can include human face similarity degree information, personage gender information, personage
Age information, etc..The image coordinate information of face can indicate position of the facial image in video image.Optionally, people
Face information can also include other information, and the embodiment of the present disclosure does not limit this.
As an example, Item Information can include at least one of following message:Item Title information, article category
The image coordinate information of property information and article.
Wherein, goods attribute information can include the information such as material, the brand of article.Optionally, Item Information can be with
Including other information, the embodiment of the present disclosure does not limit this.
As an example, scene information can include scene name information, for example, seabeach, airport etc., the disclosure
Embodiment does not limit this.
Optionally, GPU 120 specifically can carry out content by the video image in convolutional neural networks sub-video stream
Dissection process, it is achieved thereby that in automatic sub-video stream video image content understanding, overcome lacking for artificial marked content
Point improves the efficiency of content understanding.
In addition, the identification for personage, dress ornament, article and scene, can be respectively adopted different trained convolution god
It is obtained through network.By the intelligently parsing to video content, face, clothes, article and scene are effectively extracted from video
Structured messages, and combining its system are waited, it is achieved thereby that the intelligently parsing and structuring to video content are defeated
Go out.
CPU110 is additionally operable to according to multiple GPU120 Context resolutions reported as a result, obtaining the Context resolution knot of video flowing
Fruit.
Video flowing, is divided by the Context resolution system based on the video flowing that disclosure above-described embodiment provides by CPU
Sub-video stream carries out parallel dissection process in each GPU respectively after multiple sub-video streams, effectively increases processing speed.
Optionally, in a specific example, CPU110 can be according to the quantity of multiple GPU 120, by video flowing decile
For multiple sub-video streams corresponding with multiple GPU 120.
Specifically, in the Context resolution system of more GPU, CPU 110 can be according to currently available GPU quantity, dynamic
Divide video.Video flowing can be divided into multiple sub-video streams by the CPU 110, for example, the CPU 110 can be by video flowing etc.
It is divided into and multiple GPU 120 multiple sub-video streams, but the embodiment of the present disclosure is without being limited thereto correspondingly.Optionally, the CPU
110 can also otherwise divide video flowing, and the embodiment of the present disclosure is not construed as limiting this.
Optionally, each sub-video stream can be distributed to corresponding GPU by CPU 110, by GPU to the sub-video that receives
Stream carries out taking out frame decoding processing, obtains at least frame video image that the sub-video stream includes, then at least a frame regards to this
Every frame video image in frequency image carries out Context resolution processing, such as recognition of face processing, clothes identifying processing, article identification
At least one of in processing and scene Recognition processing.Alternatively, each sub-video stream first can also be carried out pumping frame decoding by CPU 110
Processing, obtains at least frame video image that each sub-video stream includes, and by at least frame video image in sub-video stream
Corresponding GPU 120 is transmitted to, at this point, GPU 120 can directly regard every frame in at least frame video image that receives
Frequency image carries out Context resolution processing, but the embodiment of the present disclosure does not limit this.
In an optional example of the disclosure, GPU 120 can include:Face recognition module, for obtaining video flowing
People information library, and using video flowing people information library, determine an at least frame video image in sub-video stream face letter
Breath.
Specifically, the people information library of video flowing can include the facial image of personage and name information in video flowing,
Or it can further include other information.Optionally, if the correlation that video flowing is movie and television play or other known personages regards
Frequency segment etc. can be detected based on the corresponding people information library of the video flowing, to improve Context resolution efficiency.
It is alternatively possible to face inspection is carried out by every frame video image in convolutional neural networks sub-video stream respectively
Survey, the face location information in video image obtained by Face datection, and based on face location information by facial image from
It decomposites and in video image, the facial image that decomposition obtains is compared with the facial image in people information library, obtain
Required face information realizes the recognition of face based on video image.It optionally, can be in addition to utilizing convolutional neural networks
Recognition of face is realized by other means, and the embodiment of the present disclosure is not construed as limiting this.
As an example, specifically, face recognition module can be to an at least frame video figure for corresponding sub-video stream
Every frame video image as in carries out Face datection, obtains at least one of every frame video image facial image, wherein, it is corresponding
In identical personage and appear at least at least one of frame continuous videos image facial image in sub-video stream and have
Identical face tracking mark.Further, face recognition module corresponds to same face tracking mark from sub-video stream
Target facial image is determined at least one facial image, and in the people information library of the target facial image and the video flowing
Facial image carry out face comparison, determine that the face tracking identifies corresponding face information.
Face recognition module can be the obtained each face assigner face trace labelling of Face datection, and based on face with
Track label tracks the face.Wherein, different faces corresponds to different face trackings and marks.Appear in successive frame
Identical face in video image can correspond to identical face tracking label, optionally, if identical face appear in it is non-
In the video image of successive frame, then it may correspond to different face trackings and identify, but the embodiment of the present disclosure does not limit this
It is fixed.
Specifically, it is identified for a face tracking, at least frame in sub-video stream can be identified to the face tracking
Facial image in continuous video image in every frame video image carries out characteristics extraction, and the characteristic value based on extraction, really
The confidence level of facial image in fixed every frame video image.It is then possible to based on being regarded in an at least frame video image per frame
The confidence level of facial image in frequency image determines that the face tracking identifies corresponding target facial image.It for example, can be by the people
The facial image that face tracking identifies confidence level maximum in corresponding at least one facial image identifies correspondence as the face tracking
Target facial image, but the embodiment of the present disclosure does not limit this.Target facial image is regarded as the personage and is regarded in the son
Frequency is best in quality in flowing or most useful for the facial image for carrying out face comparison, the personage based on the target facial image and video flowing
Facial image in information bank is compared, and quality and/or facial angle due to facial image can be avoided to ask to the greatest extent
The erroneous judgement inscribed and occurred effectively increases accuracy and the efficiency of recognition of face.
Optionally, GPU 120 further includes dress ornament identification module, and dress ornament inspection is carried out for the video image in sub-video stream
Survey is handled.
Specifically, dress ornament identification module can be regarded according to an at least frame for the sub-video stream that face recognition module detects
Facial image in frequency image in every frame video image carries out dress ornament detection process to every frame video image, obtains every frame video
At least one of image dress ornament image.
Wherein, it is alternatively possible to which each dress ornament for each personage in sub-video stream distributes dress ornament tracking mark, and base
Dress ornament is tracked and identified in dress ornament tracking mark.As an example, same face tracking is corresponded in sub-video stream
It identifies and there is identical dress ornament tracking mark corresponding to the dress ornament image of same dress ornament.Since spectators are generally more concerned with video flowing
In personage's dressing, the dress ornament identification module can based on the Face datection result of face recognition module carry out dress ornament detection, i.e.,
For the dress ornament that individually occurs in video image without identification, the clothes such as displayed in market, in this way, dress ornament identification mould
Block is detected just for the matched dress ornament of facial image, can obtain more effective dress ornament information, i.e. personage's dressing information,
And reduce system detectio and identification load.
Be assigned with dress ornament tracking mark after, dress ornament tracking module can according in sub-video stream have identical dress ornament with
At least one dress ornament image of track mark determines the corresponding dress ornament information of dress ornament tracking mark.
Specifically, for a dress ornament tracking mark, it can be tracked and be identified in sub-video stream at least based on the dress ornament
Dress ornament image in one frame video image carries out ballot selection, selects the highest dress ornament of votes as dress ornament tracking mark pair
The dress ornament answered.For example, it may be determined that dress ornament tracking mark corresponding dress ornament image in every frame video image, and based on the clothes
Decorations image determines the corresponding candidate dress ornament of dress ornament tracking mark, then can be to occurring dress ornament tracking mark in the sub-video stream
All frame video images for knowing corresponding dress ornament image are for statistical analysis, determine same candidate dress ornament in all frame video figures
The number occurred as in, and the most candidate dress ornament of the occurrence number in the sub-video stream is determined as dress ornament tracking mark pair
The dress ornament answered.Alternatively it is also possible to carry out dress ornament identification using other modes, the embodiment of the present disclosure does not limit this.
Optionally, dress ornament identification module can also establish being associated between dress ornament tracking mark and face tracking mark
System, correspondingly, Context resolution result include face tracking and identify corresponding people information and identified with face tracking associated
The corresponding dress ornament information of dress ornament tracking mark.In this way, the viewability of Context resolution result can be improved.
Specifically, there are incidence relations with facial image for the dress ornament image that dress ornament identification module detects.Therefore, dress ornament
Identification module can be based on facial image and dress ornament image position relationship (or other can determine facial image and dress ornament image position
The information put) incidence relation is established to the two, at this point, the Context resolution result of output not only includes to being identified in video image
People information, further include the dress ornament information with the personage of personage's information association, enable users to recognize more structurings
Information.
Specifically, the coordinate information of coordinate information that can be based on the dress ornament image detected and detection facial image comes true
It is fixed, when detection dress ornament image is with detecting the distance of facial image (such as:Euclidean distance) less than setting value when, it is believed that the detection take
It is corresponding that image, which is adornd, with the detection facial image, i.e., the corresponding people of the detection facial image wears detection dress ornament image correspondence
Dress ornament, but the embodiment of the present disclosure is without being limited thereto.
In one example of the present disclosure, GPU 120 can also include:Scene Recognition module, is used for:
Sub-video stream carries out shot segmentation, and it is continuous to obtain the corresponding at least frame of each camera lens at least one camera lens
Video image;
Every frame video image in an at least continuous video image of frame corresponding to each camera lens is carried out at scene Recognition
Reason obtains the corresponding scene information of each camera lens.
Optionally, when occurring multiple scenes (Same Scene is identified as multiple scenes due to judging by accident) in a camera lens,
Average confidence is calculated respectively to each scene identified, compares the scene that the average confidence can obtain corresponding to the camera lens
Information.
For scene Recognition, shot segmentation is first done, the sequential frame image in same camera lens will detect that per frame image
The confidence level of all same scenes of sequential frame image in same camera lens is added and then made even by multiple scenes and its confidence level
Mean value if average value is more than preset threshold value, just exports this scene.So multiple fields may be exported in a camera lens
Scape.Such as:Sea, seabeach.
Specifically, every frame video image that can be to camera lens in the continuous video image of a corresponding at least frame carries out scene
Identifying processing, each scene puts in obtaining this per the corresponding at least one scene of frame video image and at least one scene
Reliability.It then, can be according to the scene in the continuous video of the corresponding at least frame of the camera lens for some scene identified
Confidence level in image per frame video image, determines the corresponding objective degrees of confidence of the scene.For example, the corresponding target confidence of scene
Degree can be specially to the scene in the continuous video image of an at least frame of the camera lens per frame video image confidence level into
The average confidence that row average treatment obtains, but the embodiment of the present disclosure is without being limited thereto.Optionally, to obtain the camera lens corresponding more
It, can be according to the objective degrees of confidence of scene each in multiple scene, really in a scene after the objective degrees of confidence of each scene
The corresponding target scene of the fixed camera lens.If for example, the objective degrees of confidence of some scene be equal to or higher than predetermined threshold value, should
Scene is determined as the target scene of the camera lens, but the embodiment of the present disclosure is without being limited thereto.
In one example of the embodiment of the present disclosure, GPU 120 can carry out object with every frame video image of sub-video stream
Product examine is surveyed, and at least one images of items per frame video image is obtained, then according to an at least frame video for the sub-video stream
Image is tracked and identified corresponding to same article, determines the corresponding Item Information of the images of items.
In this way, realizing the article identification to video image, the structured message of the article in sub-video stream is obtained.
Optionally, CPU 110 can obtain the video flowing that user is uploaded by internet.
Specifically, it can be that user is uploaded by internet or user passes through this that CPU 110, which receives video flowing,
What ground uploaded, as long as legal upload means, CPU 110 can receive the video flowing, and the disclosure is not to uploading mode
It limits.
For example, user can open browser accesses Context resolution system in a manner of web.Wherein, which can be with
Server one is separation with server, and disclosure example is defined not to this.
Optionally, CPU 110 is additionally operable to before the video flowing that user is uploaded by internet is obtained, in response to user's
Video upload request carries out subscription authentication to user.
Specifically, CPU 110 can realize video data management function, be responsible for user management to authorizing, management user weighs
Limit etc.;In practical applications, after receiving user's request, it can be determined that user is identified in user right, only has power
User's request that the user of limit sends out, is just handled;The user that the user for not having permission sends out is asked, is directly fed back
Do not receive information.
Optionally, user here can be specially some terminal or Video service quotient, correspondingly, the Context resolution system
System can be specially a server or content providers, but the embodiment of the present disclosure does not limit this.
Optionally, which can also be by network (such as internet) to the Context resolution knot of the user feedback video flowing
Fruit, wherein optionally, the Context resolution result of the video flowing can be embodied with file or other forms.Optionally, which can be with
The Context resolution of the video flowing is rendered in the video flowing as a result, can further show and rendered the Context resolution result
Video flowing, but the embodiment of the present disclosure is without being limited thereto.
As an example, can utilize real-time rendering (such as:Web Renderings), to the Context resolution knot of video flowing
Fruit is cooked real-time rendering in browser, to improve user experience.In this way, by webization visual effect, enable Context resolution result more
It is intuitive to embody.
Optionally, which can also be according to the Context resolution of the video flowing as a result, further being located to the video flowing
Reason, such as purchase link or relevant advertisements information for showing the dress ornament that certain personage wears under certain scene in the video flowing, etc.,
The embodiment of the present disclosure does not limit this.
As an example, user can upload a serial by web.In system can carry out this serial
After holding dissection process, the corresponding video structural file of the serial is generated, which can include face, clothes
Decorations and the space time informations such as article (where appeared at that time point) and scene information.User utilizes these structurings
Information can carry out interactive with spectators.Such as by star's name, searching for Internet obtains other relevant informations of this star.
Or semantic search is carried out using scene and people information, for example certain star appears in the segment of certain scene for the first time.It can be with profit
With people information, selective viewing, etc. is carried out.
It is specially below that movie and television play is described the embodiment of the present disclosure as example using video flowing:
1st, user is by browser web uploaded videos file and the actor information file of the video, wherein, performer letter
Breath file includes the actor information in the video file.
For example, video file is serial《Song of Joy 2》The first collection, and actor information file includes《Song of Joy 2》
In actor information, each performer corresponds to a line.Optionally, system can also obtain the actor information file from other modes,
Such as the modes such as web search, the embodiment of the present disclosure do not limit this.
The browser can be a part for the system or be located on different physical equipments respectively from the system, this public affairs
Embodiment is opened not limit this.
2nd, associated actor information file and video file are reached the video analytics engine of system and handled by system.
3rd, video analytics engine can carry out video file in 110 environment of CPU etc. according to the quantity of GPU 120
Point, then the sub-video stream obtained after decile and actor information file are distributed to respectively in each 120 environment of GPU and carried out
Specific video dissection process.
4th, GPU 120 can carry out Context resolution processing to the sub-video stream received, obtain Context resolution result.Its
In, which can include structural metadata, specifically include star's name, the face correlation category of present frame face
Position coordinates in property and picture;The classification of clothes, color, texture, neckline and cuff information that star wears in present frame and
Position coordinates in picture;The Item Title information of present frame and the position coordinates in picture;The scene information of present frame.
User can be in the effect of visualization of the browser preview Context resolution result or the structuring member of foradownloaded video
The structural metadata file of data file, the wherein video can include the structuring letter of all frame video images of video
Breath.
Fig. 2 is the structure diagram of electronic equipment that the embodiment of the present disclosure provides.
Communication unit 210 can be used for user's request in response to reception, by video stream to Context resolution system.
Optionally, user can send user's request, such as video upload/parsing by the browser of the electronic equipment
Request.
Context resolution system can carry out Context resolution processing to video flowing, will obtain the Context resolution knot of corresponding video flowing
Fruit.
The Context resolution result of video flowing that communication unit 210 can be sent with reception content resolution system.
Storage unit 220 can be used to save the Context resolution result for the video flowing that the communication unit 210 receives.
Specifically, it is asked, user can also be carried out by receiving user with the electronic equipment of Context resolution system independence
User right is identified, is parsed in the video stream to Context resolution system that the user with permission is sent;The electronics
Equipment simultaneously preserves video stream to Context resolution system, the Context resolution result of reception content resolution system feedback;
Realize that synchronization shows Context resolution result or uniformly shows Context resolution result in some period in video streaming.
Optionally, electronic equipment 200 can also include rendering unit, for the content solution of render video stream in video streaming
Analysis result simultaneously shows the video flowing for having rendered the Context resolution result.
Specifically, electronic equipment can utilize real-time rendering (such as:Web Renderings), to content analysis result clear
Device of looking at does real-time rendering, improves user experience;By webization visual effect, Context resolution result is enable more intuitively to embody.
Fig. 3 is the schematic flow chart of the content analysis method of video flowing that the embodiment of the present disclosure provides.As shown in figure 3,
This method 300 includes:
S301, CPU obtain video flowing, and video flowing is divided into multiple sub-video streams corresponding with multiple GPU, and will be multiple
Sub-video stream is distributed to corresponding GPU.
Wherein, each sub-video stream includes the continuous video image of an at least frame in video flowing, and different GPU pairs
Answer different sub-video streams.
Optionally, a GPU 120 can correspond to one or more sub-video streams, and different GPU 120 is corresponding
The quantity of sub-video stream can be identical or different.As an example, the quantity of multiple sub-video stream can be equal to the GPU
120 integral multiple, at this point, optionally, different GPU 120 can correspond to the sub-video stream of same number.For example, multiple son
The quantity of video flowing can be equal to the quantity of multiple GPU 120, and multiple sub-video stream can be with multiple GPU 120
It corresponds, but the embodiment of the present disclosure is without being limited thereto.
Video image in S302, GPU pairs of sub-video streams corresponding with GPU carries out Context resolution processing, obtains corresponding
The Context resolution of sub-video stream by the Context resolution result of corresponding sub-video stream as a result, and report to the CPU.
Each GPU 120 in multiple GPU 120 can carry out Context resolution processing to its corresponding sub-video stream respectively,
Wherein, optionally, each GPU 120 may be used identical flow and carry out Context resolution processing, in order to make it easy to understand, below with
It is described for the operation of one of GPU 120.
S303, CPU are according to multiple GPU Context resolutions reported as a result, obtaining the Context resolution result of video flowing.
Video flowing, is divided by the Context resolution system based on the video flowing that disclosure above-described embodiment provides by CPU
Sub-video stream carries out parallel dissection process in each GPU respectively after multiple sub-video streams, effectively increases processing speed.
Optionally, Context resolution result can include it is following at least one of:People information, dress ornament information, article letter
Breath and scene information.
As an example, dress ornament information can include at least one of following information:Classification information, the color of dress ornament
Information, texture information, neckline information, the image coordinate information of cuff information and dress ornament.
Wherein, the classification information of dress ornament can represent dress ornament classification, such as housing, shirt, etc..The image coordinate of dress ornament
Information can indicate position of the dress ornament in video image.Optionally, dress ornament information can also include other information, and the disclosure is real
It applies example and does not do any restriction to this.
As an example, face information can include at least one of following information:Name information, face character letter
The image coordinate information of breath, face.
Wherein, name information can be specially the name of personage in video, such as dramatis personae's name, or people
Real Name or stage name of object, etc..Character attribute information can include human face similarity degree information, personage gender information, personage
Age information, etc..The image coordinate information of face can indicate position of the facial image in video image.Optionally, people
Face information can also include other information, and the embodiment of the present disclosure does not limit this.
As an example, Item Information can include at least one of following message:Item Title information, article category
The image coordinate information of property information and article.
Wherein, goods attribute information can include the information such as material, the brand of article.Optionally, Item Information can be with
Including other information, the embodiment of the present disclosure does not limit this.
As an example, scene information can include scene name information, for example, seabeach, airport etc., the disclosure
Embodiment does not limit this.
Optionally, GPU 120 specifically can carry out content by the video image in convolutional neural networks sub-video stream
Dissection process, it is achieved thereby that in automatic sub-video stream video image content understanding, overcome lacking for artificial marked content
Point improves the efficiency of content understanding.
In addition, the identification for personage, dress ornament, article and scene, can be respectively adopted different trained convolution god
It is obtained through network.By the intelligently parsing to video content, face, clothes, article and scene are effectively extracted from video
Structured messages, and combining its system are waited, it is achieved thereby that the intelligently parsing and structuring to video content are defeated
Go out.
Optionally, in a specific example, operation S301 can include:
According to the quantity of multiple GPU, video flowing is divided into multiple sub-video streams corresponding with multiple GPU.
Specifically, in the Context resolution system of more GPU, CPU 110 can be according to currently available GPU quantity, dynamic
Divide video.Video flowing can be divided into multiple sub-video streams by the CPU 110, for example, the CPU 110 can be by video flowing etc.
It is divided into and multiple GPU 120 multiple sub-video streams, but the embodiment of the present disclosure is without being limited thereto correspondingly.Optionally, the CPU
110 can also otherwise divide video flowing, and the embodiment of the present disclosure is not construed as limiting this.
In an optional example of the disclosure, operation S302 can include:
The people information library of video flowing is obtained, the people information library of video flowing includes the facial image of the personage in video flowing
With name information;
Using the people information library of video flowing, the face of an at least frame video image in the corresponding sub-video streams of GPU is determined
Information.
It is alternatively possible to face inspection is carried out by every frame video image in convolutional neural networks sub-video stream respectively
Survey, the face location information in video image obtained by Face datection, and based on face location information by facial image from
It decomposites and in video image, the facial image that decomposition obtains is compared with the facial image in people information library, obtain
Required face information realizes the recognition of face based on video image.It optionally, can be in addition to utilizing convolutional neural networks
Recognition of face is realized by other means, and the embodiment of the present disclosure is not construed as limiting this.
As an example, specifically, it using the people information library of video flowing, determines in the corresponding sub-video streams of GPU extremely
The face information of a few frame video image, can include:
Face datection is carried out to every frame video image in an at least frame video image for corresponding sub-video stream, is obtained every
At least one of frame video image facial image, wherein, corresponding to identical personage and appear in sub-video stream at least
There is at least one of one frame continuous videos image facial image identical face tracking to identify;
Correspond to from sub-video stream at least one facial image of same face tracking mark and determine target face figure
Picture;
Face comparison is carried out by the facial image in the video people information library to target facial image and video flowing, really
The personage to set the goal corresponding to the face tracking mark of facial image.
Specifically, it is identified for a face tracking, at least frame in sub-video stream can be identified to the face tracking
Facial image in continuous video image in every frame video image carries out characteristics extraction, and the characteristic value based on extraction, really
The confidence level of facial image in fixed every frame video image.It is then possible to based on being regarded in an at least frame video image per frame
The confidence level of facial image in frequency image determines that the face tracking identifies corresponding target facial image.It for example, can be by the people
The facial image that face tracking identifies confidence level maximum in corresponding at least one facial image identifies correspondence as the face tracking
Target facial image, but the embodiment of the present disclosure does not limit this.Target facial image is regarded as the personage and is regarded in the son
Frequency is best in quality in flowing or most useful for the facial image for carrying out face comparison, the personage based on the target facial image and video flowing
Facial image in information bank is compared, and quality and/or facial angle due to facial image can be avoided to ask to the greatest extent
The erroneous judgement inscribed and occurred effectively increases accuracy and the efficiency of recognition of face.
Optionally, operation 302 can also include:
In at least frame video image for sub-video stream detected according to face recognition module in every frame video image
Facial image, to every frame video image carry out dress ornament detection process, obtain at least one of every frame video image dress ornament figure
Picture;Wherein, same face tracking is corresponded in sub-video stream to identify and correspond to the dress ornament image of same dress ornament with identical
Dress ornament tracking mark;
According at least one dress ornament image in sub-video stream with identical dress ornament tracking mark, dress ornament tracking mark is determined
Corresponding dress ornament.
Specifically, for a dress ornament tracking mark, it can be tracked and be identified in sub-video stream at least based on the dress ornament
Dress ornament image in one frame video image carries out ballot selection, selects the highest dress ornament of votes as dress ornament tracking mark pair
The dress ornament answered.For example, it may be determined that dress ornament tracking mark corresponding dress ornament image in every frame video image, and based on the clothes
Decorations image determines the corresponding candidate dress ornament of dress ornament tracking mark, then can be to occurring dress ornament tracking mark in the sub-video stream
All frame video images for knowing corresponding dress ornament image are for statistical analysis, determine same candidate dress ornament in all frame video figures
The number occurred as in, and the most candidate dress ornament of the occurrence number in the sub-video stream is determined as dress ornament tracking mark pair
The dress ornament answered.Alternatively it is also possible to carry out dress ornament identification using other modes, the embodiment of the present disclosure does not limit this.
In one example of the present disclosure, further include:
The incidence relation between dress ornament tracking mark and face tracking mark is established, Context resolution result includes face tracking
It identifies corresponding people information and identifies the corresponding dress ornament information of associated dress ornament tracking mark with face tracking.
Specifically, there are incidence relations with facial image for the dress ornament image that dress ornament identification module detects.Therefore, dress ornament
Identification module can be based on facial image and dress ornament image position relationship (or other can determine facial image and dress ornament image position
The information put) incidence relation is established to the two, at this point, the Context resolution result of output not only includes to being identified in video image
People information, further include the dress ornament information with the personage of personage's information association, enable users to recognize more structurings
Information.
Optionally, it further includes:The Context resolution result of CPU render video streams in video streaming.To the video flowing into traveling
The processing of one step, such as show the purchase link for the dress ornament that certain personage wears under certain scene in the video flowing or relevant advertisements letter
Breath, etc., the embodiment of the present disclosure does not limit this.
Optionally, CPU obtains video flowing, including:CPU obtains the video flowing that user is uploaded by internet.
Specifically, it can be that user is uploaded by internet or user passes through this ground that CPU, which receives video flowing,
It passes, as long as legal upload means, CPU can receive the video flowing, and the disclosure does not limit upload mode.
Optionally, it before the video flowing for obtaining that user is uploaded by internet in CPU, further includes:In response to regarding for user
Frequency upload request carries out subscription authentication to user.
Specifically, CPU can realize video data management function, be responsible for user management to authorizing, manage user right etc.
Deng;In practical applications, after receiving user's request, it can be determined that user is identified in user right, only with permission
User's request that user sends out, is just handled;The user that the user for not having permission sends out is asked, directly feedback does not connect
By information.
According to the one side of the embodiment of the present disclosure, a kind of electronic equipment provided, including:Memory, can for storing
Execute instruction;
And processor, for being communicated with memory with the content for performing executable instruction video flowing thereby completing the present invention
The operation of any of the above-described embodiment of analytic method.
According to the one side of the embodiment of the present disclosure, a kind of computer program provided, including computer-readable code, when
When being run in equipment, the processor execution in the equipment is used to implement disclosure the various embodiments described above and regards computer-readable code
The instruction of each step in the content analysis method of frequency stream.
According to the one side of the embodiment of the present disclosure, a kind of computer storage media provided can for storing computer
The instruction of reading, described instruction are performed the behaviour for any of the above-described embodiment of content analysis method for performing video flowing of the present invention
Make.
The embodiment of the present invention additionally provides a kind of electronic equipment, such as can be mobile terminal, personal computer (PC), put down
Plate computer, server etc..Below with reference to Fig. 4, it illustrates suitable for being used for realizing the terminal device of the embodiment of the present application or service
The structure diagram of the electronic equipment 400 of device:As shown in figure 4, computer system 400 includes one or more processors, communication
Portion etc., one or more of processors are for example:One or more central processing unit (CPU) 401 and/or one or more
Image processor (GPU) 413 etc., processor can according to the executable instruction being stored in read-only memory (ROM) 402 or
From the executable instruction that storage section 408 is loaded into random access storage device (RAM) 403 perform various appropriate actions and
Processing.Communication unit 412 may include but be not limited to network interface card, and the network interface card may include but be not limited to IB (Infiniband) network interface card.
Processor can communicate with read-only memory 402 and/or random access storage device 430 to perform executable instruction,
It is connected by bus 404 with communication unit 412 and is communicated through communication unit 412 with other target devices, is implemented so as to complete the application
The corresponding operation of any one method that example provides for example, CPU obtains video flowing, video flowing is divided into corresponding with multiple GPU
Multiple sub-video streams, and multiple sub-video streams are distributed to corresponding GPU;Video in GPU pairs of sub-video stream corresponding with GPU
Image carries out Context resolution processing, obtains the Context resolution of corresponding sub-video stream as a result, and will be in corresponding sub-video stream
Hold analysis result and report to the CPU;CPU is according to multiple GPU Context resolutions reported as a result, obtaining the Context resolution of video flowing
As a result.
In addition, in RAM 403, it can also be stored with various programs and data needed for device operation.CPU401、ROM402
And RAM403 is connected with each other by bus 404.In the case where there is RAM403, ROM402 is optional module.RAM403 is stored
Executable instruction is written in executable instruction into ROM402 at runtime, and it is above-mentioned logical that executable instruction performs processor 401
The corresponding operation of letter method.Input/output (I/O) interface 405 is also connected to bus 404.Communication unit 412 can be integrally disposed,
It may be set to be with multiple submodule (such as multiple IB network interface cards), and in bus link.
I/O interfaces 405 are connected to lower component:Importation 406 including keyboard, mouse etc.;It is penetrated including such as cathode
The output par, c 407 of spool (CRT), liquid crystal display (LCD) etc. and loud speaker etc.;Storage section 408 including hard disk etc.;
And the communications portion 409 of the network interface card including LAN card, modem etc..Communications portion 409 via such as because
The network of spy's net performs communication process.Driver 410 is also according to needing to be connected to I/O interfaces 405.Detachable media 411, such as
Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on driver 410, as needed in order to be read from thereon
Computer program be mounted into storage section 408 as needed.
Need what is illustrated, framework as shown in Figure 4 is only a kind of optional realization method, can root during concrete practice
The component count amount and type of above-mentioned Fig. 4 are selected, are deleted, increased or replaced according to actual needs;It is set in different function component
Put, can also be used it is separately positioned or integrally disposed and other implementations, such as GPU and CPU separate setting or can be by GPU collection
Into on CPU, communication unit separates setting, can also be integrally disposed on CPU or GPU, etc..These interchangeable embodiments
Each fall within protection domain disclosed by the invention.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product, it is machine readable including being tangibly embodied in
Computer program on medium, computer program are included for the program code of the method shown in execution flow chart, program code
It may include the corresponding instruction of corresponding execution method and step provided by the embodiments of the present application, for example, CPU obtains video flowing, by video
Stream is divided into multiple sub-video streams corresponding with multiple GPU, and multiple sub-video streams are distributed to corresponding GPU;GPU pairs with
Video image in the corresponding sub-video streams of GPU carries out Context resolution processing, obtains the Context resolution knot of corresponding sub-video stream
Fruit, and the Context resolution result of corresponding sub-video stream is reported into the CPU;The Context resolution that CPU is reported according to multiple GPU
As a result, obtain the Context resolution result of video flowing.In such embodiments, which can pass through communications portion
409 are downloaded and installed from network and/or are mounted from detachable media 411.In the computer program by central processing list
When member (CPU) 401 is performed, the above-mentioned function of being limited in the present processes is performed.
Methods and apparatus of the present invention, equipment may be achieved in many ways.For example, software, hardware, firmware can be passed through
Or any combinations of software, hardware, firmware realize methods and apparatus of the present invention, equipment.The step of for method
Sequence is stated merely to illustrate, the step of method of the invention is not limited to sequence described in detail above, unless with other
Mode illustrates.In addition, in some embodiments, the present invention can be also embodied as recording program in the recording medium, this
A little programs include being used to implement machine readable instructions according to the method for the present invention.Thus, the present invention also covering stores to hold
The recording medium of the program of row according to the method for the present invention.
Description of the invention provides for the sake of example and description, and is not exhaustively or will be of the invention
It is limited to disclosed form.Many modifications and variations are obvious for the ordinary skill in the art.It selects and retouches
It states embodiment and is to more preferably illustrate the principle of the present invention and practical application, and those of ordinary skill in the art is enable to manage
The solution present invention is so as to design the various embodiments with various modifications suitable for special-purpose.
Claims (10)
1. a kind of Context resolution system of video flowing, which is characterized in that including:
Central processing unit CPU and multiple graphics processing unit GPU, wherein,
The CPU is used to obtain video flowing, and the video flowing is divided into multiple sub-video streams corresponding with the multiple GPU,
And the multiple sub-video stream is distributed to corresponding GPU, wherein, each sub-video stream is included in the video flowing extremely
The continuous video image of a few frame, and different GPU corresponds to different sub-video streams;
The video image that the GPU is used in pair sub-video stream corresponding with the GPU carries out Context resolution processing, obtains described
The Context resolution of corresponding sub-video stream is as a result, and report to the Context resolution result of the corresponding sub-video stream described
CPU;
The CPU is additionally operable to according to the multiple GPU Context resolutions reported as a result, obtaining the Context resolution knot of the video flowing
Fruit.
2. system according to claim 1, which is characterized in that the CPU by the video flowing be divided into it is the multiple
The corresponding multiple sub-video streams of GPU, including:
The video flowing is divided into multiple sons corresponding with the multiple GPU by the CPU according to the quantity of the multiple GPU
Video flowing.
3. system according to claim 1 or 2, which is characterized in that the Context resolution result include it is following at least
One:People information, dress ornament information, Item Information and scene information.
4. system according to claim 3, which is characterized in that the dress ornament information includes at least one in following information
Kind:The classification information of dress ornament, colouring information, texture information, neckline information, cuff information and the dress ornament image coordinate letter
Breath;And/or
The face information includes at least one of following information:Name information, face character information, the image coordinate of face
Information.
5. system according to any one of claim 1 to 4, which is characterized in that the GPU includes face recognition module,
For:
The people information library of the video flowing is obtained, the people information library of the video flowing includes the personage's in the video flowing
Facial image and name information;
Using the people information library of the video flowing, an at least frame video image in the corresponding sub-video streams of the GPU is determined
Face information.
6. a kind of electronic equipment, which is characterized in that including:
Communication unit for the video upload request in response to user, by video stream to Context resolution system and receives
The Context resolution result for the video flowing that the Context resolution system is sent;
Storage unit, for preserving the Context resolution result of the video flowing.
7. a kind of content analysis method of video flowing, which is characterized in that applied to including central processing unit CPU and multiple figures
The Context resolution system of processing unit GPU, including:
CPU obtains video flowing, and the video flowing is divided into multiple sub-video streams corresponding with the multiple GPU, and by described in
Multiple sub-video streams are distributed to corresponding GPU, wherein, at least frame that each sub-video stream is included in the video flowing connects
Continuous video image, and different GPU corresponds to different sub-video streams;
Video image in GPU pairs of sub-video stream corresponding with the GPU carries out Context resolution processing, obtains the corresponding son
The Context resolution of video flowing by the Context resolution result of the corresponding sub-video stream as a result, and report to the CPU;
CPU is according to the multiple GPU Context resolutions reported as a result, obtaining the Context resolution result of the video flowing.
8. a kind of electronic equipment, which is characterized in that including:Memory, for storing executable instruction;
And processor, for communicating to perform the executable instruction so as to complete described in claim 7 with the memory
The operation of the content analysis method of video flowing.
9. a kind of computer storage media, for storing computer-readable instruction, which is characterized in that described instruction is performed
When perform claim require 7 video flowings content analysis method operation.
10. a kind of computer program, including computer-readable code, which is characterized in that when the computer-readable code is being set
During standby upper operation, the processor execution in the equipment is used to implement in the content analysis method of video flowing described in claim 7
The instruction of each step.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711066691.5A CN108235114A (en) | 2017-11-02 | 2017-11-02 | Content analysis method and system, electronic equipment, the storage medium of video flowing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711066691.5A CN108235114A (en) | 2017-11-02 | 2017-11-02 | Content analysis method and system, electronic equipment, the storage medium of video flowing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108235114A true CN108235114A (en) | 2018-06-29 |
Family
ID=62655006
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711066691.5A Pending CN108235114A (en) | 2017-11-02 | 2017-11-02 | Content analysis method and system, electronic equipment, the storage medium of video flowing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108235114A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108921773A (en) * | 2018-07-04 | 2018-11-30 | 百度在线网络技术(北京)有限公司 | Human body tracking processing method, device, equipment and system |
CN110427265A (en) * | 2019-07-03 | 2019-11-08 | 平安科技(深圳)有限公司 | Method, apparatus, computer equipment and the storage medium of recognition of face |
CN110769257A (en) * | 2018-07-25 | 2020-02-07 | 北京深鉴智能科技有限公司 | Intelligent video structured analysis device, method and system |
CN111414517A (en) * | 2020-03-26 | 2020-07-14 | 成都市喜爱科技有限公司 | Video face analysis method and device and server |
CN113923472A (en) * | 2021-09-01 | 2022-01-11 | 北京奇艺世纪科技有限公司 | Video content analysis method and device, electronic equipment and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009009198A (en) * | 2007-06-26 | 2009-01-15 | Sony Corp | Image processor, imaging device, image processing method, and program |
CN101833569A (en) * | 2010-04-08 | 2010-09-15 | 中国科学院自动化研究所 | Method for automatically identifying film human face image |
US20110274314A1 (en) * | 2010-05-05 | 2011-11-10 | Nec Laboratories America, Inc. | Real-time clothing recognition in surveillance videos |
CN102541640A (en) * | 2011-12-28 | 2012-07-04 | 厦门市美亚柏科信息股份有限公司 | Cluster GPU (graphic processing unit) resource scheduling system and method |
CN105100894A (en) * | 2014-08-26 | 2015-11-25 | Tcl集团股份有限公司 | Automatic face annotation method and system |
CN105279480A (en) * | 2014-07-18 | 2016-01-27 | 顶级公司 | Method of video analysis |
CN105447529A (en) * | 2015-12-30 | 2016-03-30 | 商汤集团有限公司 | Costume detection and attribute value identification method and system |
-
2017
- 2017-11-02 CN CN201711066691.5A patent/CN108235114A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009009198A (en) * | 2007-06-26 | 2009-01-15 | Sony Corp | Image processor, imaging device, image processing method, and program |
CN101833569A (en) * | 2010-04-08 | 2010-09-15 | 中国科学院自动化研究所 | Method for automatically identifying film human face image |
US20110274314A1 (en) * | 2010-05-05 | 2011-11-10 | Nec Laboratories America, Inc. | Real-time clothing recognition in surveillance videos |
CN102541640A (en) * | 2011-12-28 | 2012-07-04 | 厦门市美亚柏科信息股份有限公司 | Cluster GPU (graphic processing unit) resource scheduling system and method |
CN105279480A (en) * | 2014-07-18 | 2016-01-27 | 顶级公司 | Method of video analysis |
CN105100894A (en) * | 2014-08-26 | 2015-11-25 | Tcl集团股份有限公司 | Automatic face annotation method and system |
CN105447529A (en) * | 2015-12-30 | 2016-03-30 | 商汤集团有限公司 | Costume detection and attribute value identification method and system |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108921773A (en) * | 2018-07-04 | 2018-11-30 | 百度在线网络技术(北京)有限公司 | Human body tracking processing method, device, equipment and system |
CN110769257A (en) * | 2018-07-25 | 2020-02-07 | 北京深鉴智能科技有限公司 | Intelligent video structured analysis device, method and system |
CN110427265A (en) * | 2019-07-03 | 2019-11-08 | 平安科技(深圳)有限公司 | Method, apparatus, computer equipment and the storage medium of recognition of face |
CN111414517A (en) * | 2020-03-26 | 2020-07-14 | 成都市喜爱科技有限公司 | Video face analysis method and device and server |
CN111414517B (en) * | 2020-03-26 | 2023-05-19 | 成都市喜爱科技有限公司 | Video face analysis method, device and server |
CN113923472A (en) * | 2021-09-01 | 2022-01-11 | 北京奇艺世纪科技有限公司 | Video content analysis method and device, electronic equipment and storage medium |
CN113923472B (en) * | 2021-09-01 | 2023-09-01 | 北京奇艺世纪科技有限公司 | Video content analysis method, device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6681342B2 (en) | Behavioral event measurement system and related method | |
CN108235114A (en) | Content analysis method and system, electronic equipment, the storage medium of video flowing | |
CN104715023B (en) | Method of Commodity Recommendation based on video content and system | |
CN108416902B (en) | Real-time object identification method and device based on difference identification | |
JP6267861B2 (en) | Usage measurement techniques and systems for interactive advertising | |
US10524005B2 (en) | Facilitating television based interaction with social networking tools | |
US10963700B2 (en) | Character recognition | |
CN108446390A (en) | Method and apparatus for pushed information | |
CN110737783A (en) | method, device and computing equipment for recommending multimedia content | |
EP3425483B1 (en) | Intelligent object recognizer | |
CN111491187B (en) | Video recommendation method, device, equipment and storage medium | |
CN113766330A (en) | Method and device for generating recommendation information based on video | |
CN110059223A (en) | Circulation, image to video computer vision guide in machine | |
CN104025615A (en) | Interactive streaming video | |
US20170013309A1 (en) | System and method for product placement | |
CN114283349A (en) | Data processing method and device, computer equipment and storage medium | |
CN110659923A (en) | Information display method and device for user terminal | |
US20220309279A1 (en) | Computerized system and method for fine-grained event detection and content hosting therefrom | |
CA3171181A1 (en) | System and method for analyzing videos in real-time | |
CN113709559B (en) | Video dividing method, device, computer equipment and storage medium | |
US12034981B2 (en) | System and method for analyzing videos in real-time | |
WO2024104286A1 (en) | Video processing method and apparatus, electronic device, and storage medium | |
CN117523625A (en) | Video character recognition method, device, equipment and storage medium | |
US20130138493A1 (en) | Episodic approaches for interactive advertising | |
CN117496549A (en) | Yolov 5-based express package detection system and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180629 |
|
RJ01 | Rejection of invention patent application after publication |