WO2017092127A1 - Video classification method and apparatus - Google Patents

Video classification method and apparatus Download PDF

Info

Publication number
WO2017092127A1
WO2017092127A1 PCT/CN2015/099610 CN2015099610W WO2017092127A1 WO 2017092127 A1 WO2017092127 A1 WO 2017092127A1 CN 2015099610 W CN2015099610 W CN 2015099610W WO 2017092127 A1 WO2017092127 A1 WO 2017092127A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
face
category
determining
picture
Prior art date
Application number
PCT/CN2015/099610
Other languages
French (fr)
Chinese (zh)
Inventor
陈志军
侯文迪
龙飞
Original Assignee
小米科技有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 小米科技有限责任公司 filed Critical 小米科技有限责任公司
Priority to RU2016136707A priority Critical patent/RU2667027C2/en
Priority to JP2016523976A priority patent/JP6423872B2/en
Priority to MX2016005882A priority patent/MX2016005882A/en
Priority to KR1020167010359A priority patent/KR101952486B1/en
Publication of WO2017092127A1 publication Critical patent/WO2017092127A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7837Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using objects detected or recognised in the video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7837Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using objects detected or recognised in the video content
    • G06F16/784Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using objects detected or recognised in the video content the detected or recognised objects being people
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/7867Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments

Abstract

The present disclosure relates to a video classification method and apparatus. The method comprises: acquiring a key frame, comprising a human face, in a video; acquiring a human face feature in the key frame; acquiring a human face feature corresponding to a picture type; according to the human face feature in the key frame and the human face feature corresponding to the picture type, determining a picture type to which the video belongs; and distributing the video to the picture type to which the video belongs. The technical solution can intelligently and automatically classify a video into a picture type corresponding to a person participating in the video, so that user manual classification is not needed and the classification accuracy is high.

Description

视频归类方法及装置Video classification method and device
本申请基于申请号为2015108674365、申请日为2015年12月01日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。The present application is based on a Chinese patent application filed on Jan.
技术领域Technical field
本公开涉及多媒体聚类技术领域,尤其涉及视频归类方法及装置。The present disclosure relates to the field of multimedia clustering technologies, and in particular, to a video categorization method and apparatus.
背景技术Background technique
目前,用户可使用拍摄装置拍摄到视频、照片等多媒体数据。对于照片,目前已经有人脸聚类技术,可以将同一个人参与拍摄的照片归入该人对应的照片集中。但是,目前缺少将同一个人参与拍摄的视频和照片进行人脸聚类的技术,用户只能手动将视频分类,智能化低,效率低。Currently, users can capture multimedia data such as videos and photos using the camera. For photos, there is already a face clustering technique that can group photos taken by the same person into the corresponding photo collection. However, there is currently no technology for face clustering of videos and photos that the same person participates in, and the user can only manually classify the videos, which is low in intelligence and low in efficiency.
发明内容Summary of the invention
本公开实施例提供视频归类方法及装置。所述技术方案如下:Embodiments of the present disclosure provide a video categorization method and apparatus. The technical solution is as follows:
根据本公开实施例的第一方面,提供一种视频归类方法,包括:According to a first aspect of an embodiment of the present disclosure, a video categorization method is provided, including:
获取视频中包括人脸的关键帧;Obtain key frames in the video that include faces;
获取所述关键帧中的人脸特征;Obtaining a face feature in the key frame;
获取图片类别对应的人脸特征;Obtaining a face feature corresponding to the picture category;
根据所述关键帧中的人脸特征和所述图片类别对应的人脸特征,确定所述视频所归属的图片类别;Determining, according to the face feature in the key frame and the face feature corresponding to the picture category, a picture category to which the video belongs;
将所述视频分配至所述视频所归属的图片类别中。The video is assigned to a picture category to which the video belongs.
在一个实施例中,所述获取视频中包括人脸的关键帧,包括:In an embodiment, the acquiring a key frame including a face in the video includes:
从所述视频中获取包括人脸的至少一个视频帧;Obtaining at least one video frame including a face from the video;
确定所述至少一个视频帧中,每个视频帧中的人脸参数,所述人脸参数包括人脸数目、人脸位置中的任一项或两项;Determining, in the at least one video frame, a face parameter in each video frame, where the face parameter includes any one or two of a face number and a face position;
根据所述每个视频帧中的人脸参数,确定所述视频中的关键帧。A key frame in the video is determined based on the face parameters in each of the video frames.
在一个实施例中,上述根据所述每个视频帧中的人脸参数,确定所述视频中的关键帧,包括: In an embodiment, the determining, according to the face parameters in each video frame, the key frames in the video, including:
根据所述每个视频帧中的所述人脸参数,确定所述人脸参数未重复出现在其它视频帧中的非重复视频帧;Determining, according to the face parameter in each video frame, the non-repetitive video frame that the face parameter does not repeatedly appear in other video frames;
将至少一个所述非重复视频帧确定为所述关键帧。At least one of the non-repeating video frames is determined as the key frame.
在一个实施例中,上述根据所述每个视频帧中的人脸参数,确定所述视频中的关键帧,包括:In an embodiment, the determining, according to the face parameters in each video frame, the key frames in the video, including:
根据所述每个视频帧中的所述人脸参数,确定所述人脸参数相同的至少一组重复视频帧,每组所述重复视频帧中包括至少两个视频帧,每组所述重复视频帧中摄取时间最晚的视频帧与摄取时间最早的视频帧之间的摄取时间之差小于或等于预设时长,每组所述重复视频帧中所有视频帧的人脸参数相同;Determining, according to the face parameter in each video frame, at least one set of repeated video frames with the same face parameter, and each group of the repeated video frames includes at least two video frames, each group of the repetition The difference between the ingest time between the video frame with the latest ingested time and the video frame with the earliest time of the video frame is less than or equal to the preset duration, and the face parameters of all the video frames in each group of the repeated video frames are the same;
将每组所述重复视频帧中的任一视频帧确定为所述关键帧。Any one of the sets of the repeated video frames is determined as the key frame.
在一个实施例中,所述根据所述关键帧中的人脸特征和所述图片类别对应的人脸特征,确定所述视频所归属的图片类别,包括:当所述视频的数目为至少两个时,确定每个视频的所述关键帧中的人脸特征;根据每个视频的所述关键帧中的人脸特征,对所述至少两个视频进行人脸聚类处理,获得至少一个视频类别;根据所述至少一个视频类别各自对应的人脸特征和所述图片类别对应的人脸特征,确定对应相同人脸特征的视频类别和图片类别;In an embodiment, the determining, according to the face feature in the key frame and the face feature corresponding to the picture category, the picture category to which the video belongs, including: when the number of the video is at least two Determining a face feature in the key frame of each video; performing face clustering processing on the at least two videos according to a face feature in the key frame of each video to obtain at least one a video category; determining, according to a face feature corresponding to each of the at least one video category and a face feature corresponding to the picture category, a video category and a picture category corresponding to the same facial feature;
所述将所述视频分配至所述视频所归属的图片类别中,包括:将所述每个视频类别中的视频分配至对应相同人脸特征的图片类别中。The assigning the video to a picture category to which the video belongs includes: assigning a video in each of the video categories to a picture category corresponding to the same facial feature.
在一个实施例中,所述根据所述关键帧中的人脸特征和所述图片类别对应的人脸特征,确定所述视频所归属的图片类别,包括:In an embodiment, the determining, according to the face feature in the key frame and the face feature corresponding to the picture category, the picture category to which the video belongs, including:
在所述图片类别对应的人脸特征中,确定与所述关键帧中的人脸特征匹配的图片类别;Determining, in a face feature corresponding to the picture category, a picture category that matches a face feature in the key frame;
将所述匹配的图片类别确定为所述视频所归属的图片类别。The matched picture category is determined as the picture category to which the video belongs.
在一个实施例中,所述方法还包括:In an embodiment, the method further includes:
获取所述视频的拍摄时间和拍摄地点;Obtaining the shooting time and shooting location of the video;
确定与所述视频的拍摄时间和拍摄地点相同的目的图片;Determining a picture of the same purpose as the shooting time and shooting location of the video;
将所述视频分配至所述目的图片所归属的图片类别中。The video is assigned to a picture category to which the destination picture belongs.
根据本公开实施例的第二方面,提供一种视频归类装置,包括:According to a second aspect of the embodiments of the present disclosure, a video categorization apparatus is provided, including:
第一获取模块,用于获取视频中包括人脸的关键帧;a first acquiring module, configured to acquire a key frame including a face in the video;
第二获取模块,用于获取所述第一获取模块获取到的所述关键帧中的人脸特征;a second acquiring module, configured to acquire a facial feature in the key frame acquired by the first acquiring module;
第三获取模块,用于获取图片类别对应的人脸特征;a third acquiring module, configured to acquire a face feature corresponding to the picture category;
第一确定模块,用于根据所述第二获取模块获取到的所述关键帧中的人脸特征和所述第三获取模块获取到的所述图片类别对应的人脸特征,确定所述视频所归属的图片类别; a first determining module, configured to determine the video according to a face feature in the key frame acquired by the second acquiring module and a face feature corresponding to the picture category acquired by the third acquiring module The category of the picture to which it belongs;
第一分配模块,用于将所述视频分配至所述第一确定模块确定出的所述视频所归属的图片类别中。a first allocation module, configured to allocate the video to a picture category to which the video determined by the first determining module belongs.
在一个实施例中,所述第一获取模块,包括:In an embodiment, the first acquiring module includes:
获取子模块,用于从所述视频中获取包括人脸的至少一个视频帧;Obtaining a submodule, configured to acquire at least one video frame including a human face from the video;
获取子模块,用于从所述视频中获取包括人脸的至少一个视频帧;Obtaining a submodule, configured to acquire at least one video frame including a human face from the video;
第一确定子模块,用于确定所述获取子模块获取到的所述至少一个视频帧中,每个视频帧中的人脸参数,所述人脸参数包括人脸数目、人脸位置中的任一项或两项;a first determining submodule, configured to determine a face parameter in each video frame in the at least one video frame acquired by the acquiring submodule, where the face parameter includes a number of faces and a face position Any one or two;
第二确定子模块,用于根据所述每个视频帧中的人脸参数,确定所述视频中的关键帧。And a second determining submodule, configured to determine a key frame in the video according to the face parameter in each video frame.
在一个实施例中,所述第二确定子模块,还用于根据所述每个视频帧中的所述人脸参数,确定所述人脸参数未重复出现在其它视频帧中的非重复视频帧;将至少一个所述非重复视频帧确定为所述关键帧。In an embodiment, the second determining submodule is further configured to determine, according to the face parameter in each video frame, the non-repetitive video that the face parameter does not repeatedly appear in other video frames. a frame; determining at least one of the non-repeating video frames as the key frame.
在一个实施例中,所述第二确定子模块,还用于根据所述每个视频帧中的所述人脸参数,确定所述人脸参数相同的至少一组重复视频帧,每组所述重复视频帧中包括至少两个视频帧,每组所述重复视频帧中摄取时间最晚的视频帧与摄取时间最早的视频帧之间的摄取时间之差小于或等于预设时长,每组所述重复视频帧中所有视频帧的人脸参数相同;将每组所述重复视频帧中的任一视频帧确定为所述关键帧。In an embodiment, the second determining submodule is further configured to determine, according to the face parameter in each video frame, at least one set of repeated video frames with the same face parameter, each group of The repeated video frame includes at least two video frames, and the difference between the ingest time between the video frame with the latest ingested time and the video frame with the earliest time in each of the repeated video frames is less than or equal to a preset duration, each group The face parameters of all the video frames in the repeated video frames are the same; any one of the sets of the repeated video frames is determined as the key frame.
在一个实施例中,所述第一确定模块,包括:In an embodiment, the first determining module includes:
第三确定子模块,用于当所述视频的数目为至少两个时,确定每个视频的所述关键帧中的人脸特征;根据每个视频的所述关键帧中的人脸特征,对所述至少两个视频进行人脸聚类处理,获得至少一个视频类别;根据所述至少一个视频类别各自对应的人脸特征和所述图片类别对应的人脸特征,确定对应相同人脸特征的视频类别和图片类别;a third determining submodule, configured to determine a face feature in the key frame of each video when the number of the videos is at least two; according to a face feature in the key frame of each video, Performing face clustering processing on the at least two videos to obtain at least one video category; determining corresponding corresponding facial features according to respective facial features corresponding to the at least one video category and facial features corresponding to the image category Video category and image category;
所述第一分配模块,包括:The first distribution module includes:
第一分配子模块,用于将所述第三确定子模块确定出的每个视频类别中的视频分配至对应相同人脸特征的图片类别中。And a first allocation submodule, configured to allocate, in the picture category corresponding to the same facial feature, the video in each video category determined by the third determining submodule.
在一个实施例中,所述第一确定模块,包括:In an embodiment, the first determining module includes:
第四确定子模块,用于在所述图片类别对应的人脸特征中,确定与所述关键帧中的人脸特征匹配的图片类别;a fourth determining submodule, configured to determine, in a facial feature corresponding to the picture category, a picture category that matches a facial feature in the key frame;
第二分配子模块,用于将所述第四确定子模块确定出的所述匹配的图片类别确定为所述视频所归属的图片类别。a second allocation submodule, configured to determine, by the fourth determining submodule, the matched picture category as a picture category to which the video belongs.
在一个实施例中,所述装置还包括:In one embodiment, the apparatus further includes:
第四获取模块,用于获取所述视频的拍摄时间和拍摄地点; a fourth acquiring module, configured to acquire a shooting time and a shooting location of the video;
第二确定模块,用于确定与所述第四获取模块获取到的所述视频的拍摄时间和拍摄地点相同的目的图片;a second determining module, configured to determine a target picture that is the same as the shooting time and the shooting location of the video acquired by the fourth acquiring module;
第二分配模块,用于将所述视频分配至所述第二确定模块确定出的所述目的图片所归属的图片类别中。a second allocation module, configured to allocate the video to a picture category to which the target picture determined by the second determining module belongs.
根据本公开实施例的第三方面,提供一种视频分类装置,包括:According to a third aspect of the embodiments of the present disclosure, a video classification apparatus is provided, including:
处理器;processor;
用于存储处理器可执行指令的存储器;a memory for storing processor executable instructions;
其中,所述处理器被配置为:Wherein the processor is configured to:
获取视频中包括人脸的关键帧;Obtain key frames in the video that include faces;
获取所述关键帧中的人脸特征;Obtaining a face feature in the key frame;
获取图片类别对应的人脸特征;Obtaining a face feature corresponding to the picture category;
根据所述关键帧中的人脸特征和所述图片类别对应的人脸特征,确定所述视频所归属的图片类别;Determining, according to the face feature in the key frame and the face feature corresponding to the picture category, a picture category to which the video belongs;
将所述视频分配至所述视频所归属的图片类别中。The video is assigned to a picture category to which the video belongs.
本公开的实施例提供的技术方案可以包括以下有益效果:The technical solutions provided by the embodiments of the present disclosure may include the following beneficial effects:
上述技术方案,可以智能自动地将视频归入参与该视频的人对应的图片类别中,不仅不需要用户手动归类,而且分类准确性高。In the above technical solution, the video can be intelligently and automatically classified into the picture category corresponding to the person participating in the video, which not only does not require manual classification by the user, but also has high classification accuracy.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。The above general description and the following detailed description are intended to be illustrative and not restrictive.
附图说明DRAWINGS
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。The accompanying drawings, which are incorporated in the specification
图1是根据一示例性实施例示出的一种视频归类方法的流程图。FIG. 1 is a flow chart showing a video categorization method according to an exemplary embodiment.
图2是根据一示例性实施例示出的另一种视频归类方法的流程图。2 is a flow chart of another video categorization method, according to an exemplary embodiment.
图3是根据一示例性实施例示出的再一种视频归类方法的流程图。FIG. 3 is a flowchart of still another video categorization method according to an exemplary embodiment.
图4是根据一示例性实施例示出的一种视频归类装置的框图。FIG. 4 is a block diagram of a video categorizing device, according to an exemplary embodiment.
图5是根据一示例性实施例示出的另一种视频归类装置的框图。FIG. 5 is a block diagram of another video categorization device, according to an exemplary embodiment.
图6是根据一示例性实施例示出的再一种视频归类装置的框图。FIG. 6 is a block diagram of still another video categorization apparatus according to an exemplary embodiment.
图7是根据一示例性实施例示出的又一种视频归类装置的框图。FIG. 7 is a block diagram of still another video categorization apparatus, according to an exemplary embodiment.
图8是根据一示例性实施例示出的又一种视频归类装置的框图。 FIG. 8 is a block diagram of still another video categorization apparatus, according to an exemplary embodiment.
图9是根据一示例性实施例示出的适用于网络连接装置的框图。FIG. 9 is a block diagram suitable for a network connection device, according to an exemplary embodiment.
具体实施方式detailed description
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. The following description refers to the same or similar elements in the different figures unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present disclosure. Instead, they are merely examples of devices and methods consistent with aspects of the present disclosure as detailed in the appended claims.
本公开实施例提供了一种视频归类技术,该技术可以智能自动地将视频归入参与该视频的人对应的图片类别中,不仅不需要用户手动归类,而且分类准确性高。The embodiment of the present disclosure provides a video categorization technology, which can intelligently and automatically assign a video into a picture category corresponding to a person participating in the video, which not only does not require manual classification by the user, but also has high classification accuracy.
在说明本公开实施例提供的方法之前,先说明一下图片类别及其生成方法。一个图片类别对应一个人脸,每个图片类别中的图片中都有同一张人脸,也可以说是一个图片类别对应一个人,因此,每个图片类别下包括具有同一人脸特征的一组图片。本公开实施例可采用以下人脸聚类方法生成图片类别,但不限于以下方法。Before describing the method provided by the embodiment of the present disclosure, the picture category and its generation method will be described first. One picture category corresponds to one face, and each picture category has the same face in the picture, or it can be said that one picture category corresponds to one person. Therefore, each picture category includes a group with the same facial feature. image. The embodiment of the present disclosure may adopt the following face clustering method to generate a picture category, but is not limited to the following method.
在人脸聚类方法中,通常,第一次聚类的时候采用全量的聚类方法初始化,后续聚类一般是采用增量聚类的方法。人脸聚类方法可包括如下步骤A1-A5:In the face clustering method, usually, the first clustering is initialized by a full-scale clustering method, and the subsequent clustering is generally an incremental clustering method. The face clustering method may include the following steps A1-A5:
步骤A1、获取N个图片各自包含的人脸特征,获得N个人脸特征,N大于或等于2。在聚类初始时,每个人脸当成一个类,那么,初始时是有N个类的。Step A1: Obtain face features included in each of the N pictures, and obtain N face features, where N is greater than or equal to 2. At the beginning of the cluster, each face is treated as a class, then there are N classes at the beginning.
步骤A2、在N个类中,计算类与类之间的距离,类与类之间的距离就是两个类各自所包含的人脸之间的距离。Step A2: In the N classes, calculate the distance between the class and the class, and the distance between the class and the class is the distance between the faces of the two classes.
步骤A3、预先设定一个距离阈值θ,当两个类之间的距离小于θ时,则认为这两个类是对应同一个人的,这一轮迭代将这两个类合并成一个新的类。Step A3, a distance threshold θ is preset, and when the distance between the two classes is less than θ, the two classes are considered to correspond to the same person, and this iteration merges the two classes into a new class. .
步骤A4、重复执行步骤A3以进行重复迭代,直到在一轮迭代中没有新的类产生,则迭代终止。Step A4, step A3 is repeatedly performed to perform repeated iterations until no new class is generated in one iteration, and the iteration is terminated.
步骤A5、结果共产生M个类,每个类至少包含一个人脸,一个类表示一个人。In step A5, the result is a total of M classes, each class containing at least one face, and one class representing one person.
图1所示为本公开实施例提供的一种视频归类方法的流程图。该方法的执行主体可以是用于管理多媒体文件的应用程序,此时,该方法中涉及到的视频、图片类别以及图片类别下的图片,是指上述应用程序所在设备中存储的视频、图片类别以及图片类别下的图片。另外,该方法的执行主体也可以是存储有多媒体文件的电子设备,此时,该方法中涉及到的视频、图片类别以及图片类别下的图片,是指存储于该电子设备中的视频、图片类别以及图片类别下的图片。上述应用程序或电子设备可以是周期性地自动触发该方法,也可以是在接收到用户的指示时触发该方法,还可以是在监测到产生了至少一个新视频时自动触发该方法,触发 该方法的时机可以有多种,并不限于以上例举的几种,其最终目的是利用该方法对视频进行智能归类,节省人力。如图1所示,该方法包括步骤S101-S105:FIG. 1 is a flowchart of a video categorization method according to an embodiment of the present disclosure. The execution body of the method may be an application for managing a multimedia file. At this time, the video, the picture category, and the picture under the picture category involved in the method refer to the video and picture category stored in the device where the application is located. And the image under the image category. In addition, the executor of the method may also be an electronic device that stores a multimedia file. At this time, the video, the picture category, and the picture under the picture category involved in the method refer to the video and picture stored in the electronic device. Categories and images under the image category. The foregoing application or the electronic device may automatically trigger the method periodically, or may trigger the method when receiving the indication of the user, or may automatically trigger the method when it detects that at least one new video is generated, and trigger The timing of the method may be various, and is not limited to the above exemplified ones. The ultimate purpose is to use the method to intelligently classify videos and save manpower. As shown in FIG. 1, the method includes steps S101-S105:
在步骤S101中,获取视频中包括人脸的关键帧。In step S101, a key frame including a face in the video is acquired.
在一个实施例中,可以从视频中选择任意一个或多个包括人脸的视频帧作为关键帧,也可以按照图2所示方式获取关键帧,如图2所示,步骤S101可实施为以下步骤S201-S203:In one embodiment, any one or more video frames including a human face may be selected from the video as a key frame, or a key frame may be acquired as shown in FIG. 2. As shown in FIG. 2, step S101 may be implemented as follows. Steps S201-S203:
在步骤S201中,从视频中获取包括人脸的至少一个视频帧。In step S201, at least one video frame including a face is acquired from the video.
在步骤S202中,确定至少一个视频帧中,每个视频帧中的人脸参数,人脸参数包括人脸数目、人脸位置中的任一项或两项。In step S202, a face parameter in each video frame is determined in at least one video frame, and the face parameter includes any one or two of a face number and a face position.
在步骤S203中,根据每个视频帧中的人脸参数,确定视频中的关键帧。In step S203, a key frame in the video is determined based on the face parameters in each video frame.
其中,步骤S203可以实施为以下方式一、方式二中的任一种或者两种。方式一:根据每The step S203 can be implemented as any one or two of the following manners 1 and 2. Method 1: According to each
个视频帧中的人脸参数,确定人脸参数未重复出现在其它视频帧中的非重复视频帧;将至少一个非重复视频帧确定为关键帧。The face parameters in the video frames determine that the face parameters are not repeatedly present in the non-repeating video frames in other video frames; the at least one non-repeating video frame is determined as the key frame.
即,非重复视频帧是指人脸参数与其它任何一个视频帧都不相同的视频帧,也就是人脸画面并未重复出现在其它视频帧中,因此,可以任意选择一个或多个非重复视频帧作为关键帧。That is, the non-repeating video frame refers to a video frame in which the face parameter is different from any other video frame, that is, the face picture is not repeatedly displayed in other video frames, and therefore, one or more non-repetitions can be arbitrarily selected. Video frames are used as key frames.
方式二:根据每个视频帧中的人脸参数,确定人脸参数相同的至少一组重复视频帧,每组重复视频帧中包括至少两个视频帧,每组重复视频帧中摄取时间最晚的视频帧与摄取时间最早的视频帧之间的摄取时间之差小于或等于预设时长,每组重复视频帧中所有视频帧的人脸参数相同;将每组重复视频帧中的任一视频帧确定为关键帧。Manner 2: determining, according to the face parameters in each video frame, at least one set of repeated video frames with the same face parameters, and each set of repeated video frames includes at least two video frames, and each group of repeated video frames has the latest ingestion time. The difference between the ingestion time between the video frame and the earliest video frame is less than or equal to the preset duration, and the face parameters of all video frames in each group of repeated video frames are the same; any video in each group of repeated video frames will be repeated The frame is determined to be a key frame.
其中,预设时长可预先设定,由于通常视频内相同画面不会持续太长时间,因此,预设时长不宜太长,考虑到视频是一秒钟播放24帧的,所以可将预设时长控制于N/24秒内,N大于等于1、且小于或等于24(或36、或者其他数值,可根据需要来定),预设时长越短,最后选取的关键帧越准确。即,每组重复视频帧中的每个视频帧的人脸画面是相同的,也就是相同的人脸画面出现在了多个视频帧中。因此,可以在每组重复视频帧中选择任意一个视频帧作为关键帧,实现了去重效果,提高了选择关键帧的效率。The preset duration can be preset. Since the same picture in the video does not last for too long, the preset duration should not be too long. Considering that the video is played 24 frames per second, the preset duration can be Controlled in N/24 seconds, N is greater than or equal to 1, and less than or equal to 24 (or 36, or other values, which can be determined as needed). The shorter the preset duration, the more accurate the last selected keyframe. That is, the face pictures of each video frame in each set of repeated video frames are the same, that is, the same face picture appears in multiple video frames. Therefore, any one of the video frames can be selected as a key frame in each set of repeated video frames, which realizes the deduplication effect and improves the efficiency of selecting key frames.
以上方式一、方式二可以单独实施,也可以结合实施。The first method and the second method may be implemented separately or in combination.
在步骤S102中,获取关键帧中的人脸特征。In step S102, a face feature in a key frame is acquired.
在步骤S103中,获取图片类别对应的人脸特征。In step S103, a face feature corresponding to the picture category is acquired.
在步骤S104中,根据关键帧中的人脸特征和图片类别对应的人脸特征,确定视频所归属的图片类别。In step S104, the picture category to which the video belongs is determined according to the face feature in the key frame and the face feature corresponding to the picture category.
在步骤S105中,将视频分配至视频所归属的图片类别中。 In step S105, the video is assigned to the picture category to which the video belongs.
本公开实施例提供的上述方法,可以智能自动地将视频与图片进行归类,不仅不需要用户手动归类,而且依据人脸特征来进行分类,准确性高。The above method provided by the embodiment of the present disclosure can intelligently and automatically classify a video and a picture, and does not need to be manually classified by a user, and is classified according to a face feature, and has high accuracy.
在一个实施例中,步骤S104可实施为步骤B1-B2:步骤B1、在图片类别对应的人脸特征中,确定与关键帧中的人脸特征匹配的图片类别;例如,可以执行前述步骤A1-A5,通过人脸聚类处理,根据关键帧中的人脸特征来确定关键帧所归属的图片类别,关键帧所归属的图片类别即为与关键帧中的人脸特征匹配的图片类别;步骤B2、将上述步骤B1确定出的匹配的图片类别确定为视频所归属的图片类别。In an embodiment, step S104 may be implemented as steps B1-B2: step B1, determining a picture category that matches a face feature in a key frame in a face feature corresponding to the picture category; for example, performing the foregoing step A1 -A5, through the face clustering process, determining the picture category to which the key frame belongs according to the face feature in the key frame, and the picture category to which the key frame belongs is the picture category matching the face feature in the key frame; Step B2: Determine the matched picture category determined by the above step B1 as the picture category to which the video belongs.
在另一个实施例中,步骤S104可实施为步骤C1-C3:In another embodiment, step S104 can be implemented as steps C1-C3:
步骤C1、当视频的数目为至少两个时,确定每个视频的关键帧中的人脸特征;步骤C2、根据每个视频的关键帧中的人脸特征,对至少两个视频进行人脸聚类处理,获得至少一个视频类别,一个视频类别对应一个人脸;具体地,可使用前述步骤A1-A5所示人脸聚类方法,对每个关键帧进行人脸聚类处理,获得至少一个类;一个类就是一个视频类别,从而每个视频类别对应一个人脸特征;视频的关键帧所属的视频类别,就是该视频所属的视频类别;步骤C3、根据至少一个视频类别各自对应的人脸特征和图片类别对应的人脸特征,确定对应相同人脸特征的视频类别和图片类别;即,确定出对应相同人脸特征的视频类别和图片类别。相应地,上述步骤S105可实施为:将每个视频类别中的视频分配至对应相同人脸特征的图片类别中。此种方式,是先将视频进行人脸聚类处理,获得视频类别,然后再将视频类别和图片类别进行人脸聚类处理,确定出对应相同人脸的视频类别和图片类别,将每个视频类别中的视频分配至对应相同人脸特征的图片类别中,从而实现了对视频的归类处理。Step C1: determining a face feature in a key frame of each video when the number of videos is at least two; step C2, performing face on at least two videos according to a face feature in a key frame of each video The clustering process obtains at least one video category, and one video category corresponds to one human face; specifically, the face clustering method shown in the foregoing steps A1-A5 may be used to perform face clustering processing on each key frame to obtain at least a class; a class is a video category, such that each video category corresponds to a face feature; the video category to which the key frame of the video belongs is the video category to which the video belongs; step C3, the person corresponding to each of the at least one video category The face feature and the face feature corresponding to the picture category determine a video category and a picture category corresponding to the same facial feature; that is, a video category and a picture category corresponding to the same facial feature are determined. Correspondingly, the above step S105 can be implemented as: assigning videos in each video category to picture categories corresponding to the same facial features. In this way, the video is first subjected to face clustering processing to obtain a video category, and then the video category and the image category are subjected to face clustering processing to determine a video category and a picture category corresponding to the same face, each of which will be The video in the video category is assigned to the picture category corresponding to the same facial feature, thereby realizing the categorization processing of the video.
在一个实施例中,上述方法还可以利用如下方式进行视频归类,这种方式不需要进行人脸聚类处理,而是粗略地认为只要是拍摄时间和拍摄地点相同的视频和图片,就认为它们是同一个人参与的,可将它们归入一类,此种方式具有一定的准确性,并且归类速度快。如图3所示,上述方法还可包括步骤S301-S303:步骤S301,获取视频的拍摄时间和拍摄地点;步骤S302,确定与视频的拍摄时间和拍摄地点相同的目的图片;步骤S303,将视频分配至目的图片所归属的图片类别中。In one embodiment, the above method may also perform video categorization in the following manner, which does not require face clustering processing, but roughly assumes that as long as the shooting time and the shooting location are the same video and picture, it is considered They are the same person involved, they can be classified into one category, this method has certain accuracy and is fast. As shown in FIG. 3, the foregoing method may further include steps S301-S303: step S301, acquiring a shooting time and a shooting location of the video; and step S302, determining a destination image that is the same as the shooting time and the shooting location of the video; and step S303, the video is displayed. Assigned to the picture category to which the destination picture belongs.
本公开实施例的第二方面,提供一种视频归类装置,该装置可用于管理多媒体文件的应用程序,此时,该装置中涉及到的视频、图片类别以及图片类别下的图片,是指上述应用程序所在设备中存储的视频、图片类别以及图片类别下的图片。另外,该装置也可以用于存储有多媒体文件的电子设备,此时,该装置中涉及到的视频、图片类别以及图片类别下的图片,是指存储于该电子设备中的视频、图片类别以及图片类别下的图片。上述应用程序或电子设备可以是周期性地自动触发该装置执行操作,也可以是在接收到用户的指示时触发该装置执 行操作,还可以是在监测到产生了至少一个新视频时自动触发该该装置执行操作,触发时机可以有多种,并不限于以上例举的几种,其最终目的是利用该装置对视频进行智能归类,节省人力。如图4所示,该装置包括:A second aspect of the embodiments of the present disclosure provides a video categorization device, which can be used to manage an application of a multimedia file. At this time, the video, the picture category, and the picture under the picture category in the device refer to The video, image category, and image under the image category stored in the device where the above application is located. In addition, the device can also be used for an electronic device storing a multimedia file. In this case, the video, the picture category, and the picture under the picture category in the device refer to the video and picture categories stored in the electronic device. Picture under the picture category. The above application or electronic device may automatically trigger the device to perform an operation periodically, or may trigger the device to perform when receiving an instruction from the user. The operation may also automatically trigger the device to perform an operation when it detects that at least one new video is generated. The trigger timing may be various, and is not limited to the above-exemplified ones, and the ultimate purpose is to use the device to video. Intelligent classification, saving manpower. As shown in Figure 4, the device comprises:
第一获取模块41,被配置为获取视频中包括人脸的关键帧;The first obtaining module 41 is configured to acquire a key frame including a face in the video;
第二获取模块42,被配置为获取第一获取模块41获取到的关键帧中的人脸特征;The second obtaining module 42 is configured to acquire a facial feature in the key frame acquired by the first obtaining module 41;
第三获取模块43,被配置为获取图片类别对应的人脸特征;The third obtaining module 43 is configured to acquire a face feature corresponding to the picture category;
第一确定模块44,被配置为根据第二获取模块42获取到的关键帧中的人脸特征和第三获取模块43获取到的图片类别对应的人脸特征,确定视频所归属的图片类别;The first determining module 44 is configured to determine a picture category to which the video belongs according to the face feature in the key frame acquired by the second obtaining module 42 and the face feature corresponding to the picture category acquired by the third obtaining module 43;
第一分配模块45,被配置为将视频分配至第一确定模块41确定出的视频所归属的图片类别中。The first allocating module 45 is configured to allocate the video to the picture category to which the video determined by the first determining module 41 belongs.
本公开实施例提供的上述装置,可以智能自动地将视频与图片进行归类,不仅不需要用户手动归类,而且依据人脸特征来进行分类,准确性高。The foregoing device provided by the embodiment of the present disclosure can intelligently and automatically classify videos and pictures, and does not need to be manually classified by a user, and is classified according to facial features, and has high accuracy.
在一个实施例中,如图5所示,第一获取模块41,包括:In an embodiment, as shown in FIG. 5, the first obtaining module 41 includes:
获取子模块51,被配置为从视频中获取包括人脸的至少一个视频帧;The obtaining submodule 51 is configured to acquire at least one video frame including a human face from the video;
第一确定子模块52,被配置为确定获取子模块51获取到的至少一个视频帧中,每个视频帧中的人脸参数,人脸参数包括人脸数目、人脸位置中的任一项或两项;The first determining sub-module 52 is configured to determine, in the at least one video frame acquired by the obtaining sub-module 51, a face parameter in each video frame, where the face parameter includes any one of a face number and a face position. Or two;
第二确定子模块53,被配置为根据每个视频帧中的人脸参数,确定视频中的关键帧。The second determining sub-module 53 is configured to determine key frames in the video based on the face parameters in each video frame.
在一个实施例中,第二确定子模块53,还被配置为根据每个视频帧中的人脸参数,确定人脸参数未重复出现在其它视频帧中的非重复视频帧;将至少一个非重复视频帧确定为关键帧。即,非重复视频帧是指人脸参数与其它任何一个视频帧都不相同的视频帧,也就是人脸画面并未重复出现在其它视频帧中,因此,可以任意选择一个或多个非重复视频帧作为关键帧。In an embodiment, the second determining submodule 53 is further configured to determine, according to the face parameter in each video frame, a non-repetitive video frame in which the face parameter is not repeatedly present in other video frames; The repeated video frame is determined as a key frame. That is, the non-repeating video frame refers to a video frame in which the face parameter is different from any other video frame, that is, the face picture is not repeatedly displayed in other video frames, and therefore, one or more non-repetitions can be arbitrarily selected. Video frames are used as key frames.
在一个实施例中,第二确定子模块53,还被配置为根据每个视频帧中的人脸参数,确定人脸参数相同的至少一组重复视频帧,每组重复视频帧中包括至少两个视频帧,每组重复视频帧中摄取时间最晚的视频帧与摄取时间最早的视频帧之间的摄取时间之差小于或等于预设时长,每组重复视频帧中所有视频帧的人脸参数相同;将每组重复视频帧中的任一视频帧确定为关键帧。In an embodiment, the second determining submodule 53 is further configured to determine, according to the face parameters in each video frame, at least one set of repeated video frames having the same face parameters, and each set of the repeated video frames includes at least two Video frames, the difference between the ingest time between the video frame with the latest ingested time and the video frame with the earliest time in each group of repeated video frames is less than or equal to the preset duration, and the face of all video frames in each group of repeated video frames The parameters are the same; any video frame in each set of repeated video frames is determined as a key frame.
其中,预设时长可预先设定,由于通常视频内相同画面不会持续太长时间,因此,预设时长不宜太长,考虑到视频是一秒钟播放24帧的,所以可将预设时长控制于N/24秒内,N大于等于1、且小于或等于24(或36、或者其他数值,可根据需要来定),预设时长越短,最后选取的关键帧越准确。即,每组重复视频帧中的每个视频帧的人脸画面是相同的,也就 是相同的人脸画面出现在了多个视频帧中。因此,可以在每组重复视频帧中选择任意一个视频帧作为关键帧,实现了去重效果,提高了选择关键帧的效率。The preset duration can be preset. Since the same picture in the video does not last for too long, the preset duration should not be too long. Considering that the video is played 24 frames per second, the preset duration can be Controlled in N/24 seconds, N is greater than or equal to 1, and less than or equal to 24 (or 36, or other values, which can be determined as needed). The shorter the preset duration, the more accurate the last selected keyframe. That is, the face image of each video frame in each group of repeated video frames is the same, that is, The same face picture appears in multiple video frames. Therefore, any one of the video frames can be selected as a key frame in each set of repeated video frames, which realizes the deduplication effect and improves the efficiency of selecting key frames.
在一个实施例中,如图6所示,第一确定模块44,包括:In an embodiment, as shown in FIG. 6, the first determining module 44 includes:
第三确定子模块61,被配置为当视频的数目为至少两个时,确定每个视频的关键帧中的人脸特征;根据每个视频的关键帧中的人脸特征,对至少两个视频进行人脸聚类处理,获得至少一个视频类别;一个视频类别对应一个人脸;具体地,可使用前述步骤A1-A5所示人脸聚类方法,对每个关键帧进行人脸聚类处理,获得至少一个类;一个类就是一个视频类别,从而每个视频类别对应一个人脸特征;视频的关键帧所属的视频类别,就是该视频所属的视频类别;根据至少一个视频类别各自对应的人脸特征和图片类别对应的人脸特征,确定对应相同人脸特征的视频类别和图片类别;即,确定出对应相同人脸特征的视频类别和图片类别。a third determining sub-module 61 configured to determine a face feature in a key frame of each video when the number of videos is at least two; at least two according to a face feature in a key frame of each video The video performs face clustering processing to obtain at least one video category; one video category corresponds to one human face; specifically, the face clustering method shown in the foregoing steps A1-A5 may be used to perform face clustering for each key frame. Processing, obtaining at least one class; one class is a video category, such that each video category corresponds to a face feature; the video category to which the key frame of the video belongs is the video category to which the video belongs; corresponding to each of the at least one video category The face feature and the face feature corresponding to the picture category determine a video category and a picture category corresponding to the same facial feature; that is, a video category and a picture category corresponding to the same facial feature are determined.
第一分配模块45,包括:The first distribution module 45 includes:
第一分配子模块62,被配置为将第三确定子模61确定出的每个视频类别中的视频分配至对应相同人脸特征的图片类别中。The first distribution sub-module 62 is configured to allocate the video in each video category determined by the third determination sub-mode 61 into the picture category corresponding to the same facial feature.
上述装置,是先将视频进行人脸聚类处理,获得视频类别,然后再将视频类别和图片类别进行人脸聚类处理,确定出对应相同人脸的视频类别和图片类别,将每个视频类别中的视频分配至对应相同人脸特征的图片类别中,从而实现了对视频的归类处理。In the above device, the video is first subjected to face clustering processing to obtain a video category, and then the video category and the image category are subjected to face clustering processing to determine a video category and a picture category corresponding to the same face, and each video is selected. The video in the category is assigned to the picture category corresponding to the same facial feature, thereby realizing the categorization processing of the video.
在一个实施例中,如图7所示,第一确定模块44,包括:In an embodiment, as shown in FIG. 7, the first determining module 44 includes:
第四确定子模块71,被配置为在图片类别对应的人脸特征中,确定与关键帧中的人脸特征匹配的图片类别;The fourth determining sub-module 71 is configured to determine a picture category that matches a facial feature in the key frame in the facial feature corresponding to the picture category;
第二分配子模块72,被配置为将第四确定子模块71确定出的匹配的图片类别确定为视频所归属的图片类别。The second allocation sub-module 72 is configured to determine the matched picture category determined by the fourth determining sub-module 71 as the picture category to which the video belongs.
在一个实施例中,如图8所示,上述装置还包括:In an embodiment, as shown in FIG. 8, the foregoing apparatus further includes:
第四获取模块81,被配置为获取视频的拍摄时间和拍摄地点;The fourth obtaining module 81 is configured to acquire a shooting time and a shooting location of the video;
第二确定模块82,被配置为确定与第四获取模块81获取到的视频的拍摄时间和拍摄地点相同的目的图片;The second determining module 82 is configured to determine a destination picture that is the same as the shooting time and the shooting location of the video acquired by the fourth obtaining module 81;
第二分配模块83,被配置为将视频分配至第二确定模块82确定出的目的图片所归属的图片类别中。The second allocating module 83 is configured to allocate the video to the picture category to which the destination picture determined by the second determining module 82 belongs.
上述装置不需要进行人脸聚类处理,而是粗略地认为只要是拍摄时间和拍摄地点相同的视频和图片,就认为它们是同一个人参与的,可将它们归入一类,此种方式具有一定的准确性,并且归类速度快。The above device does not need to perform face clustering processing, but roughly assumes that as long as the video and the picture with the same shooting time and shooting location are considered to be the same person, they can be classified into one category. Certain accuracy and fast classification.
根据本公开实施例的第三方面,提供一种视频分类装置,包括: According to a third aspect of the embodiments of the present disclosure, a video classification apparatus is provided, including:
处理器;processor;
用于存储处理器可执行指令的存储器;a memory for storing processor executable instructions;
其中,处理器被配置为:Wherein the processor is configured to:
获取视频中包括人脸的关键帧;Obtain key frames in the video that include faces;
获取关键帧中的人脸特征;Obtain face features in key frames;
获取图片类别对应的人脸特征;Obtaining a face feature corresponding to the picture category;
根据关键帧中的人脸特征和图片类别对应的人脸特征,确定视频所归属的图片类别;Determining a picture category to which the video belongs according to the face feature in the key frame and the face feature corresponding to the picture category;
将视频分配至视频所归属的图片类别中。Assign the video to the picture category to which the video belongs.
图9是根据一示例性实施例示出的一种用于视频归类的装置800的框图。例如,装置800可以是移动设备,如移动电话,计算机,数字广播终端,消息收发设备,游戏控制台,平板设备,医疗设备,健身设备,个人数字助理等。FIG. 9 is a block diagram of an apparatus 800 for video categorization, according to an exemplary embodiment. For example, device 800 can be a mobile device such as a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, fitness device, personal digital assistant, and the like.
参照图9,装置800可以包括以下一个或多个组件:处理组件802,存储器804,电源组件806,多媒体组件808,音频组件810,输入/输出(I/O)的接口812,传感器组件814,以及通信组件816。Referring to Figure 9, device 800 can include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, And a communication component 816.
处理组件802通常控制装置800的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理组件802可以包括一个或多个处理器820来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件802可以包括一个或多个模块,便于处理组件802和其他组件之间的交互。例如,处理组件802可以包括多媒体模块,以方便多媒体组件808和处理组件802之间的交互。 Processing component 802 typically controls the overall operation of device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. Processing component 802 can include one or more processors 820 to execute instructions to perform all or part of the steps of the above described methods. Moreover, processing component 802 can include one or more modules to facilitate interaction between component 802 and other components. For example, processing component 802 can include a multimedia module to facilitate interaction between multimedia component 808 and processing component 802.
存储器804被配置为存储各种类型的数据以支持在设备800的操作。这些数据的示例包括用于在装置800上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器804可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。 Memory 804 is configured to store various types of data to support operation at device 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phone book data, messages, pictures, videos, and the like. The memory 804 can be implemented by any type of volatile or non-volatile storage device, or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable. Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Disk or Optical Disk.
电力组件806为装置800的各种组件提供电力。电力组件806可以包括电源管理系统,一个或多个电源,及其他与为装置800生成、管理和分配电力相关联的组件。 Power component 806 provides power to various components of device 800. Power component 806 can include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for device 800.
多媒体组件808包括在所述装置800和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界, 而且还检测与所述触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件808包括一个前置摄像头和/或后置摄像头。当设备800处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。The multimedia component 808 includes a screen between the device 800 and the user that provides an output interface. In some embodiments, the screen can include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen can be implemented as a touch screen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touches, slides, and gestures on the touch panel. The touch sensor can sense not only the boundary of the touch or the sliding action, It also detects the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front camera and/or a rear camera. When the device 800 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
音频组件810被配置为输出和/或输入音频信号。例如,音频组件810包括一个麦克风(MIC),当装置800处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器804或经由通信组件816发送。在一些实施例中,音频组件810还包括一个扬声器,用于输出音频信号。The audio component 810 is configured to output and/or input an audio signal. For example, the audio component 810 includes a microphone (MIC) that is configured to receive an external audio signal when the device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may be further stored in memory 804 or transmitted via communication component 816. In some embodiments, the audio component 810 also includes a speaker for outputting an audio signal.
I/O接口812为处理组件802和外围接口模块之间提供接口,上述外围接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。The I/O interface 812 provides an interface between the processing component 802 and the peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to, a home button, a volume button, a start button, and a lock button.
传感器组件814包括一个或多个传感器,用于为装置800提供各个方面的状态评估。例如,传感器组件814可以检测到设备800的打开/关闭状态,组件的相对定位,例如所述组件为装置800的显示器和小键盘,传感器组件814还可以检测装置800或装置800一个组件的位置改变,用户与装置800接触的存在或不存在,装置800方位或加速/减速和装置800的温度变化。传感器组件814可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件814还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件814还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器。 Sensor assembly 814 includes one or more sensors for providing device 800 with a status assessment of various aspects. For example, sensor assembly 814 can detect an open/closed state of device 800, a relative positioning of components, such as the display and keypad of device 800, and sensor component 814 can also detect a change in position of one component of device 800 or device 800. The presence or absence of user contact with device 800, device 800 orientation or acceleration/deceleration, and temperature variation of device 800. Sensor assembly 814 can include a proximity sensor configured to detect the presence of nearby objects without any physical contact. Sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 can also include an acceleration sensor, a gyro sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
通信组件816被配置为便于装置800和其他设备之间有线或无线方式的通信。装置800可以接入基于通信标准的无线网络,如WiFi,2G或3G,或它们的组合。在一个示例性实施例中,通信部件816经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,所述通信部件816还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。Communication component 816 is configured to facilitate wired or wireless communication between device 800 and other devices. The device 800 can access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives broadcast signals or broadcast associated information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 also includes a near field communication (NFC) module to facilitate short range communication. For example, the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
在示例性实施例中,装置800可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。In an exemplary embodiment, device 800 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor, or other electronic component implementation for performing the above methods.
在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包括指令的存储器804,上述指令可由装置800的处理器820执行以完成上述方法。例如,所述非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和 光数据存储设备等。In an exemplary embodiment, there is also provided a non-transitory computer readable storage medium comprising instructions, such as a memory 804 comprising instructions executable by processor 820 of apparatus 800 to perform the above method. For example, the non-transitory computer readable storage medium may be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, and Optical data storage devices, etc.
一种非临时性计算机可读存储介质,当所述存储介质中的指令由移动终端的处理器执行时,使得移动终端能够执行一种视频归类方法,所述方法包括:A non-transitory computer readable storage medium, when instructions in the storage medium are executed by a processor of a mobile terminal, enabling the mobile terminal to perform a video categorization method, the method comprising:
获取视频中包括人脸的关键帧;Obtain key frames in the video that include faces;
获取所述关键帧中的人脸特征;Obtaining a face feature in the key frame;
获取图片类别对应的人脸特征;Obtaining a face feature corresponding to the picture category;
根据所述关键帧中的人脸特征和所述图片类别对应的人脸特征,确定所述视频所归属的图片类别;Determining, according to the face feature in the key frame and the face feature corresponding to the picture category, a picture category to which the video belongs;
将所述视频分配至所述视频所归属的图片类别中。The video is assigned to a picture category to which the video belongs.
本领域技术人员在考虑说明书及实践这里公开的公开后,将容易想到本公开的其它实施方案。本申请旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。Other embodiments of the present disclosure will be readily apparent to those skilled in the <RTIgt; The present application is intended to cover any variations, uses, or adaptations of the present disclosure, which are in accordance with the general principles of the disclosure and include common general knowledge or common technical means in the art that are not disclosed in the present disclosure. . The specification and examples are to be regarded as illustrative only,
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。 It is to be understood that the invention is not limited to the details of the details and The scope of the disclosure is to be limited only by the appended claims.

Claims (15)

  1. 一种视频归类方法,其特征在于,包括:A video categorization method, comprising:
    获取视频中包括人脸的关键帧;Obtain key frames in the video that include faces;
    获取所述关键帧中的人脸特征;Obtaining a face feature in the key frame;
    获取图片类别对应的人脸特征;Obtaining a face feature corresponding to the picture category;
    根据所述关键帧中的人脸特征和所述图片类别对应的人脸特征,确定所述视频所归属的图片类别;Determining, according to the face feature in the key frame and the face feature corresponding to the picture category, a picture category to which the video belongs;
    将所述视频分配至所述视频所归属的图片类别中。The video is assigned to a picture category to which the video belongs.
  2. 如权利要求1所述的方法,其特征在于,所述获取视频中包括人脸的关键帧,包括:The method according to claim 1, wherein the acquiring a key frame including a face in the video comprises:
    从所述视频中获取包括人脸的至少一个视频帧;Obtaining at least one video frame including a face from the video;
    确定所述至少一个视频帧中,每个视频帧中的人脸参数,所述人脸参数包括人脸数目、人脸位置中的任一项或两项;Determining, in the at least one video frame, a face parameter in each video frame, where the face parameter includes any one or two of a face number and a face position;
    根据所述每个视频帧中的人脸参数,确定所述视频中的关键帧。A key frame in the video is determined based on the face parameters in each of the video frames.
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述每个视频帧中的人脸参数,确定所述视频中的关键帧,包括:The method according to claim 2, wherein the determining a key frame in the video according to a face parameter in each video frame comprises:
    根据所述每个视频帧中的所述人脸参数,确定所述人脸参数未重复出现在其它视频帧中的非重复视频帧;Determining, according to the face parameter in each video frame, the non-repetitive video frame that the face parameter does not repeatedly appear in other video frames;
    将至少一个所述非重复视频帧确定为所述关键帧。At least one of the non-repeating video frames is determined as the key frame.
  4. 根据权利要求2所述的方法,其特征在于,所述根据所述每个视频帧中的人脸参数,确定所述视频中的关键帧,包括:The method according to claim 2, wherein the determining a key frame in the video according to a face parameter in each video frame comprises:
    根据所述每个视频帧中的所述人脸参数,确定所述人脸参数相同的至少一组重复视频帧,每组所述重复视频帧中包括至少两个视频帧,每组所述重复视频帧中摄取时间最晚的视频帧与摄取时间最早的视频帧之间的摄取时间之差小于或等于预设时长,每组所述重复视频帧中所有视频帧的人脸参数相同;Determining, according to the face parameter in each video frame, at least one set of repeated video frames with the same face parameter, and each group of the repeated video frames includes at least two video frames, each group of the repetition The difference between the ingest time between the video frame with the latest ingested time and the video frame with the earliest time of the video frame is less than or equal to the preset duration, and the face parameters of all the video frames in each group of the repeated video frames are the same;
    将每组所述重复视频帧中的任一视频帧确定为所述关键帧。Any one of the sets of the repeated video frames is determined as the key frame.
  5. 如权利要求1所述的方法,其特征在于,The method of claim 1 wherein
    所述根据所述关键帧中的人脸特征和所述图片类别对应的人脸特征,确定所述视频所归属的图片类别,包括:Determining, according to the face feature in the key frame and the face feature corresponding to the picture category, the picture category to which the video belongs, including:
    当所述视频的数目为至少两个时,确定每个视频的所述关键帧中的人脸特征;Determining a face feature in the key frame of each video when the number of the videos is at least two;
    根据每个视频的所述关键帧中的人脸特征,对所述至少两个视频进行人脸聚类处理,获 得至少一个视频类别;Performing face clustering processing on the at least two videos according to the face features in the key frame of each video Get at least one video category;
    根据所述至少一个视频类别各自对应的人脸特征和所述图片类别对应的人脸特征,确定对应相同人脸特征的视频类别和图片类别;Determining a video category and a picture category corresponding to the same facial feature according to a face feature corresponding to each of the at least one video category and a face feature corresponding to the picture category;
    所述将所述视频分配至所述视频所归属的图片类别中,包括:The assigning the video to a picture category to which the video belongs includes:
    将所述每个视频类别中的视频分配至对应相同人脸特征的图片类别中。The video in each of the video categories is assigned to a picture category corresponding to the same facial feature.
  6. 如权利要求1所述的方法,其特征在于,所述根据所述关键帧中的人脸特征和所述图片类别对应的人脸特征,确定所述视频所归属的图片类别,包括:The method according to claim 1, wherein the determining the picture category to which the video belongs according to the face feature in the key frame and the face feature corresponding to the picture category comprises:
    在所述图片类别对应的人脸特征中,确定与所述关键帧中的人脸特征匹配的图片类别;Determining, in a face feature corresponding to the picture category, a picture category that matches a face feature in the key frame;
    将所述匹配的图片类别确定为所述视频所归属的图片类别。The matched picture category is determined as the picture category to which the video belongs.
  7. 如权利要求1所述的方法,其特征在于,所述方法还包括:The method of claim 1 wherein the method further comprises:
    获取所述视频的拍摄时间和拍摄地点;Obtaining the shooting time and shooting location of the video;
    确定与所述视频的拍摄时间和拍摄地点相同的目的图片;Determining a picture of the same purpose as the shooting time and shooting location of the video;
    将所述视频分配至所述目的图片所归属的图片类别中。The video is assigned to a picture category to which the destination picture belongs.
  8. 一种视频归类装置,其特征在于,包括:A video categorizing device, comprising:
    第一获取模块,用于获取视频中包括人脸的关键帧;a first acquiring module, configured to acquire a key frame including a face in the video;
    第二获取模块,用于获取所述第一获取模块获取到的所述关键帧中的人脸特征;a second acquiring module, configured to acquire a facial feature in the key frame acquired by the first acquiring module;
    第三获取模块,用于获取图片类别对应的人脸特征;a third acquiring module, configured to acquire a face feature corresponding to the picture category;
    第一确定模块,用于根据所述第二获取模块获取到的所述关键帧中的人脸特征和所述第三获取模块获取到的所述图片类别对应的人脸特征,确定所述视频所归属的图片类别;a first determining module, configured to determine the video according to a face feature in the key frame acquired by the second acquiring module and a face feature corresponding to the picture category acquired by the third acquiring module The category of the picture to which it belongs;
    第一分配模块,用于将所述视频分配至所述第一确定模块确定出的所述视频所归属的图片类别中。a first allocation module, configured to allocate the video to a picture category to which the video determined by the first determining module belongs.
  9. 如权利要求8所述的装置,其特征在于,所述第一获取模块,包括:The device according to claim 8, wherein the first obtaining module comprises:
    获取子模块,用于从所述视频中获取包括人脸的至少一个视频帧;Obtaining a submodule, configured to acquire at least one video frame including a human face from the video;
    第一确定子模块,用于确定所述获取子模块获取到的所述至少一个视频帧中,每个视频帧中的人脸参数,所述人脸参数包括人脸数目、人脸位置中的任一项或两项;a first determining submodule, configured to determine a face parameter in each video frame in the at least one video frame acquired by the acquiring submodule, where the face parameter includes a number of faces and a face position Any one or two;
    第二确定子模块,用于根据所述每个视频帧中的人脸参数,确定所述视频中的关键帧。And a second determining submodule, configured to determine a key frame in the video according to the face parameter in each video frame.
  10. 如权利要求9所述的装置,其特征在于,The device of claim 9 wherein:
    所述第二确定子模块,还用于根据所述每个视频帧中的所述人脸参数,确定所述人脸参数未重复出现在其它视频帧中的非重复视频帧;将至少一个所述非重复视频帧确定为所述关键帧。 The second determining submodule is further configured to determine, according to the face parameter in each video frame, a non-repetitive video frame in which the face parameter is not repeatedly displayed in other video frames; The non-repetitive video frame is determined as the key frame.
  11. 如权利要求9所述的装置,其特征在于,The device of claim 9 wherein:
    所述第二确定子模块,还用于根据所述每个视频帧中的所述人脸参数,确定所述人脸参数相同的至少一组重复视频帧,每组所述重复视频帧中包括至少两个视频帧,每组所述重复视频帧中摄取时间最晚的视频帧与摄取时间最早的视频帧之间的摄取时间之差小于或等于预设时长,每组所述重复视频帧中所有视频帧的人脸参数相同;将每组所述重复视频帧中的任一视频帧确定为所述关键帧。The second determining sub-module is further configured to determine, according to the face parameter in each video frame, at least one set of repeated video frames with the same face parameter, where each group of the repeated video frames is included At least two video frames, the difference between the ingest time between the video frame with the latest ingested time and the video frame with the earliest time in each of the repeated video frames is less than or equal to a preset duration, and each group of the repeated video frames The face parameters of all video frames are the same; any one of the sets of the repeated video frames is determined as the key frame.
  12. 如权利要求8所述的装置,其特征在于,The device of claim 8 wherein:
    所述第一确定模块,包括:The first determining module includes:
    第三确定子模块,用于当所述视频的数目为至少两个时,确定每个视频的所述关键帧中的人脸特征;根据每个视频的所述关键帧中的人脸特征,对所述至少两个视频进行人脸聚类处理,获得至少一个视频类别;根据所述至少一个视频类别各自对应的人脸特征和所述图片类别对应的人脸特征,确定对应相同人脸特征的视频类别和图片类别;a third determining submodule, configured to determine a face feature in the key frame of each video when the number of the videos is at least two; according to a face feature in the key frame of each video, Performing face clustering processing on the at least two videos to obtain at least one video category; determining corresponding corresponding facial features according to respective facial features corresponding to the at least one video category and facial features corresponding to the image category Video category and image category;
    所述第一分配模块,包括:The first distribution module includes:
    第一分配子模块,用于将所述第三确定子模块确定出的所述每个视频类别中的视频分配至对应相同人脸特征的图片类别中。And a first allocation submodule, configured to allocate, in the picture category corresponding to the same facial feature, the video in each video category determined by the third determining submodule.
  13. 如权利要求8所述的装置,其特征在于,所述第一确定模块,包括:The device of claim 8, wherein the first determining module comprises:
    第四确定子模块,用于在所述图片类别对应的人脸特征中,确定与所述关键帧中的人脸特征匹配的图片类别;a fourth determining submodule, configured to determine, in a facial feature corresponding to the picture category, a picture category that matches a facial feature in the key frame;
    第二分配子模块,用于将所述第四确定子模块确定出的所述匹配的图片类别确定为所述视频所归属的图片类别。a second allocation submodule, configured to determine, by the fourth determining submodule, the matched picture category as a picture category to which the video belongs.
  14. 如权利要求8所述的装置,其特征在于,所述装置还包括:The device of claim 8 further comprising:
    第四获取模块,用于获取所述视频的拍摄时间和拍摄地点;a fourth acquiring module, configured to acquire a shooting time and a shooting location of the video;
    第二确定模块,用于确定与所述第四获取模块获取到的所述视频的拍摄时间和拍摄地点相同的目的图片;a second determining module, configured to determine a target picture that is the same as the shooting time and the shooting location of the video acquired by the fourth acquiring module;
    第二分配模块,用于将所述视频分配至所述第二确定模块确定出的所述目的图片所归属的图片类别中。a second allocation module, configured to allocate the video to a picture category to which the target picture determined by the second determining module belongs.
  15. 一种视频分类装置,其特征在于,包括:A video classification device, comprising:
    处理器;processor;
    用于存储处理器可执行指令的存储器;a memory for storing processor executable instructions;
    其中,所述处理器被配置为: Wherein the processor is configured to:
    获取视频中包括人脸的关键帧;Obtain key frames in the video that include faces;
    获取所述关键帧中的人脸特征;Obtaining a face feature in the key frame;
    获取图片类别对应的人脸特征;Obtaining a face feature corresponding to the picture category;
    根据所述关键帧中的人脸特征和所述图片类别对应的人脸特征,确定所述视频所归属的图片类别;Determining, according to the face feature in the key frame and the face feature corresponding to the picture category, a picture category to which the video belongs;
    将所述视频分配至所述视频所归属的图片类别中。 The video is assigned to a picture category to which the video belongs.
PCT/CN2015/099610 2015-12-01 2015-12-29 Video classification method and apparatus WO2017092127A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
RU2016136707A RU2667027C2 (en) 2015-12-01 2015-12-29 Method and device for video categorization
JP2016523976A JP6423872B2 (en) 2015-12-01 2015-12-29 Video classification method and apparatus
MX2016005882A MX2016005882A (en) 2015-12-01 2015-12-29 Video classification method and apparatus.
KR1020167010359A KR101952486B1 (en) 2015-12-01 2015-12-29 Video categorization method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510867436.5 2015-12-01
CN201510867436.5A CN105426515B (en) 2015-12-01 2015-12-01 video classifying method and device

Publications (1)

Publication Number Publication Date
WO2017092127A1 true WO2017092127A1 (en) 2017-06-08

Family

ID=55504727

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/099610 WO2017092127A1 (en) 2015-12-01 2015-12-29 Video classification method and apparatus

Country Status (8)

Country Link
US (1) US10115019B2 (en)
EP (1) EP3176709A1 (en)
JP (1) JP6423872B2 (en)
KR (1) KR101952486B1 (en)
CN (1) CN105426515B (en)
MX (1) MX2016005882A (en)
RU (1) RU2667027C2 (en)
WO (1) WO2017092127A1 (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106227868A (en) * 2016-07-29 2016-12-14 努比亚技术有限公司 The classifying method of video file and device
CN106453916B (en) * 2016-10-31 2019-05-31 努比亚技术有限公司 Object classification device and method
KR20190007816A (en) 2017-07-13 2019-01-23 삼성전자주식회사 Electronic device for classifying video and operating method thereof
CN108830151A (en) * 2018-05-07 2018-11-16 国网浙江省电力有限公司 Mask detection method based on gauss hybrid models
CN108986184B (en) * 2018-07-23 2023-04-18 Oppo广东移动通信有限公司 Video creation method and related device
CN110334753B (en) * 2019-06-26 2023-04-07 Oppo广东移动通信有限公司 Video classification method and device, electronic equipment and storage medium
CN110516624A (en) * 2019-08-29 2019-11-29 北京旷视科技有限公司 Image processing method, device, electronic equipment and storage medium
CN110580508A (en) * 2019-09-06 2019-12-17 捷开通讯(深圳)有限公司 video classification method and device, storage medium and mobile terminal
CN111177086A (en) * 2019-12-27 2020-05-19 Oppo广东移动通信有限公司 File clustering method and device, storage medium and electronic equipment
CN111553191A (en) * 2020-03-30 2020-08-18 深圳壹账通智能科技有限公司 Video classification method and device based on face recognition and storage medium
CN112069875A (en) * 2020-07-17 2020-12-11 北京百度网讯科技有限公司 Face image classification method and device, electronic equipment and storage medium
CN112835807B (en) * 2021-03-02 2022-05-31 网易(杭州)网络有限公司 Interface identification method and device, electronic equipment and storage medium
CN115115822B (en) * 2022-06-30 2023-10-31 小米汽车科技有限公司 Vehicle-end image processing method and device, vehicle, storage medium and chip

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040228504A1 (en) * 2003-05-13 2004-11-18 Viswis, Inc. Method and apparatus for processing image
CN103207870A (en) * 2012-01-17 2013-07-17 华为技术有限公司 Method, server, device and system for photo sort management
CN103530652A (en) * 2013-10-23 2014-01-22 北京中视广信科技有限公司 Face clustering based video categorization method and retrieval method as well as systems thereof
CN103827856A (en) * 2011-09-27 2014-05-28 惠普发展公司,有限责任合伙企业 Retrieving visual media
CN104284240A (en) * 2014-09-17 2015-01-14 小米科技有限责任公司 Video browsing method and device
CN104317932A (en) * 2014-10-31 2015-01-28 小米科技有限责任公司 Photo sharing method and device

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005227957A (en) * 2004-02-12 2005-08-25 Mitsubishi Electric Corp Optimal face image recording device and optimal face image recording method
AR052601A1 (en) 2005-03-10 2007-03-21 Qualcomm Inc CLASSIFICATION OF CONTENTS FOR MULTIMEDIA PROCESSING
JP4616091B2 (en) * 2005-06-30 2011-01-19 株式会社西部技研 Rotary gas adsorption concentrator
US8150155B2 (en) * 2006-02-07 2012-04-03 Qualcomm Incorporated Multi-mode region-of-interest video object segmentation
KR100771244B1 (en) * 2006-06-12 2007-10-29 삼성전자주식회사 Method and apparatus for processing video data
JP4697106B2 (en) * 2006-09-25 2011-06-08 ソニー株式会社 Image processing apparatus and method, and program
JP2008117271A (en) * 2006-11-07 2008-05-22 Olympus Corp Object recognition device of digital image, program and recording medium
US8488901B2 (en) * 2007-09-28 2013-07-16 Sony Corporation Content based adjustment of an image
JP5278425B2 (en) * 2008-03-14 2013-09-04 日本電気株式会社 Video segmentation apparatus, method and program
JP5134591B2 (en) * 2009-06-26 2013-01-30 京セラドキュメントソリューションズ株式会社 Wire locking structure
JP2011100240A (en) * 2009-11-05 2011-05-19 Nippon Telegr & Teleph Corp <Ntt> Representative image extraction method, representative image extraction device, and representative image extraction program
US8452778B1 (en) * 2009-11-19 2013-05-28 Google Inc. Training of adapted classifiers for video categorization
JP2011234180A (en) * 2010-04-28 2011-11-17 Panasonic Corp Imaging apparatus, reproducing device, and reproduction program
US9405771B2 (en) * 2013-03-14 2016-08-02 Microsoft Technology Licensing, Llc Associating metadata with images in a personal image collection
US9471675B2 (en) * 2013-06-19 2016-10-18 Conversant Llc Automatic face discovery and recognition for video content analysis
EP3089102B1 (en) * 2013-12-03 2019-02-20 ML Netherlands C.V. User feedback for real-time checking and improving quality of scanned image
CN104133875B (en) * 2014-07-24 2017-03-22 北京中视广信科技有限公司 Face-based video labeling method and face-based video retrieving method
CN104361128A (en) * 2014-12-05 2015-02-18 河海大学 Data synchronization method of PC (Personnel Computer) end and mobile terminal based on hydraulic polling business

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040228504A1 (en) * 2003-05-13 2004-11-18 Viswis, Inc. Method and apparatus for processing image
CN103827856A (en) * 2011-09-27 2014-05-28 惠普发展公司,有限责任合伙企业 Retrieving visual media
CN103207870A (en) * 2012-01-17 2013-07-17 华为技术有限公司 Method, server, device and system for photo sort management
CN103530652A (en) * 2013-10-23 2014-01-22 北京中视广信科技有限公司 Face clustering based video categorization method and retrieval method as well as systems thereof
CN104284240A (en) * 2014-09-17 2015-01-14 小米科技有限责任公司 Video browsing method and device
CN104317932A (en) * 2014-10-31 2015-01-28 小米科技有限责任公司 Photo sharing method and device

Also Published As

Publication number Publication date
EP3176709A1 (en) 2017-06-07
US10115019B2 (en) 2018-10-30
KR20180081637A (en) 2018-07-17
JP6423872B2 (en) 2018-11-14
US20170154221A1 (en) 2017-06-01
KR101952486B1 (en) 2019-02-26
RU2016136707A3 (en) 2018-03-16
RU2016136707A (en) 2018-03-16
CN105426515A (en) 2016-03-23
MX2016005882A (en) 2017-08-02
RU2667027C2 (en) 2018-09-13
CN105426515B (en) 2018-12-18
JP2018502340A (en) 2018-01-25

Similar Documents

Publication Publication Date Title
WO2017092127A1 (en) Video classification method and apparatus
WO2017031875A1 (en) Method and apparatus for changing emotion icon in chat interface, and terminal device
WO2021031609A1 (en) Living body detection method and device, electronic apparatus and storage medium
WO2016090829A1 (en) Image shooting method and device
WO2016029641A1 (en) Photograph acquisition method and apparatus
WO2016090822A1 (en) Method and device for upgrading firmware
WO2017096782A1 (en) Method of preventing from blocking camera view and device
RU2648625C2 (en) Method and apparatus for determining spatial parameter by using image, and terminal device
WO2015169061A1 (en) Image segmentation method and device
US20170154206A1 (en) Image processing method and apparatus
WO2017084183A1 (en) Information displaying method and device
WO2018120906A1 (en) Buffer state report (bsr) report trigger method, device and user terminal
US10230891B2 (en) Method, device and medium of photography prompts
WO2021036382A9 (en) Image processing method and apparatus, electronic device and storage medium
JP6333990B2 (en) Panorama photo generation method and apparatus
WO2017000491A1 (en) Iris image acquisition method and apparatus, and iris recognition device
WO2018228422A1 (en) Method, device, and system for issuing warning information
CN106534951B (en) Video segmentation method and device
WO2016078394A1 (en) Voice call reminding method and device
WO2016110146A1 (en) Mobile terminal and virtual key processing method
US20170090684A1 (en) Method and apparatus for processing information
WO2017080084A1 (en) Font addition method and apparatus
WO2016173246A1 (en) Telephone call method and device based on name card in cloud
WO2017219497A1 (en) Message generation method and apparatus
WO2017140108A1 (en) Pressure detection method and apparatus

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2016523976

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20167010359

Country of ref document: KR

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: MX/A/2016/005882

Country of ref document: MX

ENP Entry into the national phase

Ref document number: 2016136707

Country of ref document: RU

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15909641

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15909641

Country of ref document: EP

Kind code of ref document: A1