WO2020173379A1 - Picture grouping method and device - Google Patents
Picture grouping method and device Download PDFInfo
- Publication number
- WO2020173379A1 WO2020173379A1 PCT/CN2020/076040 CN2020076040W WO2020173379A1 WO 2020173379 A1 WO2020173379 A1 WO 2020173379A1 CN 2020076040 W CN2020076040 W CN 2020076040W WO 2020173379 A1 WO2020173379 A1 WO 2020173379A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- face
- video
- category
- user
- electronic device
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 107
- 238000012545 processing Methods 0.000 claims abstract description 58
- 230000008569 process Effects 0.000 claims description 20
- 238000004422 calculation algorithm Methods 0.000 claims description 18
- 230000002123 temporal effect Effects 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 11
- 230000001815 facial effect Effects 0.000 description 44
- 230000006870 function Effects 0.000 description 44
- 238000004891 communication Methods 0.000 description 38
- 238000010586 diagram Methods 0.000 description 32
- 238000007726 management method Methods 0.000 description 18
- 238000013461 design Methods 0.000 description 14
- 238000001514 detection method Methods 0.000 description 13
- 238000005516 engineering process Methods 0.000 description 13
- 210000000887 face Anatomy 0.000 description 13
- 230000005236 sound signal Effects 0.000 description 13
- 239000000284 extract Substances 0.000 description 12
- 238000010295 mobile communication Methods 0.000 description 11
- 210000000988 bone and bone Anatomy 0.000 description 10
- 230000000694 effects Effects 0.000 description 9
- 230000014509 gene expression Effects 0.000 description 9
- 241000282472 Canis lupus familiaris Species 0.000 description 8
- 239000013598 vector Substances 0.000 description 7
- 238000000605 extraction Methods 0.000 description 6
- 238000013528 artificial neural network Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 5
- 230000033001 locomotion Effects 0.000 description 5
- 229920001621 AMOLED Polymers 0.000 description 4
- 239000006185 dispersion Substances 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000001133 acceleration Effects 0.000 description 3
- 230000009471 action Effects 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000013500 data storage Methods 0.000 description 3
- 230000003068 static effect Effects 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 241000282326 Felis catus Species 0.000 description 2
- GRRMZXFOOGQMFA-UHFFFAOYSA-J YoYo-1 Chemical compound [I-].[I-].[I-].[I-].C12=CC=CC=C2C(C=C2N(C3=CC=CC=C3O2)C)=CC=[N+]1CCC[N+](C)(C)CCC[N+](C)(C)CCC[N+](C1=CC=CC=C11)=CC=C1C=C1N(C)C2=CC=CC=C2O1 GRRMZXFOOGQMFA-UHFFFAOYSA-J 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000036772 blood pressure Effects 0.000 description 2
- 238000005034 decoration Methods 0.000 description 2
- 239000010985 leather Substances 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 230000005855 radiation Effects 0.000 description 2
- 235000018185 Betula X alpestris Nutrition 0.000 description 1
- 235000018212 Betula X uliginosa Nutrition 0.000 description 1
- 206010011469 Crying Diseases 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 241000190070 Sarracenia purpurea Species 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000010009 beating Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 238000013529 biological neural network Methods 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000019771 cognition Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 239000004020 conductor Substances 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 230000008921 facial expression Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 230000003862 health status Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 229920001690 polydopamine Polymers 0.000 description 1
- 239000002096 quantum dot Substances 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000003238 somatosensory effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/55—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/75—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
Definitions
- the embodiments of the present application relate to the field of electronic technology, and in particular, to a method and device for grouping pictures.
- Current clustering methods mainly use face detection algorithms to detect faces and feature points in pictures (such as key points such as corners of eyes, nose tip, and mouth corners), extract facial features, and use facial features to cluster pictures.
- This method has high clustering accuracy for frontal face images, and low clustering accuracy for face images taken from other angles.
- the embodiments of the present application provide a picture grouping method and device, which can cluster face pictures stored in an electronic device according to face images of different shapes in a reference image set obtained by the electronic device, and improve clustering accuracy.
- an embodiment of the present application provides a picture grouping method, which can be applied to an electronic device.
- the electronic device obtains at least one face picture.
- the method includes: the electronic device obtains at least one video. Then, the electronic device extracts multiple face image frames from at least one video.
- the electronic device performs clustering processing on at least one face picture according to multiple face image frames. After that, the electronic device 4 displays at least one group according to the clustering processing result, and each group includes at least one face picture of a user.
- the electronic device can use multiple face image frames in at least one video as a priori information, and cluster the face images according to the multiple face image frames in the at least one video, thereby classifying the face images according to different users. Grouping makes the face pictures of the same user cluster into the same group, and improves the accuracy of face picture clustering and grouping.
- the electronic device performs clustering processing on at least one face picture according to the multiple face image frames, including: the electronic device divides the multiple face image frames into at least one category, and each category corresponds to each category. Multiple face image frames of different shapes for a user. The electronic device performs clustering processing on at least one face picture according to the classification results of the multiple face image frames.
- the electronic device can group the face pictures and the divided categories into a group according to the classification results, or regroup the face pictures into a group.
- the electronic device can accurately group different face images of different face angles, expressions, etc., according to the different face images of different users, and improve the aggregation.
- the accuracy of clustering and grouping reduces the dispersion of clustering.
- the electronic device classifying multiple face image frames into at least one category includes: the electronic device separately classifies the face image frames in each video into at least one category. If at least one of the first category The similarity between the face features of the first face image frame and the face features of the second face image frame in the second category is greater than or equal to the preset value, then the electronic device divides the first category with the second category Merged into the same category.
- the electronic device can first classify the face image frames in the same video, and then merge the categories of the face image frames with greater similarity in different videos, that is, the faces of the same user in different videos The image frames are merged into the same category.
- the electronic device divides the face image frames in each video into at least one category, including: the electronic device uses a face tracking algorithm to separately divide the time continuity in each video Multiple face image frames of the same user are divided into the same category.
- the face image frames of the same user with temporal continuity may be adjacent image frames.
- the face images in the same video tracked by the electronic device through the face tracking algorithm have time continuity, meet the must-link constraint, are the face of the same user, and therefore can be classified into the same category.
- each group also includes any one or a combination of any of the following: the video where the user’s face image frame is located, the video segment where the user’s face image frame is located, or the user’s At least one face image frame.
- the electronic device can not only group face pictures, but also group videos, video segments and face image frames, etc., and jointly manage face pictures and videos, video segments, and face image frames to improve users Find efficiency and management experience.
- At least one face picture of a user included in each group is a single photo or a group photo.
- obtaining at least one video by the electronic device includes: the electronic device obtains at least one video from a storage area of the electronic device.
- the at least one video may be a video previously captured, downloaded, copied, or obtained by other means by the electronic device.
- the electronic device acquiring at least one video includes: the electronic device prompts the user to shoot a video including human face image frames.
- the electronic device records and generates at least one video after detecting the operation instructed by the user to shoot the video.
- the electronic device can record a video in real time for use in grouping face pictures.
- the method further includes: the electronic device acquires at least one image group, and each image group includes multiple image frames of the same user in different forms.
- At least one image group includes any one or a combination of any of the following: moving pictures, a pre-photographed image group that includes different forms of the same user's face, an image group formed by multiple frames of images collected in real time during shooting preview, or An image group formed by multiple frames of images taken during continuous shooting.
- the electronic device extracting multiple face image frames from at least one video includes: the electronic device extracts multiple face image frames from at least one video and at least one image group.
- the electronic device can classify face pictures not only according to videos, but also according to various image groups including user face picture frames of different forms, such as moving pictures.
- the electronic device treats at least one person according to multiple face image frames.
- the face pictures are clustered; and according to the clustering processing result, at least one group is displayed, and each group includes at least one face picture of a user.
- the electronic device can display the grouping result of the face pictures in response to the user's instruction.
- the electronic device after opening the album, automatically performs clustering processing on at least one face picture according to multiple face image frames; and according to the clustering processing result, displays at least one group, each Each group includes one At least one face image of each user.
- the electronic device after opening the album, the electronic device can automatically perform clustering and display grouping processing.
- the electronic device automatically performs clustering processing on at least one face image according to multiple face image frames when the power level is higher than the preset power level during the charging process; Then, according to the clustering processing result, at least one group is displayed, and each group includes at least one face picture of a user.
- the electronic device can automatically perform clustering and display grouping processing at different times.
- the electronic device when the electronic device displays at least one group, it may also prompt the user that the group is obtained by grouping face pictures according to face image frames in the video.
- an embodiment of the present application provides a method for grouping pictures, which is applied to an electronic device.
- the electronic device stores at least one video and at least one face picture.
- the method includes: the electronic device detects that the user is used for viewing After the image classification operation, at least one group is displayed.
- each group includes at least one face picture of a user, and any one or a combination of any of the following: the video where the user's face image frame is located, and the video segment where the user's face image frame is located , Or at least one face image frame of the user.
- an embodiment of the present application provides a picture grouping method, which is applied to an electronic device, and at least one face picture is stored on the electronic device.
- the method includes: the electronic device acquires at least one reference image set, the reference image set includes A series of face image frames with temporal continuity. Then, the electronic device performs clustering processing on at least one face image according to the face image frame. After that, the electronic device can display at least one group according to the clustering processing result, and each group includes at least one face picture of a user.
- the reference image set may be a face image frame in a video; a face image frame in an animation; or a collection of time-continuous multi-frame images collected in real time in the shooting preview state, A collection of multi-frame images with time continuity captured in the capture mode, a collection of multi-frame images with time continuity captured by an electronic device during continuous shooting; or a user preset including different forms of the same user Face image group, etc.
- an embodiment of the present application provides a picture grouping method, which is applied to an electronic device, and at least one picture is stored on the electronic device.
- the method includes: the electronic device acquires at least one video, and the video includes image frames; Perform clustering processing on at least one picture; according to the clustering processing result, at least one group is displayed, and each group includes at least one picture of an entity.
- the entity may include human faces, dogs, cats, houses, etc.
- an embodiment of the present application provides a picture grouping device, which is included in an electronic device, and the device has a function of implementing the behavior of the electronic device in any of the foregoing aspects and possible implementation manners.
- This function can be realized by hardware, or by hardware executing corresponding software.
- the hardware or software includes at least one module or unit corresponding to the above-mentioned functions. For example, acquiring modules or units, extracting modules or units, clustering modules or units, and displaying modules or units.
- an embodiment of the present application provides an electronic device, including at least one processor and at least one memory.
- the at least one memory is coupled with at least one processor, and the at least one memory is used to store computer program code, and the computer program code includes computer instructions.
- the at least one processor executes the computer instructions, the electronic device is made to perform any possible implementation of the foregoing aspects.
- the picture grouping method in.
- an embodiment of the present application provides a computer storage medium, including computer instructions, which when the computer instructions run on an electronic device, cause the electronic device to execute the picture grouping method in any of the possible implementations of the foregoing aspects.
- an embodiment of the present application provides a computer program product, which when the computer program product runs on a computer, causes the computer to execute the picture grouping method in any one of the possible implementations of the foregoing aspects.
- FIG. 1 is a schematic structural diagram of an electronic device provided by an embodiment of the application.
- Figure 2 is a schematic diagram of a set of interfaces provided by an embodiment of the application.
- FIG. 3 is a schematic diagram of an interface provided by an embodiment of the application.
- FIG. 4 is a schematic diagram of another interface provided by an embodiment of the application.
- Figure 5 is a schematic diagram of another interface provided by an embodiment of the application.
- FIG. 6 is a schematic diagram of another interface provided by an embodiment of the application.
- FIG. 7A is a schematic diagram of another interface provided by an embodiment of the application.
- FIG. 7B is a schematic diagram of a video and a face image frame in the video provided by an embodiment of the application.
- FIG. 8A is a schematic diagram of a classification effect provided by an embodiment of this application.
- FIG. 8B is a schematic diagram of another classification effect provided by an embodiment of the application.
- FIG. 9A is a schematic diagram of another interface provided by an embodiment of the application.
- FIG. 9B is a schematic diagram of another interface provided by an embodiment of the application.
- FIG. 9C is a schematic diagram of another interface provided by an embodiment of the application.
- FIG. 10 is a schematic diagram of another set of interfaces provided by an embodiment of the application.
- FIG. 11 is a schematic diagram of another set of interfaces provided by an embodiment of the application.
- FIG. 12 is a schematic diagram of another set of interfaces provided by an embodiment of the application.
- FIG. 13 is a schematic diagram of another set of interfaces provided by an embodiment of the application.
- FIG. 14 is a schematic diagram of another set of interfaces provided by an embodiment of the application.
- FIG. 15 is a schematic diagram of another interface provided by an embodiment of the application.
- FIG. 16 is a schematic diagram of another set of interfaces provided by an embodiment of the application.
- FIG. 17 is a schematic diagram of face image frames in an image group provided by an embodiment of the application.
- FIG. 18 is a schematic diagram of another set of interfaces provided by an embodiment of the application.
- FIG. 19A is a schematic diagram of another interface provided by an embodiment of this application.
- FIG. 19B is a schematic diagram of another interface provided by an embodiment of the application.
- FIG. 20 is a schematic diagram of a face image frame in another image group provided by an embodiment of the application.
- FIG. 21 is a schematic diagram of another interface provided by an embodiment of the application.
- FIG. 22 is a schematic diagram of another set of interfaces provided by an embodiment of the application.
- FIG. 23 is a flowchart of a method for grouping pictures according to an embodiment of this application.
- FIG. 24 is a schematic structural diagram of another electronic device provided by an embodiment of the application.
- the embodiment of the present application provides a method for grouping pictures, which can be applied to electronic devices.
- the electronic device may cluster the face pictures (that is, pictures containing the face images) stored on the electronic device according to the reference image set.
- the reference image set includes multiple face images of different shapes with temporal continuity.
- the form here can include the angle of the face (such as side face, face up or down, etc.), facial expressions (such as laughing, crying or funny expressions, etc.), whether to have a beard, whether to wear Sunglasses, whether the face is covered by a hat, whether the face is covered by hair, etc.
- the face pictures stored on the electronic device refer to static pictures that exist independently.
- the reference image set may include a collection of a series of image frames with temporal continuity in the video acquired by the electronic device.
- the video may be a video taken by a camera of an electronic device, a video obtained by an electronic device from an application (application, App) (such as Douyin, Kuaishou, Meipai, YOYO, etc.), and a video obtained by an electronic device from other devices , Or the video saved during the video call, etc.
- the reference image set may also include animated images (Gif) acquired by the electronic device, and the animated images include multiple frames of images with temporal continuity.
- Gif animated images
- the reference image set may also include an image group composed of a series of images with time continuity acquired by the electronic device.
- the image group may be a collection of time-continuous multi-frame images collected in real time by the electronic device in the shooting preview state.
- the image group may be a collection of time-continuous multi-frame images captured by the electronic device in the capture mode (the electronic device or the user may designate one of the images as the captured image).
- the image group may be a collection of multiple frames of images with time continuity captured by the electronic device during continuous shooting.
- the image group may be a user-preset image group including different forms of faces of the same user (for example, a pre-photographed front face image, side face image, and laughing face image of the same user). One or more of the image group), etc.
- the electronic device can use the face images in the reference image set as prior information. According to the face images of different forms in the reference image set, the electronic device The stored pictures are clustered, so that face pictures of different shapes can also be accurately clustered, and the clustering accuracy of face pictures is improved.
- the electronic device may be a mobile phone, a tablet computer, a wearable device, a vehicle-mounted device, an augmented reality (AR)/virtual reality (VR) device, a notebook computer, an ultra-mobile personal computer (ultra-mobile personal computer).
- AR augmented reality
- VR virtual reality
- UMPCs computers, UMPCs
- netbooks netbooks
- PDAs personal digital assistants
- FIG. 1 shows a schematic structural diagram of an electronic device 100.
- the electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, and an antenna 2.
- Mobile communication module 150 wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, earphone jack 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display 194, and Subscriber identification module (subscriber identification module, SIM) card interface 195, etc.
- SIM Subscriber identification module
- the sensor module 180 may include pressure sensor 180A, gyroscope sensor 180B, air pressure sensor 180C, magnetic sensor 180D, acceleration sensor 180E, distance sensor 180F, proximity light sensor 180G, fingerprint sensor 180H, temperature sensor 180J, touch sensor 180K, ambient light Sensor 180L, bone conduction sensor 180M, etc.
- the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the electronic device 100.
- the electronic device 100 may include more or fewer components than shown, or combine certain components, or disassemble certain components, or arrange different components.
- the components shown in the figure can be implemented in hardware, software or a combination of software and hardware.
- the processor 110 may include one or more processing units.
- the processor 110 may include an application processor (AP), a modem processor, a graphics processing unit (GPU), and an image signal processor. (image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processor (neural-network processing unit, NPU) etc.
- AP application processor
- modem processor modem processor
- GPU graphics processing unit
- image signal processor image signal processor
- ISP image signal processor
- controller memory
- video codec digital signal processor
- DSP digital signal processor
- baseband processor baseband processor
- neural-network processor neural-network processing unit
- the controller can be the nerve center and command center of the electronic device 100.
- the controller can generate operation control signals according to the command operation code and timing signals to complete the control of fetching and executing commands.
- a memory may also be provided in the processor 110 to store instructions and data.
- the memory in the processor 110 is a cache memory.
- the memory can store instructions or data that the processor 110 has just used or used cyclically. If the processor 110 needs to use the instruction or data again, it can be called directly from the memory. Repeated access is avoided, the waiting time of the processor 110 is reduced, and the efficiency of the system is improved.
- the processor 110 may include one or more interfaces.
- the interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (PCM) interface, and a universal asynchronous transmitter (universal asynchronous transmitter) interface.
- I2C integrated circuit
- I2S integrated circuit sound
- PCM pulse code modulation
- UART universal asynchronous transmitter
- MIPI mobile industry processor interface
- GPIO general-purpose input/output
- SIM subscriber identity module
- USB universal serial bus
- the I2C interface is a two-way synchronous serial bus that includes a serial data line (SDA) and a derail clock line (SCL).
- the processor 110 may include multiple sets of I2C buses.
- the processor 110 may be coupled to the touch sensor 180K, charger, flash, camera 193, etc., through different I2C bus interfaces.
- the processor 110 may couple the touch sensor 180K through the I2C interface, so that the processor 110 and the touch sensor 180K communicate through the I2C bus interface to realize the touch function of the electronic device 100.
- the I2S interface can be used for audio communication.
- the processor 110 may include multiple sets of I2S buses.
- the processor 110 may be coupled with the audio module 170 through an I2S bus to implement communication between the processor 110 and the audio module 170.
- the audio module 170 may transmit audio signals to the wireless communication module 160 through the I2S interface, so as to realize the function of answering calls through the Bluetooth headset.
- the PCM interface can also be used for audio communication to sample, quantize and encode analog signals.
- the audio module 170 and the wireless communication module 160 may be coupled through a PCM bus interface.
- the audio module 170 may also transmit audio signals to the wireless communication module 160 through the PCM interface, so as to realize the function of answering calls through the Bluetooth headset. Both I2S interface and PCM interface can be used for audio communication.
- the UART interface is a universal serial data bus used for asynchronous communication.
- the bus can be a two-way communication bus. It converts the data to be transmitted between serial communication and parallel communication.
- the UART interface is generally used to connect the processor 110 and the wireless communication module 160.
- the processor 110 communicates with the Bluetooth module in the wireless communication module 160 through the UART interface to realize the Bluetooth function.
- the audio module 170 can transmit audio signals to the wireless communication module 160 through the UART interface, so as to realize the function of playing music through the Bluetooth headset.
- the MIPI interface can be used to connect the processor 110 with the display screen 194, the camera 193 and other peripheral devices.
- MIPI interface includes camera serial interface (CSI), display serial interface (DSI) and so on.
- the processor 110 and the camera 193 communicate through a CSI interface to implement the shooting function of the electronic device 100.
- the processor 110 and the display screen 194 communicate through the DSI interface to realize the display function of the electronic device 100.
- the GPI0 interface can be configured through software.
- the GPI0 interface can be configured as a control signal or as a data signal.
- the GPI0 interface may be used to connect the processor 110 with the camera 193, the display screen 194, the wireless communication module 160, the audio module 170, the sensor module 180, and so on.
- GPI0 interface can also be configured as I2C interface, I2S Interface, UART interface, MIPI interface, etc.
- the USB interface 130 is an interface that complies with the USB standard specification, and specifically may be a Mini USB interface, a Micro USB interface, a USB Type C interface, and so on.
- the USB interface 130 can be used to connect a charger to charge the electronic device 100, and can also be used to transfer data between the electronic device 100 and peripheral devices. It can also be used to connect headphones and play audio through headphones. This interface can also be used to connect other electronic devices, such as AR devices.
- the interface connection relationship between the modules illustrated in the embodiment of the present application is merely illustrative, and does not constitute a structural limitation of the electronic device 100.
- the electronic device 100 may also adopt different interface connection modes in the foregoing embodiments, or a combination of multiple interface connection modes.
- the charging management module 140 is used to receive charging input from the charger.
- the charger can be a wireless charger or a wired charger.
- the charging management module 140 may receive the charging input of the wired charger through the USB interface 130.
- the charging management module 140 may receive the wireless charging input through the wireless charging coil of the electronic device 100. While the charging management module 140 charges the battery 142, the power management module 141 can also supply power to electronic devices.
- the power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110.
- the power management module 141 receives input from the battery 142 and/or the charge management module 140, and supplies power to the processor 110, the internal memory 121, the external memory, the display screen 194, the camera 193, and the wireless communication module 160.
- the power management module 141 can also be used to monitor parameters such as battery capacity, battery cycle times, and battery health status (leakage, impedance).
- the power management module 141 may also be provided in the processor 110. In other embodiments, the power management module 141 and the charging management module 140 may also be provided in the same device.
- the wireless communication function of the electronic device 100 can be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor, and the baseband processor.
- Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
- Each antenna in the electronic device 100 can be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization.
- Antenna 1 can be multiplexed as a diversity antenna for wireless LAN.
- the antenna can be used in combination with a tuning switch.
- the mobile communication module 150 can provide a wireless communication solution including 2G/3G/4G/5G and the like applied to the electronic device 100.
- the mobile communication module 150 may include at least one filter, a switch, a power amplifier, a low noise amplifier (LNA), etc.
- the mobile communication module 150 can receive electromagnetic waves by the antenna 1, and perform processing such as filtering and amplifying the received electromagnetic waves, and then transmitting them to the modem processor for demodulation.
- the mobile communication module 150 can also amplify the signal modulated by the modem processor, and convert it into electromagnetic waves through the antenna 1 for radiation.
- At least part of the functional modules of the mobile communication module 150 may be provided in the processor 110. In some embodiments, at least part of the functional modules of the mobile communication module 150 and at least part of the modules of the processor 110 may be provided in the same device.
- the modem processor may include a modulator and a demodulator.
- the modulator is used to modulate the low frequency baseband signal to be sent into a medium and high frequency signal.
- the demodulator is used to demodulate the received electromagnetic wave signal into a low-frequency baseband signal. Then the demodulator transmits the demodulated low-frequency baseband signal to the baseband processor for processing. After the low-frequency baseband signal is processed by the baseband processor, it is passed to the application processor.
- the application processor outputs sound signals through audio equipment (not limited to speakers 170A, receiver 170B, etc.), or displays images or videos through the display 194.
- the modem processor may be an independent device. In other embodiments, the modem processor may be independent of the processor 110 and be provided in the same device as the mobile communication module 150 or other functional modules.
- the wireless communication module 160 can provide applications on the electronic device 100 including wireless local area network (WLAN). networks, WLAN) (such as wireless fidelity (Wi-Fi) networks), Bluetooth (bluetooth, BT), global navigation satellite system (GNSS), frequency modulation (FM), close range Wireless communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions.
- the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
- the wireless communication module 160 receives electromagnetic waves via the antenna 2, frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110.
- the wireless communication module 160 may also receive a signal to be sent from the processor 110, perform frequency modulation, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.
- the antenna 1 of the electronic device 100 is coupled with the mobile communication module 150, and the antenna 2 is coupled with the wireless communication module 160, so that the electronic device 100 can communicate with the network and other devices through wireless communication technology.
- Wireless communication technologies can include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), and broadband code division. Multiple access (wideband code division multiple access, WCDMA), time-division code division multiple access (TD-SCDMA), long term evolution (LTE), BT, GNSS, WLAN, NFC, FM , And/or IR technology, etc.
- GSM global system for mobile communications
- GPRS general packet radio service
- CDMA code division multiple access
- CDMA broadband code division. Multiple access (wideband code division multiple access, WCDMA), time-division code division multiple access (TD-SCDMA), long term evolution (LTE), BT, GNSS, WLAN, NFC, FM , And/or IR technology, etc.
- GNSS can include global positioning system (GP
- the electronic device 100 implements a display function through GRJ, a display screen 194, and an application processor.
- the GPU is a microprocessor for image processing, connected to the display 194 and the application processor.
- the GPU is used to perform mathematical and geometric calculations, and is used for graphics rendering.
- the processor 110 may include one or more GPUs, which execute program instructions to generate or change display information.
- the display screen 194 is used to display images, videos, etc.
- the display screen 194 includes a display panel.
- the display panel can adopt liquid crystal display (LCD), organic light-emitting diode (OLED), active matrix organic light-emitting diode or active-matrix organic light-emitting diode (active-matrix organic light-emitting diode).
- LCD liquid crystal display
- OLED organic light-emitting diode
- active-matrix organic light-emitting diode active-matrix organic light-emitting diode
- AMOLED flexible light-emitting diode
- FLED flexible light-emitting diode
- Miniled MicroLed, Micro-oLed, quantum dot light emitting diodes (QLED), etc.
- the electronic device 100 may include one or N display screens 194, and N is a positive integer greater than one.
- the electronic device 100 can implement shooting functions through an ISP, a camera 193, a video codec, a GPU, a display screen 194, and an application processor.
- the ISP is used to process the data fed back from the camera 193. For example, when taking a picture, the shutter is opened, the light is transmitted to the photosensitive element of the camera through the lens, the light signal is converted into an electrical signal, and the photosensitive element of the camera transmits the electrical signal to the ISP for processing, which is converted into an image visible to the naked eye.
- ISP can also optimize the image noise, brightness, and skin color.
- the ISP can also optimize the exposure, color temperature and other parameters of the shooting scene.
- the ISP may be provided in the camera 193.
- the camera 193 is used to capture still images or videos.
- the object generates an optical image through the lens and projects it to the photosensitive element.
- the photosensitive element can be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
- CMOS complementary metal-oxide-semiconductor
- the photosensitive element converts the optical signal into an electrical signal, and then transfers the electrical signal to the ISP to convert it into a digital image signal.
- ISP outputs digital image signals to DSP for processing.
- DSP converts digital image signals into standard RGB, YUV and other formats.
- the electronic device 100 may include one or N cameras 193, and N is a positive integer greater than one.
- the digital signal processor is used to process digital signals. In addition to processing digital image signals, it can also process other digital signals. For example, when the electronic device 100 selects the frequency point, the digital signal processor is used to Fourier transform the frequency point energy. Wait.
- Video codecs are used to compress or decompress digital video.
- the electronic device 100 may support one or more video codecs. In this way, the electronic device 100 can play or record videos in multiple encoding formats, such as: moving picture experts group (MPEG)1, MPEG2, MPEG3, MPEG4, etc.
- MPEG moving picture experts group
- NPU is a neural-network (NN) computing processor.
- NN neural-network
- applications such as intelligent cognition of the electronic device 100 can be implemented, such as: image recognition, face recognition, voice recognition, text understanding, and so on.
- the NPU or other processor may be used to perform face detection, face tracking, face feature extraction, and image clustering operations on the face image in the video stored by the electronic device 100; 100 performs operations such as face detection and facial feature extraction on the face images in the stored pictures, and clusters the pictures stored in the electronic device 100 according to the face features of the pictures and the clustering results of the face images in the video.
- the external memory interface 120 may be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 100.
- the external memory card communicates with the processor 110 through the external memory interface 120 to realize the data storage function. For example, save music, video and other files in an external memory card.
- the internal memory 121 may be used to store computer executable program code, and the executable program code includes instructions.
- the processor 110 executes various functional applications and data processing of the electronic device 100 by running instructions stored in the internal memory 121.
- the internal memory 121 may include a program storage area and a data storage area.
- the storage program area can store an operating system, at least one application program (such as a sound playback function, an image playback function, etc.) required by at least one function.
- the data storage area can store data (such as audio data, phone book, etc.) created during the use of the electronic device 100.
- the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash storage (UFS), and the like.
- a non-volatile memory such as at least one magnetic disk storage device, a flash memory device, a universal flash storage (UFS), and the like.
- the electronic device 100 can implement audio functions through an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, a headphone interface 170D, and an application processor. For example, music playback, recording, etc.
- the audio module 170 is used to convert digital audio information into an analog audio signal for output, and also used to convert an analog audio input into a digital audio signal.
- the audio module 170 can also be used to encode and decode audio signals.
- the audio module 170 may be disposed in the processor 110, or some functional modules of the audio module 170 may be disposed in the processor 110.
- the loudspeaker 170A also called “La ⁇ 8" is used to convert audio electrical signals into sound signals.
- the electronic device 100 can listen to music through the speaker 170A, or listen to a hands-free call.
- the receiver 170B also called a "handset" is used to convert audio electrical signals into sound signals.
- the electronic device 100 answers a phone call or voice message, it can receive the voice by bringing the receiver 170B close to the human ear.
- Microphone 170C also called “microphone”, “microphone”, is used to convert sound signals into electrical signals.
- the user can make a sound by approaching the microphone 170C with a human mouth, and input the sound signal to the microphone 170C.
- the electronic device 100 may be provided with at least one microphone 170C.
- the electronic device 100 may be provided with two microphones 170C, which can not only collect sound signals, but also implement noise reduction functions.
- the electronic device 100 may also be provided with three, four or more microphones 170C, which can collect sound signals, reduce noise, identify sound sources, and realize directional recording functions.
- the earphone interface 170D is used to connect wired earphones.
- the earphone interface 170D can be a USB interface 130, or a 3.5mm open mobile terminal platform (OMTP) standard interface, and the American cellular telecommunications industry association of the USA (CTIA) standard interface .
- the pressure sensor 180A is used to sense the pressure signal and can convert the pressure signal into an electrical signal.
- the pressure sensor 180A may be provided on the display screen 194.
- the capacitive pressure sensor may include at least two parallel plates with conductive material. When a force is applied to the pressure sensor 180A, the capacitance between the electrodes changes.
- the electronic device 100 determines the intensity of the pressure according to the change in capacitance. When a touch operation acts on the display 194, the electronic device 100 detects the intensity of the touch operation according to the pressure sensor 180A. The electronic device 100 may also calculate the touched position according to the detection signal of the pressure sensor 180A.
- touch operations acting on the same touch position but with different touch operation strengths may correspond to different operation instructions. For example: when a touch operation whose intensity is less than the first pressure threshold is applied to the short message application icon, the instruction to view the short message is executed. When a touch operation with a touch operation intensity greater than or equal to the first pressure threshold acts on the short message application icon, an instruction to create a new short message is executed.
- the gyro sensor 180B can be used to determine the movement posture of the electronic device 100.
- the angular velocity of the electronic device 100 around three axes can be determined through the gyro sensor 180B.
- the gyro sensor 180B can be used for shooting anti-shake.
- the gyro sensor 180B detects the shake angle of the electronic device 100, calculates the distance that the lens module needs to compensate according to the angle, and allows the lens to counteract the shake of the electronic device 100 through a reverse movement to achieve anti-shake.
- the gyro sensor 180B can also be used for navigation and somatosensory game scenes.
- the air pressure sensor 180C is used to measure air pressure.
- the electronic device 100 calculates the altitude based on the air pressure value measured by the air pressure sensor 180C, and assists positioning and navigation.
- the magnetic sensor 180D includes a Hall sensor.
- the electronic device 100 can use the magnetic sensor 180D to detect the opening and closing of the flip holster.
- the electronic device 100 can detect the opening and closing of the flip according to the magnetic sensor 180D. Then, according to the detected opening and closing state of the leather case or the opening and closing state of the flip cover, features such as automatic flip cover unlocking are set.
- the acceleration sensor 180E can detect the magnitude of the acceleration of the electronic device 100 in various directions (generally three axes). When the electronic device 100 is stationary, the magnitude and direction of gravity can be detected. It can also be used to identify the posture of electronic equipment, and can be used in applications such as horizontal and vertical screen switching, pedometers and so on.
- the electronic device 100 can measure the distance by infrared or laser. In some embodiments, when shooting a scene, the electronic device 100 may use the distance sensor 180F to measure the distance to achieve fast focusing.
- the proximity light sensor 180G may include, for example, a light emitting diode (LED) and a light detector, such as a photodiode.
- the light emitting diode may be an infrared light emitting diode.
- the electronic device 100 emits infrared light to the outside through the light emitting diode.
- the electronic device 100 uses a photodiode to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the electronic device 100. When insufficient reflected light is detected, the electronic device 100 can determine that there is no object near the electronic device 100.
- the electronic device 100 can use the proximity light sensor 180G to detect that the user holds the electronic device 100 close to the ear to talk, so as to automatically turn off the screen to save power.
- the proximity light sensor 180G can also be used in leather case mode, and the pocket mode will automatically unlock and lock the screen.
- the ambient light sensor 180L is used to sense the brightness of the ambient light.
- the electronic device 100 can adjust the brightness of the display screen 194 automatically according to the perceived brightness of the ambient light.
- the ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures.
- the ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the electronic device 100 is in the pocket to prevent accidental touch.
- the fingerprint sensor 180H is used to collect fingerprints.
- the electronic device 100 can use the collected fingerprint characteristics to realize fingerprint unlocking, access to the application lock, fingerprint photographs, fingerprint answering calls, and so on.
- the temperature sensor 180J is used to detect temperature.
- the electronic device 100 uses a temperature sensor 180J Detect the temperature, execute the temperature processing strategy. For example, when the temperature reported by the temperature sensor 180J exceeds a threshold, the electronic device 100 reduces the performance of the processor located near the temperature sensor 180J, so as to reduce power consumption and implement thermal protection.
- the electronic device 100 when the temperature is lower than another threshold, the electronic device 100 heats the battery 142 to avoid abnormal shutdown of the electronic device 100 due to low temperature.
- the electronic device 100 boosts the output voltage of the battery 142 to avoid abnormal shutdown caused by low temperature.
- Touch sensor 180K also called “touch panel”.
- the touch sensor 180K can be set on the display screen 194, and the touch screen is composed of the touch sensor 180K and the display screen 194, which is also called “touch screen”.
- the touch sensor 180K is used to detect touch operations on or near it.
- the touch sensor can transmit the detected touch operation to the application processor to determine the type of touch event.
- the display screen 194 can provide visual output related to the touch operation.
- the touch sensor 180K may also be disposed on the surface of the electronic device 100, which is different from the position of the display screen 194.
- the bone conduction sensor 180M can acquire vibration signals.
- the bone conduction sensor 180M can obtain the vibration signal of the vibrating bone mass of the human voice.
- the bone conduction sensor 180M can also contact the human pulse and receive the blood pressure pulse signal.
- the bone conduction sensor 180M may also be provided in the earphone, combined with the bone conduction earphone.
- the audio module 170 may parse the voice signal based on the vibration signal of the vibrating bone block of the voice obtained by the bone conduction sensor 180M, and implement the voice function.
- the application processor can analyze the heart rate information based on the blood pressure beating signal obtained by the bone conduction sensor 180M, and realize the heart rate detection function.
- the button 190 includes a power button, a volume button, and so on.
- the button 190 may be a mechanical button. It can also be a touch button.
- the electronic device 100 can receive key input, and generate key signal input related to user settings and function control of the electronic device 100.
- the motor 191 can generate vibration prompts.
- the motor 191 can be used for incoming call vibration notification, and can also be used for touch vibration feedback.
- touch operations applied to different applications can correspond to different vibration feedback effects.
- Acting on touch operations in different areas of the display screen 194, the motor 191 can also correspond to different vibration feedback effects.
- Different application scenarios for example: time reminder, receiving information, alarm clock, game, etc.
- the touch vibration feedback effect can also support customization.
- the indicator 192 may be an indicator light, which may be used to indicate a charging state, a change in power, and may also be used to indicate messages, missed calls, notifications, and the like.
- the SIM card interface 195 is used to connect to the SIM card.
- the SIM card can be inserted into the SIM card interface 195 or pulled out from the SIM card interface 195 to achieve contact and separation with the electronic device 100.
- the electronic device 100 may support one or N SIM card interfaces, and N is a positive integer greater than one.
- SIM card interface 195 can support Nano SIM card, Micro SIM card, SIM card and so on.
- the same SIM card interface 195 can insert multiple cards at the same time. The types of multiple cards can be the same or different.
- the SIM card interface 195 can also be compatible with different types of SIM cards.
- the SIM card interface 195 is also compatible with external memory cards.
- the electronic device 100 interacts with the network through the SIM card to realize functions such as call and data communication.
- the electronic device 100 adopts an eSIM, that is, an embedded SIM card.
- the eSIM card can be embedded in the electronic device 100 and cannot be separated from the electronic device 100.
- the following mainly takes the electronic device 100 as a mobile phone as an example to describe the image grouping method provided in the embodiment of the present application.
- the above-mentioned reference image set such as video or image group usually records a process of continuous and dynamic change in time. Therefore, the reference image set may often include a series of different forms of face images of the same user in the process of dynamic change. Therefore, the mobile phone can first track the face images in each reference image set, and obtain people of the same user in each reference image set with time continuity, different angles, different expressions, different decorations, and different hairstyles.
- Face map The face images in each reference image set are automatically grouped into one category, and the reference image set clustering result is obtained; then, according to the facial features in the face image and the face image in the reference image set clustering result According to the similarity of face features, the face images are clustered, so that face images of different shapes can also be clustered correctly, and the clustering accuracy of face images is improved.
- the reference image set clustering process and the face image clustering process can be automatically performed.
- the mobile phone if it obtains the reference image set, it can automatically perform the reference image set clustering process; after detecting the user's instruction to classify portraits or the user's instruction to turn on the portrait classification function, the reference The clustering result of the image set clusters the face pictures stored on the mobile phone.
- the mobile phone when the mobile phone detects that the user clicks the album icon 201 shown in (a) in FIG. 2, the mobile phone opens the album and displays the interface shown in (b) in FIG. 2. After the mobile phone detects that the user clicks on the control 202 shown in Figure 2 (b), it displays the portrait classification control 203 shown in Figure 2 (c); after the mobile phone detects that the user clicks on the control 203, It is determined that the user's instruction to classify portraits is detected, or the user instructs to enable the portrait classification function. Or, the mobile phone displays the interface as shown in (d) in Figure 2 after opening the album.
- the mobile phone After the mobile phone detects that the user clicks the discovery control 204 shown in (d) in Figure 2, it can display (e) in Figure 2 ) Shows the portrait classification control 205, the detail control 206 and so on. After the mobile phone detects that the user clicks on the control 206, it displays the portrait classification function description 207 as shown in (f) in Figure 2 so that the user can understand the specific content of the function. After the mobile phone detects that the user clicks on the control 205, it can determine that it detects the user's instruction to classify portraits, or the user instructs to enable the portrait classification function.
- the mobile phone obtains the reference image set, and detects the operation of the user instructing portrait classification or the user instructing to open the portrait classification function, clustering of the reference image set and face picture clustering is performed.
- the user can also choose whether to classify the face pictures according to the reference image set.
- the mobile phone may display the interface shown in FIG. 3. If the mobile phone detects that the user clicks on the control 302, it indicates that the user chooses to classify the face pictures according to the reference image set; if the mobile phone detects that the user clicks on the control 301, it indicates that the user chooses not to use the reference image set, but directly based on the face picture Portrait classification. As another example, the user can instruct the mobile phone to classify the face pictures according to the reference image set through voice or preset mobile phone.
- the user can also set the content of the reference image set.
- the mobile phone can set a reference image set including insiders such as videos obtained by the mobile phone.
- the mobile phone may also prompt the user whether to cluster face pictures according to the reference image set.
- the mobile phone may prompt the user through a prompt box 501.
- the mobile phone after opening the album on the mobile phone, or detecting that the user chooses to cluster face pictures according to the reference image set, if the mobile phone does not obtain the reference image set, the user may be prompted to add one or more A reference image set, the reference image set includes the face image of the user corresponding to the face picture, so that the mobile phone can more accurately cluster the face picture according to the reference image set.
- the mobile phone can prompt the user to shoot (or download, copy) a video about the target user in the face picture through the prompt box 601.
- the mobile phone can prompt the user to let the target user play a game that can collect the face image of the target user, such as YOYO Xuanwu, and the mobile phone can record a video about the target user during the game.
- the mobile phone may prompt the user to add an image group, which may be the same image group selected by the user from the face pictures. Multiple face pictures of users in different forms. Then, the mobile phone can cluster the face pictures of the target user according to the acquired video or image group and other reference image sets.
- a large number of face pictures and videos can be stored on the mobile phone.
- the face picture may be taken by the user through the camera of the mobile phone, or downloaded through the Internet or Ba Mao, or obtained through screenshots, or copied from other devices, or obtained by other means.
- the video can be a video taken by the user through the camera of a mobile phone, or a video downloaded through the network or an App, or a video saved during a video call, or a video copied from other devices, or a video obtained by other means.
- the video and face picture may include face images of the user or other users (such as relatives, friends, celebrities, etc.).
- the video records a continuous and dynamic process, so the video can often include multiple face images of the same user during the dynamic process.
- the mobile phone can first track the face images in the video to obtain the face images of the same user in the video with time continuity, different angles, different expressions, different decorations, different hairstyles, and other different forms, and then combine these faces
- the images are automatically clustered into one category to obtain the video clustering results; then the video clustering results are used as prior information, based on the similarity between the facial features in the face pictures and the facial features in the video clustering results ,
- the face pictures are clustered, so that face pictures of different forms can also be clustered correctly, and the clustering accuracy of face pictures is improved.
- the mobile phone stores video 1 and a large number of face pictures.
- the face picture includes face picture 1, face picture 2, face picture 3, and face picture 4.
- the mobile phone detects face 1 at time 1, which is the face on frontal face A, and continues to track face 1 during time period 1; mobile phone at time 2 Face 2 is detected, the face 2 is the face on the smiling face D, and the face 2 is continuously tracked within the time period 2; the mobile phone detects the face 3 at the time 3, and the face 3 is the face The face on the face G, and keep tracking the face 3 in the time period 3.
- the face images tracked by the mobile phone in time period 1 include frontal face A, side face B, and face C with sunglasses, etc.; the face images tracked by the mobile phone in time period 2 include smiling face D, Face E with closed eyes and face F with funny expressions, etc.; Face images tracked by the mobile phone in time period 3 include face up G and face H.
- the skin color model method detects human faces based on the relatively concentrated distribution of facial skin color in the color space.
- the template method one or several standard face templates are preset, and then the matching degree between the sample image collected by the test and the standard template is calculated, and the threshold is used to determine whether there is a face.
- feature sub-face method face rule method, sample learning method, etc.
- tracking methods There are also many face tracking methods.
- model-based tracking methods may include skin color model, ellipse model, texture model and binocular template.
- the tracking method based on motion information mainly uses the continuity law of the target motion between consecutive frames of the image to predict the face area to achieve the purpose of fast tracking. Methods such as motion segmentation, optical flow, and stereo vision are usually used, and spatio-temporal gradients and Kalman filters are often used for tracking.
- tracking methods based on local features of human faces and tracking methods based on neural networks.
- the mobile phone can compare the front face A, side face B and Face C wearing sunglasses is automatically grouped into one category, for example, clustered into category 1; Face D with smiling eyes, face E with closed eyes, and face F with funny expressions in time period 2 are automatically grouped into one category, for example, the clustering is category 2; the face G looking up and the face H looking down in the time period 3 are automatically clustered into one category, for example, the clustering is category 3.
- the categories here can also be called cluster centers. It is understandable that, for videos other than Video 1 stored on the mobile phone, the mobile phone may also use face detection and face tracking methods to cluster the face images in the video.
- the mobile phone can also cluster the face images in different tracking results. Specifically, the mobile phone can extract the facial features of the facial images in different tracking results (before extracting the facial features of the facial image, the mobile phone can also perform face correction on the facial image (that is, convert the facial image from other angles) If a face image in a certain category has high similarity with a face image in another category, it can be grouped into one category, then these two categories All face images in can be grouped into one category.
- the mobile phone determines that the front face A in category 1 has a high similarity to the upward face G in category 3, and can be grouped into one category, then the front face A, side face B, and side face B in category 1 and category 3 Face C, face G, and face H wearing sunglasses can be grouped together.
- face clustering methods that group different face images into one category, such as hierarchical clustering methods, partition-based clustering methods, density-based clustering methods, and grid-based clustering methods.
- Methods model-based clustering method, distance-based clustering method and interconnection-based clustering method, etc.
- K-Means algorithm DBSCAN algorithm, BIRCH algorithm, MeanShift algorithm, etc.
- the mobile phone can extract the facial features of different facial images, and perform clustering according to the similarity of the different facial features.
- face feature extraction can be understood as a process of mapping a face image to an n-dimensional vector (n is a positive integer), and the n-dimensional vector has the ability to characterize the face image.
- n is a positive integer
- the face feature is a multi-dimensional vector
- the similarity may be the distance between the multi-dimensional vectors corresponding to the face features of different faces.
- the distance may be Euclidean distance, Mahalanobis distance, Manhattan distance, etc.
- the similarity can be the cosine similarity, correlation coefficient, information entropy, etc. between the facial features of different faces.
- the face feature is a multi-dimensional vector
- the face feature 1 of the front face A in category 1 extracted by the mobile phone is [0.88, 0.64, 0.58, 0.11, ..., 0.04, 0.23]
- the face feature 2 of the upward face G in the middle is [0.68, 0.74, 0.88, 0.81, ...,0.14, 0.53]
- the similarity between the face feature 1 and the face feature 2 is based on the corresponding multi-dimensional
- the cosine similarity between vectors is measured.
- the cosine similarity is 0.96.
- the similarity between face feature 1 and face feature 2 is 96%; the similarity threshold corresponding to the clustering is 80%; the similarity 96% is greater than the similarity
- the threshold is 80%, so the frontal face A in category 1 and the upward face G in category 3 can be grouped into one category, and all face images in category 1 and category 3 can be grouped into one category.
- the face feature is a multi-dimensional vector, and the corresponding relationship between each face image and the face feature can be seen in Table 1.
- the distance threshold of clustering is 5, and the face feature A of front face A in category 1 and the face feature A of upside face G in category 3 are shown in Table 1.
- the Euclidean distance between facial features G is less than the distance threshold of 5, so the frontal face A in category 1 and the upward face G in category 3 can be grouped into one category, and all face images in category 1 and category 3 can be grouped together. As a class.
- the mobile phone can use incremental clustering algorithms or other clustering algorithms based on the video clustering results, such as categories 1, category 2, and category 3, and the extracted facial features of the facial images in each category.
- the class method is to cluster face picture 1, face picture 2, face picture 3, and face picture 4, thereby fusing the face image and face picture in the video, and clustering the face picture to the previous In the video clustering results. For example, if the facial features of a face image stored in the mobile phone are similar to the facial features of a face image in a certain category (such as category 1) (such as greater than or equal to a preset value) 1), the face image can be clustered into the category of the face image.
- the face image can be used to achieve the expansion of the video clustering result in an incremental manner. If the similarity between the facial features of a certain face picture stored in the mobile phone and the facial features of the facial images in each category of the video clustering result is small (for example, less than the preset value 1), then the face The pictures are grouped into a new category.
- the correspondence between face pictures and face features can be found in Table 1 above.
- the clustering distance threshold is 5, and the difference between the face feature a of face image 1 and the face feature A of front face A If the Euclidean distance is less than the distance threshold of 5, then the face image 1 can be clustered into category 1 where the front face A is located; the Euclidean distance between the face feature b of the face image 2 and the face feature B of the side face B is less than The distance threshold is 5, then the face image 2 can be clustered into the category 1 where the side face B is located; the Euclidean distance between the face feature c of the face image 3 and the face feature D of the smiling face D is less than the distance threshold 5.
- the face image 3 can be clustered into the category 2 where the smiling face D is located; the Euclidean distance between the face feature d of the face image 4 and the face feature G of the upward face G is less than the distance threshold of 5, then The face picture 4 can be clustered into category 3 where the upward face G is located.
- the range of the Euclidean distance between the face feature corresponding to category 1 and the reference feature is 0-50;
- the clustering range of 3 corresponding facial features is 100-150. If the clusters of category 1 and category 3 are the same category 4, the range of the Euclidean distance between the facial features corresponding to category 4 and the reference feature is [0,50] U [100,150].
- the face features of face picture 1, face picture 2 and face picture 4 are all in the range of [0,50] U [100,150], so face picture 1, face picture 2 and face picture 4 are all acceptable Cluster into category 4, thus clustering into the same category.
- a schematic diagram of the clustering effect can be seen in Figure 8A.
- the mobile phone can separately extract the facial features of all face images and facial images listed in Table 1, and then cluster them according to the similarity of the facial features.
- the mobile phone can group multiple face images of the same user into the same category, so that the face images of different shapes in the category can be Face images with similar face images are also clustered into this category, so compared with the prior art, the dispersion of clustering can be reduced, the accuracy of face clustering can be improved, the management of the user is convenient, and the user experience is improved.
- the similarity between face features is measured by Euclidean distance, clustering If the distance threshold is 5, without considering the result of video clustering, if the face picture 1, face picture 2, face picture 3, and face picture 4 are clustered directly, since every two face pictures The Euclidean distance between the facial features of are all greater than the distance threshold of 5, so any two cannot be clustered into One type, which results in each face picture being divided into a category.
- the clustering effect diagram can be seen in FIG. 8B. Compared with FIG. 8A, the face clustering shown in FIG. 8B has a greater degree of dispersion and a lower clustering accuracy, which causes problems such as false positives or false negatives in the clustering results.
- the mobile phone may also perform identity marking on the face in the video according to the video clustering result.
- the above description mainly takes the reference image set as the video as an example.
- the reference image set is the image group mentioned above (for example, the image group formed by moving pictures, the image group corresponding to the multi-frame images obtained during shooting preview, etc.), Or when the reference image set includes the video and the aforementioned image group, the mobile phone can still perform clustering processing in a manner similar to the video processing process, which will not be repeated here.
- the reference image set is an image group preset by the user
- the images in the image group are usually different face images of the same user set by the user. Therefore, the mobile phone does not need to perform face detection and Tracking, directly automatically group the face images in the image group into one category.
- the mobile phone can also cluster the newly added face pictures according to the clustering result of the reference image set.
- the mobile phone can extend the newly added face pictures to the previous clustering results through incremental clustering.
- the mobile phone After clustering the face picture 1, face picture 2, face picture 3, and face picture 4 according to the reference image set clustering result, if the subsequent mobile phone obtains a new reference image set (such as video 2), then In one solution, the mobile phone performs face detection, tracking, and clustering according to the previous reference image set and the new reference image set, and compares face picture 1, face picture 2, face picture 3 according to the video clustering result. Perform clustering processing with face image 4; in another solution, the mobile phone does not perform clustering processing on face image 1, face image 2, face image 3, and face image 4 temporarily, and the user is detected Instruct the operation of portrait classification before re-clustering.
- a new reference image set such as video 2
- the mobile phone regardless of whether the mobile phone acquires a new reference image set, the mobile phone periodically performs face detection, tracking, and clustering on the currently acquired reference image set, and calculates the current reference image set based on the reference image set clustering result.
- the stored face pictures are clustered.
- the mobile phone after the mobile phone detects the user's instruction to classify portraits, it performs face detection, tracking, and clustering according to the currently acquired reference image set, and performs face detection, tracking, and clustering according to the reference image set clustering results. Face images are clustered.
- the mobile phone can be in a preset time period (for example, 00:00-6:00 at night), or in an idle state (for example, when the mobile phone is not performing other services). ), or when the mobile phone is charging and the power is greater than or equal to the preset value 2, face detection, tracking and clustering are performed according to the currently acquired reference image set, and the currently stored face The pictures are clustered.
- a preset time period for example, 00:00-6:00 at night
- an idle state for example, when the mobile phone is not performing other services.
- the mobile phone can display the clustering results.
- mobile phones can be displayed in groups (for example, folders).
- the reference image set is still video 1
- category 1 and category 3 are clustered as category 4.
- the face pictures stored on the mobile phone include face picture 1, face picture 2, face picture 3, and face picture 4 as examples Be explained.
- the mobile phone may display the video clustering result.
- the video portrait classification interface that is, the video clustering result interface
- the mobile phone can display group 1 corresponding to category 4 and group 2 corresponding to category 2.
- the group corresponding to each cluster category may include the video where the face image of the category is located.
- group 1 corresponding to category 4 and group 2 corresponding to category 2 both include video 1.
- the cover image displayed by the thumbnail of the video in the group may be a face image belonging to the category in the video.
- the cover image may be a relatively positive face image, or an image designated by the user.
- the mobile phone can put the video into the groups corresponding to all categories of the face images in the video; or, when the face image of a certain category in the video appears for a duration greater than or equal to the preset duration, The mobile phone puts the video into the group corresponding to the category; or, when the number of frames of the face image of a certain category in the video is greater than or equal to the preset value 3, the mobile phone puts the video into the group corresponding to the category Or, when a frontal face image of a certain category appears in the video, the mobile phone puts the video into the group corresponding to the category.
- the group corresponding to each cluster category may include a video segment where the face image of the category is located, and the video segment of the face image of the category appears.
- group 1 corresponding to category 4 may be group 1A in FIG. 9B, and group 1A may include video segment 1 corresponding to time period 1 in video 1 and video segment 3 corresponding to time period 3; in addition, category 2
- the corresponding group 2 may be group 2A, and group 2A may include video segment 2 corresponding to time period 2 in video 1.
- the group corresponding to each cluster category may include the face image frame of the category in the video where the face image of the category is located.
- group 1 corresponding to category 4 may be group 1B in FIG. 9C, and group 1B may include frontal face A, side face B, face C wearing sunglasses, face upward G, and face upward H.
- the group 2 corresponding to category 2 can be group 2B, and group 2B can include smiling face D, face E with closed eyes, and face F with funny expressions.
- the group corresponding to the same category may include multiple sub-groups, and the face image frames belonging to the category in the same video belong to the same sub-group.
- the group corresponding to each cluster category may include the video segment in which the face image of the category is located in the video where the face image of the category appears, and the face image frame of the category.
- the mobile phone can display the result of video clustering, which is convenient for users to categorize and manage videos based on video images, which improves the efficiency of users for searching and managing videos, and improves user experience.
- the mobile phone may not display the clustering result; after the face image clustering is completed, the clustering result is displayed again.
- the mobile phone may display the face image clustering result.
- the group corresponding to each cluster category includes the face pictures of that category.
- the mobile phone can display group 3 corresponding to category 4 and group 4 corresponding to category 2, and group 3 includes face picture 1, face picture 2, and face picture 4. , Group 4 includes face picture 3.
- the mobile phone can display the video clustering result and the face image clustering result.
- the video clustering result and the face image clustering result can be displayed in different groups respectively, or can be combined and displayed in the same group.
- the video clustering result and the face image clustering result can be displayed in different groups
- the video clustering result can be displayed in group 5
- the face image clustering result can be displayed in group 6.
- the content in group 5 may be the video clustering result described above (for example, as shown in FIGS. 9A-9C);
- the content in group 6 may be the above-described content (for example, (a) in FIG. 10 -( c) Shown) Clustering results of face images.
- the group corresponding to each clustering category may include the face picture clustering result described above, or the video clustering result described above. Class result.
- category 4 corresponds to group 7, and category 2 corresponds to group 8.
- every The group corresponding to each cluster category may include the face image of the category and the video where the face image of the category is located.
- the group 7 corresponding to category 4 is group 7A, and group 7A includes face picture 1, face picture 2, face picture 4, and video 1; see Fig. 11 (C)
- the group 8 corresponding to category 2 is group 8A, and group 8A includes face image 3 and video 1.
- the grouped cover image can be a face image in this category or a face image in this category in the video.
- the cover image of video 1 in group 7 and the cover image of video 1 in group 8 may be the same or different.
- the cover image of Video 1 may be a face image included in the category corresponding to the group.
- the face image of this category may belong to one sub-group, and the video where the face image of this category is located may belong to another sub-group.
- group 7 corresponding to category 4 is group 7B, and group 7B includes sub-group 7-1 corresponding to face pictures and sub-group 7-2 corresponding to videos. See (b) in Figure 12, sub-group 7-1 includes face picture 1, face picture 2 and face picture 4; see (c) in Figure 12, sub-group 7-2 includes video 1.
- the group corresponding to each cluster category may include the face image of the category, and the video segment in which the face image of the category appears in the video where the face image of the category is located.
- the group 7 corresponding to category 4 is group 7C
- group 7C includes people Face picture 1, face picture 2, face picture 4, video segment 1 and video segment 3
- group 8 corresponding to category 2 is group 8C
- group 8C includes face picture 3 and video segment 2.
- the face pictures of this category may belong to one sub-group, and the video segment may belong to another sub-group.
- the group corresponding to each cluster category may include the face picture of the category and the image frame intercepted or selected in the video where the face image of the category is located.
- the group 7 corresponding to category 4 is group 7D, and group 7D includes people Face picture 1, face picture 2, face picture 4, and face images A, B, C, G, H in video 1;
- group 8 corresponding to category 2 is group 8D, and group 8D includes face picture 3 and Face images D, E, F in video 1.
- the face pictures of this category may belong to one subgroup; in the video where the face images of this category are located, the face image frames of this category may belong to another subgroup.
- the group corresponding to each cluster category may include face pictures of that category. It may also include the video where the face image of the category is located; in the video where the face image of the category is located, the video segment in which the face image of the category appears, and one or more of the captured or selected image frames .
- the face pictures of this category and the face image frames of this category may belong to one subgroup; the video or video segment corresponding to this category may belong to another subgroup.
- the face pictures of this category may belong to one subgroup, and the face image frames and videos or video segments of this category may belong to another subgroup.
- the face images of this category, the face image frames, videos, and video segments of this category belong to different subgroups respectively.
- the mobile phone may display the face picture clustering result, and determine whether to display the video clustering result according to the user's instruction.
- the name of the group may be a name manually input by the user; or it may be a name obtained by the mobile phone itself through learning.
- the mobile phone can determine the user identity in the picture, such as father, mother, wife (or husband), son, daughter, etc., according to the actions and intimacy between users in the picture or video, and set the user identity as the group name.
- the mobile phone displays the face picture clustering result for the first time or every time, it may also prompt the user that the face picture clustering result is obtained by classifying the face picture according to a reference image set such as a video.
- the mobile phone may prompt the user by displaying information 1501, so that the user can learn the portrait classification function of the mobile phone.
- the mobile phone can improve the efficiency of the user to find and manage the face pictures and videos based on the clustering results of the comprehensive management and display of the face pictures and videos, and improve the user experience.
- the user can actively add a reference image set corresponding to the user in the face picture. For example, if the clustering result of face image 5 is wrong, see (a) in Figure 16.
- the user can click control 1601, or the user can click control 1601 after selecting face image 5; then, the user can click
- the control 1602 shown in (b) in FIG. 16 adds a reference image set; or, the user can add a reference image set through voice, preset gestures, or the like.
- the reference image set may be a video or a set of images captured by the user in real time, or a set of images obtained by the user through a mobile phone, and the set of images includes different forms of the user corresponding to the wrong face picture. Human face.
- the reference image set may be the image set shown in (a)-(h) in FIG. 17. After the reference image set is added, the mobile phone can combine the reference image set added by the user to re-cluster the face pictures whose clustering is wrong; or re-cluster all face pictures stored on the phone.
- the clustering method described in the above embodiment classifies the face pictures of different users according to the similarity of the facial features, so the groups corresponding to different clustering categories can also be understood as the groups corresponding to different users. .
- the groups corresponding to different cluster categories displayed on the mobile phone may correspond to different priorities.
- the users corresponding to the high-priority groups may be users who are more concerned about.
- the mobile phone can determine what appears in the saved facial pictures and videos.
- the users with the highest frequency are the users most concerned about, and the groups corresponding to these users have the highest priority.
- the mobile phone can determine that the priority of the group corresponding to the user with high intimacy of the mobile phone user is higher.
- a mobile phone can use emotions based on factors such as the intimacy of actions between different users, the expressions of different users, the frequency of different users’ appearance in videos and face pictures, and the position of different users in videos and face pictures.
- the analysis algorithm determines the intimacy between different users and the user, thereby determining that users with higher intimacy with the user are the users that the user cares more about, and the priority of the groups corresponding to these users is also higher.
- the relatives are usually the users that the user cares more about, and the user prefers to display the groups corresponding to the relatives first, so the mobile phone can determine the identity of the mobile phone user. Groups corresponding to users with closer facial information have higher priority.
- groups with high priority may be displayed first.
- the mobile phone can display the high-priority groups on the top of the portrait classification interface, and the low-priority groups need to be viewed by the user by sliding up or switching pages.
- the mobile phone may only display the top N (positive integer) groups with the highest priority on the portrait classification interface, and may not display the groups corresponding to other users that the user does not care about.
- the user may be a user that the user is more concerned about, and the mobile phone may be The group corresponding to the user is displayed on the classification interface.
- a group photo of a certain user and another user in the face picture may be in the group corresponding to the certain user, and may also be in the group where the other user is located.
- face picture 6 is a group photo of user 1 and user 2; referring to (b) in FIG. 18, face picture 6 is both in the group corresponding to user 1 and User 2 corresponds to the group.
- groups corresponding to different users only include a single photo of the user, and group photos of multiple users are additionally displayed.
- groups corresponding to different users only include a single photo of the user, and group photos of multiple users are in another group.
- the mobile phone can also mark the face in the picture according to the clustering result.
- the mobile phone may also display the clustering results in a personalized manner. For example, for pictures in a group, when the mobile phone detects the operation of the user instructing color retention, the mobile phone can reserve the area indicated by the user or the preset area in the grouped picture as a color image, and other areas on the picture become Gray image.
- the preset area is the area where the user is located
- the mobile phone may retain the color of the image in the area where the user is located
- the images in other areas are grayscale images.
- the image frame of the area where the target user is located remains, and the image frames of other areas disappear, that is, the images of other areas It can be blank, black, gray or other preset colors.
- the mobile phone may also generate a protagonist story.
- the main character story may include a series of multiple images of a certain user.
- the images in the protagonist's story are images of the same category. Specifically, they can be images in the reference image set (for example, they can include video segments in a video or face image frames in a video), or they can be images in a face picture.
- the mobile phone can not only extract face pictures from the pictures to edit the protagonist's story, but also combine the face images in the reference image set such as videos to edit the protagonist's story, so that the source of the protagonist's image can be wider and the protagonist's story can be more extensive Vivid, interesting and colorful.
- the above description has taken the video as the reference image set as an example.
- the reference image set is another reference image set (for example, the images captured by the mobile phone shown in (a)-(f) in Figure 20) In the case of group)
- the face pictures can still be clustered according to other reference image sets in the manner described in the above embodiment, which will not be repeated here.
- the above description is based on an example of a human face as a classification object.
- the classification object is another object
- the clustering method provided in this application embodiment can still be used to cluster the pictures on the mobile phone.
- the user can also set the classification objects that the mobile phone can cluster.
- the classification objects are animal faces (such as the face of a dog, the face of a cat), objects (such as a house, a car, a mobile phone, a water cup, etc.), a logo mark (such as the logo of the Olympic rings), etc.
- the mobile phone may first gather the reference image sets of houses in different angles, different directions, different positions, different brightness, and different scenes acquired by the mobile phone in the manner described in the above embodiments. Clustering (for example, tracking and automatic clustering processing), and then clustering the pictures of houses stored in the mobile phone according to the clustering results of the reference image set, so that the clustering accuracy of pictures of houses of different appearances is higher, which is convenient for users Find and manage pictures of houses.
- the clustering results displayed by the mobile phone can include the grouping of face pictures and the grouping of other classification objects; it can also be said that the mobile phone can perform clustering and clustering according to different entities. Grouping.
- the mobile phone can display in the clustering results the groups corresponding to the faces of different users (such as user 1 and user 2), and different dogs (such as Dog 1) The corresponding grouping, and the corresponding grouping of different houses (such as house 1).
- the clustering result may include a group 9 corresponding to a human face, a group 10 corresponding to a dog, and a group 11 corresponding to a house.
- group 9 may include sub-groups corresponding to different users (for example, user 1 and user 2), group 10 may include sub-groups corresponding to different dogs, and group 11 may include sub-groups corresponding to different houses.
- the sub-groups may include image clustering results, or include image clustering results and reference image set clustering results, which will not be described in detail here.
- the user can also select the classification result of the classification object currently to be displayed.
- the mobile phone after the mobile phone detects that the user clicks on the control 2201 shown in Figure 22 (a), it can display the interface shown in Figure 22 (b); the mobile phone detects the user After clicking the control 2202 shown in (b) in Figure 22, the interface shown in (c) in Figure 22 can be displayed; then, when the user selects the portrait category, the mobile phone only displays the face; when the user selects the dog category When the user selects the classification of the house, the mobile phone only displays the house; when the user selects other classification objects, perform the clustering results of other classification objects. It should be noted that there can be multiple ways for the user to select the clustering results of the classification objects that need to be displayed currently, which is not limited to the example shown in FIG. 22.
- another embodiment of the present application provides a method for grouping pictures, which can be implemented in an electronic device having the hardware structure shown in FIG. 1. At least one face picture is saved on the electronic device. As shown in Figure 23, the method may include:
- the electronic device acquires at least one video.
- At least one video obtained by the electronic device may include multiple face image frames, and each video may also include multiple face image frames.
- At least one face picture saved on the electronic device is a static picture taken by the user before, or the electronic device obtains a static picture through downloading, copying, or the like.
- the at least one face picture may be face picture 1-face picture 4 shown in FIG. 8A.
- the storage area of the electronic device stores at least one video
- the electronic device obtains at least one video from the storage area.
- the video stored in the storage area may be taken by the user before, downloaded by the electronic device, or obtained by the electronic device during the running of the application program.
- the electronic device may prompt the user to shoot a video including face image frames, and after detecting the user's instruction to shoot the video, record and generate at least one video.
- the electronic device prompts the user to download at least one video, and the downloaded video is obtained after the user instructs to download.
- the at least one video acquired by the electronic device may include video 1 shown in FIG. 7B.
- the electronic device extracts multiple face image frames from at least one video.
- the electronic device After acquiring at least one video, the electronic device can extract multiple face image frames from the at least one video, so that the face images can be grouped subsequently according to the extracted face image frames.
- the video acquired by the electronic device includes the video 1 shown in FIG. 7B
- the face image frame extracted by the electronic device from the video 1 may be the face image frame A-face image frame H in FIG. 7B.
- the electronic device may also extract a face image frame from at least one video, so that the face images may be grouped according to the extracted face image frame later.
- the electronic device performs clustering processing on at least one face image according to multiple face image frames.
- the electronic device may perform clustering processing on the face picture 1-the face picture 4 according to the extracted face image frame A-face image frame H.
- clustering algorithms there may be multiple clustering algorithms.
- the electronic device 4 displays at least one group according to the clustering processing result, and each group includes a user's At least one face picture.
- each group obtained by the clustering process may include at least one face picture of a user, that is, a group may include at least one face picture of the same user, and at least one face picture of the same user. Face pictures can be in the same group.
- the electronic device may use multiple face image frames in at least one video as prior information, and cluster the face images according to the multiple face image frames in the at least one video, so as to classify the face images according to different Users are grouped, so that face pictures of the same user are clustered into the same group, and the accuracy of face picture grouping is improved.
- At least one face picture included in a group may be a face picture of the same user determined by the electronic device.
- the electronic device may calculate according to the similarity between the face features on the face picture, and determine that different face pictures with a similarity greater than or equal to the first preset value are face pictures of the same user.
- the obtained group may refer to the group shown in FIG. 10(b) 3 and group 4 shown in (c) in Figure 10.
- Group 3 includes the face picture of user 1
- group 4 includes the face picture of user 2.
- each group further includes any one or a combination of any of the following: the video where the user's face image frame is located, the video segment where the user's face image frame is located, or at least one of the user's face image frames Face image frame. That is, the electronic device can group face pictures, videos, video segments, and face image frames according to different users, unified or jointly manage users' videos and pictures, which is convenient for users to find and manage, and improve user experience.
- the group 7A corresponding to user 1 includes the face picture of the user and the video 1 where the face image frame of the user 1 is located.
- the group 7C corresponding to user 1 includes the user's face picture and the video segment where the user's face image frame is located.
- the group 7D corresponding to the user 1 includes the face picture of the user and multiple face image frames of the user.
- At least one face picture of a user included in each group is a single photo or a group photo.
- group 3 in FIG. 10 includes a single photo of user 1
- group 4 shown in (c) in FIG. 10 includes a single photo of user 2.
- Group 9 shown in (a) in FIG. 18 includes the single photo and group photo of user 1, and group 10 shown in (b) in FIG. 18 includes the single photo and group photo of user 2.
- the foregoing step 2303 may specifically include:
- the electronic device divides the multiple face image frames into at least one category, and each category corresponds to multiple face image frames of different forms of a user.
- the electronic device may divide the face image frame AC into category 1, and the category 1 includes multiple face image frames of different forms of the user 1; and divide the face image frame DF into category 2, category 2. Includes multiple face image frames of user 2 in different forms; the face image frame GH is divided into category 3, and category 3 includes multiple face image frames of user 1 in different forms.
- the electronic device performs clustering processing on at least one face image according to the classification results of multiple face image frames.
- the electronic device may perform clustering processing on the face pictures 1-4 according to category 1, category 2, and category 3 shown in FIG. 8A.
- the electronic device may group the face pictures and the divided categories into a group according to the classification results, or divide the face pictures into a new group.
- the face image frame in the video is usually a dynamically changing face image frame, and may include face images of different forms Like.
- the electronic device can respond to different face images of different users in different categories according to different types of face images. Face pictures of different shapes such as angles and expressions are accurately grouped to improve the accuracy of grouping.
- the above step 2303A may specifically include: the electronic device separately classifies the face image frames in each video into at least one category.
- adjacent image frames in the same video have temporal continuity
- multiple face image frames of the same user with temporal continuity in the video can be classified into one category.
- the face image frames of the same user with temporal continuity can usually be adjacent face image frames.
- the face images in the same video tracked by the electronic device through the face tracking algorithm have temporal continuity, meet the must-link constraint, are the faces of the same user, and can be classified into the same category. Therefore, the electronic device can separately classify multiple face image frames of the same user with temporal continuity in each video into the same category through the face tracking algorithm. In this way, the face image frames of multiple users in the same video can correspond to multiple categories.
- the result of the electronic device classifying the face image frame in the video 1 may be category 1, category 2, and category 3 shown in FIG. 8A.
- the above step 2303A may specifically further include: if the facial features of the first face image frame in the first category in at least one category are similar to the face features of the second face image frame in the second category If the value is greater than or equal to the second preset value, the electronic device may merge the first category and the second category into the same category.
- the two face image frames are generally the face image frames of the same user, and the categories of the two face image frames are also the same as those of the same user.
- the electronic device can merge the categories of the two face image frames into the same category.
- the electronic device can first divide the face image frames in the same video into categories, and then merge the categories of the face image frames with greater similarity in different videos, that is, the face image frames of the same user in different videos Merged into the same category.
- the electronic device sets the category 1 Merged with category 3 into category 4.
- the electronic device may perform clustering processing on at least one face image saved (acquired) by the electronic device according to category 2 and category 4.
- the method may further include:
- the electronic device acquires at least one image group, and each image group includes multiple image frames of the same user in different forms.
- the at least one image group includes any one or a combination of any of the following: a moving image, a pre-photographed image group including different forms of the same user's face, an image formed by multiple frames of images collected in real time during the shooting preview Group, or an image group formed by multiple frames of images taken during continuous shooting.
- step 2302 may specifically include: the electronic device extracts multiple face image frames from at least one video and at least one image group.
- the image group in step 2305 and the video in step 2301 may be the reference image set described in the foregoing embodiment of this application. That is, the electronic device can obtain multiple face image frames of the same user in different poses from one or more reference image sets, so that the electronic device can accurately group the face images according to the multiple face image frames of the same user in different poses. Reduce the dispersion of clusters.
- an electronic device includes hardware and software corresponding to each function.
- Piece modules With reference to the algorithm steps of the examples described in the embodiments disclosed herein, the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software-driven hardware depends on the specific application and design constraint conditions of the technical solution. Those skilled in the art can use different methods for each specific application in combination with the embodiments to implement the described functions, but such implementation should not be considered beyond the scope of the present application.
- the embodiment of the present application may divide the electronic device into functional modules according to the foregoing method examples.
- each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
- the above integrated modules can be implemented in the form of hardware. It should be noted that the division of modules in the embodiments of the present application is illustrative, and is only a logical function division, and there may be other division methods in actual implementation.
- FIG. 24 shows a schematic diagram of a possible composition of the electronic device 2400 involved in the foregoing embodiment.
- the electronic device 2400 may include: an acquiring unit 2401, extraction unit 2402, clustering unit 2403, display unit 2404, and so on.
- the obtaining unit 2401 may be used to support the electronic device 2400 to perform the foregoing step 2301, and/or other processes used in the technology described herein.
- the extracting unit 2401 may be used to support the electronic device 2400 to perform the foregoing steps 2302, etc., and/or used in other processes of the technology described herein.
- the clustering unit 2403 may be used to support the electronic device 2400 to perform the above steps 2303, 2303A, 2303B, etc., and/or other processes of the technology described herein.
- the display unit 2404 may be used to support the electronic device 2400 to perform the above-mentioned steps 2304, etc., and/or used in other processes of the technology described herein.
- the electronic device provided in the embodiment of the present application is used to perform the above-mentioned grouping method for pictures, and therefore can achieve the same effect as the above-mentioned implementation method.
- the electronic device may include a processing module and a storage module.
- the processing module can be used to control and manage the actions of the electronic device. For example, it can be used to support the electronic device to execute the steps performed by the above-mentioned obtaining unit 2401, extraction unit 2402, clustering unit 2403, and display unit 2404.
- the storage module can be used to support electronic devices to store reference image sets such as face pictures and videos, moving pictures, and to store program codes and data.
- the electronic device may also include a communication module, which may be used to support communication between the electronic device and other devices.
- the processing module may be a processor or a controller. It can implement or execute various exemplary logical blocks, modules, and circuits described in conjunction with the disclosure of this application.
- the processor may also be a combination of computing functions, for example, a combination of one or more microprocessors, a combination of digital signal processing (DSP) and a microprocessor, etc.
- the storage module may be a memory.
- the communication module may specifically be a radio frequency circuit, a Bluetooth chip, a wifi chip, and other devices that interact with other electronic devices.
- the electronic device involved in the embodiment of the present application may be an electronic device having the structure shown in FIG. 1.
- the internal memory 121 shown in FIG. 1 may store computer program instructions, and when the instructions are executed by the processor 110, the electronic device can execute: acquire at least one video; extract multiple face image frames from the at least one video; Perform clustering processing on at least one face picture according to multiple face image frames; and according to the clustering processing result, display at least one group, and each group includes at least one face picture of a user.
- the electronic device can specifically execute: divide multiple face image frames into at least one category, and each category corresponds to multiple face image frames of different forms of a user; and As a result of the classification of multiple face image frames, cluster processing is performed on at least one face image, and the steps in the foregoing method embodiment.
- the embodiment of the present application also provides a computer storage medium, the computer storage medium stores computer instructions, when the computer instructions run on the electronic device, the electronic device executes the above-mentioned related method steps to implement the picture grouping method in the above-mentioned embodiment .
- the embodiments of the present application also provide a computer program product.
- the computer program product runs on a computer, the computer is caused to execute the above-mentioned related steps, so as to realize the picture grouping method in the above-mentioned embodiment.
- the embodiments of the present application also provide a device.
- the device may specifically be a chip, component or module.
- the device may include a connected processor and a memory; where the memory is used to store computer execution instructions, and when the device is running, The processor can execute the computer-executable instructions stored in the memory, so that the chip executes the picture grouping methods in the foregoing method embodiments.
- the electronic equipment, computer storage medium, computer program product, or chip provided in the embodiments of the present application are all used to execute the corresponding method provided above. Therefore, the beneficial effects that can be achieved can refer to the corresponding method provided above. The beneficial effects of the method are not repeated here.
- the disclosed device and method can be implemented in other ways.
- the device embodiments described above are merely illustrative, for example, the division of modules or units is only a logical function division, and there may be other division methods in actual implementation, for example, multiple units or components may be combined or It can be integrated into another device, or some features can be ignored or not implemented.
- the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
- the units described as separate parts may or may not be physically separate.
- the parts displayed as a unit may be one physical unit or multiple physical units, that is, they may be located in one place or distributed to multiple different places. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
- each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
- the above-mentioned integrated unit can be realized in the form of hardware or software functional unit.
- the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a readable storage medium.
- the technical solutions of the embodiments of the present application essentially or the part that contributes to the prior art, or all or part of the technical solutions can be embodied in the form of a software product, and the software product is stored in a storage medium.
- There are thousands of instructions used to make a device may be a single-chip microcomputer, a chip, etc.) or a processor (processor) execute all or part of the steps of the method described in each embodiment of the present application.
- the aforementioned storage media include: U disk, mobile hard disk, read only memory (read only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program codes.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Multimedia (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Studio Devices (AREA)
Abstract
A picture grouping method and device, relating to the technical field of electronics. The picture grouping method and device are used for grouping face pictures and capable of clustering face pictures in an electronic device according to face images in different forms in a reference image set obtained by the electronic device, improving clustering accuracy. The method comprises: an electronic obtains at least one video; extracting a plurality of face image frames from the at least one video; performing, according to the plurality of face image frames, cluster processing on at least one face picture obtained the electronic device; and displaying at least one group according to a cluster processing result, each group respectively comprising at least one face picture of one user.
Description
一种图片分组方法及设备 本申请要求在 2019年 2月 27 日提交中国国家知识产权局、 申请号为 201910147299.6、 发明名称为 “一种图片分组方法及设备” 的中国专利申请的优先权, 其全部内容通过引用结 合在本申请中。 技术领域 A method and equipment for grouping pictures. This application requires the priority of a Chinese patent application filed with the State Intellectual Property Office of China, the application number is 201910147299.6, and the title of the invention is "a method and equipment for grouping pictures" on February 27, 2019. The entire content is incorporated into this application by reference. Technical field
本申请实施例涉及电子技术领域, 尤其涉及一种图片分组方法及设备。 The embodiments of the present application relate to the field of electronic technology, and in particular, to a method and device for grouping pictures.
背景技术 Background technique
随着终端技术的不断发展, 用户通过手机等终端设备拍摄的图片越来越多, 一些用户的 手机中甚至存储有几千张图片。 用户从大量的图片中手动查找目标图片, 以及对大量的图片 进行分类管理, 往往需要花费很多的时间和精力。 With the continuous development of terminal technology, more and more pictures are taken by users through mobile phones and other terminal devices, and some users even store thousands of pictures in their mobile phones. It often takes a lot of time and energy for the user to manually search for the target picture from a large number of pictures and to classify and manage the large number of pictures.
随着人脸特征提取技术的进步, 利用人脸信息将不同的人脸图片进行聚类, 提供了一种 有效的图片聚类方法, 能够方便用户在手机上管理和查找人脸图片。 With the advancement of facial feature extraction technology, the use of face information to cluster different face pictures provides an effective picture clustering method, which is convenient for users to manage and find face pictures on mobile phones.
当前聚类方法主要通过人脸检测算法检测图片中的人脸和特征点 (例如眼角、 鼻尖、 嘴 角等关键点), 提取人脸特征, 利用人脸特征进行图片聚类。 该方法对正面人脸图片的聚类精 度较高, 对其他角度拍摄的人脸图片的聚类精度较低。 Current clustering methods mainly use face detection algorithms to detect faces and feature points in pictures (such as key points such as corners of eyes, nose tip, and mouth corners), extract facial features, and use facial features to cluster pictures. This method has high clustering accuracy for frontal face images, and low clustering accuracy for face images taken from other angles.
发明内容 Summary of the invention
本申请实施例提供一种图片分组方法及设备, 能够根据电子设备获取的参考图像集中不 同形态的人脸图像, 对电子设备存储的人脸图片进行聚类, 提高聚类精度。 The embodiments of the present application provide a picture grouping method and device, which can cluster face pictures stored in an electronic device according to face images of different shapes in a reference image set obtained by the electronic device, and improve clustering accuracy.
为达到上述目的, 本申请实施例采用如下技术方案: To achieve the foregoing objectives, the following technical solutions are adopted in the embodiments of this application:
一方面, 本申请的实施例提供了一种图片分组方法, 可以应用于电子设备, 电子设备获 取了至少一张人脸图片, 该方法包括: 电子设备获取至少一个视频。 而后, 电子设备从至少 一个视频中提取多个人脸图像帧。 电子设备根据多个人脸图像帧, 对至少一张人脸图片进行 聚类处理。 之后, 电子设备 4艮据聚类处理结果, 显示至少一个分组, 每个分组分别包括一个 用户的至少一张人脸图片。 On the one hand, an embodiment of the present application provides a picture grouping method, which can be applied to an electronic device. The electronic device obtains at least one face picture. The method includes: the electronic device obtains at least one video. Then, the electronic device extracts multiple face image frames from at least one video. The electronic device performs clustering processing on at least one face picture according to multiple face image frames. After that, the electronic device 4 displays at least one group according to the clustering processing result, and each group includes at least one face picture of a user.
这样, 电子设备可以以至少一个视频中的多个人脸图像帧为先验信息, 根据至少一个视 频中的多个人脸图像帧对人脸图片进行聚类, 从而将人脸图片根据不同的用户进行分组, 使 得同一用户的人脸图片聚类为同一个分组, 提高人脸图片聚类和分组的准确性。 In this way, the electronic device can use multiple face image frames in at least one video as a priori information, and cluster the face images according to the multiple face image frames in the at least one video, thereby classifying the face images according to different users. Grouping makes the face pictures of the same user cluster into the same group, and improves the accuracy of face picture clustering and grouping.
在一种可能的设计中, 电子设备根据多个人脸图像帧, 对至少一张人脸图片进行聚类处 理, 包括: 电子设备将多个人脸图像帧划分为至少一个类别, 每个类别分别对应于一个用户 不同形态的多个人脸图像帧。 电子设备根据多个人脸图像帧的类别划分结果, 对至少一张人 脸图片进行聚类处理。 In a possible design, the electronic device performs clustering processing on at least one face picture according to the multiple face image frames, including: the electronic device divides the multiple face image frames into at least one category, and each category corresponds to each category. Multiple face image frames of different shapes for a user. The electronic device performs clustering processing on at least one face picture according to the classification results of the multiple face image frames.
这样, 电子设备可以根据类别划分结果, 将人脸图片与已划分的类别归为一组, 或者将 人脸图片重新分为一组。 当每个类别中分别包括同一用户不同形态的人脸图像时, 电子设备 可以根据不同用户不同形态的人脸图像, 对不同人脸角度、 表情等不同形态的人脸图片进行 准确分组, 提高聚类和分组的准确性, 降低聚类的分散度。 In this way, the electronic device can group the face pictures and the divided categories into a group according to the classification results, or regroup the face pictures into a group. When each category includes different face images of the same user, the electronic device can accurately group different face images of different face angles, expressions, etc., according to the different face images of different users, and improve the aggregation. The accuracy of clustering and grouping reduces the dispersion of clustering.
在另一种可能的设计中, 电子设备将多个人脸图像帧划分为至少一个类别, 包括: 电子 设备分别将每个视频中的人脸图像帧划分为至少一个类别。 若至少一个类别中第一类别中的
第一人脸图像帧的人脸特征, 与第二类别中的第二人脸图像帧的人脸特征之间的相似度大于 或者等于预设值, 则电子设备将第一类别和第二类别合并为同一个类别。 In another possible design, the electronic device classifying multiple face image frames into at least one category includes: the electronic device separately classifies the face image frames in each video into at least one category. If at least one of the first category The similarity between the face features of the first face image frame and the face features of the second face image frame in the second category is greater than or equal to the preset value, then the electronic device divides the first category with the second category Merged into the same category.
也就是说, 电子设备可以先将同一个视频中的人脸图像帧划分类别, 而后再将不同视频 中相似度较大的人脸图像帧所在的类别合并, 即将不同视频中同一用户的人脸图像帧合并为 同一个类别。 In other words, the electronic device can first classify the face image frames in the same video, and then merge the categories of the face image frames with greater similarity in different videos, that is, the faces of the same user in different videos The image frames are merged into the same category.
在另一种可能的设计中,电子设备分别将每个视频中的人脸图像帧划分为至少一个类别, 包括: 电子设备通过人脸跟踪算法, 分别将每个视频中, 具有时间连续性的同一用户的多个 人脸图像帧划分为同一个类别。 In another possible design, the electronic device divides the face image frames in each video into at least one category, including: the electronic device uses a face tracking algorithm to separately divide the time continuity in each video Multiple face image frames of the same user are divided into the same category.
其中, 具有时间连续性的同一用户的人脸图像帧, 可以是相邻的图像帧。 例如, 电子设 备通过人脸跟踪算法跟踪到的同一视频中的人脸图像具有时间连续性, 满足 must-link约束, 是同一个用户的人脸, 因而可以归为同一个类别。 Wherein, the face image frames of the same user with temporal continuity may be adjacent image frames. For example, the face images in the same video tracked by the electronic device through the face tracking algorithm have time continuity, meet the must-link constraint, are the face of the same user, and therefore can be classified into the same category.
在另一种可能的设计中, 每个分组还包括以下任意一项或任意多项的组合: 用户的人脸 图像帧所在的视频, 用户的人脸图像帧所在的视频分段, 或用户的至少一个人脸图像帧。 In another possible design, each group also includes any one or a combination of any of the following: the video where the user’s face image frame is located, the video segment where the user’s face image frame is located, or the user’s At least one face image frame.
这样, 电子设备不仅可以对人脸图片进行分组, 还可以对视频、 视频分段和人脸图像帧 等进行分组, 并且联合管理人脸图片和视频、 视频分段以及人脸图像帧, 提高用户查找效率 和管理体验。 In this way, the electronic device can not only group face pictures, but also group videos, video segments and face image frames, etc., and jointly manage face pictures and videos, video segments, and face image frames to improve users Find efficiency and management experience.
在另一种可能的设计中,每个分组包括的一个用户的至少一张人脸图片为单人照或合影。 在另一种可能的设计中, 电子设备获取至少一个视频, 包括: 电子设备从电子设备的存 储区获取至少一个视频。 In another possible design, at least one face picture of a user included in each group is a single photo or a group photo. In another possible design, obtaining at least one video by the electronic device includes: the electronic device obtains at least one video from a storage area of the electronic device.
其中, 该至少一个视频可以是电子设备之前拍摄、 下载、 拷贝或通过其他方式获取到的 视频。 Wherein, the at least one video may be a video previously captured, downloaded, copied, or obtained by other means by the electronic device.
在另一种可能的设计中, 电子设备获取至少一个视频, 包括: 电子设备提示用户拍摄包 括人脸图像帧的视频。 电子设备在检测到用户指示拍摄视频的操作后, 录制并生成至少一个 视频。 In another possible design, the electronic device acquiring at least one video includes: the electronic device prompts the user to shoot a video including human face image frames. The electronic device records and generates at least one video after detecting the operation instructed by the user to shoot the video.
在该方案中, 电子设备可以实时录制一个视频, 以便用于人脸图片分组。 In this solution, the electronic device can record a video in real time for use in grouping face pictures.
在另一种可能的设计中, 该方法还包括: 电子设备获取至少一个图像组, 每个图像组中 包括同一用户不同形态的多个图像帧。至少一个图像组包括以下任意一项或任意多项的组合: 动图, 预先拍摄的包括同一用户不同形态的人脸的图像组, 在拍摄预览时实时采集的多帧图 像形成的图像组, 或在连拍时拍摄到的多帧图像形成的图像组。 电子设备从至少一个视频中 提取多个人脸图像帧, 包括: 电子设备从至少一个视频以及至少一个图像组中, 提取多个人 脸图像帧。 In another possible design, the method further includes: the electronic device acquires at least one image group, and each image group includes multiple image frames of the same user in different forms. At least one image group includes any one or a combination of any of the following: moving pictures, a pre-photographed image group that includes different forms of the same user's face, an image group formed by multiple frames of images collected in real time during shooting preview, or An image group formed by multiple frames of images taken during continuous shooting. The electronic device extracting multiple face image frames from at least one video includes: the electronic device extracts multiple face image frames from at least one video and at least one image group.
这样, 电子设备不仅可以根据视频, 还可以根据动图等多种包括用户不同形态的人脸图 像帧的图像组, 对人脸图片进行分类。 In this way, the electronic device can classify face pictures not only according to videos, but also according to various image groups including user face picture frames of different forms, such as moving pictures.
在另一种可能的设计中, 电子设备在检测到用户用于查看图像分类的操作后, 或者在检 测到用户指示开启人脸分类的功能后, 根据多个人脸图像帧, 对至少一张人脸图片进行聚类 处理; 并根据聚类处理结果, 显示至少一个分组, 每个分组分别包括一个用户的至少一张人 脸图片。 In another possible design, after the electronic device detects the user's operation for viewing image classification, or after detecting the user's instruction to enable the face classification function, the electronic device treats at least one person according to multiple face image frames. The face pictures are clustered; and according to the clustering processing result, at least one group is displayed, and each group includes at least one face picture of a user.
这样, 电子设备可以响应于用户的指示, 再显示人脸图片的分组结果。 In this way, the electronic device can display the grouping result of the face pictures in response to the user's instruction.
在另一种可能的设计中, 电子设备在打开相册后, 自动根据多个人脸图像帧, 对至少一 张人脸图片进行聚类处理; 并根据聚类处理结果, 显示至少一个分组, 每个分组分别包括一
个用户的至少一张人脸图片。 In another possible design, after opening the album, the electronic device automatically performs clustering processing on at least one face picture according to multiple face image frames; and according to the clustering processing result, displays at least one group, each Each group includes one At least one face image of each user.
在该方案中, 在打开相册后, 电子设备可以自动进行聚类和显示分组的处理。 In this solution, after opening the album, the electronic device can automatically perform clustering and display grouping processing.
在另一种可能的设计中, 电子设备在充电过程中, 电量高于预设电量值的情况下, 自动 根据多个人脸图像帧, 对至少一张人脸图片进行聚类处理; 在打开相册后, 根据聚类处理结 果, 显示至少一个分组, 每个分组分别包括一个用户的至少一张人脸图片。 In another possible design, the electronic device automatically performs clustering processing on at least one face image according to multiple face image frames when the power level is higher than the preset power level during the charging process; Then, according to the clustering processing result, at least one group is displayed, and each group includes at least one face picture of a user.
在该方案中, 电子设备分别可以在不同时机自动进行聚类和显示分组的处理。 In this solution, the electronic device can automatically perform clustering and display grouping processing at different times.
在另一种可能的设计中, 电子设备在显示至少一个分组时, 还可以提示用户该分组是根 据视频中的人脸图像帧, 对人脸图片进行分组得到的。 In another possible design, when the electronic device displays at least one group, it may also prompt the user that the group is obtained by grouping face pictures according to face image frames in the video.
这样, 可以便于用户获知电子设备当前是根据视频进行人脸图片分组的。 In this way, it is convenient for the user to know that the electronic device currently groups the face pictures according to the video.
另一方面, 本申请实施例提供了一种图片分组方法, 应用于电子设备, 电子设备上保存 有至少一个视频和至少一张人脸图片, 该方法包括: 电子设备在检测到用户用于查看图像分 类的操作后, 显示至少一个分组。 其中, 每个分组分别包括一个用户的至少一张人脸图片, 以及以下任意一项或任意多项的组合: 用户的人脸图像帧所在的视频, 用户的人脸图像帧所 在的视频分段, 或用户的至少一个人脸图像帧。 On the other hand, an embodiment of the present application provides a method for grouping pictures, which is applied to an electronic device. The electronic device stores at least one video and at least one face picture. The method includes: the electronic device detects that the user is used for viewing After the image classification operation, at least one group is displayed. Wherein, each group includes at least one face picture of a user, and any one or a combination of any of the following: the video where the user's face image frame is located, and the video segment where the user's face image frame is located , Or at least one face image frame of the user.
另一方面, 本申请实施例提供了一种图片分组方法, 应用于电子设备, 电子设备上保存 有至少一张人脸图片, 该方法包括: 电子设备获取至少一个参考图像集, 参考图像集包括具 有时间连续性的一系列的人脸图像帧。 而后, 电子设备根据人脸图像帧, 对至少一张人脸图 片进行聚类处理。 之后, 电子设备可以根据聚类处理结果, 显示至少一个分组, 每个分组分 别包括一个用户的至少一张人脸图片。 On the other hand, an embodiment of the present application provides a picture grouping method, which is applied to an electronic device, and at least one face picture is stored on the electronic device. The method includes: the electronic device acquires at least one reference image set, the reference image set includes A series of face image frames with temporal continuity. Then, the electronic device performs clustering processing on at least one face image according to the face image frame. After that, the electronic device can display at least one group according to the clustering processing result, and each group includes at least one face picture of a user.
在一种可能的设计中,该参考图像集可以是视频中的人脸图像帧;动图中的人脸图像帧; 或者在拍摄预览状态实时采集的具有时间连续性的多帧图像的集合, 在抓拍模式采集到的具 有时间连续性的多帧图像的集合, 电子设备在连拍时拍摄到的具有时间连续性的多帧图像的 集合; 或者用户预设的包括同一用户的不同形态的人脸的图像组等。 In a possible design, the reference image set may be a face image frame in a video; a face image frame in an animation; or a collection of time-continuous multi-frame images collected in real time in the shooting preview state, A collection of multi-frame images with time continuity captured in the capture mode, a collection of multi-frame images with time continuity captured by an electronic device during continuous shooting; or a user preset including different forms of the same user Face image group, etc.
另一方面, 本申请实施例提供了一种图片分组方法, 应用于电子设备, 电子设备上保存 有至少一张图片,该方法包括: 电子设备获取至少一个视频,视频包括图像帧;根据图像帧, 对至少一张图片进行聚类处理; 才艮据聚类处理结果, 显示至少一个分组, 每个分组分别包括 一个实体的至少一张图片。 例如, 该实体可以包括人脸、 狗、 猫、 房子等。 On the other hand, an embodiment of the present application provides a picture grouping method, which is applied to an electronic device, and at least one picture is stored on the electronic device. The method includes: the electronic device acquires at least one video, and the video includes image frames; Perform clustering processing on at least one picture; according to the clustering processing result, at least one group is displayed, and each group includes at least one picture of an entity. For example, the entity may include human faces, dogs, cats, houses, etc.
另一方面, 本申请实施例提供了一种图片分组装置, 该装置包含在电子设备中, 该装置 具有实现上述方面及可能的实现方式中任一方法中电子设备行为的功能。 该功能可以通过硬 件实现, 也可以通过硬件执行相应的软件实现。 硬件或软件包括至少一个与上述功能相对应 的模块或单元。 例如, 获取模块或单元、 提取模块或单元、 聚类模块或单元以及显示模块或 单元等。 On the other hand, an embodiment of the present application provides a picture grouping device, which is included in an electronic device, and the device has a function of implementing the behavior of the electronic device in any of the foregoing aspects and possible implementation manners. This function can be realized by hardware, or by hardware executing corresponding software. The hardware or software includes at least one module or unit corresponding to the above-mentioned functions. For example, acquiring modules or units, extracting modules or units, clustering modules or units, and displaying modules or units.
又一方面,本申请实施例提供了一种电子设备,包括至少一个处理器和至少一个存储器。 该至少一个存储器与至少一个处理器耦合, 至少一个存储器用于存储计算机程序代码, 计算 机程序代码包括计算机指令, 当至少一个处理器执行计算机指令时, 使得电子设备执行上述 方面任一项可能的实现中的图片分组方法。 In another aspect, an embodiment of the present application provides an electronic device, including at least one processor and at least one memory. The at least one memory is coupled with at least one processor, and the at least one memory is used to store computer program code, and the computer program code includes computer instructions. When the at least one processor executes the computer instructions, the electronic device is made to perform any possible implementation of the foregoing aspects. The picture grouping method in.
另一方面, 本申请实施例提供了一种计算机存储介质, 包括计算机指令, 当计算机指令 在电子设备上运行时, 使得电子设备执行上述方面任一项可能的实现中的图片分组方法。 On the other hand, an embodiment of the present application provides a computer storage medium, including computer instructions, which when the computer instructions run on an electronic device, cause the electronic device to execute the picture grouping method in any of the possible implementations of the foregoing aspects.
又一方面, 本申请实施例提供了一种计算机程序产品, 当计算机程序产品在计算机上运 行时, 使得计算机执行上述方面任一项可能的实现中的图片分组方法。
附图说明 In another aspect, an embodiment of the present application provides a computer program product, which when the computer program product runs on a computer, causes the computer to execute the picture grouping method in any one of the possible implementations of the foregoing aspects. Description of the drawings
图 1为本申请实施例提供的一种电子设备的结构示意图; FIG. 1 is a schematic structural diagram of an electronic device provided by an embodiment of the application;
图 2为本申请实施例提供的一组界面示意图; Figure 2 is a schematic diagram of a set of interfaces provided by an embodiment of the application;
图 3为本申请实施例提供的一种界面示意图; FIG. 3 is a schematic diagram of an interface provided by an embodiment of the application;
图 4为本申请实施例提供的另一种界面示意图; FIG. 4 is a schematic diagram of another interface provided by an embodiment of the application;
图 5为本申请实施例提供的另一种界面示意图; Figure 5 is a schematic diagram of another interface provided by an embodiment of the application;
图 6为本申请实施例提供的另一种界面示意图; FIG. 6 is a schematic diagram of another interface provided by an embodiment of the application;
图 7A为本申请实施例提供的另一种界面示意图; FIG. 7A is a schematic diagram of another interface provided by an embodiment of the application;
图 7B为本申请实施例提供的一个视频及视频中的人脸图像帧的示意图; FIG. 7B is a schematic diagram of a video and a face image frame in the video provided by an embodiment of the application;
图 8A为本申请实施例提供的一种分类效果示意图; FIG. 8A is a schematic diagram of a classification effect provided by an embodiment of this application;
图 8B为本申请实施例提供的另一种分类效果示意图; FIG. 8B is a schematic diagram of another classification effect provided by an embodiment of the application;
图 9A为本申请实施例提供的另一种界面示意图; FIG. 9A is a schematic diagram of another interface provided by an embodiment of the application;
图 9B为本申请实施例提供的另一种界面示意图; FIG. 9B is a schematic diagram of another interface provided by an embodiment of the application;
图 9C为本申请实施例提供的另一种界面示意图; FIG. 9C is a schematic diagram of another interface provided by an embodiment of the application;
图 10为本申请实施例提供的另一组界面示意图; FIG. 10 is a schematic diagram of another set of interfaces provided by an embodiment of the application;
图 11为本申请实施例提供的另一组界面示意图; FIG. 11 is a schematic diagram of another set of interfaces provided by an embodiment of the application;
图 12为本申请实施例提供的另一组界面示意图; FIG. 12 is a schematic diagram of another set of interfaces provided by an embodiment of the application;
图 13为本申请实施例提供的另一组界面示意图; FIG. 13 is a schematic diagram of another set of interfaces provided by an embodiment of the application;
图 14为本申请实施例提供的另一组界面示意图; FIG. 14 is a schematic diagram of another set of interfaces provided by an embodiment of the application;
图 15为本申请实施例提供的另一种界面示意图; FIG. 15 is a schematic diagram of another interface provided by an embodiment of the application;
图 16为本申请实施例提供的另一组界面示意图; FIG. 16 is a schematic diagram of another set of interfaces provided by an embodiment of the application;
图 17为本申请实施例提供的一个图像组中的人脸图像帧的示意图; FIG. 17 is a schematic diagram of face image frames in an image group provided by an embodiment of the application;
图 18为本申请实施例提供的另一组界面示意图; FIG. 18 is a schematic diagram of another set of interfaces provided by an embodiment of the application;
图 19A为本申请实施例提供的另一种界面示意图; FIG. 19A is a schematic diagram of another interface provided by an embodiment of this application;
图 19B为本申请实施例提供的另一种界面示意图; FIG. 19B is a schematic diagram of another interface provided by an embodiment of the application;
图 20为本申请实施例提供的另一个图像组中的人脸图像帧的示意图; FIG. 20 is a schematic diagram of a face image frame in another image group provided by an embodiment of the application;
图 21为本申请实施例提供的另一种界面示意图; FIG. 21 is a schematic diagram of another interface provided by an embodiment of the application;
图 22为本申请实施例提供的另一组界面示意图; FIG. 22 is a schematic diagram of another set of interfaces provided by an embodiment of the application;
图 23为本申请实施例提供的一种图片分组方法流程图; FIG. 23 is a flowchart of a method for grouping pictures according to an embodiment of this application;
图 24为本申请实施例提供的另一种电子设备的结构示意图。 FIG. 24 is a schematic structural diagram of another electronic device provided by an embodiment of the application.
具体实施方式 detailed description
下面将结合本申请实施例中的附图, 对本申请实施例中的技术方案进行描述。 其中, 在 本申请实施例的描述中, 除非另有说明, “/”表示或的意思, 例如, A/B可以表示 A或 B ; 本 文中的“和 /或”仅仅是一种描述关联对象的关联关系, 表示可以存在三种关系, 例如, A 和 / 或 B, 可以表示: A, B, 以及 AB这三种情况。 另夕卜, 在本申请实施例的描述中, “多个,’是 指两个或多于两个。 The technical solutions in the embodiments of the present application will be described below in conjunction with the drawings in the embodiments of the present application. Wherein, in the description of the embodiments of the present application, unless otherwise specified, "/" means or, for example, A/B can mean A or B; "and/or" in this document is only a description of related objects The association relationship of indicates that there can be three relationships, for example, A and/or B can indicate: A, B, and AB. In addition, in the description of the embodiments of the present application, "multiple," refers to two or more than two.
本申请实施例提供一种图片分组方法, 可以应用于电子设备。 电子设备可以根据参考图 像集对电子设备上存储的人脸图片 (即包含人脸图像的图片)进行聚类。 参考图像集中包括 具有时间连续性的多张不同形态的人脸图像。 其中, 这里的形态可以包括人脸的角度(例如 侧脸、 仰脸或俯脸等), 人脸的表情(例如大笑、 大哭或搞怪表情等), 是否留胡子, 是否戴
墨镜,脸部是否被帽子遮挡,脸部是否被头发遮挡等。与参考图像集中的视频或图像组不同, 电子设备上存储的人脸图片是指独立存在的一张一张的静态图片。 The embodiment of the present application provides a method for grouping pictures, which can be applied to electronic devices. The electronic device may cluster the face pictures (that is, pictures containing the face images) stored on the electronic device according to the reference image set. The reference image set includes multiple face images of different shapes with temporal continuity. Among them, the form here can include the angle of the face (such as side face, face up or down, etc.), facial expressions (such as laughing, crying or funny expressions, etc.), whether to have a beard, whether to wear Sunglasses, whether the face is covered by a hat, whether the face is covered by hair, etc. Different from the video or image group in the reference image set, the face pictures stored on the electronic device refer to static pictures that exist independently.
其中, 参考图像集可以包括电子设备获取的视频中具有时间连续性的一系列图像帧的集 合。 例如, 该视频可以是电子设备的摄像头拍摄的视频, 电子设备从应用程序 ( application, App ) (例如抖音、 快手、 美拍、 YOYO炫舞等)获取的视频, 电子设备从其他设备获取到的 视频, 或者视频通话过程中保存的视频等。 The reference image set may include a collection of a series of image frames with temporal continuity in the video acquired by the electronic device. For example, the video may be a video taken by a camera of an electronic device, a video obtained by an electronic device from an application (application, App) (such as Douyin, Kuaishou, Meipai, YOYO, etc.), and a video obtained by an electronic device from other devices , Or the video saved during the video call, etc.
参考图像集还可以包括电子设备获取的动图 ( Gif ), 动图中包括具有时间连续性的多帧 图像。 The reference image set may also include animated images (Gif) acquired by the electronic device, and the animated images include multiple frames of images with temporal continuity.
此外, 参考图像集还可以包括电子设备获取的具有时间连续性的一系列图像组成的图像 组。 例如, 该图像组可以是电子设备在拍摄预览状态实时采集的具有时间连续性的多帧图像 的集合。 再例如, 该图像组可以是电子设备在抓拍模式采集到的具有时间连续性的多帧图像 的集合 (电子设备或用户可以指定其中一张图像为抓拍获得的图像)。 又例如, 该图像组可以 是电子设备在连拍时拍摄到的具有时间连续性的多帧图像的集合。 再例如, 该图像组可以是 用户预设的包括同一用户的不同形态的人脸的图像组 (例如预先拍摄的同一用户的正面人脸 图像、 侧面人脸图像、 大笑的人脸图像等组成的图像组)等中的一种或多种。 In addition, the reference image set may also include an image group composed of a series of images with time continuity acquired by the electronic device. For example, the image group may be a collection of time-continuous multi-frame images collected in real time by the electronic device in the shooting preview state. For another example, the image group may be a collection of time-continuous multi-frame images captured by the electronic device in the capture mode (the electronic device or the user may designate one of the images as the captured image). For another example, the image group may be a collection of multiple frames of images with time continuity captured by the electronic device during continuous shooting. For another example, the image group may be a user-preset image group including different forms of faces of the same user (for example, a pre-photographed front face image, side face image, and laughing face image of the same user). One or more of the image group), etc.
由于参考图像集中通常包括同一用户的多种不同形态的人脸图像, 因而电子设备可以将 参考图像集中的人脸图像作为先验信息, 根据参考图像集中不同形态的人脸图像, 对电子设 备上存储的图片进行聚类处理, 使得不同形态的人脸图片也能够准确聚类, 提高人脸图片的 聚类精度。 Since the reference image set usually includes multiple face images of the same user in different forms, the electronic device can use the face images in the reference image set as prior information. According to the face images of different forms in the reference image set, the electronic device The stored pictures are clustered, so that face pictures of different shapes can also be accurately clustered, and the clustering accuracy of face pictures is improved.
其中,该电子设备可以是手机、平板电脑、可穿戴设备、车载设备、增强现实 ( augmented reality , AR )/虚拟现实( virtual reality , VR )设备、笔记本电脑、超级移动个人计算机( ultra-mobile personal computer, UMPC )、 上网本、 个人数字助理 ( personal digital assistant, PDA )等电子 设备上, 本申请实施例对电子设备的具体类型不作任何限制。 Among them, the electronic device may be a mobile phone, a tablet computer, a wearable device, a vehicle-mounted device, an augmented reality (AR)/virtual reality (VR) device, a notebook computer, an ultra-mobile personal computer (ultra-mobile personal computer). On electronic devices such as computers, UMPCs), netbooks, and personal digital assistants (personal digital assistants, PDAs), the embodiments of this application do not impose any restrictions on the specific types of electronic devices.
示例性的, 图 1示出了电子设备 100的一种结构示意图。 电子设备 100可以包括处理器 110,外部存储器接口 120,内部存储器 121 ,通用串行总线 (universal serial bus, USB)接口 130, 充电管理模块 140, 电源管理模块 141 , 电池 142, 天线 1 , 天线 2, 移动通信模块 150, 无线 通信模块 160, 音频模块 170, 扬声器 170A, 受话器 170B , 麦克风 170C, 耳机接口 170D, 传感器模块 180, 按键 190, 马达 191 , 指示器 192, 摄像头 193 , 显示屏 194, 以及用户标 识模块 (subscriber identification module, SIM)卡接口 195等。 其中传感器模块 180可以包括 压力传感器 180A, 陀螺仪传感器 180B , 气压传感器 180C, 磁传感器 180D, 加速度传感器 180E, 距离传感器 180F, 接近光传感器 180G, 指纹传感器 180H, 温度传感器 180J, 触摸传 感器 180K, 环境光传感器 180L, 骨传导传感器 180M等。 Exemplarily, FIG. 1 shows a schematic structural diagram of an electronic device 100. The electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, and an antenna 2. , Mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, earphone jack 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display 194, and Subscriber identification module (subscriber identification module, SIM) card interface 195, etc. The sensor module 180 may include pressure sensor 180A, gyroscope sensor 180B, air pressure sensor 180C, magnetic sensor 180D, acceleration sensor 180E, distance sensor 180F, proximity light sensor 180G, fingerprint sensor 180H, temperature sensor 180J, touch sensor 180K, ambient light Sensor 180L, bone conduction sensor 180M, etc.
可以理解的是, 本申请实施例示意的结构并不构成对电子设备 100的具体限定。 在本申 请另一些实施例中, 电子设备 100可以包括比图示更多或更少的部件, 或者组合某些部件, 或者拆分某些部件, 或者不同的部件布置。 图示的部件可以以硬件, 软件或软件和硬件的组 合实现。 It can be understood that the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the electronic device 100. In other embodiments of the present application, the electronic device 100 may include more or fewer components than shown, or combine certain components, or disassemble certain components, or arrange different components. The components shown in the figure can be implemented in hardware, software or a combination of software and hardware.
处理器 110 可以包括一个或多个处理单元, 例如: 处理器 110 可以包括应用处理器 (application processor, AP), 调制解调处理器, 图形处理器 (graphics processing unit, GPU), 图像信号处理器 (image signal processor, ISP), 控制器, 存储器, 视频编解码器, 数字信号处 理器 (digital signal processor , DSP) , 基带处理器, 和 /或神经网络处理器 (neural-network
processing unit, NPU)等。 其中, 不同的处理单元可以是独立的器件, 也可以集成在一个或多 个处理器中。 The processor 110 may include one or more processing units. For example, the processor 110 may include an application processor (AP), a modem processor, a graphics processing unit (GPU), and an image signal processor. (image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processor (neural-network processing unit, NPU) etc. Among them, different processing units may be independent devices, or may be integrated in one or more processors.
其中, 控制器可以是电子设备 100的神经中枢和指挥中心。 控制器可以根据指令操作码 和时序信号, 产生操作控制信号, 完成取指令和执行指令的控制。 Wherein, the controller can be the nerve center and command center of the electronic device 100. The controller can generate operation control signals according to the command operation code and timing signals to complete the control of fetching and executing commands.
处理器 110中还可以设置存储器, 用于存储指令和数据。 在一些实施例中, 处理器 110 中的存储器为高速缓冲存储器。 该存储器可以保存处理器 110刚用过或循环使用的指令或数 据。如果处理器 110需要再次使用该指令或数据,可从存储器中直接调用。避免了重复存取, 减少了处理器 110的等待时间, 因而提高了系统的效率。 A memory may also be provided in the processor 110 to store instructions and data. In some embodiments, the memory in the processor 110 is a cache memory. The memory can store instructions or data that the processor 110 has just used or used cyclically. If the processor 110 needs to use the instruction or data again, it can be called directly from the memory. Repeated access is avoided, the waiting time of the processor 110 is reduced, and the efficiency of the system is improved.
在一些实施例中, 处理器 110 可以包括一个或多个接口。 接口可以包括集成电路 (inter-integrated circuit, I2C)接口, 集成电路内置音频 (inter-integrated circuit sound, I2S)接口, 脉冲编码调制 (pulse code modulation, PCM)接口,通用异步收发传输器 (universal asynchronous receiver/transmitter , UART)接口, 移动产业处理器接口 (mobile industry processor interface, MIPI) , 通用输入输出 (general-purpose input/output , GPIO)接口, 用户标识模块 (subscriber identity module, SIM)接口, 和 /或通用串行总线 (universal serial bus, USB)接口等。 In some embodiments, the processor 110 may include one or more interfaces. The interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (PCM) interface, and a universal asynchronous transmitter (universal asynchronous transmitter) interface. receiver/transmitter, UART) interface, mobile industry processor interface (MIPI), general-purpose input/output (GPIO) interface, subscriber identity module (SIM) interface, and / Or universal serial bus (USB) interface, etc.
I2C接口是一种双向同步串行总线, 包括一根串行数据线 (serial data line, SDA)和一根串 行时钟线 (derail clock line, SCL)。 在一些实施例中, 处理器 110可以包含多组 I2C总线。 处 理器 110可以通过不同的 I2C总线接口分别耦合触摸传感器 180K, 充电器, 闪光灯, 摄像头 193等。 例如: 处理器 110可以通过 I2C接口耦合触摸传感器 180K, 使处理器 110与触摸传 感器 180K通过 I2C总线接口通信, 实现电子设备 100的触摸功能。 The I2C interface is a two-way synchronous serial bus that includes a serial data line (SDA) and a derail clock line (SCL). In some embodiments, the processor 110 may include multiple sets of I2C buses. The processor 110 may be coupled to the touch sensor 180K, charger, flash, camera 193, etc., through different I2C bus interfaces. For example, the processor 110 may couple the touch sensor 180K through the I2C interface, so that the processor 110 and the touch sensor 180K communicate through the I2C bus interface to realize the touch function of the electronic device 100.
I2S接口可以用于音频通信。 在一些实施例中, 处理器 110可以包含多组 I2S总线。 处理 器 110可以通过 I2S总线与音频模块 170耦合,实现处理器 110与音频模块 170之间的通信。 在一些实施例中, 音频模块 170可以通过 I2S接口向无线通信模块 160传递音频信号, 实现 通过蓝牙耳机接听电话的功能。 The I2S interface can be used for audio communication. In some embodiments, the processor 110 may include multiple sets of I2S buses. The processor 110 may be coupled with the audio module 170 through an I2S bus to implement communication between the processor 110 and the audio module 170. In some embodiments, the audio module 170 may transmit audio signals to the wireless communication module 160 through the I2S interface, so as to realize the function of answering calls through the Bluetooth headset.
PCM接口也可以用于音频通信, 将模拟信号抽样, 量化和编码。 在一些实施例中, 音频 模块 170与无线通信模块 160可以通过 PCM总线接口耦合。 The PCM interface can also be used for audio communication to sample, quantize and encode analog signals. In some embodiments, the audio module 170 and the wireless communication module 160 may be coupled through a PCM bus interface.
在一些实施例中,音频模块 170也可以通过 PCM接口向无线通信模块 160传递音频信号, 实现通过蓝牙耳机接听电话的功能。 I2S接口和 PCM接口都可以用于音频通信。 In some embodiments, the audio module 170 may also transmit audio signals to the wireless communication module 160 through the PCM interface, so as to realize the function of answering calls through the Bluetooth headset. Both I2S interface and PCM interface can be used for audio communication.
UART接口是一种通用串行数据总线, 用于异步通信。 该总线可以为双向通信总线。 它 将要传输的数据在串行通信与并行通信之间转换。 The UART interface is a universal serial data bus used for asynchronous communication. The bus can be a two-way communication bus. It converts the data to be transmitted between serial communication and parallel communication.
在一些实施例中, UART接口通常被用于连接处理器 110与无线通信模块 160。 例如: 处 理器 110通过 UART接口与无线通信模块 160中的蓝牙模块通信, 实现蓝牙功能。 在一些实 施例中, 音频模块 170可以通过 UART接口向无线通信模块 160传递音频信号, 实现通过蓝 牙耳机播放音乐的功能。 In some embodiments, the UART interface is generally used to connect the processor 110 and the wireless communication module 160. For example, the processor 110 communicates with the Bluetooth module in the wireless communication module 160 through the UART interface to realize the Bluetooth function. In some embodiments, the audio module 170 can transmit audio signals to the wireless communication module 160 through the UART interface, so as to realize the function of playing music through the Bluetooth headset.
MIPI接口可以被用于连接处理器 110与显示屏 194, 摄像头 193等外围器件。 MIPI接口 包括摄像头串行接口 (camera serial interface, CSI) , 显示屏串行接口 (display serial interface, DSI)等。 在一些实施例中, 处理器 110和摄像头 193通过 CSI接口通信, 实现电子设备 100 的拍摄功能。 处理器 110和显示屏 194通过 DSI接口通信, 实现电子设备 100的显示功能。 The MIPI interface can be used to connect the processor 110 with the display screen 194, the camera 193 and other peripheral devices. MIPI interface includes camera serial interface (CSI), display serial interface (DSI) and so on. In some embodiments, the processor 110 and the camera 193 communicate through a CSI interface to implement the shooting function of the electronic device 100. The processor 110 and the display screen 194 communicate through the DSI interface to realize the display function of the electronic device 100.
GPI0接口可以通过软件配置。 GPI0接口可以被配置为控制信号, 也可被配置为数据信 号。 在一些实施例中, GPI0接口可以用于连接处理器 110与摄像头 193, 显示屏 194, 无线 通信模块 160, 音频模块 170, 传感器模块 180等。 GPI0接口还可以被配置为 I2C接口, I2S
接口, UART接口, MIPI接口等。 The GPI0 interface can be configured through software. The GPI0 interface can be configured as a control signal or as a data signal. In some embodiments, the GPI0 interface may be used to connect the processor 110 with the camera 193, the display screen 194, the wireless communication module 160, the audio module 170, the sensor module 180, and so on. GPI0 interface can also be configured as I2C interface, I2S Interface, UART interface, MIPI interface, etc.
USB接口 130是符合 USB标准规范的接口, 具体可以是 Mini USB接口, Micro USB接 口, USB Type C接口等。 USB接口 130可以用于连接充电器为电子设备 100充电, 也可以用 于电子设备 100与外围设备之间传输数据。 也可以用于连接耳机, 通过耳机播放音频。 该接 口还可以用于连接其他电子设备, 例如 AR设备等。 The USB interface 130 is an interface that complies with the USB standard specification, and specifically may be a Mini USB interface, a Micro USB interface, a USB Type C interface, and so on. The USB interface 130 can be used to connect a charger to charge the electronic device 100, and can also be used to transfer data between the electronic device 100 and peripheral devices. It can also be used to connect headphones and play audio through headphones. This interface can also be used to connect other electronic devices, such as AR devices.
可以理解的是, 本申请实施例示意的各模块间的接口连接关系, 只是示意性说明, 并不 构成对电子设备 100的结构限定。 在本申请另一些实施例中, 电子设备 100也可以采用上述 实施例中不同的接口连接方式, 或多种接口连接方式的组合。 It can be understood that the interface connection relationship between the modules illustrated in the embodiment of the present application is merely illustrative, and does not constitute a structural limitation of the electronic device 100. In other embodiments of the present application, the electronic device 100 may also adopt different interface connection modes in the foregoing embodiments, or a combination of multiple interface connection modes.
充电管理模块 140用于从充电器接收充电输入。 其中, 充电器可以是无线充电器, 也可 以是有线充电器。 在一些有线充电的实施例中, 充电管理模块 140可以通过 USB接口 130接 收有线充电器的充电输入。 在一些无线充电的实施例中, 充电管理模块 140可以通过电子设 备 100的无线充电线圈接收无线充电输入。 充电管理模块 140为电池 142充电的同时, 还可 以通过电源管理模块 141为电子设备供电。 The charging management module 140 is used to receive charging input from the charger. Among them, the charger can be a wireless charger or a wired charger. In some wired charging embodiments, the charging management module 140 may receive the charging input of the wired charger through the USB interface 130. In some embodiments of wireless charging, the charging management module 140 may receive the wireless charging input through the wireless charging coil of the electronic device 100. While the charging management module 140 charges the battery 142, the power management module 141 can also supply power to electronic devices.
电源管理模块 141用于连接电池 142, 充电管理模块 140与处理器 110。 电源管理模块 141接收电池 142和 /或充电管理模块 140的输入, 为处理器 110, 内部存储器 121 , 外部存储 器, 显示屏 194, 摄像头 193 , 和无线通信模块 160等供电。 电源管理模块 141还可以用于监 测电池容量, 电池循环次数, 电池健康状态(漏电, 阻抗)等参数。 The power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110. The power management module 141 receives input from the battery 142 and/or the charge management module 140, and supplies power to the processor 110, the internal memory 121, the external memory, the display screen 194, the camera 193, and the wireless communication module 160. The power management module 141 can also be used to monitor parameters such as battery capacity, battery cycle times, and battery health status (leakage, impedance).
在其他一些实施例中, 电源管理模块 141也可以设置于处理器 110中。 在另一些实施例 中, 电源管理模块 141和充电管理模块 140也可以设置于同一个器件中。 In some other embodiments, the power management module 141 may also be provided in the processor 110. In other embodiments, the power management module 141 and the charging management module 140 may also be provided in the same device.
电子设备 100的无线通信功能可以通过天线 1, 天线 2, 移动通信模块 150, 无线通信模 块 160, 调制解调处理器以及基带处理器等实现。 The wireless communication function of the electronic device 100 can be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor, and the baseband processor.
天线 1和天线 2用于发射和接收电磁波信号。 电子设备 100中的每个天线可用于覆盖单 个或多个通信频带。 不同的天线还可以复用, 以提高天线的利用率。 例如: 可以将天线 1复 用为无线局域网的分集天线。 在另外一些实施例中, 天线可以和调谐开关结合使用。 Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals. Each antenna in the electronic device 100 can be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization. For example: Antenna 1 can be multiplexed as a diversity antenna for wireless LAN. In other embodiments, the antenna can be used in combination with a tuning switch.
移动通信模块 150可以提供应用在电子设备 100上的包括 2G/3G/4G/5G等无线通信的解 决方案。 移动通信模块 150可以包括至少一个滤波器, 开关, 功率放大器, 低噪声放大器(low noise amplifier, LNA)等。 移动通信模块 150可以由天线 1接收电磁波, 并对接收的电磁波进 行滤波, 放大等处理, 传送至调制解调处理器进行解调。 移动通信模块 150还可以对经调制 解调处理器调制后的信号放大, 经天线 1转为电磁波辐射出去。 The mobile communication module 150 can provide a wireless communication solution including 2G/3G/4G/5G and the like applied to the electronic device 100. The mobile communication module 150 may include at least one filter, a switch, a power amplifier, a low noise amplifier (LNA), etc. The mobile communication module 150 can receive electromagnetic waves by the antenna 1, and perform processing such as filtering and amplifying the received electromagnetic waves, and then transmitting them to the modem processor for demodulation. The mobile communication module 150 can also amplify the signal modulated by the modem processor, and convert it into electromagnetic waves through the antenna 1 for radiation.
在一些实施例中, 移动通信模块 150的至少部分功能模块可以被设置于处理器 110中。 在一些实施例中, 移动通信模块 150的至少部分功能模块可以与处理器 110的至少部分模块 被设置在同一个器件中。 In some embodiments, at least part of the functional modules of the mobile communication module 150 may be provided in the processor 110. In some embodiments, at least part of the functional modules of the mobile communication module 150 and at least part of the modules of the processor 110 may be provided in the same device.
调制解调处理器可以包括调制器和解调器。 其中, 调制器用于将待发送的低频基带信号 调制成中高频信号。 解调器用于将接收的电磁波信号解调为低频基带信号。 随后解调器将解 调得到的低频基带信号传送至基带处理器处理。 低频基带信号经基带处理器处理后, 被传递 给应用处理器。 应用处理器通过音频设备(不限于扬声器 170A, 受话器 170B等)输出声音信 号, 或通过显示屏 194显示图像或视频。 在一些实施例中, 调制解调处理器可以是独立的器 件。 在另一些实施例中, 调制解调处理器可以独立于处理器 110, 与移动通信模块 150或其 他功能模块设置在同一个器件中。 The modem processor may include a modulator and a demodulator. Among them, the modulator is used to modulate the low frequency baseband signal to be sent into a medium and high frequency signal. The demodulator is used to demodulate the received electromagnetic wave signal into a low-frequency baseband signal. Then the demodulator transmits the demodulated low-frequency baseband signal to the baseband processor for processing. After the low-frequency baseband signal is processed by the baseband processor, it is passed to the application processor. The application processor outputs sound signals through audio equipment (not limited to speakers 170A, receiver 170B, etc.), or displays images or videos through the display 194. In some embodiments, the modem processor may be an independent device. In other embodiments, the modem processor may be independent of the processor 110 and be provided in the same device as the mobile communication module 150 or other functional modules.
无线通信模块 160可以提供应用在电子设备 100上的包括无线局域网(wireless local area
networks , WLAN) (如无线保真 (wireless fidelity , Wi-Fi)网络), 蓝牙 (bluetooth, BT), 全球导 航卫星系统 (global navigation satellite system, GNSS), 调频 (frequency modulation, FM), 近距 离无线通信技术 (near field communication , NFC), 红外技术 (infrared, IR)等无线通信的解决 方案。 无线通信模块 160可以是集成至少一个通信处理模块的一个或多个器件。 无线通信模 块 160经由天线 2接收电磁波, 将电磁波信号调频以及滤波处理, 将处理后的信号发送到处 理器 110。 无线通信模块 160还可以从处理器 110接收待发送的信号,对其进行调频,放大, 经天线 2转为电磁波辐射出去。 The wireless communication module 160 can provide applications on the electronic device 100 including wireless local area network (WLAN). networks, WLAN) (such as wireless fidelity (Wi-Fi) networks), Bluetooth (bluetooth, BT), global navigation satellite system (GNSS), frequency modulation (FM), close range Wireless communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions. The wireless communication module 160 may be one or more devices integrating at least one communication processing module. The wireless communication module 160 receives electromagnetic waves via the antenna 2, frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110. The wireless communication module 160 may also receive a signal to be sent from the processor 110, perform frequency modulation, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.
在一些实施例中, 电子设备 100的天线 1和移动通信模块 150耦合, 天线 2和无线通信 模块 160耦合, 使得电子设备 100可以通过无线通信技术与网络以及其他设备通信。 无线通 信技术可以包括全球移动通讯系统 (global system for mobile communications , GSM), 通用分 组无线服务 (general packet radio service , GPRS) , 码分多址接入 (code division multiple access, CDMA) , 宽带码分多址 (wideband code division multiple access , WCDMA) , 时分码分多址 (time-division code division multiple access, TD-SCDMA),长期演进 (long term evolution, LTE), BT, GNSS , WLAN, NFC , FM, 和 /或 IR技术等。 GNSS可以包括全球卫星定位系统 (global positioning system , GPS), 全球导航卫星系统 (global navigation satellite system, GLONASS), 北斗卫星导航系统 (beidou navigation satellite system, BDS) , 准天顶卫星系统 (quasi-zenith satellite system, QZSS)和 /或星基增强系统 (satellite based augmentation systems , SBAS)» In some embodiments, the antenna 1 of the electronic device 100 is coupled with the mobile communication module 150, and the antenna 2 is coupled with the wireless communication module 160, so that the electronic device 100 can communicate with the network and other devices through wireless communication technology. Wireless communication technologies can include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), and broadband code division. Multiple access (wideband code division multiple access, WCDMA), time-division code division multiple access (TD-SCDMA), long term evolution (LTE), BT, GNSS, WLAN, NFC, FM , And/or IR technology, etc. GNSS can include global positioning system (GPS), global navigation satellite system (GLONASS), Beidou navigation satellite system (BDS), quasi-zenith satellite system (QZSS) and/or satellite based augmentation systems (SBAS) »
电子设备 100通过 GRJ, 显示屏 194, 以及应用处理器等实现显示功能。 GPU为图像处 理的微处理器, 连接显示屏 194和应用处理器。 GPU用于执行数学和几何计算, 用于图形渲 染。 处理器 110可包括一个或多个 GPU, 其执行程序指令以生成或改变显示信息。 The electronic device 100 implements a display function through GRJ, a display screen 194, and an application processor. The GPU is a microprocessor for image processing, connected to the display 194 and the application processor. The GPU is used to perform mathematical and geometric calculations, and is used for graphics rendering. The processor 110 may include one or more GPUs, which execute program instructions to generate or change display information.
显示屏 194用于显示图像, 视频等。 显示屏 194包括显示面板。 显示面板可以采用液晶 显示屏 (liquid crystal display, LCD), 有机发光二极管 (organic light-emitting diode, OLED), 有 源矩阵有机发光二极体或主动矩阵有机发光二极体 (active-matrix organic light emitting diode的, AMOLED) ,柔性发光二极管 (flex light-emitting diode, FLED), Miniled, MicroLed, Micro-oLed, 量子点发光二极管 (quantum dot light emitting diodes, QLED)等。 在一些实施例中, 电子设备 100可以包括 1个或 N个显示屏 194, N为大于 1的正整数。 The display screen 194 is used to display images, videos, etc. The display screen 194 includes a display panel. The display panel can adopt liquid crystal display (LCD), organic light-emitting diode (OLED), active matrix organic light-emitting diode or active-matrix organic light-emitting diode (active-matrix organic light-emitting diode). AMOLED), flexible light-emitting diode (FLED), Miniled, MicroLed, Micro-oLed, quantum dot light emitting diodes (QLED), etc. In some embodiments, the electronic device 100 may include one or N display screens 194, and N is a positive integer greater than one.
电子设备 100可以通过 ISP, 摄像头 193 , 视频编解码器, GPU, 显示屏 194以及应用处 理器等实现拍摄功能。 The electronic device 100 can implement shooting functions through an ISP, a camera 193, a video codec, a GPU, a display screen 194, and an application processor.
ISP用于处理摄像头 193反馈的数据。 例如, 拍照时, 打开快门, 光线通过镜头被传递 到摄像头感光元件上, 光信号转换为电信号, 摄像头感光元件将电信号传递给 ISP处理, 转 化为肉眼可见的图像。 ISP还可以对图像的噪点, 亮度, 肤色进行算法优化。 ISP还可以对拍 摄场景的曝光, 色温等参数优化。 在一些实施例中, ISP可以设置在摄像头 193中。 The ISP is used to process the data fed back from the camera 193. For example, when taking a picture, the shutter is opened, the light is transmitted to the photosensitive element of the camera through the lens, the light signal is converted into an electrical signal, and the photosensitive element of the camera transmits the electrical signal to the ISP for processing, which is converted into an image visible to the naked eye. ISP can also optimize the image noise, brightness, and skin color. The ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. In some embodiments, the ISP may be provided in the camera 193.
摄像头 193用于捕获静态图像或视频。 物体通过镜头生成光学图像投射到感光元件。 感 光元件可以是电荷接合器件 (charge coupled device , CCD)或互补金属氧化物半导体 (complementary metal-oxide- semiconductor , CMOS)光电晶体管。 感光元件把光信号转换成电 信号, 之后将电信号传递给 ISP转换成数字图像信号。 ISP将数字图像信号输出到 DSP加工 处理。 DSP将数字图像信号转换成标准的 RGB , YUV等格式的图像信号。在一些实施例中, 电子设备 100可以包括 1个或 N个摄像头 193 , N为大于 1的正整数。 The camera 193 is used to capture still images or videos. The object generates an optical image through the lens and projects it to the photosensitive element. The photosensitive element can be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, and then transfers the electrical signal to the ISP to convert it into a digital image signal. ISP outputs digital image signals to DSP for processing. DSP converts digital image signals into standard RGB, YUV and other formats. In some embodiments, the electronic device 100 may include one or N cameras 193, and N is a positive integer greater than one.
数字信号处理器用于处理数字信号, 除了可以处理数字图像信号, 还可以处理其他数字 信号。 例如, 当电子设备 100在频点选择时, 数字信号处理器用于对频点能量进行傅里叶变
换等。 The digital signal processor is used to process digital signals. In addition to processing digital image signals, it can also process other digital signals. For example, when the electronic device 100 selects the frequency point, the digital signal processor is used to Fourier transform the frequency point energy. Wait.
视频编解码器用于对数字视频压缩或解压缩。 电子设备 100可以支持一种或多种视频编 解码器。 这样, 电子设备 100可以播放或录制多种编码格式的视频, 例如: 动态图像专家组 (moving picture experts group , MPEG)1 , MPEG2 , MPEG3 , MPEG4等。 Video codecs are used to compress or decompress digital video. The electronic device 100 may support one or more video codecs. In this way, the electronic device 100 can play or record videos in multiple encoding formats, such as: moving picture experts group (MPEG)1, MPEG2, MPEG3, MPEG4, etc.
NPU为神经网络 (neural-network, NN)计算处理器, 通过借鉴生物神经网络结构, 例如借 鉴人脑神经元之间传递模式, 对输入信息快速处理, 还可以不断的自学习。 通过 NPU可以实 现电子设备 100的智能认知等应用, 例如: 图像识别, 人脸识别, 语音识别, 文本理解等。 NPU is a neural-network (NN) computing processor. By borrowing the structure of biological neural network, for example, borrowing the transfer mode between human brain neurons, it can quickly process input information, and it can also continuously self-learn. Through the NPU, applications such as intelligent cognition of the electronic device 100 can be implemented, such as: image recognition, face recognition, voice recognition, text understanding, and so on.
在本申请实施例中, NPU或其他处理器可以用于对电子设备 100存储的视频中的人脸图 像进行人脸检测、 人脸跟踪、 人脸特征提取和图像聚类等操作; 对电子设备 100存储的图片 中的人脸图像进行人脸检测、 人脸特征提取等操作, 并根据图片的人脸特征以及视频中人脸 图像的聚类结果, 对电子设备 100存储的图片进行聚类。 In the embodiment of the present application, the NPU or other processor may be used to perform face detection, face tracking, face feature extraction, and image clustering operations on the face image in the video stored by the electronic device 100; 100 performs operations such as face detection and facial feature extraction on the face images in the stored pictures, and clusters the pictures stored in the electronic device 100 according to the face features of the pictures and the clustering results of the face images in the video.
外部存储器接口 120可以用于连接外部存储卡, 例如 Micro SD卡, 实现扩展电子设备 100的存储能力。 外部存储卡通过外部存储器接口 120与处理器 110通信, 实现数据存储功 能。 例如将音乐, 视频等文件保存在外部存储卡中。 The external memory interface 120 may be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 100. The external memory card communicates with the processor 110 through the external memory interface 120 to realize the data storage function. For example, save music, video and other files in an external memory card.
内部存储器 121可以用于存储计算机可执行程序代码, 可执行程序代码包括指令。 处理 器 110通过运行存储在内部存储器 121的指令, 从而执行电子设备 100的各种功能应用以及 数据处理。 内部存储器 121可以包括存储程序区和存储数据区。 其中, 存储程序区可存储操 作系统, 至少一个功能所需的应用程序 (比如声音播放功能, 图像播放功能等)等。 存储数据 区可存储电子设备 100使用过程中所创建的数据 (比如音频数据, 电话本等)等。 The internal memory 121 may be used to store computer executable program code, and the executable program code includes instructions. The processor 110 executes various functional applications and data processing of the electronic device 100 by running instructions stored in the internal memory 121. The internal memory 121 may include a program storage area and a data storage area. Among them, the storage program area can store an operating system, at least one application program (such as a sound playback function, an image playback function, etc.) required by at least one function. The data storage area can store data (such as audio data, phone book, etc.) created during the use of the electronic device 100.
此外, 内部存储器 121可以包括高速随机存取存储器, 还可以包括非易失性存储器, 例 如至少一个磁盘存储器件, 闪存器件, 通用闪存存储器 (universal flash storage, UFS)等。 In addition, the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash storage (UFS), and the like.
电子设备 100可以通过音频模块 170, 扬声器 170A, 受话器 170B , 麦克风 170C, 耳机 接口 170D, 以及应用处理器等实现音频功能。 例如音乐播放, 录音等。 The electronic device 100 can implement audio functions through an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, a headphone interface 170D, and an application processor. For example, music playback, recording, etc.
音频模块 170用于将数字音频信息转换成模拟音频信号输出, 也用于将模拟音频输入转 换为数字音频信号。 音频模块 170还可以用于对音频信号编码和解码。 在一些实施例中, 音 频模块 170可以设置于处理器 110中, 或将音频模块 170的部分功能模块设置于处理器 110 中。 The audio module 170 is used to convert digital audio information into an analog audio signal for output, and also used to convert an analog audio input into a digital audio signal. The audio module 170 can also be used to encode and decode audio signals. In some embodiments, the audio module 170 may be disposed in the processor 110, or some functional modules of the audio module 170 may be disposed in the processor 110.
扬声器 170A, 也称“喇 ^八”, 用于将音频电信号转换为声音信号。 电子设备 100可以通过 扬声器 170A收听音乐, 或收听免提通话。 The loudspeaker 170A, also called "La ^ 8", is used to convert audio electrical signals into sound signals. The electronic device 100 can listen to music through the speaker 170A, or listen to a hands-free call.
受话器 170B , 也称“听筒”, 用于将音频电信号转换成声音信号。 当电子设备 100接听电 话或语音信息时, 可以通过将受话器 170B靠近人耳接听语音。 The receiver 170B, also called a "handset", is used to convert audio electrical signals into sound signals. When the electronic device 100 answers a phone call or voice message, it can receive the voice by bringing the receiver 170B close to the human ear.
麦克风 170C, 也称“话筒”, “传声器”, 用于将声音信号转换为电信号。 当拨打电话或发 送语音信息时, 用户可以通过人嘴靠近麦克风 170C发声, 将声音信号输入到麦克风 170C。 电子设备 100可以设置至少一个麦克风 170C。 在另一些实施例中, 电子设备 100可以设置两 个麦克风 170C, 除了采集声音信号, 还可以实现降噪功能。 在另一些实施例中, 电子设备 100还可以设置三个, 四个或更多麦克风 170C, 实现采集声音信号, 降噪, 还可以识别声音 来源, 实现定向录音功能等。 Microphone 170C, also called "microphone", "microphone", is used to convert sound signals into electrical signals. When making a call or sending a voice message, the user can make a sound by approaching the microphone 170C with a human mouth, and input the sound signal to the microphone 170C. The electronic device 100 may be provided with at least one microphone 170C. In other embodiments, the electronic device 100 may be provided with two microphones 170C, which can not only collect sound signals, but also implement noise reduction functions. In other embodiments, the electronic device 100 may also be provided with three, four or more microphones 170C, which can collect sound signals, reduce noise, identify sound sources, and realize directional recording functions.
耳机接口 170D用于连接有线耳机。耳机接口 170D可以是 USB接口 130,也可以是 3.5mm 的开放移动电子设备平台 (open mobile terminal platform, OMTP)标准接口, 美国蜂寫电信工 业协会 (cellular telecommunications industry association of the USA, CTIA)标准接口。
压力传感器 180A用于感受压力信号,可以将压力信号转换成电信号。在一些实施例中, 压力传感器 180A可以设置于显示屏 194。 压力传感器 180A的种类很多, 如电阻式压力传感 器, 电感式压力传感器, 电容式压力传感器等。 电容式压力传感器可以是包括至少两个具有 导电材料的平行板。 当有力作用于压力传感器 180A, 电极之间的电容改变。 电子设备 100根 据电容的变化确定压力的强度。 当有触摸操作作用于显示屏 194, 电子设备 100根据压力传 感器 180A检测触摸操作强度。电子设备 100也可以根据压力传感器 180A的检测信号计算触 摸的位置。 The earphone interface 170D is used to connect wired earphones. The earphone interface 170D can be a USB interface 130, or a 3.5mm open mobile terminal platform (OMTP) standard interface, and the American cellular telecommunications industry association of the USA (CTIA) standard interface . The pressure sensor 180A is used to sense the pressure signal and can convert the pressure signal into an electrical signal. In some embodiments, the pressure sensor 180A may be provided on the display screen 194. There are many types of pressure sensors 180A, such as resistive pressure sensors, inductive pressure sensors, and capacitive pressure sensors. The capacitive pressure sensor may include at least two parallel plates with conductive material. When a force is applied to the pressure sensor 180A, the capacitance between the electrodes changes. The electronic device 100 determines the intensity of the pressure according to the change in capacitance. When a touch operation acts on the display 194, the electronic device 100 detects the intensity of the touch operation according to the pressure sensor 180A. The electronic device 100 may also calculate the touched position according to the detection signal of the pressure sensor 180A.
在一些实施例中, 作用于相同触摸位置, 但不同触摸操作强度的触摸操作, 可以对应不 同的操作指令。 例如: 当有触摸操作强度小于第一压力阈值的触摸操作作用于短消息应用图 标时, 执行查看短消息的指令。 当有触摸操作强度大于或等于第一压力阈值的触摸操作作用 于短消息应用图标时, 执行新建短消息的指令。 In some embodiments, touch operations acting on the same touch position but with different touch operation strengths may correspond to different operation instructions. For example: when a touch operation whose intensity is less than the first pressure threshold is applied to the short message application icon, the instruction to view the short message is executed. When a touch operation with a touch operation intensity greater than or equal to the first pressure threshold acts on the short message application icon, an instruction to create a new short message is executed.
陀螺仪传感器 180B可以用于确定电子设备 100的运动姿态。 在一些实施例中, 可以通 过陀螺仪传感器 180B确定电子设备 100围绕三个轴(即, X, y和 z轴)的角速度。 陀螺仪传感 器 180B可以用于拍摄防抖。 示例性的, 当按下快门, 陀螺仪传感器 180B检测电子设备 100 抖动的角度, 根据角度计算出镜头模组需要补偿的距离, 让镜头通过反向运动抵消电子设备 100的抖动, 实现防抖。 陀螺仪传感器 180B还可以用于导航, 体感游戏场景。 The gyro sensor 180B can be used to determine the movement posture of the electronic device 100. In some embodiments, the angular velocity of the electronic device 100 around three axes (that is, X, y, and z axes) can be determined through the gyro sensor 180B. The gyro sensor 180B can be used for shooting anti-shake. Exemplarily, when the shutter is pressed, the gyro sensor 180B detects the shake angle of the electronic device 100, calculates the distance that the lens module needs to compensate according to the angle, and allows the lens to counteract the shake of the electronic device 100 through a reverse movement to achieve anti-shake. The gyro sensor 180B can also be used for navigation and somatosensory game scenes.
气压传感器 180C用于测量气压。 在一些实施例中, 电子设备 100通过气压传感器 180C 测得的气压值计算海拔高度, 辅助定位和导航。 The air pressure sensor 180C is used to measure air pressure. In some embodiments, the electronic device 100 calculates the altitude based on the air pressure value measured by the air pressure sensor 180C, and assists positioning and navigation.
磁传感器 180D包括霍尔传感器。电子设备 100可以利用磁传感器 180D检测翻盖皮套的 开合。 在一些实施例中, 当电子设备 100是翻盖机时, 电子设备 100可以根据磁传感器 180D 检测翻盖的开合。 进而根据检测到的皮套的开合状态或翻盖的开合状态, 设置翻盖自动解锁 等特性。 The magnetic sensor 180D includes a Hall sensor. The electronic device 100 can use the magnetic sensor 180D to detect the opening and closing of the flip holster. In some embodiments, when the electronic device 100 is a flip machine, the electronic device 100 can detect the opening and closing of the flip according to the magnetic sensor 180D. Then, according to the detected opening and closing state of the leather case or the opening and closing state of the flip cover, features such as automatic flip cover unlocking are set.
加速度传感器 180E可检测电子设备 100在各个方向上(一般为三轴)加速度的大小。 当电 子设备 100静止时可检测出重力的大小及方向。 还可以用于识别电子设备姿态, 应用于横竖 屏切换, 计步器等应用。 The acceleration sensor 180E can detect the magnitude of the acceleration of the electronic device 100 in various directions (generally three axes). When the electronic device 100 is stationary, the magnitude and direction of gravity can be detected. It can also be used to identify the posture of electronic equipment, and can be used in applications such as horizontal and vertical screen switching, pedometers and so on.
距离传感器 180F, 用于测量距离。 电子设备 100可以通过红外或激光测量距离。 在一些 实施例中, 拍摄场景, 电子设备 100可以利用距离传感器 180F测距以实现快速对焦。 Distance sensor 180F, used to measure distance. The electronic device 100 can measure the distance by infrared or laser. In some embodiments, when shooting a scene, the electronic device 100 may use the distance sensor 180F to measure the distance to achieve fast focusing.
接近光传感器 180G可以包括例如发光二极管(LED)和光检测器, 例如光电二极管。 发光 二极管可以是红外发光二极管。电子设备 100通过发光二极管向外发射红外光。电子设备 100 使用光电二极管检测来自附近物体的红外反射光。 当检测到充分的反射光时, 可以确定电子 设备 100附近有物体。 当检测到不充分的反射光时, 电子设备 100可以确定电子设备 100附 近没有物体。 电子设备 100可以利用接近光传感器 180G检测用户手持电子设备 100贴近耳 朵通话, 以便自动熄灭屏幕达到省电的目的。 接近光传感器 180G也可用于皮套模式, 口袋 模式自动解锁与锁屏。 The proximity light sensor 180G may include, for example, a light emitting diode (LED) and a light detector, such as a photodiode. The light emitting diode may be an infrared light emitting diode. The electronic device 100 emits infrared light to the outside through the light emitting diode. The electronic device 100 uses a photodiode to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the electronic device 100. When insufficient reflected light is detected, the electronic device 100 can determine that there is no object near the electronic device 100. The electronic device 100 can use the proximity light sensor 180G to detect that the user holds the electronic device 100 close to the ear to talk, so as to automatically turn off the screen to save power. The proximity light sensor 180G can also be used in leather case mode, and the pocket mode will automatically unlock and lock the screen.
环境光传感器 180L用于感知环境光亮度。电子设备 100可以根据感知的环境光亮度自适 应调节显示屏 194亮度。环境光传感器 180L也可用于拍照时自动调节白平衡。环境光传感器 180L还可以与接近光传感器 180G配合, 检测电子设备 100是否在口袋里, 以防误触。 The ambient light sensor 180L is used to sense the brightness of the ambient light. The electronic device 100 can adjust the brightness of the display screen 194 automatically according to the perceived brightness of the ambient light. The ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures. The ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the electronic device 100 is in the pocket to prevent accidental touch.
指纹传感器 180H用于采集指纹。电子设备 100可以利用采集的指纹特性实现指纹解锁, 访问应用锁, 指纹拍照, 指纹接听来电等。 The fingerprint sensor 180H is used to collect fingerprints. The electronic device 100 can use the collected fingerprint characteristics to realize fingerprint unlocking, access to the application lock, fingerprint photographs, fingerprint answering calls, and so on.
温度传感器 180J用于检测温度。 在一些实施例中, 电子设备 100利用温度传感器 180J
检测的温度, 执行温度处理策略。 例如, 当温度传感器 180J上报的温度超过阈值, 电子设备 100执行降低位于温度传感器 180J附近的处理器的性能, 以便降低功耗实施热保护。 在另一 些实施例中, 当温度低于另一阈值时, 电子设备 100对电池 142加热, 以避免低温导致电子 设备 100异常关机。在其他一些实施例中, 当温度低于又一阈值时, 电子设备 100对电池 142 的输出电压执行升压, 以避免低温导致的异常关机。 The temperature sensor 180J is used to detect temperature. In some embodiments, the electronic device 100 uses a temperature sensor 180J Detect the temperature, execute the temperature processing strategy. For example, when the temperature reported by the temperature sensor 180J exceeds a threshold, the electronic device 100 reduces the performance of the processor located near the temperature sensor 180J, so as to reduce power consumption and implement thermal protection. In other embodiments, when the temperature is lower than another threshold, the electronic device 100 heats the battery 142 to avoid abnormal shutdown of the electronic device 100 due to low temperature. In some other embodiments, when the temperature is lower than another threshold, the electronic device 100 boosts the output voltage of the battery 142 to avoid abnormal shutdown caused by low temperature.
触摸传感器 180K, 也称“触控面板”。 触摸传感器 180K可以设置于显示屏 194, 由触摸 传感器 180K与显示屏 194组成触摸屏, 也称“触控屏”。 触摸传感器 180K用于检测作用于其 上或附近的触摸操作。 触摸传感器可以将检测到的触摸操作传递给应用处理器, 以确定触摸 事件类型。 可以通过显示屏 194提供与触摸操作相关的视觉输出。 在另一些实施例中, 触摸 传感器 180K也可以设置于电子设备 100的表面, 与显示屏 194所处的位置不同。 Touch sensor 180K, also called "touch panel". The touch sensor 180K can be set on the display screen 194, and the touch screen is composed of the touch sensor 180K and the display screen 194, which is also called “touch screen”. The touch sensor 180K is used to detect touch operations on or near it. The touch sensor can transmit the detected touch operation to the application processor to determine the type of touch event. The display screen 194 can provide visual output related to the touch operation. In other embodiments, the touch sensor 180K may also be disposed on the surface of the electronic device 100, which is different from the position of the display screen 194.
骨传导传感器 180M可以获取振动信号。 在一些实施例中, 骨传导传感器 180M可以获 取人体声部振动骨块的振动信号。 骨传导传感器 180M也可以接触人体脉搏, 接收血压跳动 信号。 The bone conduction sensor 180M can acquire vibration signals. In some embodiments, the bone conduction sensor 180M can obtain the vibration signal of the vibrating bone mass of the human voice. The bone conduction sensor 180M can also contact the human pulse and receive the blood pressure pulse signal.
在一些实施例中, 骨传导传感器 180M也可以设置于耳机中, 结合成骨传导耳机。 音频 模块 170可以基于骨传导传感器 180M获取的声部振动骨块的振动信号, 解析出语音信号, 实现语音功能。应用处理器可以基于骨传导传感器 180M获取的血压跳动信号解析心率信息, 实现心率检测功能。 In some embodiments, the bone conduction sensor 180M may also be provided in the earphone, combined with the bone conduction earphone. The audio module 170 may parse the voice signal based on the vibration signal of the vibrating bone block of the voice obtained by the bone conduction sensor 180M, and implement the voice function. The application processor can analyze the heart rate information based on the blood pressure beating signal obtained by the bone conduction sensor 180M, and realize the heart rate detection function.
按键 190包括开机键, 音量键等。 按键 190可以是机械按键。 也可以是触摸式按键。 电 子设备 100可以接收按键输入, 产生与电子设备 100的用户设置以及功能控制有关的键信号 输入。 The button 190 includes a power button, a volume button, and so on. The button 190 may be a mechanical button. It can also be a touch button. The electronic device 100 can receive key input, and generate key signal input related to user settings and function control of the electronic device 100.
马达 191可以产生振动提示。 马达 191可以用于来电振动提示, 也可以用于触摸振动反 馈。 例如, 作用于不同应用(例如拍照, 音频播放等)的触摸操作, 可以对应不同的振动反馈 效果。 作用于显示屏 194不同区域的触摸操作, 马达 191也可对应不同的振动反馈效果。 不 同的应用场景(例如: 时间提醒, 接收信息, 闹钟, 游戏等)也可以对应不同的振动反馈效果。 触摸振动反馈效果还可以支持自定义。 The motor 191 can generate vibration prompts. The motor 191 can be used for incoming call vibration notification, and can also be used for touch vibration feedback. For example, touch operations applied to different applications (such as photographing, audio playback, etc.) can correspond to different vibration feedback effects. Acting on touch operations in different areas of the display screen 194, the motor 191 can also correspond to different vibration feedback effects. Different application scenarios (for example: time reminder, receiving information, alarm clock, game, etc.) can also correspond to different vibration feedback effects. The touch vibration feedback effect can also support customization.
指示器 192可以是指示灯, 可以用于指示充电状态, 电量变化, 也可以用于指示消息, 未接来电, 通知等。 The indicator 192 may be an indicator light, which may be used to indicate a charging state, a change in power, and may also be used to indicate messages, missed calls, notifications, and the like.
SIM卡接口 195用于连接 SIM卡。 SIM卡可以通过插入 SIM卡接口 195 ,或从 SIM卡接 口 195拔出, 实现和电子设备 100的接触和分离。 电子设备 100可以支持 1个或 N个 SIM卡 接口, N为大于 1的正整数。 SIM卡接口 195可以支持 Nano SIM卡, Micro SIM卡, SIM卡 等。 同一个 SIM卡接口 195可以同时插入多张卡。多张卡的类型可以相同,也可以不同。 SIM 卡接口 195也可以兼容不同类型的 SIM卡。 SIM卡接口 195也可以兼容外部存储卡。 电子设 备 100通过 SIM卡和网络交互, 实现通话以及数据通信等功能。 在一些实施例中, 电子设备 100采用 eSIM, 即: 嵌入式 SIM卡。 eSIM卡可以嵌在电子设备 100中, 不能和电子设备 100 分离。 The SIM card interface 195 is used to connect to the SIM card. The SIM card can be inserted into the SIM card interface 195 or pulled out from the SIM card interface 195 to achieve contact and separation with the electronic device 100. The electronic device 100 may support one or N SIM card interfaces, and N is a positive integer greater than one. SIM card interface 195 can support Nano SIM card, Micro SIM card, SIM card and so on. The same SIM card interface 195 can insert multiple cards at the same time. The types of multiple cards can be the same or different. The SIM card interface 195 can also be compatible with different types of SIM cards. The SIM card interface 195 is also compatible with external memory cards. The electronic device 100 interacts with the network through the SIM card to realize functions such as call and data communication. In some embodiments, the electronic device 100 adopts an eSIM, that is, an embedded SIM card. The eSIM card can be embedded in the electronic device 100 and cannot be separated from the electronic device 100.
以下主要以电子设备 100为手机为例, 对本申请实施例提供的图片分组方法进行阐述。 上述视频或图像组等参考图像集记录的通常是时间上连续的、 动态变化的过程, 因而参 考图像集中往往可以包括同一用户在动态变化过程中的一系列的、 不同形态的人脸图像。 因 此, 手机可以首先对每个参考图像集中的人脸图像进行跟踪, 获取每个参考图像集中同一用 户的, 具有时间连续性的, 不同角度、 不同表情、 不同装饰、 不同发型等不同形态的人脸图
像,并将每个参考图像集中的这些人脸图像自动聚为一类,得到参考图像集聚类结果; 而后, 根据人脸图片中的人脸特征与参考图像集聚类结果中人脸图像的人脸特征的相似度, 对人脸 图片进行聚类,从而可以使得不同形态的人脸图片也能正确聚类,提高人脸图片的聚类精度。 The following mainly takes the electronic device 100 as a mobile phone as an example to describe the image grouping method provided in the embodiment of the present application. The above-mentioned reference image set such as video or image group usually records a process of continuous and dynamic change in time. Therefore, the reference image set may often include a series of different forms of face images of the same user in the process of dynamic change. Therefore, the mobile phone can first track the face images in each reference image set, and obtain people of the same user in each reference image set with time continuity, different angles, different expressions, different decorations, and different hairstyles. Face map The face images in each reference image set are automatically grouped into one category, and the reference image set clustering result is obtained; then, according to the facial features in the face image and the face image in the reference image set clustering result According to the similarity of face features, the face images are clustered, so that face images of different shapes can also be clustered correctly, and the clustering accuracy of face images is improved.
在本申请的一个实施例中, 手机若获取到参考图像集, 则可以自动进行参考图像集聚类 处理和人脸图片聚类处理。 In an embodiment of the present application, if the mobile phone obtains the reference image set, the reference image set clustering process and the face image clustering process can be automatically performed.
在本申请的另一个实施例中, 手机若获取到参考图像集, 则可以自动进行参考图像集聚 类处理; 在检测到用户指示人像分类的操作或用户指示打开人像分类功能后, 再根据参考图 像集聚类结果对手机上存储的人脸图片进行聚类。 In another embodiment of the present application, if the mobile phone obtains the reference image set, it can automatically perform the reference image set clustering process; after detecting the user's instruction to classify portraits or the user's instruction to turn on the portrait classification function, the reference The clustering result of the image set clusters the face pictures stored on the mobile phone.
示例性的, 当手机检测到用户点击图 2中的( a)所示的相册图标 201后,手机打开相册, 显示如图 2中的 (b)所示的界面。 手机在检测到用户点击图 2中的 (b)所示的控件 202的 操作后, 显示如图 2中的 (c)所示的人像分类控件 203; 手机检测到用户点击控件 203的操 作后, 确定检测到用户指示人像分类的操作, 或用户指示开启人像分类功能。 或者, 手机打 开相册后显示如图 2中的 (d)所示的界面, 手机在检测到用户点击图 2中的 (d)所示的发 现控件 204后, 可以显示如图 2中的 (e)所示的人像分类控件 205、 详情控件 206等。 手机 在检测到用户点击控件 206的操作后,显示如图 2中的(f)所示的人像分类的功能说明 207 , 以便用户了解该功能的具体内容。 手机在检测到用户点击控件 205的操作后, 可以确定检测 到用户指示人像分类的操作, 或用户指示开启人像分类功能。 Exemplarily, when the mobile phone detects that the user clicks the album icon 201 shown in (a) in FIG. 2, the mobile phone opens the album and displays the interface shown in (b) in FIG. 2. After the mobile phone detects that the user clicks on the control 202 shown in Figure 2 (b), it displays the portrait classification control 203 shown in Figure 2 (c); after the mobile phone detects that the user clicks on the control 203, It is determined that the user's instruction to classify portraits is detected, or the user instructs to enable the portrait classification function. Or, the mobile phone displays the interface as shown in (d) in Figure 2 after opening the album. After the mobile phone detects that the user clicks the discovery control 204 shown in (d) in Figure 2, it can display (e) in Figure 2 ) Shows the portrait classification control 205, the detail control 206 and so on. After the mobile phone detects that the user clicks on the control 206, it displays the portrait classification function description 207 as shown in (f) in Figure 2 so that the user can understand the specific content of the function. After the mobile phone detects that the user clicks on the control 205, it can determine that it detects the user's instruction to classify portraits, or the user instructs to enable the portrait classification function.
在本申请的又一个实施例中, 手机若获取到参考图像集, 且检测到用户指示人像分类的 操作或用户指示打开人像分类功能, 则进行参考图像集聚类和人脸图片聚类。 In another embodiment of the present application, if the mobile phone obtains the reference image set, and detects the operation of the user instructing portrait classification or the user instructing to open the portrait classification function, clustering of the reference image set and face picture clustering is performed.
在本申请的另一个实施例中, 用户还可以选择是否根据参考图像集对人脸图片进行人像 分类。 In another embodiment of the present application, the user can also choose whether to classify the face pictures according to the reference image set.
示例性的, 手机在检测到用户点击图 2中的(b)所示的控件 202后, 可以显示如图 3所 示的界面。 若手机检测到用户点击控件 302, 则表明用户选择根据参考图像集对人脸图片进 行人像分类; 若手机检测到用户点击控件 301 , 则表明用户选择不根据参考图像集, 直接根 据人脸图片进行人像分类。 再示例性的, 用户可以通过语音或预设手机指示手机根据参考图 像集对人脸图片进行分类。 Exemplarily, after detecting that the user clicks the control 202 shown in (b) in FIG. 2, the mobile phone may display the interface shown in FIG. 3. If the mobile phone detects that the user clicks on the control 302, it indicates that the user chooses to classify the face pictures according to the reference image set; if the mobile phone detects that the user clicks on the control 301, it indicates that the user chooses not to use the reference image set, but directly based on the face picture Portrait classification. As another example, the user can instruct the mobile phone to classify the face pictures according to the reference image set through voice or preset mobile phone.
此外, 用户还可以设置参考图像集的内容。 比如, 参见图 4, 手机可以设置参考图像集 包括手机获取的视频等内客。 In addition, the user can also set the content of the reference image set. For example, referring to Figure 4, the mobile phone can set a reference image set including insiders such as videos obtained by the mobile phone.
在本申请的又一实施例中, 在手机首次或每次打开相册, 或手机检测到用户指示人像分 类的操作后, 手机还可以提示用户是否根据参考图像集对人脸图片进行聚类。 示例性的, 在 手机检测到用户点击控件 203或控件 205后, 参见图 5, 手机可以通过提示框 501来提示用 户。 In another embodiment of the present application, after the mobile phone opens the album for the first time or every time, or the mobile phone detects that the user has instructed a portrait classification operation, the mobile phone may also prompt the user whether to cluster face pictures according to the reference image set. Exemplarily, after the mobile phone detects that the user clicks on the control 203 or the control 205, referring to FIG. 5, the mobile phone may prompt the user through a prompt box 501.
在本申请的另一个实施例中, 在手机打开相册, 或者检测到用户选择根据参考图像集对 人脸图片进行聚类后, 若手机未获取到参考图像集, 则可以提示用户添加一个或多个参考图 像集, 该参考图像集包括人脸图片对应的用户的人脸图像, 以便手机可以根据参考图像集更 为精确地对人脸图片进行聚类。 In another embodiment of the present application, after opening the album on the mobile phone, or detecting that the user chooses to cluster face pictures according to the reference image set, if the mobile phone does not obtain the reference image set, the user may be prompted to add one or more A reference image set, the reference image set includes the face image of the user corresponding to the face picture, so that the mobile phone can more accurately cluster the face picture according to the reference image set.
例如, 参见图 6, 手机可以通过提示框 601提示用户拍摄(或下载、 拷贝)一段关于人 脸图片中的目标用户的视频。 再例如, 手机可以提示用户让目标用户玩一会 YOYO炫舞等可 以采集到目标用户的人脸图像的游戏,手机可以在游戏过程中录制一段关于目标用户的视频。 又例如, 手机可以提示用户添加一个图像组, 该图像组可以是用户从人脸图片中选择的同一
用户的不同形态的多张人脸图片。 而后, 手机可以根据获取的视频或图像组等参考图像集对 目标用户的人脸图片进行聚类。 For example, referring to FIG. 6, the mobile phone can prompt the user to shoot (or download, copy) a video about the target user in the face picture through the prompt box 601. For another example, the mobile phone can prompt the user to let the target user play a game that can collect the face image of the target user, such as YOYO Xuanwu, and the mobile phone can record a video about the target user during the game. For another example, the mobile phone may prompt the user to add an image group, which may be the same image group selected by the user from the face pictures. Multiple face pictures of users in different forms. Then, the mobile phone can cluster the face pictures of the target user according to the acquired video or image group and other reference image sets.
以下以参考图像集为视频为例, 对手机进行参考图像集聚类处理, 以及根据参考图像集 聚类结果对人脸图片进行聚类为例进行说明。 In the following, taking the reference image set as a video as an example, clustering the reference image set on the mobile phone, and clustering the face pictures according to the clustering result of the reference image set as an example for description.
手机上可以存储有大量的人脸图片和视频。 该人脸图片可以是用户通过手机的摄像头拍 摄的, 或者通过网络或八卯下载的, 或者通过截屏获取的, 或者从其他设备拷贝的, 或者通 过其他方式获取的。 该视频可以是用户通过手机的摄像头拍摄的视频, 或者通过网络或 App 下载的视频, 或者视频通话过程中保存的视频, 或者从其他设备拷贝的视频, 或者通过其他 方式获取的视频。该视频和人脸图片上可以包括用户或其他用户 (比如亲人、朋友、 明星等) 的人脸图像。 A large number of face pictures and videos can be stored on the mobile phone. The face picture may be taken by the user through the camera of the mobile phone, or downloaded through the Internet or Ba Mao, or obtained through screenshots, or copied from other devices, or obtained by other means. The video can be a video taken by the user through the camera of a mobile phone, or a video downloaded through the network or an App, or a video saved during a video call, or a video copied from other devices, or a video obtained by other means. The video and face picture may include face images of the user or other users (such as relatives, friends, celebrities, etc.).
视频记录的是连续的、 动态变化的过程, 因而视频中往往可以包括同一用户在动态变化 过程中的多种形态的人脸图像。 手机可以首先对视频中的人脸图像进行跟踪, 获取视频中同 一用户的, 具有时间连续性的, 不同角度、 不同表情、 不同装饰、 不同发型等不同形态的人 脸图像, 并将这些人脸图像自动聚为一类, 从而得到视频聚类结果; 而后将视频聚类结果作 为先验信息, 根据人脸图片中的人脸特征与视频聚类结果中人脸图像的人脸特征的相似度, 对人脸图片进行聚类, 从而可以使得不同形态的人脸图片也能正确聚类, 提高人脸图片的聚 类精度。 The video records a continuous and dynamic process, so the video can often include multiple face images of the same user during the dynamic process. The mobile phone can first track the face images in the video to obtain the face images of the same user in the video with time continuity, different angles, different expressions, different decorations, different hairstyles, and other different forms, and then combine these faces The images are automatically clustered into one category to obtain the video clustering results; then the video clustering results are used as prior information, based on the similarity between the facial features in the face pictures and the facial features in the video clustering results , The face pictures are clustered, so that face pictures of different forms can also be clustered correctly, and the clustering accuracy of face pictures is improved.
举例来说, 参见图 7A, 手机上存储有视频 1和大量的人脸图片, 例如该人脸图片包括人 脸图片 1、 人脸图片 2、 人脸图片 3和人脸图片 4。 For example, referring to FIG. 7A, the mobile phone stores video 1 and a large number of face pictures. For example, the face picture includes face picture 1, face picture 2, face picture 3, and face picture 4.
在视频 1中,参见图 7B ,手机在时刻 1检测到人脸 1 ,该人脸 1为正面人脸 A上的人脸, 并在时间段 1 内持续跟踪该人脸 1; 手机在时刻 2检测到人脸 2, 该人脸 2为微笑的人脸 D 上的人脸, 并在时间段 2内持续跟踪该人脸 2; 手机在时刻 3检测到人脸 3 , 该人脸 3为仰脸 G上的人脸, 并在时间段 3 内持续跟踪该人脸 3。 手机在时间段 1 内跟踪到的人脸图像包括 正面人脸 A、 侧面人脸 B和戴墨镜的人脸 C等; 手机在时间段 2内跟踪到的人脸图像包括微 笑的人脸 D、 闭眼睛的人脸 E和搞怪表情的人脸 F等; 手机在时间段 3内跟踪到的人脸图像 包括仰脸 G和俯脸 H。 In video 1, referring to Figure 7B, the mobile phone detects face 1 at time 1, which is the face on frontal face A, and continues to track face 1 during time period 1; mobile phone at time 2 Face 2 is detected, the face 2 is the face on the smiling face D, and the face 2 is continuously tracked within the time period 2; the mobile phone detects the face 3 at the time 3, and the face 3 is the face The face on the face G, and keep tracking the face 3 in the time period 3. The face images tracked by the mobile phone in time period 1 include frontal face A, side face B, and face C with sunglasses, etc.; the face images tracked by the mobile phone in time period 2 include smiling face D, Face E with closed eyes and face F with funny expressions, etc.; Face images tracked by the mobile phone in time period 3 include face up G and face H.
其中, 人脸检测方法可以有多种。 例如, 肤色模型法, 依据面貌肤色在色彩空间中分布 相对集中的规律来检测人脸。 再例如, 参考模板法, 预设一个或数个标准人脸的模板, 然后 计算测试采集的样图与标准模板之间的匹配程度,并通过阈值来判断是否存在人脸。又例如, 特征子脸法、 人脸规则法、 样品学习法等。 Among them, there can be multiple face detection methods. For example, the skin color model method detects human faces based on the relatively concentrated distribution of facial skin color in the color space. For another example, referring to the template method, one or several standard face templates are preset, and then the matching degree between the sample image collected by the test and the standard template is calculated, and the threshold is used to determine whether there is a face. For another example, feature sub-face method, face rule method, sample learning method, etc.
人脸跟踪方法也可以有多种。 例如, 基于模型的跟踪方法, 常见的跟踪模型可以有肤色 模型、 椭圆模型、 纹理模型及双眼模板等。 再例如, 基于运动信息的跟踪方法, 主要利用图 像连续帧间目标运动的连续性规律, 进行人脸区域的预测以达到快速跟踪的目的。 通常采用 运动分割、 光流、 立体视觉等方法, 常利用时空梯度, 卡尔曼滤波器进行跟踪。 又例如, 基 于人脸局部特征的跟踪方法, 以及基于神经网络的跟踪方法等。 There are also many face tracking methods. For example, for model-based tracking methods, common tracking models may include skin color model, ellipse model, texture model and binocular template. For another example, the tracking method based on motion information mainly uses the continuity law of the target motion between consecutive frames of the image to predict the face area to achieve the purpose of fast tracking. Methods such as motion segmentation, optical flow, and stereo vision are usually used, and spatio-temporal gradients and Kalman filters are often used for tracking. For another example, tracking methods based on local features of human faces, and tracking methods based on neural networks.
由于手机跟踪到的人脸图像满足时间连续性, 满足必须链接 ( must-link ) 约束, 是同一 个用户的人脸, 因而手机可以将时间段 1内的正面人脸 A、 侧面人脸 B和戴墨镜的人脸 C自 动聚为一类, 例如聚类为类别 1; 将时间段 2内的微笑的人脸 D、 闭眼睛的人脸 E和搞怪表 情的人脸 F自动聚为一类, 例如聚类为类别 2; 将时间段 3内的仰视的人脸 G和俯视的人脸 H自动聚为一类, 例如聚类为类别 3。 其中, 这里的类别也可以称为聚类中心。
可以理解的是, 对于手机上存储的视频 1以外的其他视频, 手机也可以采用人脸检测和 人脸跟踪的方法对视频中的人脸图像进行聚类处理。 Since the face image tracked by the mobile phone meets the time continuity and the must-link constraint, it is the face of the same user. Therefore, the mobile phone can compare the front face A, side face B and Face C wearing sunglasses is automatically grouped into one category, for example, clustered into category 1; Face D with smiling eyes, face E with closed eyes, and face F with funny expressions in time period 2 are automatically grouped into one category, For example, the clustering is category 2; the face G looking up and the face H looking down in the time period 3 are automatically clustered into one category, for example, the clustering is category 3. Among them, the categories here can also be called cluster centers. It is understandable that, for videos other than Video 1 stored on the mobile phone, the mobile phone may also use face detection and face tracking methods to cluster the face images in the video.
在对每组跟踪结果中的人脸图像进行聚类处理后, 在一种方案中, 手机还可以对不同跟 踪结果中的人脸图像进行聚类处理。 具体的, 手机可以提取不同跟踪结果中的人脸图像的人 脸特征(在提取人脸图像的人脸特征之前, 手机还可以对人脸图像进行人脸转正(即将其他 角度的人脸图像转换成正面人脸的图像)等处理), 若某一类别中的某张人脸图像与另一类别 中的某张人脸图像的相似度较高, 可以聚为一类, 则这两个类别中的所有人脸图像均可以聚 为一类。 比如, 手机确定类别 1中的正面人脸 A与类别 3中的仰脸 G的相似度较高, 可以聚 为一类, 则类别 1和类别 3中的正面人脸 A、 侧面人脸 B、 戴墨镜的人脸 C、 仰脸 G和俯脸 H可以聚为一类。 After clustering the face images in each group of tracking results, in a solution, the mobile phone can also cluster the face images in different tracking results. Specifically, the mobile phone can extract the facial features of the facial images in different tracking results (before extracting the facial features of the facial image, the mobile phone can also perform face correction on the facial image (that is, convert the facial image from other angles) If a face image in a certain category has high similarity with a face image in another category, it can be grouped into one category, then these two categories All face images in can be grouped into one category. For example, the mobile phone determines that the front face A in category 1 has a high similarity to the upward face G in category 3, and can be grouped into one category, then the front face A, side face B, and side face B in category 1 and category 3 Face C, face G, and face H wearing sunglasses can be grouped together.
其中,将不同人脸图像聚为一类的人脸聚类方法可以有多种,例如基于层次的聚类方法, 基于划分的聚类方法, 基于密度的聚类方法, 基于网格的聚类方法, 基于模型的聚类方法、 基于距离的聚类方法以及基于互连性的聚类方法等。具体的,可以有 K-Means算法、 DBSCAN 算法、 BIRCH算法和 MeanShift算法等。 例如, 在一种聚类方法中, 手机可以提取不同人脸 图像的人脸特征, 根据不同人脸特征的相似度进行聚类。 其中, 人脸特征提取可以理解为将 人脸图像映射为 n( n为正整数)维向量的过程, 该 n维向量具有表征该人脸图像的能力。 不 同人脸图像的人脸特征之间的相似度越高, 则不同人脸图像越能聚为一类。 Among them, there are many face clustering methods that group different face images into one category, such as hierarchical clustering methods, partition-based clustering methods, density-based clustering methods, and grid-based clustering methods. Methods, model-based clustering method, distance-based clustering method and interconnection-based clustering method, etc. Specifically, there may be K-Means algorithm, DBSCAN algorithm, BIRCH algorithm, MeanShift algorithm, etc. For example, in a clustering method, the mobile phone can extract the facial features of different facial images, and perform clustering according to the similarity of the different facial features. Among them, face feature extraction can be understood as a process of mapping a face image to an n-dimensional vector (n is a positive integer), and the n-dimensional vector has the ability to characterize the face image. The higher the similarity between the facial features of different facial images, the more different facial images can be grouped into one category.
其中, 衡量相似度的方法可以有多种。 比如, 人脸特征为多维向量, 相似度可以为不同 人脸的人脸特征对应的多维向量之间的距离。 例如, 该距离可以是欧式距离, 马氏距离、 曼 哈顿距离等。 再比如, 相似度可以为不同人脸的人脸特征之间的余弦相似性、 相关系数、 信 息熵等。 Among them, there can be multiple methods for measuring similarity. For example, the face feature is a multi-dimensional vector, and the similarity may be the distance between the multi-dimensional vectors corresponding to the face features of different faces. For example, the distance may be Euclidean distance, Mahalanobis distance, Manhattan distance, etc. For another example, the similarity can be the cosine similarity, correlation coefficient, information entropy, etc. between the facial features of different faces.
比如, 人脸特征为多维向量, 若手机提取的类别 1 中的正面人脸 A 的人脸特征 1 为 [0.88, 0.64, 0.58, 0.11, ...,0.04, 0.23] ; 手机提取的类别 3 中的仰脸 G 的人脸特征 2 为 [0.68, 0.74, 0.88, 0.81, ...,0.14, 0.53]; 人脸特征 1与人脸特征 2之间的相似度通过两者分别对应 的多维向量之间的余弦相似性来衡量。 该余弦相似性为 0.96, 根据该余弦相似性确定人脸特 征 1和人脸特征 2之间的相似度为 96%; 聚类对应的相似度阈值为 80%; 则相似度 96%大于 相似度阈值 80% , 因而类别 1中的正面人脸 A与类别 3中的仰脸 G可以聚为一类, 类别 1和 类别 3中的所有人脸图像可以聚为一类。 For example, if the face feature is a multi-dimensional vector, if the face feature 1 of the front face A in category 1 extracted by the mobile phone is [0.88, 0.64, 0.58, 0.11, ..., 0.04, 0.23]; category 3 extracted by the mobile phone The face feature 2 of the upward face G in the middle is [0.68, 0.74, 0.88, 0.81, ...,0.14, 0.53]; the similarity between the face feature 1 and the face feature 2 is based on the corresponding multi-dimensional The cosine similarity between vectors is measured. The cosine similarity is 0.96. According to the cosine similarity, it is determined that the similarity between face feature 1 and face feature 2 is 96%; the similarity threshold corresponding to the clustering is 80%; the similarity 96% is greater than the similarity The threshold is 80%, so the frontal face A in category 1 and the upward face G in category 3 can be grouped into one category, and all face images in category 1 and category 3 can be grouped into one category.
再示例性的, 人脸特征为多维向量, 各人脸图像与人脸特征的对应关系可以参见表 1。 For another example, the face feature is a multi-dimensional vector, and the corresponding relationship between each face image and the face feature can be seen in Table 1.
表 1 Table 1
若人脸特征之间的相似度通过欧式距离来衡量, 聚类的距离阈值为 5,表 1所示的类别 1 中的正面人脸 A的人脸特征 A和类别 3中的仰脸 G的人脸特征 G之间的欧式距离小于距离 阈值 5, 因而类别 1中的正面人脸 A与类别 3中的仰脸 G可以聚为一类, 类别 1和类别 3中 的所有人脸图像可以聚为一类。 If the similarity between face features is measured by Euclidean distance, the distance threshold of clustering is 5, and the face feature A of front face A in category 1 and the face feature A of upside face G in category 3 are shown in Table 1. The Euclidean distance between facial features G is less than the distance threshold of 5, so the frontal face A in category 1 and the upward face G in category 3 can be grouped into one category, and all face images in category 1 and category 3 can be grouped together. As a class.
在视频聚类完成后, 手机可以根据视频聚类结果, 例如类别 1、 类别 2和类别 3等类别 以及提取的各类别中的人脸图像的人脸特征, 采用增量聚类算法或其他聚类方法, 对人脸图 片 1、 人脸图片 2、 人脸图片 3和人脸图片 4进行聚类, 从而将视频中的人脸图像和人脸图片 进行融合, 将人脸图片聚类到先前的视频聚类结果中。 比如, 若手机中存储的某张人脸图片 的人脸特征, 与某个类别 (例如类别 1) 中的某张人脸图像的人脸特征的相似度较大(例如 大于或者等于预设值 1), 则可以将该人脸图片聚类到该人脸图像所在的类别中。 其中, 当采 用增量聚类算法时, 可以将人脸图片以增量的方式实现视频聚类结果的扩展。 若手机中存储 的某张人脸图片的人脸特征与视频聚类结果的各类别中人脸图像的人脸特征的相似度都较小 (例如小于预设值 1), 则将该人脸图片归为一个新的类别。 After the video clustering is completed, the mobile phone can use incremental clustering algorithms or other clustering algorithms based on the video clustering results, such as categories 1, category 2, and category 3, and the extracted facial features of the facial images in each category. The class method is to cluster face picture 1, face picture 2, face picture 3, and face picture 4, thereby fusing the face image and face picture in the video, and clustering the face picture to the previous In the video clustering results. For example, if the facial features of a face image stored in the mobile phone are similar to the facial features of a face image in a certain category (such as category 1) (such as greater than or equal to a preset value) 1), the face image can be clustered into the category of the face image. Among them, when the incremental clustering algorithm is adopted, the face image can be used to achieve the expansion of the video clustering result in an incremental manner. If the similarity between the facial features of a certain face picture stored in the mobile phone and the facial features of the facial images in each category of the video clustering result is small (for example, less than the preset value 1), then the face The pictures are grouped into a new category.
示例性的, 人脸图片与人脸特征的对应关系可以参见上述表 1。 在一个举例中, 若人脸 特征之间的相似度通过欧式距离来衡量, 聚类的距离阈值为 5, 人脸图片 1的人脸特征 a与 正面人脸 A的人脸特征 A之间的欧式距离小于距离阈值 5, 则人脸图片 1可以聚类为正面人 脸 A所在的类别 1 ; 人脸图片 2的人脸特征 b与侧面人脸 B的人脸特征 B之间的欧式距离小 于距离阈值 5 , 则人脸图片 2可以聚类为侧面人脸 B所在的类别 1 ; 人脸图片 3的人脸特征 c 与微笑的人脸 D的人脸特征 D之间的欧式距离小于距离阈值 5, 则人脸图片 3可以聚类为微 笑的人脸 D所在的类别 2;人脸图片 4的人脸特征 d与仰脸 G的人脸特征 G之间的欧式距离 小于距离阈值 5, 则人脸图片 4可以聚类为仰脸 G所在的类别 3。 Exemplarily, the correspondence between face pictures and face features can be found in Table 1 above. In an example, if the similarity between face features is measured by Euclidean distance, the clustering distance threshold is 5, and the difference between the face feature a of face image 1 and the face feature A of front face A If the Euclidean distance is less than the distance threshold of 5, then the face image 1 can be clustered into category 1 where the front face A is located; the Euclidean distance between the face feature b of the face image 2 and the face feature B of the side face B is less than The distance threshold is 5, then the face image 2 can be clustered into the category 1 where the side face B is located; the Euclidean distance between the face feature c of the face image 3 and the face feature D of the smiling face D is less than the distance threshold 5. Then the face image 3 can be clustered into the category 2 where the smiling face D is located; the Euclidean distance between the face feature d of the face image 4 and the face feature G of the upward face G is less than the distance threshold of 5, then The face picture 4 can be clustered into category 3 where the upward face G is located.
在另一个举例中, 若人脸特征之间的相似度通过与参考特征之间的欧式距离来衡量, 类 别 1对应的人脸特征与参考特征之间的欧式距离的范围为 0-50; 类别 3对应的人脸特征的聚 类范围为 100-150。 若类别 1与类别 3聚类为同一类别 4, 则类别 4对应的人脸特征与参考特 征之间的欧式距离的范围为 [0,50] U [100,150]。 人脸图片 1、 人脸图片 2和人脸图片 4的人脸 特征均在 [0,50] U [100,150]的范围内, 因而人脸图片 1、 人脸图片 2和人脸图片 4均可以聚类 到类别 4中, 从而聚为同一个类别。 示例性的, 聚类效果示意图可以参见图 8A。 In another example, if the similarity between face features is measured by the Euclidean distance with the reference feature, the range of the Euclidean distance between the face feature corresponding to category 1 and the reference feature is 0-50; The clustering range of 3 corresponding facial features is 100-150. If the clusters of category 1 and category 3 are the same category 4, the range of the Euclidean distance between the facial features corresponding to category 4 and the reference feature is [0,50] U [100,150]. The face features of face picture 1, face picture 2 and face picture 4 are all in the range of [0,50] U [100,150], so face picture 1, face picture 2 and face picture 4 are all acceptable Cluster into category 4, thus clustering into the same category. Exemplarily, a schematic diagram of the clustering effect can be seen in Figure 8A.
在又一个举例中, 手机可以分别提取表 1 中所列出的所有人脸图像和人脸图片的人脸特 征, 而后根据人脸特征的相似度进行聚类。 In another example, the mobile phone can separately extract the facial features of all face images and facial images listed in Table 1, and then cluster them according to the similarity of the facial features.
通过以上描述可知, 在对视频中的人脸进行跟踪并自动聚类后, 手机可以将同一用户的 多种不同形态的人脸图像聚为同一个类别, 从而可以使得与该类别中不同形态的人脸图像相 似的人脸图片也聚类到该类别, 因而与现有技术相比可以降低聚类的分散度, 提高人脸聚类 的精度, 方便用户的管理, 提高用户的使用体验。 From the above description, it can be seen that after tracking and automatically clustering the faces in the video, the mobile phone can group multiple face images of the same user into the same category, so that the face images of different shapes in the category can be Face images with similar face images are also clustered into this category, so compared with the prior art, the dispersion of clustering can be reduced, the accuracy of face clustering can be improved, the management of the user is convenient, and the user experience is improved.
若手机存储的人脸图片 1、人脸图片 2、人脸图片 3和人脸图片 4的人脸特征仍如表 1所 示, 人脸特征之间的相似度通过欧式距离来衡量, 聚类的距离阈值为 5, 则在不考虑视频聚 类的结果, 直接对人脸图片 1、 人脸图片 2、 人脸图片 3和人脸图片 4进行聚类的情况下, 由 于每两个人脸图片的人脸特征之间的欧式距离均大于距离阈值 5, 因而任意两张均不能聚为
一类, 从而导致每张人脸图片分别为一个类别, 示例性的, 聚类效果示意图可以参见图 8B。 与图 8A相比, 图 8B所示的人脸聚类的分散度较大, 聚类精度较低, 使得聚类结果出现误报 ( false positive)或漏报( false negative)等问题。 If the facial features of face picture 1, face picture 2, face picture 3, and face picture 4 stored in the mobile phone are still as shown in Table 1, the similarity between face features is measured by Euclidean distance, clustering If the distance threshold is 5, without considering the result of video clustering, if the face picture 1, face picture 2, face picture 3, and face picture 4 are clustered directly, since every two face pictures The Euclidean distance between the facial features of are all greater than the distance threshold of 5, so any two cannot be clustered into One type, which results in each face picture being divided into a category. For an example, the clustering effect diagram can be seen in FIG. 8B. Compared with FIG. 8A, the face clustering shown in FIG. 8B has a greater degree of dispersion and a lower clustering accuracy, which causes problems such as false positives or false negatives in the clustering results.
此外, 在本申请的实施例中, 在视频聚类完成后, 手机还可以根据视频聚类结果, 对视 频中的人脸进行身份标记。 In addition, in the embodiment of the present application, after the video clustering is completed, the mobile phone may also perform identity marking on the face in the video according to the video clustering result.
以上主要以参考图像集为视频为例进行说明的, 当参考图像集为上文提到的图像组(例 如动图形成的图像组, 拍摄预览时获取的多帧图像对应的图像组等), 或者当参考图像集包括 视频和上述图像组时, 手机仍可以采用与视频处理过程类似的方式进行聚类处理, 此处不予 赘述。 The above description mainly takes the reference image set as the video as an example. When the reference image set is the image group mentioned above (for example, the image group formed by moving pictures, the image group corresponding to the multi-frame images obtained during shooting preview, etc.), Or when the reference image set includes the video and the aforementioned image group, the mobile phone can still perform clustering processing in a manner similar to the video processing process, which will not be repeated here.
需要说明的是, 若参考图像集为用户预设的图像组, 则图像组中的图像通常是用户主动 设置的同一用户的不同形态的人脸图像, 因而手机也可以不需要进行人脸检测和跟踪, 直接 将该图像组中的人脸图像自动聚为一类。 It should be noted that if the reference image set is an image group preset by the user, the images in the image group are usually different face images of the same user set by the user. Therefore, the mobile phone does not need to perform face detection and Tracking, directly automatically group the face images in the image group into one category.
此外, 在根据参考图像集聚类结果对人脸图片 1、 人脸图片 2、 人脸图片 3和人脸图片 4 进行聚类后, 若后续手机上新增加了用户 1或用户 2的人脸图片, 则与人脸图片 1、 人脸图 片 2、 人脸图片 3和人脸图片 4类似, 手机还可以根据参考图像集聚类结果对新增加的人脸 图片进行聚类。 在一种具体实现中, 手机可以通过增量聚类的方式, 将新增加的人脸图片扩 展到先前的聚类结果中。 In addition, after clustering face picture 1, face picture 2, face picture 3, and face picture 4 according to the clustering result of the reference image set, if the face of user 1 or user 2 is newly added to the subsequent mobile phone The picture is similar to the face picture 1, the face picture 2, the face picture 3, and the face picture 4. The mobile phone can also cluster the newly added face pictures according to the clustering result of the reference image set. In a specific implementation, the mobile phone can extend the newly added face pictures to the previous clustering results through incremental clustering.
在根据参考图像集聚类结果对人脸图片 1、人脸图片 2、人脸图片 3和人脸图片 4进行聚 类后, 若后续手机获取到了新的参考图像集(例如视频 2), 则在一种方案中, 手机根据之前 的参考图像集以及新的参考图像集进行人脸检测、 跟踪和聚类, 并根据视频聚类结果对人脸 图片 1、 人脸图片 2、 人脸图片 3和人脸图片 4进行聚类处理; 在另一种方案中, 手机暂不重 新对人脸图片 1、 人脸图片 2、 人脸图片 3和人脸图片 4进行聚类处理, 在检测到用户指示人 像分类的操作后再重新聚类。 After clustering the face picture 1, face picture 2, face picture 3, and face picture 4 according to the reference image set clustering result, if the subsequent mobile phone obtains a new reference image set (such as video 2), then In one solution, the mobile phone performs face detection, tracking, and clustering according to the previous reference image set and the new reference image set, and compares face picture 1, face picture 2, face picture 3 according to the video clustering result. Perform clustering processing with face image 4; in another solution, the mobile phone does not perform clustering processing on face image 1, face image 2, face image 3, and face image 4 temporarily, and the user is detected Instruct the operation of portrait classification before re-clustering.
在另一实施例中, 不论手机是否获取到新的参考图像集, 手机周期性地对当前获取到的 参考图像集进行人脸检测、 跟踪和聚类, 并根据参考图像集聚类结果对当前存储的人脸图片 进行聚类。 In another embodiment, regardless of whether the mobile phone acquires a new reference image set, the mobile phone periodically performs face detection, tracking, and clustering on the currently acquired reference image set, and calculates the current reference image set based on the reference image set clustering result. The stored face pictures are clustered.
在另一实施例中, 手机在检测到用户指示人像分类的操作后, 根据当前获取到的参考图 像集进行人脸检测、 跟踪和聚类, 并根据参考图像集聚类结果对当前存储的人脸图片进行聚 类。 In another embodiment, after the mobile phone detects the user's instruction to classify portraits, it performs face detection, tracking, and clustering according to the currently acquired reference image set, and performs face detection, tracking, and clustering according to the reference image set clustering results. Face images are clustered.
另一实施例中, 由于人脸聚类的资源消耗较大, 因而手机可以在预设时间段内 (例如夜 间 00:00-6:00), 或者在空闲状态(例如手机未执行其他业务时), 或者在手机进行充电且电量 大于或者等于预设值 2时, 根据当前获取到的参考图像集进行人脸检测、 跟踪和聚类, 并根 据参考图像集聚类结果对当前存储的人脸图片进行聚类。 In another embodiment, because face clustering consumes a lot of resources, the mobile phone can be in a preset time period (for example, 00:00-6:00 at night), or in an idle state (for example, when the mobile phone is not performing other services). ), or when the mobile phone is charging and the power is greater than or equal to the preset value 2, face detection, tracking and clustering are performed according to the currently acquired reference image set, and the currently stored face The pictures are clustered.
在聚类完成后,手机可以显示聚类结果。 比如,手机可以通过分组(例如可以是文件夹) 的形式进行显示。 以下仍以参考图像集为视频 1, 类别 1和类别 3聚类为类别 4, 手机上存储 的人脸图片包括人脸图片 1、 人脸图片 2、 人脸图片 3和人脸图片 4为例进行说明。 After the clustering is completed, the mobile phone can display the clustering results. For example, mobile phones can be displayed in groups (for example, folders). In the following, the reference image set is still video 1, category 1 and category 3 are clustered as category 4. The face pictures stored on the mobile phone include face picture 1, face picture 2, face picture 3, and face picture 4 as examples Be explained.
在本申请的一个实施例中, 在视频聚类完成后, 手机可以显示视频聚类结果。 例如, 参 见图 9A所示的视频人像分类界面 (即视频聚类结果界面), 手机可以显示类别 4对应的分组 1和类别 2对应的分组 2。 In an embodiment of the present application, after the video clustering is completed, the mobile phone may display the video clustering result. For example, referring to the video portrait classification interface (that is, the video clustering result interface) shown in FIG. 9A, the mobile phone can display group 1 corresponding to category 4 and group 2 corresponding to category 2.
在一种方案中, 每个聚类类别对应的分组中可以包括该类别的人脸图像所在的视频。 例
如,类别 4对应的分组 1和类别 2对应的分组 2中都包括视频 1。在一种实现中,参见图 9A, 该分组中视频的缩略图所显示的封面图像, 可以是该视频中属于该类别的一张人脸图像。 尤 其地, 该封面图像可以是一张较为正面的人脸图像, 或者用户指定的图像。 In one solution, the group corresponding to each cluster category may include the video where the face image of the category is located. example For example, group 1 corresponding to category 4 and group 2 corresponding to category 2 both include video 1. In an implementation, referring to FIG. 9A, the cover image displayed by the thumbnail of the video in the group may be a face image belonging to the category in the video. In particular, the cover image may be a relatively positive face image, or an image designated by the user.
对于视频来说,手机可以将视频放入视频中的人脸图像所属的所有类别所对应的分组中; 或者, 当视频中某个类别的人脸图像出现的时长大于或者等于预设时长时, 手机才将该视频 放入该类别对应的分组中; 或者, 当视频中某个类别的人脸图像的帧数大于或者等于预设值 3 时, 手机才将该视频放入该类别对应的分组中; 或者, 当视频中出现某个类别的正面人脸 图像时, 手机才将该视频放入该类别对应的分组中。 For video, the mobile phone can put the video into the groups corresponding to all categories of the face images in the video; or, when the face image of a certain category in the video appears for a duration greater than or equal to the preset duration, The mobile phone puts the video into the group corresponding to the category; or, when the number of frames of the face image of a certain category in the video is greater than or equal to the preset value 3, the mobile phone puts the video into the group corresponding to the category Or, when a frontal face image of a certain category appears in the video, the mobile phone puts the video into the group corresponding to the category.
在另一种方案中,每个聚类类别对应的分组中可以包括该类别的人脸图像所在的视频中, 出现该类别的人脸图像的视频分段。 In another solution, the group corresponding to each cluster category may include a video segment where the face image of the category is located, and the video segment of the face image of the category appears.
例如, 类别 4对应的分组 1可以为图 9B中的分组 1A, 分组 1A中可以包括视频 1中的 时间段 1对应的视频分段 1和时间段 3对应的视频分段 3; 此外, 类别 2对应的分组 2可以 为分组 2A, 分组 2A中可以包括视频 1中的时间段 2对应的视频分段 2。 For example, group 1 corresponding to category 4 may be group 1A in FIG. 9B, and group 1A may include video segment 1 corresponding to time period 1 in video 1 and video segment 3 corresponding to time period 3; in addition, category 2 The corresponding group 2 may be group 2A, and group 2A may include video segment 2 corresponding to time period 2 in video 1.
在又一种方案中,每个聚类类别对应的分组中可以包括该类别的人脸图像所在的视频中, 该类别的人脸图像帧。 In another solution, the group corresponding to each cluster category may include the face image frame of the category in the video where the face image of the category is located.
例如,类别 4对应的分组 1可以为图 9C中的分组 1B ,分组 1B中可以包括正面人脸 A、 侧面人脸 B、 戴墨镜的人脸 C、 仰脸 G和俯脸 H。 类别 2对应的分组 2可以为分组 2B , 分组 2B中可以包括微笑的人脸 D、 闭眼睛的人脸 E和搞怪表情的人脸 F。 在一种实现中, 同一类 别对应的分组中可以包括多个子分组, 同一视频中属于该类别的人脸图像帧属于同一个子分 组。 For example, group 1 corresponding to category 4 may be group 1B in FIG. 9C, and group 1B may include frontal face A, side face B, face C wearing sunglasses, face upward G, and face upward H. The group 2 corresponding to category 2 can be group 2B, and group 2B can include smiling face D, face E with closed eyes, and face F with funny expressions. In an implementation, the group corresponding to the same category may include multiple sub-groups, and the face image frames belonging to the category in the same video belong to the same sub-group.
在另一种方案中,每个聚类类别对应的分组中可以包括该类别的人脸图像所在的视频中, 出现该类别的人脸图像的视频分段, 以及该类别的人脸图像帧。 In another solution, the group corresponding to each cluster category may include the video segment in which the face image of the category is located in the video where the face image of the category appears, and the face image frame of the category.
在该实施例中, 手机可以显示视频聚类的结果, 方便用户根据视频图像对视频进行归类 管理, 提高用户查找和管理视频的效率, 提高用户使用体验。 In this embodiment, the mobile phone can display the result of video clustering, which is convenient for users to categorize and manage videos based on video images, which improves the efficiency of users for searching and managing videos, and improves user experience.
在本申请的另一个实施例中, 在视频聚类完成后, 手机可以不显示聚类结果; 在人脸图 片聚类完成后, 再显示聚类结果。 In another embodiment of the present application, after the video clustering is completed, the mobile phone may not display the clustering result; after the face image clustering is completed, the clustering result is displayed again.
在一种方案中, 在人脸图片聚类完成后, 手机可以显示人脸图片聚类结果。 每个聚类类 别对应的分组中包括该类别的人脸图片。 In one solution, after the face image clustering is completed, the mobile phone may display the face image clustering result. The group corresponding to each cluster category includes the face pictures of that category.
例如, 参见图 10中的 ( a) -( c), 手机可以显示类别 4对应的分组 3和类别 2对应的分 组 4, 分组 3中包括人脸图片 1、 人脸图片 2和人脸图片 4, 分组 4中包括人脸图片 3。 For example, referring to (a)-(c) in Figure 10, the mobile phone can display group 3 corresponding to category 4 and group 4 corresponding to category 2, and group 3 includes face picture 1, face picture 2, and face picture 4. , Group 4 includes face picture 3.
在另一种方案中, 在人脸图片聚类完成后, 手机可以显示视频聚类结果和人脸图片聚类 结果。 其中, 视频聚类结果和人脸图片聚类结果可以分别显示在不同的分组中, 也可以结合 显示在同一个分组中。 In another solution, after the face image clustering is completed, the mobile phone can display the video clustering result and the face image clustering result. Among them, the video clustering result and the face image clustering result can be displayed in different groups respectively, or can be combined and displayed in the same group.
当视频聚类结果和人脸图片聚类结果可以分别显示在不同的分组中时, 视频聚类结果可 以显示在分组 5中, 人脸图片聚类结果可以显示在分组 6中。 其中, 关于分组 5中的内容可 以为以上描述的(例如图 9A-图 9C所示的)视频聚类结果; 分组 6中的内容可以为以上描述 的 (例如图 10中的 (a) -(c)所示的)人脸图片聚类结果。 When the video clustering result and the face image clustering result can be displayed in different groups, the video clustering result can be displayed in group 5, and the face image clustering result can be displayed in group 6. Wherein, the content in group 5 may be the video clustering result described above (for example, as shown in FIGS. 9A-9C); the content in group 6 may be the above-described content (for example, (a) in FIG. 10 -( c) Shown) Clustering results of face images.
当视频聚类结果和人脸图片聚类结果结合显示在同一分组中时, 每个聚类类别对应的分 组中既可以包括以上描述的人脸图片聚类结果, 还可以包括以上描述的视频聚类结果。 When the video clustering result and the face picture clustering result are combined and displayed in the same group, the group corresponding to each clustering category may include the face picture clustering result described above, or the video clustering result described above. Class result.
例如, 参见图 11中的 (a), 类别 4对应分组 7, 类别 2对应分组 8。 在一种方案中, 每
个聚类类别对应的分组中, 可以包括该类别的人脸图片以及该类别的人脸图像所在的视频。 示例性的,参见图 11中的( b),类别 4对应的分组 7为分组 7A,分组 7A中包括人脸图片 1、 人脸图片 2、 人脸图片 4以及视频 1 ; 参见图 11中的 (c), 类别 2对应的分组 8为分组 8A, 分组 8A中包括人脸图片 3以及视频 1。 For example, referring to (a) in FIG. 11, category 4 corresponds to group 7, and category 2 corresponds to group 8. In one scheme, every The group corresponding to each cluster category may include the face image of the category and the video where the face image of the category is located. Exemplarily, referring to (b) in Figure 11, the group 7 corresponding to category 4 is group 7A, and group 7A includes face picture 1, face picture 2, face picture 4, and video 1; see Fig. 11 (C) The group 8 corresponding to category 2 is group 8A, and group 8A includes face image 3 and video 1.
需要说明的是, 分组的封面图像可以是该类别中的人脸图片也可以是视频中该类别的人 脸图像。 分组 7中视频 1的封面图像和分组 8中视频 1的封面图像可以相同, 也可以不同。 优选地, 视频 1的封面图像可以为所在分组对应的类别中包括的人脸图像。 It should be noted that the grouped cover image can be a face image in this category or a face image in this category in the video. The cover image of video 1 in group 7 and the cover image of video 1 in group 8 may be the same or different. Preferably, the cover image of Video 1 may be a face image included in the category corresponding to the group.
在一种实现中, 同一类别对应的分组中, 该类别人脸图片可以属于一个子分组, 该类别 的人脸图像所在的视频可以属于另一个子分组。 示例性的, 参见图 12中的(a), 类别 4对应 的分组 7为分组 7B , 分组 7B中包括人脸图片对应的子分组 7-1和视频对应的子分组 7-2。 参 见图 12中的 (b), 子分组 7-1中包括人脸图片 1、 人脸图片 2和人脸图片 4; 参见图 12中的 ( c), 子分组 7-2中包括视频 1。 In one implementation, in the group corresponding to the same category, the face image of this category may belong to one sub-group, and the video where the face image of this category is located may belong to another sub-group. Exemplarily, referring to (a) in FIG. 12, group 7 corresponding to category 4 is group 7B, and group 7B includes sub-group 7-1 corresponding to face pictures and sub-group 7-2 corresponding to videos. See (b) in Figure 12, sub-group 7-1 includes face picture 1, face picture 2 and face picture 4; see (c) in Figure 12, sub-group 7-2 includes video 1.
在另一种方案中, 每个聚类类别对应的分组中, 可以包括该类别的人脸图片, 以及该类 别的人脸图像所在的视频中, 出现该类别的人脸图像的视频分段。 示例性的, 作为图 11中的 ( b)和( c)的一种替换方案, 参见图 13中的( a)和( b), 类别 4对应的分组 7为分组 7C, 分组 7C中包括人脸图片 1、 人脸图片 2、 人脸图片 4以及视频分段 1和视频分段 3; 类别 2 对应的分组 8为分组 8C , 分组 8C中包括人脸图片 3以及视频分段 2。 在一种实现中, 同一 类别对应的分组中, 该类别的人脸图片可以属于一个子分组, 视频分段可以属于另一个子分 组。 In another solution, the group corresponding to each cluster category may include the face image of the category, and the video segment in which the face image of the category appears in the video where the face image of the category is located. Exemplarily, as an alternative to (b) and (c) in FIG. 11, referring to (a) and (b) in FIG. 13, the group 7 corresponding to category 4 is group 7C, and group 7C includes people Face picture 1, face picture 2, face picture 4, video segment 1 and video segment 3; group 8 corresponding to category 2 is group 8C, and group 8C includes face picture 3 and video segment 2. In one implementation, in groups corresponding to the same category, the face pictures of this category may belong to one sub-group, and the video segment may belong to another sub-group.
在又一种方案中, 每个聚类类别对应的分组中, 可以包括该类别的人脸图片, 以及该类 别的人脸图像所在的视频中截取或选择的图像帧。 示例性性, 作为图 11中的 (b)和(c)的 一种替换方案, 参见图 14中的 (a)和(b), 类别 4对应的分组 7为分组 7D, 分组 7D中包 括人脸图片 1、 人脸图片 2、 人脸图片 4以及视频 1中的人脸图像 A、 B、 C、 G、 H; 类别 2 对应的分组 8为分组 8D, 分组 8D中包括人脸图片 3以及视频 1中的人脸图像 D、 E、 F。 在 一种实现中, 该类别的人脸图片可以属于一个子分组; 该类别的人脸图像所在的视频中, 该 类别的人脸图像帧可以属于另一个子分组。 In another solution, the group corresponding to each cluster category may include the face picture of the category and the image frame intercepted or selected in the video where the face image of the category is located. Exemplarily, as an alternative to (b) and (c) in FIG. 11, see (a) and (b) in FIG. 14, the group 7 corresponding to category 4 is group 7D, and group 7D includes people Face picture 1, face picture 2, face picture 4, and face images A, B, C, G, H in video 1; group 8 corresponding to category 2 is group 8D, and group 8D includes face picture 3 and Face images D, E, F in video 1. In one implementation, the face pictures of this category may belong to one subgroup; in the video where the face images of this category are located, the face image frames of this category may belong to another subgroup.
在另一种方案中, 每个聚类类别对应的分组中可以包括该类别的人脸图片。 还可以包括 该类别的人脸图像所在的视频; 该类别的人脸图像所在的视频中, 出现该类别的人脸图像的 视频分段, 以及截取或选择的图像帧中的一种或多种。 在一种实现中, 该类别的人脸图片和 该类别的人脸图像帧可以属于一个子分组; 该类别对应的视频或视频分段可以属于另一个子 分组。 In another solution, the group corresponding to each cluster category may include face pictures of that category. It may also include the video where the face image of the category is located; in the video where the face image of the category is located, the video segment in which the face image of the category appears, and one or more of the captured or selected image frames . In an implementation, the face pictures of this category and the face image frames of this category may belong to one subgroup; the video or video segment corresponding to this category may belong to another subgroup.
在另一种实现中, 该类别的人脸图片可以属于一个子分组, 该类别的人脸图像帧和视频 或视频分段可以属于另一个子分组。 在又一种实现中, 该类别的人脸图片, 该类别的人脸图 像帧, 视频, 以及视频分段分别属于不同的子分组。 In another implementation, the face pictures of this category may belong to one subgroup, and the face image frames and videos or video segments of this category may belong to another subgroup. In yet another implementation, the face images of this category, the face image frames, videos, and video segments of this category belong to different subgroups respectively.
在其他方案中, 在人脸图片聚类完成后, 手机可以显示人脸图片聚类结果, 并根据用户 的指示确定是否显示视频聚类结果。 In other solutions, after the face picture clustering is completed, the mobile phone may display the face picture clustering result, and determine whether to display the video clustering result according to the user's instruction.
需要说明的是, 在本申请的实施例中, 分组的名称可以是用户手动输入的名称; 也可以 是手机自身通过学习获得的名称。 例如, 手机可以根据图片或视频中用户间的动作、 亲密关 系等确定图片中的用户身份, 比如父亲、 母亲、 妻子(或丈夫)、 儿子、 女儿等, 并将用户身 份设置为分组的名称。
此外, 在该实施例中, 手机在首次或每次显示人脸图片聚类结果时, 还可以提示用户人 脸图片聚类结果是根据视频等参考图像集对人脸图片进行分类得到的。 示例性的, 在显示类 别 4对应的分组 7和类别 2对应的分组 8时, 参见图 15, 手机可以通过显示信息 1501来对 用户进行提示, 以便用户获知手机所具有的人像分类功能。 It should be noted that, in the embodiment of the present application, the name of the group may be a name manually input by the user; or it may be a name obtained by the mobile phone itself through learning. For example, the mobile phone can determine the user identity in the picture, such as father, mother, wife (or husband), son, daughter, etc., according to the actions and intimacy between users in the picture or video, and set the user identity as the group name. In addition, in this embodiment, when the mobile phone displays the face picture clustering result for the first time or every time, it may also prompt the user that the face picture clustering result is obtained by classifying the face picture according to a reference image set such as a video. Exemplarily, when the group 7 corresponding to category 4 and the group 8 corresponding to category 2 are displayed, referring to FIG. 15, the mobile phone may prompt the user by displaying information 1501, so that the user can learn the portrait classification function of the mobile phone.
在该实施例中, 手机可以根据综合管理和显示人脸图片和视频的聚类结果, 提高用户查 找和管理人脸图片和视频的效率, 提高用户使用体验。 In this embodiment, the mobile phone can improve the efficiency of the user to find and manage the face pictures and videos based on the clustering results of the comprehensive management and display of the face pictures and videos, and improve the user experience.
在本申请的另一个实施例中, 在人脸图片聚类完成后, 若用户发现某张人脸图片聚类结 果出错, 则用户可以主动添加一个该人脸图片中用户对应的参考图像集, 例如, 若人脸图片 5的聚类结果出错, 则参见图 16中的(a), 用户可以单击控件 1601, 或者用户可以在选择人 脸图片 5后单击控件 1601 ; 而后, 用户可以点击图 16中的(b)所示的控件 1602, 从而添加 一个参考图像集; 或者, 用户可以通过语音、 预设手势等方式添加一个参考图像集。 In another embodiment of the present application, after the face picture clustering is completed, if the user finds that a face picture clustering result is wrong, the user can actively add a reference image set corresponding to the user in the face picture. For example, if the clustering result of face image 5 is wrong, see (a) in Figure 16. The user can click control 1601, or the user can click control 1601 after selecting face image 5; then, the user can click The control 1602 shown in (b) in FIG. 16 adds a reference image set; or, the user can add a reference image set through voice, preset gestures, or the like.
其中, 该参考图像集可以是用户实时拍摄的一段视频或一组图像, 也可以是用户通过手 机获取的一组图像, 该组图像中包括出错的人脸图片对应的用户的不同形态的人脸。 示例性 的, 该参考图像集可以是图 17 中的 (a) -(h)所示的图像组。 在参考图像集添加完成后, 手机可以结合用户添加的参考图像集, 重新对聚类出错的人脸图片进行聚类; 或者重新对手 机上存储的所有人脸图片进行聚类。 Wherein, the reference image set may be a video or a set of images captured by the user in real time, or a set of images obtained by the user through a mobile phone, and the set of images includes different forms of the user corresponding to the wrong face picture. Human face. Exemplarily, the reference image set may be the image set shown in (a)-(h) in FIG. 17. After the reference image set is added, the mobile phone can combine the reference image set added by the user to re-cluster the face pictures whose clustering is wrong; or re-cluster all face pictures stored on the phone.
需要说明的是, 以上实施例描述的聚类方法是根据人脸特征的相似度对不同用户的人脸 图片进行分类的, 因而不同聚类类别对应的分组, 也可以理解为不同用户对应的分组。 It should be noted that the clustering method described in the above embodiment classifies the face pictures of different users according to the similarity of the facial features, so the groups corresponding to different clustering categories can also be understood as the groups corresponding to different users. .
在本申请的一些实施例中, 手机上显示的不同聚类类别对应的分组, 即不同用户对应的 分组, 可以对应不同的优先级。 优先级高的分组对应的用户可能是用户更关心的用户。 In some embodiments of the present application, the groups corresponding to different cluster categories displayed on the mobile phone, that is, the groups corresponding to different users, may correspond to different priorities. The users corresponding to the high-priority groups may be users who are more concerned about.
其中, 在一种技术方案中, 用户越为关心的用户, 用户在手机上保存的该用户的人脸图 片和视频通常也越多, 因而手机可以确定所保存的人脸图片和视频中出现的频率最高的用户 为用户最为关心的用户, 这些用户对应的分组的优先级也最高。 Among them, in a technical solution, the more the user cares about the user, the more facial pictures and videos of the user that the user saves on the mobile phone, so the mobile phone can determine what appears in the saved facial pictures and videos. The users with the highest frequency are the users most concerned about, and the groups corresponding to these users have the highest priority.
在另一种技术方案中, 手机可以确定与手机用户的亲密度高的用户对应的分组的优先级 更高。 例如, 手机可以根据不同用户与用户之间动作的亲密性、 不同用户的表情、 不同用户 在视频和人脸图片中出现的频率, 不同用户在视频和人脸图片中的位置等因素, 通过情感分 析算法确定不同用户与用户的亲密度, 从而确定与用户亲密度更高的用户是用户更为关心的 用户, 这些用户所对应的分组的优先级也更高。 In another technical solution, the mobile phone can determine that the priority of the group corresponding to the user with high intimacy of the mobile phone user is higher. For example, a mobile phone can use emotions based on factors such as the intimacy of actions between different users, the expressions of different users, the frequency of different users’ appearance in videos and face pictures, and the position of different users in videos and face pictures. The analysis algorithm determines the intimacy between different users and the user, thereby determining that users with higher intimacy with the user are the users that the user cares more about, and the priority of the groups corresponding to these users is also higher.
在又一种技术方案中, 由于用户的亲人通常与用户的面部信息更为相似, 亲人通常是用 户更为关心的用户, 用户更想优先显示亲人对应的分组, 因而手机可以确定与手机用户的面 部信息更为接近的用户对应的分组的优先级更高。 In another technical solution, since the user’s relatives are usually more similar to the user’s facial information, the relatives are usually the users that the user cares more about, and the user prefers to display the groups corresponding to the relatives first, so the mobile phone can determine the identity of the mobile phone user. Groups corresponding to users with closer facial information have higher priority.
在一些实施例中, 优先级高的分组可以优先显示。 在一种技术方案中, 手机可以将优先 级高的分组显示在人像分类界面的顶部, 优先级低的分组需要用户通过上滑或者切换页面等 方式查看。 在另一种技术方案中, 手机可以仅在人像分类界面上显示优先级最高的前 N(为 正整数)个分组, 对于用户不太关心的其他用户对应的分组可以不进行显示。 In some embodiments, groups with high priority may be displayed first. In a technical solution, the mobile phone can display the high-priority groups on the top of the portrait classification interface, and the low-priority groups need to be viewed by the user by sliding up or switching pages. In another technical solution, the mobile phone may only display the top N (positive integer) groups with the highest priority on the portrait classification interface, and may not display the groups corresponding to other users that the user does not care about.
在另一些实施例中,若手机上保存的某个用户的人脸图片和视频的数量超过预设值 4(例 如可以为 5), 则可以该用户是用户较为关心的用户, 手机可以在人像分类界面上显示该用户 对应的分组。 In other embodiments, if the number of face pictures and videos of a user saved on the mobile phone exceeds the preset value of 4 (for example, it can be 5), then the user may be a user that the user is more concerned about, and the mobile phone may be The group corresponding to the user is displayed on the classification interface.
在本申请的一些实施例中, 人脸图片中的某一个用户与另一个用户的合影, 可以在该某 一个用户所对应的分组中, 同时还可以在另一个用户所在的分组中。
示例性的, 参见图 18中的 (a), 人脸图片 6为用户 1和用户 2的合影; 参见图 18中的 ( b), 人脸图片 6既在用户 1对应的分组中, 又在用户 2对应的分组中。 In some embodiments of the present application, a group photo of a certain user and another user in the face picture may be in the group corresponding to the certain user, and may also be in the group where the other user is located. Exemplarily, referring to (a) in FIG. 18, face picture 6 is a group photo of user 1 and user 2; referring to (b) in FIG. 18, face picture 6 is both in the group corresponding to user 1 and User 2 corresponds to the group.
在本申请的另一些实施例中, 参见图 19A, 不同用户对应的分组中仅包括该用户的单人 照, 多个用户的合影额外进行显示。 In some other embodiments of the present application, referring to FIG. 19A, groups corresponding to different users only include a single photo of the user, and group photos of multiple users are additionally displayed.
在本申请的另一些实施例中, 参见图 19B , 不同用户对应的分组中仅包括该用户的单人 照, 多个用户的合影在另一个分组中。 In some other embodiments of the present application, referring to FIG. 19B, groups corresponding to different users only include a single photo of the user, and group photos of multiple users are in another group.
此外, 在人脸图片聚类完成后, 手机还可以根据聚类结果, 对图片中的人脸进行身份标 记。 In addition, after the clustering of the face picture is completed, the mobile phone can also mark the face in the picture according to the clustering result.
在本申请的其他实施例中, 在人脸图片聚类完成后, 手机还可以对聚类结果进行个性化 显示。 例如, 对于分组中的图片, 当手机检测到用户指示色彩保留的操作后, 手机可以将该 分组的图片中, 用户指示的区域或者预设的区域保留为彩色图像, 图片上的其他区域变为灰 色图像。 In other embodiments of the present application, after the clustering of the face pictures is completed, the mobile phone may also display the clustering results in a personalized manner. For example, for pictures in a group, when the mobile phone detects the operation of the user instructing color retention, the mobile phone can reserve the area indicated by the user or the preset area in the grouped picture as a color image, and other areas on the picture become Gray image.
示例性的,预设的区域为用户所在的区域,手机可以保留用户所在区域内的图像的颜色, 其他区域内的图像为灰度图像。 再例如, 例如, 对于某个目标用户对应的分组中的图片, 当 手机检测到用户指示保留用户的操作后, 目标用户所在区域的图像画面保留, 其他区域的图 像画面消失, 即其他区域的图像可以为空白、 黑色、 灰色或其他预设的颜色。 Exemplarily, the preset area is the area where the user is located, the mobile phone may retain the color of the image in the area where the user is located, and the images in other areas are grayscale images. For another example, for a picture in a group corresponding to a target user, when the mobile phone detects that the user has instructed to keep the user's operation, the image frame of the area where the target user is located remains, and the image frames of other areas disappear, that is, the images of other areas It can be blank, black, gray or other preset colors.
在本申请的其他实施例中, 在人脸图片聚类完成后, 手机还可以生成主角故事。 该主角 故事可以包括一系列的某一用户的多张图像。 主角故事中的图像是同一类别的图像, 具体可 以是参考图像集中的图像(例如可以包括视频中的视频分段或视频中的人脸图像帧), 也可以 是人脸图片中的图像。 In other embodiments of the present application, after the clustering of the face pictures is completed, the mobile phone may also generate a protagonist story. The main character story may include a series of multiple images of a certain user. The images in the protagonist's story are images of the same category. Specifically, they can be images in the reference image set (for example, they can include video segments in a video or face image frames in a video), or they can be images in a face picture.
也就是说, 手机不仅可以从图片中提取人脸图片进行主角故事编辑, 还可以结合视频等 参考图像集中的人脸图像进行主角故事编辑, 从而可以使得主角图像的来源更广, 使得主角 故事更力口生动有趣和丰富多彩。 That is to say, the mobile phone can not only extract face pictures from the pictures to edit the protagonist's story, but also combine the face images in the reference image set such as videos to edit the protagonist's story, so that the source of the protagonist's image can be wider and the protagonist's story can be more extensive Vivid, interesting and colorful.
需要说明的是, 以上是以视频为参考图像集为例进行说明的, 当参考图像集为其他参考 图像集(例如图 20中的(a) -(f)所示的手机连拍到的图像组)时, 仍可以根据其他参考图 像集, 采用以上实施例中描述的方式对人脸图片进行聚类, 此处不予赘述。 It should be noted that the above description has taken the video as the reference image set as an example. When the reference image set is another reference image set (for example, the images captured by the mobile phone shown in (a)-(f) in Figure 20) In the case of group), the face pictures can still be clustered according to other reference image sets in the manner described in the above embodiment, which will not be repeated here.
以上是以人脸为分类对象为例进行说明的, 当分类对象为其他对象时, 仍可以采用本申 请实施例提供的聚类方法对手机上的图片进行聚类。 并且, 用户还可以设置手机可以进行聚 类的分类对象。 The above description is based on an example of a human face as a classification object. When the classification object is another object, the clustering method provided in this application embodiment can still be used to cluster the pictures on the mobile phone. In addition, the user can also set the classification objects that the mobile phone can cluster.
例如, 分类对象为动物的脸(比如狗的脸、 猫的脸)、 物体(比如房子、 汽车、 手机、 水 杯等)、 logo标识(比如奥运五环的标识)等。 举例来说, 分类对象为房子, 手机也可以通过 上述实施例描述的方式先将手机获取到的包括的不同角度、不同方位、不同位置、不同亮度、 不同场景下的房子的参考图像集进行聚类(例如进行跟踪和自动聚类处理), 而后根据该参考 图像集聚类结果再对手机存储的房子的图片进行聚类, 以使得不同样子的房子的图片的聚类 精度较高, 方便用户对房子的图片的查找和管理。 For example, the classification objects are animal faces (such as the face of a dog, the face of a cat), objects (such as a house, a car, a mobile phone, a water cup, etc.), a logo mark (such as the logo of the Olympic rings), etc. For example, if the classification object is a house, the mobile phone may first gather the reference image sets of houses in different angles, different directions, different positions, different brightness, and different scenes acquired by the mobile phone in the manner described in the above embodiments. Clustering (for example, tracking and automatic clustering processing), and then clustering the pictures of houses stored in the mobile phone according to the clustering results of the reference image set, so that the clustering accuracy of pictures of houses of different appearances is higher, which is convenient for users Find and manage pictures of houses.
当分类对象还包括人脸以外的多种其他分类对象时, 手机在显示的聚类结果可以包括人 脸图片分组以及其他分类对象的分组;也可以说,手机可以按照不同的实体进行聚类和分组。 When the classification object also includes a variety of other classification objects other than the face, the clustering results displayed by the mobile phone can include the grouping of face pictures and the grouping of other classification objects; it can also be said that the mobile phone can perform clustering and clustering according to different entities. Grouping.
举例来说, 当分类对象包括人脸、 狗和房子时, 参见图 21, 手机可以在聚类结果中显示 不同用户 (例如用户 1和用户 2)的人脸分别对应的分组, 不同狗(例如狗 1)分别对应的分 组, 以及不同房子 (例如房子 1)分别对应的分组。
在另一个举例中, 当分类对象包括人脸和房子时, 聚类结果中可以包括人脸对应的分组 9, 狗对应的分组 10, 以及房子对应的分组 11。 其中, 分组 9中可以包括不同用户 (例如用 户 1和用户 2)分别对应的子分组, 分组 10中可以包括不同的狗分别对应的子分组, 分组 11 中可以包括不同房子分别对应的子分组。 并且, 子分组中可以包括图片聚类结果, 或者包括 图片聚类结果以及参考图像集聚类结果, 这里不再详细说明。 For example, when the classification objects include human faces, dogs, and houses, referring to FIG. 21, the mobile phone can display in the clustering results the groups corresponding to the faces of different users (such as user 1 and user 2), and different dogs (such as Dog 1) The corresponding grouping, and the corresponding grouping of different houses (such as house 1). In another example, when the classification object includes a human face and a house, the clustering result may include a group 9 corresponding to a human face, a group 10 corresponding to a dog, and a group 11 corresponding to a house. Wherein, group 9 may include sub-groups corresponding to different users (for example, user 1 and user 2), group 10 may include sub-groups corresponding to different dogs, and group 11 may include sub-groups corresponding to different houses. In addition, the sub-groups may include image clustering results, or include image clustering results and reference image set clustering results, which will not be described in detail here.
在另一种方案中, 用户还可以选择当前需要显示的分类对象的分类结果。 示例性的, 参 见图 22中的 (a), 手机检测到用户点击图 22中的 (a)所示的控件 2201后, 可以显示图 22 中的 (b)所示的界面; 手机检测到用户点击图 22中的 (b)所示的控件 2202后, 可以显示 图 22中的 (c)所示的界面; 而后, 当用户选择人像分类时, 手机仅显示人脸; 当用户选择 狗的分类时, 手机仅显示狗; 当用户选择房子的分类时, 手机仅显示房子; 当用户选择其他 分类对象时, 进行其他分类对象的聚类结果。 需要说明的是, 用户选择当前需要显示的分类 对象的聚类结果的方式还可以有多种, 并不限于图 22所举的示例。 In another solution, the user can also select the classification result of the classification object currently to be displayed. Exemplarily, referring to (a) in Figure 22, after the mobile phone detects that the user clicks on the control 2201 shown in Figure 22 (a), it can display the interface shown in Figure 22 (b); the mobile phone detects the user After clicking the control 2202 shown in (b) in Figure 22, the interface shown in (c) in Figure 22 can be displayed; then, when the user selects the portrait category, the mobile phone only displays the face; when the user selects the dog category When the user selects the classification of the house, the mobile phone only displays the house; when the user selects other classification objects, perform the clustering results of other classification objects. It should be noted that there can be multiple ways for the user to select the clustering results of the classification objects that need to be displayed currently, which is not limited to the example shown in FIG. 22.
结合上述实施例及相应的附图, 本申请另一实施例提供了一种图片分组方法, 该方法可 以在具有图 1所示的硬件结构的电子设备中实现。 电子设备上保存有至少一张人脸图片。 如 图 23所示, 该方法可以包括: In combination with the foregoing embodiment and corresponding drawings, another embodiment of the present application provides a method for grouping pictures, which can be implemented in an electronic device having the hardware structure shown in FIG. 1. At least one face picture is saved on the electronic device. As shown in Figure 23, the method may include:
2301、 电子设备获取至少一个视频。 2301. The electronic device acquires at least one video.
其中, 电子设备获取的至少一个视频中可以包括多个人脸图像帧, 每个视频中也可以包 括多个人脸图像帧。 电子设备上保存的至少一张人脸图片为用户之前拍摄的, 或者电子设备 通过下载、 拷贝等方式获取的静态图片。 Wherein, at least one video obtained by the electronic device may include multiple face image frames, and each video may also include multiple face image frames. At least one face picture saved on the electronic device is a static picture taken by the user before, or the electronic device obtains a static picture through downloading, copying, or the like.
示例性的, 该至少一张人脸图片可以为图 8A所示的人脸图片 1-人脸图片 4。 Exemplarily, the at least one face picture may be face picture 1-face picture 4 shown in FIG. 8A.
电子设备获取至少一个视频的方式可以有多种。 例如, 电子设备的存储区存储有至少一 个视频, 电子设备从存储区获取至少一个视频。 其中, 存储区存储的视频可以是用户之前拍 摄的, 电子设备下载的, 或者电子设备在应用程序运行过程中获得的。 There may be multiple ways for the electronic device to obtain at least one video. For example, the storage area of the electronic device stores at least one video, and the electronic device obtains at least one video from the storage area. Wherein, the video stored in the storage area may be taken by the user before, downloaded by the electronic device, or obtained by the electronic device during the running of the application program.
再例如, 参见图 6, 电子设备可以提示用户拍摄包括人脸图像帧的视频, 在检测到用户 指示拍摄视频的操作后, 录制并生成至少一个视频。 For another example, referring to FIG. 6, the electronic device may prompt the user to shoot a video including face image frames, and after detecting the user's instruction to shoot the video, record and generate at least one video.
再例如,电子设备提示用户下载至少一个视频,在用户指示下载后获取到所下载的视频。 示例性的, 电子设备获取的至少一个视频可以包括图 7B所示的视频 1。 For another example, the electronic device prompts the user to download at least one video, and the downloaded video is obtained after the user instructs to download. Exemplarily, the at least one video acquired by the electronic device may include video 1 shown in FIG. 7B.
2302、 电子设备从至少一个视频中提取多个人脸图像帧。 2302. The electronic device extracts multiple face image frames from at least one video.
在获取到至少一个视频后, 电子设备可以从至少一个视频中提取多个人脸图像帧, 以便 后续可以根据提取的人脸图像帧对人脸图片进行分组。 示例性的, 当电子设备获取的视频包 括图 7B所示的视频 1时, 电子设备从视频 1中提取的人脸图像帧可以为图 7B中的人脸图像 帧 A-人脸图像帧 H。 After acquiring at least one video, the electronic device can extract multiple face image frames from the at least one video, so that the face images can be grouped subsequently according to the extracted face image frames. Exemplarily, when the video acquired by the electronic device includes the video 1 shown in FIG. 7B, the face image frame extracted by the electronic device from the video 1 may be the face image frame A-face image frame H in FIG. 7B.
在其他实施例中, 电子设备也可以从至少一个视频中提取一个人脸图像帧, 以便后续可 以根据提取的该人脸图像帧对人脸图片进行分组。 In other embodiments, the electronic device may also extract a face image frame from at least one video, so that the face images may be grouped according to the extracted face image frame later.
2303、 电子设备根据多个人脸图像帧, 对至少一张人脸图片进行聚类处理。 2303. The electronic device performs clustering processing on at least one face image according to multiple face image frames.
示例性的, 电子设备可以根据提取的人脸图像帧 A-人脸图像帧 H, 对人脸图片 1-人脸图 片 4进行聚类处理。 Exemplarily, the electronic device may perform clustering processing on the face picture 1-the face picture 4 according to the extracted face image frame A-face image frame H.
其中, 聚类处理的算法可以有多种, 具体可以参见以上实施例中的相关描述以及现有聚 类算法的相关技术。 Among them, there may be multiple clustering algorithms. For details, please refer to the related descriptions in the above embodiments and related technologies of existing clustering algorithms.
2304、 电子设备 4艮据聚类处理结果, 显示至少一个分组, 每个分组分别包括一个用户的
至少一张人脸图片。 2304. The electronic device 4 displays at least one group according to the clustering processing result, and each group includes a user's At least one face picture.
在该步骤中, 聚类处理得到的每一个分组分别可以包括一个用户的至少一张人脸图片, 即一个分组可以包括同一个用户的至少一张人脸图片, 同一个用户的至少一张人脸图片可以 在同一个分组中。 In this step, each group obtained by the clustering process may include at least one face picture of a user, that is, a group may include at least one face picture of the same user, and at least one face picture of the same user. Face pictures can be in the same group.
也就是说, 电子设备可以将至少一个视频中的多个人脸图像帧作为先验信息, 根据至少 一个视频中的多个人脸图像帧对人脸图片进行聚类, 从而将人脸图片根据不同的用户进行分 组, 使得同一用户的人脸图片聚类为同一个分组, 提高人脸图片分组的准确性。 That is, the electronic device may use multiple face image frames in at least one video as prior information, and cluster the face images according to the multiple face image frames in the at least one video, so as to classify the face images according to different Users are grouped, so that face pictures of the same user are clustered into the same group, and the accuracy of face picture grouping is improved.
其中, 一个分组中包括的至少一张人脸图片, 可以是电子设备确定的同一个用户的人脸 图片。 电子设备可以根据人脸图片上人脸特征之间的相似度计算, 确定相似度大于或者等于 第一预设值的不同人脸图片为同一个用户的人脸图片。 Wherein, at least one face picture included in a group may be a face picture of the same user determined by the electronic device. The electronic device may calculate according to the similarity between the face features on the face picture, and determine that different face pictures with a similarity greater than or equal to the first preset value are face pictures of the same user.
示例性的,电子设备根据人脸图像帧 A-人脸图像帧 H对人脸图片 1-人脸图片 4进行聚类 处理后, 得到的分组可以参见图 10中的(b)所示的分组 3和图 10中的(c)所示的分组 4。 分组 3中包括用户 1的人脸图片, 分组 4中包括用户 2的人脸图片。 Exemplarily, after the electronic device performs clustering processing on the face picture 1-the face picture 4 according to the face image frame A- the face image frame H, the obtained group may refer to the group shown in FIG. 10(b) 3 and group 4 shown in (c) in Figure 10. Group 3 includes the face picture of user 1, and group 4 includes the face picture of user 2.
在一种技术方案中, 每个分组还包括以下任意一项或任意多项的组合: 用户的人脸图像 帧所在的视频, 用户的人脸图像帧所在的视频分段, 或用户的至少一个人脸图像帧。 也就是 说, 电子设备可以根据不同用户对人脸图片、 视频、 视频分段和人脸图像帧进行分组, 统一 或联合管理用户的视频和图片, 方便用户查找和管理, 提高用户使用体验。 In a technical solution, each group further includes any one or a combination of any of the following: the video where the user's face image frame is located, the video segment where the user's face image frame is located, or at least one of the user's face image frames Face image frame. That is, the electronic device can group face pictures, videos, video segments, and face image frames according to different users, unified or jointly manage users' videos and pictures, which is convenient for users to find and manage, and improve user experience.
示例性的, 参见图 11中的 (b), 用户 1对应的分组 7A中包括用户的人脸图片以及用户 1的人脸图像帧所在的视频 1。 Exemplarily, referring to (b) in FIG. 11, the group 7A corresponding to user 1 includes the face picture of the user and the video 1 where the face image frame of the user 1 is located.
再示例性的, 参见图 13中的 (a), 用户 1对应的分组 7C中包括用户的人脸图片以及用 户的人脸图像帧所在的视频分段。 For another example, referring to (a) in FIG. 13, the group 7C corresponding to user 1 includes the user's face picture and the video segment where the user's face image frame is located.
再示例性的, 参见图 13中的 (a), 用户 1对应的分组 7D中包括用户的人脸图片以及用 户的多个人脸图像帧。 For another example, referring to (a) in FIG. 13, the group 7D corresponding to the user 1 includes the face picture of the user and multiple face image frames of the user.
在一种技术方案中, 每个分组包括的一个用户的至少一张人脸图片为单人照或合影。 示例性的, 图 10中的 (b)分组 3中包括用户 1的单人照, 图 10中的 (c)所示的分组 4中包括用户 2的单人照。 图 18中的 (a)所示的分组 9中包括用户 1的单人照和合影, 图 18中的 ( b)所示的分组 10中包括用户 2的单人照和合影。 In a technical solution, at least one face picture of a user included in each group is a single photo or a group photo. Exemplarily, (b) group 3 in FIG. 10 includes a single photo of user 1, and group 4 shown in (c) in FIG. 10 includes a single photo of user 2. Group 9 shown in (a) in FIG. 18 includes the single photo and group photo of user 1, and group 10 shown in (b) in FIG. 18 includes the single photo and group photo of user 2.
如图 23所示, 上述步骤 2303具体可以包括: As shown in Figure 23, the foregoing step 2303 may specifically include:
2303A、 电子设备将多个人脸图像帧划分为至少一个类别, 每个类别分别对应于一个用 户不同形态的多个人脸图像帧。 2303A. The electronic device divides the multiple face image frames into at least one category, and each category corresponds to multiple face image frames of different forms of a user.
示例性的, 参见图 8A, 电子设备可以将人脸图像帧 A-C划分为类别 1 , 类别 1中包括用 户 1不同形态的多个人脸图像帧; 将人脸图像帧 D-F划分为类别 2, 类别 2中包括用户 2不 同形态的多个人脸图帧; 将人脸图像帧 G-H划分为类别 3 , 类别 3中包括用户 1不同形态的 多个人脸图像帧。 Exemplarily, referring to FIG. 8A, the electronic device may divide the face image frame AC into category 1, and the category 1 includes multiple face image frames of different forms of the user 1; and divide the face image frame DF into category 2, category 2. Includes multiple face image frames of user 2 in different forms; the face image frame GH is divided into category 3, and category 3 includes multiple face image frames of user 1 in different forms.
2303B、 电子设备根据多个人脸图像帧的类别划分结果, 对至少一张人脸图片进行聚类 处理。 2303B. The electronic device performs clustering processing on at least one face image according to the classification results of multiple face image frames.
示例性的, 电子设备可以才艮据图 8A所示的类别 1, 类别 2和类别 3, 对人脸图片 1-4进 行聚类处理。 电子设备可以根据类别划分结果, 将人脸图片与已划分的类别归为一组, 或者 将人脸图片划分至一个新的分组。 Exemplarily, the electronic device may perform clustering processing on the face pictures 1-4 according to category 1, category 2, and category 3 shown in FIG. 8A. The electronic device may group the face pictures and the divided categories into a group according to the classification results, or divide the face pictures into a new group.
其中, 视频中的人脸图像帧通常是动态变化的人脸图像帧, 可以包括不同形态的人脸图
像。 当至少一个视频中的多个人脸图像帧划分的每个类别中, 分别包括同一用户不同形态的 人脸图像时,电子设备可以根据不同类别中不同用户不同形态的人脸图像,对不同人脸角度、 表情等不同形态的人脸图片进行准确分组, 提高分组的准确性。 Among them, the face image frame in the video is usually a dynamically changing face image frame, and may include face images of different forms Like. When each category divided by multiple face image frames in at least one video includes different types of face images of the same user, the electronic device can respond to different face images of different users in different categories according to different types of face images. Face pictures of different shapes such as angles and expressions are accurately grouped to improve the accuracy of grouping.
上述步骤 2303A具体可以包括: 电子设备分别将每个视频中的人脸图像帧划分为至少一 个类别。 The above step 2303A may specifically include: the electronic device separately classifies the face image frames in each video into at least one category.
其中, 同一视频中的相邻图像帧具有时间连续性, 视频中具有时间连续性的同一用户的 多个人脸图像帧可以归为一个类别。 而在视频中, 具有时间连续性的同一用户的人脸图像帧 通常可以是相邻的人脸图像帧。 Among them, adjacent image frames in the same video have temporal continuity, and multiple face image frames of the same user with temporal continuity in the video can be classified into one category. In video, the face image frames of the same user with temporal continuity can usually be adjacent face image frames.
例如, 电子设备通过人脸跟踪算法跟踪到的同一视频中的人脸图像具有时间连续性, 满 足 must- link约束, 是同一个用户的人脸, 可以归为同一个类别。 因而, 电子设备可以通过人 脸跟踪算法, 分别将每个视频中具有时间连续性的同一用户的多个人脸图像帧划分为同一个 类别。 这样, 同一视频中多个用户的人脸图像帧就可以对应多个类别。 For example, the face images in the same video tracked by the electronic device through the face tracking algorithm have temporal continuity, meet the must-link constraint, are the faces of the same user, and can be classified into the same category. Therefore, the electronic device can separately classify multiple face image frames of the same user with temporal continuity in each video into the same category through the face tracking algorithm. In this way, the face image frames of multiple users in the same video can correspond to multiple categories.
示例性的, 电子设备对视频 1 中的人脸图像帧划分类别后的结果, 可以为图 8A所示的 类别 1、 类别 2和类别 3。 Exemplarily, the result of the electronic device classifying the face image frame in the video 1 may be category 1, category 2, and category 3 shown in FIG. 8A.
上述步骤 2303A具体还可以包括: 若至少一个类别中第一类别中的第一人脸图像帧的人 脸特征,与第二类别中的第二人脸图像帧的人脸特征之间的相似度大于或者等于第二预设值, 则电子设备可以将第一类别和第二类别合并为同一个类别。 The above step 2303A may specifically further include: if the facial features of the first face image frame in the first category in at least one category are similar to the face features of the second face image frame in the second category If the value is greater than or equal to the second preset value, the electronic device may merge the first category and the second category into the same category.
其中, 由于人脸特征之间的相似度大于或者等于第二预设值的两个人脸图像帧, 一般为 同一用户的人脸图像帧, 这两个人脸图像帧分别所在的类别也与同一用户对应, 因而电子设 备可以将这两个人脸图像帧分别所在的类别合并为同一类别。 Among them, because the similarity between the facial features is greater than or equal to the second preset value, the two face image frames are generally the face image frames of the same user, and the categories of the two face image frames are also the same as those of the same user. Correspondingly, the electronic device can merge the categories of the two face image frames into the same category.
这样, 电子设备可以先将同一个视频中的人脸图像帧划分类别, 而后再将不同视频中相 似度较大的人脸图像帧所在的类别合并, 即将不同视频中同一用户的人脸图像帧合并为同一 个类别。 In this way, the electronic device can first divide the face image frames in the same video into categories, and then merge the categories of the face image frames with greater similarity in different videos, that is, the face image frames of the same user in different videos Merged into the same category.
示例性的, 若类别 1中的第一人脸图像帧与类别 3中的第二人脸图像帧的人脸特征之间 的相似度大于或者等于第二预设值, 则电子设备将类别 1和类别 3合并为类别 4。 Exemplarily, if the similarity between the facial features of the first face image frame in category 1 and the second face image frame in category 3 is greater than or equal to the second preset value, the electronic device sets the category 1 Merged with category 3 into category 4.
在后续的步骤 2303B中, 电子设备可以根据类别 2和类别 4, 对电子设备保存的 (获取 的) 至少一张人脸图片进行聚类处理。 In the subsequent step 2303B, the electronic device may perform clustering processing on at least one face image saved (acquired) by the electronic device according to category 2 and category 4.
此外, 参见图 23, 该方法还可以包括: In addition, referring to FIG. 23, the method may further include:
2305、 电子设备获取至少一个图像组, 每个图像组中包括同一用户不同形态的多个图像 帧。 2305. The electronic device acquires at least one image group, and each image group includes multiple image frames of the same user in different forms.
其中, 该至少一个图像组包括以下任意一项或任意多项的组合: 动图, 预先拍摄的包括 同一用户不同形态的人脸的图像组, 在拍摄预览时实时采集的多帧图像形成的图像组, 或在 连拍时拍摄到的多帧图像形成的图像组。 Wherein, the at least one image group includes any one or a combination of any of the following: a moving image, a pre-photographed image group including different forms of the same user's face, an image formed by multiple frames of images collected in real time during the shooting preview Group, or an image group formed by multiple frames of images taken during continuous shooting.
在步骤 2305的基础上, 上述步骤 2302具体可以包括: 电子设备从至少一个视频以及至 少一个图像组中, 提取多个人脸图像帧。 On the basis of step 2305, the above step 2302 may specifically include: the electronic device extracts multiple face image frames from at least one video and at least one image group.
其中, 步骤 2305中的图像组以及步骤 2301中的视频, 可以为本申请上述实施例描述的 参考图像集。 也就是说, 电子设备可以从一个或多个参考图像集中获取同一用户不同姿态的 多个人脸图像帧, 以便于电子设备根据同一用户不同姿态的多个人脸图像帧对人脸图片进行 精确分组, 降低聚类的分散度。 Wherein, the image group in step 2305 and the video in step 2301 may be the reference image set described in the foregoing embodiment of this application. That is, the electronic device can obtain multiple face image frames of the same user in different poses from one or more reference image sets, so that the electronic device can accurately group the face images according to the multiple face image frames of the same user in different poses. Reduce the dispersion of clusters.
可以理解的是, 电子设备为了实现上述功能, 其包含了执行各个功能相应的硬件和成软
件模块。 结合本文中所公开的实施例描述的各示例的算法步骤, 本申请能够以硬件或硬件和 计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行, 取决于技术方案的特定应用和设计约束条件。 本领域技术人员可以结合实施例对每个特定的 应用来使用不同方法来实现所描述的功能, 但是这种实现不应认为超出本申请的范围。 It can be understood that, in order to realize the above-mentioned functions, an electronic device includes hardware and software corresponding to each function. Piece modules. With reference to the algorithm steps of the examples described in the embodiments disclosed herein, the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software-driven hardware depends on the specific application and design constraint conditions of the technical solution. Those skilled in the art can use different methods for each specific application in combination with the embodiments to implement the described functions, but such implementation should not be considered beyond the scope of the present application.
本申请实施例可以根据上述方法示例对电子设备进行功能模块的划分, 例如, 可以对应 各个功能划分各个功能模块, 也可以将两个或两个以上的功能集成在一个处理模块中。 上述 集成的模块可以采用硬件的形式实现。 需要说明的是, 本申请实施例中对模块的划分是示意 性的, 仅仅为一种逻辑功能划分, 实际实现时可以有另外的划分方式。 The embodiment of the present application may divide the electronic device into functional modules according to the foregoing method examples. For example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The above integrated modules can be implemented in the form of hardware. It should be noted that the division of modules in the embodiments of the present application is illustrative, and is only a logical function division, and there may be other division methods in actual implementation.
在采用对应各个功能划分各个功能模块的情况下,图 24示出了上述实施例中涉及的电子 设备 2400的一种可能的组成示意图, 如图 24所示, 该电子设备 2400可以包括: 获取单元 2401、 提取单元 2402、 聚类单元 2403和显示单元 2404等。 In the case of dividing each functional module corresponding to each function, FIG. 24 shows a schematic diagram of a possible composition of the electronic device 2400 involved in the foregoing embodiment. As shown in FIG. 24, the electronic device 2400 may include: an acquiring unit 2401, extraction unit 2402, clustering unit 2403, display unit 2404, and so on.
其中, 获取单元 2401可以用于支持电子设备 2400执行上述步骤 2301, 和 /或用于本文所 描述的技术的其他过程。 Wherein, the obtaining unit 2401 may be used to support the electronic device 2400 to perform the foregoing step 2301, and/or other processes used in the technology described herein.
提取单元 2401可以用于支持电子设备 2400执行上述步骤 2302等, 和 /或用于本文所描 述的技术的其他过程。 The extracting unit 2401 may be used to support the electronic device 2400 to perform the foregoing steps 2302, etc., and/or used in other processes of the technology described herein.
聚类单元 2403可以用于支持电子设备 2400执行上述步骤 2303、步骤 2303A、步骤 2303B 等, 和 /或用于本文所描述的技术的其他过程。 The clustering unit 2403 may be used to support the electronic device 2400 to perform the above steps 2303, 2303A, 2303B, etc., and/or other processes of the technology described herein.
显示单元 2404可以用于支持电子设备 2400执行上述步骤 2304等, 和 /或用于本文所描 述的技术的其他过程。 The display unit 2404 may be used to support the electronic device 2400 to perform the above-mentioned steps 2304, etc., and/or used in other processes of the technology described herein.
需要说明的是, 上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模 块的功能描述, 在此不再赘述。 It should be noted that all relevant content of the steps involved in the above method embodiments can be cited in the functional description of the corresponding functional module, which will not be repeated here.
本申请实施例提供的电子设备, 用于执行上述图片的分组方法, 因此可以达到与上述实 现方法相同的效果。 The electronic device provided in the embodiment of the present application is used to perform the above-mentioned grouping method for pictures, and therefore can achieve the same effect as the above-mentioned implementation method.
在采用集成的单元的情况下, 电子设备可以包括处理模块和存储模块。 其中, 处理模块 可以用于对电子设备的动作进行控制管理, 例如, 可以用于支持电子设备执行上述获取单元 2401、 提取单元 2402、 聚类单元 2403和显示单元 2404执行的步骤。 In the case of an integrated unit, the electronic device may include a processing module and a storage module. The processing module can be used to control and manage the actions of the electronic device. For example, it can be used to support the electronic device to execute the steps performed by the above-mentioned obtaining unit 2401, extraction unit 2402, clustering unit 2403, and display unit 2404.
存储模块可以用于支持电子设备存储人脸图片和视频、 动图等参考图像集, 以及存储程 序代码和数据等。 The storage module can be used to support electronic devices to store reference image sets such as face pictures and videos, moving pictures, and to store program codes and data.
另外, 电子设备还可以包括通信模块, 可以用于支持电子设备与其他设备的通信。 In addition, the electronic device may also include a communication module, which may be used to support communication between the electronic device and other devices.
其中, 处理模块可以是处理器或控制器。 其可以实现或执行结合本申请公开内容所描述 的各种示例性的逻辑方框, 模块和电路。 处理器也可以是实现计算功能的组合, 例如包含一 个或多个微处理器组合, 数字信号处理 ( digital signal processing, DSP )和微处理器的组合等 等。 存储模块可以是存储器。 通信模块具体可以为射频电路、 蓝牙芯片、 wifi 芯片等与其他 电子设备交互的设备。 Wherein, the processing module may be a processor or a controller. It can implement or execute various exemplary logical blocks, modules, and circuits described in conjunction with the disclosure of this application. The processor may also be a combination of computing functions, for example, a combination of one or more microprocessors, a combination of digital signal processing (DSP) and a microprocessor, etc. The storage module may be a memory. The communication module may specifically be a radio frequency circuit, a Bluetooth chip, a wifi chip, and other devices that interact with other electronic devices.
在一个实施例中, 当处理模块为处理器, 存储模块为存储器时, 本申请实施例所涉及的 电子设备可以为具有图 1所示结构的电子设备。 具体的, 图 1所示的内部存储器 121可以存 储有计算机程序指令, 当指令被处理器 110执行时, 使得电子设备可以执行: 获取至少一个 视频; 从至少一个视频中提取多个人脸图像帧; 根据多个人脸图像帧, 对至少一张人脸图片 进行聚类处理; 4艮据聚类处理结果, 显示至少一个分组, 每个分组分别包括一个用户的至少 一张人脸图片。
具体的, 当指令被处理器 110执行时, 使得电子设备具体可以执行: 将多个人脸图像帧 划分为至少一个类别, 每个类别分别对应于一个用户不同形态的多个人脸图像帧; 并根据多 个人脸图像帧的类别划分结果, 对至少一张人脸图片进行聚类处理等上述方法实施例中的步 骤。 In an embodiment, when the processing module is a processor and the storage module is a memory, the electronic device involved in the embodiment of the present application may be an electronic device having the structure shown in FIG. 1. Specifically, the internal memory 121 shown in FIG. 1 may store computer program instructions, and when the instructions are executed by the processor 110, the electronic device can execute: acquire at least one video; extract multiple face image frames from the at least one video; Perform clustering processing on at least one face picture according to multiple face image frames; and according to the clustering processing result, display at least one group, and each group includes at least one face picture of a user. Specifically, when the instructions are executed by the processor 110, the electronic device can specifically execute: divide multiple face image frames into at least one category, and each category corresponds to multiple face image frames of different forms of a user; and As a result of the classification of multiple face image frames, cluster processing is performed on at least one face image, and the steps in the foregoing method embodiment.
本申请实施例还提供一种计算机存储介质, 该计算机存储介质中存储有计算机指令, 当 该计算机指令在电子设备上运行时, 使得电子设备执行上述相关方法步骤实现上述实施例中 的图片分组方法。 The embodiment of the present application also provides a computer storage medium, the computer storage medium stores computer instructions, when the computer instructions run on the electronic device, the electronic device executes the above-mentioned related method steps to implement the picture grouping method in the above-mentioned embodiment .
本申请实施例还提供一种计算机程序产品, 当该计算机程序产品在计算机上运行时, 使 得计算机执行上述相关步骤, 以实现上述实施例中的图片分组方法。 The embodiments of the present application also provide a computer program product. When the computer program product runs on a computer, the computer is caused to execute the above-mentioned related steps, so as to realize the picture grouping method in the above-mentioned embodiment.
另外, 本申请的实施例还提供一种装置, 这个装置具体可以是芯片, 组件或模块, 该装 置可包括相连的处理器和存储器; 其中, 存储器用于存储计算机执行指令, 当装置运行时, 处理器可执行存储器存储的计算机执行指令, 以使芯片执行上述各方法实施例中的图片分组 方法。 In addition, the embodiments of the present application also provide a device. The device may specifically be a chip, component or module. The device may include a connected processor and a memory; where the memory is used to store computer execution instructions, and when the device is running, The processor can execute the computer-executable instructions stored in the memory, so that the chip executes the picture grouping methods in the foregoing method embodiments.
其中, 本申请实施例提供的电子设备、 计算机存储介质、 计算机程序产品或芯片均用于 执行上文所提供的对应的方法, 因此, 其所能达到的有益效果可参考上文所提供的对应的方 法中的有益效果, 此处不再赘述。 Among them, the electronic equipment, computer storage medium, computer program product, or chip provided in the embodiments of the present application are all used to execute the corresponding method provided above. Therefore, the beneficial effects that can be achieved can refer to the corresponding method provided above. The beneficial effects of the method are not repeated here.
通过以上实施方式的描述, 所属领域的技术人员可以清楚地了解到, 为描述的方便和筒 洁, 仅以上述各功能模块的划分进行举例说明, 实际应用中, 可以根据需要而将上述功能分 配由不同的功能模块完成, 即将装置的内部结构划分成不同的功能模块, 以完成以上描述的 全部或者部分功能。 Through the description of the above embodiments, those skilled in the art can clearly understand that for the convenience and clarity of description, only the division of the above-mentioned functional modules is used as an example for illustration. In actual applications, the above-mentioned functions can be allocated according to needs. It is completed by different functional modules, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
在本申请所提供的几个实施例中, 应该理解到, 所揭露的装置和方法, 可以通过其它的 方式实现。 例如, 以上所描述的装置实施例仅仅是示意性的, 例如, 模块或单元的划分, 仅 仅为一种逻辑功能划分, 实际实现时可以有另外的划分方式, 例如多个单元或组件可以结合 或者可以集成到另一个装置, 或一些特征可以忽略, 或不执行。 另一点, 所显示或讨论的相 互之间的耦合或直接耦合或通信连接可以是通过一些接口, 装置或单元的间接耦合或通信连 接, 可以是电性, 机械或其它的形式。 In the several embodiments provided in this application, it should be understood that the disclosed device and method can be implemented in other ways. For example, the device embodiments described above are merely illustrative, for example, the division of modules or units is only a logical function division, and there may be other division methods in actual implementation, for example, multiple units or components may be combined or It can be integrated into another device, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
作为分离部件说明的单元可以是或者也可以不是物理上分开的, 作为单元显示的部件可 以是一个物理单元或多个物理单元,即可以位于一个地方,或者也可以分布到多个不同地方。 可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。 The units described as separate parts may or may not be physically separate. The parts displayed as a unit may be one physical unit or multiple physical units, that is, they may be located in one place or distributed to multiple different places. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外, 在本申请各个实施例中的各功能单元可以集成在一个处理单元中, 也可以是各个 单元单独物理存在, 也可以两个或两个以上单元集成在一个单元中。 上述集成的单元既可以 采用硬件的形式实现, 也可以采用软件功能单元的形式实现。 In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be realized in the form of hardware or software functional unit.
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时, 可以存储 在一个可读取存储介质中。 基于这样的理解, 本申请实施例的技术方案本质上或者说对现有 技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来, 该软件 产品存储在一个存储介质中, 包括若千指令用以使得一个设备 (可以是单片机, 芯片等)或 处理器 ( processor )执行本申请各个实施例所述方法的全部或部分步骤。 而前述的存储介质 包括: U盘、移动硬盘、只读存储器 ( read only memory , ROM )、随机存取存储器 ( random access memory , RAM )、 磁碟或者光盘等各种可以存储程序代码的介质。 If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present application essentially or the part that contributes to the prior art, or all or part of the technical solutions can be embodied in the form of a software product, and the software product is stored in a storage medium. There are thousands of instructions used to make a device (may be a single-chip microcomputer, a chip, etc.) or a processor (processor) execute all or part of the steps of the method described in each embodiment of the present application. The aforementioned storage media include: U disk, mobile hard disk, read only memory (read only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program codes.
以上内容, 仅为本申请的具体实施方式, 但本申请的保护范围并不局限于此, 任何在本
申请揭露的技术范围内的变化或替换, 都应涵盖在本申请的保护范围之内。 因此, 本申请的 保护范围应以所述权利要求的保护范围为准。
The above content is only the specific implementation manners of this application, but the protection scope of this application is not limited to this. Changes or replacements within the scope of the technology disclosed in the application shall be covered by the scope of protection of this application. Therefore, the protection scope of this application should be subject to the protection scope of the claims.
Claims
1、 一种图片分组方法, 应用于电子设备, 所述电子设备上保存有至少一张人脸图片, 其 特征在于, 所述方法包括: 1. A picture grouping method applied to an electronic device, where at least one face picture is stored on the electronic device, characterized in that the method includes:
获耳又至少一个视频; Earn at least one video;
从所述至少一个视频中提取多个人脸图像帧; Extracting multiple face image frames from the at least one video;
根据所述多个人脸图像帧, 对所述至少一张人脸图片进行聚类处理; Perform clustering processing on the at least one face picture according to the multiple face image frames;
才艮据所述聚类处理结果, 显示至少一个分组, 每个所述分组分别包括一个用户的至少一 张 A脸图片。 According to the clustering processing result, at least one group is displayed, and each group includes at least one A face picture of a user.
2、 根据权利要求 1所述的方法, 其特征在于, 所述根据所述多个人脸图像帧, 对所述至 少一张人脸图片进行聚类处理, 包括: 2. The method according to claim 1, wherein the clustering of the at least one face picture according to the multiple face image frames comprises:
将所述多个人脸图像帧划分为至少一个类别, 每个所述类别分别对应于一个用户不同形 态的多个人脸图像帧; Dividing the multiple face image frames into at least one category, and each category corresponds to multiple face image frames of different forms of a user;
根据所述多个人脸图像帧的类别划分结果, 对所述至少一张人脸图片进行聚类处理。 Perform clustering processing on the at least one face picture according to the classification results of the multiple face image frames.
3、根据权利要求 2所述的方法, 其特征在于, 所述将所述多个人脸图像帧划分为至少一 个类别, 包括: 3. The method according to claim 2, wherein the dividing the plurality of face image frames into at least one category comprises:
分别将每个所述视频中的人脸图像帧划分为至少一个类别; Dividing the face image frames in each video into at least one category;
若所述至少一个类别中第一类别中的第一人脸图像帧的人脸特征, 与第二类别中的第二 人脸图像帧的人脸特征之间的相似度大于或者等于预设值, 则将所述第一类别和所述第二类 别合并为同一个类别。 If the face feature of the first face image frame in the first category in the at least one category, the similarity between the face feature of the second face image frame in the second category is greater than or equal to the preset value , Then the first category and the second category are combined into the same category.
4、根据权利要求 3所述的方法, 其特征在于, 所述分别将每个所述视频中的人脸图像帧 划分为至少一个类别, 包括: 4. The method according to claim 3, wherein the separately dividing the face image frames in each of the videos into at least one category comprises:
通过人脸跟踪算法, 分别将每个所述视频中, 具有时间连续性的同一用户的多个人脸图 像帧划分为同一个类别。 Through the face tracking algorithm, the multiple face image frames of the same user with temporal continuity in each video are divided into the same category.
5、 根据权利要求 1-4任一项所述的方法, 其特征在于, 每个所述分组还包括以下任意一 项或任意多项的组合: 所述用户的人脸图像帧所在的视频, 所述用户的人脸图像帧所在的视 频分段, 或所述用户的至少一个人脸图像帧。 5. The method according to any one of claims 1-4, wherein each of the groups further includes any one or a combination of any of the following: the video where the user's face image frame is located, The video segment in which the face image frame of the user is located, or at least one face image frame of the user.
6、 根据权利要求 1-5任一项所述的方法, 其特征在于, 每个所述分组包括的一个用户的 至少一张人脸图片为单人照或合影。 6. The method according to any one of claims 1-5, wherein at least one face picture of a user included in each group is a single photo or a group photo.
7、 根据权利要求 1-6任一项所述的方法, 其特征在于, 所述获取至少一个视频, 包括: 从所述电子设备的存储区获取所述至少一个视频。 7. The method according to any one of claims 1-6, wherein the acquiring at least one video comprises: acquiring the at least one video from a storage area of the electronic device.
8、 根据权利要求 1-6任一项所述的方法, 其特征在于, 所述获取至少一个视频, 包括: 提示用户拍摄包括人脸图像帧的视频; 8. The method according to any one of claims 1-6, wherein said acquiring at least one video comprises: prompting a user to shoot a video including face image frames;
在检测到用户指示拍摄视频的操作后, 录制并生成至少一个视频。 After detecting the user's instruction to shoot the video, at least one video is recorded and generated.
9、 根据权利要求 1-8任一项所述的方法, 其特征在于, 所述方法还包括: 9. The method according to any one of claims 1-8, wherein the method further comprises:
获取至少一个图像组, 每个所述图像组中包括同一用户不同形态的多个图像帧; 所述至 少一个图像组包括以下任意一项或任意多项的组合: 动图, 预先拍摄的包括同一用户不同形 态的人脸的图像组, 在拍摄预览时实时采集的多帧图像形成的图像组, 或在连拍时拍摄到的 多帧图像形成的图像组; At least one image group is acquired, and each image group includes multiple image frames of the same user in different forms; the at least one image group includes any one or a combination of any of the following: moving pictures, pre-photographed including the same Image groups of different forms of the user’s face, an image group formed by multiple frames of images captured in real-time during shooting preview, or an image group formed by multiple frames of images captured during continuous shooting;
所述从所述至少一个视频中提取多个人脸图像帧, 包括: The extracting multiple face image frames from the at least one video includes:
从所述至少一个视频以及所述至少一个图像组中, 提取所述多个人脸图像帧。
Extracting the plurality of face image frames from the at least one video and the at least one image group.
10、 一种电子设备, 其特征在于, 所述电子设备包括: 至少一个处理器; 至少一个存储 器; 其中, 所述至少一个存储器中存储有计算机程序指令, 当所述指令被所述至少一个处理 器执行时, 使得所述电子设备执行以下步骤: 10. An electronic device, characterized in that the electronic device comprises: at least one processor; at least one memory; wherein, the at least one memory stores computer program instructions, and when the instructions are processed by the at least one When the device is executed, the electronic device is caused to perform the following steps:
获耳又至少一个视频; Earn at least one video;
从所述至少一个视频中提取多个人脸图像帧; Extracting multiple face image frames from the at least one video;
根据所述多个人脸图像帧, 对所述至少一张人脸图片进行聚类处理; Perform clustering processing on the at least one face picture according to the multiple face image frames;
才艮据所述聚类处理结果, 显示至少一个分组, 每个所述分组分别包括一个用户的至少一 张 A脸图片。 According to the clustering processing result, at least one group is displayed, and each group includes at least one A face picture of a user.
11、 根据权利要求 10所述的电子设备, 其特征在于, 所述根据所述多个人脸图像帧, 对 所述至少一张人脸图片进行聚类处理, 具体包括: 11. The electronic device according to claim 10, wherein the clustering process on the at least one face image according to the multiple face image frames specifically comprises:
将所述多个人脸图像帧划分为至少一个类别, 每个所述类别分别对应于一个用户不同形 态的多个人脸图像帧; Dividing the multiple face image frames into at least one category, and each category corresponds to multiple face image frames of different forms of a user;
根据所述多个人脸图像帧的类别划分结果, 对所述至少一张人脸图片进行聚类处理。 Perform clustering processing on the at least one face picture according to the classification results of the multiple face image frames.
12、 根据权利要求 11所述的电子设备, 其特征在于, 所述将所述多个人脸图像帧划分为 至少一个类别, 具体包括: 12. The electronic device according to claim 11, wherein the dividing the plurality of face image frames into at least one category specifically comprises:
分别将每个所述视频中的人脸图像帧划分为至少一个类别; Dividing the face image frames in each video into at least one category;
若所述至少一个类别中第一类别中的第一人脸图像帧的人脸特征, 与第二类别中的第二 人脸图像帧的人脸特征之间的相似度大于或者等于预设值, 则将所述第一类别和所述第二类 别合并为同一个类别。 If the face feature of the first face image frame in the first category in the at least one category, the similarity between the face feature of the second face image frame in the second category is greater than or equal to the preset value , Then the first category and the second category are combined into the same category.
13、根据权利要求 12所述的电子设备, 其特征在于, 所述分别将每个所述视频中的人脸 图像帧划分为至少一个类别, 具体包括: 13. The electronic device according to claim 12, wherein said separately dividing the face image frames in each said video into at least one category specifically comprises:
通过人脸跟踪算法, 分别将每个所述视频中, 具有时间连续性的同一用户的多个人脸图 像帧划分为同一个类别。 Through the face tracking algorithm, the multiple face image frames of the same user with temporal continuity in each video are divided into the same category.
14、 根据权利要求 10-13任一项所述的电子设备, 其特征在于, 每个所述分组还包括以 下任意一项或任意多项的组合: 所述用户的人脸图像帧所在的视频, 所述用户的人脸图像帧 所在的视频分段, 或所述用户的至少一个人脸图像帧。 14. The electronic device according to any one of claims 10-13, wherein each of the groups further comprises any one or a combination of any of the following: the video where the user's face image frame is located , The video segment where the face image frame of the user is located, or at least one face image frame of the user.
15、 根据权利要求 10-14任一项所述的电子设备, 其特征在于, 每个所述分组包括的一 个用户的至少一张人脸图片为单人照或合影。 15. The electronic device according to any one of claims 10-14, wherein at least one face picture of a user included in each group is a single photo or a group photo.
16、根据权利要求 10-15任一项所述的电子设备,其特征在于,所述获取至少一个视频, 具体包括: 16. The electronic device according to any one of claims 10-15, wherein said acquiring at least one video specifically comprises:
从所述至少一个存储器获取所述至少一个视频。 The at least one video is acquired from the at least one memory.
17、根据权利要求 10-15任一项所述的电子设备,其特征在于,所述获取至少一个视频, 具体包括: 17. The electronic device according to any one of claims 10-15, wherein said acquiring at least one video specifically comprises:
提示用户拍摄包括人脸图像帧的视频; Prompt the user to shoot a video including face image frames;
在检测到用户指示拍摄视频的操作后, 录制并生成至少一个视频。 After detecting the user's instruction to shoot the video, at least one video is recorded and generated.
18、 根据权利要求 10-17任一项所述的电子设备, 其特征在于, 当所述指令被所述至少 一个处理器执行时, 还使得所述电子设备执行以下步骤: 18. The electronic device according to any one of claims 10-17, wherein when the instruction is executed by the at least one processor, the electronic device is further caused to execute the following steps:
获取至少一个图像组, 每个所述图像组中包括同一用户不同形态的多个图像帧; 所述至 少一个图像组包括以下任意一项或任意多项的组合: 动图, 预先拍摄的包括同一用户不同形 态的人脸的图像组, 在拍摄预览时实时采集的多帧图像形成的图像组, 或在连拍时拍摄到的
多帧图像形成的图像组; At least one image group is acquired, and each image group includes multiple image frames of the same user in different forms; the at least one image group includes any one or a combination of any of the following: moving pictures, pre-photographed including the same The image group of the user's face in different forms, the image group formed by the multi-frame images collected in real time during the shooting preview, or the image group taken during continuous shooting Image group formed by multiple frames of images;
所述从所述至少一个视频中提取多个人脸图像帧, 具体包括: The extracting multiple face image frames from the at least one video specifically includes:
从所述至少一个视频以及所述至少一个图像组中, 提取所述多个人脸图像帧。 Extracting the plurality of face image frames from the at least one video and the at least one image group.
19、 一种计算机存储介质, 其特征在于, 包括计算机指令, 当所述计算机指令在电子设 备上运行时, 所述电子设备执行如权利要求 1-9中任一项所述的图片分组方法。 19. A computer storage medium, characterized by comprising computer instructions, and when the computer instructions are run on an electronic device, the electronic device executes the picture grouping method according to any one of claims 1-9.
20、 一种计算机程序产品, 其特征在于, 当所述计算机程序产品在计算机上运行时, 所 述计算机执行如权利要求 1-9中任一项所述的图片分组方法。
20. A computer program product, characterized in that, when the computer program product runs on a computer, the computer executes the picture grouping method according to any one of claims 1-9.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910147299.6A CN111625670A (en) | 2019-02-27 | 2019-02-27 | Picture grouping method and device |
CN201910147299.6 | 2019-02-27 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020173379A1 true WO2020173379A1 (en) | 2020-09-03 |
Family
ID=72239103
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/076040 WO2020173379A1 (en) | 2019-02-27 | 2020-02-20 | Picture grouping method and device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111625670A (en) |
WO (1) | WO2020173379A1 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112287152A (en) * | 2020-10-26 | 2021-01-29 | 山东晨熙智能科技有限公司 | Photo classification method and system |
CN112395037A (en) * | 2020-12-07 | 2021-02-23 | 深圳云天励飞技术股份有限公司 | Dynamic cover selection method and device, electronic equipment and storage medium |
CN112950601A (en) * | 2021-03-11 | 2021-06-11 | 成都微识医疗设备有限公司 | Method, system and storage medium for screening pictures for esophageal cancer model training |
CN113111934A (en) * | 2021-04-07 | 2021-07-13 | 杭州海康威视数字技术股份有限公司 | Image grouping method and device, electronic equipment and storage medium |
CN113378764A (en) * | 2021-06-25 | 2021-09-10 | 深圳市斯博科技有限公司 | Video face acquisition method, device, equipment and medium based on clustering algorithm |
CN116708751A (en) * | 2022-09-30 | 2023-09-05 | 荣耀终端有限公司 | Method and device for determining photographing duration and electronic equipment |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112364688B (en) * | 2020-09-30 | 2022-04-08 | 北京奇信智联科技有限公司 | Face clustering method and device, computer equipment and readable storage medium |
CN113542594B (en) * | 2021-06-28 | 2023-11-17 | 惠州Tcl云创科技有限公司 | High-quality image extraction processing method and device based on video and mobile terminal |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130076943A1 (en) * | 2007-08-23 | 2013-03-28 | Samsung Electronics Co., Ltd. | Apparatus and method for image recognition of facial areas in photographic images from a digital camera |
CN103034714A (en) * | 2012-12-11 | 2013-04-10 | 北京百度网讯科技有限公司 | Mobile terminal, picture sorting management method and picture sorting management device of mobile terminal |
CN105608425A (en) * | 2015-12-17 | 2016-05-25 | 小米科技有限责任公司 | Method and device for sorted storage of pictures |
CN105631408A (en) * | 2015-12-21 | 2016-06-01 | 小米科技有限责任公司 | Video-based face album processing method and processing device |
CN105740850A (en) * | 2016-03-04 | 2016-07-06 | 北京小米移动软件有限公司 | Method and device for classifying photos |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104731964A (en) * | 2015-04-07 | 2015-06-24 | 上海海势信息科技有限公司 | Face abstracting method and video abstracting method based on face recognition and devices thereof |
CN105243098B (en) * | 2015-09-16 | 2018-10-26 | 小米科技有限责任公司 | The clustering method and device of facial image |
CN106776662B (en) * | 2015-11-25 | 2020-03-03 | 腾讯科技(深圳)有限公司 | Photo sorting method and device |
CN107977674B (en) * | 2017-11-21 | 2020-02-18 | Oppo广东移动通信有限公司 | Image processing method, image processing device, mobile terminal and computer readable storage medium |
-
2019
- 2019-02-27 CN CN201910147299.6A patent/CN111625670A/en active Pending
-
2020
- 2020-02-20 WO PCT/CN2020/076040 patent/WO2020173379A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130076943A1 (en) * | 2007-08-23 | 2013-03-28 | Samsung Electronics Co., Ltd. | Apparatus and method for image recognition of facial areas in photographic images from a digital camera |
CN103034714A (en) * | 2012-12-11 | 2013-04-10 | 北京百度网讯科技有限公司 | Mobile terminal, picture sorting management method and picture sorting management device of mobile terminal |
CN105608425A (en) * | 2015-12-17 | 2016-05-25 | 小米科技有限责任公司 | Method and device for sorted storage of pictures |
CN105631408A (en) * | 2015-12-21 | 2016-06-01 | 小米科技有限责任公司 | Video-based face album processing method and processing device |
CN105740850A (en) * | 2016-03-04 | 2016-07-06 | 北京小米移动软件有限公司 | Method and device for classifying photos |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112287152A (en) * | 2020-10-26 | 2021-01-29 | 山东晨熙智能科技有限公司 | Photo classification method and system |
CN112395037A (en) * | 2020-12-07 | 2021-02-23 | 深圳云天励飞技术股份有限公司 | Dynamic cover selection method and device, electronic equipment and storage medium |
CN112950601A (en) * | 2021-03-11 | 2021-06-11 | 成都微识医疗设备有限公司 | Method, system and storage medium for screening pictures for esophageal cancer model training |
CN112950601B (en) * | 2021-03-11 | 2024-01-09 | 成都微识医疗设备有限公司 | Picture screening method, system and storage medium for esophageal cancer model training |
CN113111934A (en) * | 2021-04-07 | 2021-07-13 | 杭州海康威视数字技术股份有限公司 | Image grouping method and device, electronic equipment and storage medium |
CN113111934B (en) * | 2021-04-07 | 2023-08-08 | 杭州海康威视数字技术股份有限公司 | Image grouping method and device, electronic equipment and storage medium |
CN113378764A (en) * | 2021-06-25 | 2021-09-10 | 深圳市斯博科技有限公司 | Video face acquisition method, device, equipment and medium based on clustering algorithm |
CN113378764B (en) * | 2021-06-25 | 2022-11-29 | 深圳万兴软件有限公司 | Video face acquisition method, device, equipment and medium based on clustering algorithm |
CN116708751A (en) * | 2022-09-30 | 2023-09-05 | 荣耀终端有限公司 | Method and device for determining photographing duration and electronic equipment |
CN116708751B (en) * | 2022-09-30 | 2024-02-27 | 荣耀终端有限公司 | Method and device for determining photographing duration and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN111625670A (en) | 2020-09-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021213120A1 (en) | Screen projection method and apparatus, and electronic device | |
WO2020259038A1 (en) | Method and device for capturing images | |
WO2020186969A1 (en) | Multi-path video recording method and device | |
WO2020173379A1 (en) | Picture grouping method and device | |
WO2020078299A1 (en) | Method for processing video file, and electronic device | |
AU2019418925B2 (en) | Photographing method and electronic device | |
WO2020238775A1 (en) | Scene recognition method, scene recognition device, and electronic apparatus | |
CN111132234B (en) | Data transmission method and corresponding terminal | |
US20220180485A1 (en) | Image Processing Method and Electronic Device | |
WO2021057277A1 (en) | Photographing method in dark light and electronic device | |
WO2021068926A1 (en) | Model updating method, working node, and model updating system | |
WO2022022319A1 (en) | Image processing method, electronic device, image processing system and chip system | |
CN112840635A (en) | Intelligent photographing method, system and related device | |
WO2021115483A1 (en) | Image processing method and related apparatus | |
WO2021057626A1 (en) | Image processing method, apparatus, device, and computer storage medium | |
CN112150499B (en) | Image processing method and related device | |
WO2021208677A1 (en) | Eye bag detection method and device | |
WO2023241209A9 (en) | Desktop wallpaper configuration method and apparatus, electronic device and readable storage medium | |
WO2020062304A1 (en) | File transmission method and electronic device | |
WO2022135144A1 (en) | Self-adaptive display method, electronic device, and storage medium | |
WO2024045661A1 (en) | Image processing method and electronic device | |
WO2023005882A1 (en) | Photographing method, photographing parameter training method, electronic device, and storage medium | |
WO2020078267A1 (en) | Method and device for voice data processing in online translation process | |
WO2022214004A1 (en) | Target user determination method, electronic device and computer-readable storage medium | |
WO2022143158A1 (en) | Data backup method, electronic device, data backup system and chip system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20763392 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20763392 Country of ref document: EP Kind code of ref document: A1 |