WO2021036906A1 - 一种图片处理方法及装置 - Google Patents

一种图片处理方法及装置 Download PDF

Info

Publication number
WO2021036906A1
WO2021036906A1 PCT/CN2020/110307 CN2020110307W WO2021036906A1 WO 2021036906 A1 WO2021036906 A1 WO 2021036906A1 CN 2020110307 W CN2020110307 W CN 2020110307W WO 2021036906 A1 WO2021036906 A1 WO 2021036906A1
Authority
WO
WIPO (PCT)
Prior art keywords
picture
newly added
pictures
sets
electronic device
Prior art date
Application number
PCT/CN2020/110307
Other languages
English (en)
French (fr)
Inventor
赵一麟
郭宏伟
黄向东
李凯
马杰延
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021036906A1 publication Critical patent/WO2021036906A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Definitions

  • This application relates to the field of terminal technology, and in particular to a picture processing method and device.
  • smart terminal devices store a large number of user photos, so how to classify user photos so that all photos that meet specific conditions can be searched from a large number of user photos in real time and quickly become an urgent problem to be solved.
  • the existing picture collection and retrieval technology is implemented based on face recognition, neural network and clustering algorithms, that is, the intelligent terminal device performs face recognition on the picture, and the feature vector of the picture is obtained through the neural network algorithm, and the intelligent The feature vectors of all photos in the gallery of the terminal device are compared one by one to perform image clustering; thus, the image closest to the feature vector of the retrieved image can be searched out as the image retrieval result based on the comparison results one by one.
  • This application provides a picture processing method and device, which solves the problem that the classification and retrieval efficiency of comparing feature vectors one by one in the prior art is too low, it is impossible to obtain picture retrieval results in real time, and it is impossible to quickly create new classifications and categories based on newly added photos in real time.
  • the index problem
  • an image processing method is provided, which is applied to an electronic device.
  • the method includes: the electronic device clusters a plurality of pictures to obtain a plurality of picture sets; the electronic device obtains a newly added picture and the center of the plurality of picture sets For the similarity distance between points, the first picture set with the smallest similarity distance to the newly added picture is determined from multiple picture sets. The smaller the similarity distance, the higher the similarity; the center point is the value of all pictures in the picture set.
  • the average value of the feature vector if the similarity distance between the newly added picture and the center point of the first picture set is less than or equal to the classification threshold of the first picture set, the electronic device determines that the newly added picture belongs to the first picture set; if The similarity distance between the newly added picture and the center point of the first picture set is greater than the classification threshold of the first picture set, the electronic device creates a second picture set, and the newly added picture belongs to the second picture set.
  • the electronic device may be a smart terminal device.
  • the electronic device calculates the similar distance between the newly added picture and the center point of the multiple picture sets obtained through clustering, and selects the picture set with the smallest similar distance as the new picture set.
  • the archive result of the added pictures avoids the feature vector comparison of the newly added pictures and the original pictures one by one, so that the newly added pictures can be clustered quickly and in real time, and new pictures can be created for unsuccessful clustering.
  • Picture collection improves the efficiency of picture clustering and improves user experience.
  • the electronic device clusters multiple pictures to obtain multiple picture sets, which specifically includes: obtaining each picture according to the similar distance between each picture and the feature vector of the multiple pictures Corresponding to the top K similar distances sorted from small to large, where K is a positive integer greater than or equal to 1; perform a difference operation on the K similar distances corresponding to each picture to obtain the K similar distances corresponding to each picture
  • the similarity distance corresponding to the outlier is the classification threshold corresponding to each picture; the similarity distance between each picture and the feature vector of multiple pictures, the similarity distance is less than or equal to the classification threshold
  • the picture is marked as a picture set, and the picture set corresponding to each picture is obtained; the generated multiple picture sets are merged to obtain multiple picture sets.
  • the electronic device uses artificial intelligence algorithms to cluster multiple pictures to automatically calculate the classification threshold, that is, to classify each group of pictures through the automatic threshold to perform picture clustering, avoiding the difference of similar distances with a fixed threshold.
  • the inaccuracy of classifying a large number of large pictures improves the flexibility and accuracy of picture clustering.
  • the method before performing the difference operation on the K similar distances corresponding to each picture, the method further includes: for any picture in each picture, if any picture corresponds to the first K similar distances The m-th similarity distance in is greater than the second threshold, then any picture is marked as a picture set, where m is a positive integer greater than or equal to 1.
  • the electronic device now eliminates discrete points before performing image classification according to the classification threshold, which reduces the amount of calculation for subsequent calculations and improves the efficiency of image clustering.
  • combining multiple generated picture sets to obtain multiple picture sets includes: obtaining the center point of the picture set corresponding to each picture; if the center points of any two picture sets are between When the similarity distance of is less than the classification threshold of any one of any two picture sets, any two picture sets are merged into one picture set.
  • the electronic device performs the inter-class merging of the picture sets with higher similarity according to the similarity distance between the corresponding center points of the picture sets, which can improve the efficiency and accuracy of picture clustering.
  • the electronic device obtains the similar distance between the newly added picture and the center points of the multiple picture sets, which specifically includes: if the number of newly added pictures is at least two, then the newly added pictures
  • the pictures are clustered to obtain multiple newly-added picture sets corresponding to the newly-added pictures; the similar distance between each newly-added picture set and the center points of the multiple picture sets is calculated.
  • the newly added multiple pictures are clustered first to generate multiple newly added picture sets, and then the similarity distance is calculated according to the newly added picture set and the pre-stored picture clustering results, so as to perform Clustering speeds up the efficiency of image archiving, realizes real-time and rapid clustering of newly added images, and improves the efficiency of image clustering.
  • an electronic device in a second aspect, includes a processor and a memory connected to the processor.
  • the memory is used to store instructions.
  • the electronic device is used to execute: Perform clustering to obtain multiple picture sets; obtain the similar distance between the newly added picture and the center points of the multiple picture sets, and determine the first picture set with the smallest similar distance to the newly added picture from the multiple picture sets , Where, the smaller the similarity distance, the higher the similarity; the center point is the average of the feature vectors of all pictures in the picture set; if the similarity distance between the newly added picture and the center point of the first picture set is less than or equal to the first picture set A classification threshold of a picture set, it is determined that the newly added picture belongs to the first picture set; if the similarity distance between the newly added picture and the center point of the first picture set is greater than the classification threshold of the first picture set, a second picture set is created Picture collection, the newly added picture belongs to the second picture collection.
  • clustering multiple pictures to obtain multiple picture sets includes: obtaining the corresponding distance of each picture according to the similar distance between each picture and the feature vector of the multiple pictures The first K similar distances sorted from small to large, where K is a positive integer greater than or equal to 1; the difference operation is performed on the K similar distances corresponding to each picture, and the distance among the K similar distances corresponding to each picture is obtained.
  • the similarity distance corresponding to the outlier points is the classification threshold corresponding to each picture; for the similarity distances between each picture and the feature vectors of multiple pictures, the pictures whose similarity distance is less than or equal to the classification threshold are marked For a picture set, the picture set corresponding to each picture is obtained; the generated multiple picture sets are merged to obtain multiple picture sets.
  • the electronic device before performing the difference operation on the K similar distances corresponding to each picture, the electronic device is also used to perform: For any picture in each picture, if any picture corresponds to the first K If the m-th similarity distance among the similarity distances is greater than the second threshold, any picture is marked as a picture set, where m is a positive integer greater than or equal to 1.
  • combining multiple generated picture sets to obtain multiple picture sets includes: obtaining the center point of the picture set corresponding to each picture; if the center points of any two picture sets are between When the similarity distance of is less than the classification threshold of any one of any two picture sets, any two picture sets are merged into one picture set.
  • obtaining the similar distances between the center points of the newly added pictures and the multiple picture sets specifically includes: if the number of newly added pictures is at least two, perform the calculation on the newly added pictures. Clustering to obtain multiple newly-added picture sets corresponding to the newly-added pictures; calculate the similar distance between each newly-added picture set and the center points of the multiple picture sets.
  • a chip system which is characterized in that the chip system is applied to electronic equipment; the chip system includes one or more interface circuits and one or more processors; the interface circuit and the processor are interconnected by wires; To receive a signal from the memory of an electronic device and send a signal to the processor, the signal includes a computer instruction stored in the memory; when the processor executes the computer instruction, the electronic device executes the first aspect and any of its possible design methods .
  • a readable storage medium which is characterized in that instructions are stored in the readable storage medium, and when the readable storage medium runs on an electronic device, the electronic device executes the first aspect and any of its possibilities. Way of design.
  • a computer program product which is characterized in that, when the computer program product runs on a computer, the computer is caused to execute the first aspect and any of the possible design methods.
  • FIG. 1 is a schematic diagram of the hardware structure of an electronic device provided by an embodiment of the application.
  • FIG. 2 is a software system architecture diagram of an electronic device provided by an embodiment of the application.
  • FIG. 3 is a schematic flowchart of an image processing method provided by an embodiment of this application.
  • FIG. 4 is a schematic diagram of a process of obtaining a feature vector corresponding to a picture by an electronic device according to an embodiment of the application;
  • FIG. 5 is a schematic diagram of a flow of obtaining outliers in a similarity list of pictures by an electronic device according to an embodiment of the application;
  • FIG. 6 is a schematic diagram of similar distance distribution in a picture clustering result provided by an embodiment of this application.
  • FIG. 7 is a schematic flowchart of another image processing method provided by an embodiment of the application.
  • FIG. 8 is a schematic diagram of the effect of another image processing method provided by an embodiment of the application.
  • FIG. 9 is a schematic flowchart of another image processing method provided by an embodiment of the application.
  • FIG. 10 is a schematic structural diagram of a chip system provided by an embodiment of the application.
  • first and second are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features. Therefore, the features defined with “first” and “second” may explicitly or implicitly include one or more of these features. In the description of the present embodiment, unless otherwise specified, "plurality" means two or more.
  • the embodiment of the present application provides a picture processing method, which can be applied to the process of electronic equipment classifying and displaying pictures or retrieving user photos, especially in the process of classifying and retrieving photos of people on the electronic equipment.
  • the electronic device can quickly perform image retrieval in real time, and can quickly perform classification according to newly added images, or create new classifications and indexes, which can improve the user experience.
  • the electronic device in the embodiments of the present application may be a mobile phone, a tablet computer, a desktop computer, a laptop, a handheld computer, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, and a cellular phone.
  • FIG. 1 shows a schematic diagram of the structure of an electronic device 100.
  • the electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, and an antenna 2.
  • Mobile communication module 150 wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, earphone jack 170D, sensor module 180, buttons 190, motor 191, indicator 192, camera 193, display screen 194, and Subscriber identification module (subscriber identification module, SIM) card interface 195, etc.
  • SIM Subscriber identification module
  • the sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light Sensor 180L, bone conduction sensor 180M, etc.
  • the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the electronic device 100.
  • the electronic device 100 may include more or fewer components than shown, or combine certain components, or split certain components, or arrange different components.
  • the illustrated components can be implemented in hardware, software, or a combination of software and hardware.
  • the processor 110 may include one or more processing units.
  • the processor 110 may include an application processor (AP), a modem processor, a graphics processing unit (GPU), and an image signal processor. (image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processing unit (NPU) Wait.
  • AP application processor
  • modem processor modem processor
  • GPU graphics processing unit
  • image signal processor image signal processor
  • ISP image signal processor
  • controller memory
  • video codec digital signal processor
  • DSP digital signal processor
  • NPU neural-network processing unit
  • the different processing units may be independent devices or integrated in one or more processors.
  • the controller may be the nerve center and command center of the electronic device 100.
  • the controller can generate operation control signals according to the instruction operation code and timing signals to complete the control of fetching and executing instructions.
  • a memory may also be provided in the processor 110 to store instructions and data.
  • the memory in the processor 110 is a cache memory.
  • the memory can store instructions or data that the processor 110 has just used or used cyclically. If the processor 110 needs to use the instruction or data again, it can be directly called from the memory. Repeated accesses are avoided, the waiting time of the processor 110 is reduced, and the efficiency of the system is improved.
  • the processor 110 may include one or more interfaces.
  • Interfaces can include integrated circuit (I2C) interfaces, integrated circuit built-in audio (inter-integrated circuit sound, I2S) interfaces, pulse code modulation (PCM) interfaces, universal asynchronous transmitters receiver/transmitter, UART) interface, mobile industry processor interface (MIPI), general-purpose input/output (GPIO) interface, subscriber identity module (SIM) interface, and / Or Universal Serial Bus (USB) interface, etc.
  • I2C integrated circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • UART mobile industry processor interface
  • MIPI mobile industry processor interface
  • GPIO general-purpose input/output
  • SIM subscriber identity module
  • USB Universal Serial Bus
  • the interface connection relationship between the modules illustrated in the embodiment of the present application is merely a schematic description, and does not constitute a structural limitation of the electronic device 100.
  • the electronic device 100 may also adopt different interface connection modes in the foregoing embodiments, or a combination of multiple interface connection modes.
  • the charging management module 140 is used to receive charging input from the charger.
  • the charger can be a wireless charger or a wired charger.
  • the charging management module 140 may receive the charging input of the wired charger through the USB interface 130.
  • the charging management module 140 may receive the wireless charging input through the wireless charging coil of the electronic device 100. While the charging management module 140 charges the battery 142, it can also supply power to the electronic device through the power management module 141.
  • the power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110.
  • the power management module 141 receives input from the battery 142 and/or the charge management module 140, and supplies power to the processor 110, the internal memory 121, the external memory, the display screen 194, the camera 193, and the wireless communication module 160.
  • the power management module 141 can also be used to monitor parameters such as battery capacity, battery cycle times, and battery health status (leakage, impedance).
  • the power management module 141 may also be provided in the processor 110.
  • the power management module 141 and the charging management module 140 may also be provided in the same device.
  • the wireless communication function of the electronic device 100 can be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor, and the baseband processor.
  • the electronic device 100 implements a display function through a GPU, a display screen 194, an application processor, and the like.
  • the GPU is an image processing microprocessor, which is connected to the display screen 194 and the application processor.
  • the GPU is used to perform mathematical and geometric calculations for graphics rendering.
  • the processor 110 may include one or more GPUs, which execute program instructions to generate or change display information.
  • the display screen 194 is used to display images, videos, and the like.
  • the display screen 194 includes a display panel.
  • the display panel can adopt liquid crystal display (LCD), organic light-emitting diode (OLED), active-matrix organic light-emitting diode or active-matrix organic light-emitting diode (active-matrix organic light-emitting diode).
  • LCD liquid crystal display
  • OLED organic light-emitting diode
  • active-matrix organic light-emitting diode active-matrix organic light-emitting diode
  • AMOLED flexible light-emitting diode (FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diode (QLED), etc.
  • the electronic device 100 may include one or N display screens 194, and N is a positive integer greater than one.
  • the electronic device 100 can realize a shooting function through an ISP, a camera 193, a video codec, a GPU, a display screen 194, and an application processor.
  • the electronic device 100 can implement audio functions through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the earphone interface 170D, and the application processor. For example, music playback, recording, etc.
  • the software system of the electronic device 100 may adopt a layered architecture, an event-driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture.
  • the embodiment of the present application takes an Android system with a layered architecture as an example to illustrate the software structure of the electronic device 100 by way of example.
  • FIG. 2 is a block diagram of the software structure of the electronic device 100 according to an embodiment of the present application.
  • the layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Communication between layers through software interface.
  • the Android system is divided into four layers, from top to bottom, the application layer, the application framework layer, the Android runtime and system library, and the kernel layer.
  • the application layer can include a series of application packages. As shown in Figure 2, the application package can include applications such as camera, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, short message, etc.
  • the application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer.
  • the application framework layer includes some predefined functions.
  • the application framework layer can include a window manager, a content provider, a view system, a phone manager, a resource manager, and a notification manager.
  • the window manager is used to manage window programs.
  • the window manager can obtain the size of the display, determine whether there is a status bar, lock the screen, take a screenshot, etc.
  • the content provider is used to store and retrieve data and make these data accessible to applications.
  • the data may include videos, images, audios, phone calls made and received, browsing history and bookmarks, phone book, etc.
  • the view system includes visual controls, such as controls that display text, controls that display pictures, and so on.
  • the view system can be used to build applications.
  • the display interface can be composed of one or more views.
  • a display interface that includes a short message notification icon may include a view that displays text and a view that displays pictures.
  • the phone manager is used to provide the communication function of the electronic device 100. For example, the management of the call status (including connecting, hanging up, etc.).
  • the resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and so on.
  • the notification manager enables the application to display notification information in the status bar, which can be used to convey notification-type messages, and it can disappear automatically after a short stay without user interaction.
  • the notification manager is used to notify download completion, message reminders, and so on.
  • the notification manager can also be a notification that appears in the status bar at the top of the system in the form of a chart or a scroll bar text, such as a notification of an application running in the background, or a notification that appears on the screen in the form of a dialog window. For example, text messages are prompted in the status bar, prompt sounds, electronic devices vibrate, and indicator lights flash.
  • Android Runtime includes core libraries and virtual machines. Android runtime is responsible for the scheduling and management of the Android system.
  • the core library consists of two parts: one part is the function functions that the java language needs to call, and the other part is the core library of Android.
  • the application layer and the application framework layer run in a virtual machine.
  • the virtual machine executes the java files of the application layer and the application framework layer as binary files.
  • the virtual machine is used to perform functions such as object life cycle management, stack management, thread management, security and exception management, and garbage collection.
  • the system library can include multiple functional modules. For example: surface manager (surface manager), media library (Media Libraries), three-dimensional graphics processing library (for example: OpenGL ES), 2D graphics engine (for example: SGL), etc.
  • the surface manager is used to manage the display subsystem and provides a combination of 2D and 3D layers for multiple applications.
  • the media library supports playback and recording of a variety of commonly used audio and video formats, as well as still image files.
  • the media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG or PNG.
  • the 3D graphics processing library is used to realize 3D graphics drawing, image rendering, synthesis and layer processing.
  • the 2D graphics engine is a drawing engine for 2D drawing.
  • the kernel layer is the layer between hardware and software.
  • the kernel layer contains at least display driver, camera driver, audio driver, and sensor driver.
  • the methods in the following embodiments can all be implemented in the electronic device 100 having the above-mentioned hardware structure and software structure.
  • only the above-mentioned electronic device 100 is a mobile phone as an example to describe the method of the embodiment of the present application.
  • pictures stored in the gallery application on the mobile phone which can be photos taken by the mobile phone camera or pictures obtained by other applications on the mobile phone.
  • all pictures in the gallery can be classified.
  • the pictures in the gallery can be divided into a collection of pictures of people, natural scenery, animals, landmarks, buildings, things, etc.; further, there can be detailed Classification, for example, to classify photos of people according to different people, for example, according to the label of the owner of the mobile phone, the photo of the mother, the photo of the daughter and the photo of a certain friend, etc.; there can also be For a more detailed classification, the photos of the same person are classified according to the age of the person in the photo, and are divided into childhood, adolescence, and middle age.
  • the following embodiments of the present application only take the classification of different people as an example.
  • the pictures of the same person are clustered into a picture set by the full clustering algorithm, and all pictures in the picture set are marked with a classification label. Convenient for users to view.
  • the embodiment of the application also supports image retrieval.
  • the user can search based on an existing picture, or search based on keywords, or use the camera application of the mobile phone to take a photo in real time to search, then the mobile phone can return all pictures belonging to the same picture collection based on the retrieved picture For example, it can be all pictures of the same person in the search picture.
  • the embodiment of the present application provides an image processing method. As shown in FIG. 3, the method may include:
  • the electronic device performs face detection on the picture to obtain coordinate information of the face frame.
  • the face detection technology is used to quickly detect the face information in the picture, and return the position of the face frame, locate the key points of the facial features and contours, accurately identify a variety of face attributes, and return the obtained coordinate information of the face frame.
  • the electronic device performs face detection on the picture shown in (a) in FIG. 4, and after the face detection recognizes the picture object, it divides the image of the face and outputs the image shown in FIG. The face frame shown in (b), and the coordinate information of the face frame.
  • Face feature extraction technology can be used to extract features from facial pictures in the face frame. Specifically, it can be a deep learning technology using a neural network to extract a high-dimensional feature vector. For example, as shown in Figure 4, according to the figure From the face frame obtained by face detection in 4, a high-dimensional feature vector such as (13, 423, 3...190) is extracted.
  • the electronic device obtains and saves the feature vector corresponding to each picture, so as to subsequently perform the comparison calculation between the feature vectors.
  • the similarity distance can be used to indicate the degree of similarity between the two pictures; the greater the similarity distance, the greater the similarity between the two pictures. The smaller the similarity, the less similar; the smaller the similarity distance, the greater the similarity between the two pictures, that is, the more similar.
  • the algorithm for calculating the similarity distance can be expressed by calculating the Euclidean distance or the cosine distance of the feature vectors corresponding to two pictures.
  • the embodiment of the present application does not specifically limit this calculation algorithm, and the specific algorithm for calculating the similar distance based on the cosine distance is exemplarily given below.
  • the cosine distance between picture A1 and picture A2 is:
  • the feature vector corresponding to the picture can be normalized first, that is, the feature vector is converted into data, so that the length of the normalized feature vector after the data conversion is 1 , Or it can be said that the modulus of the normalized feature vector is 1.
  • Data normalization processing can be used to reduce the amount of calculation and speed up the efficiency of data calculation processing.
  • the similarity distances between each picture and the feature vectors of multiple pictures are sorted from small to large and the top K similarity distances are sorted to generate a similarity list corresponding to each picture.
  • K is a positive integer greater than or equal to 1.
  • picture A1 and the feature vector corresponding to the N pictures including itself are calculated for similarity distance, and the calculated similarity distance is calculated according to Arrange from small to large, and take the first K similarity distance values to form the similarity list corresponding to A1.
  • Each picture has a corresponding similarity list, that is, there are N similarity lists, and each similarity list has K similarity distances sorted from small to large, that is, an N*K similarity matrix is formed.
  • K can be pre-set by those skilled in the art according to the number of pictures in the gallery, the accuracy of the clustering algorithm and other parameters.
  • K can be set to 50 in the embodiment of the present application.
  • m is a positive integer greater than or equal to 1.
  • m may be set to 3, that is, it is set to mark at least three pictures as a picture set. Therefore, when the third similarity distance in the similarity list of a picture, that is, Top3 is less than a preset threshold, that is, there are less than three pictures with the closest similarity to the picture, which is not enough to be classified into one category. Then mark the picture as a discrete point where the clustering fails, and remove the discrete points; or use the picture as a single category and mark it with a label.
  • the preset threshold is used to indicate the similarity distance threshold at which three pictures can be grouped into a picture set, which can be pre-set by those skilled in the art according to the number of pictures in the gallery, the accuracy of the clustering algorithm and other parameters.
  • the preset threshold may be set to 0.2. That is, when Top3 in the similarity list of a certain picture is less than 0.2, the picture is marked as a separate category, and it is temporarily not clustered with other pictures.
  • the subsequent step 307 when the merging judgment between the picture sets is performed, it can be determined whether the picture meets the conditions for the inter-class merging with other picture sets. If the inter-category merging conditions are not met or the number of pictures in the merged picture collection is still less than 3, the picture will be marked as temporarily not clustered successfully, and this picture can continue to participate in clustering when there are new pictures in the future , To determine whether there are newly added pictures that can be clustered into a picture set.
  • the classification threshold value corresponding to each picture is obtained according to the similar distance between each picture and multiple pictures.
  • the pictures whose similarity distance is less than or equal to the classification threshold are marked as A collection of pictures.
  • the calculation method of the classification threshold can be performed by performing a difference operation on the K similar distances corresponding to each picture to obtain the outliers in the K similar distances corresponding to each picture, and the outliers in each picture correspond to the outliers in the K similar distances.
  • the similarity distance is the classification threshold corresponding to each picture.
  • the outlier refers to the extreme large value and the extreme small value that are far from the general level of the sequence in a numerical sequence.
  • outliers can be obtained by difference operation. For example, a difference operation is performed on K similar distances to obtain a difference function, and the similar distance corresponding to the largest peak of the difference function is an outlier.
  • the similar distance distribution function shown in the left figure is obtained according to the similarity list of a certain picture, and the similar distance distribution function is subjected to a difference operation to obtain the difference function shown in the right figure.
  • the maximum peak value can be expressed as an outlier in the similarity list of the picture, and the similarity distance corresponding to the outlier can be set as the classification threshold.
  • pictures corresponding to similar distances whose similarity distances are less than or equal to the classification threshold can be marked as a picture set.
  • the classification threshold is dynamically generated based on the similar distance between each picture in the picture set. A new picture is added to the picture set. A picture or deleting a picture may affect the size of the classification threshold. Therefore, it is not a fixed threshold, but is obtained according to an artificial intelligence algorithm, so that the threshold can be dynamically selected according to the similarity between pictures in different picture sets, which can improve the accuracy of the picture set.
  • the similarity distance between the multiple photos of the person is relatively high.
  • the multiple photos of another person have a single scene, similar ages, expressions, and other factors, it can be foreseen that the similarity distance between the multiple photos is generally low.
  • a fixed similarity threshold is used as a criterion for distinguishing whether people in different photos are different people, it is difficult to avoid clustering errors.
  • the center point of the picture set corresponding to each picture can be obtained by: obtaining the center point of the picture set corresponding to each picture, where the center point is the average value of the feature vectors corresponding to all pictures in the picture set. If the similarity distance between the center points of any two picture sets is less than the classification threshold of any one of the two picture sets, the two picture sets can be merged into one picture set.
  • the center point can be used to represent the average value of the feature vectors in the picture set. Therefore, the relationship between the center point and the classification threshold of the picture set can be used to determine whether the two picture sets can be merged between classes, and the picture sets with higher similarity are merged and updated into a new picture set.
  • the center point of the picture set X is calculated as X_1, and the automatic threshold value is 0.5; the center point of the picture set Y is Y_1, and the automatic threshold value is 0.6; the similar distance between X_1 and Y_1 is calculated as 0.55, and 0.55 is judged to be less than 0.5. Then the picture collection X and the picture collection Y are merged into one category. Then update the center point corresponding to the picture set and the K value corresponding to the picture set.
  • step 310 it can be determined whether there are still picture sets that can be merged according to the above-mentioned inter-category merging conditions. If there is no picture set that needs to be merged, step 310 is executed; if there is still a picture set that needs to be merged, step 309 is executed.
  • the termination condition of the inter-class merging may further include: if the second similarity distance in the similarity list of a picture set is greater than a certain preset threshold, the picture set is not merged with any other picture set.
  • the preset threshold is used to indicate that the similarity between the image collection and other image collections is too low, and the similarity distance threshold that does not need to be merged between classes can be used by those skilled in the art according to the number and clustering of the images in the gallery. Parameters such as the accuracy of the algorithm are set in advance, which are not specifically limited in the embodiment of the present application.
  • the number of pictures in the picture sets will also change. Therefore, it is necessary to update the number K value of pictures in the picture set, and perform step 306 to update and calculate the classification threshold of the picture set.
  • N pictures can be clustered into n pictures
  • the collection is displayed in the gallery application of the electronic device. Specifically, it may be a way of displaying labels of a picture collection in a gallery application.
  • a label represents a picture collection, and the user can click on the label to view all pictures and some pictures in the picture collection.
  • the classification threshold is automatically calculated by means of artificial intelligence, which realizes the full clustering of user pictures, and can perform different classification standards according to the similarity of different user pictures, thereby improving the accuracy of picture classification.
  • the similarity distance between pictures in a picture collection is smaller than the similarity distance between different picture collections, and the photos on the user’s mobile phone can be quickly set up according to the photos of different people.
  • the mobile phone can store the center point of the generated multiple picture sets, the number of pictures in the picture set, or the label of the picture set in the memory of the mobile phone, so that when the newly added picture clustering process is subsequently performed, it will not It is necessary to perform clustering processing on the pictures that have been clustered, and comparison calculations can be performed according to the information of the picture set stored in the memory, so that picture classification and picture retrieval can be quickly performed.
  • the electronic device when it acquires a batch of new photos, it can use incremental clustering algorithm to quickly cluster the newly-added pictures with the original picture collection, that is, archive them into existing pictures In the collection; if the newly added photo cannot be archived with the existing picture collection and meets the conditions for the newly added photo collection, the electronic device can create a new photo collection based on the newly added photo.
  • the specific incremental clustering process can include the following two processes.
  • the first one is to calculate the similar distance between the newly added M pictures and the original picture set one by one, and the picture set with the smallest similar distance is regarded as the belonging of the picture. set.
  • it can be: calculating the similar distance between the newly added picture and the center point of each picture set in the original multiple picture sets, the picture set with the smallest similar distance, for example, the first picture set; If the similarity distance of the first picture set is less than or equal to the classification threshold of the first picture set, the electronic device determines that the newly added picture belongs to the first picture set; if the similarity distance between the newly added picture and the first picture set is greater than the first picture If the classification threshold of the set is set, the electronic device creates a second picture set, and the newly added picture belongs to the second picture set.
  • the second method is to perform full clustering processing on the newly added M pictures first.
  • the specific full clustering process can be implemented as the above steps 301-310 to generate m picture sets. Then calculate the similarity distance between the center points of the m picture sets and the n picture sets generated above, and perform inter-class merging, and use the picture set with the smallest similarity distance and less than the classification threshold of any picture set as the set to which the picture belongs. And merge the two picture sets; or, when the similarity distance is the smallest and greater than the classification threshold of the two picture sets, and the number of pictures in the picture set is greater than or equal to 3, a new picture set is created based on the newly added picture set classification.
  • the second type of incremental clustering described above has a smaller amount of calculation and a better algorithm.
  • the following embodiments of the present application only take the second type of incremental clustering as an example.
  • the specific process can be as shown in Figure 7, which is achieved through the following steps :
  • the electronic device extracts the feature vector of the newly added picture.
  • the electronic device obtains the feature vector corresponding to each newly added picture and saves it for subsequent comparison and calculation between the feature vectors.
  • the similarity distance can be used to indicate the degree of similarity.
  • the specific similar distance and the elimination of discrete points please refer to the above steps 303, 304 and 305, which will not be repeated here.
  • the similarity distance is used to indicate the similarity between the newly added picture set and the original picture set, and the picture set with the highest similarity is obtained.
  • the center point is the average value of the feature vectors of all pictures in the newly added picture set or the original picture set.
  • the electronic device creates a second picture set, and the newly added picture set belongs to the second picture set.
  • condition for the electronic device to create the second picture set may also include that the number of pictures in the newly added picture set is greater than or equal to m, then a new picture set is created.
  • m may be set to 3.
  • step 707 it can be determined whether there is still a picture set that can be merged according to the aforementioned inter-category merging condition. If there is no picture set that needs to be merged, step 707 is executed; if there is still a picture set that needs to be merged, step 706 is executed.
  • the termination condition of the inter-class merging may further include: if the second similarity distance in the similarity list of a picture set is greater than a preset threshold, the picture set is not merged with any other picture sets.
  • the preset threshold is used to indicate that the similarity between the image collection and other image collections is too low, and the similarity distance threshold for inter-class merging is not required. It can be determined by those skilled in the art according to the number of images in the gallery and the clustering algorithm. Parameters such as accuracy are set in advance, which are not specifically limited in the embodiment of the present application.
  • the number of pictures in the picture collection will also change, so it is necessary to update the value K of the number of pictures, and update the classification threshold for calculating the picture collection.
  • the three newly added pictures are classified into different picture sets after incremental clustering processing.
  • classify the picture shown in Figure 8 (b) into the picture set of label 2 (c) in Figure 8
  • the user can view the above-mentioned intelligently classified picture collection of the album by clicking on the mobile phone, and view all the pictures in the picture collection by viewing the picture classification label.
  • the foregoing embodiment realizes the rapid clustering of one or more newly added pictures, and the new pictures are compared with the cluster centers of the original clustering results to determine the picture to which the newly added pictures belong. Collecting or quickly creating a new collection of pictures simplifies the clustering process of newly added pictures, improves the efficiency of clustering, and optimizes user experience.
  • the image retrieval process performed by the electronic device is similar to the above incremental clustering process, for example, such as As shown in Fig. 9, the process for the electronic device to obtain the search result according to the newly added face picture may include:
  • the electronic device obtains a face picture.
  • the electronic device can select one or more local or network pictures to obtain the face picture, or it can use the camera to obtain real-time photographed pictures.
  • step 307 of the foregoing embodiment for the conditions of merging between classes, refer to step 307 of the foregoing embodiment, and details are not described herein again.
  • the similar distances between the center points of the original picture collection and the original picture collection are greater than the classification threshold, and they can be marked as the new picture collection. And update the relevant parameters of the picture collection.
  • the foregoing embodiment of the present application uses incremental clustering to store existing photo processing results, such as face clustering results, or scene, time and other information in a database, cache and set an index.
  • existing photo processing results such as face clustering results, or scene, time and other information
  • the stored clustering results can be directly called to realize real-time search.
  • the foregoing embodiment of the present application adopts an incremental clustering technique to directly call the clustering results cached in the gallery, avoiding the comparison and calculation of the feature vectors of the photos one by one.
  • the clustering method using dynamic thresholds can increase the accuracy of face clustering when the sample distribution is irregular, and improve user experience.
  • the electronic device in the foregoing embodiment may also be implemented as a cloud device such as a cloud server.
  • the electronic device may include a memory and one or more processors, and the memory and the processor are coupled.
  • the memory is used to store computer program code, and the computer program code includes computer instructions.
  • the processor executes the computer instructions, the electronic device can execute various functions or steps in the foregoing method embodiments.
  • the chip system includes at least one processor 1001 and at least one interface circuit 1002.
  • the processor 1001 and the interface circuit 1002 may be interconnected by wires.
  • the interface circuit 1002 can be used to receive signals from other devices (such as the memory of an electronic device).
  • the interface circuit 1002 may be used to send signals to other devices (such as the processor 1001).
  • the interface circuit 1002 can read an instruction stored in the memory, and send the instruction to the processor 1001.
  • the electronic device can execute various functions or steps executed by the electronic device in the foregoing embodiments.
  • the chip system may also include other discrete devices, which are not specifically limited in the embodiment of the present application.
  • the embodiments of the present application also provide a computer storage medium, which includes computer instructions, which when the computer instructions run on the above-mentioned electronic device, cause the electronic device to perform each function or step in the above-mentioned method embodiment.
  • the embodiments of the present application also provide a computer program product, which when the computer program product runs on a computer, causes the computer to execute each function or step in the foregoing method embodiment.
  • the disclosed device and method can be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the modules or units is only a logical function division. In actual implementation, there may be other division methods, for example, multiple units or components may be divided. It can be combined or integrated into another device, or some features can be omitted or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate parts may or may not be physically separate.
  • the parts displayed as units may be one physical unit or multiple physical units, that is, they may be located in one place, or they may be distributed to multiple different places. . Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a readable storage medium.
  • the technical solutions of the embodiments of the present application are essentially or the part that contributes to the prior art, or all or part of the technical solutions can be embodied in the form of a software product, and the software product is stored in a storage medium. It includes several instructions to make a device (may be a single-chip microcomputer, a chip, etc.) or a processor (processor) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the foregoing storage media include: U disk, mobile hard disk, read only memory (read only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

一种图片处理方法及装置,涉及终端技术领域,通过人工智能技术,解决了逐一对比特征向量导致的图片分类和检索效率太低,用户体验较差的问题。具体方案包括:电子设备对多张图片进行聚类,得到多个图片集合;获取新增的图片与多个图片集合中心点之间的相似距离,从多个图片集合中确定出与新增的图片相似距离最小的第一图片集合;若新增的图片与第一图片集合中心点之间的相似距离小于或等于第一图片集合的分类阈值,则确定新增的图片属于第一图片集合;若新增的图片与第一图片集合中心点之间的相似距离大于第一图片集合的分类阈值,则新建第二图片集合,新增的图片属于第二图片集合。

Description

一种图片处理方法及装置
本申请要求于2019年08月27日提交国家知识产权局、申请号为201910798295.4、申请名称为“一种图片处理方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及终端技术领域,尤其涉及一种图片处理方法及装置。
背景技术
目前智能终端设备存储着海量的用户照片,因而如何进行用户照片分类,进而能实时、快速地从众多用户照片中搜索到满足特定条件的所有照片成为一个亟待解决的问题。
现有的图片集合和检索技术是基于人脸识别、神经网络和聚类算法来实现的,也就是智能终端设备对图片进行人脸识别,通过神经网络算法得到该图片的特征向量,将该智能终端设备的图库中的所有照片的特征向量进行逐一比对,进行图片聚类;从而可以根据逐一对比结果,搜索出与检索图片的特征向量最接近的图片作为图片检索结果。
上述逐一对比的图片集合和检索技术导致图片检索效率较低,无法实时获取图片检索结果;并且,无法根据新增的图片实时快速地创建新的分类和索引,用户使用体验不好。
发明内容
本申请提供一种图片处理方法及装置,解决了现有技术中逐一对比特征向量的分类和检索效率太低,无法实时获取图片检索结果,且无法根据新增照片实时快速地创建新的分类和索引的问题。
为达到上述目的,本申请采用如下技术方案:
第一方面,提供一种图片处理方法,应用于电子设备,该方法包括:电子设备对多张图片进行聚类,得到多个图片集合;电子设备获取新增的图片与多个图片集合的中心点之间的相似距离,从多个图片集合中确定出与新增的图片相似距离最小的第一图片集合,其中,相似距离越小,相似度越高;中心点为图片集合中所有图片的特征向量的平均值;若新增的图片与第一图片集合的中心点之间的相似距离小于或等于第一图片集合的分类阈值,则电子设备确定新增的图片属于第一图片集合;若新增的图片与第一图片集合的中心点之间的相似距离大于第一图片集合的分类阈值,则电子设备新建第二图片集合,新增的图片属于第二图片集合。
本申请实施例中,该电子设备可以为智能终端设备,电子设备对新增的图片与经过聚类得到的多个图片集合的中心点进行相似距离的计算,选择相似距离最小的图片集合作为新增的图片的归档结果,避免了对新增的图片和原有图片进行逐一对比特征 向量,从而可以快速、实时的对新增图片进行聚类,并且可以对聚类不成功的图片建立新的图片集合,提高了图片聚类的效率,提高用户体验。
在一种可能的设计方式中,电子设备对多张图片进行聚类,得到多个图片集合,具体包括:根据每张图片分别与多张图片的特征向量之间的相似距离,得到每张图片对应的从小到大排序的前K个相似距离,其中,K为大于或等于1的正整数;对每张图片对应的K个相似距离进行差分运算,得到每张图片对应的K个相似距离中的离群点,所述离群点对应的相似距离为每张图片对应的分类阈值;对每张图片分别与多张图片的特征向量之间的相似距离,将相似距离小于或等于分类阈值的图片标记为一个图片集合,得到每张图片对应的图片集合;对生成的多个图片集合进行合并,从而得到多个图片集合。上述可能的实现方式中,电子设备对多张图片进行聚类采用人工智能算法,自动计算分类阈值,也就是通过自动阈值对每一组图片分类进行图片聚类,避免了固定阈值对相似距离差异较大的大量图片进行分类的不准确性,提高了图片聚类的灵活性和准确性。
在一种可能的设计方式中,在对每张图片对应的K个相似距离进行差分运算之前,方法还包括:对于每张图片中的任一图片,若任一图片对应的前K个相似距离中的第m个相似距离大于第二阈值,则将任一图片标记为一个图片集合,其中,m为大于或等于1的正整数。上述可能的实现方式中,电子设备在根据分类阈值进行图片分类之前,现将离散点剔除,减小后续计算的计算量,提高了图片聚类的效率。
在一种可能的设计方式中,对生成的多个图片集合进行合并,从而得到多个图片集合包括:获取每张图片对应的图片集合的中心点;如果任意两个图片集合的中心点之间的相似距离小于任意两个图片集合的任意一个的分类阈值时,将任意两个图片集合合并为一个图片集合。上述可能的实现方式中,电子设备根据图片集合对应的中心点之间的相似距离对相似度较高的图片集合进行类间合并,可以提高图片聚类的效率和准确性。
在一种可能的设计方式中,电子设备获取新增的图片与多个图片集合的中心点之间的相似距离,具体包括:若新增的图片的数量为至少两个,则对新增的图片进行聚类,得到新增的图片对应的多个新增图片集合;计算每个新增图片集合分别与多个图片集合的中心点的相似距离。上述可能的实现方式中,先对新增的多张图片进行聚类,生成新增的多个图片集合,再根据该新增的图片集合与预存储的图片聚类结果计算相似距离,从而进行聚类,加快了图片归档的效率,实现实时、快速的对新增的图片进行聚类,提高图片聚类的效率。
第二方面,提供一种电子设备,该电子设备包括处理器,以及与处理器连接的存储器,存储器用于存储指令,当指令被处理器执行时,使得电子设备用于执行:对多张图片进行聚类,得到多个图片集合;获取新增的图片与多个图片集合的中心点之间的相似距离,从多个图片集合中确定出与新增的图片相似距离最小的第一图片集合,其中,相似距离越小,相似度越高;中心点为图片集合中所有图片的特征向量的平均值;若新增的图片与第一图片集合的中心点之间的相似距离小于或等于第一图片集合的分类阈值,则确定新增的图片属于第一图片集合;若新增的图片与第一图片集合的中心点之间的相似距离大于第一图片集合的分类阈值,则新建第二图片集合,新增的 图片属于第二图片集合。
在一种可能的设计方式中,对多张图片进行聚类,得到多个图片集合,具体包括:根据每张图片分别与多张图片的特征向量之间的相似距离,得到每张图片对应的从小到大排序的前K个相似距离,其中,K为大于或等于1的正整数;对每张图片对应的K个相似距离进行差分运算,得到每张图片对应的K个相似距离中的离群点,所述离群点对应的相似距离为每张图片对应的分类阈值;对每张图片分别与多张图片的特征向量之间的相似距离,将相似距离小于或等于分类阈值的图片标记为一个图片集合,得到每张图片对应的图片集合;对生成的多个图片集合进行合并,从而得到多个图片集合。
在一种可能的设计方式中,在对每张图片对应的K个相似距离进行差分运算之前,电子设备还用于执行:对于每张图片中的任一图片,若任一图片对应的前K个相似距离中的第m个相似距离大于第二阈值,则将任一图片标记为一个图片集合,其中,m为大于或等于1的正整数。
在一种可能的设计方式中,对生成的多个图片集合进行合并,从而得到多个图片集合包括:获取每张图片对应的图片集合的中心点;如果任意两个图片集合的中心点之间的相似距离小于任意两个图片集合的任意一个的分类阈值时,将任意两个图片集合合并为一个图片集合。
在一种可能的设计方式中,获取新增的图片与多个图片集合的中心点之间的相似距离,具体包括:若新增的图片的数量为至少两个,则对新增的图片进行聚类,得到新增的图片对应的多个新增图片集合;计算每个新增图片集合分别与多个图片集合的中心点的相似距离。
第三方面,提供一种芯片系统,其特征在于,芯片系统应用于电子设备;芯片系统包括一个或多个接口电路和一个或多个处理器;接口电路和处理器通过线路互联;接口电路用于从电子设备的存储器接收信号,并向处理器发送信号,信号包括存储器中存储的计算机指令;当处理器执行计算机指令时,电子设备执行第一方面及其任一种可能的设计方式的方法。
第四方面,提供一种可读存储介质,其特征在于,可读存储介质中存储有指令,当可读存储介质在电子设备上运行时,使得电子设备执行第一方面及其任一种可能的设计方式的方法。
第五方面,提供一种计算机程序产品,其特征在于,当计算机程序产品在计算机上运行时,使得计算机执行第一方面及其任一种可能的设计方式的方法。
可以理解地,上述提供的任一种图片处理的电子设备、芯片系统、可读存储介质和计算机程序产品,均用于执行上文所提供的对应的方法,因此,其所能达到的有益效果可参考上文第一方面及其任一种可能的设计方式中的有益效果,此处不再赘述。
附图说明
图1为本申请实施例提供的一种电子设备的硬件结构示意图;
图2为本申请实施例提供的一种电子设备的软件系统架构图;
图3为本申请实施例提供的一种图片处理方法的流程示意图;
图4为本申请实施例提供的一种电子设备获取图片对应的特征向量的流程示意图;
图5为本申请实施例提供的一种电子设备获取图片的相似度列表中离群点的流程示意图;
图6为本申请实施例提供的一种图片聚类结果中的相似距离分布示意图;
图7为本申请实施例提供的另一种图片处理方法的流程示意图;
图8为本申请实施例提供的另一种图片处理方法的效果示意图;
图9为本申请实施例提供的另一种图片处理方法的流程示意图;
图10为本申请实施例提供的一种芯片系统的结构示意图。
具体实施方式
以下,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本实施例的描述中,除非另有说明,“多个”的含义是两个或两个以上。
本申请实施例提供一种图片处理方法,该方法可以应用于电子设备将图片进行分类显示,或者检索用户照片的过程中,尤其是电子设备上人物照片的分类和检索的使用过程中。通过该方法,电子设备可以实时快速地进行图片检索,并可以根据新增的图片快速进行归类,或创建新的分类和索引,可以提升用户的使用体验。
示例性的,本申请实施例中的电子设备可以是手机、平板电脑、桌面型、膝上型、手持计算机、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本,以及蜂窝电话、个人数字助理(personal digital assistant,PDA)、增强现实(augmented reality,AR)\虚拟现实(virtual reality,VR)设备等可以显示图片的电子设备,本申请实施例对该电子设备的具体形态不作特殊限制。
下面将结合附图对本申请实施例的实施方式进行详细描述。图1示出了电子设备100的结构示意图。
电子设备100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。
其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
可以理解的是,本申请实施例示意的结构并不构成对电子设备100的具体限定。在本申请另一些实施例中,电子设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit, GPU),图像信号处理器(image signal processor,ISP),控制器,存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
其中,控制器可以是电子设备100的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。
可以理解的是,本申请实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对电子设备100的结构限定。在本申请另一些实施例中,电子设备100也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。
充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块140可以通过USB接口130接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块140可以通过电子设备100的无线充电线圈接收无线充电输入。充电管理模块140为电池142充电的同时,还可以通过电源管理模块141为电子设备供电。
电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器121,外部存储器,显示屏194,摄像头193,和无线通信模块160等供电。电源管理模块141还可以用于监测电池容量,电池循环次数,电池健康状态(漏电,阻抗)等参数。在其他一些实施例中,电源管理模块141也可以设置于处理器110中。在另一些实施例中,电源管理模块141和充电管理模块140也可以设置于同一个器件中。
电子设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。
电子设备100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode, OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,电子设备100可以包括1个或N个显示屏194,N为大于1的正整数。
电子设备100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。
电子设备100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。
电子设备100的软件系统可以采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构。本申请实施例以分层架构的Android系统为例,示例性说明电子设备100的软件结构。
图2是本申请实施例的电子设备100的软件结构框图。分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将Android系统分为四层,从上至下分别为应用程序层,应用程序框架层,安卓运行时(Android runtime)和系统库,以及内核层。
应用程序层可以包括一系列应用程序包。如图2所示,应用程序包可以包括相机,图库,日历,通话,地图,导航,WLAN,蓝牙,音乐,视频,短信息等应用程序。
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架。应用程序框架层包括一些预先定义的函数。
如图2所示,应用程序框架层可以包括窗口管理器,内容提供器,视图系统,电话管理器,资源管理器,通知管理器等。
窗口管理器用于管理窗口程序。窗口管理器可以获取显示屏大小,判断是否有状态栏,锁定屏幕,截取屏幕等。
内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问。所述数据可以包括视频,图像,音频,拨打和接听的电话,浏览历史和书签,电话簿等。
视图系统包括可视控件,例如显示文字的控件,显示图片的控件等。视图系统可用于构建应用程序。显示界面可以由一个或多个视图组成的。例如,包括短信通知图标的显示界面,可以包括显示文字的视图以及显示图片的视图。
电话管理器用于提供电子设备100的通信功能。例如通话状态的管理(包括接通,挂断等)。
资源管理器为应用程序提供各种资源,比如本地化字符串,图标,图片,布局文件,视频文件等等。
通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互。比如通知管理器被用于告知下载完成,消息提醒等。通知管理器还可以是以图表或者滚动条文本形式出现在系统顶部状态栏的通知,例如后台运行的应用程序的通知,还可以是以对话窗口形式出现在屏幕上的通知。例如在状态栏提示文本信息,发出提示音,电子设备振动,指示灯闪烁等。
Android Runtime包括核心库和虚拟机。Android runtime负责安卓系统的调度和管 理。核心库包含两部分:一部分是java语言需要调用的功能函数,另一部分是安卓的核心库。
应用程序层和应用程序框架层运行在虚拟机中。虚拟机将应用程序层和应用程序框架层的java文件执行为二进制文件。虚拟机用于执行对象生命周期的管理,堆栈管理,线程管理,安全和异常的管理,以及垃圾回收等功能。
系统库可以包括多个功能模块。例如:表面管理器(surface manager),媒体库(Media Libraries),三维图形处理库(例如:OpenGL ES),2D图形引擎(例如:SGL)等。
表面管理器用于对显示子系统进行管理,并且为多个应用程序提供了2D和3D图层的融合。
媒体库支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库可以支持多种音视频编码格式,例如:MPEG4、H.264、MP3、AAC、AMR、JPG或PNG等。
三维图形处理库用于实现三维图形绘图、图像渲染、合成和图层处理等。2D图形引擎是2D绘图的绘图引擎。
内核层是硬件和软件之间的层。内核层至少包含显示驱动,摄像头驱动,音频驱动,传感器驱动。
以下实施例中的方法均可以在具有上述硬件结构和软件结构的电子设备100中实现。以下实施例中仅以上述电子设备100是手机为例,对本申请实施例的方法进行说明。
手机上的图库应用中存储有许多图片,可以是手机相机拍摄的照片,也可以是手机上的其他应用获取的图片。通过该方法可以将图库中的所有图片进行分类,例如,按照图片内容分类,可以将图库中的图片分为人物、自然风景、动物、地标建筑、事物等图片集合;进一步的,还可以有细化的分类,例如,将人物照片按照不同的人进行分类,例如,按照手机的机主本人的照片,妈妈的照片,女儿的照片和某一位朋友的照片等等标签进行分类;还可以有更为精细的分类,将相同人物的照片按照照片中人物的年龄进行分类,分为儿童时期、青少年时期和中年时期等。
本申请的下述实施例仅以按照不同的人的分类作为示例,通过全量聚类算法将同一个人的图片聚类为一个图片集合,并将该图片集合中的所有图片都标记一个分类标签,方便用户查看。
进一步的,本申请实施例还支持图片检索。用户可以根据一张已有的图片进行检索、或者根据关键字进行检索,或者用手机的相机应用实时拍摄一张照片进行检索,则手机可以根据该检索图片,返回属于同一个图片集合的所有图片,例如具体可以为与检索图片中是同一个人的所有图片。
本申请实施例提供一种图片处理方法,如图3所示,该方法可以包括:
301:电子设备对图片进行人脸检测,得到人脸框的坐标信息。
其中,人脸检测技术用于快速检测图片中的人脸信息,并返回人脸框位置、定位五官与轮廓关键点,准确识别多种人脸属性,返回得到的人脸框的坐标信息。
示例性的,如图4所示,电子设备对图4中的(a)所示的图片进行人脸检测,人脸检测识别该图片对象后,将人脸的图像划分出来,输出图4中的(b)所示的人脸框, 和该人脸框图的坐标信息。
302:根据人脸框的坐标信息提取特征,得到图片对应的特征向量。
可以采用人脸特征提取技术,将人脸框中的面部图片进行特征提取,具体可以为利用神经网络的深度学习技术,提取出一个高维的特征向量,例如,图4所示的,根据图4中人脸检测得到的人脸框,提取出一个高维特征向量如(13,423,3......190)。
进一步的,电子设备获取每一张图片对应的特征向量并保存,以便后续进行特征向量之间的比对计算。
303:计算每张图片分别与多张图片的特征向量之间的相似距离,其中,相似距离越小,相似度越高。
根据每张图片与任一图片的特征向量,计算两者的相似距离,该相似距离可以用来表示两张图片之间的相似度的大小;相似距离越大,则表示两张图片之间的相似度越小,越不相似;相似距离越小,则表示两张图片之间的相似度越大,即越相似。
具体的,计算相似距离的算法,可以通过计算两个图片对应的特征向量的欧式距离或者余弦距离等来表示。本申请实施例对此计算算法不做具体限定,以下仅示例性地给出以余弦距离计算相似距离的具体算法。
图片A1对应的特征向量为A1={x 1,x 2,x 3,...x n},图片A2对应的特征向量为A2={y 1,y 2,y 3,...y n},则图片A1与图片A2的余弦距离值为:
Figure PCTCN2020110307-appb-000001
则图片A1与图片A2的相似距离为:S=1-cosθ。
进一步可选的,在计算相似距离之前,可以先对图片对应的特征向量进行数据归一化处理,也就是对特征向量进行数据转化,使得数据转化后的归一化特征向量的长度值为1,或者也可以说是,该归一化特征向量的模为1。数据归一化处理可以用于减小运算量,加快数据运算处理的效率。
304:获取每张图片对应的相似度列表。
具体可以为,将每张图片分别与多张图片的特征向量的相似距离从小到大排序后的前K个相似距离,生成每张图片对应的相似度列表。其中,K为大于或等于1的正整数。
示例性的,图库中有A1,A2,A3......AN这N张图片,则图片A1与包括它自身的N张图片对应的特征向量计算相似距离,将计算得到的相似距离按照从小到大的顺序排列,取前K个相似度距离值组成A1对应的相似度列表。例如,图片A1与它自身计算所得的相似距离为0,应该为最小的,为相似度列表的第一个相似距离,即为A1的Top1=0。
每一张图片都有一个对应的相似度列表,即有N个相似度列表,每个相似度列表有K个从小到大排序的相似距离,即组成一个N*K的相似度矩阵。其中,K可以为本领域技术人员根据图库中图片的数量,和聚类算法的精确度等参数预先进行设定,示例性的,本申请的实施例中可以将K设置为50。
305:若某一图片对应的前K个相似距离中的第m个相似距离大于预设阈值,则将该图片标记为一个图片集合。
其中,m为大于或等于1的正整数。示例性的,可以取m为3,也就是设定为最少将三张图片标记为一个图片集合。因此,当一个图片的相似度列表中的第三个相似距离,也就是Top3小于一个预设的阈值,也就是与该图片的相似度最接近的图片不足三张,不足以分为一类,则将该图片标记为聚类失败的离散点,将离散点进行剔除;或者将该图片单独作为一个分类,用标签进行标记。
该预设阈值用于表示三张图片可以聚为一个图片集合的相似距离门限,可以为本领域技术人员根据图库中图片的数量,和聚类算法的精确度等参数预先进行设定,示例性的,本申请的实施例中可以将该预设阈值设置为0.2。也就是,当某图片的相似度列表中Top3小于0.2时,将该图片标记为一个单独的分类,暂时不与其他图片进行聚类。
进一步的,待执行后续步骤307,进行图片集合之间的合并判断时,可以判断该图片是否满足与其他图片集合进行类间合并的条件。如不满足类间合并的条件或者合并后的图片集合中的图片数量仍小于3,则将该图片标记为暂未聚类成功,待后续有新增的图片时可以对此图片继续参与聚类,判断是否有新增的图片可以与之聚类为一个图片集合。
306:根据相似度列表得到图片的分类阈值,将相似度列表中的相似距离小于或等于该分类阈值的相似距离的图片标记为一个图片集合。
具体可以为:根据每张图片分别与多张图片的相似距离获取每张图片对应的分类阈值,对于每张图片分别与多张图片的相似距离,将相似距离小于或等于分类阈值的图片标记为一个图片集合。
进一步的,该分类阈值的计算方式可以通过:对每张图片对应的K个相似距离进行差分运算,得到每张图片对应的K个相似距离中的离群点,每张图片的离群点对应的相似距离为每张图片对应的分类阈值。
其中,离群点为指一个数值序列中,远离序列的一般水平的极端大值和极端小值。具体可以通过差分运算获得离群点。例如,对K个相似距离进行差分运算得到差分函数,该差分函数的最大的峰值对应的相似距离即为离群点。
示例性的,如图5,根据某图片的相似度列表得到左图中所示的相似距离分布函数,对该相似距离分布函数进行差分运算,得到右图中所示的差分函数,差分函数的最大峰值即可以表示为该图片的相似度列表中的一个离群点,该离群点对应的相似距离即可以设定为分类阈值。
接下来,则可以将该分类列表中相似距离小于或等于该分类阈值的相似距离所对应的图片标记为一个图片集合。
通过该分类阈值的设定,不同的图片集合有不同的分类阈值,该分类阈值是根据该图片集合中的每一张图片之间的相似距离进行动态生成的,该图片集合中新增一张图片或者删除一张图片,都可能会影响该分类阈值的大小。因此其并不是一个固定的阈值,而是根据人工智能的算法得到的,从而可以根据不同图片集合中图片之间的相似度情况动态选择阈值,这样可以提高图片集合的准确性。
例如图6,如果图库中某个人物的多张照片的年龄跨度较大,则该人物的所述多张照片之间的相似距离相对偏高。而如果另一人物的多张照片场景单一、年龄相近、 表情等因素也差异不大,则可以预见的是所述多张照片之间的相似距离整体偏低。在这种情况下,如果用一个固定的相似度阈值来作为区分不同照片中人物是否为不同人物的标准的话,则难以避免聚类错误的情况。而采用本申请上述实施例的这种动态阈值的聚类方法,则可以动态地、智能地进行图片集合。
307:确定多个图片集合之间是否可进行合并,对生成的多个图片集合进行合并。
具体可以通过:获取每张图片对应的图片集合的中心点,其中,中心点为该图片集合中所有图片对应的特征向量的平均值。如果任意两个图片集合的中心点之间的相似距离小于这两个图片集合中的任意一个的分类阈值时,将可以将这两个图片集合合并为一个图片集合。
需要说明的是,该中心点可以用来表示该图片集合中特征向量趋于的平均值。因此,可以通过中心点与图片集合的分类阈值之间的关系,来判断这两个图片集合是否可以进行类间合并,将相似度较高的图片集合进行合并,更新为一个新的图片集合。
示例性的,计算图片集合X的中心点为X_1,自动阈值为0.5;图片集合Y的中心点为Y_1,自动阈值为0.6;计算X_1与Y_1之间的相似距离为0.55,判断0.55小于0.5,则将图片集合X与图片集合Y合并为一个分类。则更新该图片集合对应的中心点,和该图片集合对应的K值。
308:判断是否还有需要合并的图片集合。
具体的,可以根据上述类间合并条件,判断是否还有可以合并的图片集合。如果没有需要合并的图片集合,则执行步骤310;如果还有需要合并的图片集合,则执行步骤309。
进一步的,类间合并的终止条件还可以包括:如果一个图片集合的相似度列表中的第二个相似距离大于某一个预设的阈值,则该图片集合不与任意其他图片集合进行合并。
其中,该预设的阈值用于表示该图片集合与其他图片集合的相似度太低,无需进行类间合并的相似距离门限值,可以为本领域技术人员根据图库中图片的数量和聚类算法的精确度等参数预先进行设定的,本申请实施例对此不作具体限定。
309:更新图片集合的中心点,更新该图片集合的图片数量K值。
由上述可知,图片集合进行合并后,该合并后的新的图片集合的中心点可能会发生变化,因此,需要更新该图片集合的中心点。
进一步的,图片集合合并后,图片集合的图片数量也会发生变化,因此需要更新图片集合中的图片数量K值,并且执行步骤306,更新计算该图片集合的分类阈值。
310:完成图片聚类,输出得到的多个图片集合。
当判断没有可以合并的图片集合后,即表示达到聚类的终止条件,完成对图库中所有图片的全量聚类处理,可以根据图片聚类的结果,例如将N个图片聚类为n个图片集合,显示在该电子设备的图库应用中。具体可以为图库应用中显示图片集合的标签的方式,例如,一个标签表示一个图片集合,用户点击该标签,即可查看该图片集合中的所有图片和部分图片。
上述实施例,通过人工智能的方式自动计算分类阈值,实现了用户图片的全量聚类,可以根据不同的用户图片的相似度进行不同的分类标准,提高图片分类的准确性。 由该方案实现的图片聚类,一个图片集合中的图片之间的相似距离要小于不同的图片集合之间的相似距离,并且可以快速将用户手机上的照片按照不同的人的照片建立不同的相册,并建立分类标签。
进一步的,手机可以将生成的多个图片集合的中心点、该图片集合的图片数量或者该图片集合的标签等信息存储在手机的存储器中,以便后续进行新增的图片聚类处理时,不需要对已经聚类处理的图片再进行聚类处理,而可以根据存储器中存储的图片集合的信息,进行比对计算,从而可以快速进行图片分类和图片检索。
在实际使用过程中,当电子设备获取一批新的照片后,可以通过增量聚类的算法,快速将新增的图片与原有的图片集合进行聚类,也就是归档到现有的图片集合中;如果新增的照片不可以与现有的图片集合进行归档,且满足新增图片集合的条件时,则电子设备可以根据该新增的照片创建新的图片集合。
具体的增量聚类过程可以包括下述的两种流程,第一种,将新增的M个图片逐一与原有的图片集合计算相似距离,将相似距离最小的图片集合作为该图片的所属集合。
具体可以为:计算新增的图片与原有的多个图片集合中的每个图片集合的中心点的相似距离,相似距离最小的图片集合,例如为第一图片集合;若新增的图片与第一图片集合的相似距离小于或等于第一图片集合的分类阈值,则电子设备确定该新增的图片属于第一图片集合;若新增的图片与第一图片集合的相似距离大于第一图片集合的分类阈值,则电子设备新建第二图片集合,该新增的图片属于第二图片集合。
第二种,将新增的M个图片先进行全量聚类处理,具体的全量聚类的过程可以如上述步骤301-310来实现,生成m个图片集合。再对m个图片集合与上述生成的n个图片集合的中心点计算相似距离,并进行类间合并,将相似距离最小且小于任一图片集合的分类阈值的图片集合作为该图片的所属集合,并将这两个图片集合进行合并;或者,相似距离最小且大于两个图片集合的分类阈值时,且该图片集合中的图片数量大于等于3时,则根据该新增的图片集合创建新的分类。
上述的第二种增量聚类的计算量更小,算法更优,下述的本申请实施例仅以第二种增量聚类作为示例,具体流程可以为如图7,通过如下步骤实现:
701:电子设备提取新增的图片的特征向量。
具体过程可以参见上述的步骤301和302,此处不再赘述。
电子设备获取每一张新增的图片对应的特征向量,并保存,以便后续进行特征向量之间的比对计算。
702:计算新增的图片与多个图片集合的相似度,剔除离散点。
其中,可以用相似距离来表示相似度的大小。具体的相似距离的计算和离散点剔除可以参见上述的步骤303、304和305,此处不再赘述。
703:对新增的图片进行聚类,得到新增的图片对应的多个新增图片集合。
具体过程可以参见上述的步骤306-310,此处不再赘述。例如,将新增的M个图片进行全量聚类处理,生成m个新增图片集合。
704:计算每个新增图片集合分别与多个图片集合的中心点的相似距离,进行类间合并。
该相似距离用于表示新增图片集合与原图片集合的相似度大小,得到相似度最高 的图片集合。
其中,所述中心点为新增图片集合或原图片集合中的所有图片的特征向量的平均值。
若新增图片集合与第一图片集合的相似距离小于或等于任一图片集合的分类阈值,则确定该新增图片属于第一图片集合,将该新增图片集合与第一图片集合进行合并,更新为一个图片集合。若新增的图片与第一图片集合的相似距离大于两个图片集合的分类阈值,则电子设备新建第二图片集合,该新增图片集合属于第二图片集合。
进一步的,电子设备新建第二图片集合的条件还可以包括,新增的图片集合中的图片数量大于等于m,则新建图片集合。例如,本申请的实施例可以设定图片不足三张,不足以分为一类,则可以将m设置为3。
705:判断是否还有需要合并的图片集合。
具体的,可以根据前述的类间合并条件,判断是否还有可以合并的图片集合。如果没有需要合并的图片集合,则执行步骤707;如果还有需要合并的图片集合,则执行步骤706。
进一步的,类间合并的终止条件还可以包括:如果一个图片集合的相似度列表中的第二个相似距离大于预设的阈值,则该图片集合不与任意其他图片集合进行合并。
其中,该预设的阈值用于表示该图片集合与其他图片集合的相似度太低,无需进行类间合并的相似距离门限,可以为本领域技术人员根据图库中图片的数量和聚类算法的精确度等参数预先进行设定,本申请实施例对此不作具体限定。
706:更新图片集合的中心点,更新该图片集合的图片数量K值。
由上述可知,图片集合进行合并后,该合并后的新的图片集合的中心点可能会发生变化,因此,需要更新该图片集合的中心点。
进一步的,图片集合进行合并后,图片集合的图片数量也会发生变化,因此需要更新图片数量K值,并且更新计算该图片集合的分类阈值。
707:完成图片聚类,输出得到的多个图片集合。
当判断没有可以合并的图片集合后,即表示达到聚类的终止条件,完成对新增图片的增量聚类处理,可以根据图片聚类的结果,显示在该电子设备的图库应用中。
示例性的,如图8所示,新增的三张图片,经过增量聚类处理,分别归类到不同的图片集合中。将图8中的(a)所示的图片归类到标签1的图片集合中,图8中的(b)所示的图片归类到标签2的图片集合中,图8中的(c)所示的图片归类失败,则可以通过界面显示该图片归类失败,或者可以显示为暂未检索到同一类的图片。用户可以通过点击手机查看上述的相册智能分类的图片集合,查看图片分类标签可以查看该图片集合中的所有图片。
上述实施例,实现了快速对新增的一张或者多张图片进行聚类处理,通过对新增的图片与原有聚类结果的类中心进行比对计算,从而确定新增图片所属的图片集合或者快速创建新的图片集合,简化了新增图片的聚类过程,提高了聚类的效率,优化用户体验。
在实际的应用场景中,当用户打开电子设备上的图库应用,需要电子设备根据人脸图片进行检索时,电子设备进行图片检索的处理与上述增量聚类的过程类似,示例 性的,如图9所示,电子设备根据新增人脸图片得到检索结果的流程可以包括:
901:电子设备获取人脸图片。
电子设备获取人脸图片可以选取本地的或者网络的一张或者多张图片,也可以使用摄像头获取实时的拍摄图片。
902:对获取的人脸图片进行特征提取,得到特征向量;如果是多张图片,需要先对该多张图片进行聚类,得到新增图片集合。
903:计算新增图片集合与原有的图片集合的中心点之间的相似距离,相似距离最小的且小于分类阈值的图片集合即作为检索结果输出。
进一步的,计算多个新增图片集合与原有的图片集合的中心点两两之间的相似距离,判断是否满足类间合并条件,进行新增图片集合与原有的图片集合之间的合并。
具体的,类间合并的条件参见上述实施例的步骤307,此处不再赘述。
合并图片集合之后,更新新增图片的分类标签,更新原图片集合的类中心、对应图片集合的图片数量K等相应的参数。
904:如果相似距离中最小的大于分类阈值,则根据该图片集合创建新的图片集合。
进一步的,可以将新增的图片集合中,与原有的图片集合的中心点之间的相似距离中大于分类阈值的,标记为新的图片集合。并且更新图片集合的相关参数。
本申请的上述实施例,通过增量聚类将已有的照片处理结果,如人脸聚类结果,或者场景、时间等信息存储在数据库中缓存并设置索引。则新增照片进行搜索时无需对于之前的每一张照片进行重复比对处理,即可直接调用存储的聚类结果,实现实时搜索。本申请的上述实施例采用增量聚类技术,直接调用图库中缓存的聚类结果,避免了对于照片的逐一特征向量的比对计算。另外,采用动态阈值的聚类方法可以增加对于样本分布不规则时人脸聚类的准确率,提升用户体验。
需要说明的是,上述的实施例中的电子设备还可以为云端设备例如云服务器来实现。
本申请另一些实施例提供了一种电子设备,该电子设备可以包括:存储器和一个或多个处理器,该存储器和处理器耦合。该存储器用于存储计算机程序代码,该计算机程序代码包括计算机指令。当处理器执行计算机指令时,电子设备可执行上述方法实施例中的各个功能或者步骤。
本申请实施例还提供一种芯片系统,如图10所示,该芯片系统包括至少一个处理器1001和至少一个接口电路1002。处理器1001和接口电路1002可通过线路互联。例如,接口电路1002可用于从其它装置(例如电子设备的存储器)接收信号。又例如,接口电路1002可用于向其它装置(例如处理器1001)发送信号。示例性的,接口电路1002可读取存储器中存储的指令,并将该指令发送给处理器1001。当所述指令被处理器1001执行时,可使得电子设备执行上述实施例中电子设备执行的各个功能或者步骤。当然,该芯片系统还可以包含其他分立器件,本申请实施例对此不作具体限定。
本申请实施例还提供一种计算机存储介质,该计算机存储介质包括计算机指令,当所述计算机指令在上述电子设备上运行时,使得该电子设备执行上述方法实施例中的各个功能或者步骤。
本申请实施例还提供一种计算机程序产品,当所述计算机程序产品在计算机上运 行时,使得所述计算机执行上述方法实施例中的各个功能或者步骤。
通过以上实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个装置,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是一个物理单元或多个物理单元,即可以位于一个地方,或者也可以分布到多个不同地方。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该软件产品存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上内容,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (13)

  1. 一种图片处理方法,应用于电子设备,其特征在于,所述方法包括:
    电子设备对多张图片进行聚类,得到多个图片集合;
    所述电子设备获取新增的图片与所述多个图片集合的中心点之间的相似距离,从所述多个图片集合中确定出与所述新增的图片相似距离最小的第一图片集合,其中,相似距离越小,相似度越高;所述中心点为图片集合中所有图片的特征向量的平均值;
    若所述新增的图片与所述第一图片集合的中心点之间的相似距离小于或等于所述第一图片集合的分类阈值,则所述电子设备确定所述新增的图片属于所述第一图片集合;
    若所述新增的图片与所述第一图片集合的中心点之间的相似距离大于所述第一图片集合的所述分类阈值,则所述电子设备新建第二图片集合,所述新增的图片属于所述第二图片集合。
  2. 根据权利要求1所述的方法,其特征在于,所述电子设备对多张图片进行聚类,得到多个图片集合,具体包括:
    根据每张图片分别与所述多张图片的特征向量之间的相似距离,得到每张图片对应的从小到大排序的前K个相似距离,其中,K为大于或等于1的正整数;
    对每张图片对应的K个相似距离进行差分运算,得到每张图片对应的K个相似距离中的离群点,所述离群点对应的相似距离为每张图片对应的分类阈值;
    将所述每张图片分别与所述多张图片的特征向量之间的相似距离中,相似距离小于或等于所述分类阈值的图片标记为一个图片集合,得到每张图片对应的图片集合;
    对生成的多个图片集合进行合并,从而得到所述多个图片集合。
  3. 根据权利要求2所述的方法,其特征在于,在对每张图片对应的K个相似距离进行差分运算之前,所述方法还包括:
    对于每张图片中的任一图片,若所述任一图片对应的前K个相似距离中的第m个相似距离大于预设阈值,则将所述任一图片标记为一个图片集合,其中,m为大于或等于1的正整数。
  4. 根据权利要求2或3所述的方法,其特征在于,对生成的多个图片集合进行合并,从而得到所述多个图片集合包括:
    获取每张图片对应的图片集合的中心点;
    如果任意两个图片集合的中心点之间的相似距离小于所述任意两个图片集合的任意一个的分类阈值时,将所述任意两个图片集合合并为一个图片集合。
  5. 根据权利要求4所述的方法,其特征在于,所述电子设备获取新增的图片与所述多个图片集合的中心点之间的相似距离,具体包括:
    若所述新增的图片的数量为至少两个,则对所述新增的图片进行聚类,得到所述新增的图片对应的多个新增图片集合;
    计算每个新增图片集合分别与所述多个图片集合的中心点的相似距离。
  6. 一种电子设备,其特征在于,所述电子设备包括处理器,以及与处理器连接的存储器,所述存储器用于存储指令,当所述指令被所述处理器执行时,使得所述电子设备用于执行:
    对多张图片进行聚类,得到多个图片集合;
    获取新增的图片与所述多个图片集合的中心点之间的相似距离,从所述多个图片集合中确定出与所述新增的图片相似距离最小的第一图片集合,其中,相似距离越小,相似度越高;所述中心点为图片集合中所有图片的特征向量的平均值;
    若所述新增的图片与所述第一图片集合的中心点之间的相似距离小于或等于所述第一图片集合的分类阈值,则确定所述新增的图片属于所述第一图片集合;
    若所述新增的图片与所述第一图片集合的中心点之间的相似距离大于所述第一图片集合的所述分类阈值,则新建第二图片集合,所述新增的图片属于所述第二图片集合。
  7. 根据权利要求6所述的电子设备,其特征在于,所述对多张图片进行聚类,得到多个图片集合,具体包括:
    根据每张图片分别与所述多张图片的特征向量之间的相似距离,得到每张图片对应的从小到大排序的前K个相似距离,其中,K为大于或等于1的正整数;
    对每张图片对应的K个相似距离进行差分运算,得到每张图片对应的K个相似距离中的离群点,所述离群点对应的相似距离为每张图片对应的分类阈值;
    将所述每张图片分别与所述多张图片的特征向量之间的相似距离中,相似距离小于或等于所述分类阈值的图片标记为一个图片集合,得到每张图片对应的图片集合;
    对生成的多个图片集合进行合并,从而得到所述多个图片集合。
  8. 根据权利要求7所述的电子设备,其特征在于,在对每张图片对应的K个相似距离进行差分运算之前,所述电子设备还用于执行:
    对于每张图片中的任一图片,若所述任一图片对应的前K个相似距离中的第m个相似距离大于预设阈值,则将所述任一图片标记为一个图片集合,其中,m为大于或等于1的正整数。
  9. 根据权利要求7或8所述的电子设备,其特征在于,对生成的多个图片集合进行合并,从而得到所述多个图片集合包括:
    获取每张图片对应的图片集合的中心点;
    如果任意两个图片集合的中心点之间的相似距离小于所述任意两个图片集合的任意一个的分类阈值时,将所述任意两个图片集合合并为一个图片集合。
  10. 根据权利要求9所述的电子设备,其特征在于,所述获取新增的图片与所述多个图片集合的中心点之间的相似距离,具体包括:
    若所述新增的图片的数量为至少两个,则对所述新增的图片进行聚类,得到所述新增的图片对应的多个新增图片集合;
    计算每个新增图片集合分别与所述多个图片集合的中心点的相似距离。
  11. 一种芯片系统,其特征在于,所述芯片系统应用于电子设备;所述芯片系统包括一个或多个接口电路和一个或多个处理器;所述接口电路和所述处理器通过线路互联;所述接口电路用于从所述电子设备的存储器接收信号,并向所述处理器发送所述信号,所述信号包括所述存储器中存储的计算机指令;当所述处理器执行所述计算机指令时,所述电子设备执行如权利要求1-5中任一项所述的图片处理方法。
  12. 一种可读存储介质,其特征在于,所述可读存储介质中存储有指令,当所述 可读存储介质在电子设备上运行时,使得所述电子设备执行权利要求1-5中任一项所述的图片处理方法。
  13. 一种计算机程序产品,其特征在于,当所述计算机程序产品在计算机上运行时,使得所述计算机执行权利要求1-5中任一项所述的图片处理方法。
PCT/CN2020/110307 2019-08-27 2020-08-20 一种图片处理方法及装置 WO2021036906A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910798295.4A CN112445922A (zh) 2019-08-27 2019-08-27 一种图片处理方法及装置
CN201910798295.4 2019-08-27

Publications (1)

Publication Number Publication Date
WO2021036906A1 true WO2021036906A1 (zh) 2021-03-04

Family

ID=74684051

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/110307 WO2021036906A1 (zh) 2019-08-27 2020-08-20 一种图片处理方法及装置

Country Status (2)

Country Link
CN (1) CN112445922A (zh)
WO (1) WO2021036906A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113139589A (zh) * 2021-04-12 2021-07-20 网易(杭州)网络有限公司 图片相似度检测方法、装置、处理器及电子装置
CN113641292A (zh) * 2021-07-09 2021-11-12 荣耀终端有限公司 在触摸屏上进行操作的方法和电子设备
US20210365713A1 (en) * 2021-02-26 2021-11-25 Beijing Baidu Netcom Science And Technology Co., Ltd. Image clustering method and apparatus, and storage medium
CN114785856A (zh) * 2022-03-21 2022-07-22 鹏城实验室 基于边缘计算的协同缓存方法、装置、设备及存储介质
CN116703718A (zh) * 2022-09-08 2023-09-05 荣耀终端有限公司 一种图像放大方法及电子设备

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113283502B (zh) * 2021-05-24 2023-04-28 平安国际融资租赁有限公司 基于聚类的设备状态阈值确定方法和装置
CN116302293A (zh) * 2023-05-18 2023-06-23 荣耀终端有限公司 图片显示方法和设备
CN117009564B (zh) * 2023-09-28 2024-01-05 荣耀终端有限公司 图片处理方法和电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101167076A (zh) * 2005-03-04 2008-04-23 伊斯曼柯达公司 对于缺乏时间信息的图像的添加性聚类
US20090097756A1 (en) * 2007-10-11 2009-04-16 Fuji Xerox Co., Ltd. Similar image search apparatus and computer readable medium
CN102693231A (zh) * 2011-03-23 2012-09-26 百度在线网络技术(北京)有限公司 用于根据来自网络的图像来确定图集的方法、装置和设备
US20150120693A1 (en) * 2012-05-24 2015-04-30 Panasonic Intellectual Property Management Co., Ltd. Image search system and image search method
CN106844381A (zh) * 2015-12-04 2017-06-13 富士通株式会社 图像处理装置及方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2659400A4 (en) * 2010-12-30 2017-01-11 Nokia Technologies Oy Method, apparatus, and computer program product for image clustering
US9239967B2 (en) * 2011-07-29 2016-01-19 Hewlett-Packard Development Company, L.P. Incremental image clustering
US10043112B2 (en) * 2014-03-07 2018-08-07 Qualcomm Incorporated Photo management

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101167076A (zh) * 2005-03-04 2008-04-23 伊斯曼柯达公司 对于缺乏时间信息的图像的添加性聚类
US20090097756A1 (en) * 2007-10-11 2009-04-16 Fuji Xerox Co., Ltd. Similar image search apparatus and computer readable medium
CN102693231A (zh) * 2011-03-23 2012-09-26 百度在线网络技术(北京)有限公司 用于根据来自网络的图像来确定图集的方法、装置和设备
US20150120693A1 (en) * 2012-05-24 2015-04-30 Panasonic Intellectual Property Management Co., Ltd. Image search system and image search method
CN106844381A (zh) * 2015-12-04 2017-06-13 富士通株式会社 图像处理装置及方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LEI XIAO-FENG, XIE KUN-QING, LIN FAN, XIA ZHENG-YI: "An Efficient Clustering Algorithm Based on Local Optimality of K-Means", JOURNAL OF SOFTWARE, vol. 19, no. 7, 1 July 2008 (2008-07-01), pages 1683 - 1692, XP055786526 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210365713A1 (en) * 2021-02-26 2021-11-25 Beijing Baidu Netcom Science And Technology Co., Ltd. Image clustering method and apparatus, and storage medium
US11804069B2 (en) * 2021-02-26 2023-10-31 Beijing Baidu Netcom Science And Technology Co., Ltd. Image clustering method and apparatus, and storage medium
CN113139589A (zh) * 2021-04-12 2021-07-20 网易(杭州)网络有限公司 图片相似度检测方法、装置、处理器及电子装置
CN113139589B (zh) * 2021-04-12 2023-02-28 网易(杭州)网络有限公司 图片相似度检测方法、装置、处理器及电子装置
CN113641292A (zh) * 2021-07-09 2021-11-12 荣耀终端有限公司 在触摸屏上进行操作的方法和电子设备
CN113641292B (zh) * 2021-07-09 2022-08-12 荣耀终端有限公司 在触摸屏上进行操作的方法和电子设备
CN114785856A (zh) * 2022-03-21 2022-07-22 鹏城实验室 基于边缘计算的协同缓存方法、装置、设备及存储介质
CN114785856B (zh) * 2022-03-21 2024-03-19 鹏城实验室 基于边缘计算的协同缓存方法、装置、设备及存储介质
CN116703718A (zh) * 2022-09-08 2023-09-05 荣耀终端有限公司 一种图像放大方法及电子设备
CN116703718B (zh) * 2022-09-08 2024-03-22 荣耀终端有限公司 一种图像放大方法及电子设备

Also Published As

Publication number Publication date
CN112445922A (zh) 2021-03-05

Similar Documents

Publication Publication Date Title
WO2021036906A1 (zh) 一种图片处理方法及装置
JP6328761B2 (ja) 画像ベース検索
CN114816167B (zh) 应用图标的显示方法、电子设备及可读存储介质
WO2021180109A1 (zh) 电子设备以及电子设备的搜索方法、介质
WO2022100221A1 (zh) 检索处理方法、装置及存储介质
CN106250916B (zh) 一种筛选图片的方法、装置及终端设备
WO2020151396A1 (zh) 一种图像分类的方法和电子设备
CN110022397B (zh) 图像处理方法、装置、存储介质及电子设备
WO2022194190A1 (zh) 调整触摸手势的识别参数的数值范围的方法和装置
CN112784102B (zh) 视频检索方法、装置和电子设备
CN112989210A (zh) 基于健康画像的保险推荐方法、系统、设备和介质
CN113641292B (zh) 在触摸屏上进行操作的方法和电子设备
CN116028147A (zh) 应用程序的推荐方法和电子设备
WO2024067442A1 (zh) 一种数据管理方法及相关装置
CN117131241B (zh) 搜索对象推荐方法、电子设备及计算机可读存储介质
WO2023231645A1 (zh) 搜索方法、装置、电子设备和可读存储介质
CN116700554A (zh) 信息的显示方法、电子设备及可读存储介质
CN113722539A (zh) 视频分类方法、装置、电子设备及存储介质
CN116680133A (zh) 一种黑屏检测方法和电子设备
CN116680431A (zh) 一种视觉定位方法、电子设备、介质及产品
CN115623318A (zh) 对焦方法及相关装置
Ranmale et al. Systematic Review: Face Recognition Algorithms for Photos and Real-Time Applications
CN116739483A (zh) 数据处理方法、装置、计算机设备及可读存储介质
CN116821399A (zh) 照片处理方法及相关设备
CN115033138A (zh) 图标排列方法、电子设备和可读介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20856907

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20856907

Country of ref document: EP

Kind code of ref document: A1