WO2022063189A1 - 显著性元素识别方法及装置 - Google Patents

显著性元素识别方法及装置 Download PDF

Info

Publication number
WO2022063189A1
WO2022063189A1 PCT/CN2021/119974 CN2021119974W WO2022063189A1 WO 2022063189 A1 WO2022063189 A1 WO 2022063189A1 CN 2021119974 W CN2021119974 W CN 2021119974W WO 2022063189 A1 WO2022063189 A1 WO 2022063189A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
image
feature information
identifier
sub
Prior art date
Application number
PCT/CN2021/119974
Other languages
English (en)
French (fr)
Inventor
梁宇
孙赟
Original Assignee
维沃移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 维沃移动通信有限公司 filed Critical 维沃移动通信有限公司
Publication of WO2022063189A1 publication Critical patent/WO2022063189A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes

Definitions

  • the present application belongs to the field of mobile communications, and in particular relates to a method and device for identifying a distinctive element.
  • the camera of the electronic device usually detects the salient elements of the captured picture; for example, in the portrait composition scene, the camera first detects the salient elements of the people and other subjects in the image, and obtains the salient elements of the salient elements. After the position, in the subsequent process of recommending the composition position, you can refer to the position of the salient element for subsequent processing to obtain better results.
  • the result of salient element detection directly affects the probability of salient subjects (people or other subjects) being falsely detected, missed, or cropped in the process of portrait composition. The positional relationship between other salient subjects is of great significance.
  • saliency element detection mainly uses traditional image processing, based on relatively primary visual features in the image, such as color, brightness, contrast, edge information, etc., to combine and process, simulate the human visual attention mechanism, and obtain saliency element.
  • some primary first feature information of the combined image needs to be designed artificially.
  • this method is not universal, and it is prone to false detection, which in turn affects the identification of salient elements. accuracy.
  • the purpose of the embodiments of the present application is to provide a method and device for identifying a distinctive element, which can solve the problem that the detection method of a distinctive element in the prior art is prone to false detection.
  • an embodiment of the present application provides a method for identifying a distinctive element, the method comprising:
  • the sub-information of the first feature information includes user feature information, geographic location information, At least one of time information and scene information.
  • determining the first element identifier according to the first feature information includes:
  • each of the second element identifiers corresponds to one of the sub-information, and the second element identifier is the identifier with the highest recognition rate for recognizing the sub-information;
  • the obtaining the second element identifier includes:
  • the second element identifier is obtained by training according to the first sample image
  • a second element identifier corresponding to the sub-information is trained.
  • the sub-information includes feature information extracted from the first image, or feature information input by a user.
  • the first image includes a second image displayed on a shooting preview interface of the electronic device or a third image stored in the electronic device.
  • an embodiment of the present application further provides a device for identifying a distinctive element, and the device for identifying a distinctive element includes:
  • an acquisition module for acquiring a first image and first feature information associated with the first image
  • a determining module configured to determine a first element identifier according to the first feature information
  • a recognition module configured to input the first image to the first element recognizer, and obtain the salient elements output by the first element recognizer; wherein, the sub-information of the first feature information includes user feature information , at least one of geographic location information, time information, and scene information.
  • the determining module includes:
  • the identifier acquisition sub-module is used to acquire the second element identifier; wherein, each of the second element identifiers corresponds to one of the sub-information, and the second element identifier is the one with the highest recognition rate for identifying the sub-information. identifier;
  • the fusion submodule is used to perform fusion training on each of the second element identifiers to obtain the first element identifier.
  • the identifier acquisition submodule is used for:
  • the second element identifier is obtained by training according to the first sample image
  • a second element identifier corresponding to the sub-information is trained.
  • the sub-information includes feature information extracted from the first image, or feature information input by a user.
  • the first image includes a second image displayed on a shooting preview interface of the electronic device or a third image stored in the electronic device.
  • an embodiment of the present application further provides an electronic device, the electronic device includes a memory, a processor, and a program or instruction stored in the memory and executable on the processor, where the processor executes the program or When instructed, the steps in the method for identifying a distinctive element as described above are implemented.
  • an embodiment of the present application further provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or instruction is executed by a processor, the above-mentioned method for identifying a distinctive element is implemented steps in .
  • an embodiment of the present application provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction, and implement the first aspect the method described.
  • a first image and first feature information associated with the first image are acquired; a first element identifier is determined according to the first feature information; and the first image is input to the The first element identifier is used to obtain the distinctive element output by the first element identifier; wherein, the sub-information of the first feature information includes at least one of user feature information, geographic location information, time information and scene information based on the real-time and dynamic first feature information, dynamically identify the salient elements of the first image, avoid being affected by other interference information in the first image, and improve the accuracy of identification.
  • FIG. 1 shows a flowchart of a method for identifying a distinctive element provided by an embodiment of the present application
  • FIG. 2 shows a flowchart of training a second element identifier provided by an embodiment of the present application
  • FIG. 3 shows a flowchart of a first example of an embodiment of the present application
  • FIG. 4 shows a block diagram of a device for identifying a distinctive element provided by an embodiment of the present application
  • FIG. 5 shows one of the block diagrams of the electronic device provided by the embodiment of the present application.
  • FIG. 6 shows the second block diagram of the electronic device provided by the embodiment of the present application.
  • an embodiment of the present application provides a method for identifying distinctive elements.
  • the method can be applied to electronic devices, and the electronic devices include various handheld devices, vehicle-mounted devices, wearable devices, computing equipment or other processing equipment connected to a wireless modem, as well as various forms of mobile stations (Mobile Station, MS), terminal equipment (Terminal Device), and so on.
  • mobile stations Mobile Station, MS
  • Terminal Device Terminal Equipment
  • the method includes:
  • Step 101 Acquire a first image and first feature information associated with the first image.
  • the first image to be identified is the image to be identified for the distinctive element; optionally, the first image includes a third image stored in the electronic device or a second image displayed on a shooting preview interface of the electronic device ;
  • the third image is an image that has been stored in the electronic device, such as a picture that has been taken by the electronic device, or a picture received by the electronic device.
  • the electronic device can Perform salient element identification.
  • the second image is the image formed by the shooting preview interface of the electronic device. For example, during the process of shooting an image or video, the electronic device performs real-time salient element detection on the shooting preview interface to obtain the salient elements in the picture, and then according to The relevant information of the salient elements is used for composition.
  • saliency element detection is visual saliency detection.
  • salient regions in the image are extracted, and salient regions are regions of interest to humans.
  • Objects in the salient region are salient elements.
  • the human visual system has the ability to quickly search and locate the object of interest when faced with natural scenes. This visual attention mechanism is an important mechanism for processing visual information in people's daily life.
  • Saliency detection has important application value in object recognition, image and video compression, image retrieval, image redirection and other fields.
  • the sub-information of the first feature information includes at least one of user feature information, geographic location information, time information, and scene information, that is, the first feature information of the sub-information includes user feature information, geographic location information, and time information. and at least one of scene information.
  • the first feature information may be feature information extracted from the first image, or may be feature information input by a user.
  • the user characteristic information includes user attribute information, such as the group type to which the user belongs, the usage characteristics of the electronic device, and the like.
  • the type of people the user belongs to such as age, gender, hobbies and other characteristics;
  • the use characteristics of electronic equipment can also reflect the characteristics of the user, such as the pictures in the album of the electronic equipment, the user's preference characteristics can be obtained after clustering;
  • Video, user's shopping list in the shopping APP in the electronic device, etc., can be clustered to obtain user characteristics.
  • the geographic location information is used to obtain the environment where the current electronic device (ie, the user of the electronic device) is located; the time information is used to obtain time-related factors such as the light intensity during shooting; the scene information is used to obtain the scene information during shooting, and the scene information is used to obtain the Information such as indoor and outdoor scenes, landscapes, sky, night scenes, buildings and other information.
  • the salient elements that the user pays attention to in the current shooting situation are determined according to the first feature information. For example, when the geographic location information is a scenic spot, the time information is daytime, and the scene information is scenery, the user pays more attention to the overall natural scenery. At this time, the salient element should be the sky or landscape with a large proportion of pixels in the first image. scenery;
  • the photographing time is at night, and the scene is identified as a night scene, the salient element at this time is a relatively clear and bright object in the environment, such as a luminous logo (logo).
  • the salient elements at this time are objects around the user, such as dolls in a photo, food in hands, etc.
  • Step 102 Determine a first element identifier according to the first feature information.
  • the first element identifier is used to identify the salient elements of the first image according to the first feature information to obtain the salient elements;
  • Human visual characteristics extract the salient area in the image, the salient area is the area of human interest.
  • Objects in the salient region are salient elements.
  • the human visual system has the ability to quickly search and locate the object of interest when faced with natural scenes. This visual attention mechanism is an important mechanism for processing visual information in people's daily life.
  • Saliency detection has important application value in object recognition, image and video compression, image retrieval, image redirection and other fields.
  • the first element identifier may be pre-trained, or may be obtained by training according to the first image and training samples; for example, before step 101, the first Element identifier.
  • Step 103 Input the first image to the first element identifier to obtain the saliency element output by the first element identifier.
  • the first element identifier is used to execute the salient element detection algorithm to detect the salient elements in the first image;
  • the salient element detection algorithm is obtained through machine learning in advance, and the machine learning method can be Convolutional Neural Network (CNN) or random forest, etc.; by means of machine learning, an element recognizer with an accuracy that meets the requirements is obtained, and the salient element is recognized on the first image.
  • CNN Convolutional Neural Network
  • the first element identifier performs identification based on the first feature information of the first image; obtaining the salient elements can identify salient elements in more sub-information scenarios The main body of the salient element that is concerned by the user's perspective is obtained.
  • the first image and the first feature information associated with the first image are acquired, the original image is input into a preset element identifier, and the salient elements output by the element identifier are obtained. ; Determine the first element identifier according to the first feature information; input the first image into the first element identifier to obtain the salient element output by the first element identifier; the first element identifier
  • the feature information includes at least one of a geographic location feature, a time feature, and a scene feature, and based on the real-time and dynamic first feature information, the salient elements of the first image are dynamically identified to avoid being affected by other interference information in the first image, The accuracy of identification is improved; the embodiment of the present application solves the problem that the detection method of significant elements in the prior art is prone to false detection.
  • the determining the first element identifier according to the first feature information includes:
  • each of the second element identifiers corresponds to one of the sub-information
  • the second element identifier is the identifier with the highest recognition rate for recognizing the sub-information; that is, the first
  • Each sub-information in a feature information corresponds to a second element identifier
  • each of the second element identifiers is subjected to fusion training to obtain a first element identifier.
  • the recognizer with the highest recognition rate of each sub-information is screened and fused to obtain the final first element recognizer; optionally, the fusion process can be performed according to a preset fusion algorithm, such as the bootstrapping method and the lifting method. (Boosting), stacking method (stacking); bootstrapping method such as random forest model, the intermediate recognizer is used as each decision tree in the random forest.
  • Boosting bootstrapping method
  • stacking method stacking method
  • bootstrapping method such as random forest model
  • the first feature information includes feature information extracted from the first image by the element identifier, or feature information input by a user.
  • the first feature information includes features extracted by the element identifier from the first image, for example, by performing feature extraction on the first image by a preset feature identifier to obtain the first feature information.
  • the first feature information may further include feature information input by the user, for example, when the user inputs the first image, a certain feature is simultaneously input as the first feature information, for example, the input scene information is the sky.
  • the first element recognizer preferentially uses the input first feature information to perform significant element recognition; for example, when the first element recognizer recognizes that the scene information of the first image is outdoor , and the user manually inputs the scene information as scenery, then the features input by the user are used as the final scene information to meet the user's identification needs.
  • the method includes:
  • image processing is performed on the first image to obtain a target image.
  • image processing is performed on the first image, such as image processing operations such as composition during shooting, to obtain a target image.
  • image processing may further include processing operations such as target recognition, image and video compression, image retrieval, and image redirection, which are not described again in this embodiment of the present application.
  • the determining the first element identifier according to the first feature information includes:
  • the second element identifier is obtained by training according to the first sample image; that is, the second element identifier is pre-trained;
  • a second element identifier corresponding to the sub-information is trained.
  • the process of training the second element identifier corresponding to the sub-information is as follows:
  • Step 201 Obtain a sample image including the first image; wherein the sample image includes a training sample and a test sample.
  • the sample images include a first image and a second sample image.
  • the sample image is a sample image used to train the first element identifier (the salient element detection algorithm performed by the first element identifier).
  • the sample image is obtained; in order to improve the prediction accuracy of the first element identifier, A large number of second sample images can be acquired for training.
  • the sample images include training samples and test samples, the training samples are used to train the first element recognizer, and the test samples are used to test the accuracy of the trained first element recognizer, so as to obtain the first element that meets the accuracy requirements recognizer.
  • the second sample image includes the image and the saliency element of the image, that is, the saliency element in the second sample image is known, and during the training of the first element recognizer, the saliency in the training sample
  • the element is used to reversely optimize the first element identifier; during the test, the significant elements in the test sample are used to judge the accuracy of the recognition result output by the first element identifier.
  • saliency element detection is visual saliency detection.
  • salient regions in the image are extracted, and salient regions are regions of interest to humans.
  • Objects in the salient region are salient elements.
  • the human visual system has the ability to quickly search and locate the object of interest when faced with natural scenes. This visual attention mechanism is an important mechanism for processing visual information in people's daily life.
  • Saliency detection has important application value in object recognition, image and video compression, image retrieval, image redirection and other fields.
  • the first element identifier is used to execute a salient element detection algorithm to detect salient elements in the image;
  • the salient element detection algorithm is obtained through machine learning in advance, and the machine learning method can be a convolutional neural network or a Random forest, etc.;
  • the first element recognizer with the required accuracy is obtained by training through machine learning, and the salient elements of the image are recognized.
  • Step 202 Train an initial recognizer according to the training sample to obtain a to-be-tested recognizer.
  • the training sample into the initial recognizer, obtain the recognition result, and perform reverse optimization on the initial recognizer according to the known saliency elements, and then iterate the next training sample; Recognizer to be tested.
  • the initial recognizer is determined according to the adopted machine learning algorithm, such as an initial random forest model.
  • the first element identifier performs identification based on first feature information of the first image, where the first feature information includes at least one of geographic location information, time information, and scene information.
  • the first feature information may be obtained by identifying the first image by the first element identifier, or may be a feature input to the target identifier simultaneously with the first image.
  • the geographic location information is used to obtain the environment where the current electronic device (ie, the user of the electronic device) is located; the time information is used to obtain time-related factors such as the light intensity during shooting; the scene information is used to obtain the scene information during shooting, and the scene information is used to obtain the Information such as indoor and outdoor scenes, landscapes, sky, night scenes, buildings and other information.
  • the salient elements that the user pays attention to in the current shooting situation are determined according to the first feature information. For example, when the geographic location information is a scenic spot, the time information is daytime, and the scene information is scenery, the user pays more attention to the overall natural scenery. At this time, the salient element should be the sky or landscape with a large proportion of pixels in the first image. scenery;
  • the photographing time is at night, and the scene is identified as a night scene, the salient element at this time is a relatively clear and bright object in the environment, such as a luminous logo (logo).
  • the salient elements at this time are objects around the user, such as dolls in a photo, food in hands, etc.
  • salient element identification is performed on the first image to obtain the salient element, which can identify salient elements in many scenes, and obtain the salient element subject concerned from the user's perspective.
  • an integrated learning method can be used to train multiple intermediate recognizers, which solve the same problem and combine them to obtain the final element. detector for better recognition results.
  • random forest is an ensemble learning model composed of multiple decision tree classifiers.
  • the first element recognizer In the process of training the first element recognizer, first randomly assign the same weight to each feature value to obtain an intermediate recognition Then, according to a large number of known saliency elements and the first feature information value, the intermediate recognizer is continuously classified and voted, and finally a set of weight values with the highest accuracy rate is obtained (in this set of weights, each feature value corresponds to its own Weight), the identifier to be tested is formed by the set of weight values and their corresponding intermediate identifiers.
  • a bootstrap resampling technique is used to repeatedly randomly select k sample images from the sample image set with replacement to generate a new bootstrap sample image set, and then generate k according to the bootstrap sample image set.
  • each tree in the forest In random forest, the establishment of each tree relies on an independently drawn sample, each tree in the forest has the same distribution, and the classification error depends on the classification ability of each tree and the correlation between them. For each feature, a random method is used to split each node, and the errors generated in different situations are compared, and the intrinsic estimation error, classification ability and correlation can be detected to determine the number of selected features.
  • the classification ability of a single tree may be small, but after a large number of decision trees are randomly generated, the classification ability will inevitably increase, and the most likely classification will be selected after statistics. Through a large number of classification and regression training, a set of weight values with the highest accuracy is finally obtained as the weight value of the recognizer to be tested.
  • Step 203 Test the identifier to be tested according to the test sample to obtain the first element identifier.
  • the identifier to be tested is tested by using a test sample, and the first element identifier is obtained; if the special area rate meets a preset requirement, the first element identifier is obtained.
  • training the initial recognizer includes:
  • the training sample Inputting the training sample into the initial element identifier to obtain the first feature information of the training sample; wherein, the first feature information of the training sample is that the initial element identifier performs feature extraction on the training sample, Obtain initial features, and cluster the initial features according to a preset clustering algorithm to obtain features.
  • the preset clustering algorithm may be K-Means clustering or density-based clustering, etc.;
  • the initial features are processed in a way to obtain the first feature information, which improves the recognition accuracy of the initial recognizer.
  • the initial recognizer is trained to obtain the recognizer to be tested, including:
  • the initial recognizer is trained to obtain a plurality of recognizers to be fused; wherein, each of the recognizers to be fused corresponds to one of the salient elements, and the recognizers to be fused are used to identify the salient elements.
  • the identifier to be fused is obtained by training.
  • the recognizer to be tested In the process of training the recognizer to be tested, first obtain multiple intermediate recognizers according to the training samples and the initial recognizer; from the intermediate recognizers, select a recognizer to be fused with the highest recognition rate for each salient element; here The salient elements of are identified by the initial recognizer from the training samples; after obtaining the to-be-fused recognizers of each salient element, all the to-be-fused recognizers are fused and trained to obtain the to-be-tested recognizers.
  • the fusion process can be performed according to a preset fusion algorithm, such as Bootstrapping, Boosting, and stacking; the self-help method, such as the above-mentioned random forest model, uses the intermediate identifier as a random Each decision tree in the forest.
  • a preset fusion algorithm such as Bootstrapping, Boosting, and stacking
  • the self-help method such as the above-mentioned random forest model, uses the intermediate identifier as a random Each decision tree in the forest.
  • FIG. 3 shows the main process of a method for identifying a salient element, including the following steps:
  • Step 301 extract the first feature information in the training sample.
  • Extract first feature information such as geographic location information, time information, etc.
  • extract some other features such as scene information, etc., according to the information on the image.
  • Step 302 Cluster the first feature information by a clustering method to determine a salient element.
  • the first feature information may include geographic location information, time information and scene information, or a combination of the three, to determine different saliency elements.
  • Geographic location information which can be used to obtain the current environment of the electronic device (that is, the user of the electronic device); time information is used to obtain time-related factors such as light intensity during shooting; scene information is used to obtain scene information during shooting.
  • Information such as indoor and outdoor scenes, landscapes, sky, night scenes, buildings and other information.
  • the combination of the three features can more accurately determine the salient elements that the user pays attention to in the current photographing scene.
  • Step 303 train the intermediate recognizer.
  • a training sample set containing the saliency element (or this class) is selected, and an intermediate recognizer is obtained by training, and the intermediate recognizer has a higher recognition rate for the saliency element.
  • Step 304 fuse the intermediate identifiers to obtain the element identifiers.
  • the element recognizer is obtained by training. During the training process, the first feature information and the corresponding saliency element are received. Since the saliency element can guide the training of the element recognizer, the element recognizer can output the recognition result of a specific saliency element according to different user characteristics.
  • Step 305 element identifier application.
  • the element identifier When the element identifier is actually applied, the first image is received as an input, the element identifier extracts the first feature information as a reference input, and outputs the salient elements of the first image.
  • a first image and first feature information associated with the first image are acquired; a first element identifier is determined according to the first feature information; and the first image is input to the The first element identifier is used to obtain the distinctive element output by the first element identifier; wherein, the sub-information of the first feature information includes at least one of user feature information, geographic location information, time information and scene information based on the real-time and dynamic first feature information, dynamically identify the salient elements of the first image, avoid being affected by other interference information in the first image, and improve the accuracy of identification; the embodiment of the present application solves the problem in the prior art.
  • the detection method of salient elements is prone to the problem of false detection.
  • the execution subject may be a distinctive element identification device, or a control module in the distinctive element identification device for executing the distinctive element identification method.
  • the method for identifying a distinctive element performed by a distinctive element identifying apparatus is taken as an example to describe the method for identifying a distinctive element provided by the embodiment of the present application.
  • an embodiment of the present application further provides an apparatus 400 for identifying a distinctive element, including:
  • the obtaining module 401 is configured to obtain a first image and first feature information associated with the first image.
  • the first image to be identified is the image to be identified for the distinctive element; optionally, the first image includes a third image stored in the electronic device or a second image displayed on a shooting preview interface of the electronic device ;
  • the third image is an image that has been stored in the electronic device, such as a picture that has been taken by the electronic device, or a picture received by the electronic device.
  • the electronic device can Perform salient element identification.
  • the second image is the image formed by the shooting preview interface of the electronic device. For example, in the process of shooting an image or video, the electronic device performs real-time salient element detection on the shooting preview interface to obtain the salient elements in the picture, and then according to The relevant information of the salient elements is used for composition.
  • saliency element detection is visual saliency detection.
  • salient regions in the image are extracted, and salient regions are regions of interest to humans.
  • Objects in the salient region are salient elements.
  • the human visual system has the ability to quickly search and locate the object of interest when faced with natural scenes. This visual attention mechanism is an important mechanism for processing visual information in people's daily life.
  • Saliency detection has important application value in object recognition, image and video compression, image retrieval, image redirection and other fields.
  • the sub-information of the first feature information includes at least one of user feature information, geographic location information, time information, and scene information, that is, the first feature information of the sub-information includes user feature information, geographic location information, and time information. and at least one of scene information.
  • the first feature information may be feature information extracted from the first image, or may be feature information input by a user.
  • the user characteristic information includes user attribute information, such as the group type to which the user belongs, the usage characteristics of the electronic device, and the like.
  • the type of people the user belongs to such as age, gender, hobbies and other characteristics;
  • the use characteristics of electronic equipment can also reflect the characteristics of the user, such as the pictures in the album of the electronic equipment, the user's preference characteristics can be obtained after clustering;
  • Video, user's shopping list in the shopping APP in the electronic device, etc., can be clustered to obtain user characteristics.
  • the geographic location information is used to obtain the environment where the current electronic device (ie, the user of the electronic device) is located; the time information is used to obtain time-related factors such as the light intensity during shooting; the scene information is used to obtain the scene information during shooting, and the scene information is used to obtain the Information such as indoor and outdoor scenes, landscapes, sky, night scenes, buildings and other information.
  • the salient elements that the user pays attention to in the current shooting situation are determined according to the first feature information. For example, when the geographic location information is a scenic spot, the time information is daytime, and the scene information is scenery, the user pays more attention to the overall natural scenery. At this time, the salient element should be the sky or landscape with a large proportion of pixels in the first image. scenery;
  • the photographing time is at night, and the scene is identified as a night scene, the salient element at this time is a relatively clear and bright object in the environment, such as a luminous logo (logo).
  • the salient elements at this time are objects around the user, such as dolls in a photo, food in hands, etc.
  • the determining module 402 is configured to determine a first element identifier according to the first feature information.
  • the first element identifier is used to identify the salient elements of the first image according to the first feature information to obtain the salient elements;
  • Human visual characteristics extract the salient area in the image, the salient area is the area of human interest.
  • Objects in the salient region are salient elements.
  • the human visual system has the ability to quickly search and locate the object of interest when faced with natural scenes. This visual attention mechanism is an important mechanism for processing visual information in people's daily life.
  • Saliency detection has important application value in object recognition, image and video compression, image retrieval, image redirection and other fields.
  • the first element identifier may be pre-trained, or may be obtained by training according to the first image and training samples; for example, before step 101, the first Element identifier.
  • the identification module 403 is configured to input the first image into the first element identifier to obtain the salient elements output by the first element identifier; wherein, the sub-information of the first feature information includes user features At least one of information, geographic location information, time information, and scene information.
  • the first element identifier is used to execute the salient element detection algorithm to detect the salient elements in the first image;
  • the salient element detection algorithm is obtained through machine learning in advance, and the machine learning method can be Convolutional Neural Network (CNN) or random forest, etc.; by means of machine learning, an element recognizer with an accuracy that meets the requirements is obtained, and the salient element is recognized on the first image.
  • CNN Convolutional Neural Network
  • the first element identifier performs identification based on the first feature information of the first image; obtaining the salient elements can identify salient elements in more sub-information scenarios The main body of the salient element that is concerned by the user's perspective is obtained.
  • the determining module 402 includes:
  • the identifier acquisition sub-module is used to acquire the second element identifier; wherein, each of the second element identifiers corresponds to one of the sub-information, and the second element identifier is the one with the highest recognition rate for identifying the sub-information. identifier;
  • the fusion sub-module is used to perform fusion training on each of the second element identifiers to obtain a first element identifier.
  • the identifier acquisition submodule is used for:
  • the second element identifier is obtained by training according to the first sample image
  • a second element identifier corresponding to the sub-information is trained.
  • the sub-information includes feature information extracted from the first image, or feature information input by a user.
  • the first image includes a second image displayed on a shooting preview interface of the electronic device or a third image stored in the electronic device.
  • the obtaining module 401 obtains the first image and the first feature information associated with the first image; the determining module 402 determines the first element identifier according to the first feature information; the identifying module 403 determines the first element identifier The first image is input to the first element identifier, and the distinctive element output by the first element identifier is obtained; wherein, the sub-information of the first feature information includes user feature information, geographic location information, time At least one of information and scene information; based on real-time and dynamic first feature information, dynamically identify the salient elements of the first image, avoid being affected by other interference information in the first image, and improve the accuracy of identification; this application
  • the embodiment solves the problem that the detection method of significant elements in the prior art is prone to false detection.
  • the distinctive element identification device 400 in this embodiment of the present application may be a device, or may be a component, an integrated circuit, or a chip in a terminal.
  • the apparatus 400 may be a mobile electronic device or a non-mobile electronic device.
  • the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palmtop computer, an in-vehicle electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook, or a personal digital assistant (personal digital assistant).
  • UMPC ultra-mobile personal computer
  • netbook or a personal digital assistant
  • non-mobile electronic devices can be servers, network attached storage (Network Attached Storage, NAS), personal computer (personal computer, PC), television (television, TV), teller machine or self-service machine, etc., this application Examples are not specifically limited.
  • Network Attached Storage NAS
  • personal computer personal computer, PC
  • television television
  • teller machine or self-service machine etc.
  • the distinctive element identification apparatus 400 in this embodiment of the present application may be an apparatus 400 having an operating system.
  • the operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, which are not specifically limited in the embodiments of the present application.
  • the distinctive element identification device 400 provided in the embodiment of the present application can implement each process implemented by the distinctive element identification device 400 in the method embodiments of FIG. 1 to FIG. 3 , and to avoid repetition, details are not repeated here.
  • an embodiment of the present application further provides an electronic device 500, including a processor 501, a memory 502, a program or instruction stored in the memory 502 and executable on the processor 501,
  • an electronic device 500 including a processor 501, a memory 502, a program or instruction stored in the memory 502 and executable on the processor 501,
  • the program or instruction is executed by the processor 501, each process of the above-mentioned embodiment of the distinctive element identification method can be realized, and the same technical effect can be achieved. In order to avoid repetition, details are not repeated here.
  • the electronic devices in the embodiments of the present application include the aforementioned mobile electronic devices and non-mobile electronic devices.
  • FIG. 6 is a schematic diagram of a hardware structure of an electronic device 600 for implementing various embodiments of the present application.
  • the electronic device 600 includes but is not limited to: a radio frequency unit 601, a network module 602, an audio output unit 603, an input unit 604, a sensor 605, a display unit 606, a user input unit 607, an interface unit 608, a memory 609, a processor 610, and Power supply 611 and other components.
  • the electronic device 600 may also include a power source (such as a battery) for supplying power to various components, and the power source may be logically connected to the processor 610 through a power management system, so that the power management system can manage charging, discharging, and power functions. consumption management and other functions.
  • the structure of the electronic device shown in FIG. 6 does not constitute a limitation on the electronic device, and the electronic device may include more or less components than those shown in the figure, or combine some components, or arrange different components, which will not be repeated here. .
  • the processor 610 is configured to acquire a first image and first feature information associated with the first image
  • the sub-information of the first feature information includes user feature information, geographic location information, At least one of time information and scene information.
  • the processor 610 is configured to acquire a second element identifier; wherein, each of the second element identifiers corresponds to one of the sub-information, and the second element identifier is used to identify the recognition rate of the sub-information the highest recognizer;
  • the processor 610 is configured to obtain a preset second element identifier; the second element identifier is obtained by training according to the first sample image;
  • a second element identifier corresponding to the sub-information is trained.
  • the sub-information includes feature information extracted from the first image, or feature information input by a user.
  • the first image includes a second image displayed on a shooting preview interface of the electronic device or a third image stored in the electronic device.
  • a first image and first feature information associated with the first image are acquired; a first element identifier is determined according to the first feature information; and the first image is input to the a first element identifier to obtain the distinctive element output by the first element identifier; wherein the sub-information of the first feature information includes at least one of user feature information, geographic location information, time information and scene information based on real-time and dynamic first feature information, dynamically identify the salient elements of the first image, avoid being affected by other interference information in the first image, and improve the accuracy of identification; the embodiment of the present application solves the problem in the prior art.
  • the detection method of sexual elements is prone to the problem of false detection.
  • the input unit 604 may include a graphics processor (Graphics Processing Unit, GPU) 6041 and a microphone 6042. Such as camera) to obtain still pictures or video image data for processing.
  • the display unit 606 may include a display panel 6061, which may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like.
  • the user input unit 607 includes a touch panel 6071 and other input devices 6072 .
  • the touch panel 6071 is also called a touch screen.
  • the touch panel 6071 may include two parts, a touch detection device and a touch controller.
  • Other input devices 6072 may include, but are not limited to, physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, and joysticks, which are not described herein again.
  • Memory 609 may be used to store software programs as well as various data, including but not limited to application programs and operating systems.
  • the processor 610 may integrate an application processor and a modem processor, wherein the application processor mainly processes an operating system, a user interface, and an application program, and the like, and the modem processor mainly processes wireless communication. It can be understood that, the above-mentioned modulation and demodulation processor may not be integrated into the processor 610.
  • Embodiments of the present application further provide a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or instruction is executed by a processor, each process of the above-mentioned embodiment of the method for identifying a distinctive element can be implemented, and the same can be achieved. In order to avoid repetition, the technical effect will not be repeated here.
  • the processor is the processor in the electronic device described in the foregoing embodiments.
  • the readable storage medium includes a computer-readable storage medium, such as a computer read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like.
  • An embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction to implement the above method for identifying a distinctive element In order to avoid repetition, the details are not repeated here.
  • the chip mentioned in the embodiments of the present application may also be referred to as a system-on-chip, a system-on-chip, a system-on-a-chip, or a system-on-a-chip, or the like.
  • the method of the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases the former is better implementation.
  • the technical solution of the present application can be embodied in the form of a software product in essence or in a part that contributes to the prior art, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, CD-ROM), including several instructions to make a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods described in the various embodiments of this application.
  • a storage medium such as ROM/RAM, magnetic disk, CD-ROM

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

本申请公开了一种显著性元素识别方法及装置。所述方法包括:获取第一图像和与所述第一图像相关联的第一特征信息;根据所述第一特征信息,确定第一元素识别器;将所述第一图像输入至所述第一元素识别器,得到所述第一元素识别器输出的显著性元素;其中,所述第一特征信息的子信息包括用户特征信息、地理位置信息、时间信息以及场景信息中的至少一种。

Description

显著性元素识别方法及装置
相关申请的交叉引用
本申请主张在2020年09月24日在中国提交的中国专利申请号No.202011020259.4的优先权,其全部内容通过引用包含于此。
技术领域
本申请属于移动通信领域,具体涉及一种显著性元素识别方法及装置。
背景技术
随着移动通信技术的迅速发展,以智能手机为首的电子设备已成为人们生活中各方面不可或缺的工具。电子设备的各种应用程序(Application,APP)的功能也逐渐完善,不再只是单纯地起到通信作用,较多的是为用户提供各种智能化服务,给用户工作、生活带来了极大的便利。
就拍摄功能而言,电子设备的相机通常在对所拍摄的画面进行显著性元素检测;比如在人像构图场景下,相机首先对图像中人和其他主体进行显著性元素检测,得到显著性元素的位置之后,后续在推荐构图位置的过程中,则可以参考显著性元素的位置,进行后续处理以得到更好的结果。显著性元素检测的结果直接影响到在人像构图过程中,显著性主体(人或其他主体)被误检、漏检、裁切的概率;因此,提高显著性检测的检测率,对于判断人和其他显著性主体之间的位置关系具有十分重要的意义。
目前,显著性元素检测主要是利用传统的图像处理,基于图像中较为初级的视觉特征,例如颜色、亮度、对比度、边缘信息等,进行组合并加工,模拟人的视觉注意力机制,得到显著性元素。而该方案中,需要人为地去设计组合图像的一些初级第一特征信息,当图像中存在过多干扰信息时,该方法不具备普适性,容易产生误检,进而影响显著性元素识别的准确性。
发明内容
本申请实施例的目的是提供一种显著性元素识别方法及装置,能够解决现有技术中显著性元素检测方式,容易产生误检的问题。
为了解决上述技术问题,本申请是这样实现的:
第一方面,本申请实施例提供了一种显著性元素识别方法,所述方法包括:
获取第一图像和与所述第一图像相关联的第一特征信息;
根据所述第一特征信息,确定第一元素识别器;
将所述第一图像输入至所述第一元素识别器,得到所述第一元素识别器输出的显著性元素;其中,所述第一特征信息的子信息包括用户特征信息、地理位置信息、时间信息以及场景信息中的至少一种。
可选地,所述根据所述第一特征信息,确定第一元素识别器,包括:
获取第二元素识别器;其中,每个所述第二元素识别器对应一所述子信息,所述第二元素识别器为识别所述子信息识别率最高的识别器;
将每个所述第二元素识别器进行融合训练,得到第一元素识别器。
可选地,所述获取第二元素识别器,包括:
获取预设的第二元素识别器;所述第二元素识别器为根据第一样本图像训练得到的;
根据所述第一特征信息的子信息,训练与所述子信息对应的第二元素识别器。
可选地,所述子信息包括提取自所述第一图像中的特征信息,或用户输入的特征信息。
可选地,所述第一图像包括电子设备的拍摄预览界面显示的第二图像或所述电子设备中存储的第三图像。
第二方面,本申请实施例还提供了一种显著性元素识别装置,所述显著性元素识别装置包括:
获取模块,用于获取第一图像和与所述第一图像相关联的第一特征信息;
确定模块,用于根据所述第一特征信息,确定第一元素识别器;
识别模块,用于将所述第一图像输入至所述第一元素识别器,得到所述第一元素识别器输出的显著性元素;其中,所述第一特征信息的子信息包括用户特征信息、地理位置信息、时间信息以及场景信息中的至少一种。
可选地,所述确定模块包括:
识别器获取子模块,用于获取第二元素识别器;其中,每个所述第二元素识别器对应一所述子信息,所述第二元素识别器为识别所述子信息识别率最高的识别器;
融合子模块,用于将每个所述第二元素识别器进行融合训练,得到第一元素识别 器。
可选地,所述识别器获取子模块用于:
获取预设的第二元素识别器;所述第二元素识别器为根据第一样本图像训练得到的;
根据所述第一特征信息的子信息,训练与所述子信息对应的第二元素识别器。
可选地,所述子信息包括提取自所述第一图像中的特征信息,或用户输入的特征信息。
可选地,所述第一图像包括电子设备的拍摄预览界面显示的第二图像或所述电子设备中存储的第三图像。
第三方面,本申请实施例还提供了一种电子设备,该电子设备包括存储器、处理器及存储在存储器上并可在处理器上运行的程序或指令,所述处理器执行所述程序或指令时实现如上所述的显著性元素识别方法中的步骤。
第四方面,本申请实施例还提供了一种可读存储介质,该可读存储介质上存储有程序或指令,所述程序或指令被处理器执行时实现如上所述的显著性元素识别方法中的步骤。
第五方面,本申请实施例提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如第一方面所述的方法。
在本申请实施例中,获取第一图像和与所述第一图像相关联的第一特征信息;根据所述第一特征信息,确定第一元素识别器;将所述第一图像输入至所述第一元素识别器,得到所述第一元素识别器输出的显著性元素;其中,所述第一特征信息的子信息包括用户特征信息、地理位置信息、时间信息以及场景信息中的至少一种;基于实时、动态的第一特征信息,动态识别第一图像的显著性元素,避免受到第一图像中其他干扰信息的影响,提升识别的准确率。
附图说明
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1表示本申请实施例提供的显著性元素识别方法的流程图;
图2表示本申请实施例提供的训练第二元素识别器的流程图;
图3表示本申请实施例的第一示例的流程图;
图4表示本申请的实施例提供的显著性元素识别装置的框图;
图5表示本申请的实施例提供的电子设备的框图之一;
图6表示本申请的实施例提供的电子设备的框图之二。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
应理解,说明书通篇中提到的“一个实施例”或“一实施例”意味着与实施例有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。在本申请的各种实施例中,应理解,下述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,说明书以及权利要求中“和/或”表示所连接对象的至少其中之一,字符“/”,一般表示前后关联对象是一种“或”的关系。
下面结合附图,通过具体的实施例及其应用场景对本申请实施例提供的显著性元素识别方法进行详细地说明。
参见图1,本申请一实施例提供了一种显著性元素识别方法,可选地,所述方法可应用于电子设备,所述电子设备包括各种手持设备、车载设备、可穿戴设备、计算设备或连接到无线调制解调器的其它处理设备,以及各种形式的移动台(Mobile Station,MS),终端设备(Terminal Device)等等。
所述方法包括:
步骤101,获取第一图像和与所述第一图像相关联的第一特征信息。
其中,待识别的第一图像即待进行显著性元素识别的图像;可选地,所述第一图像包括电子设备中存储的第三图像或所述电子设备的拍摄预览界面显示的第二图像;第三图像即电子设备中已经存储的图像,比如电子设备已经拍摄完成的图片,或电子设备接收的图片,电子设备在对第一图像进行图像处理或优化的过程中,可以对第三图像进行显著性元素识别。第二图像即电子设备的拍摄预览界面所形成的图像,比如电子设备在拍摄图像或视频的过程中,对拍摄预览界面进行实时的显著性元素检测,以获得画面中的显著性元素,进而根据显著性元素的相关信息进行构图。
具体地,显著性元素检测即视觉显著性检测,通过模拟人的视觉特点,提取图像中的显著区域,显著区域即人类感兴趣的区域。显著区域内的对象即显著性元素。通常情况下,人类视觉系统在面对自然场景时,具有快速搜索和定位其所感兴趣目标的能力,这种视觉注意机制是人们日常生活中处理视觉信息的重要机制。显著性检测在目标识别、图像视频压缩、图像检索、图像重定向等领域中有着重要的应用价值。
所述第一特征信息的子信息包括用户特征信息、地理位置信息、时间信息以及场景信息中的至少一种,也即所述子信息第一特征信息包括用户特征信息、地理位置信息、时间信息以及场景信息中的至少一种。可选地,所述第一特征信息可以是提取自所述第一图像中的特征信息,还可以是用户输入的特征信息。
用户特征信息包括用户属性信息,比如用户所属的人群类型、电子设备的使用特征等。用户所属的人群类型例如年龄、性别、爱好等特征;电子设备的使用特征亦能反映用户的特征,例如电子设备的相册内的图片,聚类后得到用户的偏好特征;或电子设备内播放的视频、用户在电子设备内购物类APP中的购物清单等,均能聚类得到用户特征。
地理位置信息用于获得当前电子设备(即电子设备的用户)所处的环境;时间信息用于获得拍摄时的光照强度等与时间相关的因素;场景信息用于获得拍摄时的场景信息,场景信息例如室内外场景、风景、天空、夜景、建筑物等信息。
通过第一特征信息判断当前拍摄情况下,用户所关注的显著性元素。例如地理位置信息为景区,时间信息为白天,场景信息为风景时,则用户更多的关注是整体的自然风光,此时显著性元素应是第一图像中像素占比较大的天空或者山水等景物;
或者,地理位置为市区,拍照时间为晚上,场景识别为夜景时,此时显著性元素为环境中较为清晰明亮的物体,如发光的标识(logo)等。
或者,地理位置为商场,场景识别为室内时,此时显著性元素为用户身边的物体,例如合照的玩偶,手中的美食等。
步骤102,根据所述第一特征信息,确定第一元素识别器。
其中,所述第一元素识别器用于根据第一特征信息,对所述第一图像进行显著性元素识别,得到所述显著性元素;具体地,显著性元素检测即视觉显著性检测,通过模拟人的视觉特点,提取图像中的显著区域,显著区域即人类感兴趣的区域。显著区域内的对象即显著性元素。通常情况下,人类视觉系统在面对自然场景时,具有快速搜索和定位其所感兴趣目标的能力,这种视觉注意机制是人们日常生活中处理视觉信息的重要机制。显著性检测在目标识别、图像视频压缩、图像检索、图像重定向等领域中有着重要的应用价值。
其中,第一元素识别器可以是预先训练的,也可以是根据第一图像结合训练样本训练得到的;比如,在步骤101之前,通过第一图像与预设的第一样本图像训练第一元素识别器。
步骤103,将所述第一图像输入至所述第一元素识别器,得到所述第一元素识别器输出的显著性元素。
其中,第一元素识别器元素识别器用于执行显著性元素检测算法,以检测第一图像中的显著性元素;显著性元素检测算法为预先经过机器学习的方式获得的,机器学习的方式可以是卷积神经网络(Convolutional Neural Network,CNN)或随机森林等;通过机器学习的方式训练得到精确度满足要求的元素识别器,对第一图像进行显著性元素识别。
具体地,在识别过程中,第一元素识别器(显著性元素检测算法)基于第一图像的第一特征信息进行识别;得到所述显著性元素,可以识别较多的子信息场景下的显著性元素,得到以用户视角关注的显著性元素主体。
本申请实施例中,获取第一图像和与所述第一图像相关联的第一特征信息,将所述原始图像输入至预设的元素识别器,得到所述元素识别器输出的显著性元素;根据所述第一特征信息,确定第一元素识别器;将所述第一图像输入至所述第一元素识别器,得到所述第一元素识别器输出的显著性元素;所述第一特征信息包括地理位置特征、时间特征以及场景特征中的至少一种,基于实时、动态的第一特征信息,动态识别第一图像的显著性元素,避免受到第一图像中其他干扰信息的影响,提升识别的准确率;本申请实施例解决了现有技术中显著性元素检测方式,容易产生误检的问题。
在一个可选实施例中,所述根据所述第一特征信息,确定第一元素识别器,包括:
获取第二元素识别器;其中,每个所述第二元素识别器对应一所述子信息,所述第二元素识别器为识别所述子信息识别率最高的识别器;也就是说,第一特征信息中的每个子信息对应一第二元素识别器;
然后将每个所述第二元素识别器进行融合训练,得到第一元素识别器。
这样,筛选识别每个子信息识别率最高的识别器进行融合得到最终的第一元素识别器;可选地,融合过程中可根据预设的融合算法进行融合,比如自助法(Bootstrapping)、提升法(Boosting)、堆叠法(stacking);自助法例如随机森林模型,将中间识别器作为随机森林中每个决策树。
在一个可选实施例中,所述第一特征信息包括所述元素识别器提取自所述第一图像中的特征信息,或用户输入的特征信息。
其中,所述第一特征信息包括所述元素识别器提取自所述第一图像中的特征,比如通过预设的特征识别器对第一图像进行特征提取,得到第一特征信息。所述第一特征信息还可包括用户输入的特征信息,比如用户在输入第一图像时,同时将某个特征输入作为第一特征信息,比如输入场景信息为天空。
可选地,若存在输入的第一特征信息,则第一在元素识别器优先以输入的第一特征信息进行显著性元素识别;比如第一元素识别器识别第一图像的场景信息为室外时,而用户手动输入了场景信息为风景,则以用户输入的特征作为最终的场景信息,以满足用户的识别需求。
在一个可选实施例中,所述得到所述第一元素识别器输出的显著性元素之后,所述方法包括:
根据所述显著性元素,对所述第一图像进行图像处理,得到目标图像。
识别到的显著性元素之后,对第一图像进行图像处理,图像处理比如拍摄过程中的构图等操作,得到目标图像。图像处理还可包括目标识别、图像视频压缩、图像检索、图像重定向等处理操作,本申请实施例在此不再赘述。
在一个可选实施例中,所述根据所述第一特征信息,确定第一元素识别器,包括:
获取预设的第二元素识别器;所述第二元素识别器为根据第一样本图像训练得到的;即第二元素识别器为预先训练好的;
根据所述第一特征信息的子信息,训练与所述子信息对应的第二元素识别器。
参见图2,根据所述第一特征信息的子信息,训练与所述子信息对应的第二元素识别器的过程如下:
步骤201,获取包括所述第一图像的样本图像;其中,所述样本图像包括训练样本以及测试样本。
其中,样本图像中包括第一图像以及第二样本图像。
样本图像为用于训练第一元素识别器(第一元素识别器所执行的显著性元素检测算法)的样本图像,步骤201中,获取样本图像;为了提高第一元素识别器的预测准确率,可以获取大量的第二样本图像进行训练。样本图像中包括训练样本以及测试样本,训练样本用于训练第一元素识别器,而测试样本用于对所训练的第一元素识别器进行准确率测试,以得到满足准确率要求的第一元素识别器。第二样本图像中包括图像以及该图像的显著性元素,也就是说,第二样本图像中的显著性元素是已知的,在第一元素识别器训练的过程中,训练样本中的显著性元素用于对第一元素识别器进行反向优化;在测试过程中,测试样本中的显著性元素用于对第一元素识别器输出的识别结果,进行准确率判断。
具体地,显著性元素检测即视觉显著性检测,通过模拟人的视觉特点,提取图像中的显著区域,显著区域即人类感兴趣的区域。显著区域内的对象即显著性元素。通常情况下,人类视觉系统在面对自然场景时,具有快速搜索和定位其所感兴趣目标的能力,这种视觉注意机制是人们日常生活中处理视觉信息的重要机制。显著性检测在目标识别、图像视频压缩、图像检索、图像重定向等领域中有着重要的应用价值。
其中,第一元素识别器用于执行显著性元素检测算法,以检测图像中的显著性元素;显著性元素检测算法为预先经过机器学习的方式获得的,机器学习的方式可以是卷积神经网络或随机森林等;通过机器学习的方式训练得到精确度满足要求的第一元素识别器,对图像进行显著性元素识别。
步骤202,根据所述训练样本,训练初始识别器,得到待测试识别器。
将所述训练样本输入至初始识别器,得到识别结果,并根据已知的显著性元素对初始识别器进行反向优化,然后迭代下一个训练样本;如此循环迭代,至得到满足准确率要求的待测试识别器。初始识别器根据所采用的机器学习算法确定,比如为一个初始的随机森林模型。
具体地,在识别过程中,第一元素识别器基于第一图像的第一特征信息进行识别,所述第一特征信息包括地理位置信息、时间信息以及场景信息中的至少一种。可选地,所述第一特征信息可以是第一元素识别器对第一图像进行识别得到的,还可以是与第一图像同时输入至所述目标识别器的特征。
地理位置信息用于获得当前电子设备(即电子设备的用户)所处的环境;时间信息用于获得拍摄时的光照强度等与时间相关的因素;场景信息用于获得拍摄时的场景信息,场景信息例如室内外场景、风景、天空、夜景、建筑物等信息。
通过第一特征信息判断当前拍摄情况下,用户所关注的显著性元素。例如地理位置信息为景区,时间信息为白天,场景信息为风景时,则用户更多的关注是整体的自然风光,此时显著性元素应是第一图像中像素占比较大的天空或者山水等景物;
或者,地理位置为市区,拍照时间为晚上,场景识别为夜景时,此时显著性元素为环境中较为清晰明亮的物体,如发光的标识(logo)等。
或者,地理位置为商场,场景识别为室内时,此时显著性元素为用户身边的物体,例如合照的玩偶,手中的美食等。
根据第一特征信息,对所述第一图像进行显著性元素识别,得到所述显著性元素,可以识别较多的场景下的显著性元素,得到以用户视角关注的显著性元素主体。
可选地,通过机器学习训练得到第一元素识别器的过程中,可采用集成学习的方式,训练多个中间识别器,多个中间识别器解决相同的问题,将它们结合起来得到最终的元素检测器,以获得更好的识别结果。以集成学习过程为随机森林为例,随机森林是由多个决策树分类器构成的集成学习模式,训练第一元素识别器的过程,首先随机赋予每个特征值相同的权重,得到一中间识别器;然后根据大量的已知显著性元素、第一特征信息值不断对中间识别器进行分类并投票,最终得到准确率最高的一组权重数值(该组权重中,各个特征值的对应各自的权重),由该组权重数值与各自对应的中间识别器构成待测试识别器。
具体地,在选择样本图像时,通过自助法(bootstrap)重采样技术,从样本图像集有放回地重复随机抽取k个样本图像生成新的自助样本图像集,然后根据自助样本图像集生成k个用于分类的决策树,将多个决策树合并在一起,组成随机森林模型。
随机森林中,每棵树的建立依赖于一个独立抽取的样本,森林中的每棵树具有相同的分布,分类误差取决于每一棵树的分类能力和它们之间的相关性。对于每个特征,采用随机的方法去分裂每一个节点,比较不同情况下产生的误差,能够检测到内在估计误差、分类能力和相关性决定选择特征的数目。单棵树的分类能力可能很小,但在随机产生大量的决策树后,分类能力必然增强,经统计后选择最可能的分类。通过大量的分类、回归训练,最终得到准确率最高的一组权重数值,作为待测试识别器的权重数值。
步骤203,根据所述测试样本,对所述待测试识别器进行测试,得到所述第一元素识别器。
利用测试样本对所述待测试识别器进行测试,得到所述第一元素识别器;若专区率满足预设要求,则得到所述第一元素识别器。
在一个可选实施例中,根据所述训练样本,训练初始识别器,包括:
将所述训练样本输入至初始元素识别器,得到所述训练样本的第一特征信息;其中,所述训练样本的第一特征信息为所述初始元素识别器对所述训练样本进行特征提取,得到初始特征,并根据预设聚类算法,对所述初始特征进行聚类得到的特征。
在训练的过程中,对于所提取的训练样本的第一特征信息,进行特征聚类;可选地,预设聚类算法可以是K-Means聚类或基于密度的聚类等;采用聚类的方式对初始特征进行处理,得到第一特征信息,提升初始识别器的识别准确率。
在一个可选实施例中,所述根据所述训练样本,训练初始识别器,得到待测试识别器,包括:
根据所述训练样本,训练初始识别器得到多个待融合识别器;其中,每个所述待融合识别器对应一所述显著性元素,所述待融合识别器为识别所述显著性元素识别率最高的识别器;
根据所述待融合识别器,训练得到待测试识别器。
在训练待测试识别器过程中,首先根据训练样本以及初始识别器得到多个中间识别器;从中间识别器中,分别为每个显著性元素选择一个识别率最高的待融合识别器;此处的显著性元素为初始识别器从所述训练样本中识别到的;得到每个显著性元素的待融合识别器之后,将所有的待融合识别器进行融合训练,得到待测试识别器。
可选地,融合过程中可根据预设的融合算法进行融合,比如自助法(Bootstrapping)、提升法(Boosting)、堆叠法(stacking);自助法例如上述随机森林模型,将中间识别器作为随机森林中每个决策树。
作为第一示例,参见图3,图3示出了一显著性元素识别方法的主要过程,包括以下步骤:
步骤301,根据初始识别器,提取训练样本中的第一特征信息。
根据训练样本在拍摄过程中产生的信息,提取第一特征信息,例如,地理位置信息,时间信息等;同时,根据图像上的信息,提取一些其他特征,例如,场景信息等。
步骤302,通过聚类方法聚类第一特征信息,确定显著性元素。
通过预设聚类方法,输出该类第一特征信息所需要确定的显著性元素。第一特征信息可以包括地理位置信息,时间信息和场景信息,或三者的结合,确定不同显著性元素。
地理位置信息,可以获得当前电子设备(即电子设备的用户)所处的环境;时间信息用于获得拍摄时的光照强度等与时间相关的因素;场景信息用于获得拍摄时的场景信息,场景信息例如室内外场景、风景、天空、夜景、建筑物等信息。
三种特征相互结合,则可以更加精准地判断当前的拍照场景下,用户所关注的显著性元素。
步骤303,根据显著性元素,训练中间识别器。
可选地,根据显著性元素,选取包含该显著性元素(或该类)的训练样本集,训练得到一个中间识别器,这个中间识别器器针对该显著性元素有着较高的识别率。
步骤304,将中间识别器融合得到元素识别器。
将若干初始识别器,以及对应的第一特征信息,作为参考输入,训练得到元素识别器。在训练过程中,接收第一特征信息以及对应显著性元素,由于该显著性元素能够指导元素识别器的训练,因此元素识别器能够根据不同的用户特征,输出特定显著性元素的识别结果。
步骤305,元素识别器应用。
将元素识别器实际应用时,接收第一图像作为输入,元素识别器提取第一特征信息作为参考输入,输出第一图像的显著性元素。
在本申请实施例中,获取第一图像和与所述第一图像相关联的第一特征信息;根据所述第一特征信息,确定第一元素识别器;将所述第一图像输入至所述第一元素识别器,得到所述第一元素识别器输出的显著性元素;其中,所述第一特征信息的子信息包括用户特征信息、地理位置信息、时间信息以及场景信息中的至少一种;基于实时、动态的第一特征信息,动态识别第一图像的显著性元素,避免受到第一图像中其他干扰信息的影响,提升识别的准确率;本申请实施例解决了现有技术中显著性元素检测方式,容易产生误检的问题。
以上介绍了本申请实施例提供的显著性元素识别方法,下面将结合附图介绍本申请实施例提供的显著性元素识别装置。
需要说明的是,本申请实施例提供的显著性元素识别方法,执行主体可以为显著性元素识别装置,或者该显著性元素识别装置中的用于执行显著性元素识别方法的控制模块。本申请实施例中以显著性元素识别装置执行显著性元素识别方法为例,说明本申请实施例提供的显著性元素识别方法。
参见图4,本申请实施例还提供了一种显著性元素识别装置400,包括:
获取模块401,用于获取第一图像和与所述第一图像相关联的第一特征信息。
其中,待识别的第一图像即待进行显著性元素识别的图像;可选地,所述第一图像包括电子设备中存储的第三图像或所述电子设备的拍摄预览界面显示的第二图像;第三图像 即电子设备中已经存储的图像,比如电子设备已经拍摄完成的图片,或电子设备接收的图片,电子设备在对第一图像进行图像处理或优化的过程中,可以对第三图像进行显著性元素识别。第二图像即电子设备的拍摄预览界面所形成的图像,比如电子设备在拍摄图像或视频的过程中,对拍摄预览界面进行实时的显著性元素检测,以获得画面中的显著性元素,进而根据显著性元素的相关信息进行构图。
具体地,显著性元素检测即视觉显著性检测,通过模拟人的视觉特点,提取图像中的显著区域,显著区域即人类感兴趣的区域。显著区域内的对象即显著性元素。通常情况下,人类视觉系统在面对自然场景时,具有快速搜索和定位其所感兴趣目标的能力,这种视觉注意机制是人们日常生活中处理视觉信息的重要机制。显著性检测在目标识别、图像视频压缩、图像检索、图像重定向等领域中有着重要的应用价值。
所述第一特征信息的子信息包括用户特征信息、地理位置信息、时间信息以及场景信息中的至少一种,也即所述子信息第一特征信息包括用户特征信息、地理位置信息、时间信息以及场景信息中的至少一种。可选地,所述第一特征信息可以是提取自所述第一图像中的特征信息,还可以是用户输入的特征信息。
用户特征信息包括用户属性信息,比如用户所属的人群类型、电子设备的使用特征等。用户所属的人群类型例如年龄、性别、爱好等特征;电子设备的使用特征亦能反映用户的特征,例如电子设备的相册内的图片,聚类后得到用户的偏好特征;或电子设备内播放的视频、用户在电子设备内购物类APP中的购物清单等,均能聚类得到用户特征。
地理位置信息用于获得当前电子设备(即电子设备的用户)所处的环境;时间信息用于获得拍摄时的光照强度等与时间相关的因素;场景信息用于获得拍摄时的场景信息,场景信息例如室内外场景、风景、天空、夜景、建筑物等信息。
通过第一特征信息判断当前拍摄情况下,用户所关注的显著性元素。例如地理位置信息为景区,时间信息为白天,场景信息为风景时,则用户更多的关注是整体的自然风光,此时显著性元素应是第一图像中像素占比较大的天空或者山水等景物;
或者,地理位置为市区,拍照时间为晚上,场景识别为夜景时,此时显著性元素为环境中较为清晰明亮的物体,如发光的标识(logo)等。
或者,地理位置为商场,场景识别为室内时,此时显著性元素为用户身边的物体,例如合照的玩偶,手中的美食等。
确定模块402,用于根据所述第一特征信息,确定第一元素识别器。
其中,所述第一元素识别器用于根据第一特征信息,对所述第一图像进行显著性元素 识别,得到所述显著性元素;具体地,显著性元素检测即视觉显著性检测,通过模拟人的视觉特点,提取图像中的显著区域,显著区域即人类感兴趣的区域。显著区域内的对象即显著性元素。通常情况下,人类视觉系统在面对自然场景时,具有快速搜索和定位其所感兴趣目标的能力,这种视觉注意机制是人们日常生活中处理视觉信息的重要机制。显著性检测在目标识别、图像视频压缩、图像检索、图像重定向等领域中有着重要的应用价值。
其中,第一元素识别器可以是预先训练的,也可以是根据第一图像结合训练样本训练得到的;比如,在步骤101之前,通过第一图像与预设的第一样本图像训练第一元素识别器。
识别模块403,用于将所述第一图像输入至所述第一元素识别器,得到所述第一元素识别器输出的显著性元素;其中,所述第一特征信息的子信息包括用户特征信息、地理位置信息、时间信息以及场景信息中的至少一种。
其中,第一元素识别器元素识别器用于执行显著性元素检测算法,以检测第一图像中的显著性元素;显著性元素检测算法为预先经过机器学习的方式获得的,机器学习的方式可以是卷积神经网络(Convolutional Neural Network,CNN)或随机森林等;通过机器学习的方式训练得到精确度满足要求的元素识别器,对第一图像进行显著性元素识别。
具体地,在识别过程中,第一元素识别器(显著性元素检测算法)基于第一图像的第一特征信息进行识别;得到所述显著性元素,可以识别较多的子信息场景下的显著性元素,得到以用户视角关注的显著性元素主体。
可选地,本申请实施例中,所述确定模块402包括:
识别器获取子模块,用于获取第二元素识别器;其中,每个所述第二元素识别器对应一所述子信息,所述第二元素识别器为识别所述子信息识别率最高的识别器;
融合子模块,用于将每个所述第二元素识别器进行融合训练,得到第一元素识别器。
可选地,本申请实施例中,所述识别器获取子模块用于:
获取预设的第二元素识别器;所述第二元素识别器为根据第一样本图像训练得到的;
根据所述第一特征信息的子信息,训练与所述子信息对应的第二元素识别器。
可选地,本申请实施例中,所述子信息包括提取自所述第一图像中的特征信息,或用户输入的特征信息。
可选地,本申请实施例中,所述第一图像包括电子设备的拍摄预览界面显示的第二图像或所述电子设备中存储的第三图像。
本申请实施例中,获取模块401获取第一图像和与所述第一图像相关联的第一特征信息;确定模块402根据所述第一特征信息,确定第一元素识别器;识别模块403将所述第一图像输入至所述第一元素识别器,得到所述第一元素识别器输出的显著性元素;其中,所述第一特征信息的子信息包括用户特征信息、地理位置信息、时间信息以及场景信息中的至少一种;基于实时、动态的第一特征信息,动态识别第一图像的显著性元素,避免受到第一图像中其他干扰信息的影响,提升识别的准确率;本申请实施例解决了现有技术中显著性元素检测方式,容易产生误检的问题。
本申请实施例中的显著性元素识别装置400可以是装置,也可以是终端中的部件、集成电路、或芯片。该装置400可以是移动电子设备,也可以为非移动电子设备。示例性的,移动电子设备可以为手机、平板电脑、笔记本电脑、掌上电脑、车载电子设备、可穿戴设备、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本或者个人数字助理(personal digital assistant,PDA)等,非移动电子设备可以为服务器、网络附属存储器(Network Attached Storage,NAS)、个人计算机(personal computer,PC)、电视机(television,TV)、柜员机或者自助机等,本申请实施例不作具体限定。
本申请实施例中的显著性元素识别装置400可以为具有操作系统的装置400。该操作系统可以为安卓(Android)操作系统,可以为ios操作系统,还可以为其他可能的操作系统,本申请实施例不作具体限定。
本申请实施例提供的显著性元素识别装置400能够实现图1至图3的方法实施例中显著性元素识别装置400实现的各个过程,为避免重复,这里不再赘述。
可选的,如图5所示,本申请实施例还提供一种电子设备500,包括处理器501,存储器502,存储在存储器502上并可在所述处理器501上运行的程序或指令,该程序或指令被处理器501执行时实现上述显著性元素识别方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
需要注意的是,本申请实施例中的电子设备包括上述所述的移动电子设备和非移动电子设备。
图6为实现本申请各个实施例的一种电子设备600的硬件结构示意图;
该电子设备600包括但不限于:射频单元601、网络模块602、音频输出单元603、输入单元604、传感器605、显示单元606、用户输入单元607、接口单元608、存储器609、处理器610、以及电源611等部件。本领域技术人员可以理解,电子设备600还可以包括给各个部件供电的电源(比如电池),电源可以通过电源管理系统与处理器610逻辑相连, 从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。图6中示出的电子设备结构并不构成对电子设备的限定,电子设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置,在此不再赘述。
其中,处理器610,用于获取第一图像和与所述第一图像相关联的第一特征信息;
根据所述第一特征信息,确定第一元素识别器;
将所述第一图像输入至所述第一元素识别器,得到所述第一元素识别器输出的显著性元素;其中,所述第一特征信息的子信息包括用户特征信息、地理位置信息、时间信息以及场景信息中的至少一种。
可选的,处理器610,用于获取第二元素识别器;其中,每个所述第二元素识别器对应一所述子信息,所述第二元素识别器为识别所述子信息识别率最高的识别器;
将每个所述第二元素识别器进行融合训练,得到第一元素识别器。
可选地,处理器610,用于获取预设的第二元素识别器;所述第二元素识别器为根据第一样本图像训练得到的;
根据所述第一特征信息的子信息,训练与所述子信息对应的第二元素识别器。
可选地,所述子信息包括提取自所述第一图像中的特征信息,或用户输入的特征信息。
可选地,所述第一图像包括电子设备的拍摄预览界面显示的第二图像或所述电子设备中存储的第三图像。
本申请实施例中,获取第一图像和与所述第一图像相关联的第一特征信息;根据所述第一特征信息,确定第一元素识别器;将所述第一图像输入至所述第一元素识别器,得到所述第一元素识别器输出的显著性元素;其中,所述第一特征信息的子信息包括用户特征信息、地理位置信息、时间信息以及场景信息中的至少一种;基于实时、动态的第一特征信息,动态识别第一图像的显著性元素,避免受到第一图像中其他干扰信息的影响,提升识别的准确率;本申请实施例解决了现有技术中显著性元素检测方式,容易产生误检的问题。
应理解的是,本申请实施例中,输入单元604可以包括图形处理器(Graphics Processing Unit,GPU)6041和麦克风6042,图形处理器6041对在视频捕获模式或图像捕获模式中由图像捕获装置(如摄像头)获得的静态图片或视频的图像数据进行处理。显示单元606可包括显示面板6061,可以采用液晶显示器、有机发光二极管等形式来配置显示面板6061。用户输入单元607包括触控面板6071以及其他输入设备6072。触控面板6071,也称为触 摸屏。触控面板6071可包括触摸检测装置和触摸控制器两个部分。其他输入设备6072可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆,在此不再赘述。存储器609可用于存储软件程序以及各种数据,包括但不限于应用程序和操作系统。处理器610可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器610中。
本申请实施例还提供一种可读存储介质,可读存储介质上存储有程序或指令,该程序或指令被处理器执行时实现上述显著性元素识别方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
其中,所述处理器为上述实施例中所述的电子设备中的处理器。所述可读存储介质,包括计算机可读存储介质,如计算机只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等。
本申请实施例另提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现上述显著性元素识别方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
应理解,本申请实施例提到的芯片还可以称为系统级芯片、系统芯片、芯片系统或片上系统芯片等。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。此外,需要指出的是,本申请实施方式中的方法和装置的范围不限按示出或讨论的顺序来执行功能,还可包括根据所涉及的功能按基本同时的方式或按相反的顺序来执行功能,例如,可以按不同于所描述的次序来执行所描述的方法,并且还可以添加、省去、或组合各种步骤。另外,参照某些示例所描述的特征可在其他示例中被组合。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如 ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,均属于本申请的保护之内。

Claims (14)

  1. 一种显著性元素识别方法,所述方法包括:
    获取第一图像和与所述第一图像相关联的第一特征信息;
    根据所述第一特征信息,确定第一元素识别器;
    将所述第一图像输入至所述第一元素识别器,得到所述第一元素识别器输出的显著性元素;其中,所述第一特征信息的子信息包括用户特征信息、地理位置信息、时间信息以及场景信息中的至少一种。
  2. 根据权利要求1所述的显著性元素识别方法,其中,所述根据所述第一特征信息,确定第一元素识别器,包括:
    获取第二元素识别器;其中,每个所述第二元素识别器对应一所述子信息,所述第二元素识别器为识别所述子信息识别率最高的识别器;
    将每个所述第二元素识别器进行融合训练,得到第一元素识别器。
  3. 根据权利要求2所述的显著性元素识别方法,其中,所述获取第二元素识别器,包括:
    获取预设的第二元素识别器;所述第二元素识别器为根据第一样本图像训练得到的;
    根据所述第一特征信息的子信息,训练与所述子信息对应的第二元素识别器。
  4. 根据权利要求1所述的显著性元素识别方法,其中,所述子信息包括提取自所述第一图像中的特征信息,或用户输入的特征信息。
  5. 根据权利要求1所述的显著性元素识别方法,其中,所述第一图像包括电子设备的拍摄预览界面显示的第二图像或所述电子设备中存储的第三图像。
  6. 一种显著性元素识别装置,所述装置包括:
    获取模块,用于获取第一图像和与所述第一图像相关联的第一特征信息;
    确定模块,用于根据所述第一特征信息,确定第一元素识别器;
    识别模块,用于将所述第一图像输入至所述第一元素识别器,得到所述第一元素识别器输出的显著性元素;其中,所述第一特征信息的子信息包括用户特征信息、地理位置信息、时间信息以及场景信息中的至少一种。
  7. 根据权利要求6所述的显著性元素识别装置,其中,所述确定模块包括:
    识别器获取子模块,用于获取第二元素识别器;其中,每个所述第二元素识别器 对应一所述子信息,所述第二元素识别器为识别所述子信息识别率最高的识别器;
    融合子模块,用于将每个所述第二元素识别器进行融合训练,得到第一元素识别器。
  8. 根据权利要求7所述的显著性元素识别装置,其中,所述识别器获取子模块用于:
    获取预设的第二元素识别器;所述第二元素识别器为根据第一样本图像训练得到的;
    根据所述第一特征信息的子信息,训练与所述子信息对应的第二元素识别器。
  9. 根据权利要求6所述的显著性元素识别装置,其中,所述子信息包括提取自所述第一图像中的特征信息,或用户输入的特征信息。
  10. 根据权利要求6所述的显著性元素识别装置,其中,所述第一图像包括电子设备的拍摄预览界面显示的第二图像或所述电子设备中存储的第三图像。
  11. 一种显著性元素识别装置,包括所述装置被配置成用于执行如权利要求1至5中任一项所述的显著性元素识别方法。
  12. 一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如权利要求1至5中任一项所述的显著性元素识别方法。
  13. 一种电子设备,所述电子设备包括存储器、处理器及存储在存储器上并可在处理器上运行的程序或指令,所述处理器执行所述程序或指令时实现如权利要求1至5任一项所述的显著性元素识别方法中的步骤。
  14. 一种可读存储介质,所述可读存储介质上存储有程序或指令,所述程序或指令被处理器执行时实现如权利要求1至5所述的显著性元素识别方法中的步骤。
PCT/CN2021/119974 2020-09-24 2021-09-23 显著性元素识别方法及装置 WO2022063189A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011020259.4A CN112101387A (zh) 2020-09-24 2020-09-24 显著性元素识别方法及装置
CN202011020259.4 2020-09-24

Publications (1)

Publication Number Publication Date
WO2022063189A1 true WO2022063189A1 (zh) 2022-03-31

Family

ID=73756248

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/119974 WO2022063189A1 (zh) 2020-09-24 2021-09-23 显著性元素识别方法及装置

Country Status (2)

Country Link
CN (1) CN112101387A (zh)
WO (1) WO2022063189A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112101387A (zh) * 2020-09-24 2020-12-18 维沃移动通信有限公司 显著性元素识别方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103793925A (zh) * 2014-02-24 2014-05-14 北京工业大学 融合时空特征的视频图像视觉显著程度检测方法
CN107622280A (zh) * 2017-09-14 2018-01-23 河南科技大学 基于场景分类的模块化处方式图像显著性检测方法
US20190132520A1 (en) * 2017-11-02 2019-05-02 Adobe Inc. Generating image previews based on capture information
CN110348291A (zh) * 2019-05-28 2019-10-18 华为技术有限公司 一种场景识别方法、一种场景识别装置及一种电子设备
CN112101387A (zh) * 2020-09-24 2020-12-18 维沃移动通信有限公司 显著性元素识别方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103793925A (zh) * 2014-02-24 2014-05-14 北京工业大学 融合时空特征的视频图像视觉显著程度检测方法
CN107622280A (zh) * 2017-09-14 2018-01-23 河南科技大学 基于场景分类的模块化处方式图像显著性检测方法
US20190132520A1 (en) * 2017-11-02 2019-05-02 Adobe Inc. Generating image previews based on capture information
CN110348291A (zh) * 2019-05-28 2019-10-18 华为技术有限公司 一种场景识别方法、一种场景识别装置及一种电子设备
CN112101387A (zh) * 2020-09-24 2020-12-18 维沃移动通信有限公司 显著性元素识别方法及装置

Also Published As

Publication number Publication date
CN112101387A (zh) 2020-12-18

Similar Documents

Publication Publication Date Title
US11483268B2 (en) Content navigation with automated curation
CN111556278B (zh) 一种视频处理的方法、视频展示的方法、装置及存储介质
CN111047621B (zh) 一种目标对象追踪方法、系统、设备及可读介质
CN111506758B (zh) 物品名称确定方法、装置、计算机设备及存储介质
CN109977859A (zh) 一种图标识别的方法以及相关装置
WO2022042573A1 (zh) 应用程序控制方法、装置、电子设备及可读存储介质
CN104200249B (zh) 一种衣物自动搭配的方法,装置及系统
CN111491187B (zh) 视频的推荐方法、装置、设备及存储介质
CN106462349B (zh) 一种电子照片显示方法、装置和移动设备
WO2022227393A1 (zh) 图像拍摄方法及装置、电子设备和计算机可读存储介质
JP7231638B2 (ja) 映像に基づく情報取得方法及び装置
CN107300967A (zh) 一种智能导航方法、装置、存储介质和终端
CN110084204B (zh) 基于目标对象姿态的图像处理方法、装置和电子设备
CN112203115B (zh) 一种视频识别方法和相关装置
CN107025441B (zh) 肤色检测方法及装置
CN110263729A (zh) 一种镜头边界检测的方法、模型训练方法以及相关装置
CN103105924A (zh) 人机交互方法和装置
WO2023197648A1 (zh) 截图处理方法及装置、电子设备和计算机可读介质
WO2022063189A1 (zh) 显著性元素识别方法及装置
CN111951157A (zh) 图像处理方法、设备及存储介质
CN112200844A (zh) 生成图像的方法、装置、电子设备及介质
CN109302528A (zh) 一种拍照方法、移动终端及计算机可读存储介质
CN108052506B (zh) 自然语言处理方法、装置、存储介质及电子设备
CN110491384B (zh) 一种语音数据处理方法及装置
WO2023066373A1 (zh) 确定样本图像的方法、装置、设备及存储介质

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21871558

Country of ref document: EP

Kind code of ref document: A1