WO2020221177A1 - 图像的识别方法及装置、存储介质和电子装置 - Google Patents

图像的识别方法及装置、存储介质和电子装置 Download PDF

Info

Publication number
WO2020221177A1
WO2020221177A1 PCT/CN2020/087071 CN2020087071W WO2020221177A1 WO 2020221177 A1 WO2020221177 A1 WO 2020221177A1 CN 2020087071 W CN2020087071 W CN 2020087071W WO 2020221177 A1 WO2020221177 A1 WO 2020221177A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
area
contour
processing
cluster
Prior art date
Application number
PCT/CN2020/087071
Other languages
English (en)
French (fr)
Inventor
屈奇勋
胡雯
廖奎翔
张磊
石瑗璐
李宛庭
沈凌浩
郑汉城
Original Assignee
深圳数字生命研究院
深圳碳云智能数字生命健康管理有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳数字生命研究院, 深圳碳云智能数字生命健康管理有限公司 filed Critical 深圳数字生命研究院
Publication of WO2020221177A1 publication Critical patent/WO2020221177A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]

Definitions

  • This application relates to the field of image recognition, and specifically to an image recognition method and device, storage medium and electronic device.
  • the embodiments of the present application provide an image recognition method and device, a storage medium, and an electronic device to at least solve the problem of manually recognizing color and texture features in an image in the related art.
  • an image recognition method including: acquiring a first image containing a target object; extracting a second image corresponding to a designated area from the first image, wherein the The designated area is the area containing the target object in the first image; the color feature and texture feature of the target object are extracted from the second image, and the color feature and the texture feature are identified based on the color feature and the texture feature. Describe the color and character of the target object in the first image.
  • traits refer to the components and physical state of the target object reflected by the texture characteristics of the target object in the image.
  • the characteristics of stool include but are not limited to the composition and/or physical state of stool.
  • the composition of stool includes whether there is milk flap, whether there is foam, whether there is bloodshot or whether there is mucus, etc., the physical state of stool Including but not limited to muddy, egg flower, watery, mucus, banana, toothpaste, sea cucumber, clay, tar, or sheep feces granular.
  • an image recognition device including: an acquisition module, configured to acquire a first image containing a target object; a first extraction module, configured to extract and extract from the first image A second image corresponding to a designated area, wherein the designated area is an area containing the target object in the first image; a second extraction module is configured to extract the color of the target object from the second image Features and texture features, and identify the color and traits of the target object in the first image based on the color feature and the texture feature.
  • a computer-readable storage medium in which a computer program is stored, wherein the computer program is configured to execute the above-mentioned image recognition method when running Steps in the embodiment.
  • an electronic device including a memory and a processor, the memory is stored with a computer program, and the processor is configured to run the computer program to perform the above-mentioned image recognition Steps in the method embodiment.
  • the area corresponding to the target object is extracted from the first image containing the target object as the second image, and the color feature and texture feature of the target object are extracted from the second image to identify the target object Colors and traits, thereby solving the problem of manually identifying color and texture features in images in related technologies, and achieving the effect of improving recognition efficiency and saving costs.
  • FIG. 1 is a hardware structure block diagram of a terminal of an image recognition method according to an embodiment of the present application
  • Fig. 2 is a flowchart of an image recognition method according to an embodiment of the present application.
  • Fig. 3 is a structural block diagram of an image recognition device according to an embodiment of the present application.
  • FIG. 1 is a hardware structural block diagram of a terminal of an image recognition method according to an embodiment of the present application.
  • the terminal may include one or more (only one is shown in FIG. 1) processor 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA) and The memory 104 is configured to store data.
  • the aforementioned terminal may also include a transmission device 106 and an input/output device 108 configured as a communication function.
  • the structure shown in FIG. 1 is only for illustration, and does not limit the structure of the foregoing terminal.
  • the terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration from that shown in FIG.
  • the memory 104 may be configured to store computer programs, for example, software programs and modules of application software, such as the computer programs corresponding to the image recognition method in the embodiment of the present application.
  • the processor 102 runs the computer programs stored in the memory 104, thereby Perform various functional applications and data processing, that is, realize the above-mentioned methods.
  • the memory 104 may include a high-speed random access memory, and may also include a non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory.
  • the memory 104 may further include a memory remotely provided with respect to the processor 102, and these remote memories may be connected to the terminal through a network. Examples of the aforementioned networks include but are not limited to the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
  • the transmission device 106 is configured to receive or transmit data via a network.
  • the foregoing specific examples of the network may include a wireless network provided by a communication provider of the terminal.
  • the transmission device 106 includes a network adapter (Network Interface Controller, NIC for short), which can be connected to other network devices through a base station to communicate with the Internet.
  • the transmission device 106 may be a radio frequency (Radio Frequency, referred to as RF) module, which is configured to communicate with the Internet in a wireless manner.
  • RF Radio Frequency
  • FIG. 2 is a flowchart of an image recognition method according to an embodiment of the present application. As shown in Fig. 2, the process includes the following steps:
  • Step S202 acquiring a first image containing the target object
  • Step S204 extracting a second image corresponding to the designated area from the first image, where the designated area is an area containing the target object in the first image;
  • Step S206 Extract the color feature and texture feature of the target object from the second image, and identify the color and character of the target object in the first image based on the color feature and texture feature.
  • the area corresponding to the target object is extracted from the first image containing the target object as the second image, and the color feature and texture feature of the target object are extracted from the second image to identify Show the color and traits of the target object.
  • the method is applied to the scene of feces image recognition, thereby solving the problem of manually identifying the color and properties of feces in the feces image in the related art, and achieving the improvement of recognition efficiency and saving The effect of cost.
  • traits refer to the components and physical state of the target object reflected by the texture characteristics of the target object in the image.
  • the target object can be feces, sputum, soil, tissue samples, etc., specifically to the recognition method of feces images.
  • the characteristics of feces include but are not limited to the composition and/or physical state of feces.
  • the composition of feces includes whether there are milk flaps, foam, etc.
  • Bloodshot or mucus the physical state of feces includes muddy, egg flower, watery, mucus, banana, toothpaste, sea cucumber, clay, tar, or sheep feces granular.
  • the manner of acquiring the first image containing the target object involved in step S202 may be:
  • Step S202-11 collecting a plurality of image data containing the target object, and establishing an image database based on the collected image data;
  • Step S202-12 training the first convolutional neural network based on the image data in the image database to obtain a second convolutional neural network for classification;
  • step S202-13 the input image data is analyzed through the second convolutional neural network to obtain the first image.
  • the target object is baby feces.
  • the target object can also be soil, adult feces, sputum samples, tissue samples, etc., here only for example.
  • the specific process can be as follows: 1 Input the training set image into the first convolutional neural network (feedforward neural network model) that randomly initializes the network parameters, each training set image has a corresponding label; through the first convolutional neural network The input training set image is calculated to obtain the result of the reference labeling of the training set image; 2The loss of the current first convolutional neural network to the training set image labeling result is determined according to the reference labeling result of the training set image and the actual labeling result of the training set image Function value; 3Adjust the network parameters of the current first convolutional neural network through the backpropagation algorithm according to the loss function value of the training set image annotation result, and adjust the first convolutional neural network after the input value of the verification set image is adjusted; 4 Calculate the input validation set images through the first convolutional neural network to obtain the reference annotation results of the validation set images; 5 Determine the current first convolutional neural network based on the reference annotation results of the validation set images and the actual annotation results of the validation set images The value of the loss function of the
  • the convolutional neural network is preferably SqueezeNet or MobileNet, because these two types of convolutional neural networks have fewer parameters, require less computing resources, and run faster on the CPU.
  • convolutional neural networks can also be used, such as MobileNet, ResNet, Xception, Inception, DenseNet, LeNet, AlexNet, and many other convolutional neural networks for classification, and neural networks for classification are applicable.
  • the above-mentioned steps S202-11 to S202-13 are used to detect whether the input image contains a stool image by using a convolutional neural network.
  • the three steps include:
  • Step S11 (corresponding to step S202-11), collect data
  • Step S12 (corresponding to step S202-12), train a convolutional neural network
  • step S11 all the images collected in step S11 are divided into three parts: training set, verification set and test set.
  • step S12 In order to train the convolutional neural network in step S12 to obtain better classification results, adjust all images to a uniform size before training.
  • different convolutional neural networks require input images of different sizes, such as SqueezeNet, MobileNet input is 224, Xception input is 299, etc. The required size is adjusted according to actual needs, and there is no specific limitation here.
  • the convolutional neural network before training is equivalent to the first convolutional neural network in this application
  • the convolutional neural network after training is equivalent to the second convolutional neural network in this application.
  • Step S13 (corresponding to step S202-13), use a convolutional neural network to make predictions
  • the network gives the probability of whether the input image contains "feces", when this probability is greater than the preset probability value (for example, when the preset probability value is 50%), it is considered that the input picture contains feces.
  • the trained convolutional neural network is used to determine whether the input picture contains "feces”.
  • Other probabilities such as 40%, 60%, or 65%, can also be selected according to actual needs. The details are here Not limited.
  • the method of extracting the second image corresponding to the designated area from the first image in step S204 involved in the present application can be implemented in the following manner:
  • Step S204-11 Perform brightness normalization processing on the first image
  • this step S204-11 can be further implemented in the following manner:
  • Step S21 Convert the first image from RGB space to CIELAB space under the standard light source D65 level;
  • Step S22 performing enhancement processing on the L channel in the image in the CIELAB space through an adaptive histogram equalization method that limits the contrast to complete the brightness normalization processing on the L channel;
  • Step S23 Combine the L channel in the CIELAB space image after the brightness normalization process with the unprocessed A channel and B channel in the CIELAB space image to obtain the CIELAB space image with the brightness normalization process, and then combine the CIELAB
  • the RGB space image converted from the space image is used as the first image after brightness normalization processing.
  • the A channel represents the range from red to green, and the value range is [127,-128];
  • the B channel represents the range from yellow to blue, and the value range is [127,-128].
  • the brightness normalization step is an existing brightness normalization method. In other embodiments, other brightness normalization methods in the prior art may also be used, and the details are not limited herein.
  • steps S21 to S23 are equivalent to the preprocessing of the first image, including adjusting the picture size can also be a preprocessing step, but whether the size is adjusted or not does not affect the output of this step; in addition, because The input image shooting environment is complex, different images have different brightness and darkness, and different areas of the same image have different brightness and darkness.
  • the input image should be preprocessed and brightness normalized to improve the segmentation accuracy of the target image and achieve more accurate segmentation effect.
  • step S21 to step S23 can be implemented in the following manner in a specific application scenario:
  • the contrast-limited adaptive histogram equalization (CLAHE) method is used to enhance the L channel of the converted CIELAB space image.
  • the purpose of the enhancement is to normalize the brightness of different regions of the same image.
  • the specific process of CLAHE is: calculating the probability of the grayscale histogram of the L channel image; setting a clip limit, which is preferably 3.0% in this application.
  • the threshold is greater than 0 and less than 100% Both are available. In order to achieve a better effect, the threshold range is preferably 2.0% to 5.0%.
  • the window range is from 5 pixels ⁇ 5 pixels to 21 pixels ⁇ 21 pixels, preferably, the adaptive window is 11 pixels ⁇ 11 pixels) pixels with a certain gray value
  • the grayscale histogram is cropped, and the cropped histogram is evenly distributed to each grayscale, which completes the normalization of the brightness of the L channel .
  • the L channel that has been normalized for brightness is combined with the unprocessed A and B channels and converted back to an RGB space image to complete the preprocessing of the image, that is, to complete the brightness normalization of the first image.
  • Step S204-12 performing clustering and segmentation processing on the first image after brightness normalization processing
  • step S204-12 can optionally be implemented in the following manner:
  • Step S31 establishing a feature vector for each pixel in the first image after the brightness normalization process is performed
  • Step S32 clustering the feature vectors of the pixels into a preset number of categories through a clustering algorithm, and obtaining a preset number of clustered images corresponding to the preset number of categories;
  • Step S33 extracting a third image from the first image, where the third image is an image with a center of the first image as a center and an area of a preset percentage of the first image;
  • Step S34 Count the number of pixels in the preset number of cluster images in the third image, and mark the cluster image with the most pixels in the third image as the center image.
  • steps S31 to S34 are clustering and segmenting the color region:
  • the input is the first image that has been preprocessed. Since the baby’s stool is taken as an example in this application, the color of the stool area is mostly obtained based on the above steps. Since the stool area is generally a continuous and relatively uniform color block, according to this feature, pixel-based color and The location clustering segmentation can be in specific application scenarios:
  • the vector includes the RBG value of the pixel and the coordinate (X, Y) of the pixel in the image, and a total of 5 values;
  • Use clustering algorithms KMeans, Fuzzy-CMeans, Gaussian mixture model maximum expected clustering or other clustering methods that specify the number of cluster centers can be used
  • cluster the feature vectors of all pixels into 3 categories preferably, cluster 2
  • cluster 3 categories preferably, cluster 2
  • cluster 3 it is more preferable to cluster 3. It should be noted that if the stool area is more obvious, the background area is pure, and cluster 2 can get a good clustering.
  • Clustering effect no matter what the situation, the number of clusters is greater than 5 clusters, the clustering effect is not good); clustering 3 categories, the input image is divided into three regions (correspondingly, clustering 2 categories, the image is divided into two regions ; Cluster 4 categories, the image is divided into four regions; cluster 5 categories, the image is divided into five regions), or that the input image is composed of three parts; in this application, for the above clustering algorithm obtained by For the image, taking cluster 3 as an example, the cluster images corresponding to the three partial regions may be called "cluster image 1", “cluster image 2" and “cluster image 3" respectively.
  • the central region of the divided image (the central region is the physical center of the complete image) is extracted, and the area of this region is 25% of the complete image (the area of the central area occupies the entire image 10% to 50% is acceptable.
  • the area of this area is 10% to 30% of the area of the complete image); in this central area, "cluster image 1" is counted separately , The number of pixels of "cluster image 2" and “cluster image 3", the object with the largest number of pixels is the "central image”.
  • Step S204-13 performs contour detection and segmentation processing on the first image after brightness normalization processing
  • Step S41 Convert a preset number of cluster images to HSV channels, and extract S channels;
  • Step S42 performing adaptive binarization processing on the image of the channel
  • Step S43 performing a closing operation on the result of the adaptive binarization processing and then performing contour detection
  • Step S44 judging whether the area enclosed by the detected contour is greater than a first preset threshold
  • Step S45 in the case where the judgment result is yes, keep the cluster image with the object area greater than the first preset threshold, and select the cluster image with the largest object area from the cluster images with the object area greater than the first preset threshold as Contour image
  • Step S46 In the case where the judgment result is no, discard the cluster images whose object area is less than or equal to the first preset threshold.
  • step S41 to step S46 in a specific application scenario, it can be implemented in the following manner: convert the cluster image to HSV channel (H: hue, what color, S: saturation, Color purity, V: brightness), extract the S channel, and adaptively binarize the S channel image.
  • HSV channel hue, what color, S: saturation, Color purity, V: brightness
  • the adaptive window size is preferably 15 pixels ⁇ 15 pixels to 27 pixels ⁇ 27 pixels, more preferably 21 pixels ⁇ 21 pixels; Perform a closed operation on the binarization result, and then perform contour detection to judge the objects contained in each detected contour:
  • the area of the object is smaller than the preset ratio of the input image area (preferably 1/10, it can also be based on Actually adjust the value, such as 1/5, 1/8, 1/9, 1/11 or 2/11, etc., discard the object; otherwise, keep the object; if there is no reserved object, contour detection
  • the output result of is none; if there is, the object is retained. Among all the retained objects, the object with the largest area is recorded as the "contour object", which corresponds to the aforementioned contour image.
  • Step S204-14 Determine the second image according to the degree of matching between the first image after the clustering and segmentation processing and the first image after the contour detection and segmentation processing.
  • this step S204-14 can be implemented in the following manner:
  • Step S51 in the case that there is no contour image in the first image after the contour detection and segmentation process, the center image is used as the third image;
  • Step S52 there is a contour image in the first image after contour detection and segmentation, and the area of the intersection area between the contour image and the center image is proportional to the area of the union area of the contour image and the center image is greater than or equal to the second predetermined image.
  • the image in the intersection area of the contour image and the center image is taken as the fourth image;
  • Step S53 There is a contour image in the first image after contour detection and segmentation processing, and the area of the intersection area of the contour image and the center image, and the area ratio of the union area of the contour image and the center image is less than a second preset threshold In the case of the preset number of cluster images, the cluster image with the largest area with the contour image and the intersection area is used as the fifth image;
  • Step S54 Determine whether the third image or the fourth image or the fifth image is a single connected area or multiple connected areas;
  • Step S55 in the case that the third image, the fourth image, or the fifth image is a single connected area or multiple connected areas, and the area ratio of the third image, the fourth image, or the fifth image to the first image is smaller than that of the third image or the fourth image or the fifth image.
  • Preset threshold to terminate the image recognition process
  • Step S56 In the case that the third image, the fourth image, or the fifth image are multiple connected regions, and the area ratio of the third image, the fourth image, or the fifth image to the first image is greater than or equal to the third preset Threshold, the third image or the fourth image or the fifth image is used as the second image.
  • the area of the intersection area of the “central image” and the “contour image” is greater than or equal to 80% of the combined area of the two (80% is the second preset threshold value, in other embodiments, the first 2.
  • the preset threshold can be 70% to 90%
  • the final segmentation result is the intersection area of the "center image” and the "contour image”;
  • the following post-processing is required: if the segmentation result is a single connected region, but the area of this region is less than 10% of the area of the input image (that is, the third preset threshold is preferably 10%. In other embodiments, Other third preset thresholds can be selected according to the actual situation), it is considered that the above method cannot accurately segment the feces, and the analysis process is terminated; if the segmentation results are multiple connected regions, the largest connected region is retained, if the area of this connected region If it is less than 10% of the input image area, it is considered that the above method cannot accurately segment the feces, and the analysis process is terminated; otherwise, this connected area is the target feces area.
  • the method of extracting the color feature of the target object from the second image involved in step S206 can be implemented in the following manner:
  • Step S206-11 extracting the RGB channel values of all pixels in the second image, and composing each channel value into a vector separately;
  • Step S206-12 Count the pixel values of each channel vector, and select pixel values that meet the preset quantile range;
  • Step S206-13 Calculate the average value of the pixel value of each channel vector based on the selected pixel value satisfying the preset quantile range;
  • Step S206-14 Combine the mean values in the order of R, G, B into a vector with a preset length, and use the vector with a preset length as a color feature.
  • steps S206-11 to S206-14 in a specific application scenario first extract the RGB channel values of all pixels in the fecal area, and the value of each channel constitutes a vector; statistics of each channel vector Pixel value, for each channel vector, obtain 5% quantile and 95% quantile (this quantile range is the preferred range of the preset quantile range in this application, and can be adjusted accordingly according to the actual situation); For each channel vector, keep the values greater than 5% quantile and less than 95% quantile.
  • This step is used to delete outliers; after the outliers are deleted, calculate the mean value of each channel vector, and the results are calculated according to R, G
  • the sequence of, B is combined into a vector of length 3. This vector is the color feature of the stool area.
  • the method of extracting the texture feature of the target object from the second image in step S206 involved in this embodiment can be implemented in the following manner:
  • Step S206-21 extract the first largest inscribed rectangle from the area of the second image
  • Step S206-22 in the case where the ratio of the area of the first largest inscribed rectangle to the area of the second image is greater than or equal to the fourth preset threshold, perform texture feature extraction on the largest inscribed rectangle;
  • Step S206-23 in the case that the ratio of the area of the first largest inscribed rectangle to the area of the second image is less than the fourth preset threshold, divide the area of the second image into N areas of equal area; where N Is a positive integer greater than or equal to 2;
  • Step S206-24 Search for the second largest inscribed rectangle from each of the N regions, and determine the plurality of inscribed union rectangles of the first largest inscribed rectangle and the plurality of second largest inscribed rectangles. ;
  • Step S206-25 Determine the areas of multiple inscribed union rectangles according to different values of N, and select multiple inscribed union rectangles with the largest area for texture feature extraction.
  • the stool area is mostly irregular, the associated information of its internal pixels is very important. Therefore, if the stool area is deformed so that the irregular area becomes a regular rectangle, it will destroy the internal pixels of the area.
  • Associated features if the smallest bounding rectangle of the feces area is used for feature extraction, more points outside the feces area are introduced, and there is more noise; if the largest inscribed rectangle of the feces area is used for feature extraction, too many internal pixels are discarded Point, the possibility of key features being discarded increases. Therefore, it is proposed to divide the fecal area into multiple regions, find the largest inscribed rectangle in each region, and further extract features from this rectangle. Based on this, in the specific application scenario of the present application, the foregoing steps S206-21 to S206-25 may be in the following manner:
  • Step S61 Extract the largest inscribed rectangle in the feces area, record this largest inscribed rectangle as "inscribed rectangle 0" (corresponding to the first largest inscribed rectangle), if the area of this inscribed rectangle 0 is greater than or equal to the area of the stool area 60% (60% is the fourth preset threshold of this embodiment, and the fourth preset threshold is preferably greater than or equal to 30%; more preferably, 50% to 70%; more preferably, 60%) is internally connected Rectangle 0 is used for feature extraction; if the area of the inscribed rectangle 0 is less than 60% of the area of the stool area (60% is the fourth preset threshold of this embodiment, the fourth preset threshold is preferably greater than or equal to 30%; Preferably, 50% to 70%; more preferably, 60%), then step S62 to step S65 need to be executed. If the area of the inscribed rectangle 0 is greater than 80% of the area of the feces area, there is no need to perform steps S62 to S65;
  • Step S63 Extract the largest inscribed rectangles from the divided N regions, respectively, denoted as “inscribed rectangle 1", ..., “inscribed rectangle N” (corresponding to the second largest inscribed rectangle); combined “Inscribed rectangle 0”, obtain the union area of all "inscribed rectangles", and this union area is "inscribed union”, calculate the area of this inscribed union area;
  • Step S65 Perform feature extraction on each "inscribed rectangle”. First, extract all image areas corresponding to the "inscribed rectangle" area from the preprocessed input image, and separate the RGB channels of each image area to obtain each The gray-scale image of the channel; for the gray-scale image of each channel, extract the features of the gray-level co-occurrence matrix and the local binary mode.
  • the specific method of extracting the features of the gray-level co-occurrence matrix and the feature of the local binary pattern can be:
  • Extract the features of the gray-level co-occurrence matrix first calculate the gray-level co-occurrence matrix.
  • the pixel pitch to be scanned is preferably 1, 2, 3, 4. If the pixel pitch is too large, more computing time is required.
  • the scanning angle is 0°, 45 °, 90°, 135°, 180°, 225°, 270°, 325° (the angular interval is preferably 30°, 45°, 60° or 90°, etc., more preferably 45°); further, from the gray scale Extract features from the co-occurrence matrix, including contrast, inverse moment of difference, entropy, autocorrelation, energy, difference, second-order moment, etc., and combine the extracted features into a feature vector, which is the gray of a certain channel grayscale image Features of degree co-occurrence matrix.
  • Extract local binary pattern features first calculate the local binary pattern matrix, the parameters are: use in a circular area with a radius of 3 (the radius range is preferably 1 to 5, more preferably 3, and a radius of 3 has a better effect) Containing 24 sampling points (the number of sampling points can be adjusted according to the radius, the radius is different, the number of sampling points is different, the specific adjustment can be obtained by the existing software, algorithm, etc.
  • the local binary pattern operator of calculates the local binary pattern matrix, and then counts the local binary pattern histogram (the number of groups in the histogram is preferably 32 to 256, more preferably 128), and each group contains pixels
  • feature merging merge the eigenvectors of the gray-level co-occurrence matrix and the eigenvectors of the local binary mode with the gray-level image of each channel to form a long vector, which is recorded as the "channel eigenvector”; then, merge the RGB channels The "channel feature vector” is merged into a long vector, which is recorded as the "feature vector of the inscribed rectangle n".
  • steps S61 to S62 first extract the "inscribed rectangle 0" from the stool area, if the area of the "inscribed rectangle 0" is greater than or equal to 60% of the area of the stool area, then the "inscribed rectangle 0" Extract the "feature vector of the inscribed rectangle 0"; if the area of the "inscribed rectangle 0" is less than 60% of the area of the stool area, proceed to step S62 to step S63 to divide the stool area into N parts, and extract the "inscribed rectangle 0" for each part For rectangle n", step S65 is performed for each "inscribed rectangle n", and the "feature vector of inscribed rectangle n" is extracted. So far, for each input image, N inscribed rectangle feature vectors will be generated”.
  • the Delta-E measurement is used to classify the color of stool
  • the statistical probability model is used to analyze the characteristics of stool
  • CIELAB Delta-E (or ⁇ E) 2000
  • CIELAB, Delta-E or ⁇ E is a standard for measuring color difference issued by the International Lighting Association, which can better reflect the color difference perceived by the human eye; 2000 It represents the standard released in 2000. This standard was modified on the basis of the 1994 standard.
  • the calculation process of CIELAB Delta-E 2000 is shown in the figure below.
  • Other methods for calculating color difference include Euclidean distance method, CIELAB Delta-E 1976, CIELAB Delta-E 1994, Delta-E CMC and other methods. 2000 is a step-by-step improvement on these methods.
  • the steps for color classification using Delta-E are as follows,
  • the color classification categories include but are not limited to the following categories: yellow, dark green, brown, red and black.
  • the inventor of this application has obtained RGB values suitable for color classification of baby feces after a large number of experiments, specifically: yellow [200,200,0], dark green [ 0,70,0], brown [180,60,60], red [220,0,0] and black [0,0,0]; what needs to be explained is the color categories that can be added, but each color category All need a standard RGB value corresponding to it; then, convert these standard colors from RGB space to CIELAB space under the standard light source D65 level (color space conversion is consistent with image preprocessing), and record it as "LAB space standard color” .
  • Step S72 Convert the average color proposed in the stool area of the input image from the RGB space to the CIELAB space under the standard light source D65 level; use the CIELAB Delta-E 2000 standard to compare the average color of the stool area with the standard color one by one, Find the standard color closest to the average color of the stool area (the CIELAB Delta-E 2000 calculation result is the smallest), then the color of the stool area is the standard color, and the color classification process ends.
  • Step S73 classifying the appearance of feces, mainly includes five tasks: traits, presence or absence of milk flaps, presence or absence of foam, presence or absence of blood, and presence or absence of mucus.
  • the classification category of each task is shown in Table 2 as the final result.
  • the specific process is divided into four steps: stool image data collection and labeling; image data preprocessing, stool region segmentation, feature extraction; training model; using the model to predict the input image.
  • Step S74 data collection and labeling: there is currently no relevant stool image data set.
  • stool images mainly infant stool images
  • the doctor annotates each image, including: stool color, traits (9 types of traits in Table 2), presence or absence of milk flaps, presence or absence of foam, presence or absence of blood streaks, and presence or absence of mucus.
  • This data set image is divided into three parts, including: training set, validation set and test set.
  • Step S75 Perform preprocessing, fecal area segmentation, and feature extraction on each part of the image data: perform processes 3 and 4 for each image, and each image can obtain at least one inscribed rectangular feature vector; therefore, this data set Contains three parts of the inscribed rectangular feature vector: training set, validation set and test set.
  • Step S76 For each classification task, use the training set to train XGBoost (XGB, Extreme Gradient Boosting), Support Vector Machine (SVM) and Random Forest (Random Forest, RF) (a total of 5 classification tasks in this step, There are three models for each task, a total of 15 models; Note: XGB, SVM and RF are relatively common classifiers), use the validation set to adjust the hyperparameters of the three models, and use the test set to evaluate the training effects of the three models.
  • Table 1 The specific hyperparameter settings of the three models for each classification task are shown in Table 1:
  • Step S77 using the model to predict the input image
  • results of infant feces recognition in this embodiment are only the recognition of whether the image contains feces images and the identification and classification of the characteristics of feces in the feces images. Based on the results of this embodiment, those skilled in the art It is not possible to directly evaluate the health status of the infant or diagnose/treat the infant’s disease, and the results of this embodiment cannot directly reflect the health status of the infant.
  • model used is an ensemble model of gradient boosting tree, support vector machine and random forest.
  • the method according to the above embodiment can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is Better implementation.
  • the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, The optical disc) includes several instructions to enable a terminal device (which can be a mobile phone, a computer, a server, or a network device, etc.) to execute the method described in each embodiment of the present application.
  • an image recognition device is also provided, which is used to implement the above-mentioned embodiments and preferred implementations, and those that have been described will not be repeated.
  • the term "module” can implement a combination of software and/or hardware with predetermined functions.
  • the devices described in the following embodiments are preferably implemented by software, hardware or a combination of software and hardware is also possible and conceived.
  • Fig. 3 is a structural block diagram of an image recognition device according to an embodiment of the present application.
  • the device includes: an acquisition module 32, configured to acquire a first image containing a target object; a first extraction module 34, and acquisition
  • the module 32 is coupled and connected, and is configured to extract a second image corresponding to a designated area from the first image, where the designated area is an area containing the target object in the first image;
  • the second extraction module 36 is connected to the first extraction module
  • the coupling connection is configured to extract the color feature and texture feature of the target object from the second image, and identify the color and character of the target object in the first image based on the color feature and texture feature.
  • the method is applied to the scene of feces image recognition, thereby solving the problem of manually identifying the color and properties of feces in the feces image in the related art, and achieving the improvement of recognition efficiency and saving The effect of cost.
  • traits refer to the components and physical state of the target object reflected by the texture characteristics of the target object in the image.
  • the target object can be feces, sputum, soil, tissue samples, etc., specifically to the recognition method of feces images.
  • the characteristics of feces include but are not limited to the composition and/or physical state of feces.
  • the composition of feces includes whether there are milk flaps, foam, etc.
  • Bloodshot or mucus the physical state of feces includes muddy, egg flower, watery, mucus, banana, toothpaste, sea cucumber, clay, tar, or sheep feces granular.
  • the acquisition module 32 in this embodiment includes: a establishing unit configured to collect a plurality of image data containing the target object, and to establish an image database based on the collected image data; and a training unit configured to be based on images in the image database
  • the data trains the first convolutional neural network to obtain the second convolutional neural network for classification;
  • the analysis unit is configured to analyze the input image data through the second convolutional neural network to obtain the first image.
  • the operation performed by the establishment unit corresponds to the method step of step S202-11 in the above embodiment
  • the operation performed by the training unit corresponds to the illegal step of step S202-12 in the above embodiment
  • the analysis unit performs The operation of corresponds to the method steps of step S202-13 in the foregoing embodiment.
  • the first extraction module 34 in this embodiment includes: a normalization unit configured to perform brightness normalization processing on the first image; and a cluster segmentation unit configured to perform brightness normalization processing on the The first image is subjected to clustering and segmentation processing; the contour detection and segmentation unit is configured to perform contour detection and segmentation processing on the first image after the brightness normalization processing; the determining unit is configured to perform the clustering and segmentation processing according to the first image The degree of matching with the first image after the contour detection and segmentation process determines the second image.
  • the normalization unit in the present application may further include: a conversion subunit, configured to convert the first image from RGB space to CIELAB space under the standard light source D65 level; the first normalization subunit, configured to pass the restriction
  • the adaptive histogram equalization method of contrast enhances the L channel in the CIELAB space image to complete the brightness normalization of the L channel; the second normalization subunit is set to complete the brightness normalization process
  • the RGB space image obtained by combining the L channel in the image in the CIELAB space with the unprocessed A channel and the B channel in the image in the CIELAB space is used as the first image after the brightness normalization process.
  • the operations performed by the sub-units included in the normalization unit are equivalent to steps S21 to S23 in the foregoing embodiment, and are equivalent to preprocessing of the first image, including adjusting the picture size or preprocessing. It is a processing step, but whether the size is adjusted or not will not affect the output of this step.
  • different images due to the complex shooting environment of the input image, different images have different brightness and darkness, and different areas of the same image have different brightness and darkness.
  • the input image should be preprocessed and brightness The normalization of to improve the segmentation accuracy of the target image and achieve a more accurate segmentation effect.
  • the cluster segmentation unit in the present application may further include: a creation subunit configured to create a feature vector for each pixel in the first image after brightness normalization processing; the cluster segmentation subunit is configured to pass The clustering algorithm clusters the feature vectors of pixels into a preset number of categories and obtains a preset number of clustered images corresponding to the preset number of categories; the first extraction subunit is set to extract the third from the first image Image, where the third image is an image centered on the center of the first image and whose area is a preset percentage of the first image; the marking subunit is set to respectively count a preset number of pixels in the cluster image in the first image The number in the three images, and the cluster image with the most pixels in the third image is marked as the center image.
  • a creation subunit configured to create a feature vector for each pixel in the first image after brightness normalization processing
  • the cluster segmentation subunit is configured to pass The clustering algorithm clusters the feature vectors of pixels into a preset number of categories and obtains a
  • the operations performed by the subunits included in the cluster segmentation unit are equivalent to steps S31 to S34 in the foregoing embodiment, that is, cluster segmentation of the color region.
  • steps S31 to S34 that is, cluster segmentation of the color region.
  • the feces area is generally a continuous color block with relatively uniform color
  • clustering images of different colors are segmented based on pixel color and location, and clustering 3 categories
  • the cluster images corresponding to the three partial regions may be called "cluster image 1", “cluster image 2" and "cluster image 3" respectively.
  • the central region of the divided image (the central region is the physical center of the complete image) is extracted, and the area of this region is 25% of the complete image (the area of the central area occupies the entire image 10% to 50% is acceptable.
  • the area of this area is 10% to 30% of the area of the complete image); in this central area, "cluster image 1" is counted separately , The number of pixels of "cluster image 2" and “cluster image 3", the object with the largest number of pixels is the "central image”.
  • the contour detection and segmentation unit in the present application may further include: a second extraction subunit, configured to convert a preset number of cluster images to the HSV channel, and extract the S channel; and the first processing subunit, configured to The image of the S channel is subjected to adaptive binarization processing; the contour detection sub-unit is set to perform contour detection after the result of the adaptive binarization processing is closed; the judgment sub-unit is set to determine whether the area of the detected contour is Greater than the first preset threshold; the second processing subunit is set to retain cluster images whose object area is greater than the first preset threshold if the judgment result is yes, and collect cluster images whose object area is greater than the first preset threshold The cluster image with the largest object area is selected from the class images as the contour image; the third processing subunit is configured to discard the cluster image with the object area less than or equal to the first preset threshold when the judgment result is no.
  • a second extraction subunit configured to convert a preset number of cluster images to the HSV channel, and extract the S channel
  • the operations performed by the subunits included in the contour detection and segmentation unit correspond to the method steps from step S41 to step S46 in the above embodiment.
  • it can be implemented in the following manner: Convert the class image to HSV channel (H: hue, what color, S: saturation, color purity, V: brightness), extract the S channel, perform adaptive binarization of the S channel image, and optimize the adaptive window size 15 pixels ⁇ 15 pixels to 27 pixels ⁇ 27 pixels, more preferably 21 pixels ⁇ 21 pixels; close the binarization result, and then perform contour detection, and judge the objects contained in each detected contour: when this
  • the area of the object is smaller than the preset ratio of the input image area (preferably 1/10, the value can also be adjusted according to the actual situation, such as 1/5, 1/8, 1/9, 1/11 or 2/11, etc.)
  • the output result of contour detection is none; if there is, keep the object, among all the reserved objects, record
  • the determining unit in the present application may further include: a fourth processing subunit, configured to use the center image as the third image when there is no contour image in the first image after the contour detection and segmentation process;
  • the processing subunit is set to include a contour image in the first image after contour detection and segmentation processing, and the area of the intersection area of the contour image and the center image is greater than or equal to the area ratio of the union area of the contour image and the center image
  • the second preset threshold the image in the intersection area of the contour image and the central image is taken as the fourth image
  • the sixth processing subunit is set to include the contour image in the first image after contour detection and segmentation processing, and If the area ratio of the intersection area of the contour image and the center image and the area ratio of the union area of the contour image and the center image is less than the second preset threshold, the preset number of cluster images is divided into the contour image and the intersection area.
  • the cluster image with the largest area is used as the fifth image;
  • the seventh processing subunit is set to determine whether the third image or the fourth image or the fifth image is a single connected region or multiple connected regions;
  • the eighth processing subunit is set to When the third image, the fourth image, or the fifth image is a single connected region or multiple connected regions, and the area ratio of the third image, the fourth image, or the fifth image to the first image is less than the third preset threshold, Terminate the image recognition process;
  • the ninth processing subunit is set to be in the case that the third image or the fourth image or the fifth image is multiple connected regions, and the third image, the fourth image or the fifth image and the first The area ratio of the image is greater than or equal to the third preset threshold, and the third image or the fourth image or the fifth image is used as the second image.
  • the second extraction module 36 in the present application may further include: a first extraction unit, configured to extract the RGB channel values of all pixels in the second image, and form each channel value into a vector separately;
  • a processing unit is set to count the pixel values of each channel vector and select pixel values that meet the preset quantile range; the calculation unit is set to calculate each pixel value based on the selected pixel values that meet the preset quantile range The average value of the pixel values of the two channel vectors;
  • the second processing unit is configured to combine the average values in the order of R, G, B into a vector of preset length, and use the vector of preset length as a color feature.
  • the second extraction module in the present application may further include: a second extraction unit configured to extract the first maximum inscribed rectangle from the area of the second image; and a third extraction unit configured to be within the first maximum When the ratio of the area of the connected rectangle to the area of the second image is greater than or equal to the fourth preset threshold, the texture feature extraction is performed on the largest inscribed rectangle; the fourth processing unit is set to the area of the first largest inscribed rectangle When the ratio of the area to the second image is less than the fourth preset threshold, the area of the second image is divided into N areas of equal area; where N is a positive integer greater than or equal to 2; the fifth processing unit, Set to search for the second largest inscribed rectangle from each of the N regions, and determine multiple inscribed union rectangles of the first largest inscribed rectangle and multiple second largest inscribed rectangles; sixth The processing unit is configured to determine the areas of multiple inscribed union rectangles according to different values of N, and select multiple inscribed union rectangles with the largest area for texture feature extraction.
  • a second extraction unit configured to extract the first maximum in
  • each of the above modules can be implemented by software or hardware.
  • it can be implemented in the following manner, but not limited to this: the above modules are all located in the same processor; or, the above modules are combined in any combination The forms are located in different processors.
  • the embodiments of the present application also provide a computer-readable storage medium in which a computer program is stored, wherein the computer program is configured to execute the steps in any one of the foregoing method embodiments when running.
  • the foregoing computer-readable storage medium may be configured to store a computer program for executing the following steps:
  • the foregoing storage medium may include, but is not limited to: U disk, Read-Only Memory (Read-Only Memory, ROM for short), Random Access Memory (Random Access Memory, RAM for short), Various media that can store computer programs, such as mobile hard disks, magnetic disks, or optical disks.
  • An embodiment of the present application also provides an electronic device, including a memory and a processor, the memory is stored with a computer program, and the processor is configured to run the computer program to execute the steps in any of the foregoing method embodiments.
  • the aforementioned electronic device may further include a transmission device and an input-output device, wherein the transmission device is connected to the aforementioned processor, and the input-output device is connected to the aforementioned processor.
  • the foregoing processor may be configured to execute the following steps through a computer program:
  • modules or steps of this application can be implemented by a general computing device, and they can be concentrated on a single computing device or distributed in a network composed of multiple computing devices.
  • they can be implemented with program codes executable by the computing device, so that they can be stored in the storage device for execution by the computing device, and in some cases, can be executed in a different order than here.
  • the image recognition method and device, storage medium and electronic device provided by the embodiments of the present application have the following beneficial effects: it solves the problem of manually recognizing color and texture features in an image in the related art , To achieve the effect of improving identification efficiency and saving costs.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

本申请提供了一种图像的识别方法及装置、存储介质和电子装置,包括:获取含有目标对象的第一图像;从所述第一图像中提取出与指定区域对应的第二图像,其中,所述指定区域为在所述第一图像中含有所述目标对象的区域;从所述第二图像中提取出目标对象的颜色特征和纹理特征,并基于所述颜色特征和所述纹理特征识别出所述第一图像中的目标对象的颜色和性状。通过本申请,解决了相关技术中通过人工的方式对图像中的颜色及纹理特征进行识别的问题,达到了提高识别效率节约成本的效果。

Description

图像的识别方法及装置、存储介质和电子装置 技术领域
本申请涉及图像识别领域,具体而言,涉及一种图像的识别方法及装置、存储介质和电子装置。
背景技术
相关技术中,对于粪便性状和颜色通常的辨别方法是将粪便进行图像取样后通过人眼来判断,但这种辨别方式需要有一名技术人员进行人工的观察,存在成本高且效率低的问题。
针对相关技术中的上述问题,目前尚未存在有效的解决方案。
发明内容
本申请实施例提供了一种图像的识别方法及装置、存储介质和电子装置,以至少解决相关技术中通过人工的方式对图像中的颜色及纹理特征进行识别的问题。
根据本申请的一个实施例,提供了一种图像的识别方法,包括:获取含有目标对象的第一图像;从所述第一图像中提取出与指定区域对应的第二图像,其中,所述指定区域为在所述第一图像中含有所述目标对象的区域;从所述第二图像中提取出目标对象的颜色特征和纹理特征,并基于所述颜色特征和所述纹理特征识别出所述第一图像中的目标对象的颜色和性状。
需要说明的是,性状是指图像中目标对象的纹理特征所反映的目标对象的组成成分、物理状态等特征。具体到粪便图像的识别方法,粪便的性状包括但不限于粪便的组成和/或物理状态,粪便的组成包括是否存在奶瓣、是否存在泡沫、是否存在血丝或是否存在黏液等,粪便的物理状态包括但不限于稀泥状、蛋花状、水样状、粘液状、香蕉状、牙膏状、海参状、陶土状、柏油状或羊屎颗粒状等。
根据本申请的一个实施例,提供了一种图像的识别装置,包括:获取模块,设置为获取含有目标对象的第一图像;第一提取模块,设置为从所述第一图像中提取出与指定区域对应的第二图像,其中,所述指定区域为在所述第一图像中含有所述目标对象的区域;第二提取模块,设置为从所述第二图像中提取出目标对象的颜色特征和纹理特征,并基于所述颜色特征和所述纹理特征识别出所述第一图像中的目标对象的颜色和性状。
根据本申请的又一个实施例,还提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,其中,所述计算机程序被设置为运行时执行上述图像的识别方法实施例中的步骤。
根据本申请的又一个实施例,还提供了一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器被设置为运行所述计算机程序以执行上述图像的识别方法实施例中的步骤。
通过本申请,从含有目标对象的第一图像中提取出与含有目标对象对应的区域作为第二图像,并从该第二图像中提取出目标对象的颜色特征和纹理特征从而识别出目标对象的颜色和性状,从而解决了相关技术中通过人工的方式对图像中的颜色及纹理特征进行识别的问题,达到了提高识别效率节约成本的效果。
附图说明
图1是本申请实施例的一种图像的识别方法的终端的硬件结构框图;
图2是根据本申请实施例的图像的识别方法的流程图;
图3是根据本申请实施例的图像的识别装置的结构框图。
具体实施方式
下文中将参考附图并结合实施例来详细说明本申请。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序 或先后次序。
本申请中所提供的方法实施例可以在终端、计算机终端或者类似的运算装置中执行。以运行在终端上为例,图1是本申请实施例的一种图像的识别方法的终端的硬件结构框图。如图1所示,终端可以包括一个或多个(图1中仅示出一个)处理器102(处理器102可以包括但不限于微处理器MCU或可编程逻辑器件FPGA等的处理装置)和设置为存储数据的存储器104,可选地,上述终端还可以包括设置为通信功能的传输设备106以及输入输出设备108。本领域普通技术人员可以理解,图1所示的结构仅为示意,其并不对上述终端的结构造成限定。例如,终端还可包括比图1中所示更多或者更少的组件,或者具有与图1所示不同的配置。
存储器104可设置为存储计算机程序,例如,应用软件的软件程序以及模块,如本申请实施例中的图像的识别方法对应的计算机程序,处理器102通过运行存储在存储器104内的计算机程序,从而执行各种功能应用以及数据处理,即实现上述的方法。存储器104可包括高速随机存储器,还可包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器104可进一步包括相对于处理器102远程设置的存储器,这些远程存储器可以通过网络连接至终端。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
传输装置106设置为经由一个网络接收或者发送数据。上述的网络具体实例可包括终端的通信供应商提供的无线网络。在一个实例中,传输装置106包括一个网络适配器(Network Interface Controller,简称为NIC),其可通过基站与其他网络设备相连从而可与互联网进行通讯。在一个实例中,传输装置106可以为射频(Radio Frequency,简称为RF)模块,其设置为通过无线方式与互联网进行通讯。
在本实施例中提供了一种上述终端的图像的识别方法,图2是根据本申请实施例的图像的识别方法的流程图,如图2所示,该流程包括如下步 骤:
步骤S202,获取含有目标对象的第一图像;
步骤S204,从第一图像中提取出与指定区域对应的第二图像,其中,指定区域为在第一图像中含有目标对象的区域;
步骤S206,从第二图像中提取出目标对象的颜色特征和纹理特征,并基于颜色特征和纹理特征识别出第一图像中的目标对象的颜色和性状。
通过上述步骤S202至步骤S206,从含有目标对象的第一图像中提取出与含有目标对象对应的区域作为第二图像,并从该第二图像中提取出目标对象的颜色特征和纹理特征从而识别出目标对象的颜色和性状。本申请的一个具体实施方式中,将该方法应用到粪便图像识别的场景,从而解决了相关技术中通过人工的方式对粪便图像中的粪便颜色和性状进行识别的问题,达到了提高识别效率节约成本的效果。
需要说明的是,性状是指图像中目标对象的纹理特征所反映的目标对象的组成成分、物理状态等特征。目标对象可以是粪便、痰液、泥土、组织样本等,具体到粪便图像的识别方法,粪便的性状包括但不限于粪便的组成和/或物理状态,粪便的组成包括是否存在奶瓣、泡沫、血丝或黏液,粪便的物理状态包括稀泥状、蛋花状、水样状、粘液状、香蕉状、牙膏状、海参状、陶土状、柏油状或羊屎颗粒状。
在本申请的可选实施方式中,上述步骤S202中涉及到的获取含有目标对象的第一图像的方式,可以是:
步骤S202-11,收集多个含有目标对象的图像数据,并基于收集的图像数据建立图像数据库;
步骤S202-12,基于图像数据库中的图像数据对第一卷积神经网络进行训练得到用于分类的第二卷积神经网络;
步骤S202-13,通过第二卷积神经网络对输入的图像数据进行分析得到第一图像。
对于上述步骤S202-11至步骤S202-13,在具体应用场景中以目标对象为婴儿的粪便为例,当然目标对象也可以是泥土、成人粪便、痰液样本、组织样本等,在此仅仅是举例说明。
其具体过程可以为:①将训练集图像输入至随机初始化网络参数的第一卷积神经网络中(前馈神经网络模型),各训练集图像具有相应的标注;通过第一卷积神经网络对输入的训练集图像进行计算得到训练集图像的参考标注的结果;②根据训练集图像的参考标注结果及训练集图像的实际标注结果确定当前第一卷积神经网络对训练集图像标注结果的损失函数值;③根据训练集图像标注结果的损失函数值通过反向传播算法调整当前第一卷积神经网络的网络参数,将验证集图像输入值调整网络参数后的第一卷积神经网络;④通过第一卷积神经网络对输入的验证集图像进行计算得到验证集图像的参考标注的结果;⑤根据验证集图像的参考标注结果及验证集图像的实际标注结果确定当前第一卷积神经网络对验证集图像标注结果的损失函数值;⑥当验证集图像标注结果的损失函数值满足预设条件时,得到第二卷积神经网络的网络参数目标值并获得第二卷积神经网络,确定图像识别模型;否则,重复过程①、②、③、④、⑤直到验证集图像标注结果的损失函数值满足预设条件。
此外,在本申请中该卷积神经网络优选为SqueezeNet或MobileNet,原因为这两种卷积神经网络的参数较少,所需计算资源较少,在CPU上运行速度较快。但当然也可以使用其他卷积神经网络,如MobileNet、ResNet、Xception、Inception、DenseNet、LeNet、AlexNet等等很多其他用于做分类的卷积神经网络,用于分类的神经网络都适用。
基于此,上述步骤S202-11至步骤S202-13通过使用卷积神经网络目的是用于检测输入图像是否含有粪便图像,在具体应用场景中三个步骤包括:
步骤S11(对应于步骤S202-11),收集数据;
其中,由于目前没有相关的粪便图像数据集,因此,需要先建立粪便 图像数据集。从网络上或者其他途径收集粪便图像(主要是婴幼儿的粪便图像)两千余张;此外,作为对照,还可以收集五大类(家庭场景、类粪便食物、育婴周边、人像及其他)及三十小类一万余张非粪便图片。
步骤S12(对应于步骤S202-12),训练卷积神经网络;
其中,将步骤S11收集的所有图像分为三部分:训练集、验证集及测试集。使用训练集训练卷积神经网络,计算、调整网络中的权重(参数);使用验证集调整训练卷积神经网络过程中所使用的超参数(如学习率、正则化系数、丢弃率等),使卷积神经网络在验证集上获得较好结果;最后,使用测试集测试训练后的卷积神经网络的分类效果。为了使得步骤S12训练卷积神经网络获得更好的分类效果,在训练之前先将所有图像调整到统一的尺寸,需要说明的是,不同的卷积神经网络需要不同尺寸的输入图像,如SqueezeNet、MobileNet输入为224,Xception输入为299等,所需的尺寸根据实际需要进行调整,在此不做具体限定。
需要说明的是,训练前的卷积神经网络相当于上述本申请中的第一卷积神经网络,训练后的卷积神经网络相当于上述本申请中的第二卷积神经网络。
步骤S13(对应于步骤S202-13),使用卷积神经网络做预测;
其中,首先调整输入图像尺寸至卷及神经网络需要的大小,然后将调整后的图像输入卷及神经网络,网络给出输入图片是否含有“粪便”的概率,当此概率大于预设概率值(例如,预设概率值为50%)时,认为此输入图片中含有粪便。在其他实施方式中,使用该训练好的卷积神经网络判断输入的图片是否含有“粪便”是,也可根据实际的需要选择其他概率,如40%、60%或65%等,具体在此不做限定。
在本申请的另一个可选实施方式中,对于本申请中涉及到的步骤S204中的从第一图像中提取出与指定区域对应的第二图像的方式,可以通过如下方式来实现:
步骤S204-11,对第一图像进行亮度归一化处理;
其中,该步骤S204-11,进一步可以通过如下方式来实现:
步骤S21,将第一图像从RGB空间转换为标准光源D65水平下的CIELAB空间;
步骤S22,通过限制对比度的自适应直方图均衡化方式对CIELAB空间的图像中的L通道进行增强处理以完成对L通道的亮度归一化处理;
步骤S23,将完成亮度归一化处理的CIELAB空间的图像中的L通道与CIELAB空间的图像中未经处理的A通道和B通道相结合得到亮度归一化处理的CIELAB空间图像,再将CIELAB空间图像转换到的RGB空间图像作为进行亮度归一化处理后的第一图像。其中,A通道表示从红色到绿色的范围,取值范围是[127,-128];B通道表示从黄色到蓝色的范围,取值范围是[127,-128]。该亮度归一化的步骤为现有的亮度归一化的方法,在其他实施方式中,也可采用现有技术中其他亮度归一化的方法,具体在此不做限定。
需要说明的是,上述步骤S21至步骤S23,相当于是对第一图像的进行预处理,包括调整图片尺寸也可以是预处理的一步,但调整尺寸与否,不影响此步骤输出;此外,由于输入图像拍摄环境复杂,不同图像亮暗程度不同,同一图像不同区域亮暗程度不同,应对输入图像做预处理,做亮度的归一化,以提高目标图像的分割精度,能够达到更精确的分割效果。
可选地,该步骤S21至步骤S23,在具体应用场景中可以通过如下方式来实现:
首先,将输入图像从RGB空间转换为在标准光源D65水平下的CIELAB空间,这一步是为了不同图像的亮度归一化;其中,CIE是指国际照明委员会标准,L代表亮度或明度,A和B代表相关色彩范围,D65意思是L为65(标准白光)。
其次,使用限制对比度的自适应直方图均衡化(CLAHE)方法对转换得到的CIELAB空间的图像的L通道进行增强,该增强的目的是为了对同一图像的不同区域做亮度归一化。其中,CLAHE的具体过程为:计算L通 道图像的灰度直方图的概率;设定一个裁剪阈值(clip limit),在本申请中优选为3.0%,需要说明的是,阈值大于0小于100%都可用,为了达到更好的效果,优选的,阈值范围在2.0%至5.0%。当某个自适应窗口(在本申请中,窗口范围在5像素×5像素至21像素×21像素,优选的,自适应窗口为11像素×11像素)中的某个灰度值的像素个数占所有窗口内像素个数的比例大于给定阈值时,对此灰度直方图进行裁剪,将裁剪下的直方图部分平均分配到各个灰度级,即完成对L通道的亮度归一化。
最后,将完成亮度归一化的L通道与未经过处理的A通道和B通道相结合,转换回RGB空间图像,以完成图像的预处理,即完成第一图像的亮度归一化处理。
步骤S204-12,对进行亮度归一化处理后的第一图像进行聚类分割处理;
其中,该步骤S204-12可选地,可以通过如下方式来实现:
步骤S31,对进行亮度归一化处理后的第一图像中的每个像素建立特征向量;
步骤S32,通过聚类算法将所述像素的特征向量聚为预设数量的类别并得到与预设数量的类别对应的预设数量的聚类图像;
步骤S33,从第一图像中提取出第三图像,其中,第三图像为以第一图像的中心为中心且面积为第一图像预设百分比的图像;
步骤S34,分别统计出预设数量的聚类图像中的像素点在第三图像中的数量,并将在第三图像中像素点最多的聚类图像标记为中心图像。
需要说明的是,上述步骤S31至步骤S34是对颜色区域聚类分割:输入为做过预处理的第一图像。由于本申请中以婴儿的粪便为例,所以基于上述步骤多要得到的也是粪便区域的颜色,由于粪便区域一般是一个连续的、颜色较为均匀的色块,根据这个特点,使用基于像素颜色及位置的聚类分割,在具体的应用场景中可以是:
首先,对进行亮度归一化处理后的第一图像中的每个像素建立特征向 量,向量中包含此像素的RBG值及此像素在图像中的坐标(X,Y)等共5个值;使用聚类算法(KMeans、Fuzzy-CMeans、高斯混合模型的最大期望聚类或其他指明聚类中心个数的聚类方法均可使用)对所有像素的特征向量聚3类(优选的,聚2至5类;为了达到更好的颜色聚类分割的效果,更优选地,聚3类。需要说明的是,如果是粪便区域比较明显,背景区域纯净,聚2类即可获得很好的聚类效果,无论什么情况,聚类中的个数大于5类聚类效果不好);聚3类,输入图像被分为三个区域(相应的,聚2类,图像被分为两个区域;聚4类,图像被分为四个区域;聚5类,图像被分为五个区域),或认为输入图像由三部分组成;在本申请中,对于上述通过聚类算法得到的聚类图像,以聚3类为例,该三部分区域对应的聚类图像分别可以称为“聚类图像1”,“聚类图像2”和“聚类图像3”。
最后,将输入图像分割为三个区域后,提取出分割图像中心区域(中心区域即为完整图像的物理中心的区域),此区域面积为完整图像的25%(中心区域的面积占完整图像的10%至50%均可,为了获得更好的分割中心区域的效果,优选该区域的面积为完整图像面积的10%至30%);在此中心区域中,分别统计“聚类图像1”,“聚类图像2”和“聚类图像3”的像素个数,记像素点个数最多的物体为“中心图像”。
步骤S204-13对进行亮度归一化处理后的第一图像进行轮廓检测分割处理;
其中,对于上述步骤S204-13在本实施例的可选实施方式中,可以通过如下方式来实现:
步骤S41,将预设数量的聚类图像转换到HSV通道,并提取出S通道;
步骤S42,对通道的图像进行自适应二值化处理;
步骤S43,将自适应二值化处理的结果进行闭操作后进行轮廓检测;
步骤S44,判断检测到的轮廓所包围的面积是否大于第一预设阈值;
步骤S45,在判断结果为是的情况下,保留物体面积大于第一预设阈 值的聚类图像,并与物体面积大于第一预设阈值的聚类图像中选择物体面积最大的聚类图像作为轮廓图像;
步骤S46,在判断结果为否的情况下,舍弃物体面积小于或等于第一预设阈值的聚类图像。
对于上述步骤S41至步骤S46进行轮廓检测的方法步骤,在具体应用场景中,可以通过如下方式来实现:将上述聚类图像转换到HSV通道(H:色相,何种颜色,S:饱和度,色彩的纯度,V:亮度),提取出S通道,对S通道图像进行自适应二值化,自适应窗口大小优选15像素×15像素至27像素×27像素,更优选21像素×21像素;对二值化结果进行闭操作,然后进行轮廓检测,对每一个检测到的轮廓所包含的物体进行判断:当此物体的面积小于输入图像面积的预设比例(优选1/10,也可根据实际情况对该值进行调整,比如1/5、1/8、1/9、1/11或2/11等)时,舍弃此物体;反之,保留此物体;若无保留的物体,轮廓检测的输出结果为无;若有,则保留物体,在保留的所有物体中,记面积最大的物体为“轮廓物体”,对应于上述轮廓图像。
步骤S204-14,根据进行聚类分割处理后的第一图像与进行轮廓检测分割处理后的第一图像的匹配程度确定第二图像。
其中,该步骤S204-14可以通过如下方式来实现:
步骤S51,在进行轮廓检测分割处理后的第一图像中不存在轮廓图像的情况下,将中心图像作为第三图像;
步骤S52,在进行轮廓检测分割处理后的第一图像中存在轮廓图像,且轮廓图像和中心图像的交集区域的面积,与轮廓图像和中心图像的并集区域的面积比例大于或等于第二预设阈值的情况下,将轮廓图像与中心图像的交集区域的图像作为第四图像;
步骤S53,在进行轮廓检测分割处理后的第一图像中存在轮廓图像,且轮廓图像和中心图像的交集区域的面积,与轮廓图像和中心图像的并集区域的面积比例小于第二预设阈值的情况下,将预设数量的聚类图像中与 轮廓图像与交集区域的面积最大的聚类图像作为第五图像;
步骤S54,判断第三图像或第四图像或第五图像为单一连通区域还是多个连通区域;
步骤S55,在第三图像或第四图像或第五图像为单一连通区域或多个连通区域的情况下,且第三图像或第四图像或第五图像与第一图像的面积比例小于第三预设阈值,终止图像的识别过程;
步骤S56,在第三图像或第四图像或第五图像为多个连通区域的情况下,且第三图像或第四图像或第五图像与第一图像的面积比例大于或等于第三预设阈值,将第三图像或第四图像或第五图像作为第二图像。
对于上述步骤S51至步骤S56,在具体应用场景中可以是:
若无“轮廓图像”,则最终分割结果为“中心图像”区域。
若存在“轮廓物体”,且“中心图像”与“轮廓图像”交集区域的面积大于等于两者并集区域面积的80%(80%即为第二预设阈值,在其他实施方式中,第二预设阈值可为70%至90%),则最终分割结果为“中心图像”与“轮廓图像”交集区域;
若存在“轮廓物体”,且“中心物体”与“轮廓物体”交集区域的面积小于两者并集区域面积的80%(70%至90%),此时统计“轮廓物体”中“聚类图像1”,“聚类图像2”、“聚类图像3”的面积,最大面积的物体区域为最终分割结果。
获得此分割结果后,需进行以下后处理:若此分割结果为单一连通区域,但此区域面积小于输入图像面积的10%(即第三预设阈值优选为10%,在其他实施方式中,可根据实际情况选择其他第三预设阈值),则认为以上方法无法对粪便准确分割,分析过程终止;若此分割结果为多个连通区域,则保留面积最大的连通区域,若此连通区域面积小于输入图像面积的10%,则认为以上方法无法对粪便准确分割,分析过程终止;反之,此连通区域为目标粪便区域。
在本实施例的另一个可选实施方式,对于步骤S206中涉及到的从第 二图像中提取出目标对象的颜色特征的方式,可以通过如下方式来实现:
步骤S206-11,提取出第二图像中所有像素点的RGB通道值,并将每个通道值单独组成一个向量;
步骤S206-12,统计每个通道向量的像素值,并选择满足预设分位数范围的像素值;
步骤S206-13,基于选择出的满足预设分位数范围的像素值的计算每个通道向量的像素值的均值;
步骤S206-14,将均值按照R,G,B的顺序组合成为预设长度的向量,并将预设长度的向量作为颜色特征。
对于上述步骤S206-11至步骤S206-14在具体的应用场景中可以是:首先提取出粪便区域内所有像素点的RGB通道值,每个通道的数值单独组成一个向量;统计每个通道向量的像素值,对每个通道向量,获得5%分位数及95%分位数(该分位数范围为本申请中预设分位数范围的优选范围,可以根据实际情况进行相应调整);对于每个通道向量,保留大于5%分位数及小于95%分位数的数值,此步用于删除异常值;异常值删除后,计算每个通道向量的均值,将结果按照R、G、B的顺序组合成长度为3的向量,此向量即为粪便区域颜色特征。
在本实施例的又一个可选实施方式中,对于本实施例中涉及到的步骤S206中的从第二图像中提取出目标对象的纹理特征的方式可以通过如下方式来实现:
步骤S206-21,从第二图像的区域中提取第一最大内接矩形;
步骤S206-22,在第一最大内接矩形的面积与第二图像的面积的比例大于或等于第四预设阈值的情况下,对最大内接矩形进行纹理特征提取;
步骤S206-23,在第一最大内接矩形的面积与第二图像的面积的比例小于第四预设阈值的情况下,将第二图像的区域划分为等面积的N个区域;其中,N为大于或等于2的正整数;
步骤S206-24,从N个区域中的每一个区域中分别查找第二最大内接矩形,并确定第一最大内接矩形分别与多个第二最大内接矩形的多个内接并集矩形;
步骤S206-25,根据N的不同取值确定出多个内接并集矩形的面积,并选择面积最大的多个内接并集矩形进行纹理特征提取。
需要说明的是,因粪便区域大多不规则,其内部像素点的关联信息又非常重要,所以:若对粪便区域进行形变,使不规则区域形变为规则矩形,则破坏了区域内部像素点间的关联特征;若使用粪便区域最小外接矩形用于特征提取,则引入了较多粪便区域外的点,噪声较多;若使用粪便区域最大内接矩形用于特征提取,则丢弃掉过多的内部像素点,关键特征被丢弃的可能性增高。因此,提出将粪便区域等面积分为多个区域,每个区域内寻找最大内接矩形,进一步从此矩形中提取特征。基于此,在本申请的具体应用场景中,上述步骤S206-21至步骤S206-25可以是如下方式:
步骤S61,提取粪便区域中的最大内接矩形,记此最大内接矩形为“内接矩形0”(对应于第一最大内接矩形),若此内接矩形0的面积大于等于粪便区域面积的60%(60%即为本实施例的第四预设阈值,第四预设阈值优选大于或等于30%;更优选的,50%至70%;更优选的,60%)则内接矩形0即用于特征提取;若内接矩形0的面积小于粪便区域面积的60%(60%即为本实施例的第四预设阈值,第四预设阈值优选大于或等于30%;更优选的,50%至70%;更优选的,60%),则需执行步骤S62至步骤S65。如果内接矩形0的面积大于粪便区域面积的80%,则不需要执行步骤S62至S65;
步骤S62,将不规则的粪便区域等面积划分为N个区域(N=2,3,4,5...,取值2,3时效果较好,N为1时即不划分区域):计算获得粪便区域的内部中心点,记为“点C”;遍历粪便区域边缘所有点,计算获得距离点C最近的点,记为“点1”;以点1为起始,顺时针遍历粪便区域边缘所有点,每到一点,便计算当前点与点C、点1连线及粪便区域边缘所围 成的区域面积,当此区域面积为粪便区域面积的1/N时,及此点为“点n”,n可取2,…,N;当n=N时,此等面积划分过程停止;分别连接点C与点1至点n,粪便区域被划分成等面积的N个区域;
步骤S63,分别从划分出的N个区域中提取最大内接矩形,分别记为“内接矩形1”,...,“内接矩形N”(对应于第二最大内接矩形);结合“内接矩形0”,获得所有“内接矩形”的并集区域,及此并集区域为“内接并集”,计算此内接并集区域的面积;
步骤S64,优选的,N小于等于10,更优选的,N为3或4,若N取值太大,特别是大于10损失的像素点将逐渐增多,影响纹理特征的提取)对N进行遍历,对应每一个N值,获得“内接并集N”;例如,当N=2、3或4时,获得“内接并集2”“内接并集3”“内接并集4”;分别计算每个“内接并集”的面积,取最大面积的“内接并集”对应的N值,即为粪便区域需要等面积划分的个数,组成此“内接并集”的所有的“内接矩形”(包含内接矩形0)将被用于特征提取。
步骤S65,对每个“内接矩形”进行特征提取,首先从经过预处理的输入图像中提取出所有“内接矩形”区域对应的图像区域,分离每个图像区域的RGB通道,获得每个通道的灰度图像;对每个通道的灰度图像,提取灰度共生矩阵特征及局部二值模式特征。
具体提取灰度共生矩阵特征及局部二值模式特征方式可以是:
提取灰度共生矩阵特征:首先计算灰度共生矩阵,需扫描的像素间距优选为1,2,3,4,像素间距过大,需要更多的运算时间,需扫描的角度为0°,45°,90°,135°,180°,225°,270°,325°(角度间隔优选为30°,45°,60°或90°等,更优选为45°);进一步的,从灰度共生矩阵中提取特征,包含对比度、逆差矩、熵、自相关性、能量、差异性、二阶矩等,将提取出的特征组合成特征向量,此特征向量即为某通道灰度图像的灰度共生矩阵特征。
提取局部二值模式特征:首先计算局部二值模式矩阵,参数为:采用 在半径为3(半径范围优选1~5,更优选3,半径为3时具有更好的效果)的圆形区域内含有24个采样点(采样点的个数可根据半径进行调整,半径不同,采样点个数不同,具体调整可为现有的获取二值模式矩阵的软件、算法等方式获得)的旋转不变的局部二值模式算子,计算局部二值模式矩阵,然后统计局部二值模式直方图(直方图中组别个数优选为32~256,更优选128),将每个组别包含像素的个数组合成向量,长度为128(长度优选为32~256),此向量即为某通道灰度图像的局部二值模式特征。
最后,特征合并:对与每个通道的灰度图像,合并其灰度共生矩阵特征向量及局部二值模式特征向量,组成一个长向量,记为“通道特征向量”;然后,合并RGB通道的“通道特征向量”合并成一个长向量,记为“内接矩形n的特征向量”。
也就是说,对于上述步骤S61至步骤S62:先从粪便区域中提取“内接矩形0”,若“内接矩形0”面积大于等于粪便区域面积的60%,则对“内接矩形0”提取“内接矩形0的特征向量”;若“内接矩形0”面积小于粪便区域面积的60%,则进行步骤S62至步骤S63,将粪便区域等分为N部分,每一部分提取“内接矩形n”,对每个“内接矩形n”进行步骤S65,提取“内接矩形n的特征向量”。至此,对于每张输入图像,会产生N个内接矩形的特征向量”。
需要说明的是,在本申请中,使用Delta-E量度对粪便颜色进行分类,使用统计概率模型对粪便特征进行分析
其中,使用的Delta-E全称为CIELAB Delta-E(或ΔE)2000 CIELAB,Delta-E或ΔE是国际照明协会发布的度量颜色差异的标准,此标准能更好的反应人眼感觉色差;2000代表2000年发布的标准,此标准在1994年标准的基础上做了修改,CIELAB Delta-E 2000的计算过程如下图所示。其他计算颜色差异的方法还有欧氏距离法、CIELAB Delta-E 1976、CIELAB Delta-E 1994、Delta-E CMC等方法,2000是在这些方法上一步步做的改进。
可选地,使用Delta-E进行颜色分类的步骤如下,
步骤S71,颜色分类的类别,包含但不限于以下几类:黄色,墨绿色,褐色,红色及黑色。首先对每种颜色设置标准RGB值,本申请的发明人在科研和实际应用中,经过大量的试验获得适用于婴儿粪便颜色分类的RGB值,具体为:黄色[200,200,0],墨绿色[0,70,0],褐色[180,60,60],红色[220,0,0]及黑色[0,0,0];需要说明的是可增加分类的颜色类别,但每个颜色类别都需要一个标准RGB值与其对应;然后,将这些标准颜色的从RGB空间转换为在标准光源D65水平下的CIELAB空间(颜色空间转换与图像预处理中一致),记为“LAB空间标准颜色”。
步骤S72,将从输入图像粪便区域中提出的平均颜色从RGB空间转换为在标准光源D65水平下的CIELAB空间;使用CIELAB Delta-E 2000标准,将粪便区域平均颜色与标准颜色一一进行对比,找到与粪便区域平均颜色最接近的标准颜色(CIELAB Delta-E 2000计算结果最小),则粪便区域颜色即为此标准颜色,颜色分类过程结束。
步骤S73,对粪便的外观进行分类,主要包含五个任务:性状、有无奶瓣、有无泡沫、有无血丝、有无粘液,每个任务的分类类别如表2最终结果所示。具体过程分为四步:粪便图像数据收集、标记;对图像数据进行预处理、粪便区域分割、特征提取;训练模型;使用模型预测输入图像。
步骤S74,数据收集、标记:目前没有相关的粪便图像数据集,首先从网络上收集了粪便图像(主要是婴幼儿的粪便图像)两千余张,与2(1)步相同;请专业儿科医师对每幅图像进行标注,标注内容包括:粪便颜色、性状(表2中性状9类)、有无奶瓣、有无泡沫、有无血丝、有无粘液。将此数据集图像分为三部分,包含:训练集、验证集及测试集。
步骤S75,对每一部分的图像数据进行预处理、粪便区域分割、特征提取:对每幅图像进行过程3、4,每幅图像可至少获得1个内接矩形的特征向量;因此,此数据集包含了三部分的内接矩形特征向量:训练集、验证集及测试集。
步骤S76,对于每个分类任务,使用训练集分别训练XGBoost(XGB,Extreme Gradient Boosting)、支持向量机(Support Vector Machine,SVM)及随机森林(RandomForest,RF)(这一步总共5个分类任务,每个任务三个模型,总共15个模型;注:XGB、SVM及RF算是比较常见的分类器),使用验证集调整训练三个模型的超参数,使用测试集评价三个模型的训练效果。具体每个分类任务三个模型的超参数设置见表1:
Figure PCTCN2020087071-appb-000001
表1
步骤S77,使用模型预测输入图像;
其中,获得输入图像后,首先检测输入图像中有无粪便;若未检测出粪便,则过程终止;若检测出粪便,获得此输入图像的所有内接矩形特征向量及粪便区域平均颜色;进而对粪便颜色分类;将每个内接矩形特征向量分别输入到每个任务的三个分类器(XGBoost,SVM,RF)中,获得分类结果;对于每个任务,收集所有的分类结果,进行统计,得票最多的类别为最终分类结果;若存在多个得票最多的类别,对于有无奶瓣、有无泡沫、有无血丝、有无粘液这四个任务,返回肯定结果,对于性状这个任务,返回得分相同的所有类别。
例如:输入一张不含粪便的图像:在检测出该图像中不含粪便的情况下,则该识别过程终止;输入一张含有粪便的图像:在检测出图像中含有粪便的情况下;分割出粪便区域;提取出3个内接矩形向量;将3个内接矩形向量分别输入检测有无奶瓣的3个分类器,共得到9个分类结果,统计分类结果中得票最多的类别,即为有无奶瓣的分类结果;对于其他分类任务,亦是此过程。
需要说明的是,本实施例关于婴儿粪便识别的结果,仅是对图像中是否是含有粪便图像的识别和对粪便图像中粪便的性状识别和分类,基于本实施例的结果,本领域技术人员并不能直接对婴儿健康状况进行评估或对婴儿进行疾病的诊断/治疗,本实施例的结果也不能直接反应婴儿的健康状态。
需要说明的是,使用的模型为梯度提升树、支持向量机及随机森林的集合模型。
粪便的最终分类结果如表2所示
Figure PCTCN2020087071-appb-000002
Figure PCTCN2020087071-appb-000003
表2
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
在本实施例中还提供了一种图像的识别装置,该装置用于实现上述实施例及优选实施方式,已经进行过说明的不再赘述。如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。
图3是根据本申请实施例的图像的识别装置的结构框图,如图3所示,该装置包括:获取模块32,设置为获取含有目标对象的第一图像;第一提取模块34,与获取模块32耦合连接,设置为从第一图像中提取出与指定区域对应的第二图像,其中,指定区域为在第一图像中含有目标对象的区域;第二提取模块36,与第一提取模块耦合连接,设置为从第二图像中提取出目标对象的颜色特征和纹理特征,并基于颜色特征和纹理特征识别出第一图像中的目标对象的颜色和性状。本申请的一个具体实施方式中,将 该方法应用到粪便图像识别的场景,从而解决了相关技术中通过人工的方式对粪便图像中的粪便颜色和性状进行识别的问题,达到了提高识别效率节约成本的效果。
需要说明的是,性状是指图像中目标对象的纹理特征所反映的目标对象的组成成分、物理状态等特征。目标对象可以是粪便、痰液、泥土、组织样本等,具体到粪便图像的识别方法,粪便的性状包括但不限于粪便的组成和/或物理状态,粪便的组成包括是否存在奶瓣、泡沫、血丝或黏液,粪便的物理状态包括稀泥状、蛋花状、水样状、粘液状、香蕉状、牙膏状、海参状、陶土状、柏油状或羊屎颗粒状。
可选地,本实施例中的获取模块32包括:建立单元,设置为收集多个含有目标对象的图像数据,并基于收集的图像数据建立图像数据库;训练单元,设置为基于图像数据库中的图像数据对第一卷积神经网络进行训练得到用于分类的第二卷积神经网络;分析单元,设置为通过第二卷积神经网络对输入的图像数据进行分析得到第一图像。
需要说明的是,建立单元所执行的操作与上述实施例中步骤S202-11的方法步骤对应,训练单元所执行的操作与上述实施例中的步骤S202-12的犯法步骤对应,分析单元所执行的操作与上述实施例中步骤S202-13的方法步骤对应。
可选地,本实施例中的第一提取模块34包括:归一化单元,设置为对第一图像进行亮度归一化处理;聚类分割单元,设置为对进行亮度归一化处理后的第一图像进行聚类分割处理;轮廓检测分割单元,设置为对进行亮度归一化处理后的第一图像进行轮廓检测分割处理;确定单元,设置为根据进行聚类分割处理后的第一图像与进行轮廓检测分割处理后的第一图像的匹配程度确定第二图像。
其中,本申请中的归一化单元进一步可以包括:转换子单元,设置为将第一图像从RGB空间转换为标准光源D65水平下的CIELAB空间;第一归一化子单元,设置为通过限制对比度的自适应直方图均衡化方式对 CIELAB空间的图像中的L通道进行增强处理以完成对L通道的亮度归一化处理;第二归一化子单元,设置为将完成亮度归一化处理的CIELAB空间的图像中的L通道与CIELAB空间的图像中未经处理的A通道和B通道相结合得到的RGB空间图像作为进行亮度归一化处理后的第一图像。
需要说明的是,该归一化单元所包括的子单元所执行的操作相当于上述实施例中步骤S21至步骤S23,相当于是对第一图像的进行预处理,包括调整图片尺寸也可以是预处理的一步,但调整尺寸与否,不影响此步骤输出;此外,由于输入图像拍摄环境复杂,不同图像亮暗程度不同,同一图像不同区域亮暗程度不同,应对输入图像做预处理,做亮度的归一化,以提高目标图像的分割精度,能够达到更精确地分割效果。
其中,本申请中的聚类分割单元进一步可以包括:建立子单元,设置为对进行亮度归一化处理后的第一图像中的每个像素建立特征向量;聚类分割子单元,设置为通过聚类算法将像素的特征向量聚为预设数量的类别并得到与预设数量的类别对应的预设数量的聚类图像;第一提取子单元,设置为从第一图像中提取出第三图像,其中,第三图像为以第一图像的中心为中心且面积为第一图像预设百分比的图像;标记子单元,设置为分别统计出预设数量的聚类图像中的像素点在第三图像中的数量,并将在第三图像中像素点最多的聚类图像标记为中心图像。
需要说明的是,该聚类分割单元所包括的子单元所执行的操作相当于上述实施例中步骤S31至步骤S34,即是对颜色区域聚类分割。以婴儿的粪便为例,由于粪便区域一般是一个连续的、颜色较为均匀的色块,根据这个特点,使用基于像素颜色及位置的聚类分割出不同颜色的聚类图像,以聚3类为例,该三部分区域对应的聚类图像分别可以称为“聚类图像1”,“聚类图像2”和“聚类图像3”。最后,将输入图像分割为三个区域后,提取出分割图像中心区域(中心区域即为完整图像的物理中心的区域),此区域面积为完整图像的25%(中心区域的面积占完整图像的10%至50%均可,为了获得更好的分割中心区域的效果,优选该区域的面积为完整图像面积的10%至30%);在此中心区域中,分别统计“聚类图像1”,“聚类 图像2”和“聚类图像3”的像素个数,记像素点个数最多的物体为“中心图像”。
其中,本申请中的轮廓检测分割单元进一步可以包括:第二提取子单元,设置为将预设数量的聚类图像转换到HSV通道,并提取出S通道;第一处理子单元,设置为对S通道的图像进行自适应二值化处理;轮廓检测子单元,设置为将自适应二值化处理的结果进行闭操作后进行轮廓检测;判断子单元,设置为判断检测到的轮廓的面积是否大于第一预设阈值;第二处理子单元,设置为在判断结果为是的情况下,保留物体面积大于第一预设阈值的聚类图像,并从物体面积大于第一预设阈值的聚类图像中选择物体面积最大的聚类图像作为轮廓图像;第三处理子单元,设置为在判断结果为否的情况下,舍弃物体面积小于或等于第一预设阈值的聚类图像。
需要说明的是,上述轮廓检测分割单元所包括的子单元所执行的操作与上述实施例中步骤S41至步骤S46的方法步骤对应,在具体应用场景中,可以通过如下方式来实现:将上述聚类图像转换到HSV通道(H:色相,何种颜色,S:饱和度,色彩的纯度,V:亮度),提取出S通道,对S通道图像进行自适应二值化,自适应窗口大小优选15像素×15像素至27像素×27像素,更优选21像素×21像素;对二值化结果进行闭操作,然后进行轮廓检测,对每一个检测到的轮廓所包含的物体进行判断:当此物体的面积小于输入图像面积的预设比例(优选1/10,也可根据实际情况对该值进行调整,比如1/5、1/8、1/9、1/11或2/11等)时,舍弃此物体;反之,保留此物体;若无保留的物体,轮廓检测的输出结果为无;若有,则保留物体,在保留的所有物体中,记面积最大的物体为“轮廓物体”,对应于上述轮廓图像。
其中,本申请中的确定单元进一步可以包括:第四处理子单元,设置为在进行轮廓检测分割处理后的第一图像中不存在轮廓图像的情况下,将中心图像作为第三图像;第五处理子单元,设置为在进行轮廓检测分割处理后的第一图像中存在轮廓图像,且轮廓图像和中心图像的交集区域的面积,与轮廓图像和中心图像的并集区域的面积比例大于或等于第二预设阈 值的情况下,将轮廓图像与中心图像的交集区域的图像作为第四图像;第六处理子单元,设置为在进行轮廓检测分割处理后的第一图像中存在轮廓图像,且轮廓图像和中心图像的交集区域的面积,与轮廓图像和中心图像的并集区域的面积比例小于第二预设阈值的情况下,将预设数量的聚类图像中与轮廓图像与交集区域的面积最大的聚类图像作为第五图像;第七处理子单元,设置为判断第三图像或第四图像或第五图像为单一连通区域还是多个连通区域;第八处理子单元,设置为在第三图像或第四图像或第五图像为单一连通区域或多个连通区域的情况下,且第三图像或第四图像或第五图像与第一图像的面积比例小于第三预设阈值,终止图像的识别过程;第九处理子单元,设置为在第三图像或第四图像或第五图像为多个连通区域的情况下,且第三图像或第四图像或第五图像与第一图像的面积比例大于或等于第三预设阈值,将第三图像或第四图像或第五图像作为第二图像。
需要说明的是,上述确定单元所包括的子单元所执行的操作与上述实施例中步骤S51至步骤S56的方法步骤对应。
可选地,本申请中的第二提取模块36进一步可以包括:第一提取单元,设置为提取出第二图像中所有像素点的RGB通道值,并将每个通道值单独组成一个向量;第一处理单元,设置为统计每个通道向量的像素值,并选择满足预设分位数范围的像素值;计算单元,设置为基于选择出的满足预设分位数范围的像素值的计算每个通道向量的像素值的均值;第二处理单元,设置为将均值按照R,G,B的顺序组合成为预设长度的向量,并将预设长度的向量作为颜色特征。
可选地,本申请中的第二提取模块进一步可以包括:第二提取单元,设置为从第二图像的区域中提取第一最大内接矩形;第三提取单元,设置为在第一最大内接矩形的面积与第二图像的面积的比例大于或等于第四预设阈值的情况下,对最大内接矩形进行纹理特征提取;第四处理单元,设置为在第一最大内接矩形的面积与第二图像的面积的比例小于第四预设阈值的情况下,将第二图像的区域划分为等面积的N个区域;其中,N为大于或等于2的正整数;第五处理单元,设置为从N个区域中的每一个 区域中分别查找第二最大内接矩形,并确定第一最大内接矩形分别与多个第二最大内接矩形的多个内接并集矩形;第六处理单元,设置为根据N的不同取值确定出多个内接并集矩形的面积,并选择面积最大的多个内接并集矩形进行纹理特征提取。
需要说明的是,该第二提取模块所包括的单元所执行的操作与上述实施例中步骤S206-21至步骤S206-25的方法步骤对应。
需要说明的是,上述各个模块是可以通过软件或硬件来实现的,对于后者,可以通过以下方式实现,但不限于此:上述模块均位于同一处理器中;或者,上述各个模块以任意组合的形式分别位于不同的处理器中。
本申请的实施例还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序,其中,该计算机程序被设置为运行时执行上述任一项方法实施例中的步骤。
可选地,在本实施例中,上述计算机可读存储介质可以被设置为存储用于执行以下步骤的计算机程序:
S1,获取含有目标对象的第一图像;
S2,从第一图像中提取出与指定区域对应的第二图像,其中,指定区域为在第一图像中含有目标对象的区域;
S3,从第二图像中提取出目标对象的颜色特征和纹理特征,并基于颜色特征和纹理特征识别出第一图像中的目标对象的颜色和性状。
可选地,在本实施例中,上述存储介质可以包括但不限于:U盘、只读存储器(Read-Only Memory,简称为ROM)、随机存取存储器(Random Access Memory,简称为RAM)、移动硬盘、磁碟或者光盘等各种可以存储计算机程序的介质。
本申请的实施例还提供了一种电子装置,包括存储器和处理器,该存储器中存储有计算机程序,该处理器被设置为运行计算机程序以执行上述 任一项方法实施例中的步骤。
可选地,上述电子装置还可以包括传输设备以及输入输出设备,其中,该传输设备和上述处理器连接,该输入输出设备和上述处理器连接。
可选地,在本实施例中,上述处理器可以被设置为通过计算机程序执行以下步骤:
S1,获取含有目标对象的第一图像;
S2,从第一图像中提取出与指定区域对应的第二图像,其中,指定区域为在第一图像中含有目标对象的区域;
S3,从第二图像中提取出目标对象的颜色特征和纹理特征,并基于颜色特征和纹理特征识别出第一图像中的目标对象的颜色和性状。
可选地,本实施例中的具体示例可以参考上述实施例及可选实施方式中所描述的示例,本实施例在此不再赘述。
显然,本领域的技术人员应该明白,上述的本申请的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本申请不限制于任何特定的硬件和软件结合。
以上所述仅为本申请的优选实施例而已,并不用于限制本申请,对于本领域的技术人员来说,本申请可以有各种更改和变化。凡在本申请的原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。
工业实用性
如上所述,本申请实施例提供的一种图像的识别方法及装置、存储介 质和电子装置具有以下有益效果:解决了相关技术中通过人工的方式对图像中的颜色及纹理特征进行识别的问题,达到了提高识别效率节约成本的效果。

Claims (20)

  1. 一种图像的识别方法,包括:
    获取含有目标对象的第一图像;
    从所述第一图像中提取出与指定区域对应的第二图像,其中,所述指定区域为在所述第一图像中含有所述目标对象的区域;
    从所述第二图像中提取出目标对象的颜色特征和纹理特征,并基于所述颜色特征和所述纹理特征识别出所述第一图像中的目标对象的颜色和性状。
  2. 根据权利要求1所述的方法,其中,所述获取含有目标对象的第一图像,包括:
    收集多个含有目标对象的图像数据,并基于收集的图像数据建立图像数据库;
    基于所述图像数据库中的图像数据对第一卷积神经网络进行训练得到用于分类的第二卷积神经网络;
    通过所述第二卷积神经网络对输入的图像数据进行分析得到所述第一图像。
  3. 根据权利要求1所述的方法,其中,所述从所述第一图像中提取出与指定区域对应的第二图像,包括:
    对所述第一图像进行亮度归一化处理;
    对进行亮度归一化处理后的第一图像进行聚类分割处理;
    对进行亮度归一化处理后的第一图像进行轮廓检测分割处理;
    根据进行聚类分割处理后的第一图像与进行轮廓检测分割处理后的第一图像的匹配程度确定所述第二图像。
  4. 根据权利要求3所述的方法,其中,所述对所述第一图像进行亮度 归一化处理,包括:
    将所述第一图像从RGB空间转换为标准光源D65水平下的CIELAB空间;
    通过限制对比度的自适应直方图均衡化方式对所述CIELAB空间的图像中的L通道进行增强处理以完成对L通道的亮度归一化处理;
    将完成亮度归一化处理的CIELAB空间的图像中的L通道与CIELAB空间的图像中未经处理的A通道和B通道相结合得到的RGB空间图像作为进行亮度归一化处理后的第一图像。
  5. 根据权利要求3所述的方法,其中,所述对进行亮度归一化处理后的第一图像进行聚类分割处理,包括:
    对进行亮度归一化处理后的第一图像中的每个像素建立特征向量;
    通过聚类算法将所述像素的特征向量聚为预设数量的类别,并得到与所述预设数量的类别对应的预设数量的聚类图像;
    从所述第一图像中提取出第三图像,其中,所述第三图像为以所述第一图像的中心为中心且面积为所述第一图像预设百分比的图像;
    分别统计出所述预设数量的聚类图像中的像素点在所述第三图像中的数量,并将在所述第三图像中像素点最多的聚类图像标记为中心图像。
  6. 根据权利要求5所述的方法,其中,所述对进行亮度归一化处理后的第一图像进行轮廓检测分割处理,包括:
    将预设数量的聚类图像转换到HSV通道,并提取出S通道;
    对所述S通道的图像进行自适应二值化处理;
    将自适应二值化处理的结果进行闭操作后进行轮廓检测;
    判断检测到的轮廓的面积是否大于第一预设阈值;
    在判断结果为是的情况下,保留物体面积大于所述第一预设阈值的聚类图像,并从物体面积大于所述第一预设阈值的聚类图像中选择所述物体面积最大的聚类图像作为轮廓图像;
    在判断结果为否的情况下,舍弃物体面积小于或等于所述第一预设阈值的聚类图像。
  7. 根据权利要求6所述的方法,其中,所述根据进行聚类分割处理后的第一图像与进行轮廓检测分割处理后的第一图像的匹配程度确定所述第二图像,包括:
    在进行轮廓检测分割处理后的第一图像中不存在所述轮廓图像的情况下,将所述中心图像作为第三图像;
    在进行轮廓检测分割处理后的第一图像中存在所述轮廓图像,且所述轮廓图像和所述中心图像的交集区域的面积,与所述轮廓图像和所述中心图像的并集区域的面积比例大于或等于第二预设阈值的情况下,将所述轮廓图像与所述中心图像的交集区域的图像作为第四图像;
    在进行轮廓检测分割处理后的第一图像中存在所述轮廓图像,且所述轮廓图像和所述中心图像的交集区域的面积,与所述轮廓图像和所述中心图像的并集区域的面积比例小于所述第二预设阈值的情况下,将所述预设数量的聚类图像中与所述轮廓图像与交集区域的面积最大的聚类图像作为第五图像;
    判断所述第三图像或所述第四图像或所述第五图像为单一连通区域还是多个连通区域;
    在所述第三图像或所述第四图像或所述第五图像为单一连通区域或多个连通区域的情况下,且所述第三图像或所述第四图像或所述第五图像与所述第一图像的面积比例小于第三预设阈值,终止图像的识别过程;
    在所述第三图像或所述第四图像或所述第五图像为多个连通区域的情况下,且所述第三图像或所述第四图像或所述第五图像与所述第一图像的面积比例大于或等于所述第三预设阈值,将所述第三图像或所述第四图像或所述第五图像作为所述第二图像。
  8. 根据权利要求1所述的方法,其中,所述从所述第二图像中提取出目标对象的颜色特征,包括:
    提取出所述第二图像中所有像素点的RGB通道值,并将每个通道值单独组成一个向量;
    统计每个通道向量的像素值,并选择满足预设分位数范围的像素值;
    基于选择出的满足预设分位数范围的像素值的计算每个通道向量的像素值的均值;
    将所述均值按照R,G,B的顺序组合成为预设长度的向量,并将所述预设长度的向量作为所述颜色特征。
  9. 根据权利要求1所述的方法,其中,所述从所述第二图像中提取出目标对象的纹理特征,包括:
    从所述第二图像的区域中提取第一最大内接矩形;
    在所述第一最大内接矩形的面积与所述第二图像的面积的比例大于或等于第四预设阈值的情况下,对所述最大内接矩形进行纹理特征提取;
    在所述第一最大内接矩形的面积与所述第二图像的面积的比例小于第四预设阈值的情况下,将所述第二图像的区域划分为等面积的N个区域;其中,N为大于或等于2的正整数;
    从所述N个区域中的每一个区域中分别查找第二最大内接矩形,并确定所述第一最大内接矩形分别与多个所述第二最大内接矩形的多个内接并集矩形;
    根据N的不同取值确定出多个内接并集矩形的面积,并选择面积最大的多个内接并集矩形进行纹理特征提取。
  10. 一种图像的识别装置,包括:
    获取模块,设置为获取含有目标对象的第一图像;
    第一提取模块,设置为从所述第一图像中提取出与指定区域对应的第二图像,其中,所述指定区域为在所述第一图像中含有所述目标对象的区域;
    第二提取模块,设置为从所述第二图像中提取出目标对象的颜色特征和纹理特征,并基于所述颜色特征和所述纹理特征识别出所述第一图像中的目标对象的颜色和性状。
  11. 根据权利要求10所述的装置,其中,所述获取模块包括:
    建立单元,设置为收集多个含有目标对象的图像数据,并基于收集的图像数据建立图像数据库;
    训练单元,设置为基于所述图像数据库中的图像数据对第一卷积神经网络进行训练得到用于分类的第二卷积神经网络;
    分析单元,设置为通过所述第二卷积神经网络对输入的图像数据进行分析得到所述第一图像。
  12. 根据权利要求10所述的装置,其中,所述第一提取模块包括:
    归一化单元,设置为对所述第一图像进行亮度归一化处理;
    聚类分割单元,设置为对进行亮度归一化处理后的第一图像进行聚类分割处理;
    轮廓检测分割单元,设置为对进行亮度归一化处理后的第一图像进行轮廓检测分割处理;
    确定单元,设置为根据进行聚类分割处理后的第一图像与进行轮廓检测分割处理后的第一图像的匹配程度确定所述第二图像。
  13. 根据权利要求12所述的装置,其中,所述归一化单元包括:
    转换子单元,设置为将所述第一图像从RGB空间转换为标准光源D65水平下的CIELAB空间;
    第一归一化子单元,设置为通过限制对比度的自适应直方图均衡化方式对所述CIELAB空间的图像中的L通道进行增强处理以完成对L通道的亮度归一化处理;
    第二归一化子单元,设置为将完成亮度归一化处理的CIELAB空间的图像中的L通道与CIELAB空间的图像中未经处理的A通道和B通道相结合得到的RGB空间图像作为进行亮度归一化处理后的第一图像。
  14. 根据权利要求13所述的装置,其中,所述聚类分割单元包括:
    建立子单元,设置为对进行亮度归一化处理后的第一图像中的每个像素建立特征向量;
    聚类分割子单元,设置为通过聚类算法将所述像素的特征向量聚为预设数量的类别,并得到与所述预设数量的类别对应的预设数量的聚类图像;
    第一提取子单元,设置为从所述第一图像中提取出第三图像,其中,所述第三图像为以所述第一图像的中心为中心且面积为所述第一图像预设百分比的图像;
    标记子单元,设置为分别统计出所述预设数量的聚类图像中的像素点在所述第三图像中的数量,并将在所述第三图像中像素点最多的聚类图像标记为中心图像。
  15. 根据权利要求14所述的装置,其中,所述轮廓检测分割单元包括:
    第二提取子单元,设置为将预设数量的聚类图像转换到HSV通道,并提取出S通道;
    第一处理子单元,设置为对所述S通道的图像进行自适应二值化处理;
    轮廓检测子单元,设置为将自适应二值化处理的结果进行闭操作后进行轮廓检测;
    判断子单元,设置为判断检测到的轮廓的面积是否大于第一预设阈值;
    第二处理子单元,设置为在判断结果为是的情况下,保留物体面积大于所述第一预设阈值的聚类图像,并从物体面积大于所述第一预设阈值的聚类图像中选择所述物体面积最大的聚类图像作为轮廓图像;
    第三处理子单元,设置为在判断结果为否的情况下,舍弃物体面积小 于或等于所述第一预设阈值的聚类图像。
  16. 根据权利要求15所述的装置,其中,所述确定单元包括:
    第四处理子单元,设置为在进行轮廓检测分割处理后的第一图像中不存在所述轮廓图像的情况下,将所述中心图像作为第三图像;
    第五处理子单元,设置为在进行轮廓检测分割处理后的第一图像中存在所述轮廓图像,且所述轮廓图像和所述中心图像的交集区域的面积,与所述轮廓图像和所述中心图像的并集区域的面积比例大于或等于第二预设阈值的情况下,将所述轮廓图像与所述中心图像的交集区域的图像作为第四图像;
    第六处理子单元,设置为在进行轮廓检测分割处理后的第一图像中存在所述轮廓图像,且所述轮廓图像和所述中心图像的交集区域的面积,与所述轮廓图像和所述中心图像的并集区域的面积比例小于所述第二预设阈值的情况下,将所述预设数量的聚类图像中与所述轮廓图像与交集区域的面积最大的聚类图像作为第五图像;
    第七处理子单元,设置为判断所述第三图像或所述第四图像或所述第五图像为单一连通区域还是多个连通区域;
    第八处理子单元,设置为在所述第三图像或所述第四图像或所述第五图像为单一连通区域或多个连通区域的情况下,且所述第三图像或所述第四图像或所述第五图像与所述第一图像的面积比例小于第三预设阈值,终止图像的识别过程;
    第九处理子单元,设置为在所述第三图像或所述第四图像或所述第五图像为多个连通区域的情况下,且所述第三图像或所述第四图像或所述第五图像与所述第一图像的面积比例大于或等于所述第三预设阈值,将所述第三图像或所述第四图像或所述第五图像作为所述第二图像。
  17. 根据权利要求10所述的装置,其中,所述第二提取模块包括:
    第一提取单元,设置为提取出所述第二图像中所有像素点的RGB通道值,并将每个通道值单独组成一个向量;
    第一处理单元,设置为统计每个通道向量的像素值,并选择满足预设分位数范围的像素值;
    计算单元,设置为基于选择出的满足预设分位数范围的像素值的计算每个通道向量的像素值的均值;
    第二处理单元,设置为将所述均值按照R,G,B的顺序组合成为预设长度的向量,并将所述预设长度的向量作为所述颜色特征。
  18. 根据权利要求10所述的装置,其中,所述第二提取模块包括:
    第二提取单元,设置为从所述第二图像的区域中提取第一最大内接矩形;
    第三提取单元,设置为在所述第一最大内接矩形的面积与所述第二图像的面积的比例大于或等于第四预设阈值的情况下,对所述最大内接矩形进行纹理特征提取;
    第四处理单元,设置为在所述第一最大内接矩形的面积与所述第二图像的面积的比例小于第四预设阈值的情况下,将所述第二图像的区域划分为等面积的N个区域;其中,N为大于或等于2的正整数;
    第五处理单元,设置为从所述N个区域中的每一个区域中分别查找第二最大内接矩形,并确定所述第一最大内接矩形分别与多个所述第二最大内接矩形的多个内接并集矩形;
    第六处理单元,设置为根据N的不同取值确定出多个内接并集矩形的面积,并选择面积最大的多个内接并集矩形进行纹理特征提取。
  19. 一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,其中,所述计算机程序被设置为运行时执行所述权利要求1至9任一项中所述的方法。
  20. 一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器被设置为运行所述计算机程序以执行所述权利要求1至9任一项中所述的方法。
PCT/CN2020/087071 2019-04-30 2020-04-26 图像的识别方法及装置、存储介质和电子装置 WO2020221177A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910365149.2 2019-04-30
CN201910365149.2A CN111860533B (zh) 2019-04-30 2019-04-30 图像的识别方法及装置、存储介质和电子装置

Publications (1)

Publication Number Publication Date
WO2020221177A1 true WO2020221177A1 (zh) 2020-11-05

Family

ID=72966686

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/087071 WO2020221177A1 (zh) 2019-04-30 2020-04-26 图像的识别方法及装置、存储介质和电子装置

Country Status (2)

Country Link
CN (1) CN111860533B (zh)
WO (1) WO2020221177A1 (zh)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112365485A (zh) * 2020-11-19 2021-02-12 同济大学 一种基于Circular LBP和颜色空间转换算法的黑色素瘤识别方法
CN112419298A (zh) * 2020-12-04 2021-02-26 中冶建筑研究总院(深圳)有限公司 一种螺栓节点板锈蚀检测方法、装置、设备及存储介质
CN112508989A (zh) * 2020-11-20 2021-03-16 腾讯科技(深圳)有限公司 一种图像处理方法、装置、服务器以及介质
CN112633297A (zh) * 2020-12-28 2021-04-09 浙江大华技术股份有限公司 目标对象的识别方法、装置、存储介质以及电子装置
CN112733841A (zh) * 2020-12-30 2021-04-30 中冶赛迪重庆信息技术有限公司 钢卷内部紊乱判断方法、系统、设备及介质
CN112949657A (zh) * 2021-03-09 2021-06-11 河南省现代农业大数据产业技术研究院有限公司 一种基于遥感影像纹理特征的林地分布提取方法及装置
CN113128576A (zh) * 2021-04-02 2021-07-16 中国农业大学 基于深度学习图像分割的作物行检测方法及装置
CN113538387A (zh) * 2021-07-23 2021-10-22 广东电网有限责任公司 基于深度卷积神经网络的多尺度巡检图像识别方法及装置
CN113610776A (zh) * 2021-07-16 2021-11-05 广州大学 一种夹心饼干的缺陷检测方法、装置及存储介质
CN114792299A (zh) * 2021-01-25 2022-07-26 山东信通电子股份有限公司 一种输电线路异常状态检测方法及装置
CN115060665A (zh) * 2022-08-16 2022-09-16 君华高科集团有限公司 一种食品安全自动巡检系统
CN115546621A (zh) * 2022-11-28 2022-12-30 浙江托普云农科技股份有限公司 一种作物长势分析方法、装置及应用
CN115601690A (zh) * 2022-12-13 2023-01-13 山东常生源生物科技股份有限公司(Cn) 一种基于智慧农业的食用菌环境检测方法
CN116030276A (zh) * 2023-03-29 2023-04-28 东莞市永惟实业有限公司 一种印刷图像识别系统
CN116129157A (zh) * 2023-04-13 2023-05-16 深圳市夜行人科技有限公司 一种基于极微光的警戒摄像机智能图像处理方法及系统
CN116473501A (zh) * 2023-04-28 2023-07-25 北京云柿信息技术有限公司 一种插片式主观验光结果自动记录方法、装置及系统
CN116626029A (zh) * 2023-07-20 2023-08-22 津泰(天津)医疗器械有限公司 一种用于糖尿病氯化钴试纸色差的检测方法
CN117392056A (zh) * 2023-09-06 2024-01-12 北京长木谷医疗科技股份有限公司 一种基于同态增强的x射线医学图像归一化方法及装置
CN117648296A (zh) * 2024-01-29 2024-03-05 北京惠朗时代科技有限公司 一种图形数据的读取装置及使用方法

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112183674B (zh) * 2020-11-06 2022-06-10 南昌航空大学 一种粪便宏观图像颜色和性状多任务识别方法及系统
CN112561906A (zh) * 2020-12-24 2021-03-26 百果园技术(新加坡)有限公司 一种图像处理方法、装置、设备及介质
CN113505258A (zh) * 2021-06-23 2021-10-15 广东瑞芯智能科技有限公司 智能手表表盘界面数据的预存方法、系统、装置及介质
CN114140684A (zh) * 2021-11-08 2022-03-04 深圳江行联加智能科技有限公司 堵煤漏煤检测方法、装置、设备及存储介质
CN116824258B (zh) * 2023-06-30 2024-05-14 哈尔滨工业大学 一种基于反向投影的施工场地烟尘检测方法

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101447076A (zh) * 2008-12-02 2009-06-03 浙江大学 一种web图像中感兴趣区域的分割方法
JP2011170745A (ja) * 2010-02-22 2011-09-01 Burein:Kk パンの識別装置とそのプログラム
CN102855492A (zh) * 2012-07-27 2013-01-02 中南大学 基于矿物浮选泡沫图像的分类方法
CN103632156A (zh) * 2013-12-23 2014-03-12 中南大学 基于多尺度邻域相关矩阵的泡沫图像纹理特征提取方法
CN104766071A (zh) * 2015-04-28 2015-07-08 重庆邮电大学 一种应用于无人驾驶汽车的交通灯快速检测算法
CN106651883A (zh) * 2016-12-30 2017-05-10 四川沃文特生物技术有限公司 基于机器视觉的粪便形态识别方法
CN107977600A (zh) * 2017-09-11 2018-05-01 江苏国光信息产业股份有限公司 一种基于侧掌特征的签名鉴伪方法

Family Cites Families (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101179710B (zh) * 2007-11-30 2010-12-08 浙江工业大学 铁路道口智能视频监控装置
CN101526944B (zh) * 2008-12-23 2011-10-12 广州乐庚信息科技有限公司 图像检索比对方法
CN101789005A (zh) * 2010-01-22 2010-07-28 深圳创维数字技术股份有限公司 一种基于感兴趣区域的图像检索方法
JP5558995B2 (ja) * 2010-09-30 2014-07-23 富士フイルム株式会社 色値取得方法、画像処理方法、色値取得装置、画像処理装置及びプログラム
CN103295186B (zh) * 2012-02-24 2016-03-09 佳能株式会社 图像描述符生成方法和系统、图像检测方法和系统
CN107395975A (zh) * 2013-01-07 2017-11-24 华为技术有限公司 一种图像处理方法及装置
CN103218831B (zh) * 2013-04-21 2015-11-18 北京航空航天大学 一种基于轮廓约束的视频运动目标分类识别方法
CN104268573B (zh) * 2014-09-24 2017-12-26 深圳市华尊科技股份有限公司 车辆检测方法及装置
CN104408469A (zh) * 2014-11-28 2015-03-11 武汉大学 基于图像深度学习的烟火识别方法及系统
CN104392455B (zh) * 2014-12-09 2017-03-29 西安电子科技大学 基于方向检测的在线掌纹有效区域快速分割方法
CN107301421A (zh) * 2016-04-15 2017-10-27 中兴通讯股份有限公司 车辆颜色的识别方法及装置
CN106485199A (zh) * 2016-09-05 2017-03-08 华为技术有限公司 一种车身颜色识别的方法及装置
CN108073864B (zh) * 2016-11-15 2021-03-09 北京市商汤科技开发有限公司 目标对象检测方法、装置及系统和神经网络结构
JP7002846B2 (ja) * 2017-03-03 2022-01-20 キヤノン株式会社 画像処理装置およびその制御方法
CN107301405A (zh) * 2017-07-04 2017-10-27 上海应用技术大学 自然场景下的交通标志检测方法
US10646999B2 (en) * 2017-07-20 2020-05-12 Tata Consultancy Services Limited Systems and methods for detecting grasp poses for handling target objects
CN207996826U (zh) * 2017-12-14 2018-10-23 北京木业邦科技有限公司 木板分拣系统
CN108229379A (zh) * 2017-12-29 2018-06-29 广东欧珀移动通信有限公司 图像识别方法、装置、计算机设备和存储介质
CN108629319B (zh) * 2018-05-09 2020-01-07 北京嘀嘀无限科技发展有限公司 图像检测方法及系统
CN108921857A (zh) * 2018-06-21 2018-11-30 中国人民解放军61062部队科技装备处 一种面向监视场景的视频图像焦点区域分割方法
CN109558883B (zh) * 2018-12-03 2023-04-18 宁夏智启连山科技有限公司 叶片特征提取方法及装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101447076A (zh) * 2008-12-02 2009-06-03 浙江大学 一种web图像中感兴趣区域的分割方法
JP2011170745A (ja) * 2010-02-22 2011-09-01 Burein:Kk パンの識別装置とそのプログラム
CN102855492A (zh) * 2012-07-27 2013-01-02 中南大学 基于矿物浮选泡沫图像的分类方法
CN103632156A (zh) * 2013-12-23 2014-03-12 中南大学 基于多尺度邻域相关矩阵的泡沫图像纹理特征提取方法
CN104766071A (zh) * 2015-04-28 2015-07-08 重庆邮电大学 一种应用于无人驾驶汽车的交通灯快速检测算法
CN106651883A (zh) * 2016-12-30 2017-05-10 四川沃文特生物技术有限公司 基于机器视觉的粪便形态识别方法
CN107977600A (zh) * 2017-09-11 2018-05-01 江苏国光信息产业股份有限公司 一种基于侧掌特征的签名鉴伪方法

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112365485A (zh) * 2020-11-19 2021-02-12 同济大学 一种基于Circular LBP和颜色空间转换算法的黑色素瘤识别方法
CN112508989B (zh) * 2020-11-20 2024-03-01 腾讯科技(深圳)有限公司 一种图像处理方法、装置、服务器以及介质
CN112508989A (zh) * 2020-11-20 2021-03-16 腾讯科技(深圳)有限公司 一种图像处理方法、装置、服务器以及介质
CN112419298A (zh) * 2020-12-04 2021-02-26 中冶建筑研究总院(深圳)有限公司 一种螺栓节点板锈蚀检测方法、装置、设备及存储介质
CN112419298B (zh) * 2020-12-04 2024-01-19 中冶建筑研究总院(深圳)有限公司 一种螺栓节点板锈蚀检测方法、装置、设备及存储介质
CN112633297A (zh) * 2020-12-28 2021-04-09 浙江大华技术股份有限公司 目标对象的识别方法、装置、存储介质以及电子装置
CN112633297B (zh) * 2020-12-28 2023-04-07 浙江大华技术股份有限公司 目标对象的识别方法、装置、存储介质以及电子装置
CN112733841A (zh) * 2020-12-30 2021-04-30 中冶赛迪重庆信息技术有限公司 钢卷内部紊乱判断方法、系统、设备及介质
CN112733841B (zh) * 2020-12-30 2022-12-16 中冶赛迪信息技术(重庆)有限公司 钢卷内部紊乱判断方法、系统、设备及介质
CN114792299A (zh) * 2021-01-25 2022-07-26 山东信通电子股份有限公司 一种输电线路异常状态检测方法及装置
CN112949657A (zh) * 2021-03-09 2021-06-11 河南省现代农业大数据产业技术研究院有限公司 一种基于遥感影像纹理特征的林地分布提取方法及装置
CN113128576A (zh) * 2021-04-02 2021-07-16 中国农业大学 基于深度学习图像分割的作物行检测方法及装置
CN113610776A (zh) * 2021-07-16 2021-11-05 广州大学 一种夹心饼干的缺陷检测方法、装置及存储介质
CN113610776B (zh) * 2021-07-16 2023-08-15 广州大学 一种夹心饼干的缺陷检测方法、装置及存储介质
CN113538387B (zh) * 2021-07-23 2024-04-05 广东电网有限责任公司 基于深度卷积神经网络的多尺度巡检图像识别方法及装置
CN113538387A (zh) * 2021-07-23 2021-10-22 广东电网有限责任公司 基于深度卷积神经网络的多尺度巡检图像识别方法及装置
CN115060665B (zh) * 2022-08-16 2023-01-24 君华高科集团有限公司 一种食品安全自动巡检系统
CN115060665A (zh) * 2022-08-16 2022-09-16 君华高科集团有限公司 一种食品安全自动巡检系统
CN115546621A (zh) * 2022-11-28 2022-12-30 浙江托普云农科技股份有限公司 一种作物长势分析方法、装置及应用
CN115601690A (zh) * 2022-12-13 2023-01-13 山东常生源生物科技股份有限公司(Cn) 一种基于智慧农业的食用菌环境检测方法
CN116030276A (zh) * 2023-03-29 2023-04-28 东莞市永惟实业有限公司 一种印刷图像识别系统
CN116129157A (zh) * 2023-04-13 2023-05-16 深圳市夜行人科技有限公司 一种基于极微光的警戒摄像机智能图像处理方法及系统
CN116129157B (zh) * 2023-04-13 2023-06-16 深圳市夜行人科技有限公司 一种基于极微光的警戒摄像机智能图像处理方法及系统
CN116473501B (zh) * 2023-04-28 2023-12-05 北京云柿信息技术有限公司 一种插片式主观验光结果自动记录方法、装置及系统
CN116473501A (zh) * 2023-04-28 2023-07-25 北京云柿信息技术有限公司 一种插片式主观验光结果自动记录方法、装置及系统
CN116626029A (zh) * 2023-07-20 2023-08-22 津泰(天津)医疗器械有限公司 一种用于糖尿病氯化钴试纸色差的检测方法
CN116626029B (zh) * 2023-07-20 2023-09-22 津泰(天津)医疗器械有限公司 一种用于糖尿病氯化钴试纸色差的检测方法
CN117392056A (zh) * 2023-09-06 2024-01-12 北京长木谷医疗科技股份有限公司 一种基于同态增强的x射线医学图像归一化方法及装置
CN117648296A (zh) * 2024-01-29 2024-03-05 北京惠朗时代科技有限公司 一种图形数据的读取装置及使用方法
CN117648296B (zh) * 2024-01-29 2024-04-09 北京惠朗时代科技有限公司 一种图形数据的读取装置及使用方法

Also Published As

Publication number Publication date
CN111860533A (zh) 2020-10-30
CN111860533B (zh) 2023-12-12

Similar Documents

Publication Publication Date Title
WO2020221177A1 (zh) 图像的识别方法及装置、存储介质和电子装置
Aquino et al. vitisBerry: An Android-smartphone application to early evaluate the number of grapevine berries by means of image analysis
WO2021196632A1 (zh) 一种全景数字病理图像智能分析系统及方法
CN109002851B (zh) 一种基于图像多特征融合的水果分类方法及应用
CN106951899A (zh) 基于图像识别的异常检测方法
WO2020253127A1 (zh) 脸部特征提取模型训练方法、脸部特征提取方法、装置、设备及存储介质
CN103400146B (zh) 基于颜色建模的中医面色识别方法
CN107292307B (zh) 一种倒置汉字验证码自动识别方法及系统
CN103735253A (zh) 一种基于移动终端的中医舌象分析系统及方法
CN107871316B (zh) 基于深度神经网络的x光片手骨兴趣区域自动提取方法
US10417772B2 (en) Process to isolate object of interest in image
US20140286527A1 (en) Systems and methods for accelerated face detection
CN105205437B (zh) 基于头部轮廓验证的侧脸检测方法及装置
CN115994907B (zh) 用于食品检测机构综合信息的智能处理系统及方法
CN109829924A (zh) 一种基于主体特征分析的图像质量评价方法
CN109089992A (zh) 一种基于机器视觉的鱼类新鲜度的分类方法及系统
Rani et al. K-means clustering and SVM for plant leaf disease detection and classification
CN111967319A (zh) 基于红外和可见光的活体检测方法、装置、设备和存储介质
CN112434647A (zh) 一种人脸活体检测方法
CN111340772A (zh) 一种基于移动终端的钢筋混凝土桥梁损伤检测系统及方法
CN110648336A (zh) 一种舌质和舌苔的分割方法及装置
CN114092572A (zh) 服装颜色分析方法、系统、存储介质及计算机设备
CN110929740A (zh) 一种基于lgbm模型的舌质舌苔分离方法
CN203970354U (zh) 一种基于移动终端的中医舌象分析系统
CN107145734B (zh) 一种医疗数据的自动获取与录入方法及其系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20798075

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20798075

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20798075

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 13.04.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20798075

Country of ref document: EP

Kind code of ref document: A1