WO2021057395A1 - Heel type identification method, device, and storage medium - Google Patents

Heel type identification method, device, and storage medium Download PDF

Info

Publication number
WO2021057395A1
WO2021057395A1 PCT/CN2020/112536 CN2020112536W WO2021057395A1 WO 2021057395 A1 WO2021057395 A1 WO 2021057395A1 CN 2020112536 W CN2020112536 W CN 2020112536W WO 2021057395 A1 WO2021057395 A1 WO 2021057395A1
Authority
WO
WIPO (PCT)
Prior art keywords
heel
image
candidate
network
camera
Prior art date
Application number
PCT/CN2020/112536
Other languages
French (fr)
Chinese (zh)
Inventor
翟懿奎
邓文博
周文略
柯琪锐
甘俊英
应自炉
曾军英
Original Assignee
五邑大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 五邑大学 filed Critical 五邑大学
Publication of WO2021057395A1 publication Critical patent/WO2021057395A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Definitions

  • the present invention relates to the technical field of image processing, in particular to a method, a device and a storage medium for identifying a heel model.
  • the purpose of the present invention is to provide a heel model recognition method, device and storage medium, which can increase the speed of heel recognition, improve the accuracy of heel model recognition, and greatly reduce the workload of merchants.
  • an embodiment of the present invention proposes a method for identifying a heel model, which includes the following steps:
  • collecting a heel image, and preprocessing the heel image to obtain a heel chromaticity diagram includes the following steps:
  • a bilateral filter is used to perform highlight denoising processing on the sharpened image to obtain a heel chromaticity diagram.
  • obtaining a heel image of the side of the heel at a horizontal angle includes the following steps:
  • the feature extraction network includes: a residual network and a feature pyramid network; the residual network includes a number of residual blocks, the feature pyramid network includes a number of feature pyramid network layers, and the residual block is connected to some The feature pyramid network layer.
  • using the regional candidate network to process the feature output map to obtain the candidate image of the heel includes the following steps:
  • the bilinear interpolation method is used to perform alignment processing on the optimal candidate region to obtain a candidate image.
  • performing pixel-level recognition on the candidate image through the output network to obtain the heel height and heel shape of the candidate image includes the following steps:
  • a classification network is used to classify the heel pixels to obtain the heel shape of the candidate image.
  • recognizing the height of the heel and the shape of the heel through the heel database to obtain the model of the heel includes the following steps:
  • an embodiment of the present invention also provides a heel model recognition device, which includes at least one control processor and a memory for communicating with the at least one control processor; At least one instruction executed by the control processor, the instruction being executed by the at least one control processor, so that the at least one control processor can execute the method for identifying a heel model as described in any one of the above.
  • the embodiments of the present invention also provide a computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to make a computer execute any of the above A method for identifying the heel model described.
  • the technical solutions provided in the embodiments of the present invention have at least the following beneficial effects: collecting heel images, preprocessing the heel images, and enhancing the clarity and resolution of the heel images; using the feature extraction network to analyze the heel chromaticity diagram Perform feature extraction to improve the speed of feature extraction and enhance the resolution of features; use the regional candidate network to identify, filter, and classify the feature output map to obtain a normalized candidate image, which improves the accuracy of candidate region selection , Reduce the overlap between the candidate regions; through the output network to perform pixel-level recognition of the candidate image, improve the accuracy of the heel height and shape acquisition; through the heel database to store the heel information, the candidate image of the shoe The recognition of heel height and shape improves the speed and efficiency of heel recognition.
  • Fig. 1 is an overall flowchart of an embodiment of a method for identifying a heel model of the present invention.
  • the present invention provides a heel model recognition method, device and storage medium, which can increase the speed of heel recognition, improve the accuracy of heel model recognition, and greatly reduce the workload of merchants.
  • an embodiment of the present invention provides a method for identifying a heel model, which includes the following steps:
  • Step S100 Collect a heel image, and preprocess the heel image to obtain a heel chromaticity diagram
  • Step S200 Perform feature extraction on the heel chromaticity map by using a feature extraction network to obtain a feature output map
  • Step S300 Use the regional candidate network to process the feature output map to obtain a candidate image of the heel;
  • Step S400 Perform pixel-level recognition on the candidate image through the output network to obtain the heel height and heel shape of the candidate image;
  • Step S500 Recognizing the height of the heel and the shape of the heel through the heel database to obtain the model of the heel.
  • step S100 collects a heel image and preprocesses the heel image to enhance the clarity and resolution of the heel image; wherein, the preprocessing can be set to sharpening processing, denoising processing, and brightness adjustment. , Saturation adjustment, etc.
  • Step S200 uses the feature extraction network to perform feature extraction on the heel chromaticity diagram, which improves the speed of feature extraction and enhances the resolution of features;
  • Step S300 uses the regional candidate network to identify, filter, and classify the feature output map to obtain the classification
  • the unified candidate image improves the accuracy of candidate region selection and reduces the repetition between candidate regions;
  • step S400 performs pixel-level recognition on the candidate image through the output network, which improves the heel height and heel shape of the candidate image
  • Step S500 stores the heel information through the heel database, and recognizes and compares the heel height and heel shape of the candidate image, which improves the speed and efficiency of heel model recognition.
  • Another embodiment of the present invention also provides a method for recognizing a heel model, in which collecting a heel image and preprocessing the heel image to obtain a heel chromaticity diagram includes the following steps:
  • Step S110 Acquire a heel image of the side of the heel at a horizontal angle within the camera distance range and the camera brightness range;
  • Step S120 Use high-pass filtering to perform sharpening processing on the heel image to obtain a sharpened image
  • Step S130 Use a bilateral filter to perform highlight denoising processing on the sharpened image to obtain a heel chromaticity diagram.
  • step S110 collects images of the heel within the range of the camera distance and the scope of the camera brightness to ensure the clarity of the heel image; the image collection is performed at a horizontal angle to avoid the difference in the shape of the heel.
  • the heel slope and thickness of the heel can be obtained from the side image of the heel of a high-heeled shoe, but the above data of the heel cannot be obtained on the front, back and bottom of the heel of a high-heeled shoe. Therefore, the image of the side of the heel is acquired. It can improve the accuracy of heel recognition.
  • Step S120 High-pass filtering is a filtering method.
  • the rule is that high-frequency signals can pass normally, while low-frequency signals below the set critical value are blocked and attenuated. That is, high-pass filtering is only for those below a given frequency.
  • the frequency component has an attenuation effect, and the frequency component above the cutoff frequency is allowed to pass, and there is no phase shift filtering process; it is mainly used to eliminate low frequency noise, also called a low cut filter.
  • x(n,m) is the heel image
  • y(n,m) is the sharpened image after high-pass filtering
  • z(n,m) is the correction signal, which is generally obtained by high-pass filtering on x.
  • is a scaling factor used to control the enhancement effect.
  • the high-frequency part of the heel image is extracted through high-pass filtering, and the high-frequency part is superimposed on the heel image, thereby enhancing the edge information of the heel image, achieving the effect of sharpening the heel image, and improving the quality of the heel image. Clarity.
  • Step S130 Bilateral filtering is a nonlinear filtering method that combines the spatial proximity of the image and the similarity of the value domain. It also considers the spatial information and the maximum diffuse reflection chromaticity similarity to achieve edge preservation and denoising. The purpose is simple, non-iterative, and the output depends on the weighted combination of neighboring pixel values. Use the estimated maximum diffuse reflection chromaticity value of the pixel as a weighted combination of the value domain and the spatial domain to guide smoothing, and perform denoising and edge protection on the sharpened image, and then retrieve the maximum chromaticity value of each pixel, which is about to sharpen the image Enter into the following formula for bilateral filtering:
  • D is the spatial weight function
  • R is the estimated maximum diffuse reflection chromaticity similarity weight function
  • p is the pixel point after bilateral filtering
  • q is the pixel point of the sharpened image
  • ⁇ max is the highlight Maximum diffuse chromaticity
  • ⁇ max(x) is the estimated maximum diffuse chromaticity.
  • the chromaticity value will be reduced, so that the maximum chromaticity of the filtered pixel is closer to the true maximum diffuse chromaticity.
  • the estimated chromaticity of pixels that only contain diffuse reflection will also be affected by pixels that contain specular reflection and become smaller. Therefore, in order to reduce the influence of a pixel with specular components on the chromaticity of a pixel with only diffuse reflection, the maximum diffuse reflection chromaticity ⁇ max when the pixel contains highlights can be compared with the estimated maximum diffuse reflection chromaticity ⁇ max under the state of no highlights. Diffuse chromaticity And take the maximum value as the maximum chromaticity value of each pixel:
  • the ⁇ max in the above formula is iterated using bilateral filtering to make the maximum diffuse reflection chromaticity diagram of the same color smooth.
  • This article compares the filtered value after each iteration And ⁇ max , when their difference is less than the threshold at each pixel, the filter value is considered to converge and the iteration is completed; wherein the threshold at the pixel can be set according to the actual situation, for example, set to 0.02.
  • another embodiment of the present invention also provides a method for recognizing a heel model, in which, within the camera distance range and the camera brightness range, acquiring the heel image of the side of the heel at a horizontal angle includes the following steps:
  • Step S111 Obtain the camera distance of the heel, and if the camera distance is not within the camera distance range, return the camera distance error message, and the camera distance range is 10 cm to 30 cm;
  • Step S112 Obtain the camera brightness of the heel. If the camera brightness is not within the camera brightness range, return the camera brightness error message, and the camera brightness range is that the brightness superimposed value of the three color channels of red, green, and blue is not Less than 0.4;
  • Step S113 Obtain the camera focal length of the heel and the heel image of the side of the heel at a horizontal angle.
  • the camera distance range of step S111 is set to be 10 cm to 30 cm, so that the size of the heel image acquired by the camera device is within a certain range, so that the camera device can completely obtain the heel image without avoiding the heel
  • the size of the image is too small or too large, and the definition of the heel image obtained by the camera equipment is guaranteed; when the user collects the heel image, the size of the camera distance is obtained.
  • the camera distance is not within the camera distance range, Return the error information of the camera distance, allowing the user to adjust the camera distance through the error information of the distance, ensuring the accuracy of the heel image acquisition.
  • step S112 the camera brightness of the heel can be embodied according to the superimposed values of the brightness of the heel image in the three color channels of red, green, and blue, that is, the calculation formula of the camera brightness is:
  • RGB is the brightness superposition value of the three color channels of red, green and blue
  • red is the brightness value of the heel image in the red channel
  • green G is the brightness value of the heel image in the green channel
  • Blue B is the brightness value of the heel image in the blue channel.
  • Step S113 The camera focal length of the heel is automatically adjusted by the camera equipment according to the actual shooting environment. By obtaining the camera focal length of the heel, it is helpful to calculate the actual height of the heel; image acquisition of the heel at a horizontal angle , To ensure the accuracy of the heel shape in the heel image; image acquisition of the side of the heel, so that the image can acquire more characteristic points of the heel, and improve the accuracy of heel recognition.
  • another embodiment of the present invention also provides a method for recognizing a shoe heel model, wherein the feature extraction network includes: a residual network and a feature pyramid network; the residual network includes a plurality of residual blocks, the The feature pyramid network includes several feature pyramid network layers, and the feature pyramid network layer is connected behind the residual block.
  • the characteristic of the residual network is that it is easy to optimize and can increase the accuracy by increasing a considerable depth.
  • the internal residual block uses jump connections, which alleviates the increase in depth in the deep neural network.
  • the problem of gradient disappearance; the feature pyramid is used to detect and recognize objects of different scales, and the inherent multi-scale pyramid hierarchy of deep convolutional networks is used to construct feature gold characters with marginal additional losses, so that the network has a horizontal connection from top to bottom
  • the architecture of can build high-level semantic feature maps on all scales.
  • the residual network includes several residual blocks, and a characteristic pyramid network layer can be connected behind any residual block, that is, in several residual blocks, at most, a characteristic pyramid network layer can be connected after each residual block. .
  • the number of layers of the feature pyramid network is not limited, and it is set according to the actual number of residual blocks.
  • another embodiment of the present invention also provides a method for recognizing a shoe heel model, wherein, using a regional candidate network to process the feature output map to obtain a candidate image of the heel includes the following steps:
  • Step S310 Recognizing the heel of the characteristic output map by using the region candidate network to obtain several candidate regions;
  • Step S320 Obtain the confidence of the candidate region by using a classifier, and screen the candidate region according to the confidence to obtain the candidate confidence region;
  • Step S330 Obtain the area overlap degree between the confidence candidate areas, and obtain the area overlap degree data set of the confidence candidate area;
  • Step S340 Use non-maximum value suppression to process the region overlap degree data set to obtain an optimal candidate region
  • Step S350 Perform alignment processing on the optimal candidate region using a bilinear interpolation method to obtain a candidate image.
  • step S310 the area candidate network uses a sliding window to traverse all points on the feature output map, judges all regions of interest on the feature output map, and obtains several candidate regions.
  • step S320 uses the classifier to calculate the confidence of the candidate regions, and according to the size of the confidence, selects a number of candidate regions with the highest confidence, which are recorded as confidence candidate regions;
  • Step S330 obtains the degree of overlap between the confidence candidate regions Obtain the area overlap data set of the confidence candidate area, that is, the area overlap data set of each confidence candidate area contains the data of the area overlap with the candidate area and other candidate areas.
  • Step S340 Non-maximum suppression is to suppress elements that are not maximum values, that is, to select a local maximum search.
  • This local represents a neighborhood.
  • the neighborhood has two variable parameters, one is the dimension of the neighborhood, and the other is The size of the neighborhood.
  • the specific search steps for the regional overlap data set are: start with the confidence candidate region with the highest confidence, record it as the first confidence candidate region, and filter from the region overlap data set of the first confidence candidate region Select the first-type confidence candidate region whose region overlap value is not greater than the threshold; then select the second confidence candidate region with the highest regional confidence from the first-type confidence candidate region, and select the second confidence candidate region in the second confidence candidate region.
  • the range of the N-1th type of confidence candidate area includes the Nth type of confidence candidate area.
  • the confidence candidate regions include A, B, C, D, E, and F
  • the region confidence level of the confidence candidate region is A ⁇ B ⁇ C ⁇ D ⁇ E ⁇ F
  • first extract the F with the highest regional confidence Mark F as the first confidence candidate region
  • in the F’s region overlap data set filter out the confidence candidate regions whose region overlap is not greater than the threshold, assuming that A and F, B and F, C and F If the regional overlap value is not greater than the threshold, then A, B, and C will be screened out, and record the first type of confidence candidate regions; because the regional confidence of C is greater than the regional confidence of A and B, mark C as the second confidence Candidate area.
  • the area overlap degree data set of C only the area overlap degree of C and A, and C and B are the area overlap degrees.
  • the overlap degree of these two areas is filtered, and the area overlap degree value is not greater than the threshold A; then the most The optimal candidate area is the merged area of F, C, and A.
  • the bilinear interpolation is a linear interpolation extension of the interpolation function of two variables, and its core idea is to perform linear interpolation in two directions respectively.
  • Select four fixed-position pixels in the optimal candidate area and perform bilinear interpolation on these four fixed-position pixels.
  • the bilinear interpolation process is: for each fixed-position pixel, the optimal The four heel pixels adjacent to it are selected in the candidate area, and the four heel pixels are linearly interpolated in the horizontal and vertical directions. That is, according to the fixed position of the pixel and its four heel pixels. The distance between the two determines the corresponding weight, so as to calculate the interpolation position of the pixel at a fixed position.
  • the principle of bilinear interpolation is: taking the distance from the pixel at a fixed position to the four nearest heel pixels as the reference weight, and after two linear interpolations, the interpolation position of the pixel at the fixed position is obtained; According to the interpolation position of the four fixed pixel points, the optimal candidate area is aligned to obtain a normalized candidate image, which improves the accuracy of the heel image recognition.
  • another embodiment of the present invention also provides a method for recognizing a heel model, wherein the candidate image is identified at the pixel level through an output network to obtain the heel height and heel shape of the candidate image, Including the following steps:
  • Step S410 Perform pixel-level recognition on the candidate image by using a segmentation network to obtain heel pixels in the candidate image;
  • Step S420 Obtain the actual heel height according to the heel pixel points and the camera focal length
  • Step S430 Use a classification network to classify the heel pixels to obtain the heel shape of the candidate image.
  • segmentation networks there are many types of segmentation networks in step S410.
  • Commonly used segmentation networks are: FCN, UNet, SegNet, DeepLab, etc.
  • the segmentation network is to identify and classify candidate images at the pixel level to obtain the pixels of the candidate image.
  • the types of points are: heel pixels and non-heel pixels.
  • the pixel points of the candidate image are filtered to obtain the heel pixels.
  • step S420 the height of the heel in the candidate image can be calculated according to the position of the pixel point of the heel, and the actual height of the heel can be calculated by the following formula:
  • f is the camera focal length
  • h is the heel height of the candidate image
  • D is the camera distance
  • H is the actual heel height
  • Step S430 There are many types of classification networks. Commonly used classification networks are: LeNet-5, AlexNet, ZFNet, VGGNet, GoogLeNet, ResNet, etc.; classification networks mainly use convolution, parameter sharing, pooling and other operations to extract features, and use The fully connected neural network classifies and recognizes features, reducing a large amount of calculations between data. Among them, a classification network is used to classify the heel pixels to obtain a shape composed of all pixels of the same heel, that is, to obtain the heel shape of the candidate image.
  • another embodiment of the present invention also provides a method for identifying a heel model, wherein the heel height and the shape of the heel are identified through a heel database to obtain the model of the heel, including the following step:
  • Step S510 Input the height of the heel and the shape of the heel into the heel database
  • Step S520 Obtain the heel height and the heel overlap between the heel shape and the data in the heel database, and filter and arrange the heel overlap to obtain the heel A number of heel models arranged by the value of the degree of overlap.
  • a large amount of heel information is stored in the heel database of step S520.
  • the heel information includes the real heel height and the real heel shape; by calculating the heel height and the shoe heel Heel shape, the heel overlap degree between the real heel height and the real heel shape in the heel database, sorted according to the degree of heel overlap, and selected a number of heels with larger heel overlap Model, output several heel models according to the degree of overlap value, so that users can obtain the several heel models with the highest similarity in the heel image, improve the accuracy of heel model recognition, and greatly reduce the business’s burden. Workload. Among them, a number of heel models with greater heel overlap can be set to ten heel models with the largest heel overlap.
  • another embodiment of the present invention also provides a method for identifying a heel model, which includes the following steps:
  • Step S111 Obtain the camera distance of the heel, and if the camera distance is not within the camera distance range, return the camera distance error message, and the camera distance range is 10 cm to 30 cm;
  • Step S112 Obtain the camera brightness of the heel. If the camera brightness is not within the camera brightness range, return the camera brightness error message.
  • the camera brightness range is that the brightness superimposed values of the three color channels of red, green, and blue are not Less than 0.4;
  • Step S113 Obtain the camera focal length of the heel and the heel image of the side of the heel at a horizontal angle
  • Step S120 Use high-pass filtering to perform sharpening processing on the heel image to obtain a sharpened image
  • Step S130 Use a bilateral filter to perform highlight denoising processing on the sharpened image to obtain a heel chromaticity diagram
  • Step S200 Perform feature extraction on the heel chromaticity map by using a feature extraction network to obtain a feature output map
  • Step S310 Recognizing the heel of the characteristic output map by using the region candidate network to obtain several candidate regions;
  • Step S320 Obtain the confidence of the candidate region by using a classifier, and screen the candidate region according to the confidence to obtain the candidate confidence region;
  • Step S330 Obtain the area overlap degree between the confidence candidate areas, and obtain the area overlap degree data set of the confidence candidate area;
  • Step S340 Use non-maximum value suppression to process the region overlap degree data set to obtain an optimal candidate region
  • Step S350 aligning the optimal candidate area by using a bilinear interpolation method to obtain a candidate image
  • Step S410 Perform pixel-level recognition on the candidate image by using a segmentation network to obtain heel pixels in the candidate image;
  • Step S420 Obtain the actual heel height according to the heel pixel points and the camera focal length
  • Step S430 Use a classification network to classify the heel pixels to obtain the heel shape of the candidate image
  • Step S510 Input the height of the heel and the shape of the heel into the heel database
  • Step S520 Obtain the heel height and the heel overlap between the heel shape and the data in the heel database, and filter and arrange the heel overlap to obtain the heel A number of heel models arranged by the value of the degree of overlap.
  • the heel image is collected, and the heel image is preprocessed to enhance the clarity and resolution of the heel image; the feature extraction network is used to perform feature extraction on the heel chromaticity diagram, which improves the speed of feature extraction , To enhance the resolution of features; use the regional candidate network to identify, filter, and classify the feature output map to obtain a normalized candidate image, which improves the accuracy of candidate region selection and reduces the overlap between candidate regions Pixel-level recognition of candidate images through the output network improves the accuracy of heel height and shape acquisition; through the storage of heel information in the heel database, the recognition of the height and shape of the heel of the candidate image improves Speed and efficiency of heel recognition.
  • another embodiment of the present invention also provides a heel model identification device, which includes at least one control processor and a memory for communicating with the at least one control processor; The instructions executed by the at least one control processor, the instructions are executed by the at least one control processor, so that the at least one control processor can execute the method for identifying a heel model as described in any one of the above.
  • the identification device includes: one or more control processors and memories, and the control processors and memories may be connected by a bus or in other ways.
  • the memory can be used to store non-transitory software programs, non-transitory computer executable programs and modules, such as program instructions/modules corresponding to the identification method in the embodiment of the present invention.
  • the control processor executes various functional applications and data processing of the identification device by running the non-transitory software programs, instructions, and modules stored in the memory, that is, realizes the identification method of the foregoing method embodiment.
  • the memory may include a program storage area and a data storage area, where the program storage area may store an operating system and an application program required by at least one function; the data storage area may store data created according to the use of the identification device and the like.
  • the memory may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid-state storage devices.
  • the memory may optionally include a memory remotely provided with respect to the control processor, and these remote memories may be connected to the identification device via a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
  • the one or more modules are stored in the memory, and when executed by the one or more control processors, the identification method in the above method embodiment is executed, for example, the steps S100 to S500, S110 to S130, S111 to S113, S310 to S350, S410 to S430, and S510 to S520 functions.
  • the embodiment of the present invention also provides a computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are executed by one or more control processors, for example, a control processor Execution may cause the above-mentioned one or more control processors to execute the identification method in the above-mentioned method embodiment, for example, execute the above-described method steps S100 to S500, S110 to S130, S111 to S113, S310 to S350, S410 to S430, And the functions of S510 to S520.
  • a control processor Execution may cause the above-mentioned one or more control processors to execute the identification method in the above-mentioned method embodiment, for example, execute the above-described method steps S100 to S500, S110 to S130, S111 to S113, S310 to S350, S410 to S430, And the functions of S510 to S520.
  • the device embodiments described above are merely illustrative, and the units described as separate components may or may not be physically separated, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • each implementation manner can be implemented by means of software plus a general hardware platform.
  • All or part of the processes in the methods of the above embodiments can be implemented by computer programs instructing relevant hardware.
  • the programs can be stored in a computer readable storage medium. At this time, it may include the flow of the embodiment of the above-mentioned method.
  • the storage medium may be a magnetic disk, an optical disc, a read-only memory (Read Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.

Abstract

A heel type identification method comprises the following steps: acquiring a heel image, performing pre-processing on the heel image, and obtaining a heel chromaticity diagram (S100); performing feature extraction on the heel chromaticity diagram by using a feature extraction network, and obtaining a feature output diagram (S200); processing the feature output diagram by using a region candidate network, and obtaining a heel candidate image (S300); performing pixel-based identification on the candidate image by means of an output network, and obtaining a heel height and a heel shape of the candidate image (S400); and performing, by means of a heel database, identification on the heel height and the heel shape, and obtaining a heel type (S500). The identification method improves the speed of heel identification, and increases accuracy of heel type identification, thereby reducing the workload of merchants.

Description

一种鞋跟型号识别方法、装置及存储介质Method, device and storage medium for identifying shoe heel model 技术领域Technical field
本发明涉及图像处理技术领域,具体涉及一种鞋跟型号识别方法、装置及存储介质。The present invention relates to the technical field of image processing, in particular to a method, a device and a storage medium for identifying a heel model.
背景技术Background technique
传统的鞋跟型号识别是指商家通过客户提供的鞋跟影像或实物,通过肉眼的判断和记忆中的搜索,来模糊判断出客户所提供的鞋跟所属的类别和型号。随着经济的蓬勃发展,人们的生活水平有了显著提高,鞋子的种类得到了极大丰富;与之对应的鞋跟市场,也发生了极大变化,据不完全统计,全国的鞋跟每月更新的数量级已达万级,通过肉眼辨识与记忆搜索已经不能解决市场的需求。其次,客户提供的鞋跟影像没有经过统一的采集系统及采集标准进行规范,在传统的鞋跟识别中,更是加大了商家的难度,既耗费人力与时间,并且不能保证鞋跟型号识别的准确性。Traditional shoe heel model recognition refers to the fact that businesses can vaguely judge the category and model of the heel provided by the customer through the visual judgment and search in memory through the image or physical object of the heel provided by the customer. With the booming economy, people’s living standards have been significantly improved, and the types of shoes have been greatly enriched. The corresponding heel market has also undergone great changes. According to incomplete statistics, every shoe heel nationwide The order of magnitude of the monthly update has reached 10,000, and the market demand can no longer be solved through visual identification and memory search. Secondly, the heel image provided by the customer has not been regulated by a unified collection system and collection standard. In the traditional heel recognition, it has increased the difficulty of the merchant, which consumes manpower and time, and cannot guarantee the heel model recognition. Accuracy.
发明内容Summary of the invention
为解决上述问题,本发明的目的在于提供一种鞋跟型号识别方法、装置及存储介质,能够提高鞋跟识别的速度,提高鞋跟型号识别的准确率,大大减轻了商家的工作量。In order to solve the above problems, the purpose of the present invention is to provide a heel model recognition method, device and storage medium, which can increase the speed of heel recognition, improve the accuracy of heel model recognition, and greatly reduce the workload of merchants.
本发明解决其问题所采用的技术方案是:第一方面,本发明实施例提出了一种鞋跟型号识别方法,包括如下步骤:The technical solution adopted by the present invention to solve the problem is: In the first aspect, an embodiment of the present invention proposes a method for identifying a heel model, which includes the following steps:
采集鞋跟图像,对所述鞋跟图像进行预处理,得到鞋跟色度图;Acquiring a heel image, and preprocessing the heel image to obtain a heel chromaticity diagram;
利用特征提取网络对所述鞋跟色度图进行特征提取,得到特征输出图;Using a feature extraction network to perform feature extraction on the heel chromaticity map to obtain a feature output map;
利用区域候选网络对所述特征输出图进行处理,得到鞋跟的候选图像;Use the regional candidate network to process the feature output image to obtain a candidate image of the heel;
通过输出网络对所述候选图像进行像素级识别,得到所述候选图像的鞋跟高度和鞋跟形状;Perform pixel-level recognition on the candidate image through the output network to obtain the heel height and heel shape of the candidate image;
通过鞋跟数据库对所述鞋跟高度和所述鞋跟形状进行识别,得到鞋跟的型号。Identify the height of the heel and the shape of the heel through the heel database to obtain the model of the heel.
进一步,采集鞋跟图像,对所述鞋跟图像进行预处理,得到鞋跟色度图,包括如下步骤:Further, collecting a heel image, and preprocessing the heel image to obtain a heel chromaticity diagram includes the following steps:
在摄像距离范围和摄像亮度范围内,获取鞋跟侧面在水平角度上的鞋跟图像;Obtain the heel image of the side of the heel at a horizontal angle within the range of the camera distance and the camera brightness range;
利用高通滤波对所述鞋跟图像进行锐化处理,得到锐化图像;Using high-pass filtering to sharpen the heel image to obtain a sharpened image;
利用双边滤波器对所述锐化图像进行高光去噪处理,得到鞋跟色度图。A bilateral filter is used to perform highlight denoising processing on the sharpened image to obtain a heel chromaticity diagram.
进一步,在摄像距离范围和摄像亮度范围内,获取鞋跟侧面在水平角度上的鞋跟图像,包括如下步骤:Further, within the range of the camera distance and the camera brightness, obtaining a heel image of the side of the heel at a horizontal angle includes the following steps:
获取鞋跟的摄像距离,若所述摄像距离不在摄像距离范围内,则返回所述摄像距离错误信息,所述摄像距离范围为10厘米至30厘米;Acquire the camera distance of the heel, and if the camera distance is not within the camera distance range, return the camera distance error message, and the camera distance range is 10 cm to 30 cm;
获取鞋跟的摄像亮度,若所述摄像亮度不在摄像亮度范围内,则返回所述摄像亮度错误信息,所述摄像亮度范围为红、绿、蓝三个颜色通道的亮度叠加值不小于0.4;Acquire the camera brightness of the heel, if the camera brightness is not within the camera brightness range, return the camera brightness error message, and the camera brightness range is that the brightness superimposed value of the three color channels of red, green and blue is not less than 0.4;
获取鞋跟的摄像焦距,以及鞋跟侧面在水平角度上的鞋跟图像。Obtain the camera focal length of the heel and the heel image at the horizontal angle of the side of the heel.
进一步,特征提取网络包括:残差网络和特征金字塔网络;所述残差网络包括若干个残差块,所述特征金字塔网络包括若干个特征金字塔网络层,所述残差块的后面连接有所述特征金字塔网络层。Further, the feature extraction network includes: a residual network and a feature pyramid network; the residual network includes a number of residual blocks, the feature pyramid network includes a number of feature pyramid network layers, and the residual block is connected to some The feature pyramid network layer.
进一步,利用区域候选网络对所述特征输出图进行处理,得到鞋跟的候选图像,包括如下步骤:Further, using the regional candidate network to process the feature output map to obtain the candidate image of the heel includes the following steps:
利用区域候选网络对所述特征输出图进行鞋跟的识别,得到若干个候选区域;Recognizing the heel of the feature output map by using the regional candidate network to obtain several candidate regions;
利用分类器获取所述候选区域的置信度,并根据所述置信度对所述候选区域进行筛选,得到置信度候选区域;Obtaining the confidence of the candidate region by using a classifier, and screening the candidate region according to the confidence to obtain the confidence candidate region;
获取所述置信度候选区域之间的区域重叠度,得到所述置信度候选区域的区域重叠度数据组;Acquiring the degree of overlap between the candidate confidence regions to obtain a data set of the degree of overlap between the candidate confidence regions;
利用非极大值抑制对所述区域重叠度数据组进行处理,得到最优候选区域;Using non-maximum value suppression to process the region overlap degree data set to obtain an optimal candidate region;
利用双线性插值法对所述最优候选区域进行对齐处理,得到候选图像。The bilinear interpolation method is used to perform alignment processing on the optimal candidate region to obtain a candidate image.
进一步,通过输出网络对所述候选图像进行像素级识别,得到所述候选图像的鞋跟高度和鞋跟形状,包括如下步骤:Further, performing pixel-level recognition on the candidate image through the output network to obtain the heel height and heel shape of the candidate image includes the following steps:
利用分割网络对所述候选图像进行像素级识别,得到所述候选图像中的鞋跟像素点;Performing pixel-level recognition on the candidate image by using a segmentation network to obtain heel pixels in the candidate image;
根据所述鞋跟像素点和所述摄像焦距,得到实际的鞋跟高度;Obtain the actual heel height according to the heel pixel points and the camera focal length;
利用分类网络对所述鞋跟像素点进行分类,得到候选图像的鞋跟形状。A classification network is used to classify the heel pixels to obtain the heel shape of the candidate image.
进一步,通过鞋跟数据库对所述鞋跟高度和所述鞋跟形状进行识别,得到鞋跟的型号,包括如下步骤:Further, recognizing the height of the heel and the shape of the heel through the heel database to obtain the model of the heel includes the following steps:
将所述鞋跟高度和所述鞋跟形状输入到鞋跟数据库;Input the heel height and the heel shape into the heel database;
获取所述鞋跟高度和所述鞋跟形状与所述鞋跟数据库内数据之间的鞋跟重叠度,并对所述鞋跟重叠度进行筛选和排列,得到按所述鞋跟重叠度的值进行排列的若干个鞋跟型号。Obtain the heel height and the heel overlap degree between the heel shape and the data in the heel database, and filter and arrange the heel overlap degree to obtain the heel overlap degree according to the heel overlap degree Several heel models arranged by value.
第二方面,本发明实施例还提出了一种鞋跟型号识别装置,包括至少一个控制处理器和用于与所述至少一个控制处理器通信连接的存储器;所述存储器存储有可被所述至少一个控制处理器执行的指令,所述指令被所述至少一个控制处理器执行,以使所述至少一个控制处理器能够执行如以上任一项所述的一种鞋跟型号识别方法。In the second aspect, an embodiment of the present invention also provides a heel model recognition device, which includes at least one control processor and a memory for communicating with the at least one control processor; At least one instruction executed by the control processor, the instruction being executed by the at least one control processor, so that the at least one control processor can execute the method for identifying a heel model as described in any one of the above.
第三方面,本发明实施例还提出了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可执行指令,所述计算机可执行指令用于使计算机执行如以上任一项所述的一种鞋跟型号识别方法。In the third aspect, the embodiments of the present invention also provide a computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to make a computer execute any of the above A method for identifying the heel model described.
本发明实施例中提供的技术方案,至少具有如下有益效果:采集鞋跟图像,对鞋跟图像进行预处理,增强鞋跟图像的清晰度和分辨率;利用特征提取网络对鞋跟色度图进行特征提取,提高了特征提取的速度,增强了特征的分辨度;利用区域候选网络对特征输出图进行识别、筛选、分类处理,得到归一化的候选图像,提高了候选区域选取的准确性,降低了候选区域之间的重叠性;通过输出网络对候选图像进行像素级识别,提高了鞋跟高度和形状获取的准确性;通过鞋跟数据库对鞋跟信息的存储,对候选图像的鞋跟的高度和形状的识别,提高了鞋跟识别的速度和效率。The technical solutions provided in the embodiments of the present invention have at least the following beneficial effects: collecting heel images, preprocessing the heel images, and enhancing the clarity and resolution of the heel images; using the feature extraction network to analyze the heel chromaticity diagram Perform feature extraction to improve the speed of feature extraction and enhance the resolution of features; use the regional candidate network to identify, filter, and classify the feature output map to obtain a normalized candidate image, which improves the accuracy of candidate region selection , Reduce the overlap between the candidate regions; through the output network to perform pixel-level recognition of the candidate image, improve the accuracy of the heel height and shape acquisition; through the heel database to store the heel information, the candidate image of the shoe The recognition of heel height and shape improves the speed and efficiency of heel recognition.
附图说明Description of the drawings
下面结合附图和实例对本发明作进一步说明。The present invention will be further explained below with reference to the drawings and examples.
图1是本发明的鞋跟型号识别方法的一个实施例的整体流程图。Fig. 1 is an overall flowchart of an embodiment of a method for identifying a heel model of the present invention.
具体实施方式detailed description
随着经济的蓬勃发展,人们的生活水平有了显著提高,鞋子的种类得到了极 大丰富;与之对应的鞋跟市场,也发生了极大变化,全国的鞋跟每月更新的数量级已达万级,通过肉眼辨识与记忆搜索已经不能解决市场的需求;并且客户提供的鞋跟影像没有经过统一的采集系统及采集标准进行规范,加大了商家识别的难度,既耗费人力与时间,并且不能保证鞋跟型号识别的准确性。With the booming economy, people’s living standards have been significantly improved, and the types of shoes have been greatly enriched; the corresponding heel market has also undergone great changes, and the national heels have been updated by the order of magnitude every month. Up to 10,000 levels, through visual identification and memory search, it is no longer able to solve the market demand; and the heel image provided by the customer has not been standardized by a unified collection system and collection standard, which increases the difficulty of merchant identification, which consumes manpower and time. And the accuracy of the heel model identification cannot be guaranteed.
基于此,本发明提供了一种鞋跟型号识别方法、装置及存储介质,能够提高鞋跟识别的速度,提高鞋跟型号识别的准确率,大大减轻了商家的工作量。Based on this, the present invention provides a heel model recognition method, device and storage medium, which can increase the speed of heel recognition, improve the accuracy of heel model recognition, and greatly reduce the workload of merchants.
下面结合附图,对本发明实施例作进一步阐述。The embodiments of the present invention will be further described below in conjunction with the accompanying drawings.
参照图1,本发明的一个实施例提供了一种鞋跟型号识别方法,包括如下步骤:1, an embodiment of the present invention provides a method for identifying a heel model, which includes the following steps:
步骤S100:采集鞋跟图像,对所述鞋跟图像进行预处理,得到鞋跟色度图;Step S100: Collect a heel image, and preprocess the heel image to obtain a heel chromaticity diagram;
步骤S200:利用特征提取网络对所述鞋跟色度图进行特征提取,得到特征输出图;Step S200: Perform feature extraction on the heel chromaticity map by using a feature extraction network to obtain a feature output map;
步骤S300:利用区域候选网络对所述特征输出图进行处理,得到鞋跟的候选图像;Step S300: Use the regional candidate network to process the feature output map to obtain a candidate image of the heel;
步骤S400:通过输出网络对所述候选图像进行像素级识别,得到所述候选图像的鞋跟高度和鞋跟形状;Step S400: Perform pixel-level recognition on the candidate image through the output network to obtain the heel height and heel shape of the candidate image;
步骤S500:通过鞋跟数据库对所述鞋跟高度和所述鞋跟形状进行识别,得到鞋跟的型号。Step S500: Recognizing the height of the heel and the shape of the heel through the heel database to obtain the model of the heel.
在本实施例中,步骤S100采集鞋跟图像,对鞋跟图像进行预处理,增强了鞋跟图像的清晰度和分辨率;其中,预处理可以设置为锐化处理,去噪处理、亮度调节、饱和度调节等。步骤S200利用特征提取网络对鞋跟色度图进行特征提取,提高了特征提取的速度,增强了特征的分辨度;步骤S300利用区域候选网络对特征输出图进行识别、筛选、分类处理,得到归一化的候选图像,提高了候选区域选取的准确性,降低了候选区域之间的重复性;步骤S400通过输出网络对候选图像进行像素级识别,提高了候选图像的鞋跟高度和鞋跟形状获取的准确性;步骤S500通过鞋跟数据库对鞋跟信息进行存储,并对候选图像的鞋跟高度和鞋跟形状进行识别和对比,提高了鞋跟型号识别的速度和效率。In this embodiment, step S100 collects a heel image and preprocesses the heel image to enhance the clarity and resolution of the heel image; wherein, the preprocessing can be set to sharpening processing, denoising processing, and brightness adjustment. , Saturation adjustment, etc. Step S200 uses the feature extraction network to perform feature extraction on the heel chromaticity diagram, which improves the speed of feature extraction and enhances the resolution of features; Step S300 uses the regional candidate network to identify, filter, and classify the feature output map to obtain the classification The unified candidate image improves the accuracy of candidate region selection and reduces the repetition between candidate regions; step S400 performs pixel-level recognition on the candidate image through the output network, which improves the heel height and heel shape of the candidate image Accuracy of acquisition: Step S500 stores the heel information through the heel database, and recognizes and compares the heel height and heel shape of the candidate image, which improves the speed and efficiency of heel model recognition.
进一步地,本发明的另一个实施例还提供了一种鞋跟型号识别方法,其中,采集鞋跟图像,对所述鞋跟图像进行预处理,得到鞋跟色度图,包括如下步骤:Further, another embodiment of the present invention also provides a method for recognizing a heel model, in which collecting a heel image and preprocessing the heel image to obtain a heel chromaticity diagram includes the following steps:
步骤S110:在摄像距离范围和摄像亮度范围内,获取鞋跟侧面在水平角度上的鞋跟图像;Step S110: Acquire a heel image of the side of the heel at a horizontal angle within the camera distance range and the camera brightness range;
步骤S120:利用高通滤波对所述鞋跟图像进行锐化处理,得到锐化图像;Step S120: Use high-pass filtering to perform sharpening processing on the heel image to obtain a sharpened image;
步骤S130:利用双边滤波器对所述锐化图像进行高光去噪处理,得到鞋跟色度图。Step S130: Use a bilateral filter to perform highlight denoising processing on the sharpened image to obtain a heel chromaticity diagram.
在本实施例中,步骤S110在摄像距离范围和摄像亮度范围内对鞋跟进行图像的采集,保证了鞋跟图像的清晰度;在水平角度上进行图像采集,避免了鞋跟的形状发生差异,保证鞋跟图像中鞋跟形状的准确性;对鞋跟的侧面进行图像的获取,使得图像能够获取鞋跟比较多的特征点,例如是对高跟鞋的鞋跟侧面进行图像的获取,可以从高跟鞋的鞋跟侧面图像中得到鞋跟的坡度,鞋跟的粗细度等,而高跟鞋的鞋跟正面、背面、底面均无法获取鞋跟的上述数据,因此对鞋跟的侧面进行图像的获取,能够提高鞋跟识别的准确度。In this embodiment, step S110 collects images of the heel within the range of the camera distance and the scope of the camera brightness to ensure the clarity of the heel image; the image collection is performed at a horizontal angle to avoid the difference in the shape of the heel. , To ensure the accuracy of the shape of the heel in the heel image; to acquire the image of the side of the heel, so that the image can acquire more characteristic points of the heel, for example, to acquire the image of the side of the heel of a high-heeled shoe. The heel slope and thickness of the heel can be obtained from the side image of the heel of a high-heeled shoe, but the above data of the heel cannot be obtained on the front, back and bottom of the heel of a high-heeled shoe. Therefore, the image of the side of the heel is acquired. It can improve the accuracy of heel recognition.
步骤S120高通滤波是一种过滤方式,规则为高频信号能正常通过,而低于设定临界值的低频信号则被阻隔、减弱,即高通滤波是只对低于某一给定频率以下的频率成分有衰减作用,而允许这个截频以上的频率成分通过,并且没有相位移的滤波过程;主要用来消除低频噪声,也称低截止滤波器。Step S120 High-pass filtering is a filtering method. The rule is that high-frequency signals can pass normally, while low-frequency signals below the set critical value are blocked and attenuated. That is, high-pass filtering is only for those below a given frequency. The frequency component has an attenuation effect, and the frequency component above the cutoff frequency is allowed to pass, and there is no phase shift filtering process; it is mainly used to eliminate low frequency noise, also called a low cut filter.
将鞋跟图像输入到以下公式中进行高通滤波的处理:Input the heel image into the following formula for high-pass filtering:
y(n,m)=x(n,m)+λz(n,m);y(n,m)=x(n,m)+λz(n,m);
其中,x(n,m)为鞋跟图像,y(n,m)为高通滤波处理后的锐化图像,而z(n,m)为校正信号,一般是通过对x进行高通滤波获取,λ是用于控制增强效果的的一个缩放因子。通过高通滤波提取鞋跟图像的高频部分,并将高频部分与鞋跟图像进行叠加,从而强化鞋跟图像的边缘信息,达到对鞋跟图像进行锐化的效果,提高了鞋跟图像的清晰度。Among them, x(n,m) is the heel image, y(n,m) is the sharpened image after high-pass filtering, and z(n,m) is the correction signal, which is generally obtained by high-pass filtering on x. λ is a scaling factor used to control the enhancement effect. The high-frequency part of the heel image is extracted through high-pass filtering, and the high-frequency part is superimposed on the heel image, thereby enhancing the edge information of the heel image, achieving the effect of sharpening the heel image, and improving the quality of the heel image. Clarity.
步骤S130双边滤波是一种非线性滤波方法,是结合图像的空间邻近度和值域相似度的一种折中处理,同时考虑空域信息和最大漫反射色度相似性,达到保边去噪的目的,具有简单、非迭代的,输出的依赖于邻域像素值的加权组合。用所估计的像素点最大漫反射色度值作为值域和空域的加权组合引导平滑,对锐化图进行去噪保边,进而重新得到每个像素点的最大色度值,即将锐化图像输入到以下公式中进行双边滤波的处理:Step S130 Bilateral filtering is a nonlinear filtering method that combines the spatial proximity of the image and the similarity of the value domain. It also considers the spatial information and the maximum diffuse reflection chromaticity similarity to achieve edge preservation and denoising. The purpose is simple, non-iterative, and the output depends on the weighted combination of neighboring pixel values. Use the estimated maximum diffuse reflection chromaticity value of the pixel as a weighted combination of the value domain and the spatial domain to guide smoothing, and perform denoising and edge protection on the sharpened image, and then retrieve the maximum chromaticity value of each pixel, which is about to sharpen the image Enter into the following formula for bilateral filtering:
Figure PCTCN2020112536-appb-000001
Figure PCTCN2020112536-appb-000001
其中,D是空域权值函数,R是估计的最大漫反射色度相似度权值函数,p是双边滤波处理后的像素点,q是锐化图像的像素点,σ max是含高光时的最大漫反射色度,
Figure PCTCN2020112536-appb-000002
是无高光状态下的最大漫反射色度,Λmax(x)是估计的最大漫反射色度。
Among them, D is the spatial weight function, R is the estimated maximum diffuse reflection chromaticity similarity weight function, p is the pixel point after bilateral filtering, q is the pixel point of the sharpened image, and σ max is the highlight Maximum diffuse chromaticity,
Figure PCTCN2020112536-appb-000002
Is the maximum diffuse chromaticity in the state of no highlight, Λmax(x) is the estimated maximum diffuse chromaticity.
在对含镜面反射像素点的最大色度值进行双边滤波后,色度值将会降低,使得滤波后像素点的最大色度更接近真实的最大漫反射色度。同时只含漫反射的像素点色度估计值也将受到含有镜面反射像素点的影响而变小。因此,为了减小含镜面分量的像素对仅含漫反射的像素点色度的影响,可以通过比较该像素点含高光时的最大漫反射色度σ max和所估计的无高光状态下的最大漫反射色度
Figure PCTCN2020112536-appb-000003
并取其中的最大值作为每个像素点的最大色度值:
After bilateral filtering is performed on the maximum chromaticity value of the pixel with specular reflection, the chromaticity value will be reduced, so that the maximum chromaticity of the filtered pixel is closer to the true maximum diffuse chromaticity. At the same time, the estimated chromaticity of pixels that only contain diffuse reflection will also be affected by pixels that contain specular reflection and become smaller. Therefore, in order to reduce the influence of a pixel with specular components on the chromaticity of a pixel with only diffuse reflection, the maximum diffuse reflection chromaticity σ max when the pixel contains highlights can be compared with the estimated maximum diffuse reflection chromaticity σ max under the state of no highlights. Diffuse chromaticity
Figure PCTCN2020112536-appb-000003
And take the maximum value as the maximum chromaticity value of each pixel:
Figure PCTCN2020112536-appb-000004
Figure PCTCN2020112536-appb-000004
对上式中的σ max使用双边滤波进行迭代,使得同种颜色的最大漫反射色度图平滑。本文通过比较每次迭代后的滤波值
Figure PCTCN2020112536-appb-000005
和σ max,当它们的差小于每个像素处的阈值时,滤波值被认为收敛,迭代完成;其中,像素处的阈值可根据实际的情况设置,例如设置为0.02。
The σ max in the above formula is iterated using bilateral filtering to make the maximum diffuse reflection chromaticity diagram of the same color smooth. This article compares the filtered value after each iteration
Figure PCTCN2020112536-appb-000005
And σ max , when their difference is less than the threshold at each pixel, the filter value is considered to converge and the iteration is completed; wherein the threshold at the pixel can be set according to the actual situation, for example, set to 0.02.
通过双边滤波器对锐化图像进行高光去噪处理,且对滤波值
Figure PCTCN2020112536-appb-000006
和σ max进行迭代比较处理,得到含有鞋跟每个像素点的最大色度值的鞋跟色度图,提高了鞋跟像素点识别的效果,从而提高鞋跟图像的识别效果。
Perform highlight denoising processing on sharpened images through bilateral filters, and filter values
Figure PCTCN2020112536-appb-000006
Iteratively compare processing with σ max to obtain the heel chromaticity diagram containing the maximum chromaticity value of each pixel of the heel, which improves the effect of heel pixel recognition, thereby improving the recognition effect of the heel image.
进一步地,本发明的另一个实施例还提供了一种鞋跟型号识别方法,其中,在摄像距离范围和摄像亮度范围内,获取鞋跟侧面在水平角度上的鞋跟图像,包括如下步骤:Further, another embodiment of the present invention also provides a method for recognizing a heel model, in which, within the camera distance range and the camera brightness range, acquiring the heel image of the side of the heel at a horizontal angle includes the following steps:
步骤S111:获取鞋跟的摄像距离,若所述摄像距离不在摄像距离范围内,则返回所述摄像距离错误信息,所述摄像距离范围为10厘米至30厘米;Step S111: Obtain the camera distance of the heel, and if the camera distance is not within the camera distance range, return the camera distance error message, and the camera distance range is 10 cm to 30 cm;
步骤S112:获取鞋跟的摄像亮度,若所述摄像亮度不在摄像亮度范围内, 则返回所述摄像亮度错误信息,所述摄像亮度范围为红、绿、蓝三个颜色通道的亮度叠加值不小于0.4;Step S112: Obtain the camera brightness of the heel. If the camera brightness is not within the camera brightness range, return the camera brightness error message, and the camera brightness range is that the brightness superimposed value of the three color channels of red, green, and blue is not Less than 0.4;
步骤S113:获取鞋跟的摄像焦距,以及鞋跟侧面在水平角度上的鞋跟图像。Step S113: Obtain the camera focal length of the heel and the heel image of the side of the heel at a horizontal angle.
在本实施例中,步骤S111摄像距离范围设置为10厘米至30厘米,使得摄像设备获取的鞋跟图像的大小在一定的范围内,从而使得摄像设备能够完全获取鞋跟图像,而避免鞋跟图像的规格太小或太大,且保证了摄像设备对鞋跟图像获取的清晰度;在使用者对鞋跟图像进行采集时,获取摄像距离的大小,当摄像距离不在摄像距离范围内时,返回摄像距离错误信息,使使用者通过距离错误信息而调整摄像距离,保证了鞋跟图像获取的准确性。In this embodiment, the camera distance range of step S111 is set to be 10 cm to 30 cm, so that the size of the heel image acquired by the camera device is within a certain range, so that the camera device can completely obtain the heel image without avoiding the heel The size of the image is too small or too large, and the definition of the heel image obtained by the camera equipment is guaranteed; when the user collects the heel image, the size of the camera distance is obtained. When the camera distance is not within the camera distance range, Return the error information of the camera distance, allowing the user to adjust the camera distance through the error information of the distance, ensuring the accuracy of the heel image acquisition.
步骤S112鞋跟的摄像亮度可以根据鞋跟图像在红、绿、蓝三个颜色通道的亮度叠加值来体现,即摄像亮度亮度计算公式为:In step S112, the camera brightness of the heel can be embodied according to the superimposed values of the brightness of the heel image in the three color channels of red, green, and blue, that is, the calculation formula of the camera brightness is:
亮度(RGB=0.26红(R+0.67绿(G+0.07蓝(B;Brightness (RGB = 0.26 red (R + 0.67 green (G + 0.07 blue (B;
其中,亮度(RGB为红、绿、蓝三个颜色通道的亮度叠加值,红(R是鞋跟图像在红色通道中的亮度值,绿(G是鞋跟图像在绿色通道中的亮度值,蓝(B是鞋跟图像在蓝色通道中的亮度值。当亮度叠加值小于0.4时,鞋跟图像的亮度显示较暗,不利于图像的处理。因此,在使用者对鞋跟图像进行采集时,获取摄像亮度的大小,当摄像亮度小于0.4时,返回摄像亮度错误信息,使使用者通过亮度错误信息而调整摄像亮度,保证了鞋跟图像获取的准确性。Among them, brightness (RGB is the brightness superposition value of the three color channels of red, green and blue, red (R is the brightness value of the heel image in the red channel, and green (G is the brightness value of the heel image in the green channel, Blue (B is the brightness value of the heel image in the blue channel. When the brightness overlay value is less than 0.4, the brightness of the heel image is darker, which is not conducive to image processing. Therefore, the user collects the heel image When the camera brightness is obtained, when the camera brightness is less than 0.4, the camera brightness error message is returned, so that the user can adjust the camera brightness through the brightness error message, which ensures the accuracy of the heel image acquisition.
步骤S113鞋跟的摄像焦距是摄像设备根据实际的拍摄环境进行自动调节的,通过获取鞋跟的摄像焦距,有助于对鞋跟的实际高度的计算;在水平角度上对鞋跟进行图像采集,保证鞋跟图像中鞋跟形状的准确性;对鞋跟的侧面进行图像的获取,使得图像能够获取鞋跟的比较多的特征点,提高鞋跟识别的准确度。Step S113 The camera focal length of the heel is automatically adjusted by the camera equipment according to the actual shooting environment. By obtaining the camera focal length of the heel, it is helpful to calculate the actual height of the heel; image acquisition of the heel at a horizontal angle , To ensure the accuracy of the heel shape in the heel image; image acquisition of the side of the heel, so that the image can acquire more characteristic points of the heel, and improve the accuracy of heel recognition.
进一步地,本发明的另一个实施例还提供了一种鞋跟型号识别方法,其中,特征提取网络包括:残差网络和特征金字塔网络;所述残差网络包括若干个残差块,所述特征金字塔网络包括若干个特征金字塔网络层,所述残差块的后面连接有所述特征金字塔网络层。Further, another embodiment of the present invention also provides a method for recognizing a shoe heel model, wherein the feature extraction network includes: a residual network and a feature pyramid network; the residual network includes a plurality of residual blocks, the The feature pyramid network includes several feature pyramid network layers, and the feature pyramid network layer is connected behind the residual block.
在本实施例中,残差网络的特点是容易优化,并且能够通过增加相当的深度来提高准确率,其内部的残差块使用了跳跃连接,缓解了在深度神经网络中增加深度带来的梯度消失问题;特征金字塔用于对不同尺度的对象的进行检测识别, 利用深层卷积网络固有的多尺度金字塔层次结构来构造具有边际额外损失的特征金字,使得网络具有横向连接的自上而下的架构,能够在所有尺度上构建高级语义特征图。残差网络包括若干个残差块,在任意的残差块后面可以连接有特征金字塔网络层,即在若干个残差块中,最多能够在每个残差块后均连接有特征金字塔网络层。通过将鞋跟色度图输入到特征提取网络中进行特征提取,提高了鞋跟特征提取的速度,增强了鞋跟特征的分辨度。其中,特征金字塔网络的层数没有限制,根据实际的残差块数量设置。In this embodiment, the characteristic of the residual network is that it is easy to optimize and can increase the accuracy by increasing a considerable depth. The internal residual block uses jump connections, which alleviates the increase in depth in the deep neural network. The problem of gradient disappearance; the feature pyramid is used to detect and recognize objects of different scales, and the inherent multi-scale pyramid hierarchy of deep convolutional networks is used to construct feature gold characters with marginal additional losses, so that the network has a horizontal connection from top to bottom The architecture of, can build high-level semantic feature maps on all scales. The residual network includes several residual blocks, and a characteristic pyramid network layer can be connected behind any residual block, that is, in several residual blocks, at most, a characteristic pyramid network layer can be connected after each residual block. . By inputting the heel chromaticity diagram into the feature extraction network for feature extraction, the speed of heel feature extraction is improved, and the resolution of heel features is enhanced. Among them, the number of layers of the feature pyramid network is not limited, and it is set according to the actual number of residual blocks.
进一步地,本发明的另一个实施例还提供了一种鞋跟型号识别方法,其中,利用区域候选网络对所述特征输出图进行处理,得到鞋跟的候选图像,包括如下步骤:Further, another embodiment of the present invention also provides a method for recognizing a shoe heel model, wherein, using a regional candidate network to process the feature output map to obtain a candidate image of the heel includes the following steps:
步骤S310:利用区域候选网络对所述特征输出图进行鞋跟的识别,得到若干个候选区域;Step S310: Recognizing the heel of the characteristic output map by using the region candidate network to obtain several candidate regions;
步骤S320:利用分类器获取所述候选区域的置信度,并根据所述置信度对所述候选区域进行筛选,得到置信度候选区域;Step S320: Obtain the confidence of the candidate region by using a classifier, and screen the candidate region according to the confidence to obtain the candidate confidence region;
步骤S330:获取所述置信度候选区域之间的区域重叠度,得到所述置信度候选区域的区域重叠度数据组;Step S330: Obtain the area overlap degree between the confidence candidate areas, and obtain the area overlap degree data set of the confidence candidate area;
步骤S340:利用非极大值抑制对所述区域重叠度数据组进行处理,得到最优候选区域;Step S340: Use non-maximum value suppression to process the region overlap degree data set to obtain an optimal candidate region;
步骤S350:利用双线性插值法对所述最优候选区域进行对齐处理,得到候选图像。Step S350: Perform alignment processing on the optimal candidate region using a bilinear interpolation method to obtain a candidate image.
在本实施例中,步骤S310区域候选网络采用滑窗遍历特征输出图上所有的点,判断特征输出图上所有感兴趣的区域,得到若干个候选区域。步骤S320使用分类器计算候选区域的置信度,并按照置信度的大小,筛选出若干个置信度最高的候选区域,记为置信度候选区域;步骤S330获取置信度候选区域之间的区域重叠度,得到置信度候选区域的区域重叠度数据组,即每个置信度候选区域的区域重叠度数据组内均包含有与候选区域与其他候选区域之间的区域重叠度的数据。In this embodiment, in step S310, the area candidate network uses a sliding window to traverse all points on the feature output map, judges all regions of interest on the feature output map, and obtains several candidate regions. Step S320 uses the classifier to calculate the confidence of the candidate regions, and according to the size of the confidence, selects a number of candidate regions with the highest confidence, which are recorded as confidence candidate regions; Step S330 obtains the degree of overlap between the confidence candidate regions Obtain the area overlap data set of the confidence candidate area, that is, the area overlap data set of each confidence candidate area contains the data of the area overlap with the candidate area and other candidate areas.
步骤S340非极大值抑制是抑制不是极大值的元素,即选取局部最大搜索,这个局部代表的是一个邻域,邻域有两个参数可变,一是邻域的维数,二是邻域 的大小。对区域重叠度数据组具体的搜索步骤为:从置信度最大的置信度候选区域开始,记为第一个置信度候选区域,从第一个置信度候选区域的区域重叠度数据组内,筛选出区域重叠度值不大于阈值的第一类置信度候选区域;再从第一类置信度候选区域中选取出区域置信度最大的第二个置信度候选区域,并在第二个置信度候选区域的区域重叠度数据组内,筛选出区域重叠度值不大于阈值的第三类置信度候选区域;继续上述筛选,直到将所有的第N类置信度候选区域中的最大的置信度候选区域被选取出来,并进行合并,得到鞋跟的最优候选区域,提高了候选区域选取的准确性,降低了候选区域之间的重叠性。其中,第N-1类置信度候选区域的范围包含第N类置信度候选区域。Step S340 Non-maximum suppression is to suppress elements that are not maximum values, that is, to select a local maximum search. This local represents a neighborhood. The neighborhood has two variable parameters, one is the dimension of the neighborhood, and the other is The size of the neighborhood. The specific search steps for the regional overlap data set are: start with the confidence candidate region with the highest confidence, record it as the first confidence candidate region, and filter from the region overlap data set of the first confidence candidate region Select the first-type confidence candidate region whose region overlap value is not greater than the threshold; then select the second confidence candidate region with the highest regional confidence from the first-type confidence candidate region, and select the second confidence candidate region in the second confidence candidate region. In the regional overlap degree data group, filter out the third-type confidence candidate regions whose regional overlap value is not greater than the threshold; continue the above-mentioned screening until the largest confidence candidate region among all the Nth-type confidence candidate regions It is selected and merged to obtain the optimal candidate area of the heel, which improves the accuracy of candidate area selection and reduces the overlap between candidate areas. Wherein, the range of the N-1th type of confidence candidate area includes the Nth type of confidence candidate area.
例如,置信度候选区域有A、B、C、D、E和F,且置信度候选区域的区域置信度大小为A<B<C<D<E<F,首先提取区域置信度最大的F,记F为第一个置信度候选区域;在F的区域重叠度数据组中,将区域重叠度不大于阈值的置信度候选区域筛选出来,假设A与F、B与F、C与F的区域重叠度值均不大于阈值,则筛选出A、B和C,记第一类置信度候选区域;因C的区域置信度大于A和B的区域置信度,记C为第二个置信度候选区域,此时C的区域重叠度数据组中,只有C与A、C与B的区域重叠度,对这两个区域重叠度进行筛选,得到区域重叠度值不大于阈值的A;那么最优候选区域为F、C、A的合并区域。For example, the confidence candidate regions include A, B, C, D, E, and F, and the region confidence level of the confidence candidate region is A<B<C<D<E<F, first extract the F with the highest regional confidence , Mark F as the first confidence candidate region; in the F’s region overlap data set, filter out the confidence candidate regions whose region overlap is not greater than the threshold, assuming that A and F, B and F, C and F If the regional overlap value is not greater than the threshold, then A, B, and C will be screened out, and record the first type of confidence candidate regions; because the regional confidence of C is greater than the regional confidence of A and B, mark C as the second confidence Candidate area. At this time, in the area overlap degree data set of C, only the area overlap degree of C and A, and C and B are the area overlap degrees. The overlap degree of these two areas is filtered, and the area overlap degree value is not greater than the threshold A; then the most The optimal candidate area is the merged area of F, C, and A.
步骤S350双线性插值是由两个变量的插值函数的线性插值扩展,其核心思想是在两个方向分别进行一次线性插值。在最优候选区域内选取四个固定位置的像素点,并对这四个固定位置的像素点进行双线性插值,双线性插值过程为:对每个固定位置的像素点,在最优候选区域内选取与其相邻的四个鞋跟像素点,对四个鞋跟像素点进行水平和垂直两个方向上的线性内插,即根据固定位置的像素点与其四个鞋跟像素点之间的距离确定相应的权重,从而计算出固定位置的像素点的插值位置。即双线性插值的原理为:以固定位置的像素点到相邻最近的四个鞋跟像素点的距离为参考权值,经两次线性插值,得到该固定位置的像素点的插值位置;根据四个固定位置的像素点的插值位置对最优候选区域进行对齐处理,得到归一化的候选图像,提高鞋跟图像识别的准确性。In step S350, the bilinear interpolation is a linear interpolation extension of the interpolation function of two variables, and its core idea is to perform linear interpolation in two directions respectively. Select four fixed-position pixels in the optimal candidate area, and perform bilinear interpolation on these four fixed-position pixels. The bilinear interpolation process is: for each fixed-position pixel, the optimal The four heel pixels adjacent to it are selected in the candidate area, and the four heel pixels are linearly interpolated in the horizontal and vertical directions. That is, according to the fixed position of the pixel and its four heel pixels. The distance between the two determines the corresponding weight, so as to calculate the interpolation position of the pixel at a fixed position. That is, the principle of bilinear interpolation is: taking the distance from the pixel at a fixed position to the four nearest heel pixels as the reference weight, and after two linear interpolations, the interpolation position of the pixel at the fixed position is obtained; According to the interpolation position of the four fixed pixel points, the optimal candidate area is aligned to obtain a normalized candidate image, which improves the accuracy of the heel image recognition.
进一步地,本发明的另一个实施例还提供了一种鞋跟型号识别方法,其中,通过输出网络对所述候选图像进行像素级识别,得到所述候选图像的鞋跟高度和 鞋跟形状,包括如下步骤:Further, another embodiment of the present invention also provides a method for recognizing a heel model, wherein the candidate image is identified at the pixel level through an output network to obtain the heel height and heel shape of the candidate image, Including the following steps:
步骤S410:利用分割网络对所述候选图像进行像素级识别,得到所述候选图像中的鞋跟像素点;Step S410: Perform pixel-level recognition on the candidate image by using a segmentation network to obtain heel pixels in the candidate image;
步骤S420:根据所述鞋跟像素点和所述摄像焦距,得到实际的鞋跟高度;Step S420: Obtain the actual heel height according to the heel pixel points and the camera focal length;
步骤S430:利用分类网络对所述鞋跟像素点进行分类,得到候选图像的鞋跟形状。Step S430: Use a classification network to classify the heel pixels to obtain the heel shape of the candidate image.
在本实施例中,步骤S410分割网络的种类有多种,常用的分割网络有:FCN、UNet、SegNet、DeepLab等,分割网络是对候选图像进行像素级识别和分类,得到的候选图像的像素点的种类有:鞋跟像素和非鞋跟像素,同时,对候选图像的像素点进行筛选,得到鞋跟像素点。In this embodiment, there are many types of segmentation networks in step S410. Commonly used segmentation networks are: FCN, UNet, SegNet, DeepLab, etc. The segmentation network is to identify and classify candidate images at the pixel level to obtain the pixels of the candidate image. The types of points are: heel pixels and non-heel pixels. At the same time, the pixel points of the candidate image are filtered to obtain the heel pixels.
步骤S420根据鞋跟的像素点的位置可以计算出候选图像中鞋跟的高度,并通过以下公式可以计算出鞋跟的实际高度:In step S420, the height of the heel in the candidate image can be calculated according to the position of the pixel point of the heel, and the actual height of the heel can be calculated by the following formula:
Figure PCTCN2020112536-appb-000007
Figure PCTCN2020112536-appb-000007
其中,f为摄像焦距,h为候选图像的鞋跟高度,D为摄像距离,H为实际的鞋跟高度。Among them, f is the camera focal length, h is the heel height of the candidate image, D is the camera distance, and H is the actual heel height.
步骤S430分类网络的种类有多种,常用的分类网络有:LeNet-5、AlexNet、ZFNet、VGGNet、GoogLeNet、ResNet等;分类网络主要是利用卷积、参数共享、池化等操作提取特征,使用全连接神经网络对特征进行分类识别,减少了数据之间的大量计算。其中,使用分类网络对鞋跟像素点进行分类处理,得到同一个鞋跟所有的像素点组成的形状,即得到候选图像的鞋跟形状。Step S430 There are many types of classification networks. Commonly used classification networks are: LeNet-5, AlexNet, ZFNet, VGGNet, GoogLeNet, ResNet, etc.; classification networks mainly use convolution, parameter sharing, pooling and other operations to extract features, and use The fully connected neural network classifies and recognizes features, reducing a large amount of calculations between data. Among them, a classification network is used to classify the heel pixels to obtain a shape composed of all pixels of the same heel, that is, to obtain the heel shape of the candidate image.
进一步地,本发明的另一个实施例还提供了一种鞋跟型号识别方法,其中,通过鞋跟数据库对所述鞋跟高度和所述鞋跟形状进行识别,得到鞋跟的型号,包括如下步骤:Further, another embodiment of the present invention also provides a method for identifying a heel model, wherein the heel height and the shape of the heel are identified through a heel database to obtain the model of the heel, including the following step:
步骤S510:将所述鞋跟高度和所述鞋跟形状输入到鞋跟数据库;Step S510: Input the height of the heel and the shape of the heel into the heel database;
步骤S520:获取所述鞋跟高度和所述鞋跟形状与所述鞋跟数据库内数据之间的鞋跟重叠度,并对所述鞋跟重叠度进行筛选和排列,得到按所述鞋跟重叠度的值进行排列的若干个鞋跟型号。Step S520: Obtain the heel height and the heel overlap between the heel shape and the data in the heel database, and filter and arrange the heel overlap to obtain the heel A number of heel models arranged by the value of the degree of overlap.
在本实施例中,步骤S520鞋跟数据库内存储有大量的鞋跟信息,对于同一 个鞋跟来说,鞋跟信息包括有真实鞋跟高度和真实鞋跟形状;通过计算鞋跟高度和鞋跟形状,与鞋跟数据库内的真实鞋跟高度和真实鞋跟形状之间的鞋跟重叠度,按照鞋跟重叠度的大小进行排序,并筛选出鞋跟重叠度较大的若干个鞋跟型号,按照重叠度值的大小对若干个鞋跟型号进行输出,使使用者获取与鞋跟图像中相似度最高的若干个鞋跟型号,提高鞋跟型号识别的准确率,大大减轻了商家的工作量。其中,鞋跟重叠度较大的若干个鞋跟型号可以设置为鞋跟重叠度最大的十个鞋跟型号。In this embodiment, a large amount of heel information is stored in the heel database of step S520. For the same heel, the heel information includes the real heel height and the real heel shape; by calculating the heel height and the shoe heel Heel shape, the heel overlap degree between the real heel height and the real heel shape in the heel database, sorted according to the degree of heel overlap, and selected a number of heels with larger heel overlap Model, output several heel models according to the degree of overlap value, so that users can obtain the several heel models with the highest similarity in the heel image, improve the accuracy of heel model recognition, and greatly reduce the business’s burden. Workload. Among them, a number of heel models with greater heel overlap can be set to ten heel models with the largest heel overlap.
另外,参照图1,本发明的另一个实施例还提供了一种鞋跟型号识别方法,包括如下步骤:In addition, referring to Fig. 1, another embodiment of the present invention also provides a method for identifying a heel model, which includes the following steps:
步骤S111:获取鞋跟的摄像距离,若所述摄像距离不在摄像距离范围内,则返回所述摄像距离错误信息,所述摄像距离范围为10厘米至30厘米;Step S111: Obtain the camera distance of the heel, and if the camera distance is not within the camera distance range, return the camera distance error message, and the camera distance range is 10 cm to 30 cm;
步骤S112:获取鞋跟的摄像亮度,若所述摄像亮度不在摄像亮度范围内,则返回所述摄像亮度错误信息,所述摄像亮度范围为红、绿、蓝三个颜色通道的亮度叠加值不小于0.4;Step S112: Obtain the camera brightness of the heel. If the camera brightness is not within the camera brightness range, return the camera brightness error message. The camera brightness range is that the brightness superimposed values of the three color channels of red, green, and blue are not Less than 0.4;
步骤S113:获取鞋跟的摄像焦距,以及鞋跟侧面在水平角度上的鞋跟图像;Step S113: Obtain the camera focal length of the heel and the heel image of the side of the heel at a horizontal angle;
步骤S120:利用高通滤波对所述鞋跟图像进行锐化处理,得到锐化图像;Step S120: Use high-pass filtering to perform sharpening processing on the heel image to obtain a sharpened image;
步骤S130:利用双边滤波器对所述锐化图像进行高光去噪处理,得到鞋跟色度图;Step S130: Use a bilateral filter to perform highlight denoising processing on the sharpened image to obtain a heel chromaticity diagram;
步骤S200:利用特征提取网络对所述鞋跟色度图进行特征提取,得到特征输出图;Step S200: Perform feature extraction on the heel chromaticity map by using a feature extraction network to obtain a feature output map;
步骤S310:利用区域候选网络对所述特征输出图进行鞋跟的识别,得到若干个候选区域;Step S310: Recognizing the heel of the characteristic output map by using the region candidate network to obtain several candidate regions;
步骤S320:利用分类器获取所述候选区域的置信度,并根据所述置信度对所述候选区域进行筛选,得到置信度候选区域;Step S320: Obtain the confidence of the candidate region by using a classifier, and screen the candidate region according to the confidence to obtain the candidate confidence region;
步骤S330:获取所述置信度候选区域之间的区域重叠度,得到所述置信度候选区域的区域重叠度数据组;Step S330: Obtain the area overlap degree between the confidence candidate areas, and obtain the area overlap degree data set of the confidence candidate area;
步骤S340:利用非极大值抑制对所述区域重叠度数据组进行处理,得到最优候选区域;Step S340: Use non-maximum value suppression to process the region overlap degree data set to obtain an optimal candidate region;
步骤S350:利用双线性插值法对所述最优候选区域进行对齐处理,得到候 选图像;Step S350: aligning the optimal candidate area by using a bilinear interpolation method to obtain a candidate image;
步骤S410:利用分割网络对所述候选图像进行像素级识别,得到所述候选图像中的鞋跟像素点;Step S410: Perform pixel-level recognition on the candidate image by using a segmentation network to obtain heel pixels in the candidate image;
步骤S420:根据所述鞋跟像素点和所述摄像焦距,得到实际的鞋跟高度;Step S420: Obtain the actual heel height according to the heel pixel points and the camera focal length;
步骤S430:利用分类网络对所述鞋跟像素点进行分类,得到候选图像的鞋跟形状;Step S430: Use a classification network to classify the heel pixels to obtain the heel shape of the candidate image;
步骤S510:将所述鞋跟高度和所述鞋跟形状输入到鞋跟数据库;Step S510: Input the height of the heel and the shape of the heel into the heel database;
步骤S520:获取所述鞋跟高度和所述鞋跟形状与所述鞋跟数据库内数据之间的鞋跟重叠度,并对所述鞋跟重叠度进行筛选和排列,得到按所述鞋跟重叠度的值进行排列的若干个鞋跟型号。Step S520: Obtain the heel height and the heel overlap between the heel shape and the data in the heel database, and filter and arrange the heel overlap to obtain the heel A number of heel models arranged by the value of the degree of overlap.
在本实施例中,采集鞋跟图像,对鞋跟图像进行预处理,增强鞋跟图像的清晰度和分辨率;利用特征提取网络对鞋跟色度图进行特征提取,提高了特征提取的速度,增强了特征的分辨度;利用区域候选网络对特征输出图进行识别、筛选、分类处理,得到归一化的候选图像,提高了候选区域选取的准确性,降低了候选区域之间的重叠性;通过输出网络对候选图像进行像素级识别,提高了鞋跟高度和形状获取的准确性;通过鞋跟数据库对鞋跟信息的存储,对候选图像的鞋跟的高度和形状的识别,提高了鞋跟识别的速度和效率。In this embodiment, the heel image is collected, and the heel image is preprocessed to enhance the clarity and resolution of the heel image; the feature extraction network is used to perform feature extraction on the heel chromaticity diagram, which improves the speed of feature extraction , To enhance the resolution of features; use the regional candidate network to identify, filter, and classify the feature output map to obtain a normalized candidate image, which improves the accuracy of candidate region selection and reduces the overlap between candidate regions Pixel-level recognition of candidate images through the output network improves the accuracy of heel height and shape acquisition; through the storage of heel information in the heel database, the recognition of the height and shape of the heel of the candidate image improves Speed and efficiency of heel recognition.
此外,本发明的另一个实施例还提供了一种鞋跟型号识别装置,包括至少一个控制处理器和用于与所述至少一个控制处理器通信连接的存储器;所述存储器存储有可被所述至少一个控制处理器执行的指令,所述指令被所述至少一个控制处理器执行,以使所述至少一个控制处理器能够执行如上的任一项所述的一种鞋跟型号识别方法。In addition, another embodiment of the present invention also provides a heel model identification device, which includes at least one control processor and a memory for communicating with the at least one control processor; The instructions executed by the at least one control processor, the instructions are executed by the at least one control processor, so that the at least one control processor can execute the method for identifying a heel model as described in any one of the above.
在本实施例中,识别装置包括:一个或多个控制处理器和存储器,控制处理器和存储器可以通过总线或者其他方式连接。In this embodiment, the identification device includes: one or more control processors and memories, and the control processors and memories may be connected by a bus or in other ways.
存储器作为一种非暂态计算机可读存储介质,可用于存储非暂态软件程序、非暂态性计算机可执行程序以及模块,如本发明实施例中的识别方法对应的程序指令/模块。控制处理器通过运行存储在存储器中的非暂态软件程序、指令以及模块,从而执行识别装置的各种功能应用以及数据处理,即实现上述方法实施例的识别方法。As a non-transitory computer-readable storage medium, the memory can be used to store non-transitory software programs, non-transitory computer executable programs and modules, such as program instructions/modules corresponding to the identification method in the embodiment of the present invention. The control processor executes various functional applications and data processing of the identification device by running the non-transitory software programs, instructions, and modules stored in the memory, that is, realizes the identification method of the foregoing method embodiment.
存储器可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储根据识别装置的使用所创建的数据等。此外,存储器可以包括高速随机存取存储器,还可以包括非暂态存储器,例如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。在一些实施方式中,存储器可选包括相对于控制处理器远程设置的存储器,这些远程存储器可以通过网络连接至该识别装置。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory may include a program storage area and a data storage area, where the program storage area may store an operating system and an application program required by at least one function; the data storage area may store data created according to the use of the identification device and the like. In addition, the memory may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid-state storage devices. In some embodiments, the memory may optionally include a memory remotely provided with respect to the control processor, and these remote memories may be connected to the identification device via a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
所述一个或者多个模块存储在所述存储器中,当被所述一个或者多个控制处理器执行时,执行上述方法实施例中的识别方法,例如,执行以上描述识别方法步骤S100至S500、S110至S130、S111至S113、S310至S350、S410至S430,以及S510至S520的功能。The one or more modules are stored in the memory, and when executed by the one or more control processors, the identification method in the above method embodiment is executed, for example, the steps S100 to S500, S110 to S130, S111 to S113, S310 to S350, S410 to S430, and S510 to S520 functions.
本发明实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可执行指令,该计算机可执行指令被一个或多个控制处理器执行,例如,一个控制处理器执行,可使得上述一个或多个控制处理器执行上述方法实施例中的识别方法,例如,执行以上描述的方法步骤S100至S500、S110至S130、S111至S113、S310至S350、S410至S430,以及S510至S520的功能。The embodiment of the present invention also provides a computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are executed by one or more control processors, for example, a control processor Execution may cause the above-mentioned one or more control processors to execute the identification method in the above-mentioned method embodiment, for example, execute the above-described method steps S100 to S500, S110 to S130, S111 to S113, S310 to S350, S410 to S430, And the functions of S510 to S520.
以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The device embodiments described above are merely illustrative, and the units described as separate components may or may not be physically separated, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
通过以上的实施方式的描述,本领域技术人员可以清楚地了解到各实施方式可借助软件加通用硬件平台的方式来实现。本领域技术人员可以理解实现上述实施例方法中的全部或部分流程是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(ReadOnly Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。Through the description of the above implementation manners, those skilled in the art can clearly understand that each implementation manner can be implemented by means of software plus a general hardware platform. Those skilled in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by computer programs instructing relevant hardware. The programs can be stored in a computer readable storage medium. At this time, it may include the flow of the embodiment of the above-mentioned method. Wherein, the storage medium may be a magnetic disk, an optical disc, a read-only memory (Read Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.
以上是对本发明的较佳实施进行了具体说明,但本发明并不局限于上述实施The above is a detailed description of the preferred implementation of the present invention, but the present invention is not limited to the above implementation
方式,熟悉本领域的技术人员在不违背本发明精神的前提下还可作出种种的等同 变形或替换,这些等同的变形或替换均包含在本申请权利要求所限定的范围内。In this way, those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the present invention, and these equivalent modifications or substitutions are all included in the scope defined by the claims of this application.

Claims (9)

  1. 一种鞋跟型号识别方法,其特征在于,包括如下步骤:A method for identifying a heel model, which is characterized in that it comprises the following steps:
    采集鞋跟图像,对所述鞋跟图像进行预处理,得到鞋跟色度图;Acquiring a heel image, and preprocessing the heel image to obtain a heel chromaticity diagram;
    利用特征提取网络对所述鞋跟色度图进行特征提取,得到特征输出图;Using a feature extraction network to perform feature extraction on the heel chromaticity map to obtain a feature output map;
    利用区域候选网络对所述特征输出图进行处理,得到鞋跟的候选图像;Use the regional candidate network to process the feature output image to obtain a candidate image of the heel;
    通过输出网络对所述候选图像进行像素级识别,得到所述候选图像的鞋跟高度和鞋跟形状;Perform pixel-level recognition on the candidate image through the output network to obtain the heel height and heel shape of the candidate image;
    通过鞋跟数据库对所述鞋跟高度和所述鞋跟形状进行识别,得到鞋跟的型号。Identify the height of the heel and the shape of the heel through the heel database to obtain the model of the heel.
  2. 根据权利要求1所述的一种鞋跟型号识别方法,其特征在于:采集鞋跟图像,对所述鞋跟图像进行预处理,得到鞋跟色度图,包括如下步骤:The method for recognizing a shoe heel model according to claim 1, characterized in that: collecting a heel image and preprocessing the heel image to obtain a heel chromaticity diagram comprises the following steps:
    在摄像距离范围和摄像亮度范围内,获取鞋跟侧面在水平角度上的鞋跟图像;Obtain the heel image of the side of the heel at a horizontal angle within the range of the camera distance and the camera brightness range;
    利用高通滤波对所述鞋跟图像进行锐化处理,得到锐化图像;Using high-pass filtering to sharpen the heel image to obtain a sharpened image;
    利用双边滤波器对所述锐化图像进行高光去噪处理,得到鞋跟色度图。A bilateral filter is used to perform highlight denoising processing on the sharpened image to obtain a heel chromaticity diagram.
  3. 根据权利要求2所述的一种鞋跟型号识别方法,其特征在于:在摄像距离范围和摄像亮度范围内,获取鞋跟侧面在水平角度上的鞋跟图像,包括如下步骤:The method for recognizing a shoe heel model according to claim 2, characterized in that: acquiring a heel image of the side of the heel at a horizontal angle within the range of the camera distance and the camera brightness includes the following steps:
    获取鞋跟的摄像距离,若所述摄像距离不在摄像距离范围内,则返回所述摄像距离错误信息,所述摄像距离范围为10厘米至30厘米;Acquire the camera distance of the heel, and if the camera distance is not within the camera distance range, return the camera distance error message, and the camera distance range is 10 cm to 30 cm;
    获取鞋跟的摄像亮度,若所述摄像亮度不在摄像亮度范围内,则返回所述摄像亮度错误信息,所述摄像亮度范围为红、绿、蓝三个颜色通道的亮度叠加值不小于0.4;Acquire the camera brightness of the heel, if the camera brightness is not within the camera brightness range, return the camera brightness error message, and the camera brightness range is that the brightness superimposed value of the three color channels of red, green and blue is not less than 0.4;
    获取鞋跟的摄像焦距,以及鞋跟侧面在水平角度上的鞋跟图像。Obtain the camera focal length of the heel and the heel image at the horizontal angle of the side of the heel.
  4. 根据权利要求1所述的一种鞋跟型号识别方法,其特征在于:所述特征提取网络包括:残差网络和特征金字塔网络;所述残差网络包括若干个残差块,所述特征金字塔网络包括若干个特征金字塔网络层,所述残差块的后面连接有所述特征金字塔网络层。The method for recognizing a shoe heel model according to claim 1, wherein the feature extraction network includes: a residual network and a feature pyramid network; the residual network includes a plurality of residual blocks, and the feature pyramid The network includes several feature pyramid network layers, and the feature pyramid network layer is connected behind the residual block.
  5. 根据权利要求1所述的一种鞋跟型号识别方法,其特征在于:利用区域候选网络对所述特征输出图进行处理,得到鞋跟的候选图像,包括如下步骤:A method for recognizing a shoe heel model according to claim 1, characterized in that: using a regional candidate network to process the feature output map to obtain a candidate image of the heel includes the following steps:
    利用区域候选网络对所述特征输出图进行鞋跟的识别,得到若干个候选区域;Recognizing the heel of the feature output map by using the regional candidate network to obtain several candidate regions;
    利用分类器获取所述候选区域的置信度,并根据所述置信度对所述候选区域进行筛选,得到置信度候选区域;Obtaining the confidence of the candidate region by using a classifier, and screening the candidate region according to the confidence to obtain the confidence candidate region;
    获取所述置信度候选区域之间的区域重叠度,得到所述置信度候选区域的区域重叠度数据组;Acquiring the degree of overlap between the candidate confidence regions to obtain a data set of the degree of overlap between the candidate confidence regions;
    利用非极大值抑制对所述区域重叠度数据组进行处理,得到最优候选区域;Using non-maximum value suppression to process the region overlap degree data set to obtain an optimal candidate region;
    利用双线性插值法对所述最优候选区域进行对齐处理,得到候选图像。The bilinear interpolation method is used to perform alignment processing on the optimal candidate region to obtain a candidate image.
  6. 根据权利要求3所述的一种鞋跟型号识别方法,其特征在于:通过输出网络对所述候选图像进行像素级识别,得到所述候选图像的鞋跟高度和鞋跟形状,包括如下步骤:The method for recognizing a heel model according to claim 3, characterized in that: performing pixel-level recognition on the candidate image through an output network to obtain the heel height and heel shape of the candidate image, comprising the following steps:
    利用分割网络对所述候选图像进行像素级识别,得到所述候选图像中的鞋跟像素点;Performing pixel-level recognition on the candidate image by using a segmentation network to obtain heel pixels in the candidate image;
    根据所述鞋跟像素点和所述摄像焦距,得到实际的鞋跟高度;Obtain the actual heel height according to the heel pixel points and the camera focal length;
    利用分类网络对所述鞋跟像素点进行分类,得到候选图像的鞋跟形状。A classification network is used to classify the heel pixels to obtain the heel shape of the candidate image.
  7. 根据权利要求1所述的一种鞋跟型号识别方法,其特征在于:通过鞋跟数据库对所述鞋跟高度和所述鞋跟形状进行识别,得到鞋跟的型号,包括如下步骤:The method for recognizing a heel model according to claim 1, characterized in that: recognizing the height of the heel and the shape of the heel through a heel database to obtain the model of the heel comprises the following steps:
    将所述鞋跟高度和所述鞋跟形状输入到鞋跟数据库;Input the heel height and the heel shape into the heel database;
    获取所述鞋跟高度和所述鞋跟形状与所述鞋跟数据库内数据之间的鞋跟重叠度,并对所述鞋跟重叠度进行筛选和排列,得到按所述鞋跟重叠度的值进行排列的若干个鞋跟型号。Obtain the heel height and the heel overlap degree between the heel shape and the data in the heel database, and filter and arrange the heel overlap degree to obtain the heel overlap degree according to the heel overlap degree Several heel models arranged by value.
  8. 一种鞋跟型号识别装置,其特征在于,包括至少一个控制处理器和用于与所述至少一个控制处理器通信连接的存储器;所述存储器存储有可被所述至少一个控制处理器执行的指令,所述指令被所述至少一个控制处理器执行,以使所述至少一个控制处理器能够执行如权利要求1-7任一项所述的鞋跟型号识别方法。A shoe heel model recognition device, which is characterized in that it comprises at least one control processor and a memory for communicating with the at least one control processor; the memory stores the memory that can be executed by the at least one control processor. Instructions, the instructions are executed by the at least one control processor, so that the at least one control processor can execute the heel model identification method according to any one of claims 1-7.
  9. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机可执行指令,所述计算机可执行指令用于使计算机执行如权利要求1-7任一项所述的鞋跟型号识别方法。A computer-readable storage medium, wherein the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to make a computer execute the shoe of any one of claims 1-7 Follow the model identification method.
PCT/CN2020/112536 2019-09-29 2020-08-31 Heel type identification method, device, and storage medium WO2021057395A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910930378.4 2019-09-29
CN201910930378.4A CN110705634B (en) 2019-09-29 2019-09-29 Heel model identification method and device and storage medium

Publications (1)

Publication Number Publication Date
WO2021057395A1 true WO2021057395A1 (en) 2021-04-01

Family

ID=69197862

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/112536 WO2021057395A1 (en) 2019-09-29 2020-08-31 Heel type identification method, device, and storage medium

Country Status (2)

Country Link
CN (1) CN110705634B (en)
WO (1) WO2021057395A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113221795A (en) * 2021-05-24 2021-08-06 大连恒锐科技股份有限公司 Feature extraction, fusion and comparison method and device for shoe sample retrieval in video

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110705634B (en) * 2019-09-29 2023-02-28 五邑大学 Heel model identification method and device and storage medium
CN113496221B (en) * 2021-09-08 2022-02-01 湖南大学 Point supervision remote sensing image semantic segmentation method and system based on depth bilateral filtering
CN114619663A (en) * 2022-03-22 2022-06-14 芜湖风雪橡胶有限公司 Rubber shoe sole antiskid line pressing forming equipment and pressing method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104680163A (en) * 2015-02-10 2015-06-03 柳州市金旭节能科技有限公司 Licence plate recognition system
CN106875381A (en) * 2017-01-17 2017-06-20 同济大学 A kind of phone housing defect inspection method based on deep learning
US20180307940A1 (en) * 2016-01-13 2018-10-25 Peking University Shenzhen Graduate School A method and a device for image matching
CN109902749A (en) * 2019-03-04 2019-06-18 沈阳建筑大学 A kind of print recognition methods of shoes and system
CN109916308A (en) * 2019-01-14 2019-06-21 佛山市南海区广工大数控装备协同创新研究院 A kind of information collecting method and its system of sole
CN110705634A (en) * 2019-09-29 2020-01-17 五邑大学 Heel model identification method and device and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102324020B (en) * 2011-09-02 2014-06-11 北京新媒传信科技有限公司 Method and device for identifying human skin color region
CN107346413A (en) * 2017-05-16 2017-11-14 北京建筑大学 Traffic sign recognition method and system in a kind of streetscape image

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104680163A (en) * 2015-02-10 2015-06-03 柳州市金旭节能科技有限公司 Licence plate recognition system
US20180307940A1 (en) * 2016-01-13 2018-10-25 Peking University Shenzhen Graduate School A method and a device for image matching
CN106875381A (en) * 2017-01-17 2017-06-20 同济大学 A kind of phone housing defect inspection method based on deep learning
CN109916308A (en) * 2019-01-14 2019-06-21 佛山市南海区广工大数控装备协同创新研究院 A kind of information collecting method and its system of sole
CN109902749A (en) * 2019-03-04 2019-06-18 沈阳建筑大学 A kind of print recognition methods of shoes and system
CN110705634A (en) * 2019-09-29 2020-01-17 五邑大学 Heel model identification method and device and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113221795A (en) * 2021-05-24 2021-08-06 大连恒锐科技股份有限公司 Feature extraction, fusion and comparison method and device for shoe sample retrieval in video

Also Published As

Publication number Publication date
CN110705634A (en) 2020-01-17
CN110705634B (en) 2023-02-28

Similar Documents

Publication Publication Date Title
WO2021057395A1 (en) Heel type identification method, device, and storage medium
EP3455782B1 (en) System and method for detecting plant diseases
CN107862698B (en) Light field foreground segmentation method and device based on K mean cluster
US9483835B2 (en) Depth value restoration method and system
WO2018107939A1 (en) Edge completeness-based optimal identification method for image segmentation
CN108765465B (en) Unsupervised SAR image change detection method
CN113344849B (en) Microemulsion head detection system based on YOLOv5
CN110309781B (en) House damage remote sensing identification method based on multi-scale spectrum texture self-adaptive fusion
CN106846339A (en) A kind of image detecting method and device
TW200834459A (en) Video object segmentation method applied for rainy situations
WO2019071976A1 (en) Panoramic image saliency detection method based on regional growth and eye movement model
CN107944403B (en) Method and device for detecting pedestrian attribute in image
CN107633491A (en) A kind of area image Enhancement Method and storage medium based on target detection
CN109685045A (en) A kind of Moving Targets Based on Video Streams tracking and system
CN114118144A (en) Anti-interference accurate aerial remote sensing image shadow detection method
CN103839267A (en) Building extracting method based on morphological building indexes
US20170178341A1 (en) Single Parameter Segmentation of Images
Trivedi et al. Automatic segmentation of plant leaves disease using min-max hue histogram and k-mean clustering
CN110310291A (en) A kind of rice blast hierarchy system and its method
CN111462027A (en) Multi-focus image fusion method based on multi-scale gradient and matting
CN106599891A (en) Remote sensing image region-of-interest rapid extraction method based on scale phase spectrum saliency
Wang et al. An efficient method for image dehazing
CN114399480A (en) Method and device for detecting severity of vegetable leaf disease
JP6334281B2 (en) Forest phase analysis apparatus, forest phase analysis method and program
JP6218678B2 (en) Forest phase analysis apparatus, forest phase analysis method and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20867341

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20867341

Country of ref document: EP

Kind code of ref document: A1