WO2020168814A1 - 服饰识别、分类及检索的方法、装置、设备及存储介质 - Google Patents

服饰识别、分类及检索的方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2020168814A1
WO2020168814A1 PCT/CN2019/127660 CN2019127660W WO2020168814A1 WO 2020168814 A1 WO2020168814 A1 WO 2020168814A1 CN 2019127660 W CN2019127660 W CN 2019127660W WO 2020168814 A1 WO2020168814 A1 WO 2020168814A1
Authority
WO
WIPO (PCT)
Prior art keywords
target image
clothing
key feature
image
feature points
Prior art date
Application number
PCT/CN2019/127660
Other languages
English (en)
French (fr)
Inventor
谢宏斌
Original Assignee
北京京东尚科信息技术有限公司
北京京东世纪贸易有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京京东尚科信息技术有限公司, 北京京东世纪贸易有限公司 filed Critical 北京京东尚科信息技术有限公司
Priority to US17/295,337 priority Critical patent/US11977604B2/en
Priority to EP19915612.6A priority patent/EP3876110A4/en
Publication of WO2020168814A1 publication Critical patent/WO2020168814A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0282Rating or review of business operators or products
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Definitions

  • the present invention is filed based on a Chinese patent application with an application number of 201910123577.4 and an application date of February 18, 2019, and claims the priority of the Chinese patent application.
  • the entire content of the Chinese patent application is hereby incorporated into the present invention by way of introduction.
  • the present invention relates to the field of computer vision technology, and in particular to a method, device, equipment and storage medium for clothing identification, classification and retrieval.
  • Clothing recognition is one of the most important and challenging problems in the field of image retrieval. On the Internet today, most users search for content and clothing related to online shopping. Therefore, clothing identification is a key problem in solving the same style search, style identification and wear recommendation needs.
  • the main purpose of the present invention is to provide a method, device, equipment and storage medium for clothing identification, classification and retrieval, which can identify clothing more accurately.
  • an embodiment of the present invention provides a clothing identification method, the method including:
  • the heat atlas is processed based on the shape constraint conditions corresponding to the target image, and the position probability information of the key feature points contained in the target image is determined.
  • an embodiment of the present invention provides a clothing classification method implemented by using the clothing identification method according to any embodiment of the present invention, including:
  • the category attribute includes one of the following: shape, version and style;
  • the corresponding apparel category is determined based on the category attribute.
  • an embodiment of the present invention provides a clothing retrieval method implemented by the clothing identification method according to any embodiment of the present invention, including:
  • the category attribute includes one of the following: shape, version and style;
  • search element Determine the corresponding search element based on the category attribute; wherein, the search element includes search keywords and/or images;
  • a clothing image collection corresponding to the target image is retrieved based on the retrieval element.
  • an embodiment of the present invention provides a clothing identification device, including:
  • the first acquisition subsystem is configured to acquire a target image containing clothing to be identified
  • the first image processing subsystem is configured to determine, based on the target image, a heat atlas corresponding to key feature points contained in the target image, and the heat atlas includes corresponding to each key feature point contained in the target image And process the heat map set based on the shape constraint conditions corresponding to the target image to determine the position probability information of the key feature points contained in the target image.
  • an embodiment of the present invention provides a clothing classification device, including:
  • the second acquisition subsystem is configured to acquire a target image containing clothing to be identified
  • the second image processing subsystem is configured to determine, based on the target image, a heat atlas corresponding to key feature points contained in the target image, the heat atlas including corresponding to each key feature point contained in the target image And process the heat map set based on the shape constraint conditions corresponding to the target image to determine the position probability information of the key feature points contained in the target image;
  • the classification subsystem is configured to determine the category attribute of clothing based on the position probability information of the key feature points contained in the target image; the category attribute includes one of the following: shape, version, and style; and determination based on the category attribute The corresponding clothing category.
  • an embodiment of the present invention provides a clothing retrieval device, including:
  • the third acquisition subsystem is configured to acquire a target image containing the clothing to be identified
  • the third image processing subsystem is configured to determine, based on the target image, a heat atlas corresponding to key feature points contained in the target image, the heat atlas including corresponding to each key feature point contained in the target image And process the heat map set based on the shape constraint conditions corresponding to the target image to determine the position probability information of the key feature points contained in the target image;
  • the retrieval element determination subsystem is configured to determine the category attribute of clothing based on the position probability information of the key feature points contained in the target image; the category attribute includes one of the following: shape, version, and style; and based on the category The attribute determines the corresponding search element; wherein the search element includes search keywords and/or images;
  • the retrieval subsystem is configured to retrieve a clothing image set corresponding to the target image based on the retrieval element.
  • an embodiment of the present invention provides a computer device, including: a processor and a memory configured to store a computer program that can run on the processor;
  • the processor is configured to execute the clothing identification method provided by any embodiment of the present invention, or the provided clothing classification method, or the provided clothing retrieval method when running the computer program.
  • an embodiment of the present invention provides a computer storage medium in which a computer program is stored, and when the computer program is executed by a processor, it implements the clothing identification method provided in any embodiment of the present invention, Or the clothing classification method provided, or the clothing retrieval method provided.
  • the clothing identification, classification, and retrieval method, device, equipment, and storage medium provided by the embodiments of the present invention acquire a target image containing clothing to be recognized, and determine based on the target image the heat corresponding to the key feature points contained in the target image Atlas, the heat map includes the location probability heat map corresponding to each key feature point contained in the target image; in this way, the initial location information of each key feature point in the target image of the clothing to be recognized is obtained, and each An initial position probability heat map corresponding to a key feature point; the heat map set is processed based on the shape constraint condition corresponding to the target image, and the position probability information of the key feature point contained in the target image is determined.
  • processing the heat map set based on the shape constraint conditions corresponding to the clothing parts can optimize the accurate identification of the location probabilities of the key feature points contained in the clothing to be recognized, based on the determined location probability of the key feature points of the clothing to be recognized
  • the information realizes the accurate identification of clothing, and by acquiring the key feature points of the clothing, it can be more widely applied to applications in the fields of online shopping, smart dressing, and clothing design.
  • FIG. 1 is a schematic flowchart of a clothing identification method according to an embodiment of the present invention
  • Figure 2 (a) is an example sample diagram of key feature points of clothing of the first type of clothing provided by an embodiment of the present invention
  • Figure 2(b) is an example sample diagram of key feature points of clothing of the second type of clothing provided by an embodiment of the present invention.
  • Figure 2(c) is a sample diagram of examples of key feature points of clothing of the third type of clothing provided by an embodiment of the present invention.
  • Figure 2(d) is a sample diagram of examples of key feature points of clothing of the fourth type of clothing provided by an embodiment of the present invention.
  • Figure 2(e) is a sample diagram of an example of key feature points of clothing of the fifth type of clothing provided by an embodiment of the present invention
  • FIG. 3 is a schematic structural diagram of a first neural network provided by an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of a shape constraint condition set provided by an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a bidirectional cyclic convolutional neural network with shape constraint input provided by an embodiment of the present invention
  • FIG. 6 is a schematic flowchart of a clothing classification method provided by an embodiment of the present invention.
  • FIG. 7 is a schematic flowchart of a clothing retrieval method provided by an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a clothing identification device provided by an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of a clothing classification device provided by an embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of a clothing retrieval device provided by an embodiment of the present invention.
  • FIG. 11 is a schematic structural diagram of a computer device provided by an embodiment of the present invention.
  • Target image This article refers to the imaged image used for clothing key point detection, such as JPEG and other digital format images.
  • the loss function is also called the cost function, which is the objective function of neural network optimization.
  • Neural Networks is a complex network system formed by a large number of simple processing units (called neurons) widely connected to each other. It reflects many basic characteristics of the human brain function. Highly complex nonlinear dynamic learning system.
  • an embodiment of the present invention provides a clothing identification method, which is executed by a clothing identification device, and the method includes the following steps:
  • Step S101 Obtain a target image containing the clothing to be recognized, and determine a heat atlas corresponding to the key feature points contained in the target image based on the target image, the heat atlas including a position probability heat map corresponding to each key feature point contained in the target image;
  • the target image is a picture taken or drawn with a pointer to the key point detection of clothing. Determining the heat atlas corresponding to the key feature points contained in the target image based on the target image refers to capturing the corresponding features contained in the target image, including the heat atlas corresponding to the key feature points, and the heat atlas includes each item contained in the target image. Location probability heat map corresponding to key feature points.
  • the current system mainly accepts five types of clothing types.
  • the key feature points included in these five types of clothing types are determined respectively, as shown in Figure 2(a) in the five types of service types.
  • the first type of clothing includes clothing key feature points 1 to 13, a total of 13 key feature points; as shown in Figure 2(b), the five types of service types
  • the second type of clothing includes clothing key feature points 1 to 12, a total of 12 key feature points; as shown in Figure 2(c), among the five service types Examples of key feature points of clothing for the third type of clothing.
  • the third type of clothing includes clothing key feature points 1 to 7, a total of 7 key feature points; as shown in Figure 2(d), among the five service types Examples of key feature points of clothing for the fourth type of clothing.
  • the fourth type of clothing includes clothing key feature points 1 to 4, a total of 4 key feature points; as shown in Figure 2(e), among the five service types Examples of key feature points of clothing for the fifth type of clothing.
  • the fifth type of clothing includes clothing key feature points 1 to 6, a total of 6 key feature points.
  • the key points of the clothing are the partial positions of most clothing in the category to which the clothing belongs and that can be used to distinguish different styles of clothing in the clothing category in terms of function and structure. Each clothing can have one or more key points.
  • Step S102 Process the heat map set based on the shape constraint conditions corresponding to the target image, and determine the position probability information of the key feature points contained in the target image.
  • the shape constraint conditions corresponding to the heat atlas and the target image are processed, and the shape constraint conditions are used to optimize the heat atlas, and the position probability information of the key feature points contained in the target image is determined.
  • the shape constraint condition may be a constraint condition corresponding to a part of the clothing, and is used to characterize the key feature of the clothing part.
  • a clothing recognition method is provided to obtain a target image containing clothing to be recognized, and based on the target image, determine a heat atlas corresponding to the key feature points contained in the target image.
  • the heat atlas includes each key included in the target image.
  • the location probability heat map corresponding to the feature point in this way, the initial location information of each key feature point in the target image of the clothing to be recognized is obtained, and the initial location probability heat map corresponding to each key feature point is obtained; based on the shape constraints corresponding to the target image Conditions process the heat map set to determine the position probability information of the key feature points contained in the target image.
  • processing the heat map set based on the shape constraint conditions corresponding to the clothing parts can optimize the accurate identification of the position probability of the key feature points contained in the clothing to be recognized, and realize the realization based on the determined location probability information of the key feature points of the clothing to be recognized
  • Accurate identification of clothing can be more widely applicable to applications in the fields of online shopping, smart dressing and clothing design by obtaining key feature points of clothing.
  • determining the heat atlas corresponding to the key feature points contained in the target image based on the target image includes:
  • the second neural network after training processes the heat map set based on the shape constraint conditions corresponding to the target image.
  • Processing the target image through the trained first neural network refers to inputting the target image into the trained first neural network, and capturing the corresponding features contained in the target image through the first neural network, including the heat corresponding to the key feature points.
  • Atlas The heat map includes the location probability heat map corresponding to each key feature point contained in the target image.
  • the first neural network after training is used to determine the heat map set corresponding to the key feature points contained in the target image, that is, the initial position information of each key feature point of the target image, specifically, the location probability corresponding to each key feature point is obtained Heat map.
  • the shape constraint conditions corresponding to the heat atlas and the target image are input into the trained second neural network, and the heat atlas is optimized through the shape constraints, and the position probability information of the key feature points contained in the target image is determined.
  • the shape constraint condition may be a constraint condition corresponding to a part of the clothing, and is used to characterize the key feature of the clothing part.
  • the target image containing the clothing to be recognized is acquired, and the target image is processed through the trained first neural network to determine the heat atlas corresponding to the key feature points contained in the target image.
  • the heat atlas includes the target image The location probability heat map corresponding to each key feature point included; in this way, the initial location information of each key feature point in the target image of the clothing to be recognized is obtained, and the initial location probability heat map corresponding to each key feature point is obtained; through training The latter second neural network processes the heat map set based on the shape constraint conditions corresponding to the target image, and determines the position probability information of the key feature points contained in the target image.
  • processing the heat map set based on the shape constraint conditions corresponding to the clothing parts can optimize the accurate identification of the position probability of the key feature points contained in the clothing to be recognized, and realize the realization based on the determined location probability information of the key feature points of the clothing to be recognized
  • Accurate identification of clothing can be more widely applicable to applications in the fields of online shopping, smart dressing and clothing design by obtaining key feature points of clothing.
  • the method before acquiring the target image containing the clothing to be identified, the method includes:
  • the initial convolutional neural network is iteratively trained based on the image training set until the loss function meets the convergence condition, and the first neural network after training is obtained.
  • Obtaining an image training set containing multiple training images of clothing can be based on a new image as a sample image to construct a multi-batch training set.
  • the training image can be collected from an image library currently published on the Internet.
  • the training image is based on the pre-determined labeling method to label the partial images in the original image to be clear.
  • the loss function is also called the cost function, which is the objective function of neural network optimization.
  • the process of neural network training or optimization is the process of minimizing the loss function. The smaller the loss function value, the corresponding prediction result The closer it is to the real result.
  • the initial neural network model may be based on a pre-trained neural network model on a pre-trained image data set, such as Inception V1 based on pre-trained image data sets such as pre-trained ImageNet, DenseNet, etc.
  • Convolutional neural network models such as, V2, V3, V4, etc., of course, can also be any neural network model pre-trained based on pre-trained other image data sets, by using pre-training based on pre-trained image data sets The parameters of a good neural network model build the initial neural network model.
  • the first neural network may include a data preprocessing module 21, a feature extraction module 22, a region detection module 23, a target detection module 24, and a key point positioning module 25;
  • the feature extraction module 22 is configured to extract and output image feature maps from the target image data, initialize the feature extraction network with Imagenet network parameters, and start the feature extraction network from the first convolutional layer, the first residual block, the second unit, and the second residual block
  • the output of the third unit, the third residual block, the fifth unit, and the last unit of the fourth residual block respectively extract the feature map and sequentially perform upsampling and channel transformation, and then add to the corresponding position of the previous feature map and perform a 3x3 Convolution eliminates the influence of superimposed folds, thus constructing a multi-scale feature pyramid with scales of 2, 4, 8, 16, 32 times. Multi-scale features help to detect objects of different scales, making the overall detection method more robust Baton.
  • the area detection module 23 is configured to process the feature maps of each scale as follows: First, perform a 3x3 convolution feature adjustment, and then connect two fully connected branches, one branch is used to predict the object position, and the other branch is used To predict the object probability. In order to further increase the robustness of prediction, a number of reference boxes are introduced for each pixel, which have different aspect ratios and scales, and each reference box is used as the basis for decision-making. This refines the decision-making granularity, and dense reference boxes are sufficient. The use of broader collective wisdom reduces the instability of predictions. During training, first find the object bounding box with the largest overlap area ratio with each reference box as the support object of the corresponding reference box.
  • Each reference box only supports or votes for one object, and the overlap area ratio with the support object is greater than the specified threshold. It is a positive sample, otherwise it is regarded as a negative sample (the reference box with an overlapping area ratio greater than zero has the ability to distinguish the boundary of the object more than the reference box without overlapping). In order to increase the density, further specify the maximum overlap with each object bounding box The reference box of the area ratio is a positive sample. During training, in each iteration process, in order to reduce the amount of calculation, some reference boxes need to be filtered out.
  • the bounding boxes of the objects predicted by these reference boxes have a small area and a low score, and then the non-maximum compression method is further used to eliminate some references Then, according to the threshold of the overlap area ratio between the predicted object bounding box and the labeled object bounding box, some bounding boxes are filtered out again, and finally those bounding boxes located in the positive and negative sample set are selected to participate in the calculation of the final loss function.
  • the position of the object bounding box predicted by each reference box and the object score are compressed by non-maximum value, and the best object instance and object position are selected.
  • Each predicted object area is regarded as an area of interest, and the feature map of the area is scaled to a size of 14 ⁇ 14 through bilinear interpolation and output to the target detection module 24.
  • the target detection module 24 is configured to perform a maximum pooling of the input fixed size 14x14 object feature map to a size of 7x7, then expand the feature map into a one-dimensional vector, and then connect the two fully connected layers for feature transformation, and finally It is divided into two branches, and each branch is a fully connected layer, which is used to refine object positions and classify objects.
  • the object bounding box is the same as the area detection module 23, which is the smallest outer bounding box of all key points of the object class.
  • the key point positioning module 25 is configured to perform 4 consecutive 3x3 convolution feature transformations on the 14x14 object feature map output by the region detection module 23, output a new feature map of the same size, and then perform a deconvolution to make the feature map size It is 28x28. Finally, channel transformation and sigmoid activation are applied, so that the number of channels is 22, that is, the number of key points, and each channel corresponds to a key point heat map.
  • the initial convolutional neural network is iteratively trained based on the image training set until the loss function meets the convergence condition.
  • the first neural network after training means that the image training set is input to the initial convolutional neural network for iterative training, through forward conduction, Use the label information and loss function to calculate the cost, update the parameters in each layer through the backpropagation loss function gradient to adjust the weight of the initial convolutional neural network, until the loss function of the initial convolutional neural network meets the convergence condition, and get training After the first neural network.
  • the initial convolutional neural network is iteratively trained by acquiring the image training set containing the training images of multiple clothing as a basis, and the first neural network after training for clothing recognition of the target image is constructed.
  • the network and the training method are simple, which solves the problems of fewer training samples and slow calculations for clothing recognition.
  • the method before the iterative training of the initial convolutional neural network based on the image training set, the method further includes:
  • the image training set also includes augmented images.
  • performing image augmentation on the original image to obtain the corresponding augmented image includes:
  • performing image augmentation on the original image to obtain the corresponding augmented image refers to increasing the amount of data without changing the original image category.
  • the augmentation of the original image includes many, from a geometric point of view, there are horizontal translation, vertical translation and image rotation, and from a pixel transformation point of view, there is color disturbance.
  • the original image is augmented by different methods to obtain the augmented image, so that the sample of the image training set is expanded, and the amount of data is increased.
  • the initial convolutional neural network is compared with the image training set.
  • the trained first neural network after obtaining the trained first neural network, it includes:
  • the training sample set including a heat map set corresponding to the key feature points contained in the training image output by the first neural network
  • the initial bidirectional cyclic convolutional neural network is iteratively trained until the loss function meets the convergence condition, and the trained second neural network is obtained.
  • the loss function is also called the cost function, which is the objective function of neural network optimization.
  • the process of neural network training or optimization is the process of minimizing the loss function. The smaller the loss function value, the corresponding prediction result The closer it is to the real result.
  • the shape constraint condition set is obtained by modeling the unique deformation structure of clothing key points, and the node number description is shown in Table 1.
  • the design of these shapes and structures conforms to the characteristics of clothing design and human body dynamics. It can not only integrate the advantages of the joint model in the skeletal system, but also more fully model the local deformation constraints of key points of clothing.
  • the shape constraint condition set includes a shape constraint condition formed by using a triangular substructure and/or a quadrangular substructure to respectively represent the constraint relationship between a plurality of key feature points.
  • the shape constraint condition set mostly includes a triangle and/or quadrilateral substructure to represent the constraint relationship between multiple key feature points, which have the characteristics of incomplete stability and complete stability.
  • the combination of incomplete and complete stability makes the overall global structure have great deformation flexibility, and can fully model the global loose constraint relationship; in addition, each symmetry shape constraint condition also has the characteristics of symmetry.
  • the characteristics of the clothing structure are also local area symmetry between the different shape constraints, such as the left sleeve and the right sleeve, but this feature is weaker because they are not completely connected together, but through different Shape constraints are passed and implemented; finally, different shape constraint sets can also model the unique topological constraints of the human body, such as shoulders, chest, abdomen and other topological relationships. Therefore, a single shape constraint condition can fully model the local deformation constraint, and different shape constraint condition sets can implement the global loose constraint relationship, and the overall design has the advantage of global optimization.
  • the bidirectional cyclic convolutional neural network RNN is the initial neural network model, which is mainly based on the bidirectional cyclic convolutional neural network, which can more easily capture the dependencies between sequences from the data, unlike the conditional random field that requires manual design Dependency mode or configuration compatibility function between key points.
  • the initial bidirectional cyclic convolutional neural network is iteratively trained until the loss function satisfies the convergence condition.
  • the second neural network after training means that the training sample set and shape constraint set are input into the bidirectional Iterative training in the cyclic convolutional neural network, through forward conduction, the use of label information and loss function to calculate the cost, and the back propagation loss function gradient to update the parameters in each layer to adjust the weight of the bidirectional cyclic convolutional neural network , Until the loss function of the bidirectional cyclic convolutional neural network meets the convergence condition, and the second neural network after training is obtained.
  • the initial two-way cyclic convolutional neural network is iteratively trained based on the training sample set and the shape constraint condition set until the loss function meets the convergence condition, and the trained second neural network is obtained.
  • the training method is simple and improves Accuracy and speed of clothing recognition.
  • the method before processing the heat atlas based on the shape constraint conditions corresponding to the target image through the trained second neural network, the method further includes:
  • determining the shape constraint condition corresponding to the target image refers to determining the key feature point from the target image output from the first neural network, and then finding the shape constraint condition set The shape constraint conditions corresponding to the key feature points. For example, if it is determined that the target image contains key feature points including 5, refer to Fig. 4, it can be seen that the shape constraint conditions included include shape constraint conditions 5-6-13-14 and shape constraint conditions 1 -3-5, shape constraint condition 1-2-5-6, shape constraint condition 22-5-6, shape constraint condition 5-6-11-12, shape constraint condition 3-5-7-9.
  • the corresponding shape constraint conditions are determined according to the key feature points contained in the target image, which effectively captures the loosely constrained characteristics between the local areas of clothing.
  • the iterative training of the initial bidirectional cyclic convolutional neural network based on the training sample set and the shape constraint set includes:
  • the bidirectional cyclic convolutional neural network inputs each shape constraint condition in the shape constraint set into the forward layer in the set order, and inputs each shape constraint condition in the set reverse order into the backward layer for iterative training.
  • the shape constraint condition set can be divided into multiple shape constraint condition groups, such as A group, B group, C group, D group, E group, F group, where each shape constraint condition group includes multiple shapes Constraints, the shape constraint group with overlapping key feature points can continue to be subdivided into multiple shape constraint conditions, so that no key feature points are repeated in the shape constraint group, while taking into account the symmetry requirements, such as group A
  • the inner 1-3-5 and 2-4-6 shape constraints can be placed in a subgroup of group A, and the remaining three shape constraints are each subgroup.
  • the bidirectional cyclic convolutional neural network inputs each shape constraint condition in the shape constraint set into the forward layer in the set order, and inputs each shape constraint condition in the shape constraint set in the set reverse order to the backward layer for iterative training, see 1.
  • inputting each shape constraint condition into the forward layer in the set order of the shape constraint conditions refers to inputting the forward layer in the order of ABCDEF. If there are subgroups in each group, a subgroup is randomly selected for optimization until After the subgroup optimization is completed, continue to optimize the next group. Note that you need to maintain a global shape constraint split list to prevent the same key points from being selected as split nodes.
  • each shape constraint condition into the backward layer in the reverse order of setting the shape constraint refers to the reverse optimization of each group in the order of the group FEDCBA, the message propagation, iterating for several rounds in turn, and finally the dependent constraint is globally propagated.
  • a new key feature point heat map constrained by the local shape constraint condition set is obtained.
  • the new key point heat map refers to the position probability information of the key feature points contained in the target image.
  • the bidirectional cyclic convolutional neural network inputs the shape constraint conditions into the forward layer in the set order, and inputs the shape constraint conditions into the backward layer in the reverse order.
  • Carry out iterative training including:
  • the bidirectional cyclic convolutional neural network sets the connection order of key feature points according to the constraint relationship between the key feature points contained in the corresponding shape constraint conditions;
  • the position probability heat map corresponding to each key feature point is input into the forward layer and the backward layer in the connection order and the reverse order of the connection order at the same time for iterative training .
  • each shape constraint condition expresses the shape constraint characteristics between the local key feature points. It is a ring structure. It is impossible to directly use RNN to model this kind of loop dependency. We will disassemble each shape constraint condition here. Divide into two RNNs to model separately. As shown in Figure 5, the two-way cyclic convolutional neural network separately sets the shape constraints in the set connection order (5-6-14-13) and inputs the shape constraints into the forward direction.
  • each shape constraint condition in the shape constraint set is randomly selected as a node
  • the starting node is decomposed into two chains clockwise and counterclockwise, and each chain corresponds to a bidirectional RNN.
  • the initial point selection in the overall shape constraint condition set should not be repeated. In this way, it is prevented that local nodes are difficult to learn sufficient constraints, which will lead to the failure of global message propagation.
  • the second neural network after training processes the heat atlas based on the shape constraint conditions corresponding to the target image to determine the position probability information of the key feature points contained in the target image, including:
  • the second neural network adjusts the display parameters of the heat atlas according to the shape constraint conditions corresponding to the target image
  • the display parameters of the heat atlas are adjusted to obtain the corresponding target parameters, and the position probability information of the key feature points contained in the target image is obtained according to the target parameters.
  • the heat map set includes the location probability heat map corresponding to each key feature point contained in the target image; taking RNN as the second neural network as an example, each R node is a two-way RNN node, which receives forward messages from the upper and lower nodes respectively Backward message Then combine the node likelihood term x i to jointly output a new confidence y i , here Respectively the RNN dependencies of forward and backward pass, To offset and align the value range, the updated front and rear RNN dependencies are respectively For specific planning, refer to the following formula (1) to formula (5);
  • the symbol f represents forward forward: it specifies that from node i-1 to node i is forward;
  • the symbol b represents backward: it specifies that from node i to node i-1 is backward;
  • xi represents the original key point of the input Heat map i;
  • y i represents the output new key point heat map i;
  • W x , W f , W b Estimated parameters for generation.
  • key feature point 5, key feature point 6, and key feature point 14 respectively correspond to key feature point heat map i-1 and key feature point heat map i , The key feature point heat map i+1.
  • the backward constraint relationship information of, and the likelihood information is the original key feature point 6 output by the first neural network, and the posterior distribution is obtained by combining these three kinds of information, which is the key feature point 6 contained in the target image. In this way, it is possible to output a fine set of key point heat maps that conform to the local shape constraints of clothing.
  • a clothing classification method is also provided.
  • the clothing classification method is executed by the clothing classification device, and the clothing classification method includes:
  • Step S201 Determine the category attribute of the clothing based on the position probability information of the key feature points contained in the target image; the category attribute includes one of the following: shape, version and style;
  • Step S202 Determine the corresponding apparel category based on the category attribute.
  • the clothing is accurately identified, and the category attribute corresponding to the clothing is determined.
  • the category attribute as the shape as an example, the clothing is determined to be pants, and then it is classified into the pants category.
  • each target image is classified according to its shape, version and style. For example, when a user performs online shopping, he can classify the pictures provided by the merchant. When the user enters the keyword "trousers", it will be classified as belonging to the clothing category.
  • All target image collections are displayed, providing a more convenient and quicker experience for shopping; further, the clothing is classified, for example, pants are divided into calf pants, straight pants, and wide-leg pants according to version, and skirts are divided into short skirts and miniskirts according to styles. , Over-the-knee skirts, long skirts, this not only enriches the classification options, but also provides more materials for the field of clothing design and smart dressing, and can also be used more widely in the fields of online shopping, smart dressing, and clothing design Applications.
  • a clothing retrieval method is also provided.
  • the clothing retrieval method is executed by the clothing retrieval device, and the clothing retrieval method includes;
  • Step S301 Determine the category attribute of the clothing based on the position probability information of the key feature points contained in the target image; the category attribute includes one of the following: shape, version and style;
  • Step S302 Determine the corresponding retrieval element based on the category attribute; wherein, the retrieval element includes retrieval keywords and/or images;
  • Step S303 retrieve a clothing image set corresponding to the target image based on the retrieval element.
  • the clothing is accurately identified, and the category attribute corresponding to the clothing is determined.
  • the category attribute as the shape as an example, the clothing is determined to be pants, and then it is classified into the pants category.
  • the generated retrieval element is the retrieval keyword "pants" and/or the "pants image” corresponding to the pants, and then matching with the image features stored in the image feature pool Query, here can be based on the query result of the clothing, the similar clothing pictures and corresponding clothing information found to the user through the mobile terminal, the clothing information includes clothing brand, price and material.
  • the clothing retrieval based on the target image effectively solves the difficulty that users cannot or does not want to search in words, reduces the difficulty of shopping, and the visual feature meets the higher needs of users for retrieval functions, and facilitates users to quickly find clothing information. Greatly enhance the user experience.
  • the category attribute divides the pants into calf pants, straight-leg pants, and wide-leg pants according to the version, and divides the skirts into short skirts, mini skirts, over-the-knee skirts, and long skirts according to styles, so that when searching for clothing, the user is more detailed.
  • the retrieval difficulty is also smaller, and the precision of the screening is more accurate; further. It provides more materials for the field of clothing design and smart dressing, and can also be used more widely in applications such as online shopping, smart dressing, and clothing design.
  • a clothing recognition device including:
  • the first acquisition subsystem 31 is configured to acquire a target image containing clothing to be identified
  • the first image processing subsystem 32 is configured to process the target image through the trained first neural network to determine the heat atlas corresponding to the key feature points contained in the target image.
  • the heat atlas includes each of the target images.
  • the location probability heat map corresponding to the key feature points; and the second neural network after training processes the heat map set based on the shape constraint conditions corresponding to the target image to determine the location probability information of the key feature points contained in the target image.
  • a target image containing clothing to be recognized is acquired, and the target image is processed through the trained first neural network to determine the heat atlas corresponding to the key feature points contained in the target image.
  • the heat atlas includes the target The location probability heat map corresponding to each key feature point contained in the image; in this way, the initial location information of each key feature point in the target image of the clothing to be recognized is obtained, and the initial location probability heat map corresponding to each key feature point is obtained;
  • the second neural network after training processes the heat map set based on the shape constraint conditions corresponding to the target image, and determines the position probability information of the key feature points contained in the target image.
  • processing the heat map set based on the shape constraint conditions corresponding to the clothing parts can optimize the accurate identification of the position probability of the key feature points contained in the clothing to be recognized, and realize the realization based on the determined location probability information of the key feature points of the clothing to be recognized
  • Accurate identification of clothing can be more widely applicable to applications in the fields of online shopping, smart dressing and clothing design by obtaining key feature points of clothing.
  • a network training subsystem 33 configured to obtain an image training set containing training images of multiple clothing, the training image includes an original image carrying key feature point annotation information;
  • the convolutional neural network is iteratively trained until the loss function meets the convergence condition, and the first neural network after training is obtained.
  • the network training subsystem 33 is further configured to perform image augmentation on the original image to obtain a corresponding augmented image, and the image training set also includes augmented images.
  • the network training subsystem 33 is also configured to perform image horizontal translation, image vertical translation, color perturbation, and/or image rotation on the original image to obtain the corresponding augmented image.
  • the network training subsystem 33 is further configured to obtain a training sample set, the training sample set including a heat map set corresponding to the key feature points contained in the training image output by the first neural network; based on the training sample set and the shape constraint condition The set performs iterative training on the initial bidirectional cyclic convolutional neural network until the loss function meets the convergence condition, and the second neural network after training is obtained.
  • the network training subsystem 33 is further configured to input the training sample set and the shape constraint condition set into the initial bidirectional cyclic convolutional neural network; the bidirectional cyclic convolutional neural network separately sets the shape constraint conditions in the order of setting.
  • the shape constraint condition is input to the forward layer, and each shape constraint condition is input to the backward layer in the set reverse order for iterative training.
  • the network training subsystem 33 is also configured to, in one iteration, the bidirectional cyclic convolutional neural network sets the connection sequence of the key feature points according to the constraint relationship between the key feature points contained in the corresponding shape constraint condition; Taking a key feature point contained in the heat map set corresponding to the training image as a starting point, the position probability heat map corresponding to each key feature point is input into the forward layer and the backward layer in the connection order and the reverse order of the connection order at the same time for iterative training .
  • the first acquisition subsystem 31 can be implemented by a camera or a terminal with drawing function
  • the first image processing subsystem 32 can be implemented by an image processor or a server
  • the network training subsystem 33 can be implemented by a processor or Server implementation
  • Figure 8 does not show a camera, a terminal with drawing power, an image processor, a server, and a processor
  • the image processor or processor is specifically a central processing unit (CPU, Central Processing Unit), a microprocessor (MPU, Microprocessor Unit), Digital Signal Processor (DSP, Digital Signal Processing) or Field Programmable Gate Array (FPGA, Field Programmable Gate Array).
  • a clothing classification device including:
  • the second acquisition subsystem 41 is configured to acquire a target image containing clothing to be identified
  • the second image processing subsystem 42 is configured to determine a heat map set corresponding to the key feature points contained in the target image based on the target image, the heat map set including a position probability heat map corresponding to each key feature point contained in the target image; and Process the heat map set based on the shape constraint conditions corresponding to the target image, and determine the position probability information of the key feature points contained in the target image;
  • the classification subsystem 43 is configured to determine the category attributes of clothing based on the position probability information of the key feature points contained in the target image; the category attributes include one of the following: shape, version, and style; and determine the corresponding clothing based on the category attributes category.
  • the second acquisition subsystem 41 can be implemented by a camera or a terminal with drawing function
  • the second image processing subsystem 42 can be implemented by an image processor or server
  • the classification subsystem 43 can be implemented by a processor or server.
  • Figure 9 does not show a camera, a terminal with drawing power, an image processor, a server, and a processor.
  • a clothing retrieval device including:
  • the third acquisition subsystem 51 is configured to acquire a target image containing clothing to be identified
  • the third image processing subsystem 52 is configured to determine a heat map set corresponding to the key feature points contained in the target image based on the target image, the heat map set including a position probability heat map corresponding to each key feature point contained in the target image; and Process the heat map set based on the shape constraint conditions corresponding to the target image, and determine the position probability information of the key feature points contained in the target image;
  • the retrieval element determination subsystem 53 is configured to determine the category attribute of clothing based on the position probability information of the key feature points contained in the target image; the category attribute includes one of the following: shape, version and style; and the corresponding determination based on the category attribute.
  • the search elements of; among them, the search elements include search keywords and/or images;
  • the retrieval subsystem 54 is configured to retrieve a clothing image set corresponding to the target image based on retrieval elements.
  • the third acquisition subsystem 51 can be implemented by a camera or a terminal with drawing function
  • the third image processing subsystem 52 can be implemented by an image processor or a server
  • the retrieval element determination subsystem 53 can be implemented by a processor.
  • the retrieval subsystem 54 can be implemented by search engine equipment; FIG. 10 does not show a camera, a terminal with drawing power, an image processor, a server, a processor, and a search engine equipment.
  • a computer device including: at least one processor 210 and a memory 211 for storing a computer program that can run on the processor 210; wherein, FIG. 11
  • the processor 210 shown in is not used to refer to the number of processors as one, but only used to refer to the positional relationship of the processor relative to other devices. In practical applications, the number of processors can be one or more.
  • the memory 211 illustrated in FIG. 11 has the same meaning, that is, it is only used to refer to the positional relationship of the memory relative to other devices. In practical applications, the number of memories can be one or more.
  • processor 210 when the processor 210 is used to run a computer program, the following steps are executed:
  • the target image containing the clothing to be recognized process the target image through the trained first neural network, and determine the heat atlas corresponding to the key feature points contained in the target image.
  • the heat atlas includes each key feature contained in the target image The location probability heat map corresponding to the point;
  • the second neural network after training processes the heat map set based on the shape constraint conditions corresponding to the target image, and determines the position probability information of the key feature points contained in the target image.
  • processor 210 is further configured to execute the following steps when running a computer program:
  • the initial convolutional neural network is iteratively trained based on the image training set until the loss function meets the convergence condition, and the first neural network after training is obtained.
  • processor 210 is further configured to execute the following steps when running a computer program:
  • the image training set also includes augmented images.
  • processor 210 is further configured to execute the following steps when running a computer program:
  • processor 210 when the processor 210 is further configured to run a computer program, it executes the following steps:
  • the training sample set including a heat map set corresponding to the key feature points contained in the training image output by the first neural network
  • the initial bidirectional cyclic convolutional neural network is iteratively trained until the loss function meets the convergence condition, and the trained second neural network is obtained.
  • processor 210 is further configured to execute the following steps when running a computer program:
  • the shape constraint condition set includes a shape constraint condition formed by using a triangular substructure and/or a quadrangular substructure to respectively represent the constraint relationship between multiple key feature points.
  • processor 210 is further configured to execute the following steps when running a computer program:
  • processor 210 is further configured to execute the following steps when running a computer program:
  • the bidirectional cyclic convolutional neural network inputs each shape constraint condition in the shape constraint set into the forward layer in the set order, and inputs each shape constraint condition in the set reverse order into the backward layer for iterative training.
  • processor 210 is further configured to execute the following steps when running a computer program:
  • the bidirectional cyclic convolutional neural network sets the connection order of key feature points according to the constraint relationship between the key feature points contained in the corresponding shape constraint conditions;
  • the position probability heat map corresponding to each key feature point is input into the forward layer and the backward layer in the connection order and the reverse order of the connection order at the same time for iterative training .
  • processor 210 is further configured to execute the following steps when running a computer program:
  • the second neural network adjusts the display parameters of the heat atlas according to the shape constraint conditions corresponding to the target image
  • the display parameters of the heat atlas are adjusted to obtain the corresponding target parameters, and the position probability information of the key feature points contained in the target image is obtained according to the target parameters.
  • the clothing identification device further includes: at least one network interface 212.
  • the various components in the sending end are coupled together through the bus system 213.
  • the bus system 213 is used to implement connection and communication between these components.
  • the bus system 213 also includes a power bus, a control bus, and a status signal bus.
  • various buses are marked as the bus system 213 in FIG. 11.
  • the memory 211 may be a volatile memory or a non-volatile memory, and may also include both volatile and non-volatile memory.
  • the non-volatile memory can be a read only memory (ROM, Read Only Memory), a programmable read only memory (PROM, Programmable Read-Only Memory), an erasable programmable read only memory (EPROM, Erasable Programmable Read- Only Memory, Electrically Erasable Programmable Read-Only Memory (EEPROM, Electrically Erasable Programmable Read-Only Memory), magnetic random access memory (FRAM, ferromagnetic random access memory), flash memory (Flash Memory), magnetic surface memory , CD-ROM, or CD-ROM (Compact Disc Read-Only Memory); magnetic surface memory can be magnetic disk storage or tape storage.
  • the volatile memory may be random access memory (RAM, Random Access Memory), which is used as an external cache.
  • RAM random access memory
  • SRAM static random access memory
  • SSRAM synchronous static random access memory
  • DRAM dynamic random access Memory
  • SDRAM Synchronous Dynamic Random Access Memory
  • DDRSDRAM Double Data Rate Synchronous Dynamic Random Access Memory
  • ESDRAM enhanced -Type synchronous dynamic random access memory
  • SLDRAM SyncLink Dynamic Random Access Memory
  • direct memory bus random access memory DRRAM, Direct Rambus Random Access Memory
  • DRRAM Direct Rambus Random Access Memory
  • the memory 211 described in the embodiment of the present invention is intended to include, but is not limited to, these and any other suitable types of memory.
  • the memory 211 in the embodiment of the present invention is used to store various types of data to support the operation of the sender.
  • Examples of such data include: any computer programs used to operate on the sending end, such as operating systems and applications.
  • the operating system contains various system programs, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and processing hardware-based tasks.
  • the application program can include various application programs for realizing various application services.
  • a program that implements the method of the embodiment of the present invention may be included in an application program.
  • This embodiment also provides a computer storage medium, for example, including a memory 211 storing a computer program.
  • the computer program can be executed by the processor 210 in the sending end to complete the steps of the foregoing method.
  • the computer storage medium can be FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface memory, optical disk, or CD-ROM, etc.; it can also be a variety of devices including one or any combination of the above memories, such as smart phones , Tablets, laptops, etc.
  • the target image containing the clothing to be recognized process the target image through the trained first neural network, and determine the heat atlas corresponding to the key feature points contained in the target image.
  • the heat atlas includes each key feature contained in the target image The location probability heat map corresponding to the point;
  • the second neural network after training processes the heat map set based on the shape constraint conditions corresponding to the target image, and determines the position probability information of the key feature points contained in the target image.
  • the initial convolutional neural network is iteratively trained based on the image training set until the loss function meets the convergence condition, and the first neural network after training is obtained.
  • the image training set also includes augmented images.
  • the training sample set including a heat map set corresponding to the key feature points contained in the training image output by the first neural network
  • the initial bidirectional cyclic convolutional neural network is iteratively trained until the loss function meets the convergence condition, and the trained second neural network is obtained.
  • the shape constraint condition set includes a shape constraint condition formed by using a triangular substructure and/or a quadrangular substructure to respectively represent the constraint relationship between multiple key feature points.
  • the bidirectional cyclic convolutional neural network inputs each shape constraint condition in the shape constraint set into the forward layer in the set order, and inputs each shape constraint condition in the set reverse order into the backward layer for iterative training.
  • the bidirectional cyclic convolutional neural network sets the connection order of key feature points according to the constraint relationship between the key feature points contained in the corresponding shape constraint conditions;
  • the position probability heat map corresponding to each key feature point is input into the forward layer and the backward layer in the connection order and the reverse order of the connection order at the same time for iterative training .
  • the second neural network adjusts the display parameters of the heat atlas according to the shape constraint conditions corresponding to the target image
  • the display parameters of the heat atlas are adjusted to obtain the corresponding target parameters, and the position probability information of the key feature points contained in the target image is obtained according to the target parameters.
  • a target image including clothing to be recognized is acquired, and a heat atlas corresponding to the key feature points contained in the target image is determined based on the target image, and the heat atlas includes every item included in the target image.
  • a position probability heat map corresponding to a key feature point in this way, the initial position information of each key feature point in the target image of the clothing to be recognized is obtained, and the initial position probability heat map corresponding to each key feature point is obtained; based on the target image
  • the corresponding shape constraint conditions process the heat atlas, and determine the position probability information of the key feature points contained in the target image.
  • processing the heat map set based on the shape constraint conditions corresponding to the clothing parts can optimize the accurate identification of the location probabilities of the key feature points contained in the clothing to be recognized, based on the determined location probability of the key feature points of the clothing to be recognized
  • the information realizes the accurate identification of clothing, and by acquiring the key feature points of the clothing, it can be more widely applied to applications in the fields of online shopping, smart dressing, and clothing design.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Business, Economics & Management (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Multimedia (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Image Analysis (AREA)

Abstract

一种服饰识别、分类及检索的方法、装置、设备及存储介质,包括:获取包含待识别服饰的目标图像,基于目标图像确定目标图像包含的关键特征点对应的热图集,热图集包括目标图像中包含的每一关键特征点对应的位置概率热图(101);基于目标图像对应的形状约束条件对热图集进行处理,确定目标图像中包含的关键特征点的位置概率信息(102)。

Description

服饰识别、分类及检索的方法、装置、设备及存储介质
相关申请的交叉引用
本发明基于申请号为201910123577.4、申请日为2019年02月18日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以引入方式并入本发明。
技术领域
本发明涉及计算机视觉技术领域,尤其涉及一种服饰识别、分类及检索的方法、装置、设备及存储介质。
背景技术
服饰识别是图像检索领域最重要也是最有挑战性的问题之一。在当今互联网上,多数用户搜索与网上购物内容与服饰相关。因此,服饰识别是解决同款检索、风格识别以及穿搭推荐需求中的关键问题。
发明内容
本发明的主要目的在于提出一种服饰识别、分类及检索的方法、装置、设备及存储介质,能够更加精确的识别服饰。
为达到上述目的,本发明的技术方案是这样实现的:
第一方面,本发明实施例提供了一种服饰识别方法,所述方法包括:
获取包含待识别服饰的目标图像,基于所述目标图像确定所述目标图像包含的关键特征点对应的热图集,所述热图集包括所述目标图像中包含的每一关键特征点对应的位置概率热图;
基于所述目标图像对应的形状约束条件对所述热图集进行处理,确定所述目标图像中包含的关键特征点的位置概率信息。
第二方面,本发明实施例提供了一种采用本发明任一实施例所述的服饰识别方法实现的服饰分类方法,包括:
基于目标图像中包含的关键特征点的位置概率信息确定服饰的类别属性;所述类别属性包括以下其中之一:形状、版型和风格;
基于所述类别属性确定对应的服饰类别。
第三方面,本发明实施例提供了一种采用本发明任一实施例所述的服饰识别方法实现的服饰检索方法,包括:
基于目标图像中包含的关键特征点的位置概率信息确定服饰的类别属性;所述类别属性包括以下其中之一:形状、版型和风格;
基于所述类别属性确定对应的检索要素;其中,所述检索要素包括检索关键字和/或图像;
基于所述检索要素检索与所述目标图像对应的服饰图像集。
第四方面,本发明实施例提供了一种服饰识别装置,包括:
第一采集子系统,被配置为获取包含待识别服饰的目标图像;
第一图像处理子系统,被配置为基于所述目标图像确定所述目标图像包含的关键特征点对应的热图集,所述热图集包括所述目标图像中包含的每一关键特征点对应的位置概率热图;以及基于所述目标图像对应的形状约束条件对所述热图集进行处理,确定所述目标图像中包含的关键特征点的位置概率信息。
第五方面,本发明实施例提供了一种服饰分类装置,包括:
第二采集子系统,被配置为获取包含待识别服饰的目标图像;
第二图像处理子系统,被配置为基于所述目标图像确定所述目标图像包含的关键特征点对应的热图集,所述热图集包括所述目标图像中包含的每一关键特征点对应的位置概率热图;以及基于所述目标图像对应的形状约束条件对所述热图集进行处理,确定所述目标图像中包含的关键特征点的位置概率信息;
分类子系统,被配置为基于目标图像中包含的关键特征点的位置概率信息确定服饰的类别属性;所述类别属性包括以下其中之一:形状、版型和风格;以及基于所述类别属性确定对应的服饰类别。
第六方面,本发明实施例提供了一种服饰检索装置,包括:
第三采集子系统,被配置为获取包含待识别服饰的目标图像;
第三图像处理子系统,被配置为基于所述目标图像确定所述目标图像包含的关键特征点对应的热图集,所述热图集包括所述目标图像中包含的每一关键特征点对应的位置概率热图;以及基于所述目标图像对应的形状约束条件对所述热图集进行处理,确定所述目标图像中包含的关键特征点的位置概率信息;
检索要素确定子系统,被配置为基于目标图像中包含的关键特征点的位置概率信息确定服饰的类别属性;所述类别属性包括以下其中之一:形状、版型和风格;以及基于所述类别属性确定对应的检索要素;其中,所述检索要素包括检索关键字和/或图像;
检索子系统,被配置为基于所述检索要素检索与所述目标图像对应的服饰图像集。
第七方面,本发明实施例提供了一种计算机设备,包括:处理器和被配置为存储能够在处理器上运行的计算机程序的存储器;
其中,所述处理器被配置为运行所述计算机程序时,实现本发明任一实施例所提供的服饰识别方法、或所提供的服饰分类方法、或所提供的服饰检索方法。
第八方面,本发明实施例提供了一种计算机存储介质,所述计算机存储介质中存储有计算机程序,所述计算机程序被处理器执行时实现本发明任一实施例所提供的服饰识别方法、或所提供的服饰分类方法、或所提供的服饰检索方法。
本发明实施例所提供的服饰识别、分类及检索的方法、装置、设备及 存储介质,获取包含待识别服饰的目标图像,基于所述目标图像确定所述目标图像包含的关键特征点对应的热图集,所述热图集包括所述目标图像中包含的每一关键特征点对应的位置概率热图;如此,获得待识别服饰的目标图像中每一关键特征点的初始位置信息,得到每一关键特征点对应的初始位置概率热图;基于所述目标图像对应的形状约束条件对所述热图集进行处理,确定所述目标图像中包含的关键特征点的位置概率信息。如此,基于与服饰局部对应的形状约束条件对热图集进行处理,能够优化对待识别服饰包含的关键特征点的位置概率的精确识别,根据所述确定的待识别服饰的关键特征点的位置概率信息实现对服饰的精确识别,通过获取所述服饰的关键特征点能够更加广泛的适用于在网络购物、智能穿搭、服装设计等领域的应用。
附图说明
图1为本发明一实施例提供的服饰识别方法的流程示意图;
图2(a)为本发明一实施例提供的第一类服饰类型的服饰关键特征点实例样图;
图2(b)为本发明一实施例提供的第二类服饰类型的服饰关键特征点实例样图;
图2(c)为本发明一实施例提供的第三类服饰类型的服饰关键特征点实例样图;
图2(d)为本发明一实施例提供的第四类服饰类型的服饰关键特征点实例样图;
图2(e)为本发明一实施例提供的第五类服饰类型的服饰关键特征点实例样图;
图3为本发明一实施例提供的第一神经网络的结构示意图;
图4为本发明一实施例提供的形状约束条件集的示意图;
图5为本发明一实施例提供的形状约束条件输入双向循环卷积神经网络示意图;
图6为本发明一实施例提供的服饰分类方法的流程示意图;
图7为本发明一实施例提供的服饰检索方法的流程示意图;
图8为本发明一实施例提供的服饰识别装置的结构示意图;
图9为本发明一实施例提供的服饰分类装置的结构示意图;
图10为本发明一实施例提供的服饰检索装置的结构示意图;
图11为本发明一实施例提供的计算机设备的结构示意图。
具体实施方式
以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
除非另有定义,本文所使用的所有的技术和科学术语与属于本发明的技术领域的技术人员通常理解的含义相同。本文中在本发明的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本发明。本文所使用的术语“及/或”包括一个或多个相关的所列项目的任意的和所有的组合。
对本发明进行进一步详细说明之前,对本发明实施例中涉及的名词和术语进行说明,本发明实施例中涉及的名词和术语适用于如下的解释。
1)目标图像,本文中指用于进行服饰关键点检测的成像的图像,例如JPEG等各种数字格式的图像。
2)训练图像,用于图像训练的样本图像。
3)损失函数(loss function)也叫代价函数(cost function),是神经网络优化的目标函数。
4)神经网络(Neural Networks,NN),是由大量的、简单的处理单元(称为神经元)广泛地互相连接而形成的复杂网络系统,它反映了人脑功 能的许多基本特征,是一个高度复杂的非线性动力学习系统。
如图1所示,本发明一实施例提供了一种服饰识别方法,通过服饰识别装置执行该方法,该方法包括如下步骤:
步骤S101:获取包含待识别服饰的目标图像,基于目标图像确定目标图像包含的关键特征点对应的热图集,热图集包括目标图像中包含的每一关键特征点对应的位置概率热图;
目标图像是指针对需要进行服饰关键点检测所拍摄或者绘制的图片。基于目标图像确定目标图像包含的关键特征点对应的热图集是指抓取目标图像中所包含对应的特征,包含关键特征点对应的热图集,热图集包括目标图像中包含的每一关键特征点对应的位置概率热图。
不同类型的服饰具有不完全相同的关键点,当前系统主要接受5类服饰类型,对这5类服饰类型分别确定包含的关键特征点,如图2(a)所示的5类服务类型中的第一类服饰类型的服饰关键特征点实例样图,第一类服饰类型包括服饰关键特征点1至13,共计13个关键特征点;如图2(b)所示的5类服务类型中的第二类服饰类型的服饰关键特征点实例样图,第二类服饰类型包括服饰关键特征点1至12,共计12个关键特征点;如图2(c)所示的5类服务类型中的第三类服饰类型的服饰关键特征点实例样图,第三类服饰类型包括服饰关键特征点1至7,共计7个关键特征点;如图2(d)所示的5类服务类型中的第四类服饰类型的服饰关键特征点实例样图,第四类服饰类型包括服饰关键特征点1至4,共计4个关键特征点;如图2(e)所示的5类服务类型中的第五类服饰类型的服饰关键特征点实例样图,第五类服饰类型包括服饰关键特征点1至6,共计6个关键特征点。这里,服饰关键点为服饰所属类别中大多数服饰具有的、在功能和结构上可用于区分服饰类别中不同款式服饰的局部位置,每个服饰可以具有一个或者多个关键点。
确定目标图像包含的关键特征点对应的热图集,即对目标图像的每一 关键特征点的初始位置信息,具体地,获得每一关键特征点对应的位置概率热图。
步骤S102:基于目标图像对应的形状约束条件对热图集进行处理,确定目标图像中包含的关键特征点的位置概率信息。
对热图集和目标图像对应的形状约束条件进行处理,通过形状约束条件实现对热图集的优化,确定目标图像中包含的关键特征点的位置概率信息。
这里,形状约束条件可以是与服饰局部对应的约束条件,用于表征服饰局部的关键特征。
本发明实施例提供的一种服饰识别方法,获取包含待识别服饰的目标图像,基于目标图像确定目标图像包含的关键特征点对应的热图集,热图集包括目标图像中包含的每一关键特征点对应的位置概率热图;如此,获得待识别服饰的目标图像中每一关键特征点的初始位置信息,得到每一关键特征点对应的初始位置概率热图;基于目标图像对应的形状约束条件对热图集进行处理,确定目标图像中包含的关键特征点的位置概率信息。如此,基于与服饰局部对应的形状约束条件对热图集进行处理,能够优化对待识别服饰包含的关键特征点的位置概率的精确识别,根据确定的待识别服饰的关键特征点的位置概率信息实现对服饰的精确识别,通过获取服饰的关键特征点能够更加广泛的适用于在网络购物、智能穿搭、服装设计等领域的应用。
在一实施方式中,基于目标图像确定目标图像包含的关键特征点对应的热图集,包括:
通过训练后的第一神经网络对目标图像进行处理,确定目标图像包含的关键特征点对应的热图集;
基于目标图像对应的形状约束条件对热图集进行处理,包括:
通过训练后的第二神经网络基于目标图像对应的形状约束条件对热图 集进行处理。
通过训练后的第一神经网络对目标图像进行处理是指将目标图像输入训练后的第一神经网络,通过第一神经网络抓取目标图像中所包含对应的特征,包含关键特征点对应的热图集,热图集包括目标图像中包含的每一关键特征点对应的位置概率热图。
利用训练后的第一神经网络确定目标图像包含的关键特征点对应的热图集,即对目标图像的每一关键特征点的初始位置信息,具体地,获得每一关键特征点对应的位置概率热图。
将热图集和目标图像对应的形状约束条件输入训练后的第二神经网络,通过形状约束条件实现对热图集的优化,确定目标图像中包含的关键特征点的位置概率信息。
这里,形状约束条件可以是与服饰局部对应的约束条件,用于表征服饰局部的关键特征。
在上述实施方式中,获取包含待识别服饰的目标图像,通过训练后的第一神经网络对目标图像进行处理,确定目标图像包含的关键特征点对应的热图集,热图集包括目标图像中包含的每一关键特征点对应的位置概率热图;如此,获得待识别服饰的目标图像中每一关键特征点的初始位置信息,得到每一关键特征点对应的初始位置概率热图;通过训练后的第二神经网络基于目标图像对应的形状约束条件对热图集进行处理,确定目标图像中包含的关键特征点的位置概率信息。如此,基于与服饰局部对应的形状约束条件对热图集进行处理,能够优化对待识别服饰包含的关键特征点的位置概率的精确识别,根据确定的待识别服饰的关键特征点的位置概率信息实现对服饰的精确识别,通过获取服饰的关键特征点能够更加广泛的适用于在网络购物、智能穿搭、服装设计等领域的应用。
在一实施方式中,获取包含待识别服饰的目标图像之前,包括:
获取包含有多个服饰的训练图像的图像训练集,训练图像包括携带有 关键特征点标注信息的原始图像;
基于图像训练集对初始的卷积神经网络进行迭代训练,直至损失函数满足收敛条件,得到训练后的第一神经网络。
获取包含有多个服饰的训练图像的图像训练集,可以是基于新图像为样本图像构建多批次的训练集,例如,训练图像可以是基于互联网中当前已公开的图像库中收集得到的,训练图像是根据预先确定的标注方式对原始图像中局部图像进行标注予以明确。
这里,损失函数(loss function)也叫代价函数(cost function),是神经网络优化的目标函数,神经网络训练或者优化的过程就是最小化损失函数的过程,损失函数值越小,对应预测的结果和真实结果的值就越接近。
本发明实施例中,初始的神经网络模型可以是基于预训练好的图像数据集上预训练好的神经网络模型,如基于预训练好的ImageNet、DenseNet等图像数据集上预训练得到的Inception V1、V2、V3、V4等卷积神经网络模型,当然,也可以是基于预训练好的其它图像数据集上预训练好的任意神经网络模型,通过利用基于预训练好的图像数据集上预训练好的神经网络模型的参数搭建初始的神经网络模型。
具体地,参阅图3,第一神经网络可以包括数据预处理模块21、特征提取模块22、区域检测模块23、目标检测模块24、关键点定位模块25;
特征提取模块22,被配置为从目标图像数据中提取并输出图像特征图,以Imagenet网络参数初始化特征提取网络,从第一卷积层、第一残差块第二单元、第二残差块第三单元、第三残差块第五单元、第四残差块最后一个单元输出分别提取特征图并依次进行上采样和通道变换,然后和上一级特征图对应位置相加并进行一次3x3卷积消除叠加褶叠影响,如此构造出多尺度特征金字塔,尺度分别为2,4,8,16,32倍,多尺度的特征有助于检测出不同尺度的对象,使得整体检测方法更加鲁棒。
区域检测模块23,被配置为分别对每个尺度的特征图进行如下处理: 首先进行一个3x3卷积的特征调整,然后分别连接两个全连接分支,一个分支用来预测对象位置,一个分支用来预测对象概率。为了进一步增加预测的鲁棒性,对每个像素引入若干参考盒,它们具有不同的纵横比和尺度,决策时以每个参考盒为基础,这样就细化了决策粒度,稠密的参考盒充分利用了更广泛的集体智慧,减少了预测的不稳定性。训练时,首先通过寻找与每个参考盒重叠面积比例最大的对象边界盒作为相应参考盒的支持对象,每个参考盒仅支持或投票一个对象,与支持对象的重叠面积比例大于指定阈值的视为正样本,否则视为负样本(重叠面积比例大于零的参考盒要比不重叠的参考盒更具有区分对象边界的能力),为了增加稠密性,进一步指定与每个对象边界盒具有最大重叠面积比例的参考盒为正样本。训练时,每次迭代过程中,为了减少计算量,需要滤除部分参考盒,这些参考盒预测的对象边界盒的面积较小、分数较低,然后进一步使用非极大值压缩方法消除部分参考盒,再根据预测的对象边界盒与标注的对象边界盒的重叠面积比例阈值再次滤除部分边界盒,最后选择位于正负样本集合内的那些边界盒参与最终损失函数的计算。测试时,对每个参考盒预测的对象边界盒的位置和对象分数进行非极大值压缩,选取最佳的对象实例和对象位置。最终需要将每个尺度特征图预测的对象区域进行合并,这些对象区域要调整到原图像尺寸。每个预测的对象区域视为兴趣区域,通过双线性插值将该区域的特征图缩放到14x14大小输出到目标检测模块24。
目标检测模块24,被配置为对于输入的固定大小的14x14的对象特征图,进行最大池化到尺寸7x7,然后将特征图展开为一维向量,再连接两个全连接层进行特征变换,最后再分两支,每个分支为一个全连接层,分别用来精细对象位置和分类对象。训练时,对象边界盒和区域检测模块23相同,即为对象类的所有关键点的最小外包围盒。
关键点定位模块25,被配置为对区域检测模块23输出的14x14的对象特征图连续进行4次3x3卷积特征变换,输出同等大小的新的特征图,然 后进行一次反卷积使得特征图大小为28x28,最后应用通道变换和sigmoid激活,使得通道个数为22,即关键点个数,每个通道对应一张关键点热图。
基于图像训练集对初始卷积神经网络进行迭代训练,直至损失函数满足收敛条件,得到训练后的第一神经网络是指将图像训练集输入初始卷积神经网络进行迭代训练,通过前向传导、利用标注信息和损失函数来计算代价、通过反向传播损失函数梯度更新每一层中的参数,以调整初始卷积神经网络的权重,直至初始卷积神经网络的损失函数满足收敛条件,得到训练后的第一神经网络。
在上述实施方式中,通过获取包含有多个服饰的训练图像的图像训练集作为基础,对初始的卷积神经网络进行迭代训练,构造用于对目标图像进行服饰识别的训练后的第一神经网络,训练方式简单,解决了服饰识别训练样本少和运算慢的问题。
在一实施方式中,基于图像训练集对初始的卷积神经网络进行迭代训练之前,还包括:
对原始图像进行图像增广得到对应的增广图像,图像训练集还包括增广图像。
进一步地,对原始图像进行图像增广得到对应的增广图像,包括:
对原始图像分别进行图像水平平移、图像竖直平移、颜色扰动和/或图像旋转,得到对应的增广图像。
这里,对原始图像进行图像增广得到对应的增广图像是指在不改变原始图像类别的情况下,增加数据量。原始图像的增广包括很多,从几何角度来看,有水平平移、竖直平移和图像旋转,从像素变换来看,有颜色扰动。
在上述实施方式中,通过不同的方式对原始图像实现图像增广,得到增广图像,从而扩充的图像训练集的样本,增加了数据量,如此,在通过图像训练集对初始卷积神经网络进行迭代训练时,能大大提高模型的泛化 能力,能够更加精确的识别服饰。
在一实施方式中,得到训练后的第一神经网络之后,包括:
获取训练样本集,训练样本集包括第一神经网络输出的训练图像包含的关键特征点对应的热图集;
基于训练样本集以及形状约束条件集对初始的双向循环卷积神经网络进行迭代训练,直至损失函数满足收敛条件,得到训练后的第二神经网络。
这里,损失函数(loss function)也叫代价函数(cost function),是神经网络优化的目标函数,神经网络训练或者优化的过程就是最小化损失函数的过程,损失函数值越小,对应预测的结果和真实结果的值就越接近。
这里,参见图4,形状约束条件集是对服饰关键点特有的形变结构进行建模得到的,节点编号说明见表1。这些形状结构的设计符合服装设计特点以及人体动力学特性,它既能综合骨骼系统中关节模型的优点也能更充分地建模服饰关键点的局部形变约束关系。
表1
Figure PCTCN2019127660-appb-000001
进一步地,形状约束条件集包括以三角形子结构和/或四边形子结构分别表征多个关键特征点之间约束关系所形成的形状约束条件。
这里,再次参见图4,形状约束条件集多为包括以三角形和/或四边形 子结构分别表征多个关键特征点之间约束关系,它们分别具有不完全稳定性和完全稳定性的特点,这种不完全和完全稳定性间的组合使得整体的全局结构具有很大形变灵活性,能够充分建模全局的松散约束关系;另外每个对称性形状约束条件还具有对称性的特点,这种设计挖掘了服饰结构特点;再者不同形状约束条件之间的也具有局部区域对称特点,如左袖和右袖,但是这种特点较弱,因为它们并没有完全的连接到一起,而是通过不同的形状约束条件进行传递实施的;最后不同形状约束条件集之间还能建模人体特有拓扑约束关系,如肩部、胸部、腹部等拓扑关系。因此单个形状约束条件可以充分建模局部形变约束,不同形状约束条件集又可以实施全局的松散约束关系,综合在一起使得整个设计具有全局优化的优势。
这里,双向循环卷积神经网络RNN是初始的神经网络模型,主要是基于双向循环卷积神经网络可以更方便地从数据中捕捉序列之间的依赖关系,而不像条件随机场中需要人工设计关键点之间的依赖模式或配置兼容函数。
基于训练样本集以及形状约束条件集对初始的双向循环卷积神经网络进行迭代训练,直至损失函数满足收敛条件,得到训练后的第二神经网络是指将训练样本集以及形状约束条件集输入双向循环卷积神经网络中进行迭代训练,通过前向传导、利用标注信息和损失函数来计算代价、通过反向传播损失函数梯度更新每一层中的参数,以调整双向循环卷积神经网络的权重,直至双向循环卷积神经网络的损失函数满足收敛条件,得到训练后的第二神经网络。
在上述实施方式中,基于训练样本集以及形状约束条件集对初始的双向循环卷积神经网络进行迭代训练,直至损失函数满足收敛条件,得到训练后的第二神经网络,训练方式简单,提高了服饰识别的精度和速度。
在一实施方式中,通过训练后的第二神经网络基于目标图像对应的形状约束条件对热图集进行处理之前,还包括:
根据目标图像包含的关键特征点对应的热图集,确定目标图像对应的 形状约束条件,目标图像包括至少一个形状约束条件。
根据目标图像包含的关键特征点对应的热图集,确定目标图像对应的形状约束条件是指将从第一神经网络中输出的目标图像确定的关键特征点,再从形状约束条件集中找出该关键特征点对应的形状约束条件,例如,确定目标图像包含关键特征点包含5,则参考图4,可知,其包含的形状约束条件包括形状约束条件5-6-13-14、形状约束条件1-3-5、形状约束条件1-2-5-6、形状约束条件22-5-6、形状约束条件5-6-11-12、形状约束条件3-5-7-9。
如此,根据目标图像包含的关键特征点确定对应的形状约束条件,有效地捕捉了服饰局部区域之间的松散约束特性。
在一实施方式中,基于训练样本集以及形状约束条件集对初始的双向循环卷积神经网络进行迭代训练,包括:
将训练样本集以及形状约束条件集输入初始的双向循环卷积神经网络;
双向循环卷积神经网络分别以设置顺序将形状约束条件集中每一形状约束条件输入前向层、以及以设置逆向顺序将形状约束条件集中每一形状约束条件输入后向层进行迭代训练。
再次参阅图4,形状约束条件集可以分为多个形状约束条件组,如A组、B组、C组、D组、E组、F组,其中,每个形状约束条件组包括多个形状约束条件,对于形状约束条件组具有关键特征点重合的形状约束条件组可以继续细分为多个形状约束条件,使得形状约束条件组内无关键特征点重复,同时兼顾对称性要求,如A组内1-3-5,2-4-6两个形状约束条件可以放到A组的一个子组内,剩余的三个形状约束条件各自一子组。
双向循环卷积神经网络分别以设置顺序将形状约束条件集中每一形状约束条件输入前向层、以及以设置逆向顺序将形状约束条件集中每一形状约束条件输入后向层进行迭代训练,参见一具体实施例方式,以设置顺序将形状约束条件集中每一形状约束条件输入前向层是指以A-B-C-D-E-F为 序输入前向层,每组内如果含有子组,则随机选取一个子组优化,直到子组优化完成,再继续优化下一个组,这里注意需要维持一个全局的形状约束条件拆分列表,以防止选择相同的关键点作为拆分节点。以设置逆向顺序将形状约束条件集中每一形状约束条件输入后向层是指以组F-E-D-C-B-A为序逆向再次优化各组,进行消息传播,依次迭代数个回合,最终使得依赖约束得到全局传播,最后获得经局部形状约束条件集约束后的新的关键特征点热图,这里,新的关键点热图是指目标图像中包含的关键特征点的位置概率信息。这些热图充分集成了似然和先验知识,具有精细的空间定位,输入是初步预测的关键特征点热图集即包含的每一关键特征点对应的位置概率热图,经过在RNN形状约束集合上依据上述算法反复迭代后输出新的关键特征点热图集。
在一实施方式中,双向循环卷积神经网络分别以设置顺序将形状约束条件集中每一形状约束条件输入前向层、以及以设置逆向顺序将形状约束条件集中每一形状约束条件输入后向层进行迭代训练,包括:
在一次迭代中,双向循环卷积神经网络根据对应的形状约束条件包含的关键特征点之间的约束关系,设置关键特征点的连接顺序;
以训练图像对应的热图集中包含的一关键特征点为起点,将每一关键特征点对应的位置概率热图按连接顺序以及连接顺序的逆向顺序同时输入前向层和后向层进行迭代训练。
这里,对于每个形状约束条件,它表达了局部关键特征点之间的形状约束特性,是一种环形结构,直接使用RNN无法建模这种带环依赖,我们这里将每个形状约束条件拆分成两个RNN来分别建模,如图5所示,双向循环卷积神经网络分别以设置的连接顺序(5-6-14-13)将形状约束条件集中每一形状约束条件输入前向层、以及以连接顺序的逆向顺序(5-13-14-6)将形状约束条件集中每一形状约束条件输入后向层进行迭代训练是指初始时每个形状约束条件中随机选择一个节点作为起始节点,分别按照顺时针 和逆时针分解成两个链条,每个链条对应一个双向RNN。这里,为了促进依赖的全局传播,整体形状约束条件集中的初始点选择不应该重复,如此,防止了局部节点难以学习到充分的约束,进而导致全局消息传播失败。
在一实施方式中,通过训练后的第二神经网络基于目标图像对应的形状约束条件对热图集进行处理,确定目标图像中包含的关键特征点的位置概率信息,包括:
第二神经网络根据目标图像对应的形状约束条件调整热图集的显示参数;
根据显示参数需要满足的条件,对热图集的显示参数进行调整得到对应的目标参数,根据目标参数得到目标图像中包含的关键特征点的位置概率信息。
热图集包括目标图像中包含的每一关键特征点对应的位置概率热图;以RNN为第二神经网络为例,每个R节点为双向RNN节点,它从上下节点中分别接收前向消息
Figure PCTCN2019127660-appb-000002
后向消息
Figure PCTCN2019127660-appb-000003
然后结合节点似然项x i,共同输出新的置信y i,这里
Figure PCTCN2019127660-appb-000004
分别为前向和后向传递的RNN依赖,
Figure PCTCN2019127660-appb-000005
为偏移,起到取值范围对齐作用,更新的前后向RNN依赖分别为
Figure PCTCN2019127660-appb-000006
具体规划参见如下公式(1)至公式(5);
Figure PCTCN2019127660-appb-000007
Figure PCTCN2019127660-appb-000008
Figure PCTCN2019127660-appb-000009
Figure PCTCN2019127660-appb-000010
Figure PCTCN2019127660-appb-000011
这里,符号f表示前向forward:规定从节点i-1到节点i为前向;符号b表示后向backward:规定从节点i到节点i-1为后向;x i表示输入的原关键点热图i;y i表示输出的新关键点热图i;
Figure PCTCN2019127660-appb-000012
代表了关键点热图i-1与关键 点热图i之间的约束关系(条件概率分布),概率术语为置信;
Figure PCTCN2019127660-appb-000013
代表了关键点热图i与关键点热图i-1之间的约束关系(条件概率分布),概率术语为置信;综合原关键点热图x i、前向关键点约束
Figure PCTCN2019127660-appb-000014
后向关键点约束
Figure PCTCN2019127660-appb-000015
的信息之后得到新的关键点热图y i
Figure PCTCN2019127660-appb-000016
是内部状态,分别表达了前向、后向历史信息;W x、W f
Figure PCTCN2019127660-appb-000017
W b
Figure PCTCN2019127660-appb-000018
为代估计参数。
这里,作为一具体实施方式,再次参见图4中D组形状约束条件,关键特征点5、关键特征点6、关键特征点14分别对应关键特征点热图i-1、关键特征点热图i,关键特征点热图i+1。通过训练后的第二神经网络对关键特征点6进行优化,我们需要利用先验知识即关键特征点5与关键特征点6之间的前向约束和关键特征点6与关键特征点14之间的后向约束关系信息,以及似然信息即第一神经网络输出的原关键特征点6,综合这三种信息获取后验分布即目标图像中包含的关键特征点6。如此,能够输出精细的、符合服饰局部形状约束的关键点热图集合。
在另一实施方式中,如图6所示,还提供了一种服饰分类方法,通过服饰分类装置执行服饰分类方法,服饰分类方法包括;
步骤S201:基于目标图像中包含的关键特征点的位置概率信息确定服饰的类别属性;类别属性包括以下其中之一:形状、版型和风格;
步骤S202:基于类别属性确定对应的服饰类别。
这里,根据目标图像的关键特征点的位置概率信息实现对服饰的精确识别,确定服饰对应的类别属性,例如以类别属性为形状为例,确定服饰为裤子,则将其分类到裤子类别。如此,对于每一目标图像按照形状、版型和风格进行分类,例如,用户进行网络购物,可以对于商户提供的图片进行分类,用户可以在输入关键字“裤子”时,将属于该服饰类别的所有目标图像集展现出来,为购物提供了更加方便快捷的体验;进一步地,对服饰分类,例如按照版型将裤子分成小腿裤、直筒裤、阔腿裤,按照风格 将裙子分成短裙、超短裙、过膝裙、长裙,如此不仅丰富了分类选项,也为服装设计以及智能穿搭领域提供了更多的素材,也可以更加广泛的使用于在网络购物、智能穿搭、服装设计等领域的应用。
在另一实施方式中,如图7所示,还提供了一种服饰检索方法,通过服饰检索装置执行服饰检索方法,服饰检索方法包括;
步骤S301:基于目标图像中包含的关键特征点的位置概率信息确定服饰的类别属性;类别属性包括以下其中之一:形状、版型和风格;
步骤S302:基于类别属性确定对应的检索要素;其中,检索要素包括检索关键字和/或图像;
步骤S303:基于检索要素检索与目标图像对应的服饰图像集。
这里,根据目标图像的关键特征点的位置概率信息实现对服饰的精确识别,确定服饰对应的类别属性,例如以类别属性为形状为例,确定服饰为裤子,则将其分类到裤子类别。
基于类别属性确定对应的检索要素是指确定为裤子,则生成的检索要素为检索关键字“裤子”和/或与裤子对应的“裤子图像”,然后与图像特征池中存储的图像特征进行匹配查询,这里可以是根据查询到的服饰结果,将查询到的相似的服饰图片及相应的服饰信息通过移动终端展示给用户,服饰信息包括服饰的品牌、价格和材质。如此,基于目标图像的服饰检索,有效地解决了用户不能或不愿用文字进行搜索的困难,降低购物难度,可视化的特点满足了用户对检索功能的更高需求,方便用户快速找到服饰信息,极大地提升了用户体验。同时,类别属性按照版型将裤子分成小腿裤、直筒裤、阔腿裤,按照风格将裙子分成短裙、超短裙、过膝裙、长裙,如此在对服饰进行检索时,更加的细致,用户的检索难度也更小,筛选出来的精度也更加精确;进一步地。为服装设计以及智能穿搭领域提供了更多的素材,也可以更加广泛的使用于在网络购物、智能穿搭、服装设计等领域的应用。
在另一实施方式中,如图8所示,还提供了一种服饰识别装置,包括:
第一采集子系统31,被配置为获取包含待识别服饰的目标图像;
第一图像处理子系统32,被配置为通过训练后的第一神经网络对目标图像进行处理,确定目标图像包含的关键特征点对应的热图集,热图集包括目标图像中包含的每一关键特征点对应的位置概率热图;以及通过训练后的第二神经网络基于目标图像对应的形状约束条件对热图集进行处理,确定目标图像中包含的关键特征点的位置概率信息。
在本发明上述实施方式中,获取包含待识别服饰的目标图像,通过训练后的第一神经网络对目标图像进行处理,确定目标图像包含的关键特征点对应的热图集,热图集包括目标图像中包含的每一关键特征点对应的位置概率热图;如此,获得待识别服饰的目标图像中每一关键特征点的初始位置信息,得到每一关键特征点对应的初始位置概率热图;通过训练后的第二神经网络基于目标图像对应的形状约束条件对热图集进行处理,确定目标图像中包含的关键特征点的位置概率信息。如此,基于与服饰局部对应的形状约束条件对热图集进行处理,能够优化对待识别服饰包含的关键特征点的位置概率的精确识别,根据确定的待识别服饰的关键特征点的位置概率信息实现对服饰的精确识别,通过获取服饰的关键特征点能够更加广泛的适用于在网络购物、智能穿搭、服装设计等领域的应用。
可选地,还包括:网络训练子系统33,被配置为获取包含有多个服饰的训练图像的图像训练集,训练图像包括携带有关键特征点标注信息的原始图像;基于图像训练集对初始的卷积神经网络进行迭代训练,直至损失函数满足收敛条件,得到训练后的第一神经网络。
可选地,网络训练子系统33,还被配置为对原始图像进行图像增广得到对应的增广图像,图像训练集还包括增广图像。
可选地,网络训练子系统33,还被配置为对原始图像分别进行图像水平平移、图像竖直平移、颜色扰动和/或图像旋转,得到对应的增广图像。
可选地,网络训练子系统33,还被配置为获取训练样本集,训练样本集包括第一神经网络输出的训练图像包含的关键特征点对应的热图集;基于训练样本集以及形状约束条件集对初始的双向循环卷积神经网络进行迭代训练,直至损失函数满足收敛条件,得到训练后的第二神经网络。
可选地,网络训练子系统33,还被配置为将训练样本集以及形状约束条件集输入初始的双向循环卷积神经网络;双向循环卷积神经网络分别以设置顺序将形状约束条件集中每一形状约束条件输入前向层、以及以设置逆向顺序将形状约束条件集中每一形状约束条件输入后向层进行迭代训练。
可选地,网络训练子系统33,还被配置为在一次迭代中,双向循环卷积神经网络根据对应的形状约束条件包含的关键特征点之间的约束关系,设置关键特征点的连接顺序;以训练图像对应的热图集中包含的一关键特征点为起点,将每一关键特征点对应的位置概率热图按连接顺序以及连接顺序的逆向顺序同时输入前向层和后向层进行迭代训练。
需要说明的是,在实际应用中,第一采集子系统31可由摄像头或具有绘图功能的终端实现,第一图像处理子系统32可由图像处理器或服务器实现,网络训练子系统33可由处理器或服务器实现;图8中没有示出摄像头、具有绘图功率的终端、图像处理器、服务器和处理器;其中,图像处理器或处理器具体为中央处理器(CPU,Central Processing Unit)、微处理器(MPU,Microprocessor Unit)、数字信号处理器(DSP,Digital Signal Processing)或现场可编程门阵列(FPGA,Field Programmable Gate Array)。
在另一实施方式中,如图9所示,还提供了一种服饰分类装置,包括:
第二采集子系统41,被配置为获取包含待识别服饰的目标图像;
第二图像处理子系统42,被配置为基于目标图像确定目标图像包含的关键特征点对应的热图集,热图集包括目标图像中包含的每一关键特征点对应的位置概率热图;以及基于目标图像对应的形状约束条件对热图集进行处理,确定目标图像中包含的关键特征点的位置概率信息;
分类子系统43,被配置为基于目标图像中包含的关键特征点的位置概率信息确定服饰的类别属性;类别属性包括以下其中之一:形状、版型和风格;以及基于类别属性确定对应的服饰类别。
需要说明的是,在实际应用中,第二采集子系统41可由摄像头或具有绘图功能的终端实现,第二图像处理子系统42可由图像处理器或服务器实现,分类子系统43可由处理器或服务器实现;图9中没有示出摄像头、具有绘图功率的终端、图像处理器、服务器和处理器。
在另一实施方式中,如图10所示,还提供了一种服饰检索装置,包括:
第三采集子系统51,被配置为获取包含待识别服饰的目标图像;
第三图像处理子系统52,被配置为基于目标图像确定目标图像包含的关键特征点对应的热图集,热图集包括目标图像中包含的每一关键特征点对应的位置概率热图;以及基于目标图像对应的形状约束条件对热图集进行处理,确定目标图像中包含的关键特征点的位置概率信息;
检索要素确定子系统53,被配置为基于目标图像中包含的关键特征点的位置概率信息确定服饰的类别属性;类别属性包括以下其中之一:形状、版型和风格;以及基于类别属性确定对应的检索要素;其中,检索要素包括检索关键字和/或图像;
检索子系统54,被配置为基于检索要素检索与目标图像对应的服饰图像集。
需要说明的是,在实际应用中,第三采集子系统51可由摄像头或具有绘图功能的终端实现,第三图像处理子系统52可由图像处理器或服务器实现,检索要素确定子系统53可由处理器或服务器实现,检索子系统54可由搜索引擎设备实现;图10中没有示出摄像头、具有绘图功率的终端、图像处理器、服务器、处理器和搜索引擎设备。
在另一实施方式中,如图11所示,还提供了一种计算机设备,包括:至少一个处理器210和用于存储能够在处理器210上运行的计算机程序的 存储器211;其中,图11中示意的处理器210并非用于指代处理器的个数为一个,而是仅用于指代处理器相对其他器件的位置关系,在实际应用中,处理器的个数可以为一个或多个;同样,图11中示意的存储器211也是同样的含义,即仅用于指代存储器相对其他器件的位置关系,在实际应用中,存储器的个数可以为一个或多个。
其中,处理器210用于运行计算机程序时,执行如下步骤:
获取包含待识别服饰的目标图像,通过训练后的第一神经网络对目标图像进行处理,确定目标图像包含的关键特征点对应的热图集,热图集包括目标图像中包含的每一关键特征点对应的位置概率热图;
通过训练后的第二神经网络基于目标图像对应的形状约束条件对热图集进行处理,确定目标图像中包含的关键特征点的位置概率信息。
在一个可选的实施例中,处理器210还用于运行计算机程序时,执行如下步骤:
获取包含有多个服饰的训练图像的图像训练集,训练图像包括携带有关键特征点标注信息的原始图像;
基于图像训练集对初始的卷积神经网络进行迭代训练,直至损失函数满足收敛条件,得到训练后的第一神经网络。
在一个可选的实施例中,处理器210还用于运行计算机程序时,执行如下步骤:
对原始图像进行图像增广得到对应的增广图像,图像训练集还包括增广图像。
在一个可选的实施例中,处理器210还用于运行计算机程序时,执行如下步骤:
对原始图像分别进行图像水平平移、图像竖直平移、颜色扰动和/或图像旋转,得到对应的增广图像。
在一个可选的实施例中,处理器210还用于运行计算机程序时,执行 如下步骤:
获取训练样本集,训练样本集包括第一神经网络输出的训练图像包含的关键特征点对应的热图集;
基于训练样本集以及形状约束条件集对初始的双向循环卷积神经网络进行迭代训练,直至损失函数满足收敛条件,得到训练后的第二神经网络。
在一个可选的实施例中,处理器210还用于运行计算机程序时,执行如下步骤:
形状约束条件集包括以三角形子结构和/或四边形子结构分别表征多个关键特征点之间约束关系所形成的形状约束条件。
在一个可选的实施例中,处理器210还用于运行计算机程序时,执行如下步骤:
根据目标图像包含的关键特征点对应的热图集,确定目标图像对应的形状约束条件,目标图像包括至少一个形状约束条件。
在一个可选的实施例中,处理器210还用于运行计算机程序时,执行如下步骤:
将训练样本集以及形状约束条件集输入初始的双向循环卷积神经网络;
双向循环卷积神经网络分别以设置顺序将形状约束条件集中每一形状约束条件输入前向层、以及以设置逆向顺序将形状约束条件集中每一形状约束条件输入后向层进行迭代训练。
在一个可选的实施例中,处理器210还用于运行计算机程序时,执行如下步骤:
在一次迭代中,双向循环卷积神经网络根据对应的形状约束条件包含的关键特征点之间的约束关系,设置关键特征点的连接顺序;
以训练图像对应的热图集中包含的一关键特征点为起点,将每一关键特征点对应的位置概率热图按连接顺序以及连接顺序的逆向顺序同时输入前向层和后向层进行迭代训练。
在一个可选的实施例中,处理器210还用于运行计算机程序时,执行如下步骤:
第二神经网络根据目标图像对应的形状约束条件调整热图集的显示参数;
根据显示参数需要满足的条件,对热图集的显示参数进行调整得到对应的目标参数,根据目标参数得到目标图像中包含的关键特征点的位置概率信息。
该服饰识别装置还包括:至少一个网络接口212。发送端中的各个组件通过总线系统213耦合在一起。可理解,总线系统213用于实现这些组件之间的连接通信。总线系统213除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图11中将各种总线都标为总线系统213。
其中,存储器211可以是易失性存储器或非易失性存储器,也可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(ROM,Read Only Memory)、可编程只读存储器(PROM,Programmable Read-Only Memory)、可擦除可编程只读存储器(EPROM,Erasable Programmable Read-Only Memory)、电可擦除可编程只读存储器(EEPROM,Electrically Erasable Programmable Read-Only Memory)、磁性随机存取存储器(FRAM,ferromagnetic random access memory)、快闪存储器(Flash Memory)、磁表面存储器、光盘、或只读光盘(CD-ROM,Compact Disc Read-Only Memory);磁表面存储器可以是磁盘存储器或磁带存储器。易失性存储器可以是随机存取存储器(RAM,Random Access Memory),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(SRAM,Static Random Access Memory)、同步静态随机存取存储器(SSRAM,Synchronous Static Random Access Memory)、动态随机存取存储器(DRAM,Dynamic Random Access Memory)、同步 动态随机存取存储器(SDRAM,Synchronous Dynamic Random Access Memory)、双倍数据速率同步动态随机存取存储器(DDRSDRAM,Double Data Rate Synchronous Dynamic Random Access Memory)、增强型同步动态随机存取存储器(ESDRAM,Enhanced Synchronous Dynamic Random Access Memory)、同步连接动态随机存取存储器(SLDRAM,SyncLink Dynamic Random Access Memory)、直接内存总线随机存取存储器(DRRAM,Direct Rambus Random Access Memory)。本发明实施例描述的存储器211旨在包括但不限于这些和任意其它适合类型的存储器。
本发明实施例中的存储器211用于存储各种类型的数据以支持发送端的操作。这些数据的示例包括:用于在发送端上操作的任何计算机程序,如操作系统和应用程序。其中,操作系统包含各种系统程序,例如框架层、核心库层、驱动层等,用于实现各种基础业务以及处理基于硬件的任务。应用程序可以包含各种应用程序,用于实现各种应用业务。这里,实现本发明实施例方法的程序可以包含在应用程序中。
本实施例还提供了一种计算机存储介质,例如包括存储有计算机程序的存储器211,上述计算机程序可由发送端中的处理器210执行,以完成前述方法所述步骤。计算机存储介质可以是FRAM、ROM、PROM、EPROM、EEPROM、Flash Memory、磁表面存储器、光盘、或CD-ROM等存储器;也可以是包括上述存储器之一或任意组合的各种设备,如智能手机、平板电脑、笔记本电脑等。一种计算机存储介质,计算机存储介质中存储有计算机程序,计算机程被处理器运行时,执行如下步骤:
获取包含待识别服饰的目标图像,通过训练后的第一神经网络对目标图像进行处理,确定目标图像包含的关键特征点对应的热图集,热图集包括目标图像中包含的每一关键特征点对应的位置概率热图;
通过训练后的第二神经网络基于目标图像对应的形状约束条件对热图集进行处理,确定目标图像中包含的关键特征点的位置概率信息。
在一个可选的实施例中,计算机程序被处理器运行时,还执行如下步骤:
获取包含有多个服饰的训练图像的图像训练集,训练图像包括携带有关键特征点标注信息的原始图像;
基于图像训练集对初始的卷积神经网络进行迭代训练,直至损失函数满足收敛条件,得到训练后的第一神经网络。
在一个可选的实施例中,计算机程序被处理器运行时,还执行如下步骤:
对原始图像进行图像增广得到对应的增广图像,图像训练集还包括增广图像。
在一个可选的实施例中,计算机程序被处理器运行时,还执行如下步骤:
对原始图像分别进行图像水平平移、图像竖直平移、颜色扰动和/或图像旋转,得到对应的增广图像。
在一个可选的实施例中,计算机程序被处理器运行时,还执行如下步骤:
获取训练样本集,训练样本集包括第一神经网络输出的训练图像包含的关键特征点对应的热图集;
基于训练样本集以及形状约束条件集对初始的双向循环卷积神经网络进行迭代训练,直至损失函数满足收敛条件,得到训练后的第二神经网络。
在一个可选的实施例中,计算机程序被处理器运行时,还执行如下步骤:
形状约束条件集包括以三角形子结构和/或四边形子结构分别表征多个关键特征点之间约束关系所形成的形状约束条件。
在一个可选的实施例中,计算机程序被处理器运行时,还执行如下步骤:
根据目标图像包含的关键特征点对应的热图集,确定目标图像对应的形状约束条件,目标图像包括至少一个形状约束条件。
在一个可选的实施例中,计算机程序被处理器运行时,还执行如下步骤:
将训练样本集以及形状约束条件集输入初始的双向循环卷积神经网络;
双向循环卷积神经网络分别以设置顺序将形状约束条件集中每一形状约束条件输入前向层、以及以设置逆向顺序将形状约束条件集中每一形状约束条件输入后向层进行迭代训练。
在一个可选的实施例中,计算机程序被处理器运行时,还执行如下步骤:
在一次迭代中,双向循环卷积神经网络根据对应的形状约束条件包含的关键特征点之间的约束关系,设置关键特征点的连接顺序;
以训练图像对应的热图集中包含的一关键特征点为起点,将每一关键特征点对应的位置概率热图按连接顺序以及连接顺序的逆向顺序同时输入前向层和后向层进行迭代训练。
在一个可选的实施例中,计算机程序被处理器运行时,还执行如下步骤:
第二神经网络根据目标图像对应的形状约束条件调整热图集的显示参数;
根据显示参数需要满足的条件,对热图集的显示参数进行调整得到对应的目标参数,根据目标参数得到目标图像中包含的关键特征点的位置概率信息。
以上所述,仅为本发明的较佳实施例而已,并非用于限定本发明的保护范围。
工业实用性
本发明实施例中,获取包含待识别服饰的目标图像,基于所述目标图像确定所述目标图像包含的关键特征点对应的热图集,所述热图集包括所述目标图像中包含的每一关键特征点对应的位置概率热图;如此,获得待识别服饰的目标图像中每一关键特征点的初始位置信息,得到每一关键特征点对应的初始位置概率热图;基于所述目标图像对应的形状约束条件对所述热图集进行处理,确定所述目标图像中包含的关键特征点的位置概率信息。如此,基于与服饰局部对应的形状约束条件对热图集进行处理,能够优化对待识别服饰包含的关键特征点的位置概率的精确识别,根据所述确定的待识别服饰的关键特征点的位置概率信息实现对服饰的精确识别,通过获取所述服饰的关键特征点能够更加广泛的适用于在网络购物、智能穿搭、服装设计等领域的应用。

Claims (18)

  1. 一种服饰识别方法,其中,所述方法包括:
    获取包含待识别服饰的目标图像,基于所述目标图像确定所述目标图像包含的关键特征点对应的热图集,所述热图集包括所述目标图像中包含的每一关键特征点对应的位置概率热图;
    基于所述目标图像对应的形状约束条件对所述热图集进行处理,确定所述目标图像中包含的关键特征点的位置概率信息。
  2. 如权利要求1所述的服饰识别方法,其中,所述基于所述目标图像确定所述目标图像包含的关键特征点对应的热图集,包括:
    通过训练后的第一神经网络对所述目标图像进行处理,确定所述目标图像包含的关键特征点对应的热图集;
    所述基于所述目标图像对应的形状约束条件对所述热图集进行处理,确定所述目标图像中包含的关键特征点的位置概率信息,包括:
    通过训练后的第二神经网络基于所述目标图像对应的形状约束条件对所述热图集进行处理,确定所述目标图像中包含的关键特征点的位置概率信息。
  3. 如权利要求2所述的服饰识别方法,其中,所述获取包含待识别服饰的目标图像之前,包括:
    获取包含有多个服饰的训练图像的图像训练集,所述训练图像包括携带有关键特征点标注信息的原始图像;
    基于所述图像训练集对初始的卷积神经网络进行迭代训练,直至损失函数满足收敛条件,得到所述训练后的第一神经网络。
  4. 如权利要求3所述的服饰识别方法,其中,所述基于所述图像训练集对初始的卷积神经网络进行迭代训练之前,还包括:
    对所述原始图像进行图像增广得到对应的增广图像,所述图像训练集 还包括所述增广图像。
  5. 如权利要求4所述的服饰识别方法,其中,所述对所述原始图像进行图像增广得到对应的增广图像,包括:
    对所述原始图像分别进行图像水平平移、图像竖直平移、颜色扰动和/或图像旋转,得到对应的增广图像。
  6. 如权利要求3所述的服饰识别方法,其中,所述得到所述训练后的第一神经网络之后,包括:
    获取训练样本集,所述训练样本集包括所述第一神经网络输出的所述训练图像包含的关键特征点对应的热图集;
    基于所述训练样本集以及形状约束条件集对初始的双向循环卷积神经网络进行迭代训练,直至损失函数满足收敛条件,得到所述训练后的第二神经网络。
  7. 如权利要求6所述的服饰识别方法,其中,所述形状约束条件集包括以三角形子结构和/或四边形子结构分别表征多个关键特征点之间约束关系所形成的形状约束条件。
  8. 如权利要求6所述的服饰识别方法,其中,所述通过训练后的第二神经网络基于所述目标图像对应的形状约束条件对所述热图集进行处理之前,还包括:
    根据所述目标图像包含的关键特征点对应的热图集,确定所述目标图像对应的形状约束条件,所述目标图像包括至少一个形状约束条件。
  9. 如权利要求6所述的服饰识别方法,其中,所述基于所述训练样本集以及形状约束条件集对初始的双向循环卷积神经网络进行迭代训练,包括:
    将所述训练样本集以及形状约束条件集输入初始的双向循环卷积神经网络;
    所述双向循环卷积神经网络分别以设置顺序将所述形状约束条件集中 每一所述形状约束条件输入前向层、以及以设置逆向顺序将所述形状约束条件集中每一所述形状约束条件输入后向层进行迭代训练。
  10. 如权利要求9所述的服饰识别方法,其中,所述双向循环卷积神经网络分别以设置顺序将所述形状约束条件集中每一所述形状约束条件输入前向层、以及以设置逆向顺序将所述形状约束条件集中每一所述形状约束条件输入后向层进行迭代训练,包括:
    在一次迭代中,所述双向循环卷积神经网络根据对应的形状约束条件包含的关键特征点之间的约束关系,设置所述关键特征点的连接顺序;
    以所述训练图像对应的热图集中包含的一关键特征点为起点,将每一所述关键特征点对应的位置概率热图按所述连接顺序以及所述连接顺序的逆向顺序同时输入前向层和后向层进行迭代训练。
  11. 如权利要求2所述的服饰识别方法,其中,所述通过训练后的第二神经网络基于所述目标图像对应的形状约束条件对所述热图集进行处理,确定所述目标图像中包含的关键特征点的位置概率信息,包括:
    所述第二神经网络根据所述目标图像对应的形状约束条件调整所述热图集的显示参数;
    根据所述显示参数需要满足的条件,对所述热图集的所述显示参数进行调整得到对应的目标参数,根据所述目标参数得到所述目标图像中包含的关键特征点的位置概率信息。
  12. 一种采用如权利要求1至11任一项所述的服饰识别方法实现的服饰分类方法,其中,包括:
    基于目标图像中包含的关键特征点的位置概率信息确定服饰的类别属性;所述类别属性包括以下其中之一:形状、版型和风格;
    基于所述类别属性确定对应的服饰类别。
  13. 一种采用如权利要求1至11任一项所述的服饰识别方法实现的服饰检索方法,其中,包括:
    基于目标图像中包含的关键特征点的位置概率信息确定服饰的类别属性;所述类别属性包括以下其中之一:形状、版型和风格;
    基于所述类别属性确定对应的检索要素;其中,所述检索要素包括检索关键字和/或图像;
    基于所述检索要素检索与所述目标图像对应的服饰图像集。
  14. 一种服饰识别装置,其中,包括:
    第一采集子系统,被配置为获取包含待识别服饰的目标图像;
    第一图像处理子系统,被配置为基于所述目标图像确定所述目标图像包含的关键特征点对应的热图集,所述热图集包括所述目标图像中包含的每一关键特征点对应的位置概率热图;以及基于所述目标图像对应的形状约束条件对所述热图集进行处理,确定所述目标图像中包含的关键特征点的位置概率信息。
  15. 一种服饰分类装置,其中,包括:
    第二采集子系统,被配置为获取包含待识别服饰的目标图像;
    第二图像处理子系统,被配置为基于所述目标图像确定所述目标图像包含的关键特征点对应的热图集,所述热图集包括所述目标图像中包含的每一关键特征点对应的位置概率热图;以及基于所述目标图像对应的形状约束条件对所述热图集进行处理,确定所述目标图像中包含的关键特征点的位置概率信息;
    分类子系统,被配置为基于目标图像中包含的关键特征点的位置概率信息确定服饰的类别属性;所述类别属性包括以下其中之一:形状、版型和风格;以及基于所述类别属性确定对应的服饰类别。
  16. 一种服饰检索装置,其中,包括:
    第三采集子系统,被配置为获取包含待识别服饰的目标图像;
    第三图像处理子系统,被配置为基于所述目标图像确定所述目标图像包含的关键特征点对应的热图集,所述热图集包括所述目标图像中包含的 每一关键特征点对应的位置概率热图;以及基于所述目标图像对应的形状约束条件对所述热图集进行处理,确定所述目标图像中包含的关键特征点的位置概率信息;
    检索要素确定子系统,被配置为基于目标图像中包含的关键特征点的位置概率信息确定服饰的类别属性;所述类别属性包括以下其中之一:形状、版型和风格;以及基于所述类别属性确定对应的检索要素;其中,所述检索要素包括检索关键字和/或图像;
    检索子系统,被配置为基于所述检索要素检索与所述目标图像对应的服饰图像集。
  17. 一种计算机设备,其中,包括:处理器和用于存储能够在处理器上运行的计算机程序的存储器;
    其中,所述处理器用于运行所述计算机程序时,实现权利要求1至11任一项所述的服饰识别方法、或实现权利要求12所述的服饰分类方法、或实现权利要求13所述的服饰检索方法。
  18. 一种计算机存储介质,其中,所述计算机存储介质中存储有计算机程序,其中,所述计算机程序被处理器执行时实现权利要求1至11中任一项所述服饰识别方法、或实现权利要求12所述的服饰分类方法、或实现权利要求13所述的服饰检索方法。
PCT/CN2019/127660 2019-02-18 2019-12-23 服饰识别、分类及检索的方法、装置、设备及存储介质 WO2020168814A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US17/295,337 US11977604B2 (en) 2019-02-18 2019-12-23 Method, device and apparatus for recognizing, categorizing and searching for garment, and storage medium
EP19915612.6A EP3876110A4 (en) 2019-02-18 2019-12-23 METHOD, DEVICE AND EQUIPMENT FOR DETECTION, CATEGORIZATION AND SEARCH FOR AN ITEM OF CLOTHING AND STORAGE MEDIUM

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910123577.4 2019-02-18
CN201910123577.4A CN111581414B (zh) 2019-02-18 2019-02-18 服饰识别、分类及检索的方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2020168814A1 true WO2020168814A1 (zh) 2020-08-27

Family

ID=72112922

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/127660 WO2020168814A1 (zh) 2019-02-18 2019-12-23 服饰识别、分类及检索的方法、装置、设备及存储介质

Country Status (4)

Country Link
US (1) US11977604B2 (zh)
EP (1) EP3876110A4 (zh)
CN (1) CN111581414B (zh)
WO (1) WO2020168814A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116310406A (zh) * 2023-05-22 2023-06-23 浙江之科云创数字科技有限公司 一种图像检测的方法、装置、存储介质及电子设备

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102295202B1 (ko) * 2020-01-31 2021-08-27 중앙대학교 산학협력단 다중 객체 검출 방법 및 그 장치
US11473927B2 (en) * 2020-02-05 2022-10-18 Electronic Arts Inc. Generating positions of map items for placement on a virtual map
US11462033B2 (en) * 2020-09-30 2022-10-04 Wipro Limited Method and system for performing classification of real-time input sample using compressed classification model
CN112580652B (zh) * 2020-12-24 2024-04-09 咪咕文化科技有限公司 虚拟装饰方法、装置、电子设备及存储介质
CN112802108B (zh) * 2021-02-07 2024-03-15 上海商汤科技开发有限公司 目标对象定位方法、装置、电子设备及可读存储介质
CN116071359B (zh) * 2023-03-08 2023-06-23 中汽研新能源汽车检验中心(天津)有限公司 一种电池老化程度检测方法、电子设备及存储介质
US11804057B1 (en) * 2023-03-23 2023-10-31 Liquidx, Inc. Computer systems and computer-implemented methods utilizing a digital asset generation platform for classifying data structures

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108256431A (zh) * 2017-12-20 2018-07-06 中车工业研究院有限公司 一种手部位置标识方法及装置
CN108549844A (zh) * 2018-03-22 2018-09-18 华侨大学 一种基于多层分形网络和关节亲属模式的多人姿态估计方法
US10096122B1 (en) * 2017-03-28 2018-10-09 Amazon Technologies, Inc. Segmentation of object image data from background image data
CN108932495A (zh) * 2018-07-02 2018-12-04 大连理工大学 一种汽车前脸参数化模型全自动生成方法

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6022681B2 (ja) * 2012-06-08 2016-11-09 ナショナル ユニヴァーシティー オブ シンガポール オンラインストアにおける対話型衣料品検索
WO2017015947A1 (en) * 2015-07-30 2017-02-02 Xiaogang Wang A system and a method for object tracking
CN106874924B (zh) * 2015-12-14 2021-01-29 阿里巴巴集团控股有限公司 一种图片风格识别方法及装置
CN105447529B (zh) * 2015-12-30 2020-11-03 商汤集团有限公司 一种服饰检测及其属性值识别的方法和系统
CN108229488B (zh) * 2016-12-27 2021-01-01 北京市商汤科技开发有限公司 用于检测物体关键点的方法、装置及电子设备
AU2018236433B2 (en) * 2017-03-17 2022-03-03 Magic Leap, Inc. Room layout estimation methods and techniques
US10943176B2 (en) * 2017-03-22 2021-03-09 Ebay Inc. Visual aspect localization presentation
CN108229496B (zh) * 2017-07-11 2021-07-06 北京市商汤科技开发有限公司 服饰关键点的检测方法和装置、电子设备、存储介质和程序
US10733431B2 (en) * 2017-12-03 2020-08-04 Facebook, Inc. Systems and methods for optimizing pose estimation
US10706262B2 (en) * 2018-01-08 2020-07-07 3DLOOK Inc. Intelligent body measurement
US11631193B1 (en) * 2018-06-15 2023-04-18 Bertec Corporation System for estimating a pose of one or more persons in a scene
CN109166147A (zh) * 2018-09-10 2019-01-08 深圳码隆科技有限公司 基于图片的服装尺寸测量方法及装置
JP7209333B2 (ja) * 2018-09-10 2023-01-20 国立大学法人 東京大学 関節位置の取得方法及び装置、動作の取得方法及び装置
CN109325952B (zh) * 2018-09-17 2022-07-08 上海宝尊电子商务有限公司 基于深度学习的时尚服装图像分割方法
US11375176B2 (en) * 2019-02-05 2022-06-28 Nvidia Corporation Few-shot viewpoint estimation
US11948401B2 (en) * 2019-08-17 2024-04-02 Nightingale.ai Corp. AI-based physical function assessment system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10096122B1 (en) * 2017-03-28 2018-10-09 Amazon Technologies, Inc. Segmentation of object image data from background image data
CN108256431A (zh) * 2017-12-20 2018-07-06 中车工业研究院有限公司 一种手部位置标识方法及装置
CN108549844A (zh) * 2018-03-22 2018-09-18 华侨大学 一种基于多层分形网络和关节亲属模式的多人姿态估计方法
CN108932495A (zh) * 2018-07-02 2018-12-04 大连理工大学 一种汽车前脸参数化模型全自动生成方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3876110A4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116310406A (zh) * 2023-05-22 2023-06-23 浙江之科云创数字科技有限公司 一种图像检测的方法、装置、存储介质及电子设备
CN116310406B (zh) * 2023-05-22 2023-08-11 浙江之科云创数字科技有限公司 一种图像检测的方法、装置、存储介质及电子设备

Also Published As

Publication number Publication date
EP3876110A4 (en) 2022-01-05
US20220019840A1 (en) 2022-01-20
CN111581414A (zh) 2020-08-25
EP3876110A1 (en) 2021-09-08
CN111581414B (zh) 2024-01-16
US11977604B2 (en) 2024-05-07

Similar Documents

Publication Publication Date Title
WO2020168814A1 (zh) 服饰识别、分类及检索的方法、装置、设备及存储介质
Zhou et al. Torchreid: A library for deep learning person re-identification in pytorch
WO2021227726A1 (zh) 面部检测、图像检测神经网络训练方法、装置和设备
CN111797893B (zh) 一种神经网络的训练方法、图像分类系统及相关设备
WO2022213879A1 (zh) 目标对象检测方法、装置、计算机设备和存储介质
Zhang et al. Actively learning human gaze shifting paths for semantics-aware photo cropping
WO2021027789A1 (zh) 物体识别方法及装置
US9460518B2 (en) Visual clothing retrieval
US20220058429A1 (en) Method for fine-grained sketch-based scene image retrieval
US20220222918A1 (en) Image retrieval method and apparatus, storage medium, and device
CN110222718B (zh) 图像处理的方法及装置
CN111507285A (zh) 人脸属性识别方法、装置、计算机设备和存储介质
Leng et al. Context-aware attention network for image recognition
Qin et al. Depth estimation by parameter transfer with a lightweight model for single still images
CN111582449B (zh) 一种目标域检测网络的训练方法、装置、设备及存储介质
CN111191065B (zh) 一种同源图像确定方法及装置
Dong et al. A detection-regression based framework for fish keypoints detection
Walch et al. Deep Learning for Image-Based Localization
CN114821140A (zh) 基于曼哈顿距离的图像聚类方法、终端设备及存储介质
Wang et al. SPGNet: Spatial projection guided 3D human pose estimation in low dimensional space
Xu et al. Feature fusion capsule network for cow face recognition
Zhuang et al. Pose prediction of textureless objects for robot bin picking with deep learning approach
Li et al. Rule of thirds-aware reinforcement learning for image aesthetic cropping
Liu et al. Feature matching via guided motion field consensus
Guo et al. Indoor visual positioning based on image retrieval in dense connected convolutional network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19915612

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019915612

Country of ref document: EP

Effective date: 20210604

NENP Non-entry into the national phase

Ref country code: DE