CN109993128B

CN109993128B - Method for quantifying attribute features of pedestrian image

Info

Publication number: CN109993128B
Application number: CN201910268568.4A
Authority: CN
Inventors: 王蕴红; 张少雄; 刘峥
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2019-04-04
Filing date: 2019-04-04
Publication date: 2020-12-18
Anticipated expiration: 2039-04-04
Also published as: CN109993128A

Abstract

The invention discloses a method for quantifying the attribute characteristics of a pedestrian image, which comprises the following steps: determining a corresponding color word according to a mapping relation from a preset pixel value interval to the color word by each pixel value in the image to be quantized, wherein all the pixel values in each training image containing pedestrian information are induced to obtain the color word; calculating the probability that each pixel point color word belongs to each type of color theme, and representing an image to be quantized into a theme probability vector with corresponding dimensionality consistent with the number of the color themes, wherein based on the contribution of different color words to pedestrian features in training data, the relevance among different color words is calculated, and the color words with close relevance are aggregated to obtain different types of color themes; and (4) connecting the theme probability vectors corresponding to all pixel points in the image to be quantized in series, and describing the attribute quantization characteristics of the image to be quantized by using a serial connection result. The invention improves the accuracy and robustness of the pedestrian characteristic quantization result.

Description

Method for quantifying attribute features of pedestrian image

Technical Field

The invention relates to the technical field of pedestrian image feature extraction, in particular to a pedestrian image attribute feature quantization method.

Background

Color feature recognition and quantization processing techniques are widely used to represent the attribute features of pedestrian images. The basic goal of color features is to encode color information in an image, enabling color features to be robust to illumination and viewpoint variations.

At present, many methods for extracting color features of pedestrian images are expressed by using histograms of color spaces such as HSV and RGB. The method mainly comprises the steps of dividing an image to be quantized into small blocks, counting pixel values of pixel points in each block, and dividing a value range of the pixel values into a plurality of regions for quantization. The method is only simple to count the values of all pixels in each local area in the image, cannot intuitively give the overall color trend of the image to be quantized, and is sensitive to the change of illumination and visual angles.

In addition, in the existing pedestrian image color feature extraction technology, there are some feature quantization methods based on color words. However, these methods use a lot of external data, such as: some image data containing natural scene pictures without pedestrian features are used to obtain color words. The color words obtained by the external data are used for expressing the current image to be quantized containing the pedestrian information, and the deviation of the final quantization result is caused by the influence of factors such as the illumination intensity in the external data. Because, in the method for summarizing color words, the finally obtained data distribution rule of the color words is inconsistent with the color feature data distribution rule in the current pedestrian feature extraction task, some unreduced color words irrelevant to the pedestrian feature extraction task and lack of some color words with strong correlation with the pedestrian feature extraction task, which are influenced by external data, may exist in the current image data to be quantized, so that deviation is caused between the final quantization result and the actual pedestrian feature, and the accuracy and precision of the color attribute feature quantitative expression in the final pedestrian image are influenced.

Disclosure of Invention

In order to solve the technical problem, the invention provides a method for quantifying the attribute features of a pedestrian image, which comprises the following steps: determining a corresponding color word according to a mapping relation from a preset pixel value interval to the color word by each pixel value in an image to be quantized, wherein all the pixel values in each training image containing pedestrian information are induced to obtain the color word; calculating the probability that the color words of each pixel point belong to each type of color theme respectively, and representing the image to be quantized into theme probability vectors with corresponding dimensionality consistent with the number of the color themes, wherein based on the contribution of different color words to the pedestrian features in the training data, the relevance among the different color words is calculated, and the color words with close relevance are aggregated to obtain the color themes of different types; and connecting the theme probability vectors corresponding to the pixel points in the image to be quantized in series, and describing the color attribute characteristics of the image to be quantized by using a serial connection result.

Preferably, in the process of forming the color theme, the method comprises the following steps: calculating semantic relevance between any two color words according to different channel pixel values of each color word; representing each training image as a word probability vector with corresponding dimension of each pixel point consistent with the number of color words, calculating the word frequency of each color word in each training image and the probability of each color word appearing in all training images based on the word probability vector, and further obtaining the word frequency and image frequency index of each color word relative to all training images to represent the contribution of the color word to the pedestrian characteristics in the training data; based on paired pedestrian identity relevance constraint conditions, according to semantic relevance between words with different colors and the word frequency and image frequency indexes, counting the relevance between any two color pairs in all color words appearing in the training data to obtain corresponding pedestrian attribute word relevance; and aggregating the pedestrian attribute word correlation of a plurality of pairs aiming at each color to form the color theme.

Preferably, in the step of obtaining corresponding pedestrian attribute word relevance by counting the relevance between any two color pairs in all color words appearing in the training data according to semantic relevance between different color words and the word frequency and image frequency index based on paired pedestrian identity relevance constraint conditions, the method further includes: s1, arbitrarily selecting two training images from the training data, and calculating the contribution rate index correlation of each color pair appearing in the image pair according to the word frequency and the image frequency index of each color word appearing in the two training images based on the pedestrian identity correlation constraint condition of the image pair; s2, judging whether the training image pair needs to be continuously extracted, if so, selecting any two training images, and executing the step S1, otherwise, entering the step S3; and S3, summing the contribution rate index correlations of the same color pair in all the decimated image pairs based on the calculation results of the step S1 and the step S2, and combining the semantic correlations of the color pairs to obtain the corresponding pedestrian attribute word correlations.

Preferably, in step S1, the method further includes: s11, selecting a pixel point in the first training image, and recording the number of lines of the pixel point; s12, judging whether the image pair meets the pedestrian identity correlation constraint condition, respectively calculating the correlation between the color word corresponding to the pixel point and the color word corresponding to each pixel point in the same row in the second image, and obtaining a plurality of corresponding first correlation results based on the correlation; s13, judging whether each pixel point in the first training image is traversed or not, if not, selecting the next pixel point, and sequentially implementing the step S11 and the step S12, otherwise, entering the step S14; and S14, summing the first correlation results belonging to the same color pair in the two training images to obtain the contribution rate exponential correlation for the color pair.

Preferably, in step S1, the method further includes: dividing two decimated training images into a plurality of local image blocks respectively according to a preset local block threshold with a certain area, taking each local image block as a pixel point in the process of calculating the relevance of the contribution rate indexes, and determining a pixel value corresponding to each local image block; and determining the corresponding color word according to the pixel value of each local image block by utilizing the mapping relation between the pixel value interval and the color word.

Preferably, in the step of aggregating a plurality of the pedestrian attribute word correlations for each color pair to form the color theme, the method further includes: taking each color word as a node, taking the color word which has a relevance pairing relation with the current node as an adjacent node of the current node, and constructing a color-pair relevance graph model; traversing each node in the color pair correlation graph model, judging whether the pedestrian attribute word correlation between the current node and the critical point of the current node meets a theme induction condition, if so, representing the close correlation between the current node and the critical point, and inducing the two color nodes into the same color theme.

Preferably, in the process of summarizing the color word, the following steps are included: and performing cluster analysis processing on all pixel values in each training image in the training data based on a preset type color space by adopting a K nearest neighbor algorithm to form color words which accord with the pedestrian characteristic distribution rule in the color space.

Preferably, the step of determining a corresponding color word from each pixel value in the image to be quantized according to a mapping relationship from a preset pixel value interval to the color word, wherein the step of summarizing all pixel values in each training image containing pedestrian information to obtain the color word comprises: determining an interval unit where a current pixel point in the image is located, wherein each color channel is divided into m intervals on the basis of N color channels contained in each color word in a current color space, and the interval units comprise m^NA color space of individual interval units; and calculating the probability of mapping the current interval unit to each color word by using the mapping relation from the pixel value interval to the color word, and determining the color word corresponding to the maximum value in all the probability as the color word of the current pixel point.

Preferably, the step of connecting the theme probability vectors corresponding to the pixel points in the image to be quantized in series and describing the color attribute characteristics of the image to be quantized by using the connection result includes: dividing the image to be quantized into a plurality of local area blocks; determining the mean value vector of the theme probability vectors corresponding to the pixel points in each local area block, and taking the mean value vector as the theme probability vector corresponding to each local area block; and connecting the mean vectors corresponding to each local area block in series to obtain a corresponding series result.

Preferably, in the step of calculating probability of likelihood that the current interval unit is mapped to each of the color words, the method includes: calculating a first probability of selecting any one of the color words; calculating a second probability of falling in the current interval unit based on pixel values corresponding to all pixel points in the training data; under the condition that a color word is determined in advance, calculating a third probability that the color word belongs to the current interval unit; and obtaining the probability of the current interval unit mapping to the color word according to the first probability, the second probability and the third probability.

Compared with the prior art, one or more embodiments in the above scheme can have the following advantages or beneficial effects:

the invention provides a method for quantifying the color attribute characteristics of a pedestrian image. The method adopts a layered framework to quantize the color attributes in the pedestrian images, and quantizes the color attributes of each pedestrian image into a series of visual color theme sets through a color theme model conforming to the actual pedestrian characteristic color distribution rule, so that the interpretability of a color characteristic quantization result is improved. The training data of the invention is obtained from the current pedestrian feature extraction task, the distribution of the color features of the training data is highly consistent with the actual pedestrian features, and the color features and the actual pedestrian features have the same distribution characteristic, so that the fitness of the generalized color words and the constructed color topic model with the training data is better, and the overall color trend change feature of the image is given. The final pedestrian characteristic quantization result obtained by the method does not cause deviation between the color characteristic quantization result of the pedestrian image to be quantized and the corresponding actual color characteristic of the pedestrian due to the change of factors such as illumination, brightness, photographing visual angle and the like, and the accuracy and precision of the pedestrian image color characteristic quantization result are improved. Furthermore, the final pedestrian characteristic quantization result obtained by the method can not change due to the change of factors such as illumination, brightness, photographing visual angle and the like, so that the description of the color characteristic quantization result of the same pedestrian is changed, and the stability, the accuracy and the robustness of the pedestrian image color characteristic quantization result are enhanced.

Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:

fig. 1 is a step diagram of a method for quantifying an attribute feature of a pedestrian image according to an embodiment of the present application.

Fig. 2 is a schematic diagram of a color word induction result in the method for quantifying pedestrian image attribute features according to the embodiment of the present application.

Fig. 3 is a step diagram of a process of determining color words corresponding to pixel points in the method for quantifying pedestrian image attribute features according to the embodiment of the present application.

Fig. 4 is a step diagram of a color theme forming process in the method for quantifying the pedestrian image attribute feature according to the embodiment of the application.

Fig. 5 is a flowchart of a pedestrian attribute word correlation generation process in the method for quantifying pedestrian image attribute features according to the embodiment of the present application.

Fig. 6 is a flowchart of a process of generating a contribution rate index correlation of each color pair in the method for quantifying an attribute feature of a pedestrian image according to the embodiment of the present application.

Fig. 7 is a schematic diagram illustrating principle that a pedestrian attribute word correlation aggregation process is performed and a color topic model is generated in the method for quantifying the attribute features of a pedestrian image according to the embodiment of the present application.

Detailed Description

The following detailed description of the embodiments of the present invention will be provided with reference to the drawings and examples, so that how to apply the technical means to solve the technical problems and achieve the technical effects can be fully understood and implemented. It should be noted that, as long as there is no conflict, the embodiments and the features of the embodiments of the present invention may be combined with each other, and the technical solutions formed are within the scope of the present invention.

At present, many methods for extracting color features of pedestrian images are expressed by using histograms of color spaces such as HSV and RGB. The method mainly comprises the steps of dividing an image to be quantized into small blocks, counting pixel values of pixel points in each block, and dividing a value range of the pixel values into a plurality of regions for quantization. The method is only simple to count the values of all pixels in each local area in the image, cannot intuitively give the overall color trend of the image to be quantized, and is sensitive to the change of illumination and visual angles. Specifically, the color histogram feature widely used in the prior art is only one statistic of the local positions of the pixel points, and the feature semantic information of a higher level cannot be extracted. Moreover, these color feature quantization methods based on histogram statistics require artificial division of statistical intervals, and cannot adaptively find highly representative information in an image.

In addition, in the existing pedestrian image color feature extraction technology, there are some feature quantization methods based on color words. However, these methods use a lot of external data, such as: some image data containing natural scene pictures without pedestrian features are used to obtain color words. The color words obtained by the external data are used for expressing the current image to be quantized containing the pedestrian information, and the deviation of the final quantization result is caused by the influence of factors such as the illumination intensity in the external data. Because, in the method for summarizing color words, the finally obtained data distribution rule of the color words is inconsistent with the color feature data distribution rule in the current pedestrian feature extraction task, some unrelated color words irrelevant to the pedestrian feature extraction task and lack of some color words with strong correlation with the pedestrian feature extraction task, which are influenced by external data, may exist in the current image data to be quantized, so that deviation is caused between the final quantization result and the actual pedestrian feature, and the accuracy and precision of the quantitative expression of the color attribute feature in the final pedestrian image are influenced.

For example, under the influence of external data, the pixel values of colors having strong correlation with the pedestrian feature extraction task change (e.g., brightness increases), so that the color words summarized by the changed pixel values cannot accurately express the color features of the pedestrian, and therefore, when the color features of the image to be quantized are extracted, the color attributes in the pedestrian image accurately expressed by the existing summarized color words cannot be utilized.

In order to solve the technical problem, the invention provides an attribute feature quantization method and system for a pedestrian image. Firstly, based on a specific type of color space, utilizing training data comprising a plurality of training images (wherein each training image is an image containing a target pedestrian) to induce color words which accord with the distribution rule of the characteristics of the actual pedestrian. And then, constructing a mapping relation from a pixel value interval to a color word, directly corresponding the pixel point in the image to be quantized to a pixel value interval unit, and directly obtaining the color word to which the pixel value of the pixel point belongs by using the mapping relation through the position of the actual pixel value interval unit in the color space where the current pixel point is located. In addition, the invention provides a novel method for quantizing the color attribute characteristics in the pedestrian image by utilizing a hierarchical framework, according to the pedestrian identity characteristic constraint condition and the influence degree of contribution of different color words to the pedestrian characteristics in training data, the color words with close association induced by the method are aggregated into the same type of color theme, and the color words representing the color characteristics of the low quantization level are converted into the color theme representing the color characteristics of the high quantization level for quantizing and coding the image to be quantized. In this way, the color feature attribute of each pedestrian image to be quantized can be quantized into a series of intuitive color theme sets, so that the interpretability of the pedestrian feature is improved.

It should be noted that each training image in the present invention is an image including a target pedestrian. Wherein the target pedestrian is a pedestrian within a range of numbers. Each training image in the training data is specific to the pedestrians in the number, and specifically, training images of each target pedestrian under different pedestrian movement angles (for example, turning, back, front, side and the like directions), different shooting angles relative to the pedestrian, different ambient light and the like are acquired. Then, the present invention adaptively trains and extracts different types of color subjects based on the color feature data in the training images (each training image is a pedestrian image including a target pedestrian) to quantize the color features of the pedestrian image to be quantized, and the training process does not need to artificially set the limiting conditions like a histogram statistical color feature quantization method. After the color theme model is obtained, the method can take the image of any other pedestrian as the image to be quantized, and carry out color characteristic quantization operation based on the color theme.

Fig. 1 is a step diagram of a method for quantifying an attribute feature of a pedestrian image according to an embodiment of the present application. The following describes specific steps of the new pedestrian image color feature quantization method proposed by the present invention with reference to fig. 1.

Step S110 determines a color word corresponding to each pixel point from a pixel value corresponding to each pixel point in the image to be quantized according to a mapping relationship from a preset pixel value interval to the color word. The color words are obtained by summarizing pixel values corresponding to all pixel points in each training image containing pedestrian characteristic information based on a specific type of color space.

In the embodiment of the present invention, the color word induction process is to use the training data as a data source of the color word induction process, and perform cluster analysis on pixel values corresponding to all pixel points in each training image in the training data based on a preset type of color space by using a K-nearest neighbor algorithm, so as to form color words in a training data range in the color space. Through the process, the color words summarized by the embodiment of the invention contain the color features of all target pedestrians and accord with the color feature recognition rule of the target pedestrians.

It should be noted that, the color space in the embodiment of the present invention is selected from any one of an RGB color space, an HSV color space, an LAB color space, and the like, the category of the color space is not specifically limited in the present invention, and a person skilled in the art can select the color space according to actual needs.

Fig. 2 is a schematic diagram of a color word induction result in the method for quantifying pedestrian image attribute features according to the embodiment of the present application. FIG. 2(a) illustrates generalized color words based on RGB color space obtained on the VIPeR training dataset; FIG. 2(b) shows generalized color words based on HSV color space obtained on the VIPeR training dataset; FIG. 2(b) illustrates the LAB-based color space generalized color words obtained on the PRID 450S training dataset.

Fig. 3 is a step diagram of a process of determining color words corresponding to pixel points in the method for quantifying pedestrian image attribute features according to the embodiment of the present application. As shown in fig. 3, in the present invention, through steps S301 to S302, a color word corresponding to a pixel value of each pixel point in the image is determined according to a pixel value corresponding to each pixel point in the image to be quantized, and the process of step S110 is completed.

First, step S301 determines a (pixel value) interval unit where a current pixel point in an image (to-be-quantized image or training image) is located. In the embodiment of the present invention, the color space is configured to be composed of a plurality of pixel value interval units. Specifically, since each color word is expressed in the form of a channel-pixel value vector, the vector is composed of pixel values corresponding to a plurality of color channels (N represents the number of color channels). Therefore, in the embodiment of the present invention, each color channel is divided into M sections based on N color channels included in the summarized color word in the color space of the current type, and the color channels are configured to include M ═ M^NThe color space of the individual interval cells, and M represents the number of interval cells in the color space. Wherein N, M and M are positive integers.

After the position of the interval unit where the current pixel point is located is determined, the process proceeds to step S302. Step S302 calculates probability of mapping the current interval unit to each color word by using a mapping relationship between a pixel value interval and a color word, which is constructed in advance, and determines a color word corresponding to a maximum value among all the probability probabilities as a color word of the current pixel point. Therefore, the color words to be summarized into the pixel values are determined by the method for calculating the probability of mapping the interval units to the color words, a great number of pixel points are fused into the interval units, and only the mapping relation probability from the interval units to the color words is calculated.

The following describes a process of constructing a mapping relationship between a pixel value interval and a color word in the embodiment of the present invention. The invention adopts Bayes' rule to calculate the posterior probability of the mapping of the current interval unit to the word with the appointed color, thereby obtaining the probability of the mapping of the current interval unit to the word with the appointed color.

Specifically, in the process of calculating the probability of the possibility that the current interval unit is mapped to one of the designated color words, the method further comprises the following steps: 1) a first probability of selecting a word of any one color is calculated. 2) And calculating a second probability of falling in the current interval unit based on the pixel values corresponding to all the pixel points in the training data and according to the number of the pixel points of which all the pixel values in the training data belong to the current interval unit and the number of all the pixel points in the training data. 3) In the case where a color word is determined in advance, a third probability that the color word belongs to the current section unit is calculated. 4) And calculating the probability of the mapping of the current interval unit to the color word according to the first probability, the second probability and the third probability.

In one embodiment, let W ═ W₁，...，w_NwDenotes the color word obtained by one K neighbor cluster. C ═ C1_MDenotes a set of pixel value interval units in color space. The posterior probability p (wj | ci) of the pixel value interval unit ci mapped to the color word wj is defined as a corresponding probability of possibility, and is expressed by the following expression:

wherein, p (c)_i|w_j) Unit c for indicating interval of pixel value_iA priori probability (third probability) of mapping to a given color word, i.e. representing the category w of the color word given the current assignment_jUnder the condition (c), the pixel point falls into a pixel value interval unit_iThe probability of (d); p (w)_j) A weight indicating a color word that affects preference of a pixel point (section unit) when selecting the color word to which the current section belongs, that is, the first probability; p (c)_i) Representing the second probability. Next, we need to calculate these three probabilities separately. First, when calculating the first probability, the selection preference of each color word is expressed as being consistent according to practical experience, and subjective interference is eliminated, namely, the same probability is given when each color word is selected. Wherein the first probability is calculated using the following expression:

wherein N is_wRepresenting the number of color words that an embodiment of the present invention generalizes based on training data. Further, the above second probability is calculated using the following expression:

wherein N is_ciIndicates that all pixel values present in the entire training data set fall (belong) to the current interval unit c_iNumber of pixels of, N_PRepresenting the number of all pixel points in the entire training data set. Finally, the third probability is calculated using the following expression:

wherein, p (c)_i|w_j) Indicating falling with a given color wordProbability of entering the unit of the interval of the current pixel value, i.e. the third probability, w_jA vector of channel pixel values representing the currently specified color word,

c, all pixel values in the whole training data set fall in the current pixel value interval unit_iAnd is expressed by the following expression (5),

indicating the currently assigned color word w_jThe covariance matrix of (2). Wherein, the expression (5) is as follows:

wherein, P_ciFor falling in the current pixel value interval unit c_iOne of the pixel points in (2) represents a vector of pixel values of the point. In this way, the first, second and third probabilities can all be calculated by the correlation formula of bayesian rule, so that by specifying any one pixel value (vector) in the color space, the mapping relationship can be directly utilized to obtain the probability that the pixel value is mapped to each color word, thereby forming a word probability vector for the pixel value of the pixel point, further selecting the maximum value from each element in the word probability vector, and determining the maximum value as the specific color word corresponding to the current pixel value. Wherein the word probability vector is for a pixel point, the vector has a plurality of sub-probability values consistent with the generalized number of color words in the embodiment of the present invention, each probability value represents a probability that the current pixel value is mapped to a corresponding kind of color word (for example, let the word probability vector be (pa, pb, pc), where pa represents a probability that the current pixel value is mapped to a kind of color word, pb represents a probability that the current pixel value is mapped to b kind of color word, and pc represents a probability that the current pixel value is mapped to c kind of color word), and the probability value is expressed as a probability that the current pixel value is mapped to c kind of color word, and the probability value is expressed as a probability that the current pixelThe dimensions of the vector are consistent with the number of color words.

Referring again to fig. 1, after determining the color word to which each pixel value in the image to be quantized belongs, the process proceeds to step S120. Step S120 calculates (posterior) probabilities that the color words of each pixel belong to each color theme, and represents the image to be quantized as a theme probability vector having corresponding dimensionality and color theme number for each pixel. And calculating the correlation among the words with different colors based on the influence weight of the contribution rate of the words with different colors on the pedestrian features in the training data, and performing aggregation processing on the color words with close correlation to obtain the color subjects with different categories.

Specifically, the color topic model including the color topics of different categories is set in the embodiment of the present invention, and is formed by aggregating color words with strong correlation in the extraction process of the color features of people, and the number of the color words is smaller than the number of the color words which are already summarized. The theme probability vector is for a pixel point, and the vector has a plurality of sub-probability values consistent with the number of the color themes that have been constructed in the embodiment of the present invention, each sub-probability value represents a probability of a possibility that a color word corresponding to the current pixel value belongs to a corresponding color theme (for example, let the theme probability vector be represented as (pA, pB, pC), where pA represents a probability of a possibility that a color word corresponding to the current pixel value belongs to a color theme of a category a, pB represents a probability of a color word corresponding to the current pixel value belongs to a color theme of a category B, and pC represents a probability of a possibility that a color word corresponding to the current pixel value belongs to a color theme of a category C), and the dimension of the vector is consistent with the number of the color themes.

Therefore, in the embodiment of the present invention, it is necessary to calculate the probability of the possibility that the color word corresponding to the current pixel value belongs to each color topic. Further, the posterior probability that the color word corresponding to the current pixel value is mapped to the specified color theme is calculated by adopting a Bayesian rule, so that the probability that the color word corresponding to the current pixel value is mapped to each color theme is obtained, and a corresponding theme probability vector is constructed for each pixel point pixel value. Since the calculation of the posterior probability of mapping the color word corresponding to the current pixel value to the specified color theme in this section is similar to the above-mentioned method of calculating the posterior probability of mapping the current interval unit to the specified color word, it is not described herein again.

Fig. 4 is a step diagram of a color theme forming process in the method for quantifying the pedestrian image attribute feature according to the embodiment of the application. As shown in fig. 4, the present invention completes the generation process of the color topic model through steps S401 to S404.

First, step S401 calculates semantic relevance between any two color words according to pixel values corresponding to different (color) channels of each color word. Since each color word is represented in the form of a channel-pixel value vector, embodiments of the present invention require that a relevance score between any two color words be calculated based on the semantic values of the color words, i.e., based on the pixel values corresponding to each channel within the color words. Preferably, the Euclidean distance between two color words is calculated, and the correlation score of the two color words, namely the semantic correlation of the two color words is determined through a semantic correlation calculation formula. The semantic relevance calculation formula is expressed by the following expression:

in the formula (6), R represents a semantic relevance score, w_i、w_jRespectively representing the channel pixel value vectors corresponding to the two word colors. In general, the semantic relevance between pairs of color words in similar color families should be closer than the semantic relevance between pairs of color words in different color families. However, the distance is also affected by saturation and brightness, so that it is not accurate enough to use only the distance-based semantic relevance analysis, and it is also necessary to refer to the factor of contribution rate or influence degree of different color words to pedestrian features in training data.

After the semantic relevance calculation for a plurality of color pairs formed by any two-by-two combination of all the color words is completed, the process proceeds to step S402. Step S402 firstly represents each training image as a word probability vector with each pixel point corresponding dimension consistent with the number of color words, based on which, the word frequency of each color word in each training image and the probability of each color word appearing in all training images (training data set) are calculated, and further the word frequency and image frequency index of each color word relative to all training images are obtained to represent the contribution of the color word to the pedestrian characteristics in the training data. For the same color word, the color word corresponds to a corresponding word frequency (parameter) compared with each training image, and therefore, the word frequency parameter represents the importance of the association of one color word compared with one training image.

It should be noted that different color words have different importance for the color characteristics of the pedestrian image. For example: dark grey is very common in most training images in the training dataset as a shaded area, while red and orange are not common as shaded areas, but are colors that are common in the pedestrian image characteristic attributes. Therefore, in evaluating the contribution ratio or the influence factor, it is necessary to decrease the contribution ratio of dark gray and increase the contribution ratio of red and orange. Further, in the embodiment of the present invention, the degree of contribution of each color Word to the pedestrian features in the training data needs to be calculated, and is represented by a Word Frequency-Image Frequency index (Word Frequency-Inverse Image Frequency).

In step S402, a process of calculating a word frequency and an image frequency index corresponding to each color word for a current training image will be described by taking one of the training images as an example.

Firstly, with reference to the method for calculating the posterior probability of mapping the current interval unit to the word with the designated color, the probability of mapping each pixel value in the current training image to the word with different color is calculated, and each pixel point pixel value in the training image is converted into the corresponding word probability vector, thus obtaining the corresponding color word image W_IThis word probability vector represents probability of likelihood of mapping the pixel value to different kinds of color words, respectively.

Then, the word frequency corresponding to each color word appearing in the current image is calculated using the following word frequency calculation formula (refer to steps one to three). Wherein the word frequency represents the importance of a word of a certain color in the current training image. Specifically, the following description will take the example of calculating the word frequency of the word j with a specific color in the current training image. Step one, possibility probabilities mapped to j types of color words are respectively extracted from the word probability vectors of each pixel point in the current color word image, and the mean value of the possibility probabilities is calculated to obtain the mean value of the word possibility probabilities for the j types of color words. Wherein the word likelihood probability mean is calculated using the following expression:

wherein, K_I(wj) represents a word likelihood probability mean of a color word of type j; n is a radical of_PIRepresenting the number of pixel points in the current training image; w_I(y, x, j) represents the probability of the likelihood that a pixel value of a point corresponding to the pixel at the (y, x) location maps to a color word of the j category; h represents the height of the current training image; wide represents the width of the current training image.

And step two, determining one or more pixel points with the color word of the type j in all pixel points in the current training image, respectively calculating the pedestrian characteristic weight parameters of the type j color word corresponding to each pixel point in the one or more pixel points, and further summing the pedestrian characteristic weight parameters to obtain the pedestrian characteristic weight parameters of the importance of the type j color word in the current training image compared with the current training image. The pedestrian characteristic weight parameter is guided by the characteristics of the pedestrian image, so that the weight value of the pixel point closer to the central position is higher, because the effective area (including the pedestrian but not the background) of the pedestrian image is generally at the position closer to the horizontal center. Further, the pedestrian characteristic weight parameter represents a weight value of the importance influence of the position of the current pixel point relative to the pedestrian image characteristic. The pedestrian characteristic weight parameter corresponding to the pixel point of which each color word is the type j in the current training image is calculated by using the following expression:

wherein the content of the first and second substances,

representing a pedestrian characteristic weight parameter corresponding to a pixel point of which each color word is a type j in the current training image; mu is a parameter which can be set by self, specifically represents a threshold factor for calculating inferior-quality products and image frequency, is related to the number of qualified color words which are expected to be screened, and generally takes a value of 1 when the word frequency is calculated and takes a value of 0.5 when the image frequency is calculated; k denotes the sequence number of the color word for traversing all the color words. As can be seen from expression (8), when the mean value of the word likelihood probabilities of the color words of the category j is larger than the mean value of the mean values of the word likelihood probabilities of all the color words, the above-mentioned pedestrian feature weight parameter is expressed by expression

And calculating, otherwise, the value is 0.

And step three, according to the calculation results of the step one and the step two, the importance of the color word with the type of j appearing in the current training image in the training image, namely the word frequency calculation result, is obtained by using the word frequency calculation formula. Wherein, the word frequency calculation formula is expressed by the following expression:

where WFwj denotes the color word w_jImportance in the current training image I, namely word frequency;

representing the pedestrian feature weightThe value is that the pedestrian characteristic weight parameter of the importance of the color word with the type of j in the current training image is compared with the current training image; sigma K_I(w_j) And the sum of the word likelihood means of the j category words corresponding to each color word image is represented.

Then, after the word frequency calculation of each color word in each training image is completed (the color word wj is directed to the word frequency parameter of the image I), the image frequency, which is the probability that each color word appears in all training images (training data sets), needs to be calculated by using the image frequency calculation formula according to all the word frequency data. Wherein the image frequency calculation formula is represented by the following expression:

wherein IFwj represents a color word w_jFrequency of occurrence in all images (image frequency), N_IRepresenting the number of training images in the training data.

And finally, obtaining the word frequency and the image frequency index of each color word relative to all the training images according to the calculated word frequency of each color word in each training image and the probability of each color word in all the training images. Specifically, the word frequency parameter and the image probability parameter in the same color are subjected to ratio calculation by using the following exponential calculation formula, so that the word frequency and image frequency indexes of the color relative to all training images are obtained. Wherein the exponential calculation formula is represented by the following expression:

where γ represents the word frequency and image frequency index of the color word wj relative to all training images. Further, the word frequency and image frequency index represents the contribution rate or importance degree of each color word to the color characteristic attribute of the pedestrian image in the training data. As can be seen from the above expression (11), if the image frequency is high, the smaller the word frequency and the image frequency index is, it indicates that the color word is not well distinguishable, that is, the current color word is difficult to distinguish the color features belonging to different pedestrians well. On the contrary, if the word frequency of the color word in the current image is higher and the corresponding image frequency is lower, the word frequency index and the image frequency index are larger, which indicates that the color word can well distinguish the color feature of the current pedestrian from other pedestrians.

After calculating the word frequency and the image frequency index for all the color words compared to all the training images, the process proceeds to step S403. Step S403, based on the paired pedestrian identity relevance constraint conditions, according to semantic relevance between words of different colors and according to word frequency and image frequency index of each color word compared with all training images, respectively, statistics is performed on the relevance between any two color pairs in all the color words summarized in the present invention, so as to obtain the pedestrian attribute word relevance corresponding to each color pair. It should be noted that the color pairs in the present invention are formed by any pairwise combination of all the generalized color words, which is all the possibilities of matching results of random pairwise combination of all the color words. Further, the specific flow of step S403 is described in detail in example two below, and therefore will not be described here.

After the step S403 is completed, step S404 is performed, and a plurality of pedestrian attribute word correlations for each color pair are aggregated to form different types of color themes in the embodiment of the present invention, so as to construct a corresponding color theme model. Preferably, the aggregation process described above can be performed using the most dense connected subgraph algorithm.

Fig. 7 is a schematic diagram illustrating principle that a pedestrian attribute word correlation aggregation process is performed and a color topic model is generated in the method for quantifying the attribute features of a pedestrian image according to the embodiment of the present application. As shown in fig. 7, the present invention forms a color theme model including different types of color themes through steps S701 to S702.

Specifically, in step S701, each generalized color word is used as a node, a color word having a correlation pairing relationship with the current node is used as a neighboring node of the current node, and a color-pair correlation graph model is constructed, as shown in the middle graph of fig. 7, which shows the constructed color-pair correlation graph model, wherein squares represent nodes and represent different types of color words; the connecting lines between the blocks represent the correlation between word pairs with different colors, namely the pedestrian attribute word correlation between the word pairs with different colors.

Further, in step S702, each node in the color pair correlation graph model is traversed, and whether the pedestrian attribute word correlation between the current node and the critical point thereof satisfies the theme induction condition is determined, if so, the correlation between the current node and the critical point is represented to be close, and the two color nodes can be induced into the same color theme. In addition, if the theme induction condition is not met, the relationship between the two is not tight, and the two color nodes do not belong to the same color theme. As shown in the right diagram of fig. 7, the right diagram shows a constructed color theme model, wherein the square in the model represents a color theme.

In one embodiment, the theme induction condition is to determine whether the pedestrian attribute word correlation between each node and its critical point reaches or exceeds a preset theme evaluation threshold, if so, it indicates that the theme induction condition is satisfied, and it indicates that the association between the two is tight, and the two color nodes can be induced into the same color theme; if not, the theme induction condition is not met, the association between the two is not tight, and the two color nodes do not belong to the same color theme.

Thus, the different categories of color subjects in the present example, which can highly generalize and accurately explain the color features of the pedestrian image, are formed through the above-described steps S401 to S404.

Further, referring to fig. 1 again, in step S130, topic probability vectors corresponding to each pixel point in the current image to be quantized (the current image to be quantized, in which each pixel point is converted into a corresponding topic probability vector) converted in step S120 are connected in series, and the color attribute characteristics of the current image to be quantized are described by using the series connection result.

Preferably, in order to reduce the data amount of the color attribute feature quantization result of the image to be quantized, the image to be quantized needs to be subjected to a blocking process before a final quantization result is obtained. Specifically, first, step S131 divides the image to be quantized into a plurality of local area blocks according to a preset number of blocks, and then, the process proceeds to step S132.

Step S132 determines the mean vector of the topic probability vectors corresponding to the pixel points in each local region block, and uses the mean vector as the topic probability vector corresponding to each local region block. Further, the mean value of the theme probability vectors corresponding to the pixel points in each local area block is calculated, and each mean value vector is respectively used as the theme probability vector of the corresponding local area block.

Further, in step S133, the mean vectors corresponding to each local area block are concatenated to obtain a corresponding concatenation result. In a preferred embodiment, each local area block is numbered sequentially, then, the mean value vectors corresponding to each local area block are connected in series according to the numbering sequence to form a new color attribute quantization feature vector, and the pedestrian color feature in the image to be quantized is explained by using the current color attribute quantization feature vector.

In summary, the embodiment of the present invention utilizes the constructed color topic model to perform highly generalized description on the color features of the to-be-quantized pedestrian picture, and provides the color trend change features of the whole image. Because the color words summarized by the method and the different types of color themes constructed based on the color word association compactness are all obtained based on the target pedestrian characteristics (excluding background information), the final pedestrian characteristic quantization result obtained by the embodiment of the invention does not change or influence the depiction of the pedestrian image color characteristics due to the change of factors such as illumination, brightness, photographing visual angle and the like, and the robustness and the accuracy of the pedestrian image color characteristic quantization result are enhanced.

Example two

Fig. 5 is a flowchart of a pedestrian attribute word correlation generation process in the method for quantifying pedestrian image attribute features according to the embodiment of the present application. As shown in fig. 5, the present invention calculates the correlation between any two color pairs in all the color words through steps S501 to S503, and obtains the corresponding correlation of the pedestrian attribute words, thereby completing step S403.

In step S501, two training images are arbitrarily decimated from the training data, and based on the pedestrian identity correlation constraint condition of the image pair, the contribution rate index correlation of each color pair appearing in the image pair is calculated from the word frequency and the image frequency index of each color word appearing in the two training images.

Fig. 6 is a flowchart of a process of generating a contribution rate index correlation of each color pair in the method for quantifying an attribute feature of a pedestrian image according to the embodiment of the present application. As shown in fig. 6, the present invention calculates the contribution rate index correlation of each color pair appearing in the image pair through steps S601 to S604, thereby completing step S501.

First, in step S601, one pixel point (marked as a first pixel point) in the first training image is selected from the two image pairs, the number of lines where the pixel point is located is recorded, and then the process proceeds to step S602.

Step S602 determines whether the image pair satisfies the constraint condition of pedestrian identity relevance, and calculates, using the first correlation calculation formula, the relevance between the color word corresponding to the pixel point and the color word corresponding to each pixel point in the same row in the second image, respectively, to obtain a plurality of corresponding first relevance results. Specifically, it is necessary to first determine whether the image pair satisfies the pedestrian identity correlation constraint condition, that is, whether the pedestrians in the two training images are the same person, so as to obtain a corresponding determination result. Further, if the pedestrians in the image pair are the same person, the judgment result value of the pedestrian identity correlation constraint condition is 1; and if the pedestrians in the image pair are not the same person, the judgment result value of the pedestrian identity correlation constraint condition is-1.

Then, the correlation between the color word corresponding to the selected (first) pixel point in the first training image and the color word corresponding to each pixel point in the same row in the second image is calculated, referring to the left diagram of fig. 7. As shown in the left diagram of fig. 7, the upper diagram represents the first training image in the current training image pair, the lower diagram represents the second training image in the current training image pair, the first square block on the left side of the upper diagram represents the currently selected (first) pixel point, and 4 square blocks in the lower diagram represent the pixel points in the second training image in the same line as the currently selected (first) pixel point. For example: if 4 pixel points exist in the same line in the second image, each pixel point corresponds to a color word; then, calculating the correlation between the red color word corresponding to the first pixel point and the orange color word corresponding to the first pixel point in the same row in the second image to obtain a first correlation result aiming at the red-orange color pair; then calculating the correlation between the red color word corresponding to the first pixel point and the orange color word corresponding to the second pixel point in the same row in the second image to obtain a first correlation result aiming at the red-orange color pair; then, calculating the correlation between the red color word corresponding to the first pixel point and the orange color word corresponding to the third pixel point in the same row in the second image to obtain a first correlation result aiming at the red-orange color pair; further, calculating the correlation between the red color word corresponding to the first pixel point and the dark gray color word corresponding to the fourth pixel point in the same row in the second image to obtain a first correlation result aiming at the red-dark gray color pair; finally, the corresponding 4 first correlation results are obtained. The 4 first correlation results may be a same color pair (e.g., red-orange color pair) or different color pairs (red-orange color pair, red-dark gray color pair).

Wherein the first correlation calculation formula is expressed by the following expression:

β＝λ·sign(O，G)·γ_i·γ_j (12)

in formula (12), β represents the first correlation result; o represents a first training image; g represents a second training image; sign (O, G) represents the judgment result of the pedestrian identity correlation constraint condition; gamma ray_iRepresenting the color word w corresponding to the selected pixel point in the first training image_iWord frequency and image frequency index relative to all training images; gamma ray_jRepresenting the word frequency and the image frequency index of a color word wj corresponding to one pixel point in the same row in the second image relative to all training images; and lambda represents a pedestrian identity constraint weight factor, specifically represents the importance degree of the pedestrian identity constraint on the correlation influence between the current color pairs, and generally takes a value of 0.1-1. As can be seen from expression (12) above, if the two training images are from the same pedestrian, the color word correlation is large, and conversely, the correlation is small.

Then, step S603 is performed, whether each pixel point in the first training image is traversed or not is determined, if not, a next pixel point is selected, and step S11 and step S12 are sequentially performed until each pixel point in the first training image is traversed, so as to obtain a plurality of corresponding first correlation results. Further, after all the pixel points in the first training image are traversed, the process jumps to step S604.

Step S604 sums the first correlation results belonging to the same color pair in the two training images to obtain the contribution rate exponential correlation for the color pair. It should be noted that the contribution rate index correlation parameter here is a contribution rate index correlation that represents the correlation between the same color pair in the range of the two current training images.

Further, in order to reduce the amount of calculation of the first correlation result through the calculation traversal operation, it is preferable that the pixel points involved in the processes of step S601 to step S604, that is, in the process of step S501, are replaced by local image blocks. Specifically, in the first step, according to a preset local block threshold with a certain area, two decimated training images are respectively divided into a plurality of local image blocks, each local image block is used as a pixel point in the process of calculating the relevance of the contribution rate index, and a pixel value corresponding to each local image block is determined. The average value of the pixel values of each pixel point (pixel point of the minimum unit) in the local image block is used as the pixel value of the current local image block. And a second step of determining, by using the mapping relationship between the pixel value interval and the color word constructed in the first embodiment, the color word to which each local image block belongs, from the pixel value of each local image block.

It should be noted that, in order to maintain the calculation accuracy of the contribution rate index correlation and ensure the accurate expression of the correlation between the same color pairs, the area of the local block threshold may be set by itself, which is not specifically limited by the present invention, and those skilled in the art may adjust or set the area according to actual requirements. In the practical application process, if the process of step S501 (the process of step S601 to step S604) is traversed with the pixels as units, the computational complexity is greatly increased, and the process is easily affected by illumination noise, and it cannot be guaranteed that each pixel is in accordance with an expected value, and traversing with a local image block can prevent such noise caused by color distortion of individual pixels, and greatly reduce the computational complexity. Generally, the local image block size is within 8x8 size to ensure accuracy.

Further, after calculating the contribution rate index correlation of each color pair appearing in the current training image pair, the process proceeds from step S501 to step S502.

Step S502 judges whether the training image pair needs to be continuously extracted or not, if so, any two training images continue to be randomly selected, the contribution rate index correlation of each color pair appearing in the image pair is calculated, the step S501 is returned, and the currently obtained contribution rate index correlation is used as the data basis of the pedestrian attribute word correlation forming the final correlation of different color pairs until the training image pair does not need to be extracted. Further, after the pair of images does not need to be extracted, the process proceeds from step S502 to step S503.

Step S503 is to sum up the contribution rate index correlations of the same color pair in all the decimated image pairs by using the computation result of the above steps S501 and S502 and the computation expression of the pedestrian attribute word correlation, and to obtain the pedestrian attribute word correlation by combining the semantic correlations of the color pairs. The relevance of the contribution rate indexes is characterized by the relevance between the same color pairs in the range of the current image pair, and the relevance of the pedestrian attribute word is characterized by the relevance between the same color pairs in the range of a plurality of groups of image pairs after multiple times of decimation. Therefore, in the embodiment of the present invention, it is necessary to select the contribution rate index correlations of the same color pair from the multiple sets of correlation results obtained by the former, and perform the summation processing again, so as to obtain the correlation calculation result of the latter. The pedestrian attribute word correlation calculation formula is expressed by the following expression:

wherein R' represents the pedestrian attribute word correlation between the color word wt and the color word wl, R represents the semantic correlation between the color word wt and the color word wl, alpha represents the contribution rate index correlation of the color pair between the color word wt and the color word wl appearing in a set of image pairs, and U represents the number of the contribution rate index correlation parameters of the same color pair (color word wt and color word wl) in a range of a plurality of sets of image pairs.

In summary, in the embodiment of the present invention, the pedestrian attribute word relevance is obtained according to the pedestrian identity relevance constraint condition, the semantic relevance, and the word frequency and image frequency index. The semantic relevance is the relevance of the nature of color features, and the relevance of colors with similar color systems is high. For example: the red is basically in the same color system even under different lighting conditions, and the semantic relevance utilizes the essential characteristics of the color to construct the relevance between words in different colors. The pedestrian identity correlation constraint condition solves the problem of color distortion of the image caused by factors such as the pedestrian shooting angle, illumination, the pedestrian motion posture and the like. For example: the colors of the same pedestrian on the body are consistent without changing the clothes, but the colors are displayed in the image to be inconsistent due to the various factors, and at the moment, the correlation can be built on the colors by using the pedestrian identity constraint. And the word frequency and the image frequency index select colors with discriminativity (color characteristics for distinguishing different pedestrians) by utilizing statistical information of all training images in the training data set, and strengthen the correlation of the colors with strong discriminativity. Therefore, the pedestrian attribute word correlation obtained through the three parameters is more consistent with the pedestrian feature recognition rule, the description result expression of the pedestrian features is more practical and higher in accuracy, and the pedestrian feature word correlation has high generality.

The invention provides a method and a system for quantifying color attribute features of a pedestrian image. The method and system employ a layered framework to quantify Color attributes in pedestrian images, where the low-level and high-level Color attribute encodings are referred to as Color Words (Color Words) and Color Topics (Color Topics), respectively. In the color theme construction process, generating color words according to training data, and mapping each pixel to one color word through posterior probability; then, the importance of each color word is calculated according to their discrimination ability; and finally, establishing a correlation diagram representing the correlation closeness among the color words according to the constraint of the identity of the pedestrian, and classifying the color words with strong correlation into the same color theme to obtain a corresponding color theme model. Wherein the color topic model is a set of most relevant color words based on the relevance graph. Therefore, the color attribute of each pedestrian image can be quantized into a series of intuitive color theme sets through the color theme model, so that the interpretability of the color feature quantization result is improved. The training data of the invention is obtained from the current pedestrian feature extraction task, the distribution of the color features of the training data is highly consistent with the actual pedestrian features, and the color features and the actual pedestrian features have the same distribution characteristic, so that the drawn color words and the constructed color topic model have better fitness with the training data.

In summary, the embodiment of the present invention utilizes the constructed color topic model to perform highly generalized description on the color features of the to-be-quantized pedestrian picture, and provides the color trend change features of the whole image. Because the color words summarized by the method and the different types of color subjects constructed based on the color word association compactness are all obtained based on the target pedestrian characteristics (excluding background information), and the response to factors such as external environment, shooting angle and the like is not sensitive, the final pedestrian characteristic quantization result obtained by the method does not generate deviation between the color characteristic quantization result of the pedestrian image to be quantized and the corresponding actual color characteristic of the pedestrian due to the change of factors such as illumination, brightness, shooting angle and the like, and the accuracy and precision of the pedestrian image color characteristic quantization result are improved. Furthermore, the final pedestrian characteristic quantization result obtained by the embodiment of the invention can not change due to the factors such as illumination, brightness and photographing visual angle, so that the description of the color characteristic quantization result of the same pedestrian is changed, and the stability, accuracy and robustness of the pedestrian image color characteristic quantization result are enhanced.

The above description is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method for quantifying pedestrian image attribute features, comprising:

determining a corresponding color word by each pixel value in an image to be quantized according to a mapping relation from a preset pixel value interval to the color word, wherein based on a preset type of color space, all pixel values in each training image containing pedestrian information are subjected to clustering analysis induction processing to obtain the color word which accords with a pedestrian characteristic distribution rule in the color space;

calculating the probability that the color words of each pixel point belong to each type of color theme respectively, and representing the image to be quantized into theme probability vectors with corresponding dimensionality consistent with the number of the color themes, wherein based on the contribution of different color words to the pedestrian features in the training data, the relevance among the different color words is calculated, and the color words with close relevance are aggregated to obtain the color themes of different types;

and connecting the theme probability vectors corresponding to the pixel points in the image to be quantized in series, and describing the color attribute characteristics of the image to be quantized by using a serial connection result.

2. The method of claim 1, wherein in forming the color theme, comprising:

calculating semantic relevance between any two color words according to different channel pixel values of each color word;

representing each training image as a word probability vector with corresponding dimension of each pixel point consistent with the number of color words, calculating the word frequency of each color word in each training image and the probability of each color word appearing in all training images based on the word probability vector, and further obtaining the word frequency and image frequency index of each color word relative to all training images to represent the contribution of the color word to the pedestrian characteristics in the training data;

based on paired pedestrian identity relevance constraint conditions, according to semantic relevance between words with different colors and the word frequency and image frequency indexes, counting the relevance between any two color pairs in all color words appearing in the training data to obtain corresponding pedestrian attribute word relevance;

and aggregating the pedestrian attribute word correlation of a plurality of pairs aiming at each color to form the color theme.

3. The method according to claim 2, wherein in the step of obtaining the corresponding pedestrian attribute word correlation by counting the correlation between any two color pairs in all the color words appearing in the training data according to the semantic correlation between different color words and the word frequency and image frequency index based on the paired pedestrian identity correlation constraint conditions, the method further comprises:

s1, arbitrarily selecting two training images from the training data, and calculating the contribution rate index correlation of each color pair appearing in the image pair according to the word frequency and the image frequency index of each color word appearing in the two training images based on the pedestrian identity correlation constraint condition of the image pair;

s2, judging whether the training image pair needs to be continuously extracted, if so, selecting any two training images, and executing the step S1, otherwise, entering the step S3;

and S3, summing the contribution rate index correlations of the same color pair in all the decimated image pairs based on the calculation results of the step S1 and the step S2, and combining the semantic correlations of the color pairs to obtain the corresponding pedestrian attribute word correlations.

4. The method according to claim 3, wherein in step S1, further comprising:

s11, selecting a pixel point in the first training image, and recording the number of lines of the pixel point;

s12, judging whether the image pair meets the pedestrian identity correlation constraint condition, respectively calculating the correlation between the color word corresponding to the pixel point and the color word corresponding to each pixel point in the same row in the second image, and obtaining a plurality of corresponding first correlation results based on the correlation;

s13, judging whether each pixel point in the first training image is traversed or not, if not, selecting the next pixel point, and sequentially implementing the step S11 and the step S12, otherwise, entering the step S14;

and S14, summing the first correlation results belonging to the same color pair in the two training images to obtain the contribution rate exponential correlation for the color pair.

5. The method according to claim 4, wherein in step S1, further comprising:

dividing two decimated training images into a plurality of local image blocks respectively according to a preset local block threshold with a certain area, taking each local image block as a pixel point in the process of calculating the relevance of the contribution rate indexes, and determining a pixel value corresponding to each local image block;

and determining the corresponding color word according to the pixel value of each local image block by utilizing the mapping relation between the pixel value interval and the color word.

6. The method according to any one of claims 2 to 5, wherein in the step of aggregating a plurality of said pedestrian attribute word correlations for each color pair to form said color theme, further comprising:

taking each color word as a node, taking the color word which has a relevance pairing relation with the current node as an adjacent node of the current node, and constructing a color-pair relevance graph model;

traversing each node in the color pair correlation graph model, judging whether the pedestrian attribute word correlation between the current node and the critical point of the current node meets a theme induction condition, if so, representing the close correlation between the current node and the critical point, and inducing the two color nodes into the same color theme.

7. The method according to any one of claims 1 to 5, wherein in the process of summarizing the color word, the method comprises the following steps:

and performing cluster analysis processing on all pixel values in each training image in the training data based on a preset type color space by adopting a K nearest neighbor algorithm to form color words which accord with the pedestrian characteristic distribution rule in the color space.

8. The method according to any one of claims 1 to 5, wherein the step of determining a corresponding color word from each pixel value in the image to be quantized according to a mapping relationship from a preset pixel value interval to the color word, wherein the step of summarizing all pixel values in each training image containing pedestrian information to obtain the color word comprises:

determining an interval unit where a current pixel point in the image is located, wherein each color channel is divided into m intervals on the basis of N color channels contained in each color word in a current color space, and the interval units comprise m^NA color space of individual interval units;

and calculating the probability of mapping the current interval unit to each color word by using the mapping relation from the pixel value interval to the color word, and determining the color word corresponding to the maximum value in all the probability as the color word of the current pixel point.

9. The method according to any one of claims 1 to 5, wherein in the step of concatenating the theme probability vectors corresponding to each pixel point in the image to be quantized and describing the color attribute characteristics of the image to be quantized by using the concatenation result, the method comprises:

dividing the image to be quantized into a plurality of local area blocks;

determining the mean value vector of the theme probability vectors corresponding to the pixel points in each local area block, and taking the mean value vector as the theme probability vector corresponding to each local area block;

and connecting the mean vectors corresponding to each local area block in series to obtain a corresponding series result.

10. The method of claim 8, wherein in the step of calculating probability of likelihood that the current interval unit is mapped to each of the color words, comprises:

calculating a first probability of selecting any one of the color words;

calculating a second probability of falling in the current interval unit based on pixel values corresponding to all pixel points in the training data;

under the condition that a color word is determined in advance, calculating a third probability that the color word belongs to the current interval unit;

and obtaining the probability of the current interval unit mapping to the color word according to the first probability, the second probability and the third probability.