WO2022228142A1 - Object density determination method and apparatus, computer device and storage medium - Google Patents

Object density determination method and apparatus, computer device and storage medium Download PDF

Info

Publication number
WO2022228142A1
WO2022228142A1 PCT/CN2022/086848 CN2022086848W WO2022228142A1 WO 2022228142 A1 WO2022228142 A1 WO 2022228142A1 CN 2022086848 W CN2022086848 W CN 2022086848W WO 2022228142 A1 WO2022228142 A1 WO 2022228142A1
Authority
WO
WIPO (PCT)
Prior art keywords
density
image
standard
predicted
value
Prior art date
Application number
PCT/CN2022/086848
Other languages
French (fr)
Chinese (zh)
Inventor
王昌安
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2022228142A1 publication Critical patent/WO2022228142A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Definitions

  • the present application relates to the technical field of image processing, and in particular, to a method, apparatus, computer equipment and storage medium for determining the density of an object.
  • the object density determination technology can automatically infer the density of the crowd in the image, which plays an important role in video surveillance, public transportation safety and other fields.
  • the method of object density map regression is mainly used for prediction, and the deep learning technology based on artificial intelligence is used for end-to-end training and reasoning.
  • the density values in the object density map output by the trained object density determination model are often inaccurate, resulting in low accuracy of the acquired object density map.
  • an object density determination method According to various embodiments provided in the present application, an object density determination method, apparatus, computer device and storage medium are provided.
  • An object density determination method executed by computer equipment, the method comprising: acquiring a training sample image and a standard density map corresponding to the training sample image; inputting the training sample image into an object density determination model to be trained, and obtaining The predicted density map output by the object density determination model; the standard density map and the predicted density map are respectively divided to obtain a plurality of standard image blocks corresponding to the standard density map and multiple standard image blocks corresponding to the predicted density map.
  • An object density determination device comprising: an image acquisition module for acquiring a training sample image and a standard density map corresponding to the training sample image; an image input module for inputting the training sample image into an image to be trained In the object density determination model, the predicted density map output by the object density determination model is obtained; the image division module is used for dividing the standard density map and the predicted density map respectively, and obtains the multi-point density map corresponding to the standard density map.
  • a standard image block and a plurality of predicted image blocks corresponding to the predicted density map a density statistics module, configured to perform statistics on object densities in the standard image block to obtain a standard density statistic value corresponding to the standard image block, Counting object densities in the predicted image block to obtain a predicted density statistic value corresponding to the predicted image block; and a training module configured to compare the standard image block and the image position corresponding relationship with the standard image block
  • the predicted image blocks form image pairs, and based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, the parameters of the object density determination model to be trained are adjusted, and the trained object density determination model is obtained,
  • the trained object density determination model is used to generate an object density map.
  • a computer device comprising a memory and one or more processors, the memory stores computer-readable instructions, the computer-readable instructions, when executed by the processor, cause the processor to perform the above object density determination steps of the method.
  • One or more non-volatile readable storage media storing computer readable instructions which, when executed by one or more processors, cause the processors to perform the steps of the above object density determination method.
  • a computer program product comprising computer readable instructions that, when executed by a processor, implement the steps of the above object density determination method.
  • another object density determination method, apparatus, computer device and storage medium are also provided.
  • An object density determination method executed by computer equipment, the method comprising: acquiring a target image whose density is to be determined; inputting the target image into a trained object density determination model, and performing the object density determination by the object density determination model Determine; the object density determination model is obtained by adjusting the parameters of the object density determination model to be trained based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value; wherein, the image pair is a standard It consists of an image block and a predicted image block that has an image position corresponding relationship with the standard image block, and the standard image block is obtained by dividing the standard density map corresponding to the training sample image; the predicted image block is obtained by The predicted density map is obtained by dividing the predicted density map, and the predicted density map is obtained by inputting the training sample image into the object density determination model to be trained; and obtaining the corresponding target image output by the object density determination model.
  • Object density map is obtained by dividing the predicted density map, and the predicted density map is obtained by inputting the training sample image into the
  • An object density determination device the device comprises: an image acquisition module for acquiring a target image whose density is to be determined; a density determination module for inputting the target image into a trained object density determination model, through the The object density determination model is used to determine the object density; the object density determination model is obtained by adjusting the parameters of the object density determination model to be trained based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value; wherein , the image pair is composed of a standard image block and a predicted image block that has an image position correspondence with the standard image block, and the standard image block is obtained by dividing the standard density map corresponding to the training sample image; The predicted image block is obtained by dividing the predicted density map, and the predicted density map is obtained by inputting the training sample image into the object density determination model to be trained; and a density map acquisition module for acquiring The object density determines the object density map corresponding to the target image output by the model.
  • a computer device includes a memory and a processor, the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processor, causes the processor to perform the steps of the above object density determination method.
  • One or more non-volatile readable storage media storing computer readable instructions which, when executed by one or more processors, cause the processors to perform the steps of the above object density determination method.
  • a computer program product comprising computer readable instructions that, when executed by a processor, implement the steps of the above object density determination method.
  • Fig. 1 is the application environment diagram of the object density determination method in some embodiments.
  • FIG. 2 is a schematic flowchart of a method for determining object density in some embodiments
  • FIG. 3 is a schematic structural diagram of an object density determination model in some embodiments.
  • FIG. 5 is a schematic diagram of image position correspondence in some embodiments.
  • FIG. 6 is a schematic flowchart of a step of determining a loss value weight of an image to a loss value in some embodiments
  • FIG. 7 is a schematic flowchart of a method for determining object density in other embodiments.
  • FIG. 8 is a schematic diagram of a Gaussian kernel at two different sizes of human heads in some embodiments.
  • FIG. 9 is a schematic diagram of the application of the object density determination method in some embodiments.
  • FIG. 10 is a structural block diagram of an apparatus for determining object density in some embodiments.
  • FIG. 11 is a structural block diagram of an apparatus for determining object density in other embodiments.
  • Figure 12 is a diagram of the internal structure of a computer device in some embodiments.
  • the size of the standard density map and the predicted density map obtained from the same training sample image are the same, and the following correspondences exist:
  • the object density model provided by the embodiments of the present application can be applied to cloud services based on artificial intelligence.
  • the object density model can be deployed in a cloud server, and the cloud server obtains the target image whose density is to be determined, determines the object density map corresponding to the target image based on the object density model, and returns it to the terminal for display.
  • the training sample image and the object density map generated by the object density determination model can be saved on the blockchain.
  • the blockchain can generate a query code for the saved training sample image and object density map respectively, and return the generated query code to the terminal.
  • the training sample image can be queried, based on the object density map corresponding
  • the query code can be used to query the object density map.
  • the object density determination method provided in this application can be applied to the application environment shown in FIG. 1 .
  • the terminal 102 and the camera device 106 respectively communicate with the server 104 through the network.
  • the network may be a wired network or a wireless network, and the wireless network may be any one of a local area network, a metropolitan area network, and a wide area network.
  • the terminal 102 may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc., but is not limited thereto.
  • the server 104 may be an independent physical server, or a server cluster or a distributed system composed of multiple physical servers, or may provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, Cloud servers for basic cloud computing services such as middleware services, domain name services, security services, CDN, and big data and artificial intelligence platforms.
  • Camera device 106 may include one or more cameras.
  • a plurality of servers can be formed into a blockchain, and the servers are nodes on the blockchain.
  • the server 104 trains the object density determination model to be trained through the acquired training sample images, and after obtaining the trained object density determination model, deploys the object density determination model, after which the server 104 may receive the camera device 106
  • the server 104 trains the object density determination model to be trained through the acquired training sample images and obtains the trained object density determination model
  • the server 104 can use wired or wireless method
  • the trained object density determination model is sent to the terminal 102, and the terminal 102 receives the trained object density determination model and deploys it.
  • the terminal can The object density determination model processes image data to realize object density determination.
  • a method for determining the density of objects is provided, and the method for determining the density of objects can be applied to a computer device, and the computer may be a terminal or a server in FIG. 1 , and may also be a terminal and a server.
  • the interactive system composed, the method specifically includes the following steps:
  • Step 202 Obtain the training sample image and the standard density map corresponding to the training sample image.
  • the training sample images refer to images used for supervised training of the object density determination model to be trained.
  • One or more target objects are included in the training sample images.
  • the target object may specifically be an independent living body or object, such as a natural person, an animal, a vehicle, a virtual character, etc., or a specific part, such as a head, a hand, and the like.
  • Due to the supervised training there is a corresponding standard density map for the training sample images.
  • the standard density map is a density map that truly reflects the object density of the training sample images, and is a density map that supervises model training.
  • the standard density map corresponding to the training sample image may be a density map determined according to the object position points in the training sample image.
  • the density map reflects the number of objects in each position of the image. For example, the crowd density map can reflect the average number of people in the corresponding position of the unit pixel in the actual scene.
  • the density map can determine the total number of target objects in the image.
  • the computer device may acquire an image marked with an object position point of the target object as a training sample image, and the object position point may specifically be the position center point of the target object.
  • the object position point of the target object is the center point of the human head.
  • a computer device can acquire an image containing one or more target objects by taking a picture of a scene containing one or more target objects, and the image containing the target object can be used as a training sample image after the object position points are manually marked ;
  • the computer device can also obtain images including one or more target objects and marked object position points from a third-party computer device in a wired or wireless manner as a training sample image.
  • the computer device After acquiring the training sample image, the computer device determines an object response map corresponding to the training sample image according to the object position points corresponding to the training sample image, and obtains a standard density map corresponding to the training sample image according to the object response map.
  • the computer device may also directly acquire the image for which the standard density map has been determined as a training sample image.
  • the computer device may obtain images for which a standard density map has been determined from a public database of a third party as a training sample image.
  • Step 204 input the training sample image into the object density determination model to be trained, and obtain a predicted density map output by the object density determination model.
  • the object density determination model to be trained refers to an object density determination model that needs to be trained to determine model parameters.
  • the object density determination model is a machine learning model for determining the density of target objects in an image.
  • the object density determination model may employ a deep learning model comprising multiple convolutional neural networks.
  • the computer equipment inputs the training sample images into the object density determination model to be trained, the object density determination model predicts the object density in the training sample images, and obtains a predicted density map, and the computer equipment obtains the predicted density output by the object density determination model picture. It can be understood that both the standard density map and the predicted density map are obtained based on the same training sample image, so the standard density map and the predicted density map can be images of the same size.
  • the object density determination model includes an encoding layer, a decoding layer, and a prediction layer; inputting the training sample images into the object density determination model to be trained, and obtaining the predicted density map output by the object density determination model includes: Input the coding layer, perform downsampling processing through the coding layer to obtain the first target feature, then input the first target feature into the decoding layer, perform upsampling processing through the decoding layer to obtain the second target feature, and finally input the second target feature into the prediction layer, the density prediction is performed through the prediction layer, and the predicted density map is obtained.
  • the coding layer and decoding layer can use VGGnet ((Visual Geometry Group, Oxford University Computer Vision Group) series of neural networks, ResNet (residual network) series of neural networks, etc.
  • VGGnet (Visual Geometry Group, Oxford University Computer Vision Group) series of neural networks
  • ResNet residual network
  • the VGGnet series of neural networks are developed by Oxford University Computer Vision
  • the deep convolutional neural network developed by the Visual Geometry Group and researchers from Google DeepMind is composed of 5 layers of convolution layers, 3 layers of fully connected layers, and a softmax output layer.
  • the layers use max-pooling ( The maximization pool) is separated, the activation units of all hidden layers use the ReLU function, and the ResNet series of neural networks are neural networks constructed by residual blocks.
  • the high-level semantic information of the training sample image can be extracted by down-sampling the training sample image through the encoding layer, and the obtained first target feature is a low-resolution image with high-level semantic information.
  • the high-level semantic information is restored to higher-resolution semantic information, and the final second target feature is a high-resolution feature image with high-level semantic information.
  • the encoding layer and the decoding layer adopt skip links; the encoding layer includes a plurality of first convolutional layers; the decoding layer includes a plurality of second convolutional layers; the training sample images are input into the encoding layer, and the downlink is performed by the encoding layer.
  • the sampling process to obtain the first target feature includes: in the coding layer, down-sampling the intermediate feature output by the previous first convolutional layer through the current first convolutional layer, and obtaining the output of the last first convolutional layer as the first convolutional layer.
  • a target feature; inputting the first target feature into the decoding layer, performing upsampling processing through the decoding layer, and obtaining the second target feature includes: at the decoding layer, passing the current second convolutional layer according to the middle output of the previous second convolutional layer.
  • the features and the intermediate features output by the connected first convolutional layer are up-sampled, and the output of the last second convolutional layer is obtained as the second target feature.
  • the features output by the earlier convolutional layers can be integrated and the features output by the later convolutional layers can be used as the input of a convolutional layer, so that the input features of the convolutional layer include the multiple convolutional layers.
  • the contextual features with high-level semantic information obtained by the layer-by-step convolution process also include local detailed information, and the extracted features are more complete and accurate.
  • the coding layer includes five first convolution layers connected end to end, and each first convolution layer performs convolution processing on the intermediate features output by the previous convolution layer to realize down-sampling, and outputs five features in turn, namely V 1 , V 2 , V 3 , V 4 , V 5 , obtain the output feature V 5 of the first convolutional layer of the last layer as the first target feature, and input the obtained first target feature into the decoding layer.
  • the decoding layer includes five head and tail Connected second convolutional layers, each second convolutional layer is upsampled according to the intermediate features output by the previous second convolutional layer and the intermediate features output by the connected first convolutional layer, and outputs five features in turn, namely P 5 , P 4 , P 3 , P 2 , P 1 , obtain the feature P′ 1 output by the second convolutional layer of the last layer as the second target feature and input it to the prediction layer.
  • the prediction layer passes the second target feature through three
  • the two parallel convolutional layers respectively perform convolution processing on the second target feature, perform channel-wise concatenation between the output of each convolutional layer and the second target feature, and then perform convolution processing through a convolutional layer.
  • the final output predicted density map is the prediction layer.
  • FIG. 4 it is a specific schematic diagram of skip connection.
  • the output feature obtained by upsampling the input feature P i+1 by the second convolution layer and the intermediate feature output by the skip-connected first convolution layer are firstly channel-concatenated to obtain the intermediate feature, and then pass through the convolutional layer.
  • the accumulation layer is fused to obtain Pi as the input feature of the next second convolutional layer.
  • Step 206 Divide the standard density map and the predicted density map respectively to obtain multiple standard image blocks corresponding to the standard density map and multiple predicted image blocks corresponding to the predicted density map.
  • the computer device divides the standard density map to obtain multiple standard image blocks, and divides the predicted density map to obtain multiple predicted image blocks.
  • the division here refers to the area division of the pixels in the image block.
  • the computer device may first divide the standard density map to obtain the standard density map. and then divide the predicted density map according to the position of at least one standard image block in the standard density map to obtain a predicted image block that has an image position corresponding relationship with the standard image block.
  • the computer equipment can be based on the location of each pixel in image block A.
  • the predicted density map is divided so that the pixels in the predicted density map with the same position as each pixel in the image block A are divided into the same area, and the predicted image block corresponding to the image block A is obtained.
  • the computer device may divide the standard density map and the predicted density map respectively by using the same image block division method, so that the number, position and size of the predicted image blocks and the standard image blocks are matched.
  • the computer device may acquire a sliding window, slide the sliding window on the standard density map according to a preset sliding method, use the image area within the sliding window as a standard image block, and slide the sliding window according to the preset sliding method.
  • the window is slid on the predicted density map, and the image area in the sliding window is used as the predicted image block, so that the standard image block and the predicted image block with the same size and quantity and the image positions correspond one-to-one can be obtained.
  • Step 208 Count the density of objects in the standard image block to obtain a standard density statistic value corresponding to the standard image block, and perform statistics on the object density in the predicted image block to obtain a predicted density statistic value corresponding to the predicted image block.
  • the object density refers to the density value of each pixel in the image block, and the pixel density value is used to represent the density of the object at the location of the pixel.
  • Counting the density of objects in an image block refers to expressing the density values of all pixels in the image block with a statistical value. Average density values, or median density values of all pixels in an image block, etc.
  • the computer device After obtaining the standard image block and the predicted image block, the computer device performs statistics on the object density in the standard image block, obtains the standard density statistic value corresponding to the standard image block, and calculates the object density in the predicted image block in the same way. Statistics are performed to obtain the predicted density statistic value corresponding to the predicted image block. For example, assuming that the object densities in the standard image block are accumulated to obtain the standard density statistic value corresponding to the standard image block, then the object densities in the predicted image block are also accumulated to obtain the predicted density statistic value corresponding to the predicted image block. .
  • step 210 the standard image block and the predicted image block that has an image position correspondence with the standard image block are formed into an image pair, and based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, the density of the object to be trained is calculated.
  • the model is determined and parameters are adjusted to obtain an object density determination model after training, and the trained object density determination model is used to generate an object density map.
  • the image position correspondence between the standard image block and the predicted image block means that the position of the standard image block in the standard density map corresponds to the position of the predicted image block in the predicted density map, then for each There are pixels with the same position in the predicted image block corresponding to its position.
  • the standard density statistic value corresponding to the image pair refers to the standard density statistic value corresponding to the standard image block in the image pair.
  • the predicted density statistic value corresponding to the image pair refers to the predicted density statistic value corresponding to the predicted image block in the image pair.
  • the image position correspondence is shown in Figure 5, in which the dotted arrows indicate the image position correspondence.
  • the image block A1 there is an image position correspondence between the standard image block A1 and the predicted image block B1.
  • an image position correspondence between the image block A2 and the predicted image block B2 an image position correspondence between the standard image block A3 and the predicted image block B3, and an image position correspondence between the standard image block A4 and the predicted image block B4, That is, the positions of the standard image blocks and the predicted image blocks constituting the image pair in the image are consistent.
  • the computer device determines a predicted image block that has an image position correspondence relationship with the standard image block from the multiple predicted image blocks obtained by division, and the standard image block is associated with the predicted image block.
  • An image pair is formed, and based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, the computer device can obtain the image pair loss value corresponding to the image pair, and the loss value is counted based on the image pair.
  • the target loss value can be obtained, and based on the target loss value, the computer equipment can adjust the parameters of the object density determination model to be trained to obtain the trained object density determination model.
  • the use of the object density determination model to generate the object density map means that the object density determination model can output object density values corresponding to each position in the image, such as the number of people corresponding to each position.
  • object density determination model can output object density values corresponding to each position in the image, such as the number of people corresponding to each position.
  • different forms can be used to reflect the object density value corresponding to each position in the object density map as required.
  • the color corresponding to each object density value can be determined, in the form of a heat map. Displays an object density map.
  • the computer device when dividing the standard density map and the predicted density map, slides the same sliding window on the standard density map and the predicted density map in the same sliding manner, so as to obtain the multi-point density map corresponding to the standard density map.
  • standard image blocks and multiple predicted image blocks corresponding to the predicted density map then when determining the image position correspondence, the computer device can determine the corresponding relationship according to the sliding sequence, number the obtained standard image blocks based on the sliding sequence, and based on the sliding sequence
  • the obtained predicted image blocks are sequentially numbered, and two image blocks with the same number are determined as image blocks with a corresponding relationship of image positions, and the two image blocks are formed into an image pair.
  • the parameters of the object density determination model to be trained are adjusted to obtain The trained object density determines the model.
  • the computer device may generate an object density map from the object density determination model. Specifically, the target image whose density is to be determined is input into the trained object density determination model, the object density is determined by the object density determination model, and the object density map corresponding to the target image output by the object density determination model is obtained.
  • the computer device may integrate the object density map to obtain the total number of target objects in the target image.
  • the standard density block and the predicted density map are divided respectively, multiple standard image blocks corresponding to the standard density map and multiple predicted image blocks corresponding to the predicted density map are obtained, and the standard image blocks are Count the object density of the predicted image block to obtain the standard density statistic value corresponding to the standard image block, and count the object density in the predicted image block to obtain the predicted density statistic value corresponding to the predicted image block, then in the training process, the standard image block can be and the predicted image blocks that have an image position correspondence with the standard image blocks to form an image pair, and based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, the parameters of the object density determination model to be trained are adjusted, so that The density value of the local area can be fitted in units of image blocks, and the overall density value of the local area is comprehensively considered, which improves the accuracy of the object density determination model obtained by training for determining the object density.
  • adjusting parameters of the object density determination model to be trained based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, and obtaining the trained object density determination model includes: based on the image pair The difference between the corresponding standard density statistic value and the predicted density statistic value is used to obtain the image pair loss value corresponding to the image pair; the image pair loss value is counted to obtain the target loss value; the object density to be trained is based on the target loss value Determine the model and adjust the parameters to obtain the object density determination model after training.
  • each standard image block in a plurality of standard image blocks obtained by dividing the standard density map has a predicted image block with an image position corresponding relationship with it in the predicted density map.
  • the object density in each standard image block in the figure is counted, and the standard density statistic value corresponding to each standard image block is obtained, and the standard density statistic value is used to replace the density value of the area where the standard image block is located, which is equivalent to obtaining a standard density map.
  • Corresponding standard local count map count the object density in each predicted image block in the predicted density map, obtain the predicted density statistic value corresponding to each predicted image block, and replace the predicted image block with the predicted density statistic value in the area where the predicted image block is located.
  • Density value the predicted local count map corresponding to the predicted density map can be obtained, the standard local count map and the image blocks with the image position correspondence in the predicted local count map are formed into pairs, based on the standard density statistics corresponding to the image pair and the prediction The difference between the density statistical values, the loss value of the image pair corresponding to the image pair is obtained, and then during training, the computer equipment can count the loss value of the image pair corresponding to all the image pairs to obtain the standard local count map and predicted local Count the target loss value between the graphs, and finally, the computer equipment can back-propagate the target loss value to the object density determination model, and adjust the model parameters of the object density determination model through the gradient descent algorithm until the stopping condition is satisfied, The trained object density determination model is obtained.
  • the gradient descent algorithm includes but is not limited to stochastic gradient descent algorithm, Adagrad ((Adaptive Gradient, adaptive gradient) algorithm, Adadelta (improvement of AdaGrad algorithm), RMSprop (improvement of AdaGrad algorithm) and so on.
  • the computer device may construct a loss function based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, and obtain the image pair loss value corresponding to the image pair based on the loss function.
  • the loss function can be one of the cross entropy (Cross Entropy) loss function, the MERS (mean-square error, mean square error) loss function and so on.
  • the computer device performs statistics on the image pair loss values to obtain the target loss value, specifically: summing the respective image pair loss values of all the image pairs to obtain the target loss value. In some other embodiments, the computer device performs statistics on the loss values of the image pairs to obtain the target loss: averaging the respective image pair loss values of all the image pairs to obtain the target loss value.
  • the target loss value is obtained by counting the loss values of the images, and the parameters of the object density determination model to be trained are adjusted based on the target loss value to obtain the trained object density determination model, which can avoid pixel by pixel to the greatest extent. Training error due to fitted density values.
  • obtaining the image pair loss value corresponding to the image pair includes: compressing the image pair according to the target shrinking method. Shrink the corresponding standard density statistic value to obtain the shrunk standard density statistic value.
  • the shrinkage amplitude corresponding to the target shrinkage mode is positively correlated with the size of the value to be shrunk; the predicted density statistics corresponding to the image pair are calculated according to the target shrinkage mode
  • the loss value of the image pair corresponding to the image pair is obtained, where the loss value of the image pair is the same as The difference is positively correlated.
  • the target shrinking method refers to a mathematical operation method that can shrink the numerical value to reduce the numerical value.
  • the contraction amplitude corresponding to the target contraction mode is positively correlated with the value to be contracted, that is, the larger the value to be contracted, the greater the contraction amplitude; conversely, the smaller the value to be contracted, the smaller the contraction amplitude.
  • the to-be-shrinked value in the embodiment of the present application refers to a standard density statistic value or a predicted density statistic value.
  • Shrinkage amplitude refers to the difference between the value after shrinkage and the value before shrinkage.
  • the computer device can shrink the standard density statistic value corresponding to the image pair according to the target shrinking method to obtain the shrunk standard density statistic value, and shrink the predicted density statistic value corresponding to the image pair according to the target shrinking method to obtain The predicted density statistic value after shrinkage, and the computer equipment can further subtract the shrinkage standard density statistic value from the shrinkage predicted density statistic value, and when the obtained difference value is greater than 0, the difference value is used as the image corresponding to the image pair
  • the loss value when the obtained difference value is less than 0, the absolute value of the difference value is taken as the image pair loss value corresponding to the image pair.
  • the image pair loss value is positively correlated with the difference value.
  • the difference value here refers to the absolute difference value. The larger the absolute difference value, the larger the image pair loss value; on the contrary, the smaller the absolute difference value, the smaller the image pair loss value .
  • shrinking the standard density statistic value corresponding to the image pair according to the target shrinking method, and obtaining the shrunk standard density statistic value includes: shrinking the standard density statistic value corresponding to the image pair according to the target shrinking method, Obtaining the shrunk standard density statistic value includes: taking the preset value as the base, performing logarithmic transformation with the standard density statistic value as the true number, and using the obtained logarithm as the shrunk standard density statistic value, and the preset value is greater than 1 ; Shrink the predicted density statistic value corresponding to the image pair according to the target shrinking method, and obtain the shrunk predicted density statistic value including: taking the preset value as the base, and using the predicted density statistic value as the true number to perform logarithmic transformation, and converting the The resulting log is used as the predicted density statistic after shrinkage.
  • the computer The device can obtain the image pair loss value corresponding to the image pair according to the difference between log a N and log a M.
  • the preset value is greater than 1, for example, it may be e.
  • the density statistic value of this area in the standard density map and the predicted density map may be 0.
  • a constant deviation can be added to each density statistic value, and the constant deviation can be set as required, for example, it can be 1e-3 (ie 0.001), and then logarithmically transformed according to the method in the above embodiment, the image
  • the specific calculation method of the loss value refers to the following formula (1), where pred is the predicted density statistic value corresponding to the predicted image block in a certain image pair, gt is the standard density statistic value corresponding to the standard image block in the image pair, and Loss refers to is the image pair loss value, log refers to the logarithmic transformation, and the base of log can be a number greater than 1, such as e:
  • the standard density statistic value and the prediction degree statistic value corresponding to the image pair are respectively shrunk according to the target shrinking method, and the image is obtained according to the difference between the shrunk standard density statistic value and the shrunk predicted density statistic value.
  • the prediction deviation of the difficult-to-predict samples ie, image blocks in high-density areas
  • the reverse transmission can be performed.
  • the gradient will also be reduced accordingly, and the image blocks in the high-density area are likely to be wrong samples, which is conducive to weakening the excessive gradient caused by some wrong samples, highlighting the gradient of useful samples, which is conducive to the optimization of model parameters during the training process.
  • performing statistics on the loss value of the image pair to obtain the target loss value includes: determining the loss value weight of the image pair loss value according to the standard density statistical value corresponding to the image pair, and the loss value weight and the standard density statistical value are negative. Correlation; based on the weight of the loss value and the image, the weighted sum of the loss value is obtained to obtain the target loss value.
  • the computer device can determine the loss value weight of the image pair loss value according to the standard density statistical value corresponding to the image pair, and the loss value weight is negatively correlated with the standard density statistical value, that is, the larger the standard density statistical value, the greater the loss value The smaller the value weight, the smaller the standard density statistic, and the greater the weight of the loss value.
  • a preset threshold value Y may be set.
  • the computer device determines that the standard density statistic value corresponding to the image pair is larger.
  • a smaller loss value weight a is determined for the loss value of the image pair corresponding to the image pair.
  • the computer device determines a larger loss value weight b for the image pair loss value corresponding to the image pair, where b is greater than a.
  • determining the loss value weight of the loss value of the image pair according to the standard density statistic value corresponding to the image pair includes:
  • Step 602 Divide the standard density statistics into density intervals to obtain a plurality of density intervals.
  • N standard image blocks are obtained by dividing the standard density map.
  • the minimum value is a
  • the standard density statistic value can be divided into K (K ⁇ 2) density intervals, and K can be specified as needed.
  • K can be 4, and the i-th (1 ⁇ i ⁇ K)
  • the statistical value range of the density interval is shown in the following formula (2):
  • Step 604 Acquire the number of image blocks of the standard image blocks whose standard density statistic value is in the density interval.
  • Step 606 Determine the loss value weight of the image corresponding to the standard image block to the loss value based on the number of image blocks in the density interval corresponding to the standard image block; the number of image blocks is positively correlated with the loss value weight.
  • the computer device counts the number n i of image blocks of the standard image blocks that fall within the density interval.
  • the computer device may calculate the ratio pi of the number of image blocks n i in the density interval i to the total number of standard image blocks N with reference to the following formula (3):
  • the loss value weight of the image pair loss value corresponding to the standard image block in the density interval is determined according to the ratio.
  • the computer device may directly determine the ratio as the loss value weight of the image corresponding to the standard image block within the density interval i to the loss value.
  • the computer device can calculate the standard value in the density interval i with reference to the following formula (4)
  • performing statistics on the image pair loss value to obtain the target loss value includes: attenuating the image pair loss value according to the target attenuation mode to obtain the attenuated image pair loss value, wherein the attenuation corresponding to the target attenuation mode The magnitude is positively correlated with the image pair loss value; the attenuated image pair loss value is summed to obtain the target loss value.
  • the target attenuation method refers to a method that can reduce the loss value of the image pair.
  • the attenuation amplitude corresponding to the target attenuation mode is positively correlated with the loss value of the image pair, that is, the larger the loss value of the image pair, the greater the attenuation amplitude; on the contrary, the smaller the loss value of the image pair, the smaller the attenuation amplitude.
  • the attenuation magnitude refers to the difference between the image pair loss value before attenuation and the image pair loss value after attenuation.
  • the computer equipment can decay according to the target when training the object density determination model.
  • the image pair loss value is attenuated by the method to obtain the attenuated image pair loss value, and the sum operation is performed on the attenuated image pair loss value to obtain the target loss value.
  • the loss values of all image pairs may be sorted, and a preset number (eg, 10%) of image pair loss values with larger values may be selected according to the sorting result, and the loss values of these image pairs are set to 0 , so that these samples that may be mislabeled can be filtered out during training, thereby stabilizing the training process of the network.
  • a preset number eg, 10%
  • the computer equipment can sort the loss values of the 100 image pairs in descending order, and then select the loss values of the top 10 image pairs, and set the loss values of these image pairs directly to is 0.
  • the computer device may obtain a preset exponential function, and weight the loss value of the image pair by the exponential function, and the value of the exponential function is negatively correlated with the loss value of the image pair, that is, the loss value of the image pair.
  • the exponential function may be, for example, e -x , where x is the image pair loss value, and xe -x is the attenuated image pair loss value.
  • the computer attenuates the loss value of the image pair according to the target attenuation method, obtains the loss value of the image pair after attenuation, and then performs a sum operation on the loss value of the image pair after attenuation to obtain the target loss value.
  • Value backpropagation adjusts the object density to determine the model parameters of the model. Since the partial samples with the largest image loss value are suppressed by attenuation, the gradient information brought by the useful samples can be highlighted, because the proportion of these useful gradient information from the correctly labeled samples will be higher. large, so the training of the model will be more helpful.
  • dividing the standard density map and the predicted density map respectively to obtain multiple standard image blocks corresponding to the standard density map and multiple predicted image blocks corresponding to the predicted density map includes: acquiring a sliding window; sliding according to a preset sliding the sliding window on the standard density map, and taking the image area in the sliding window as the standard image block; sliding the sliding window on the predicted density map according to the preset sliding method, and taking the image area in the sliding window as the standard image block; Predicted image blocks.
  • the size of the sliding window can be determined according to needs, for example, it can be determined according to the size of the training sample image.
  • the sizes of multiple sliding windows can be the same or different.
  • the preset sliding mode refers to determining the sliding starting point from the training image, and traversing the entire training sample image to slide in a certain order.
  • the computer device slides the sliding window on the standard density map according to a preset sliding method.
  • the image area in the sliding window is used as the standard image block.
  • the sliding window is slid on the predicted density map.
  • the image area within the sliding window is used as the predicted image block.
  • the sliding window in order to improve the sliding efficiency, when sliding the sliding window on the image, the sliding window can be slid without overlapping.
  • Non-overlapping means that there are no overlapping pixels between two adjacent image blocks obtained by sliding.
  • the size of the standard density map is 128*128, if a sliding window of size 4*4 is slid non-overlapping on the standard density map, 1024 standard image blocks of 4*4 size can be obtained, if By sliding the sliding window of size 8*8 on the standard density map without overlapping, you can obtain 256 standard image blocks of size 8*8. If the sliding window of size 16*16 is performed on the standard density map Non-overlapping sliding, you can get 64 standard image blocks of 16*16 size, if you slide the sliding window of 32*32 size on the standard density map without overlapping, you can get 16 32*32 size Standard image block.
  • the object position point is used to represent the actual position of the target object in the training sample image.
  • the object position point may specifically be the center point of the object.
  • the center point of the object may specifically be the center point of the human head.
  • the object response map refers to the image obtained by responding to the position of the center point of the object, and the image is the same size as the training sample image.
  • the pixel value of the object position point is the first pixel value
  • the pixel value of the non-object position point is the second pixel value
  • the first pixel value and the second pixel value are different pixel values, so that the object Distinguish between object location points and non-object location points in the response graph.
  • the first pixel value may be, for example, 1, and the second similarity value may be, for example, 0.
  • the computer equipment can respectively respond to each object position point corresponding to the training sample image to obtain a response map of each object position point, the response map is the same size as the training sample image, and then all the response maps are pixel-superimposed, The object response map corresponding to the training sample image is obtained, and the computer device can further perform convolution processing on the object response map according to a preset Gaussian kernel to obtain a standard density map corresponding to the training sample image.
  • the target object is a natural person
  • the training sample images are marked with N head center points x 1 , x 2 , x 2 , ... x N
  • it can be expressed as a picture ⁇ (xx i ) of the same size as the training sample image, that is, only the position x i is 1, and the rest of the positions are 0, then the N head can be expressed as H(x), refer to The following formula (5):
  • the total number of people in the training sample image can be obtained by integrating the image, and then convolving the image with a Gaussian kernel G ⁇ to obtain the standard density map D corresponding to the training sample image, referring to the following formula ( 6):
  • the computer device determines the object response map corresponding to the training sample image according to the object position points corresponding to the training sample image, and then performs convolution processing on the object response map to obtain the standard density map corresponding to the training sample image, which can eliminate the The sparsity of the features in the object response map, the obtained standard density map is more conducive to the learning of the model.
  • a method for determining the density of objects is provided, and the method for determining the density of objects can be applied to a computer device, and the computer may be a terminal or a server in FIG. 1 , and may also be a terminal and a server
  • the interactive system composed, the method specifically includes the following steps:
  • Step 702 acquiring the target image of the density to be determined.
  • the target image whose density is to be determined may be the target image whose density needs to be determined.
  • the target image contains one or more target objects.
  • the computer device may photograph a scene containing one or more target objects to obtain target images of the density to be determined.
  • the computer device can also acquire the target image whose density is to be determined from other computer devices through the network.
  • the target image can be the image of various scenes.
  • the target image may be an image for monitoring crowds in a target place, and the target place may be, for example, a subway, a shopping mall, or the like.
  • Step 704 Input the target image into the trained object density determination model, and determine the object density through the object density determination model.
  • the object density determination model is obtained by adjusting the parameters of the object density determination model to be trained based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value; wherein the image pair is the standard image block and the standard image
  • the block is composed of predicted image blocks with corresponding image positions.
  • the standard image block is obtained by dividing the standard density map corresponding to the training sample image; the predicted image block is obtained by dividing the predicted density map. It is obtained by inputting the training sample image into the object density determination model to be trained.
  • Step 706 Obtain an object density map corresponding to the target image output by the object density determination model.
  • steps 702 to 704 For the detailed description of steps 702 to 704, reference may be made to the foregoing embodiments, which will not be repeated in this application.
  • the above object density determination method is obtained by adjusting the parameters of the object density determination model to be trained based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, wherein the image pair is
  • the standard image block is composed of the standard image block and the predicted image block corresponding to the image position of the standard image block.
  • the standard image block is obtained by dividing the standard density map corresponding to the training sample image, and the predicted image block is obtained by dividing the predicted density map.
  • the predicted density map is obtained by inputting the training sample image into the object density determination model to be trained. During the training of the object density determination model to be trained, the image block can be used as a unit to fit the local area.
  • the density value which comprehensively considers the overall density value of the local area, improves the accuracy of the object density determination model obtained by training when it is used to determine the object density, so that the target image is input into the trained object density determination model, and the object density is determined.
  • the model can output accurate object density maps.
  • the computer device may integrate the object density map to determine the total number of target objects in the target image.
  • the computer device may display the object density map in the form of a heat map. In the displayed object density map, the darker the color, the denser the target object.
  • the object density determination method further includes a training step of the object density determination model, and the training step specifically includes: acquiring a training sample image and a standard density map corresponding to the training sample image; inputting the training sample image into the object to be trained In the density determination model, the predicted density map output by the object density determination model is obtained; the standard density map and the predicted density map are divided respectively, and multiple standard image blocks corresponding to the standard density map and multiple predicted image blocks corresponding to the predicted density map are obtained.
  • the predicted image blocks that have an image position correspondence with the standard image blocks are composed of image pairs. Based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, the parameters of the object density determination model to be trained are adjusted to obtain training. After the object density determines the model.
  • the present application also provides an application scenario, where the above-mentioned object density determination method is applied to realize intelligent transportation.
  • the object density determination method provided by the embodiment of the present application can perform passenger flow statistics for any traffic location. Sent to the server, where the trained crowd density determination model (ie, the object density determination model in the above embodiment) is deployed.
  • the application of the object density determination method in this application scenario is as follows:
  • the object density determination model is obtained by pre-training on the server through the following steps:
  • the server obtains the training sample set, the training sample images in the training sample set are marked with the center point of the human head, and the crowd response map of the same size is obtained according to the training sample image.
  • the crowd response map the pixel of each head center point is 1, and other positions are The pixel value of 0 is 0, and the server further uses a preset Gaussian kernel to perform convolution processing on the response map to obtain a standard density map corresponding to the training sample image.
  • the standard deviation of the Gaussian kernel here is manually specified or estimated. Therefore, for heads of different scales, the area covered by the Gaussian kernel is inconsistent.
  • the Gaussian kernel is in two different A schematic diagram of a human head with a size, in which (a) the area covered by the Gaussian kernel in the figure is area 802, and (b) the area covered by the Gaussian kernel in the figure is area 804. It can be clearly seen that the semantic information of these two areas is different. identical.
  • the crowd density determination model is based on deep learning technology, takes a single image as input, and extracts image features through a deep convolutional network. Since the crowd density determination task requires both contextual features with high semantic information and local detailed information, so In order to obtain high-resolution feature maps with both high-level semantic information and detailed information, a U-shaped network structure with down-sampling and then up-sampling is usually used, and skip links are introduced to introduce detailed information for up-sampling, and finally the output crowd is predicted using dense crowds.
  • the density map, the network structure of the crowd density determination model is shown in Figure 3.
  • the server may accumulate the crowd density values in the standard image block to obtain the standard density statistical value corresponding to the standard image block.
  • the server can predict the The crowd density values in the image block are accumulated to obtain the predicted density statistic value corresponding to the predicted image block.
  • the crowd density is adjusted to determine the model parameters of the model, until the convergence condition is met, the crowd density after training is obtained. Determine the model.
  • the server inputs the crowd image into the trained crowd density determination model, determines the density of the crowd image through the crowd density determination model, obtains the crowd density map corresponding to the crowd image, and performs integration based on the crowd density map to obtain the crowd
  • the total number of people in the image (the number of people is counted in the image based on the center point of the head), the crowd density map and the total number of people are sent to the terminal, and the terminal can display the crowd density map in the form of a heat map.
  • the server can determine the object density on the image (a) in FIG. 9 to obtain a crowd density image, and can also determine the The total number of people in the crowd image, for example, the total number of people is 208, the server sends the crowd density map to the terminal, and the terminal displays the crowd density degree in the image, as shown in (b) in Figure 9 .
  • the total number of people 208 is shown in figure (b), the density of people in different image areas may be different, which can be displayed in different colors in figure (b), which uses different patterns instead of colors in (b) figure. indicated.
  • the terminal can also generate prompt information to prompt that there may be an excessive flow of people.
  • the present application also provides another application scenario, where the above-mentioned object density determination method is applied to realize a smart quotient supermarket.
  • this application scenario by obtaining the crowd density map of each target area of the supermarket, the terminal can count the flow of people in each area of the supermarket according to a certain period, and generate a report for the statistical results to provide to relevant personnel for The footprint of the target area is adjusted to ease the crowded situation in some areas.
  • the present application also provides another application scenario, where the above-mentioned object density determination method is applied to monitor the crowd density of tourist attractions.
  • the crowd density of various popular scenic spots in tourist attractions can be monitored.
  • the monitoring personnel can be prompted in the form of text or voice to improve the security of the target area. .
  • the object density determination method provided by the embodiments of the present application can alleviate the problems existing in the related art in the regression of artificially generated density maps from multiple perspectives.
  • the standard density map regression is transformed into the regression of the density statistics, and then the logarithmic change of the density statistics is performed to reduce the gradient generated by the samples with large prediction deviations, and finally the gradient information of the samples with large prediction errors is filtered out, so as to stabilize
  • the optimization process of the network After eliminating the negative effects of inaccurate artificially generated density maps, the network can be optimized to a better local optimum, resulting in better generalization ability. At the same time, this scheme fully considers the contribution of most samples with low density values to the final counting error. Therefore, in the optimization process, the method of mining between partitions is used to alleviate this problem, which is conducive to further reducing the training error.
  • steps in the flowcharts of FIGS. 2-9 are shown in sequence according to the arrows, these steps are not necessarily executed in the sequence shown by the arrows. Unless explicitly stated herein, the execution of these steps is not strictly limited to the order, and these steps may be performed in other orders. Moreover, at least a part of the steps in FIGS. 2-9 may include multiple steps or multiple stages. These steps or stages are not necessarily executed and completed at the same time, but may be executed at different times. The execution of these steps or stages The order is also not necessarily sequential, but may be performed alternately or alternately with other steps or at least a portion of the steps or phases within the other steps.
  • an object density determination apparatus 1000 is provided.
  • the apparatus may adopt software modules or hardware modules, or a combination of the two to become a part of computer equipment.
  • the apparatus specifically includes:
  • An image acquisition module 1002 configured to acquire training sample images and standard density maps corresponding to the training sample images
  • the image input module 1004 is used to input the training sample image into the object density determination model to be trained, and obtain the predicted density map output by the object density determination model;
  • the image division module 1006 is used to divide the standard density map and the predicted density map respectively, to obtain multiple standard image blocks corresponding to the standard density map and multiple predicted image blocks corresponding to the predicted density map;
  • the density statistics module 1008 is configured to perform statistics on the object density in the standard image block, obtain the standard density statistics value corresponding to the standard image block, and perform statistics on the object density in the predicted image block to obtain the predicted density statistics value corresponding to the predicted image block ;
  • the training module 1010 is used to form an image pair from a standard image block and a predicted image block with an image position corresponding relationship with the standard image block, and based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, the training module is to be trained.
  • the parameters of the object density determination model are adjusted to obtain a trained object density determination model, and the trained object density determination model is used to generate an object density map.
  • the above-mentioned object density determination device because the standard density map and the predicted density map are divided respectively, obtains a plurality of standard image blocks corresponding to the standard density map and a plurality of predicted image blocks corresponding to the predicted density map, and for the standard image blocks.
  • the object density is counted to obtain the standard density statistic value corresponding to the standard image block, the object density in the predicted image block is counted, and the predicted density statistic value corresponding to the predicted image block is obtained, then in the training process, the standard image block and The predicted image blocks that have an image position correspondence with the standard image blocks form an image pair.
  • the parameters of the object density determination model to be trained are adjusted, so as to be able to
  • the density value of the local area is fitted in units of image blocks, and the overall density value of the local area is comprehensively considered, which improves the accuracy of the object density determination model obtained by training for determining the object density.
  • the training module 1010 is further configured to obtain the image pair loss value corresponding to the image pair based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value; perform statistics on the image pair loss value , obtain the target loss value; adjust the parameters of the object density determination model to be trained based on the target loss value, and obtain the trained object density determination model.
  • the training module 1010 is further configured to shrink the standard density statistic value corresponding to the image pair according to the target shrinkage mode to obtain the shrunk standard density statistic value, the shrinkage amplitude corresponding to the target shrinkage mode and the value to be shrunk
  • the size of the shrinkage is positively correlated;
  • the predicted density statistic value corresponding to the image pair is shrunk according to the target shrinkage method to obtain the shrunk predicted density statistic value; according to the difference between the shrunk standard density statistic value and the shrunk predicted density statistic value value, the loss value of the image pair corresponding to the image pair is obtained, wherein the loss value of the image pair is positively correlated with the difference value.
  • the training module 1010 is further configured to use the preset value as the base, perform logarithmic transformation with the standard density statistic value as the true number, and use the obtained logarithm as the shrunk standard density statistic value, the preset value greater than 1; take the preset value as the base, perform logarithmic transformation with the predicted density statistic value as the true number, and use the obtained logarithm as the shrunk predicted density statistic value.
  • the training module 1010 is further configured to determine the loss value weight of the loss value of the image pair according to the standard density statistic value corresponding to the image pair, and the loss value weight has a negative correlation with the standard density statistic value; based on the loss value weight and The image performs a weighted sum of the loss values to obtain the target loss value.
  • the training module 1010 is further configured to divide the standard density statistic value into density intervals to obtain multiple density intervals; obtain the number of image blocks of the standard image blocks whose standard density statistic value is in the density interval; The number of image blocks in the corresponding density interval determines the loss value weight of the image corresponding to the standard image block to the loss value; the number of image blocks has a positive correlation with the loss value weight.
  • the training module 1010 is further configured to attenuate the loss value of the image pair according to the target attenuation method to obtain the loss value of the image pair after attenuation, wherein the attenuation amplitude corresponding to the target attenuation method is positively correlated with the loss value of the image pair relationship; sum the loss value of the attenuated image to obtain the target loss value.
  • the image division module 1006 is further configured to obtain a sliding window; slide the sliding window on the standard density map according to a preset sliding method, and use the image area within the sliding window as a standard image block; slide the sliding window according to the preset The method slides the sliding window on the predicted density map, and uses the image area within the sliding window as the predicted image block.
  • the training sample image is marked with a plurality of object position points; the image division module 1006 is further configured to determine the object response map corresponding to the training sample image according to the object position points corresponding to the training sample image; The pixel value of the position point is the first pixel value, and the pixel value of the non-object position point is the second pixel value; the object response map is convolved to obtain a standard density map corresponding to the training sample image.
  • an object density determination apparatus 1100 is provided.
  • the apparatus may adopt software modules or hardware modules, or a combination of the two to become a part of computer equipment.
  • the apparatus specifically includes:
  • an image acquisition module 1102 configured to acquire a target image whose density is to be determined
  • the density determination module 1104 is used to input the target image into the trained object density determination model, and the object density determination model is used to determine the object density; the object density determination model is based on the standard density statistic value and the predicted density statistic value corresponding to the image pair The difference between the two is obtained by adjusting the parameters of the object density determination model to be trained; among them, the image pair is composed of a standard image block and a predicted image block that has an image position correspondence with the standard image block.
  • the standard density map corresponding to the sample image is divided; the predicted image block is obtained by dividing the predicted density map, and the predicted density map is obtained by inputting the training sample image into the object density determination model to be trained;
  • the density map acquisition module 1106 is configured to acquire the object density map corresponding to the target image output by the object density determination model.
  • the object density determination model to be trained is obtained by adjusting the parameters, wherein the image pair is
  • the standard image block is composed of the standard image block and the predicted image block corresponding to the image position of the standard image block.
  • the standard image block is obtained by dividing the standard density map corresponding to the training sample image, and the predicted image block is obtained by dividing the predicted density map.
  • the predicted density map is obtained by inputting the training sample image into the object density determination model to be trained.
  • the image block can be used as a unit to fit the local area.
  • the density value which comprehensively considers the overall density value of the local area, improves the accuracy of the object density determination model obtained by training when it is used to determine the object density, so that the target image is input into the trained object density determination model, and the object density is determined.
  • the model can output accurate object density maps.
  • the above-mentioned device further includes: a training module for acquiring training sample images and standard density maps corresponding to the training sample images; inputting the training sample images into the object density determination model to be trained, and obtaining the output of the object density determination model
  • the predicted density map of obtain the standard density statistic value corresponding to the standard image block, count the object density in the predicted image block, and obtain the predicted density statistic value corresponding to the predicted image block;
  • the image blocks form image pairs, and based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, the parameters of the object density determination model to be trained are adjusted to obtain the trained object density determination model.
  • Each module in the above-mentioned object density determination device may be implemented in whole or in part by software, hardware and combinations thereof.
  • the above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
  • a computer device is provided, and the computer device may be a server, and its internal structure diagram may be as shown in FIG. 12 .
  • the computer device includes a processor, memory, and a network interface connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium, an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions and a database.
  • the internal memory provides an environment for the execution of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the computer device's database is used to store training sample image data.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer readable instructions when executed by a processor, implement an object density determination method.
  • FIG. 12 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. Include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.
  • a computer device including a memory and a processor, where computer-readable instructions are stored in the memory, and when the computer-readable instructions are executed by the processor, cause the processor to execute the above method embodiments. step.
  • one or more non-volatile readable storage media are provided, and computer-readable instructions are stored.
  • the processors When the computer-readable instructions are executed by one or more processors, the processors perform the above-mentioned methods. steps in the example.
  • a computer program product comprising computer-readable instructions, which, when executed by a processor, implement the steps in each of the foregoing method embodiments.
  • Non-volatile memory may include read-only memory (Read-Only Memory, ROM), magnetic tape, floppy disk, flash memory, or optical memory, and the like.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • the RAM may be in various forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The present application relates to an object density determination method and apparatus, a computer device and a storage medium, which can be applied to scenarios such as intelligent transportation and intelligent supermarkets. The method comprises: inputting a training sample image into an object density determination model to be trained, so as to obtain a predictive density map output by the object density determination model; acquiring a plurality of standard image blocks corresponding to a standard density map and a plurality of predicted image blocks corresponding to the predictive density map; respectively compiling statistics on object densities in the standard image blocks and the predicted image blocks, so as to obtain a standard density statistical value corresponding to each standard image block and a predictive density statistical value corresponding to each predicted image block; and training the object density determination model on the basis of the difference between the standard density statistical value and the predictive density statistical value that correspond to an image pair. The object density determination model is an artificial intelligence model, and the object density determination model can be deployed in a cloud server, thereby improving an artificial intelligence cloud service.

Description

对象密度确定方法、装置、计算机设备和存储介质Object density determination method, apparatus, computer equipment and storage medium
本申请要求于2021年04月26日提交中国专利局,申请号为202110453975X,申请名称为“对象密度确定方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on April 26, 2021 with the application number 202110453975X and the application title is "Object Density Determination Method, Apparatus, Computer Equipment and Storage Medium", the entire contents of which are by reference Incorporated in this application.
技术领域technical field
本申请涉及图像处理技术领域,特别是涉及一种对象密度确定方法、装置、计算机设备和存储介质。The present application relates to the technical field of image processing, and in particular, to a method, apparatus, computer equipment and storage medium for determining the density of an object.
背景技术Background technique
随着人工智能中的图像处理技术的发展,出现了基于图像确定对象密度的技术。通过对象密度确定技术能够自动推理出图像中的人群的密度,在视频监控、公共交通安全等领域发挥着重要的作用。With the development of image processing technology in artificial intelligence, a technology to determine the density of objects based on images has emerged. The object density determination technology can automatically infer the density of the crowd in the image, which plays an important role in video surveillance, public transportation safety and other fields.
传统技术中,在进行对象密度确定时,主要是使用对象密度图回归的方式进行预测,利用基于人工智能的深度学习技术进行端到端的训练与推理。然而,经常存在训练得到的对象密度确定模型所输出的对象密度图中的密度值并不准确,导致所获取的对象密度图的准确性低。In the traditional technology, when determining the object density, the method of object density map regression is mainly used for prediction, and the deep learning technology based on artificial intelligence is used for end-to-end training and reasoning. However, the density values in the object density map output by the trained object density determination model are often inaccurate, resulting in low accuracy of the acquired object density map.
发明内容SUMMARY OF THE INVENTION
根据本申请提供的各种实施例,提供一种对象密度确定方法、装置、计算机设备和存储介质。According to various embodiments provided in the present application, an object density determination method, apparatus, computer device and storage medium are provided.
一种对象密度确定方法,由计算机设备执行,所述方法包括:获取训练样本图像以及所述训练样本图像对应的标准密度图;将所述训练样本图像输入待训练的对象密度确定模型中,得到所述对象密度确定模型输出的预测密度图;分别对所述标准密度图和所述预测密度图进行划分,得到所述标准密度图对应的多个标准图像块以及所述预测密度图对应的多个预测图像块;对所述标准图像块中的对象密度进行统计,得到所述标准图像块对应的标准密度统计值,对所述预测图像块中的对象密度进行统计,得到所述预测图像块对应的预测密度统计值;及将所述标准图像块以及与所述标准图像块存在图像位置对应关系的预测图像块组成图像对,基于所述图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数调整,得到训练后的对象密度确定模型,所述训练后的对象密度确 定模型用于生成对象密度图。An object density determination method, executed by computer equipment, the method comprising: acquiring a training sample image and a standard density map corresponding to the training sample image; inputting the training sample image into an object density determination model to be trained, and obtaining The predicted density map output by the object density determination model; the standard density map and the predicted density map are respectively divided to obtain a plurality of standard image blocks corresponding to the standard density map and multiple standard image blocks corresponding to the predicted density map. Counting the object density in the standard image block to obtain the standard density statistic value corresponding to the standard image block, and performing statistics on the object density in the predicted image block to obtain the predicted image block The corresponding predicted density statistic value; and the standard image block and the predicted image block that has an image position corresponding relationship with the standard image block form an image pair, based on the standard density statistic value corresponding to the image pair and the predicted density statistic value The difference between the values, the parameters of the object density determination model to be trained are adjusted, and the trained object density determination model is obtained, and the trained object density determination model is used to generate an object density map.
一种对象密度确定装置,所述装置包括:图像获取模块,用于获取训练样本图像以及所述训练样本图像对应的标准密度图;图像输入模块,用于将所述训练样本图像输入待训练的对象密度确定模型中,得到所述对象密度确定模型输出的预测密度图;图像划分模块,用于分别对所述标准密度图和所述预测密度图进行划分,得到所述标准密度图对应的多个标准图像块以及所述预测密度图对应的多个预测图像块;密度统计模块,用于对所述标准图像块中的对象密度进行统计,得到所述标准图像块对应的标准密度统计值,对所述预测图像块中的对象密度进行统计,得到所述预测图像块对应的预测密度统计值;及训练模块,用于将所述标准图像块以及与所述标准图像块存在图像位置对应关系的预测图像块组成图像对,基于所述图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数调整,得到训练后的对象密度确定模型,所述训练后的对象密度确定模型用于生成对象密度图。An object density determination device, the device comprising: an image acquisition module for acquiring a training sample image and a standard density map corresponding to the training sample image; an image input module for inputting the training sample image into an image to be trained In the object density determination model, the predicted density map output by the object density determination model is obtained; the image division module is used for dividing the standard density map and the predicted density map respectively, and obtains the multi-point density map corresponding to the standard density map. a standard image block and a plurality of predicted image blocks corresponding to the predicted density map; a density statistics module, configured to perform statistics on object densities in the standard image block to obtain a standard density statistic value corresponding to the standard image block, Counting object densities in the predicted image block to obtain a predicted density statistic value corresponding to the predicted image block; and a training module configured to compare the standard image block and the image position corresponding relationship with the standard image block The predicted image blocks form image pairs, and based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, the parameters of the object density determination model to be trained are adjusted, and the trained object density determination model is obtained, The trained object density determination model is used to generate an object density map.
一种计算机设备,包括存储器和一个或者多个处理器,所述存储器中存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述处理器执行上述对象密度确定方法的步骤。A computer device comprising a memory and one or more processors, the memory stores computer-readable instructions, the computer-readable instructions, when executed by the processor, cause the processor to perform the above object density determination steps of the method.
一个或多个非易失性可读存储介质,存储有计算机可读指令,所述计算机可读指令被一个或多个处理器执行时,使得所述处理器执行上述对象密度确定方法的步骤。One or more non-volatile readable storage media storing computer readable instructions which, when executed by one or more processors, cause the processors to perform the steps of the above object density determination method.
一种计算机程序产品,包括计算机可读指令,所述计算机可读指令被处理器执行时实现上述对象密度确定方法的步骤。A computer program product comprising computer readable instructions that, when executed by a processor, implement the steps of the above object density determination method.
根据本申请提供的各种实施例,还提供另一种对象密度确定方法、装置、计算机设备和存储介质。According to various embodiments provided in the present application, another object density determination method, apparatus, computer device and storage medium are also provided.
一种对象密度确定方法,由计算机设备执行,所述方法包括:获取待确定密度的目标图像;将所述目标图像输入训练后的对象密度确定模型中,通过所述对象密度确定模型进行对象密度确定;所述对象密度确定模型是基于图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数调整得到的;其中,所述图像对是标准图像块以及与所述标准图像块存在图像位置对应关系的预测图像块组成的,所述标准图像块是通过对训练样本图像所对应的标准密度图进行划分得到的;所述预测图像块是通过对预测密度图进行划分得到的,所述预测密度图是将所述训练样本图像输入到待训练的对象密度 确定模型中处理得到;以及获取所述对象密度确定模型输出的所述目标图像对应的对象密度图。An object density determination method, executed by computer equipment, the method comprising: acquiring a target image whose density is to be determined; inputting the target image into a trained object density determination model, and performing the object density determination by the object density determination model Determine; the object density determination model is obtained by adjusting the parameters of the object density determination model to be trained based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value; wherein, the image pair is a standard It consists of an image block and a predicted image block that has an image position corresponding relationship with the standard image block, and the standard image block is obtained by dividing the standard density map corresponding to the training sample image; the predicted image block is obtained by The predicted density map is obtained by dividing the predicted density map, and the predicted density map is obtained by inputting the training sample image into the object density determination model to be trained; and obtaining the corresponding target image output by the object density determination model. Object density map.
一种对象密度确定装置,所述装置包括:图像获取模块,用于获取待确定密度的目标图像;密度确定模块,用于将所述目标图像输入训练后的对象密度确定模型中,通过所述对象密度确定模型进行对象密度确定;所述对象密度确定模型是基于图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数调整得到的;其中,所述图像对是标准图像块以及与所述标准图像块存在图像位置对应关系的预测图像块组成的,所述标准图像块是通过对训练样本图像所对应的标准密度图进行划分得到的;所述预测图像块是通过对预测密度图进行划分得到的,所述预测密度图是将所述训练样本图像输入到待训练的对象密度确定模型中处理得到;以及密度图获取模块,用于获取所述对象密度确定模型输出的所述目标图像对应的对象密度图。An object density determination device, the device comprises: an image acquisition module for acquiring a target image whose density is to be determined; a density determination module for inputting the target image into a trained object density determination model, through the The object density determination model is used to determine the object density; the object density determination model is obtained by adjusting the parameters of the object density determination model to be trained based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value; wherein , the image pair is composed of a standard image block and a predicted image block that has an image position correspondence with the standard image block, and the standard image block is obtained by dividing the standard density map corresponding to the training sample image; The predicted image block is obtained by dividing the predicted density map, and the predicted density map is obtained by inputting the training sample image into the object density determination model to be trained; and a density map acquisition module for acquiring The object density determines the object density map corresponding to the target image output by the model.
一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述处理器执行上述对象密度确定方法的步骤。A computer device includes a memory and a processor, the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processor, causes the processor to perform the steps of the above object density determination method.
一个或多个非易失性可读存储介质,存储有计算机可读指令,所述计算机可读指令被一个或多个处理器执行时,使得所述处理器执行上述对象密度确定方法的步骤。One or more non-volatile readable storage media storing computer readable instructions which, when executed by one or more processors, cause the processors to perform the steps of the above object density determination method.
一种计算机程序产品,包括计算机可读指令,所述计算机可读指令被处理器执行时实现上述对象密度确定方法的步骤。A computer program product comprising computer readable instructions that, when executed by a processor, implement the steps of the above object density determination method.
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征、目的和优点将从说明书、附图以及权利要求书变得明显。The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below. Other features, objects and advantages of the present application will become apparent from the description, drawings and claims.
附图说明Description of drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions in the embodiments of the present application more clearly, the following briefly introduces the drawings that are used in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative effort.
图1为一些实施例中对象密度确定方法的应用环境图;Fig. 1 is the application environment diagram of the object density determination method in some embodiments;
图2为一些实施例中对象密度确定方法的流程示意图;2 is a schematic flowchart of a method for determining object density in some embodiments;
图3为一些实施例中对象密度确定模型的结构示意图;3 is a schematic structural diagram of an object density determination model in some embodiments;
图4为一些实施例中跳跃连接的具体示意图;4 is a specific schematic diagram of a skip connection in some embodiments;
图5为一些实施例中图像位置对应关系的示意图;FIG. 5 is a schematic diagram of image position correspondence in some embodiments;
图6为一些实施例中确定图像对损失值的损失值权重步骤的流程示意图;6 is a schematic flowchart of a step of determining a loss value weight of an image to a loss value in some embodiments;
图7为另一些实施例中对象密度确定方法的流程示意图;7 is a schematic flowchart of a method for determining object density in other embodiments;
图8为一些实施例中高斯核在两个不同尺寸的人头处的示意图;8 is a schematic diagram of a Gaussian kernel at two different sizes of human heads in some embodiments;
图9为一些实施例中对象密度确定方法的应用示意图;9 is a schematic diagram of the application of the object density determination method in some embodiments;
图10为一些实施例中对象密度确定装置的结构框图;FIG. 10 is a structural block diagram of an apparatus for determining object density in some embodiments;
图11为另一些实施例中对象密度确定装置的结构框图;以及FIG. 11 is a structural block diagram of an apparatus for determining object density in other embodiments; and
图12为一些实施例中计算机设备的内部结构图。Figure 12 is a diagram of the internal structure of a computer device in some embodiments.
具体实施方式Detailed ways
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solutions and advantages of the present application more clearly understood, the present application will be described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application.
需要说明的是,本申请实施例中,由同一个训练样本图像得到的标准密度图和预测密度图的尺寸是相同的,并且存在以下对应关系:It should be noted that, in this embodiment of the present application, the size of the standard density map and the predicted density map obtained from the same training sample image are the same, and the following correspondences exist:
1)像素之间的位置对应:当将标准密度图和预测密度图以相同的方式建立坐标系时,如果标准密度图中的某个像素与预测密度图中的某个像素之间坐标相同,则这两个像素为位置对应的像素。1) Position correspondence between pixels: when the standard density map and the predicted density map are established in the same way as the coordinate system, if the coordinates between a pixel in the standard density map and a pixel in the predicted density map are the same, Then these two pixels are the pixels corresponding to the positions.
2)图像块之间的位置对应:当将标准密度图进行划分,得到多个标准图像块,并且将预测密度图进行划分,得到多个预测图像块,如果某个标准图像块所包含的所有像素在另一个预测图像块中均存在位置对应的像素,并且该预测图像块中的所有像素在该标准图像块中均存在位置对应的像素,则这两个图像块之间存在图像位置对应关系。2) Position correspondence between image blocks: When the standard density map is divided to obtain multiple standard image blocks, and the predicted density map is divided to obtain multiple predicted image blocks, if all the Pixels have corresponding pixels in another predicted image block, and all pixels in the predicted image block have corresponding pixels in the standard image block, then there is an image position correspondence between these two image blocks. .
还需要说明的是,本申请实施例中提到的多个指的是至少两个。It should also be noted that the multiple mentioned in the embodiments of the present application refers to at least two.
本申请实施例提供的对象密度模型可以应用于基于人工智能的云服务中。例如可以将对象密度模型部署在云服务器中,云服务器获取待确定密度的目标图像,基于对象密度模型确定该目标图像所对应的对象密度图,并返回至终端进行显示。The object density model provided by the embodiments of the present application can be applied to cloud services based on artificial intelligence. For example, the object density model can be deployed in a cloud server, and the cloud server obtains the target image whose density is to be determined, determines the object density map corresponding to the target image based on the object density model, and returns it to the terminal for display.
本申请实施例所提供的对象密度确定方法,其中训练样本图像以及通过对象密度确定模型生成的对象密度图可保存于区块链上。区块链可对保存的训练样本图像以及对象密度图分别生成查询码,将生成的查询码返回至终端,基于训练样本图像对应的查询码,可对训练样本图像进行查询,基于对象密度图对应的查询码,可对对象密度图进行查询。In the object density determination method provided by the embodiment of the present application, the training sample image and the object density map generated by the object density determination model can be saved on the blockchain. The blockchain can generate a query code for the saved training sample image and object density map respectively, and return the generated query code to the terminal. Based on the query code corresponding to the training sample image, the training sample image can be queried, based on the object density map corresponding The query code can be used to query the object density map.
本申请实施例提供的方案涉及人工智能的计算机视觉、机器学习等技术,具体通过如下实施例进行说明:The solutions provided in the embodiments of the present application relate to technologies such as computer vision and machine learning of artificial intelligence, and are specifically described by the following examples:
本申请提供的对象密度确定方法,可以应用于如图1所示的应用环境中。其中,终端102、摄像设备106分别通过网络与服务器104进行通信。其中的网络可以是有线网络或者是无线网络,无线网络可以是局域网、城域网以及广域网中的任意一种。The object density determination method provided in this application can be applied to the application environment shown in FIG. 1 . The terminal 102 and the camera device 106 respectively communicate with the server 104 through the network. The network may be a wired network or a wireless network, and the wireless network may be any one of a local area network, a metropolitan area network, and a wide area network.
终端102可以是智能手机、平板电脑、笔记本电脑、台式计算机、智能音箱、智能手表等,但并不局限于此。服务器104可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、CDN、以及大数据和人工智能平台等基础云计算服务的云服务器。摄像设备106可以包括一个或者多个摄像头。The terminal 102 may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc., but is not limited thereto. The server 104 may be an independent physical server, or a server cluster or a distributed system composed of multiple physical servers, or may provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, Cloud servers for basic cloud computing services such as middleware services, domain name services, security services, CDN, and big data and artificial intelligence platforms. Camera device 106 may include one or more cameras.
本申请实施例所提供的对象密度确定方法或装置,其中多个服务器可组成为一区块链,而服务器为区块链上的节点。In the method or device for determining the density of objects provided by the embodiments of the present application, a plurality of servers can be formed into a blockchain, and the servers are nodes on the blockchain.
在一些实施例中,服务器104在通过获取的训练样本图像对待训练的对象密度确定模型进行训练,得到训练后的对象密度确定模型后,部署该对象密度确定模型,此后,服务器104可以接收摄像设备106实时采集并传输的图像,对这些图像进行对象密度确定得到对象密度图,将对象密度图发送至终端,终端可以以热力图的形式对对象密度图进行展示。In some embodiments, the server 104 trains the object density determination model to be trained through the acquired training sample images, and after obtaining the trained object density determination model, deploys the object density determination model, after which the server 104 may receive the camera device 106 The images collected and transmitted in real time, the object density is determined on these images to obtain the object density map, and the object density map is sent to the terminal, and the terminal can display the object density map in the form of a heat map.
在一些实施例中,服务器104在通过获取的训练样本图像对待训练的对象密度确定模型进行训练,得到训练后的对象密度确定模型后,当接收到终端102的请求时,可以通过有线或无线的方式,将该训练后的对象密度确定模型发送至终端102中,终端102接收到该训练后的对象密度确定模型并进行部署,用户使用该终端对图像进行处理时,该终端可以根据训练后的该对象密度确定模型进行图像数据的处理,以实现对象密度的确定。In some embodiments, after the server 104 trains the object density determination model to be trained through the acquired training sample images and obtains the trained object density determination model, when receiving the request from the terminal 102, the server 104 can use wired or wireless method, the trained object density determination model is sent to the terminal 102, and the terminal 102 receives the trained object density determination model and deploys it. When the user uses the terminal to process the image, the terminal can The object density determination model processes image data to realize object density determination.
在一些实施例中,如图2所示,提供了一种对象密度确定方法,该对象密度确定方法可应用于计算机设备,该计算机可以是图1中的终端或者服务器,还可以是终端和服务器组成的交互系统,该方法具体包括以下步骤:In some embodiments, as shown in FIG. 2 , a method for determining the density of objects is provided, and the method for determining the density of objects can be applied to a computer device, and the computer may be a terminal or a server in FIG. 1 , and may also be a terminal and a server. The interactive system composed, the method specifically includes the following steps:
步骤202,获取训练样本图像以及训练样本图像对应的标准密度图。Step 202: Obtain the training sample image and the standard density map corresponding to the training sample image.
其中,训练样本图像指的是用于对待训练的对象密度确定模型进行有监督训练的图像。训练样本图像中包括一个或者多个目标对象。目标对象具体可以是独立的生命体或者物体,比如自然人、动物、车辆、虚拟角色等,也可以是特定部位,比如头部、手部等。由于是有 监督训练,训练样本图像存在对应的标准密度图。标准密度图是真实反映训练样本图像的对象密度的密度图,为对模型训练进行监督的密度图。训练样本图像对应的标准密度图可以是根据训练样本图像中的对象位置点确定的密度图。密度图反映了图像各个位置的对象数量,例如人群密度图可以反映单位像素在实际场景中对应位置的平均人数。通过密度图可以确定图像中目标对象的总数量。The training sample images refer to images used for supervised training of the object density determination model to be trained. One or more target objects are included in the training sample images. The target object may specifically be an independent living body or object, such as a natural person, an animal, a vehicle, a virtual character, etc., or a specific part, such as a head, a hand, and the like. Due to the supervised training, there is a corresponding standard density map for the training sample images. The standard density map is a density map that truly reflects the object density of the training sample images, and is a density map that supervises model training. The standard density map corresponding to the training sample image may be a density map determined according to the object position points in the training sample image. The density map reflects the number of objects in each position of the image. For example, the crowd density map can reflect the average number of people in the corresponding position of the unit pixel in the actual scene. The density map can determine the total number of target objects in the image.
在一些实施例中,计算机设备可以获取标注有目标对象的对象位置点的图像作为训练样本图像,对象位置点具体可以是目标对象的位置中心点,例如,目标对象为自然人时,训练样本图像中的对象位置点为人头中心点。In some embodiments, the computer device may acquire an image marked with an object position point of the target object as a training sample image, and the object position point may specifically be the position center point of the target object. For example, when the target object is a natural person, in the training sample image The object position point of is the center point of the human head.
例如,计算机设备可以通过对包含一个或多个目标对象的场景进行拍照的方式获取包含一个或多个目标对象的图像,包含目标对象的图像被人工标注出对象位置点后即可作为训练样本图像;计算机设备还可以通过有线或者无线的方式从第三方计算机设备获取包括一个或多个目标对象并且已标注对象位置点的图像作为训练样本图像。For example, a computer device can acquire an image containing one or more target objects by taking a picture of a scene containing one or more target objects, and the image containing the target object can be used as a training sample image after the object position points are manually marked ; The computer device can also obtain images including one or more target objects and marked object position points from a third-party computer device in a wired or wireless manner as a training sample image.
计算机设备在获取到训练样本图像后,根据训练样本图像所对应的对象位置点,确定训练样本图像对应的对象响应图,根据该对象响应图得到训练样本图像对应的标准密度图。After acquiring the training sample image, the computer device determines an object response map corresponding to the training sample image according to the object position points corresponding to the training sample image, and obtains a standard density map corresponding to the training sample image according to the object response map.
在另一些实施例中,计算机设备还可直接获取已确定了标准密度图的图像作为训练样本图像。例如,计算机设备可从第三方的公开数据库中获取已确定了标准密度图的图像作为训练样本图像。In other embodiments, the computer device may also directly acquire the image for which the standard density map has been determined as a training sample image. For example, the computer device may obtain images for which a standard density map has been determined from a public database of a third party as a training sample image.
步骤204,将训练样本图像输入待训练的对象密度确定模型中,得到对象密度确定模型输出的预测密度图。Step 204 , input the training sample image into the object density determination model to be trained, and obtain a predicted density map output by the object density determination model.
其中,待训练的对象密度确定模型指的是需要进行训练来确定模型参数的对象密度确定模型。对象密度确定模型为用于确定图像中目标对象的密度的机器学习模型。对象密度确定模型可以采用包含多个卷积神经网络的深度学习模型。The object density determination model to be trained refers to an object density determination model that needs to be trained to determine model parameters. The object density determination model is a machine learning model for determining the density of target objects in an image. The object density determination model may employ a deep learning model comprising multiple convolutional neural networks.
具体地,计算机设备将训练样本图像输入待训练的对象密度确定模型中,对象密度确定模型对训练样本图像中的对象密度进行预测,得到预测密度图,计算机设备获取对象密度确定模型输出的预测密度图。可以理解的是,标准密度图和预测密度图都是基于同一张训练样本图像得到的,因此标准密度图和预测密度图可以为尺寸相同的图像。Specifically, the computer equipment inputs the training sample images into the object density determination model to be trained, the object density determination model predicts the object density in the training sample images, and obtains a predicted density map, and the computer equipment obtains the predicted density output by the object density determination model picture. It can be understood that both the standard density map and the predicted density map are obtained based on the same training sample image, so the standard density map and the predicted density map can be images of the same size.
在一些实施例中,对象密度确定模型包括编码层、解码层及预测层;将训练样本图像输入待训练的对象密度确定模型中,得到对象密度确定模型输出的预测密度图包括:将训练样 本图像输入编码层,通过编码层进行下采样处理,得到第一目标特征,然后将第一目标特征输入解码层,通过解码层进行上采样处理,得到第二目标特征,最后将第二目标特征输入预测层,通过预测层进行密度预测,得到预测密度图。In some embodiments, the object density determination model includes an encoding layer, a decoding layer, and a prediction layer; inputting the training sample images into the object density determination model to be trained, and obtaining the predicted density map output by the object density determination model includes: Input the coding layer, perform downsampling processing through the coding layer to obtain the first target feature, then input the first target feature into the decoding layer, perform upsampling processing through the decoding layer to obtain the second target feature, and finally input the second target feature into the prediction layer, the density prediction is performed through the prediction layer, and the predicted density map is obtained.
其中,编码层、解码层可采用VGGnet((Visual Geometry Group,牛津大学计算机视觉组)系列神经网络、ResNet(残差网络)系列神经网络等等,其中,VGGnet系列神经网络是由牛津大学计算机视觉组(Visual Geometry Group)和Google DeepMind公司的研究员一起研发出的深度卷积神经网络,由5层卷积层、3层全连接层、softmax输出层构成,层与层之间使用max-pooling(最大化池)分开,所有隐层的激活单元都采用ReLU函数,ResNet系列神经网络是由残差块(Residual block)构建的神经网络。Among them, the coding layer and decoding layer can use VGGnet ((Visual Geometry Group, Oxford University Computer Vision Group) series of neural networks, ResNet (residual network) series of neural networks, etc. Among them, the VGGnet series of neural networks are developed by Oxford University Computer Vision The deep convolutional neural network developed by the Visual Geometry Group and researchers from Google DeepMind is composed of 5 layers of convolution layers, 3 layers of fully connected layers, and a softmax output layer. The layers use max-pooling ( The maximization pool) is separated, the activation units of all hidden layers use the ReLU function, and the ResNet series of neural networks are neural networks constructed by residual blocks.
通过编码层对训练样本图像进行下采样处理可以提取训练样本图像的高级语义信息,得到的第一目标特征为具有高级语义信息的低分辨率图像,通过解码层进行上采样处理可以将低分辨率的高级语义信息还原为较高分辨率的语义信息,最终得到的第二目标特征为具有高级语义信息的高分辨率特征图像。The high-level semantic information of the training sample image can be extracted by down-sampling the training sample image through the encoding layer, and the obtained first target feature is a low-resolution image with high-level semantic information. The high-level semantic information is restored to higher-resolution semantic information, and the final second target feature is a high-resolution feature image with high-level semantic information.
在一些实施例中,编码层和解码层采用跳跃链接;编码层包括多个第一卷积层;解码层包括多个第二卷积层;将训练样本图像输入编码层,通过编码层进行下采样处理,得到第一目标特征包括:在编码层中,通过当前第一卷积层对前一个第一卷积层输出的中间特征进行下采样,获取最后一个第一卷积层的输出作为第一目标特征;将第一目标特征输入解码层,通过解码层进行上采样处理,得到第二目标特征包括:在解码层,通过当前第二卷积层根据前一个第二卷积层输出的中间特征以及所连接的第一卷积层输出的中间特征进行上采样,获取最后一个第二卷积层的输出作为第二目标特征。In some embodiments, the encoding layer and the decoding layer adopt skip links; the encoding layer includes a plurality of first convolutional layers; the decoding layer includes a plurality of second convolutional layers; the training sample images are input into the encoding layer, and the downlink is performed by the encoding layer. The sampling process to obtain the first target feature includes: in the coding layer, down-sampling the intermediate feature output by the previous first convolutional layer through the current first convolutional layer, and obtaining the output of the last first convolutional layer as the first convolutional layer. A target feature; inputting the first target feature into the decoding layer, performing upsampling processing through the decoding layer, and obtaining the second target feature includes: at the decoding layer, passing the current second convolutional layer according to the middle output of the previous second convolutional layer. The features and the intermediate features output by the connected first convolutional layer are up-sampled, and the output of the last second convolutional layer is obtained as the second target feature.
通过跳跃链接能够将比较靠前的卷积层输出的特征综合比较靠后的卷积层输出的特征共同作为某个卷积层的输入,这样该卷积层的输入特征既包括通过多个卷积层一步一步卷积处理得到的具有高级语义信息的上下文特征,又包括局部的细节信息,提取到的特征更完善、更准确。Through the skip link, the features output by the earlier convolutional layers can be integrated and the features output by the later convolutional layers can be used as the input of a convolutional layer, so that the input features of the convolutional layer include the multiple convolutional layers. The contextual features with high-level semantic information obtained by the layer-by-step convolution process also include local detailed information, and the extracted features are more complete and accurate.
举例说明,如图3所示,为一些具体的实施例中,对象密度确定模型的结构示意图。其中,编码层包括五个首尾相连的第一卷积层,每一个第一卷积层分别将上一个卷积层输出的中间特征进行卷积处理实现下采样,依次输出五个特征即V 1,V 2,V 3,V 4,V 5,获取最后一 层第一卷积层的输出特征V 5作为第一目标特征,将得到的第一目标特征输入解码层,解码层包括五个首尾相连的第二卷积层,每一个第二卷积层根据前一个第二卷积层输出的中间特征以及所连接的第一卷积层输出的中间特征进行上采样,依次输出五个特征即P 5,P 4,P 3,P 2,P 1,获取最后一层第二卷积层输出的特征P′ 1作为第二目标特征输入预测层,预测层一方面将第二目标特征通过三个并列的卷积层分别对第二目标特征进行卷积处理,将各个卷积层的输出与第二目标特征进行通道串联(channel-wise concatenation),再通过一个卷积层进行卷积处理,最终输出预测密度图。 For example, as shown in FIG. 3 , it is a schematic structural diagram of an object density determination model in some specific embodiments. Among them, the coding layer includes five first convolution layers connected end to end, and each first convolution layer performs convolution processing on the intermediate features output by the previous convolution layer to realize down-sampling, and outputs five features in turn, namely V 1 , V 2 , V 3 , V 4 , V 5 , obtain the output feature V 5 of the first convolutional layer of the last layer as the first target feature, and input the obtained first target feature into the decoding layer. The decoding layer includes five head and tail Connected second convolutional layers, each second convolutional layer is upsampled according to the intermediate features output by the previous second convolutional layer and the intermediate features output by the connected first convolutional layer, and outputs five features in turn, namely P 5 , P 4 , P 3 , P 2 , P 1 , obtain the feature P′ 1 output by the second convolutional layer of the last layer as the second target feature and input it to the prediction layer. On the one hand, the prediction layer passes the second target feature through three The two parallel convolutional layers respectively perform convolution processing on the second target feature, perform channel-wise concatenation between the output of each convolutional layer and the second target feature, and then perform convolution processing through a convolutional layer. The final output predicted density map.
如图4所示,为跳跃连接的具体示意图。参考图4,第二卷积层对输入的特征P i+1进行上采样得到的输出特征与所跳跃连接的第一卷积层输出的中间特征首先进行通道串联,得到中间特征,再通过卷积层进行融合处理,得到P i作为下一个第二卷积层的输入特征。 As shown in FIG. 4 , it is a specific schematic diagram of skip connection. Referring to Figure 4, the output feature obtained by upsampling the input feature P i+1 by the second convolution layer and the intermediate feature output by the skip-connected first convolution layer are firstly channel-concatenated to obtain the intermediate feature, and then pass through the convolutional layer. The accumulation layer is fused to obtain Pi as the input feature of the next second convolutional layer.
步骤206,分别对标准密度图和预测密度图进行划分,得到标准密度图对应的多个标准图像块以及预测密度图对应的多个预测图像块。Step 206: Divide the standard density map and the predicted density map respectively to obtain multiple standard image blocks corresponding to the standard density map and multiple predicted image blocks corresponding to the predicted density map.
具体地,计算机设备对标准密度图进行划分,得到多个标准图像块,对预测密度图进行划分,得到多个预测图像块。这里的划分指的是将图像块中的像素进行区域划分。在一些实施例中,多个预测图像块中至少存在一个与标准图像块存在图像位置对应关系的图像块,计算机设备在进行图像块划分时,可以首先对标准密度图进行划分,得到标准密度图对应的多个标准图像块,然后根据至少一个标准图像块在标准密度图中的位置对预测密度图进行划分,得到与标准图像块存在图像位置对应关系的预测图像块。Specifically, the computer device divides the standard density map to obtain multiple standard image blocks, and divides the predicted density map to obtain multiple predicted image blocks. The division here refers to the area division of the pixels in the image block. In some embodiments, there is at least one image block in a plurality of predicted image blocks that has an image position corresponding relationship with the standard image block. When dividing the image blocks, the computer device may first divide the standard density map to obtain the standard density map. and then divide the predicted density map according to the position of at least one standard image block in the standard density map to obtain a predicted image block that has an image position corresponding relationship with the standard image block.
举个例子,假设对标准密度图进行划分得到四个标准图像块,分别为图像块A、图像块B、图像块C和图像块D,那么计算机设备可以根据图像块A中各个像素所在的位置对预测密度图进行划分,使得预测密度图中与图像块A中各个像素位置相同的像素划分到同一个区域内,得到与图像块A对应的预测图像块。For example, suppose that the standard density map is divided into four standard image blocks, namely image block A, image block B, image block C and image block D, then the computer equipment can be based on the location of each pixel in image block A. The predicted density map is divided so that the pixels in the predicted density map with the same position as each pixel in the image block A are divided into the same area, and the predicted image block corresponding to the image block A is obtained.
在一些实施例中,计算机设备可以采用相同的图像块划分方式对标准密度图以及预测密度图分别进行划分,从而使得预测图像块与标准图像块的数量、位置以及尺寸是匹配的。In some embodiments, the computer device may divide the standard density map and the predicted density map respectively by using the same image block division method, so that the number, position and size of the predicted image blocks and the standard image blocks are matched.
在一些实施例中,计算机设备可以获取滑动窗口,按照预设滑动方式将滑动窗口在标准密度图上进行滑动,将处于滑动窗口内的图像区域作为标准图像块,按照该预设滑动方式将滑动窗口在预测密度图上进行滑动,将处于滑动窗口内的图像区域作为预测图像块,从而可 以得到尺寸、数量完全相同,且图像位置一一对应的标准图像块和预测图像块。In some embodiments, the computer device may acquire a sliding window, slide the sliding window on the standard density map according to a preset sliding method, use the image area within the sliding window as a standard image block, and slide the sliding window according to the preset sliding method. The window is slid on the predicted density map, and the image area in the sliding window is used as the predicted image block, so that the standard image block and the predicted image block with the same size and quantity and the image positions correspond one-to-one can be obtained.
步骤208,对标准图像块中的对象密度进行统计,得到标准图像块对应的标准密度统计值,对预测图像块中的对象密度进行统计,得到预测图像块对应的预测密度统计值。Step 208: Count the density of objects in the standard image block to obtain a standard density statistic value corresponding to the standard image block, and perform statistics on the object density in the predicted image block to obtain a predicted density statistic value corresponding to the predicted image block.
其中,对象密度指的是图像块中各个像素的密度值,像素的密度值用于表征像素所在位置的对象密集程度。对图像块中的对象密度进行统计指的是对图像块内所有像素的密度值用一个统计值进行表示,统计具体可以是对图像块中所有像素的密度值求和,或者是对所有像素的密度值求平均,或者是对图像块中所有像素的密度值求中位数等等。The object density refers to the density value of each pixel in the image block, and the pixel density value is used to represent the density of the object at the location of the pixel. Counting the density of objects in an image block refers to expressing the density values of all pixels in the image block with a statistical value. Average density values, or median density values of all pixels in an image block, etc.
具体地,计算机设备在得到标准图像块和预测图像块后,对标准图像块中的对象密度进行统计,得到标准图像块对应的标准密度统计值,对预测图像块中的对象密度以相同的方式进行统计,得到预测图像块对应的预测密度统计值。举个例子,假设对标准图像块中的对象密度进行累加,得到标准图像块对应的标准密度统计值,则同样对预测图像块中的对象密度进行累加,得到预测图像块对应的预测密度统计值。Specifically, after obtaining the standard image block and the predicted image block, the computer device performs statistics on the object density in the standard image block, obtains the standard density statistic value corresponding to the standard image block, and calculates the object density in the predicted image block in the same way. Statistics are performed to obtain the predicted density statistic value corresponding to the predicted image block. For example, assuming that the object densities in the standard image block are accumulated to obtain the standard density statistic value corresponding to the standard image block, then the object densities in the predicted image block are also accumulated to obtain the predicted density statistic value corresponding to the predicted image block. .
步骤210,将标准图像块以及与标准图像块存在图像位置对应关系的预测图像块组成图像对,基于图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数调整,得到训练后的对象密度确定模型,训练后的对象密度确定模型用于生成对象密度图。In step 210, the standard image block and the predicted image block that has an image position correspondence with the standard image block are formed into an image pair, and based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, the density of the object to be trained is calculated. The model is determined and parameters are adjusted to obtain an object density determination model after training, and the trained object density determination model is used to generate an object density map.
其中,标准图像块与预测图像块之间存在图像位置对应关系指的是标准图像块在标准密度图中的位置与预测图像块在预测密度图中的位置对应,那么对于标准密度图中每个像素,在与其位置对应的预测图像块中均存在位置相同的像素。图像对所对应的标准密度统计值指的是图像对中的标准图像块对应的标准密度统计值。图像对所对应的预测密度统计值指的是图像对中的预测图像块所对应的预测密度统计值。The image position correspondence between the standard image block and the predicted image block means that the position of the standard image block in the standard density map corresponds to the position of the predicted image block in the predicted density map, then for each There are pixels with the same position in the predicted image block corresponding to its position. The standard density statistic value corresponding to the image pair refers to the standard density statistic value corresponding to the standard image block in the image pair. The predicted density statistic value corresponding to the image pair refers to the predicted density statistic value corresponding to the predicted image block in the image pair.
举个例子,假设标准密度图A划分为4个标准图像块A1、A2、A3、A4,预测密度图B按照相同的方式划分为尺寸、位置及数量相同的4个预测图像块B1、B2、B3、B4,则图像位置对应关系如图5所示,其中,虚线箭头表示图像位置对应关系,由图5可以看出,标准图像块A1和预测图像块B1之间存在图像位置对应关系,标准图像块A2和预测图像块B2之间存在图像位置对应关系,标准图像块A3和预测图像块B3之间存在图像位置对应关系,标准图像块A4和预测图像块B4之间存在图像位置对应关系,即组成图像对的标准图像块和预测图像块在图像中所处的位置是一致的。For example, suppose the standard density map A is divided into 4 standard image blocks A1, A2, A3, A4, and the predicted density map B is divided into 4 predicted image blocks B1, B2, B3 and B4, the image position correspondence is shown in Figure 5, in which the dotted arrows indicate the image position correspondence. As can be seen from Figure 5, there is an image position correspondence between the standard image block A1 and the predicted image block B1. There is an image position correspondence between the image block A2 and the predicted image block B2, an image position correspondence between the standard image block A3 and the predicted image block B3, and an image position correspondence between the standard image block A4 and the predicted image block B4, That is, the positions of the standard image blocks and the predicted image blocks constituting the image pair in the image are consistent.
具体地,计算机设备对于划分得到的标准图像块,从划分得到的多个预测图像块中,确定与该标准图像块存在图像位置对应关系的预测图像块,将该标准图像块与该预测图像块组成图像对,基于该图像对所对应的标准密度统计值与预测密度统计值之间的差异,计算机设备可得到该图像对所对应的图像对损失值,基于图像对损失值进行统计,计算机设备可得到目标损失值,基于该目标损失值计算机设备可对待训练的对象密度确定模型进行参数调整,得到训练后的对象密度确定模型。对象密度确定模型用于生成对象密度图是指对象密度确定模型可以输出图像中各个位置对应的对象密度值,例如各个位置对应的人的数量。在实际中,当需要显示对象密度图时,可以根据需要采用不同的形式来体现对象密度图中各个位置对应的对象密度值,例如可以确定各个对象密度值所对应的颜色,以热力图的形式显示对象密度图。Specifically, for the divided standard image block, the computer device determines a predicted image block that has an image position correspondence relationship with the standard image block from the multiple predicted image blocks obtained by division, and the standard image block is associated with the predicted image block. An image pair is formed, and based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, the computer device can obtain the image pair loss value corresponding to the image pair, and the loss value is counted based on the image pair. The target loss value can be obtained, and based on the target loss value, the computer equipment can adjust the parameters of the object density determination model to be trained to obtain the trained object density determination model. The use of the object density determination model to generate the object density map means that the object density determination model can output object density values corresponding to each position in the image, such as the number of people corresponding to each position. In practice, when the object density map needs to be displayed, different forms can be used to reflect the object density value corresponding to each position in the object density map as required. For example, the color corresponding to each object density value can be determined, in the form of a heat map. Displays an object density map.
在一些实施例中,计算机设备在对标准密度图和预测密度图进行划分时,按照相同的滑动方式将相同的滑动窗口在标准密度图和预测密度图上进行滑动,得到标准密度图对应的多个标准图像块以及预测密度图对应的多个预测图像块,那么在确定图像位置对应关系时,计算机设备可按照滑动顺序来进行确定,基于滑动顺序对得到的标准图像块进行编号,并基于滑动顺序对得到的预测图像块进行编号,将编号相同的两个图像块确定为存在图像位置对应关系的图像块,将这两个图像块组成图像对。In some embodiments, when dividing the standard density map and the predicted density map, the computer device slides the same sliding window on the standard density map and the predicted density map in the same sliding manner, so as to obtain the multi-point density map corresponding to the standard density map. standard image blocks and multiple predicted image blocks corresponding to the predicted density map, then when determining the image position correspondence, the computer device can determine the corresponding relationship according to the sliding sequence, number the obtained standard image blocks based on the sliding sequence, and based on the sliding sequence The obtained predicted image blocks are sequentially numbered, and two image blocks with the same number are determined as image blocks with a corresponding relationship of image positions, and the two image blocks are formed into an image pair.
在一些实施例中,当多个预测图像块中存在与任意一个标准图像块均不具有图像位置对应关系的图像块时,对该图像块中的每一个像素,计算该像素与标准密度图中对应位置像素的密度值的差异,最后基于这些像素的密度值的差异以及图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数调整,得到训练后的对象密度确定模型。In some embodiments, when there is an image block that does not have an image position correspondence with any one of the standard image blocks in the plurality of predicted image blocks, for each pixel in the image block, calculate the relationship between the pixel and the standard density map. The difference in the density values of the corresponding position pixels, and finally based on the difference in the density values of these pixels and the difference between the standard density statistic value and the predicted density statistic value corresponding to the image pair, the parameters of the object density determination model to be trained are adjusted to obtain The trained object density determines the model.
在一些实施例中,在得到训练后的对象密度确定模型后,计算机设备可通过该对象密度确定模型生成对象密度图。具体地,将待确定密度的目标图像输入训练后的对象密度确定模型中,通过对象密度确定模型进行对象密度确定,获取对象密度确定模型输出的目标图像对应的对象密度图。In some embodiments, after obtaining the trained object density determination model, the computer device may generate an object density map from the object density determination model. Specifically, the target image whose density is to be determined is input into the trained object density determination model, the object density is determined by the object density determination model, and the object density map corresponding to the target image output by the object density determination model is obtained.
在一些实施例中,计算机设备在得到对象密度图后,可对该对象密度图进行积分,得到目标图像中目标对象的总数量。In some embodiments, after obtaining the object density map, the computer device may integrate the object density map to obtain the total number of target objects in the target image.
上述对象密度确定方法中,由于分别对标准密度图和预测密度图进行了划分,得到标准 密度图对应的多个标准图像块以及预测密度图对应的多个预测图像块,并且对标准图像块中的对象密度进行统计,得到标准图像块对应的标准密度统计值,对预测图像块中的对象密度进行统计,得到预测图像块对应的预测密度统计值,那么在训练过程中,可以将标准图像块以及与标准图像块存在图像位置对应关系的预测图像块组成图像对,基于图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数调整,从而可以以图像块为单位去拟合局部区域的密度值,综合考虑了局部区域的整体密度值,提高了训练得到的对象密度确定模型用于确定对象密度时的准确性。In the above object density determination method, since the standard density map and the predicted density map are divided respectively, multiple standard image blocks corresponding to the standard density map and multiple predicted image blocks corresponding to the predicted density map are obtained, and the standard image blocks are Count the object density of the predicted image block to obtain the standard density statistic value corresponding to the standard image block, and count the object density in the predicted image block to obtain the predicted density statistic value corresponding to the predicted image block, then in the training process, the standard image block can be and the predicted image blocks that have an image position correspondence with the standard image blocks to form an image pair, and based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, the parameters of the object density determination model to be trained are adjusted, so that The density value of the local area can be fitted in units of image blocks, and the overall density value of the local area is comprehensively considered, which improves the accuracy of the object density determination model obtained by training for determining the object density.
在一些实施例中,基于图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数调整,得到训练后的对象密度确定模型包括:基于图像对所对应的标准密度统计值与预测密度统计值之间的差异,得到图像对所对应的图像对损失值;对图像对损失值进行统计,得到目标损失值;基于目标损失值对待训练的对象密度确定模型进行参数调整,得到训练后的对象密度确定模型。In some embodiments, adjusting parameters of the object density determination model to be trained based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, and obtaining the trained object density determination model includes: based on the image pair The difference between the corresponding standard density statistic value and the predicted density statistic value is used to obtain the image pair loss value corresponding to the image pair; the image pair loss value is counted to obtain the target loss value; the object density to be trained is based on the target loss value Determine the model and adjust the parameters to obtain the object density determination model after training.
本实施例中,对标准密度图进行划分得到的多个标准图像块中的每一个标准图像块,在预测密度图中均存在与之具有图像位置对应关系的预测图像块,计算机设备对标准密度图中的每一个标准图像块中的对象密度进行统计,得到每一个标准图像块对应的标准密度统计值,用标准密度统计值代替标准图像块所在区域的密度值,相当于得到了标准密度图对应的标准局部计数图,对预测密度图中的每一个预测图像块中的对象密度进行统计,得到每一个预测图像块对应的预测密度统计值,用预测密度统计值代替预测图像块所在区域的密度值,即可得到预测密度图对应的预测局部计数图,将标准局部计数图和预测局部计数图中存在图像位置对应关系的图像块组成对,基于图像对所对应的标准密度统计值与预测密度统计值之间的差异,得到图像对所对应的图像对损失值,进而在进行训练时,计算机设备可以对所有图像对所对应的图像对损失值进行统计,得到标准局部计数图和预测局部计数图之间的目标损失值,最终,计算机设备可以将该目标损失值反向传播至对象密度确定模型中,并通过梯度下降算法来调整对象密度确定模型的模型参数,直至满足停止条件时,得到训练后的对象密度确定模型。其中,梯度下降算法包括但不限于随机梯度下降算法、Adagrad((Adaptive Gradient,自适应梯度)算法、Adadelta(AdaGrad算法的改进)、RMSprop(AdaGrad算法的改进)等等。In this embodiment, each standard image block in a plurality of standard image blocks obtained by dividing the standard density map has a predicted image block with an image position corresponding relationship with it in the predicted density map. The object density in each standard image block in the figure is counted, and the standard density statistic value corresponding to each standard image block is obtained, and the standard density statistic value is used to replace the density value of the area where the standard image block is located, which is equivalent to obtaining a standard density map. Corresponding standard local count map, count the object density in each predicted image block in the predicted density map, obtain the predicted density statistic value corresponding to each predicted image block, and replace the predicted image block with the predicted density statistic value in the area where the predicted image block is located. Density value, the predicted local count map corresponding to the predicted density map can be obtained, the standard local count map and the image blocks with the image position correspondence in the predicted local count map are formed into pairs, based on the standard density statistics corresponding to the image pair and the prediction The difference between the density statistical values, the loss value of the image pair corresponding to the image pair is obtained, and then during training, the computer equipment can count the loss value of the image pair corresponding to all the image pairs to obtain the standard local count map and predicted local Count the target loss value between the graphs, and finally, the computer equipment can back-propagate the target loss value to the object density determination model, and adjust the model parameters of the object density determination model through the gradient descent algorithm until the stopping condition is satisfied, The trained object density determination model is obtained. Among them, the gradient descent algorithm includes but is not limited to stochastic gradient descent algorithm, Adagrad ((Adaptive Gradient, adaptive gradient) algorithm, Adadelta (improvement of AdaGrad algorithm), RMSprop (improvement of AdaGrad algorithm) and so on.
在一些实施例中,计算机设备可以基于图像对所对应的标准密度统计值与预测密度统计 值之间的差异构建损失函数,基于该损失函数得到图像对所对应的图像对损失值。其中损失函数可以交叉熵(Cross Entropy)损失函数、MERS(mean-square error,均方误差)损失函数等等中的其中一种。In some embodiments, the computer device may construct a loss function based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, and obtain the image pair loss value corresponding to the image pair based on the loss function. The loss function can be one of the cross entropy (Cross Entropy) loss function, the MERS (mean-square error, mean square error) loss function and so on.
在一些实施例中,计算机设备对图像对损失值进行统计,得到目标损失值具体可以是:对所有图像对各自的图像对损失值进行求和,得到目标损失值。在其他一些实施例中,计算机设备对图像对损失值进行统计,得到目标损失还可以是:对所有图像对各自的图像对损失值进行求平均值,得到目标损失值。In some embodiments, the computer device performs statistics on the image pair loss values to obtain the target loss value, specifically: summing the respective image pair loss values of all the image pairs to obtain the target loss value. In some other embodiments, the computer device performs statistics on the loss values of the image pairs to obtain the target loss: averaging the respective image pair loss values of all the image pairs to obtain the target loss value.
上述实施例中,通过对图像对损失值进行统计,得到目标损失值,基于目标损失值对待训练的对象密度确定模型进行参数调整,得到训练后的对象密度确定模型,可以最大程度地避免逐像素拟合密度值带来的训练误差。In the above embodiment, the target loss value is obtained by counting the loss values of the images, and the parameters of the object density determination model to be trained are adjusted based on the target loss value to obtain the trained object density determination model, which can avoid pixel by pixel to the greatest extent. Training error due to fitted density values.
在一些实施例中,如图5所示,基于图像对所对应的标准密度统计值与预测密度统计值之间的差异,得到图像对所对应的图像对损失值包括:按照目标收缩方式对图像对所对应的标准密度统计值进行收缩,得到收缩后的标准密度统计值,目标收缩方式所对应的收缩幅度与待收缩数值的大小成正相关;按照目标收缩方式对图像对所对应的预测密度统计值进行收缩,得到收缩后的预测密度统计值;根据收缩后的标准密度统计值与收缩后的预测密度统计值的差值,得到图像对所对应的图像对损失值,其中图像对损失值与差值的成正相关关系。In some embodiments, as shown in FIG. 5 , based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, obtaining the image pair loss value corresponding to the image pair includes: compressing the image pair according to the target shrinking method. Shrink the corresponding standard density statistic value to obtain the shrunk standard density statistic value. The shrinkage amplitude corresponding to the target shrinkage mode is positively correlated with the size of the value to be shrunk; the predicted density statistics corresponding to the image pair are calculated according to the target shrinkage mode According to the difference between the standard density statistic value after shrinkage and the predicted density statistic value after shrinkage, the loss value of the image pair corresponding to the image pair is obtained, where the loss value of the image pair is the same as The difference is positively correlated.
其中,目标收缩方式指的是能够对数值进行收缩以使得数值减小的数学运算方式。目标收缩方式所对应的收缩幅度与待收缩数值的大小成正相关关系,即待收缩数值越大,收缩幅度越大;反之,待收缩数值越小,收缩幅度越小。其中的待收缩数值在本申请实施例中指的是标准密度统计值或者预测密度统计值。收缩幅度指的是收缩后的数值与收缩前的数值之间的差值。The target shrinking method refers to a mathematical operation method that can shrink the numerical value to reduce the numerical value. The contraction amplitude corresponding to the target contraction mode is positively correlated with the value to be contracted, that is, the larger the value to be contracted, the greater the contraction amplitude; conversely, the smaller the value to be contracted, the smaller the contraction amplitude. The to-be-shrinked value in the embodiment of the present application refers to a standard density statistic value or a predicted density statistic value. Shrinkage amplitude refers to the difference between the value after shrinkage and the value before shrinkage.
具体地,计算机设备可以按照目标收缩方式对图像对所对应的标准密度统计值进行收缩,得到收缩后的标准密度统计值,按照目标收缩方式对图像对所对应的预测密度统计值进行收缩,得到收缩后的预测密度统计值,进一步计算机设备可以将收缩后的标准密度统计值减去收缩后的预测密度统计值,当得到的差值大于0时,将该差值作为图像对所对应的图像对损失值,当得到的差值小于0时,取该差值的绝对值作为图像对所对应的图像对损失值。其中图像对损失值与差值成正相关关系,这里的差值指的是绝对差值,绝对差值越大,图像对损失值越大;反之,绝对差值越小,图像对损失值越小。Specifically, the computer device can shrink the standard density statistic value corresponding to the image pair according to the target shrinking method to obtain the shrunk standard density statistic value, and shrink the predicted density statistic value corresponding to the image pair according to the target shrinking method to obtain The predicted density statistic value after shrinkage, and the computer equipment can further subtract the shrinkage standard density statistic value from the shrinkage predicted density statistic value, and when the obtained difference value is greater than 0, the difference value is used as the image corresponding to the image pair For the loss value, when the obtained difference value is less than 0, the absolute value of the difference value is taken as the image pair loss value corresponding to the image pair. The image pair loss value is positively correlated with the difference value. The difference value here refers to the absolute difference value. The larger the absolute difference value, the larger the image pair loss value; on the contrary, the smaller the absolute difference value, the smaller the image pair loss value .
在一些实施例中,按照目标收缩方式对图像对所对应的标准密度统计值进行收缩,得到收缩后的标准密度统计值包括:按照目标收缩方式对图像对所对应的标准密度统计值进行收缩,得到收缩后的标准密度统计值包括:将预设数值作为底数,以标准密度统计值作为真数进行对数变换,将所得到的对数作为收缩后的标准密度统计值,预设数值大于1;按照目标收缩方式对图像对所对应的预测密度统计值进行收缩,得到收缩后的预测密度统计值包括:将预设数值作为底数,以预测密度统计值作为真数进行对数变换,将所得到的对数作为收缩后的预测密度统计值。In some embodiments, shrinking the standard density statistic value corresponding to the image pair according to the target shrinking method, and obtaining the shrunk standard density statistic value includes: shrinking the standard density statistic value corresponding to the image pair according to the target shrinking method, Obtaining the shrunk standard density statistic value includes: taking the preset value as the base, performing logarithmic transformation with the standard density statistic value as the true number, and using the obtained logarithm as the shrunk standard density statistic value, and the preset value is greater than 1 ; Shrink the predicted density statistic value corresponding to the image pair according to the target shrinking method, and obtain the shrunk predicted density statistic value including: taking the preset value as the base, and using the predicted density statistic value as the true number to perform logarithmic transformation, and converting the The resulting log is used as the predicted density statistic after shrinkage.
具体地,假设预设数值为a,标准密度统计值为N,预测密度统计值为M,则收缩后的标准密度值为log aN,收缩后的预测密度统计值为log aM,那么计算机设备可根据log aN与log aM的差值,得到图像对所对应的图像对损失值。其中,预设数值大于1,例如可以是e。 Specifically, assuming that the preset value is a, the standard density statistic value is N, and the predicted density statistic value is M, the standard density value after shrinking is log a N, and the predicted density statistic value after shrinking is log a M, then the computer The device can obtain the image pair loss value corresponding to the image pair according to the difference between log a N and log a M. Wherein, the preset value is greater than 1, for example, it may be e.
在另外一些实施例中,考虑到训练样本图像中部分区域可能存在没有目标对象的情况,此时,标准密度图和预测密度图中该区域的密度统计值可能为0,为了避免取对数时出现错误,可以对每一个密度统计值加上一个常量偏差,该常量偏差可以根据需要设定,例如可以是1e-3(即0.001),再按照上述实施例中的方式进行对数变换,图像对损失值的具体计算方式参照如下公式(1),其中pred为某个图像对中预测图像块对应的预测密度统计值,gt为该图像对中标准图像块对应的标准密度统计值,Loss指的是图像对损失值,log指的是对数变换,log的底数可以是大于1的数,例如可以是e:In some other embodiments, considering that there may be no target object in some areas of the training sample image, at this time, the density statistic value of this area in the standard density map and the predicted density map may be 0. In order to avoid taking the logarithm when If an error occurs, a constant deviation can be added to each density statistic value, and the constant deviation can be set as required, for example, it can be 1e-3 (ie 0.001), and then logarithmically transformed according to the method in the above embodiment, the image The specific calculation method of the loss value refers to the following formula (1), where pred is the predicted density statistic value corresponding to the predicted image block in a certain image pair, gt is the standard density statistic value corresponding to the standard image block in the image pair, and Loss refers to is the image pair loss value, log refers to the logarithmic transformation, and the base of log can be a number greater than 1, such as e:
Loss=|log(pred+le|-3)-log(gt+le-3)|       (1)Loss=|log(pred+le|-3)-log(gt+le-3)| (1)
上述实施例中,按照目标收缩方式分别对图像对所对应的标准密度统计值和预测度统计值进行收缩,根据收缩后的标准密度统计值与收缩后的预测密度统计值的差值,得到图像对所对应的图像对损失值,由于目标收缩方式所对应的收缩幅度与待收缩数值的大小成正相关关系,可以将难以预测的样本(即高密度区域图像块)的预测偏差减小,反传梯度也会相应减小,而高密度区域图像块很可能是错误样本,从而有利于削弱某些错误样本带来过大的梯度,突出有用样本的梯度,有利于训练过程中的模型参数优化。In the above embodiment, the standard density statistic value and the prediction degree statistic value corresponding to the image pair are respectively shrunk according to the target shrinking method, and the image is obtained according to the difference between the shrunk standard density statistic value and the shrunk predicted density statistic value. For the corresponding loss value of the image pair, since the shrinkage amplitude corresponding to the target shrinkage mode is positively correlated with the size of the value to be shrunk, the prediction deviation of the difficult-to-predict samples (ie, image blocks in high-density areas) can be reduced, and the reverse transmission can be performed. The gradient will also be reduced accordingly, and the image blocks in the high-density area are likely to be wrong samples, which is conducive to weakening the excessive gradient caused by some wrong samples, highlighting the gradient of useful samples, which is conducive to the optimization of model parameters during the training process.
在一些实施例中,对图像对损失值进行统计,得到目标损失值包括:根据图像对所对应 的标准密度统计值确定图像对损失值的损失值权重,损失值权重与标准密度统计值成负相关关系;基于损失值权重以及图像对损失值进行加权求和,得到目标损失值。In some embodiments, performing statistics on the loss value of the image pair to obtain the target loss value includes: determining the loss value weight of the image pair loss value according to the standard density statistical value corresponding to the image pair, and the loss value weight and the standard density statistical value are negative. Correlation; based on the weight of the loss value and the image, the weighted sum of the loss value is obtained to obtain the target loss value.
具体地,考虑到密度值较小的区域往往占据图像的大部分,那么在训练的过程中,可以给予密度值较小的区域对应的图像块更多的关注,从而使得这类样本(即图像块)总的密度统计值误差更小。基于此,计算机设备可根据图像对所对应的标准密度统计值确定图像对损失值的损失值权重,该损失值权重与标准密度统计值成负相关关系,即标准密度统计值越大,该损失值权重越小,标准密度统计值越小,该损失值权重越大。Specifically, considering that areas with smaller density values tend to occupy most of the image, in the training process, more attention can be given to image blocks corresponding to areas with smaller density values, so that such samples (ie, images block) with less error in the overall density statistic. Based on this, the computer device can determine the loss value weight of the image pair loss value according to the standard density statistical value corresponding to the image pair, and the loss value weight is negatively correlated with the standard density statistical value, that is, the larger the standard density statistical value, the greater the loss value The smaller the value weight, the smaller the standard density statistic, and the greater the weight of the loss value.
在一些实施例中,可以设置一个预设阈值Y,当图像对所对应的标准密度统计值X大于该预设阈值Y时,判定该图像对所对应的标准密度统计值较大,计算机设备对该图像对对应的图像对损失值确定一个较小的损失值权重a,当图像对所对应的标准密度统计值X小于该预设阈值Y时,判定该图像对所对应的标准密度统计值较小,计算机设备对该图像对对应的图像对损失值确定一个较大的损失值权重b,其中,b大于a。In some embodiments, a preset threshold value Y may be set. When the standard density statistic value X corresponding to the image pair is greater than the preset threshold value Y, it is determined that the standard density statistic value corresponding to the image pair is larger, and the computer device determines that the standard density statistic value corresponding to the image pair is larger. A smaller loss value weight a is determined for the loss value of the image pair corresponding to the image pair. When the standard density statistic value X corresponding to the image pair is less than the preset threshold Y, it is determined that the standard density statistic value corresponding to the image pair is higher than the standard density statistic value of the image pair. is small, the computer device determines a larger loss value weight b for the image pair loss value corresponding to the image pair, where b is greater than a.
在一些实施例中,如图6所示,根据图像对所对应的标准密度统计值确定图像对损失值的损失值权重包括:In some embodiments, as shown in FIG. 6 , determining the loss value weight of the loss value of the image pair according to the standard density statistic value corresponding to the image pair includes:
步骤602,对标准密度统计值进行密度区间划分,得到多个密度区间。Step 602: Divide the standard density statistics into density intervals to obtain a plurality of density intervals.
具体地,假设对标准密度图进行划分得到N个标准图像块,这些标准图像块中,除了对象数量为0的标准图像块外,其他标准图像块的标准密度统计值中,最小值为a,最大值为b,则可以将标准密度统计值划分为K(K≥2)个密度区间,K具体可以根据需要取值,例如K可以为4,且使得第i(1≤i≤K)个密度区间的统计值范围如下公式(2)所示:Specifically, it is assumed that N standard image blocks are obtained by dividing the standard density map. Among these standard image blocks, except for the standard image block whose number of objects is 0, among the standard density statistic values of other standard image blocks, the minimum value is a, If the maximum value is b, the standard density statistic value can be divided into K (K≥2) density intervals, and K can be specified as needed. For example, K can be 4, and the i-th (1≤i≤K) The statistical value range of the density interval is shown in the following formula (2):
[e^{i*(log(b)-log(a))/K+log(a)},e^{(i+1)*(log(b)-log(a))/K+log(a)}         (2)[e^{i*(log(b)-log(a))/K+log(a)}, e^{(i+1)*(log(b)-log(a))/K+log (a)}    (2)
步骤604,获取标准密度统计值处于密度区间的标准图像块的图像块数量。Step 604: Acquire the number of image blocks of the standard image blocks whose standard density statistic value is in the density interval.
步骤606,基于标准图像块所对应的密度区间的图像块数量,确定标准图像块所对应的图像对损失值的损失值权重;图像块数量与损失值权重呈正相关关系。Step 606: Determine the loss value weight of the image corresponding to the standard image block to the loss value based on the number of image blocks in the density interval corresponding to the standard image block; the number of image blocks is positively correlated with the loss value weight.
具体地,对于每一个密度区间i,计算机设备统计落在该密度区间内的标准图像块的图像块数量n iSpecifically, for each density interval i, the computer device counts the number n i of image blocks of the standard image blocks that fall within the density interval.
在一些实施例中,计算机设备可参照如下公式(3)计算密度区间i内的图像块数量n i占 总的标准图像块数量N的比例p iIn some embodiments, the computer device may calculate the ratio pi of the number of image blocks n i in the density interval i to the total number of standard image blocks N with reference to the following formula (3):
p i=n i/N     (3) p i =n i /N (3)
根据该比例确定该密度区间内的标准图像块所对应的图像对损失值的损失值权重。例如,计算机设备可以直接将该比例确定为该密度区间i内的标准图像块所对应的图像对损失值的损失值权重。The loss value weight of the image pair loss value corresponding to the standard image block in the density interval is determined according to the ratio. For example, the computer device may directly determine the ratio as the loss value weight of the image corresponding to the standard image block within the density interval i to the loss value.
在一些具体的实施例中,计算机设备在计算得到密度区间i内的图像块数量n i占总的标准图像块数量N的比例后,可参照以下公式(4)计算该密度区间i内的标准图像块所对应的图像对损失值,其中,α可以根据需要进行取值,例如α取值可以为20.0: In some specific embodiments, after calculating the ratio of the number of image blocks n i in the density interval i to the total number of standard image blocks N, the computer device can calculate the standard value in the density interval i with reference to the following formula (4) The loss value of the image pair corresponding to the image block, where α can be valued as needed, for example, the value of α can be 20.0:
Loss=|log(pred+le-3)-log(gt+le-3)|*(1+α*p i)       (4) Loss=|log(pred+le-3)-log(gt+le-3)|*(1+α* pi ) (4)
上述实施例中,通过对标准密度统计值进行区间划分,得到多个密度区间,获取标准密度统计值处于密度区间的标准图像块的图像块数量,基于标准图像块所对应的密度区间的图像块数量,确定标准图像块所对应的图像对损失值的损失值权重,由于图像块数量与损失值权重呈正相关关系,可以使得图像块数量偏多的密度区间对应的图像块给予更多的关注,从而使得这些图像块总的预测误差更小。In the above embodiment, by dividing the standard density statistical value into intervals, a plurality of density intervals are obtained, and the number of image blocks of the standard image block whose standard density statistical value is in the density interval is obtained, and the image blocks in the density interval corresponding to the standard image block are obtained. Determine the loss value weight of the image corresponding to the standard image block to the loss value. Since the number of image blocks has a positive correlation with the loss value weight, the image blocks corresponding to the density interval with a large number of image blocks can be given more attention. This makes the total prediction error of these image blocks smaller.
在一些实施例中,对图像对损失值进行统计,得到目标损失值包括:根据目标衰减方式对图像对损失值进行衰减,得到衰减后的图像对损失值,其中,目标衰减方式所对应的衰减幅度与图像对损失值成正相关关系;对衰减后的图像对损失值进行求和运算,得到目标损失值。In some embodiments, performing statistics on the image pair loss value to obtain the target loss value includes: attenuating the image pair loss value according to the target attenuation mode to obtain the attenuated image pair loss value, wherein the attenuation corresponding to the target attenuation mode The magnitude is positively correlated with the image pair loss value; the attenuated image pair loss value is summed to obtain the target loss value.
其中,目标衰减方式指的是能够对图像对损失值进行减小的方式。目标衰减方式所对应的衰减幅度与图像对损失值的大小成正相关关系,即图像对损失值越大,衰减幅度越大;反之,图像对损失值越小,衰减幅度越小。衰减幅度指的是衰减前的图像对损失值与衰减后的图像对损失值之间的差值。Among them, the target attenuation method refers to a method that can reduce the loss value of the image pair. The attenuation amplitude corresponding to the target attenuation mode is positively correlated with the loss value of the image pair, that is, the larger the loss value of the image pair, the greater the attenuation amplitude; on the contrary, the smaller the loss value of the image pair, the smaller the attenuation amplitude. The attenuation magnitude refers to the difference between the image pair loss value before attenuation and the image pair loss value after attenuation.
具体地,考虑到越是错误的样本(即对象密度值不准确的标准图像块),其预测误差就越有可能很大,基于此,计算机设备在训练对象密度确定模型时,可以根据目标衰减方式对图像对损失值进行衰减,得到衰减后的图像对损失值,对对衰减后的图像对损失值进行求和运算,得到目标损失值。Specifically, considering that the more erroneous the sample (that is, the standard image block with inaccurate object density value), the more likely its prediction error is large. Based on this, the computer equipment can decay according to the target when training the object density determination model. The image pair loss value is attenuated by the method to obtain the attenuated image pair loss value, and the sum operation is performed on the attenuated image pair loss value to obtain the target loss value.
在一些实施例中,可以对所有图像对各自的图像对损失值进行排序,根据排序结果选取预设数量(例如10%)数值较大的图像对损失值,将这些图像对损失值置为0,从而可以在训练时过滤掉这些有可能是错误标注的样本,从而稳定网络的训练过程。举例说明,假设有100个图像对,那么计算机设备可以对这100个图像对的图像对损失值进行降序排列,然后选取排列在前的10个图像对损失值,将这些图像对损失值直接置为0。In some embodiments, the loss values of all image pairs may be sorted, and a preset number (eg, 10%) of image pair loss values with larger values may be selected according to the sorting result, and the loss values of these image pairs are set to 0 , so that these samples that may be mislabeled can be filtered out during training, thereby stabilizing the training process of the network. For example, assuming that there are 100 image pairs, the computer equipment can sort the loss values of the 100 image pairs in descending order, and then select the loss values of the top 10 image pairs, and set the loss values of these image pairs directly to is 0.
在另外一些实施例中,计算机设备可获取预设的指数函数,通过该指数函数对图像对损失值进行加权,该指数函数的取值大小与图像对损失值成负相关,即图像对损失值越大,该指数函数的取值越大;反之,图像对损失值越小,该指数函数的取值越小,这样,即可以让这类预测误差大的样本参与训练,又可以避免这类样本主导整个训练过程的梯度信息。其中,指数函数例如可以e -x,其中,x为图像对损失值,xe -x为衰减后的图像对损失值。 In other embodiments, the computer device may obtain a preset exponential function, and weight the loss value of the image pair by the exponential function, and the value of the exponential function is negatively correlated with the loss value of the image pair, that is, the loss value of the image pair The larger the value of the exponential function, the larger the value of the exponential function; on the contrary, the smaller the loss value of the image pair, the smaller the value of the exponential function, so that such samples with large prediction errors can be allowed to participate in training, and this kind of The samples dominate the gradient information of the entire training process. The exponential function may be, for example, e -x , where x is the image pair loss value, and xe -x is the attenuated image pair loss value.
上述实施例中,计算机按照目标衰减方式对图像对损失值进行衰减,得到衰减后的图像对损失值,再对衰减后的图像对损失值进行求和运算,得到目标损失值,通过该目标损失值反向传播调整对象密度确定模型的模型参数时,由于通过衰减抑制了图像损失值最大的部分样本,可以突出有用样本带来的梯度信息,因为这些有益梯度信息来自正确标注样本的比例会更大,所以对模型的训练会更有帮助。In the above embodiment, the computer attenuates the loss value of the image pair according to the target attenuation method, obtains the loss value of the image pair after attenuation, and then performs a sum operation on the loss value of the image pair after attenuation to obtain the target loss value. Value backpropagation adjusts the object density to determine the model parameters of the model. Since the partial samples with the largest image loss value are suppressed by attenuation, the gradient information brought by the useful samples can be highlighted, because the proportion of these useful gradient information from the correctly labeled samples will be higher. large, so the training of the model will be more helpful.
在一些实施例中,分别对标准密度图和预测密度图进行划分,得到标准密度图对应的多个标准图像块以及预测密度图对应的多个预测图像块包括:获取滑动窗口;按照预设滑动方式将滑动窗口在标准密度图上进行滑动,将处于滑动窗口内的图像区域作为标准图像块;按照预设滑动方式将滑动窗口在预测密度图上进行滑动,将处于滑动窗口内的图像区域作为预测图像块。In some embodiments, dividing the standard density map and the predicted density map respectively to obtain multiple standard image blocks corresponding to the standard density map and multiple predicted image blocks corresponding to the predicted density map includes: acquiring a sliding window; sliding according to a preset sliding the sliding window on the standard density map, and taking the image area in the sliding window as the standard image block; sliding the sliding window on the predicted density map according to the preset sliding method, and taking the image area in the sliding window as the standard image block; Predicted image blocks.
其中,滑动窗口可以是一个或者多个。多个指的是至少二个。滑动窗口的尺寸可以根据需要进行确定,例如,可以根据训练样本图像的尺寸大小进行确定。多个滑动窗口的尺寸可以相同也可以不同。预设滑动方式指的是从训练图像上确定滑动起点,按照一定的顺序遍历整个训练样本图像进行滑动。Wherein, there may be one or more sliding windows. Plural means at least two. The size of the sliding window can be determined according to needs, for example, it can be determined according to the size of the training sample image. The sizes of multiple sliding windows can be the same or different. The preset sliding mode refers to determining the sliding starting point from the training image, and traversing the entire training sample image to slide in a certain order.
具体地,计算机设备在获取到预测滑动窗口后,按照预设滑动方式将滑动窗口在标准密度图上进行滑动,每一次滑动,将处于滑动窗口内的图像区域作为标准图像块,计算机设备进一步按照上述相同的滑动方式将滑动窗口在预测密度图上进行滑动,每一滑动,将将处于滑动窗口内的图像区域作为预测图像块。Specifically, after acquiring the predicted sliding window, the computer device slides the sliding window on the standard density map according to a preset sliding method. Each time the sliding window is slid, the image area in the sliding window is used as the standard image block. In the same sliding manner as described above, the sliding window is slid on the predicted density map. For each sliding, the image area within the sliding window is used as the predicted image block.
在一些实施例中,为了提高滑动效率,将滑动窗口在图像上进行滑动时,可以使得滑动窗口不重叠的滑动。不重叠指的是两个相邻的滑动得到的两个图像块之间不存在重叠的像素。In some embodiments, in order to improve the sliding efficiency, when sliding the sliding window on the image, the sliding window can be slid without overlapping. Non-overlapping means that there are no overlapping pixels between two adjacent image blocks obtained by sliding.
举例说明,假设标准密度图的尺寸为128*128,如果将4*4的尺寸的滑动窗口在标准密度图上进行不重叠的滑动,则可以获得1024个4*4大小的标准图像块,如果将8*8的尺寸的滑动窗口在标准密度图上进行不重叠的滑动,则可以获得具有256个8*8大小标准图像块,如果将16*16的尺寸的滑动窗口在标准密度图上进行不重叠的滑动,则可以获得64个16*16大小的标准图像块,如果将32*32的尺寸的滑动窗口在标准密度图上进行不重叠的滑动,则可以获得16个32*32大小的标准图像块。For example, assuming that the size of the standard density map is 128*128, if a sliding window of size 4*4 is slid non-overlapping on the standard density map, 1024 standard image blocks of 4*4 size can be obtained, if By sliding the sliding window of size 8*8 on the standard density map without overlapping, you can obtain 256 standard image blocks of size 8*8. If the sliding window of size 16*16 is performed on the standard density map Non-overlapping sliding, you can get 64 standard image blocks of 16*16 size, if you slide the sliding window of 32*32 size on the standard density map without overlapping, you can get 16 32*32 size Standard image block.
上述实施例中,由于可以按照相同的滑动方式,将相同的滑动窗口分别在标准密度图和预测密度图进行滑动,可以得到尺寸大小相同,位置一一对应的标准图像块和预测图像块,保证标准图像块和预测图像块之间图像位置对应关系的准确性。In the above embodiment, since the same sliding window can be slid on the standard density map and the predicted density map respectively in the same sliding manner, standard image blocks and predicted image blocks with the same size and one-to-one position can be obtained, ensuring that Accuracy of image position correspondence between standard and predicted image patches.
在一些实施例中,训练样本图像标注有多个对象位置点;获取训练样本图像以及训练样本图像对应的标准密度图包括:根据训练样本图像所对应的对象位置点,确定训练样本图像对应的对象响应图;对象响应图中对象位置点的像素值为第一像素值,非对象位置点的像素值为第二像素值;对对象响应图进行卷积处理,得到训练样本图像对应的标准密度图。In some embodiments, the training sample image is marked with a plurality of object position points; obtaining the training sample image and the standard density map corresponding to the training sample image includes: determining the object corresponding to the training sample image according to the object position points corresponding to the training sample image Response map; the pixel value of the object position point in the object response map is the first pixel value, and the pixel value of the non-object position point is the second pixel value; the object response map is convolved to obtain the standard density map corresponding to the training sample image .
其中,对象位置点用于表征目标对象在训练样本图像中的实际位置。对象位置点具体可以是对象中心点,例如,当目标对象为自然人时,对象中心点具体可以是人头中心点。对象响应图指的是对对象中心点位置进行响应得到的图像,该图像与训练样本图像的尺寸相同。在对象响应图中,对象位置点的像素值为第一像素值,非对象位置点的像素值为第二像素值,第一像素值与第二像素值为不同的像素值,从而可以在对象响应图中区别对象位置点和非对象位置点。第一像素值例如可以是1,第二相似值例如可以是0。Among them, the object position point is used to represent the actual position of the target object in the training sample image. The object position point may specifically be the center point of the object. For example, when the target object is a natural person, the center point of the object may specifically be the center point of the human head. The object response map refers to the image obtained by responding to the position of the center point of the object, and the image is the same size as the training sample image. In the object response map, the pixel value of the object position point is the first pixel value, the pixel value of the non-object position point is the second pixel value, the first pixel value and the second pixel value are different pixel values, so that the object Distinguish between object location points and non-object location points in the response graph. The first pixel value may be, for example, 1, and the second similarity value may be, for example, 0.
具体地,计算机设备可以分别对训练样本图像所对应的各个对象位置点进行响应,得到各个对象位置点的响应图,该响应图与训练样本图像尺寸相同,然后将所有的响应图进行像素叠加,得到训练样本图像对应的对象响应图,计算机设备进一步可以按照预设的高斯核对对象响应图进行卷积处理,得到训练样本图像对应的标准密度图。Specifically, the computer equipment can respectively respond to each object position point corresponding to the training sample image to obtain a response map of each object position point, the response map is the same size as the training sample image, and then all the response maps are pixel-superimposed, The object response map corresponding to the training sample image is obtained, and the computer device can further perform convolution processing on the object response map according to a preset Gaussian kernel to obtain a standard density map corresponding to the training sample image.
举个例子,假设目标对象为自然人,训练样本图像中标注有N个人头中心点x 1,x 2,x 2,……x N,那么对于某个人头中心点x i(1≤i≤N),可以将其表 示成与训练样本图像同样大小的一张图δ(x-x i),即只有位置x i为1,其余位置均为0,那么N个人头可表示为H(x),参照以下公式(5): For example, assuming that the target object is a natural person, and the training sample images are marked with N head center points x 1 , x 2 , x 2 , ... x N , then for a certain head center point x i (1≤i≤N ), it can be expressed as a picture δ(xx i ) of the same size as the training sample image, that is, only the position x i is 1, and the rest of the positions are 0, then the N head can be expressed as H(x), refer to The following formula (5):
Figure PCTCN2022086848-appb-000001
Figure PCTCN2022086848-appb-000001
可以注意到对该图进行积分就可以得到训练样本图像中的总人数,然后使用一个高斯核G σ对该图进行卷积处理即可得到训练样本图像对应的标准密度图D,参考以下公式(6): It can be noted that the total number of people in the training sample image can be obtained by integrating the image, and then convolving the image with a Gaussian kernel G σ to obtain the standard density map D corresponding to the training sample image, referring to the following formula ( 6):
D=G σ*H(x)     (6) D= *H(x) (6)
可以理解的是,由于高斯核是归一化的,因此对卷积后的密度图D进行积分同样可以得到训练样本图中的总人数。It is understandable that since the Gaussian kernel is normalized, integrating the convolved density map D can also get the total number of people in the training sample map.
上述实施例中,计算机设备根据训练样本图像所对应的对象位置点,确定训练样本图像对应的对象响应图,再对对象响应图进行卷积处理,得到训练样本图像对应的标准密度图,可以消除对象响应图中特征的稀疏性,得到的标准密度图更加有利于模型的学习。In the above embodiment, the computer device determines the object response map corresponding to the training sample image according to the object position points corresponding to the training sample image, and then performs convolution processing on the object response map to obtain the standard density map corresponding to the training sample image, which can eliminate the The sparsity of the features in the object response map, the obtained standard density map is more conducive to the learning of the model.
在一些实施例中,如图7所示,提供了一种对象密度确定方法,该对象密度确定方法可应用于计算机设备,该计算机可以是图1中的终端或者服务器,还可以是终端和服务器组成的交互系统,该方法具体包括以下步骤:In some embodiments, as shown in FIG. 7 , a method for determining the density of objects is provided, and the method for determining the density of objects can be applied to a computer device, and the computer may be a terminal or a server in FIG. 1 , and may also be a terminal and a server The interactive system composed, the method specifically includes the following steps:
步骤702,获取待确定密度的目标图像。Step 702, acquiring the target image of the density to be determined.
其中,待确定密度的目标图像可以是需要进行密度确定的目标图像。目标图像中包含一个或多个目标对象。The target image whose density is to be determined may be the target image whose density needs to be determined. The target image contains one or more target objects.
具体地,计算机设备可以对包含一个或多个目标对象的场景进行拍摄得到待确定密度的目标图像。计算机设备还可以通过网络从其他计算机设备获取待确定密度的目标图像。根据需求的不同,目标图像可以是各种场景的图像。例如,目标图像可以是对目标场所内的人群进行监控的图像,目标场所例如可以是地铁、商场等等。Specifically, the computer device may photograph a scene containing one or more target objects to obtain target images of the density to be determined. The computer device can also acquire the target image whose density is to be determined from other computer devices through the network. According to different requirements, the target image can be the image of various scenes. For example, the target image may be an image for monitoring crowds in a target place, and the target place may be, for example, a subway, a shopping mall, or the like.
步骤704,将目标图像输入训练后的对象密度确定模型中,通过对象密度确定模型进行对象密度确定。Step 704: Input the target image into the trained object density determination model, and determine the object density through the object density determination model.
对象密度确定模型是基于图像对所对应的标准密度统计值与预测密度统计值之间的差 异,对待训练的对象密度确定模型进行参数调整得到的;其中,图像对是标准图像块以及与标准图像块存在图像位置对应关系的预测图像块组成的,标准图像块是通过对训练样本图像所对应的标准密度图进行划分得到的;预测图像块是通过对预测密度图进行划分得到的,预测密度图是将训练样本图像输入到待训练的对象密度确定模型中处理得到The object density determination model is obtained by adjusting the parameters of the object density determination model to be trained based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value; wherein the image pair is the standard image block and the standard image The block is composed of predicted image blocks with corresponding image positions. The standard image block is obtained by dividing the standard density map corresponding to the training sample image; the predicted image block is obtained by dividing the predicted density map. It is obtained by inputting the training sample image into the object density determination model to be trained.
步骤706,获取对象密度确定模型输出的目标图像对应的对象密度图。Step 706: Obtain an object density map corresponding to the target image output by the object density determination model.
关于步骤702-步骤704的详细描述可参考前文中的实施例,本申请在此不赘述。For the detailed description of steps 702 to 704, reference may be made to the foregoing embodiments, which will not be repeated in this application.
上述对象密度确定方法,由于对象密度确定模型是基于图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数调整得到的,其中,图像对是标准图像块以及与标准图像块存在图像位置对应关系的预测图像块组成的,标准图像块是通过对训练样本图像所对应的标准密度图进行划分得到的,预测图像块是通过对预测密度图进行划分得到的,预测密度图是将训练样本图像输入到待训练的对象密度确定模型中处理得到,在对待训练的对象密度确定模型进行训练的过程中可以以图像块为单位去拟合局部区域的密度值,综合考虑了局部区域的整体密度值,提高了训练得到的对象密度确定模型用于确定对象密度时的准确性,从而将目标图像输入训练后的对象密度确定模型中,该对象密度确定模型可以输出准确的对象密度图。The above object density determination method is obtained by adjusting the parameters of the object density determination model to be trained based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, wherein the image pair is The standard image block is composed of the standard image block and the predicted image block corresponding to the image position of the standard image block. The standard image block is obtained by dividing the standard density map corresponding to the training sample image, and the predicted image block is obtained by dividing the predicted density map. The predicted density map is obtained by inputting the training sample image into the object density determination model to be trained. During the training of the object density determination model to be trained, the image block can be used as a unit to fit the local area. The density value, which comprehensively considers the overall density value of the local area, improves the accuracy of the object density determination model obtained by training when it is used to determine the object density, so that the target image is input into the trained object density determination model, and the object density is determined. The model can output accurate object density maps.
在一些实施例中,计算机设备在获取到对象密度确定模型输出的目标图像对应的对象密度图后,可对该对象密度图进行积分,确定该目标图像中目标对象的总数量。In some embodiments, after acquiring the object density map corresponding to the target image output by the object density determination model, the computer device may integrate the object density map to determine the total number of target objects in the target image.
在一些实施例中,计算机设备在在获取到对象密度确定模型输出的目标图像对应的对象密度图后,可通过热力图的形式展示该对象密度图。在展示的对象密度图中,颜色越深,代表目标对象的密集程度越大。In some embodiments, after acquiring the object density map corresponding to the target image output by the object density determination model, the computer device may display the object density map in the form of a heat map. In the displayed object density map, the darker the color, the denser the target object.
在一些实施例中,该对象密度确定方法还包括对象密度确定模型的训练步骤,该训练步骤具体包括:获取训练样本图像以及训练样本图像对应的标准密度图;将训练样本图像输入待训练的对象密度确定模型中,得到对象密度确定模型输出的预测密度图;分别对标准密度图和预测密度图进行划分,得到标准密度图对应的多个标准图像块以及预测密度图对应的多个预测图像块;对标准图像块中的对象密度进行统计,得到标准图像块对应的标准密度统计值,对预测图像块中的对象密度进行统计,得到预测图像块对应的预测密度统计值;将标准图像块以及与标准图像块存在图像位置对应关系的预测图像块组成图像对,基于图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数 调整,得到训练后的对象密度确定模型。In some embodiments, the object density determination method further includes a training step of the object density determination model, and the training step specifically includes: acquiring a training sample image and a standard density map corresponding to the training sample image; inputting the training sample image into the object to be trained In the density determination model, the predicted density map output by the object density determination model is obtained; the standard density map and the predicted density map are divided respectively, and multiple standard image blocks corresponding to the standard density map and multiple predicted image blocks corresponding to the predicted density map are obtained. ; Count the object density in the standard image block to obtain the standard density statistic value corresponding to the standard image block, and perform statistics on the object density in the predicted image block to obtain the predicted density statistic value corresponding to the predicted image block; The predicted image blocks that have an image position correspondence with the standard image blocks are composed of image pairs. Based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, the parameters of the object density determination model to be trained are adjusted to obtain training. After the object density determines the model.
关于训练步骤的具体描述可参考上位实施例中的描述,本申请在此不赘述。For the specific description of the training step, reference may be made to the description in the upper embodiment, which is not repeated in this application.
本申请还提供一种应用场景,该应用场景应用上述的对象密度确定方法以实现智慧交通。在该应用场景中,本申请实施例提供的对象密度确定方法可以对任意交通地点进行客流统计,在该应用场景中,可以通过摄像头等监控设备获取检测的交通地点实时的人群图像,将人群图像发送至服务器,服务器部署有训练后的人群密度确定模型(即上文实施例中的对象密度确定模型)。The present application also provides an application scenario, where the above-mentioned object density determination method is applied to realize intelligent transportation. In this application scenario, the object density determination method provided by the embodiment of the present application can perform passenger flow statistics for any traffic location. Sent to the server, where the trained crowd density determination model (ie, the object density determination model in the above embodiment) is deployed.
具体地,该对象密度确定方法在该应用场景的应用如下:Specifically, the application of the object density determination method in this application scenario is as follows:
(一)服务器上预先通过以下步骤训练得到对象密度确定模型:(1) The object density determination model is obtained by pre-training on the server through the following steps:
1、服务器获取训练样本集,训练样本集中的训练样本图像标注有人头中心点,根据训练样本图像获得大小相同的人群响应图,该人群响应图中,各个人头中心点的像素为1,其他位置的像素值为0,服务器进一步采用预设的高斯核对该响应图进行卷积处理,得到训练样本图像对应的标准密度图。1. The server obtains the training sample set, the training sample images in the training sample set are marked with the center point of the human head, and the crowd response map of the same size is obtained according to the training sample image. In the crowd response map, the pixel of each head center point is 1, and other positions are The pixel value of 0 is 0, and the server further uses a preset Gaussian kernel to perform convolution processing on the response map to obtain a standard density map corresponding to the training sample image.
需要说明的是,此处的高斯核标准差都是人工指定的或估计出的,因此对于不同尺度的人头,高斯核所覆盖的区域不一致,如图8所示,为高斯核在两个不同尺寸的人头处的示意图,其中(a)图中高斯核覆盖的区域为区域802,(b)图中高斯核覆盖的区域为区域804,可以明显地看出这两个区域的语义信息是不相同的。It should be noted that the standard deviation of the Gaussian kernel here is manually specified or estimated. Therefore, for heads of different scales, the area covered by the Gaussian kernel is inconsistent. As shown in Figure 8, the Gaussian kernel is in two different A schematic diagram of a human head with a size, in which (a) the area covered by the Gaussian kernel in the figure is area 802, and (b) the area covered by the Gaussian kernel in the figure is area 804. It can be clearly seen that the semantic information of these two areas is different. identical.
这种语义信息的不一致,使得得到的训练样本图像对应的标准密度图中的密度值并不准确,而相关技术中,在训练的过程中需要逐像素去拟合这些密度值,导致训练得到的人群密度确定模型用于确定人群密度时准确性低。而本申请实施例提供的对象密度确定方法可以有效地避免这种现象。This inconsistency of semantic information makes the density values in the standard density map corresponding to the obtained training sample images inaccurate. In related technologies, it is necessary to fit these density values pixel by pixel during the training process, resulting in Crowd density determination models have low accuracy when used to determine crowd density. The object density determination method provided by the embodiment of the present application can effectively avoid this phenomenon.
2、将训练样本图像输入待训练的人群密度确定模型中,得到对象密度确定模型输出的预测密度图。2. Input the training sample image into the population density determination model to be trained, and obtain the predicted density map output by the object density determination model.
其中,人群密度确定模型基于深度学习技术,以单张图像为输入,通过深度卷积网络提取图像特征,由于人群密度确定任务既需要具有高语义信息的上下文特征,也需要局部的细节信息,因此为了获得同时具有高级语义信息和细节信息的高分辨率特征图,通常使用先下采样然后上采样的U形网络结构,并引入跳跃链接来为上采样引入细节信息,最后使用预测 输出人群密人群密度图,人群密度确定模型的网络结构如图3所示。Among them, the crowd density determination model is based on deep learning technology, takes a single image as input, and extracts image features through a deep convolutional network. Since the crowd density determination task requires both contextual features with high semantic information and local detailed information, so In order to obtain high-resolution feature maps with both high-level semantic information and detailed information, a U-shaped network structure with down-sampling and then up-sampling is usually used, and skip links are introduced to introduce detailed information for up-sampling, and finally the output crowd is predicted using dense crowds. The density map, the network structure of the crowd density determination model is shown in Figure 3.
3、获取预设滑动窗口,按照预设滑动方式将滑动窗口在标准密度图上进行滑动,将处于滑动窗口内的图像区域作为标准图像块,得到多个标准图像块,按照该预设滑动方式将滑动窗口在预测密度图上进行滑动,将处于滑动窗口内的图像区域作为预测图像块,得到多个预测图像块。3. Obtain a preset sliding window, slide the sliding window on the standard density map according to the preset sliding method, and use the image area in the sliding window as a standard image block to obtain a plurality of standard image blocks, and follow the preset sliding method. The sliding window is slid on the predicted density map, and the image area within the sliding window is used as the predicted image block to obtain multiple predicted image blocks.
4、分别对每一个标准图像块中的对象密度进行统计,得到每一个标准图像块标准图像块对应的标准密度统计值,对每一个标准图像块预测图像块中的对象密度进行统计,得到每一个标准图像块预测图像块对应的预测密度统计值。4. Count the density of objects in each standard image block separately to obtain the standard density statistic value corresponding to the standard image block of each standard image block, and perform statistics on the object density in the predicted image block of each standard image block to obtain each standard image block. The predicted density statistic corresponding to the predicted image block for a standard image block.
具体地,对于每一个标准图像块,服务器可以对该标准图像块中的人群密度值进行累加得到该标准图像块对应的标准密度统计值,同样地,对于每一个预测图像块,服务器对该预测图像块中的人群密度值进行累加得到该预测图像块对应的预测密度统计值。Specifically, for each standard image block, the server may accumulate the crowd density values in the standard image block to obtain the standard density statistical value corresponding to the standard image block. Similarly, for each predicted image block, the server can predict the The crowd density values in the image block are accumulated to obtain the predicted density statistic value corresponding to the predicted image block.
5、将每一个标准图像块以及与标准图像块存在图像位置对应关系的预测图像块组成图像对,得到多个图像对,对于每一个图像对的标准密度统计值和预测密度统计值:首先分别加上一个常量偏差,然后以e为底数,分别以标准密度统计值、预测密度统计值作为真数进行对数变换,得到标准密度统计值对应的对数以及预测密度统计值对应的对数,对这两个对数作差并取差值的绝对值作为该图像对的图像对损失值。5. Combine each standard image block and the predicted image block with the image position corresponding to the standard image block into an image pair to obtain a plurality of image pairs. For the standard density statistics and predicted density statistics of each image pair: first, respectively Add a constant deviation, and then take e as the base, perform logarithmic transformation with the standard density statistic value and the predicted density statistic value as the true number, respectively, to obtain the logarithm corresponding to the standard density statistic value and the logarithm corresponding to the predicted density statistic value, Take the difference between these two logarithms and take the absolute value of the difference as the image pair loss value for this image pair.
6、基于各个标准图像块的标准密度值进行密度区间划分,得到多个个密度区间。6. Divide the density interval based on the standard density value of each standard image block to obtain a plurality of density intervals.
7、对于每一个密度区间,统计该密度区间内的标准图像块的图像块数量,计算该标准图像块的图像块数量占图像块总数量的比例,根据该比例确定该密度区间内的标准图像块所对应的图像对的图像对损失值的损失值权重,其中,该比例与损失值权重成正相关关系。7. For each density interval, count the number of image blocks in the standard image block in the density interval, calculate the ratio of the number of image blocks in the standard image block to the total number of image blocks, and determine the standard image in the density interval according to the ratio The loss value weight of the image pair loss value of the image pair corresponding to the block, where the ratio is positively correlated with the loss value weight.
8、对于每一个图像对,计算该图像对对应的标准密度统计值和预测密度统计值之间的差值,选择差值最大的10%的图像对,将这些图像对的图像对损失值置为0,将其他图像对的图像对损失值进行加权求和,得到目标损失值,根据目标损失值反向传播调整人群密度确定模型的模型参数,直至满足收敛条件时,得到训练后的人群密度确定模型。8. For each image pair, calculate the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, select the 10% image pairs with the largest difference, and set the image pair loss values of these image pairs as If it is 0, the weighted summation of the loss values of other image pairs is used to obtain the target loss value. According to the target loss value backpropagation, the crowd density is adjusted to determine the model parameters of the model, until the convergence condition is met, the crowd density after training is obtained. Determine the model.
(二)服务器将人群图像输入训练后的人群密度确定模型中,通过人群密度确定模型对该人群图像进行密度确定,得到该人群图像对应的人群密度图,并基于人群密度图进行积分得到该人群图像的总人数(人数以人头中心点在图像中进行统计),将人群密度图及总人数发送至终端,终端可以热力图的形式展示该人群密度图。(2) The server inputs the crowd image into the trained crowd density determination model, determines the density of the crowd image through the crowd density determination model, obtains the crowd density map corresponding to the crowd image, and performs integration based on the crowd density map to obtain the crowd The total number of people in the image (the number of people is counted in the image based on the center point of the head), the crowd density map and the total number of people are sent to the terminal, and the terminal can display the crowd density map in the form of a heat map.
举个例子,如图9所示,应用本申请提供的对象密度确定方法,服务器能够对图9中的(a)图进行对象密度确定,得到人群密度图,还能够根据人群密度图确定出该人群图像中的总人数,例如,总人数为208,服务器将人群密度图发送至终端,终端对图像中人群密度程度进行显示,如图9中的(b)图所示。在(b)图中示出了总人数208,不同图像区域的人群密度程度可能不同,在(b)图中可以以不同的颜色进行显示,该(b)图中以不同的图案代替颜色进行了示意。当监测到图像中存在大于预设阈值的密度值,终端还可以生成提示信息,以提示可能存在人流量过大的情况。For example, as shown in FIG. 9 , by applying the object density determination method provided by the present application, the server can determine the object density on the image (a) in FIG. 9 to obtain a crowd density image, and can also determine the The total number of people in the crowd image, for example, the total number of people is 208, the server sends the crowd density map to the terminal, and the terminal displays the crowd density degree in the image, as shown in (b) in Figure 9 . The total number of people 208 is shown in figure (b), the density of people in different image areas may be different, which can be displayed in different colors in figure (b), which uses different patterns instead of colors in (b) figure. indicated. When a density value greater than a preset threshold is detected in the image, the terminal can also generate prompt information to prompt that there may be an excessive flow of people.
本申请还提供另一种应用场景,该应用场景应用上述的对象密度确定方法以实现智慧商超。在该应用场景中,终端通过得到超市的各个目标区域的人群密度图,可以按照一定的周期对超市的各个区域的人流量进行统计,将统计结果生成报表,以提供给相关人员,用于对目标区域的占地面积进行调整,缓解部分区域人群拥挤的情况。The present application also provides another application scenario, where the above-mentioned object density determination method is applied to realize a smart quotient supermarket. In this application scenario, by obtaining the crowd density map of each target area of the supermarket, the terminal can count the flow of people in each area of the supermarket according to a certain period, and generate a report for the statistical results to provide to relevant personnel for The footprint of the target area is adjusted to ease the crowded situation in some areas.
本申请还提供另一种应用场景,该应用场景应用上述的对象密度确定方法以旅游景区的人流密度监控。在该应用场景中,可以对旅游景区的各个热门景点的人流密度进行监控,在目标区域的人流密度超过阈值时,可以对监控人员以文字或者语音的形式进行提示,以提高目标区域的安全性。The present application also provides another application scenario, where the above-mentioned object density determination method is applied to monitor the crowd density of tourist attractions. In this application scenario, the crowd density of various popular scenic spots in tourist attractions can be monitored. When the crowd density in the target area exceeds the threshold, the monitoring personnel can be prompted in the form of text or voice to improve the security of the target area. .
本申请实施例提供的对象密度确定方法可以从多个角度来缓解相关技术在回归人工生成密度图时所存在的问题。首先将标准密度图回归转化为密度统计值回归,然后对密度统计值进行对数变化,减小预测偏差较大的样本所产生的梯度,最后过滤掉预测误差大的样本的梯度信息,从而稳定网络的优化过程。在消除了不准确的人工生成的密度图所带来的负面影响后,网络能够优化到更优的局部最优点,从而取得更好地泛化能力。同时,本方案充分考虑到了密度值偏低的多数样本对于最终计数误差的贡献,因此在优化过程中通过分区间挖掘的方法来缓解该问题,有利于进一步减小训练误差。The object density determination method provided by the embodiments of the present application can alleviate the problems existing in the related art in the regression of artificially generated density maps from multiple perspectives. First, the standard density map regression is transformed into the regression of the density statistics, and then the logarithmic change of the density statistics is performed to reduce the gradient generated by the samples with large prediction deviations, and finally the gradient information of the samples with large prediction errors is filtered out, so as to stabilize The optimization process of the network. After eliminating the negative effects of inaccurate artificially generated density maps, the network can be optimized to a better local optimum, resulting in better generalization ability. At the same time, this scheme fully considers the contribution of most samples with low density values to the final counting error. Therefore, in the optimization process, the method of mining between partitions is used to alleviate this problem, which is conducive to further reducing the training error.
应该理解的是,虽然图2-9的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2-9中的至少一部分步骤可以包括多个步骤或者多个阶段,这些步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the steps in the flowcharts of FIGS. 2-9 are shown in sequence according to the arrows, these steps are not necessarily executed in the sequence shown by the arrows. Unless explicitly stated herein, the execution of these steps is not strictly limited to the order, and these steps may be performed in other orders. Moreover, at least a part of the steps in FIGS. 2-9 may include multiple steps or multiple stages. These steps or stages are not necessarily executed and completed at the same time, but may be executed at different times. The execution of these steps or stages The order is also not necessarily sequential, but may be performed alternately or alternately with other steps or at least a portion of the steps or phases within the other steps.
在一些实施例中,如图10所示,提供了一种对象密度确定装置1000,该装置可以采用软件模块或硬件模块,或者是二者的结合成为计算机设备的一部分,该装置具体包括:In some embodiments, as shown in FIG. 10 , an object density determination apparatus 1000 is provided. The apparatus may adopt software modules or hardware modules, or a combination of the two to become a part of computer equipment. The apparatus specifically includes:
图像获取模块1002,用于获取训练样本图像以及训练样本图像对应的标准密度图;An image acquisition module 1002, configured to acquire training sample images and standard density maps corresponding to the training sample images;
图像输入模块1004,用于将训练样本图像输入待训练的对象密度确定模型中,得到对象密度确定模型输出的预测密度图;The image input module 1004 is used to input the training sample image into the object density determination model to be trained, and obtain the predicted density map output by the object density determination model;
图像划分模块1006,用于分别对标准密度图和预测密度图进行划分,得到标准密度图对应的多个标准图像块以及预测密度图对应的多个预测图像块;The image division module 1006 is used to divide the standard density map and the predicted density map respectively, to obtain multiple standard image blocks corresponding to the standard density map and multiple predicted image blocks corresponding to the predicted density map;
密度统计模块1008,用于对标准图像块中的对象密度进行统计,得到标准图像块对应的标准密度统计值,对预测图像块中的对象密度进行统计,得到预测图像块对应的预测密度统计值;The density statistics module 1008 is configured to perform statistics on the object density in the standard image block, obtain the standard density statistics value corresponding to the standard image block, and perform statistics on the object density in the predicted image block to obtain the predicted density statistics value corresponding to the predicted image block ;
训练模块1010,用于将标准图像块以及与标准图像块存在图像位置对应关系的预测图像块组成图像对,基于图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数调整,得到训练后的对象密度确定模型,训练后的对象密度确定模型用于生成对象密度图。The training module 1010 is used to form an image pair from a standard image block and a predicted image block with an image position corresponding relationship with the standard image block, and based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, the training module is to be trained. The parameters of the object density determination model are adjusted to obtain a trained object density determination model, and the trained object density determination model is used to generate an object density map.
上述对象密度确定装置,由于分别对标准密度图和预测密度图进行了划分,得到标准密度图对应的多个标准图像块以及预测密度图对应的多个预测图像块,并且对标准图像块中的对象密度进行统计,得到标准图像块对应的标准密度统计值,对预测图像块中的对象密度进行统计,得到预测图像块对应的预测密度统计值,那么在训练过程中,可以将标准图像块以及与标准图像块存在图像位置对应关系的预测图像块组成图像对,基于图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数调整,从而可以以图像块为单位去拟合局部区域的密度值,综合考虑了局部区域的整体密度值,提高了训练得到的对象密度确定模型用于确定对象密度时的准确性。The above-mentioned object density determination device, because the standard density map and the predicted density map are divided respectively, obtains a plurality of standard image blocks corresponding to the standard density map and a plurality of predicted image blocks corresponding to the predicted density map, and for the standard image blocks. The object density is counted to obtain the standard density statistic value corresponding to the standard image block, the object density in the predicted image block is counted, and the predicted density statistic value corresponding to the predicted image block is obtained, then in the training process, the standard image block and The predicted image blocks that have an image position correspondence with the standard image blocks form an image pair. Based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, the parameters of the object density determination model to be trained are adjusted, so as to be able to The density value of the local area is fitted in units of image blocks, and the overall density value of the local area is comprehensively considered, which improves the accuracy of the object density determination model obtained by training for determining the object density.
在一些实施例中,训练模块1010还用于基于图像对所对应的标准密度统计值与预测密度统计值之间的差异,得到图像对所对应的图像对损失值;对图像对损失值进行统计,得到目标损失值;基于目标损失值对待训练的对象密度确定模型进行参数调整,得到训练后的对象密度确定模型。In some embodiments, the training module 1010 is further configured to obtain the image pair loss value corresponding to the image pair based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value; perform statistics on the image pair loss value , obtain the target loss value; adjust the parameters of the object density determination model to be trained based on the target loss value, and obtain the trained object density determination model.
在一些实施例中,训练模块1010还用于按照目标收缩方式对图像对所对应的标准密度统计值进行收缩,得到收缩后的标准密度统计值,目标收缩方式所对应的收缩幅度与待收缩 数值的大小成正相关关系;按照目标收缩方式对图像对所对应的预测密度统计值进行收缩,得到收缩后的预测密度统计值;根据收缩后的标准密度统计值与收缩后的预测密度统计值的差值,得到图像对所对应的图像对损失值,其中图像对损失值与差值的成正相关关系。In some embodiments, the training module 1010 is further configured to shrink the standard density statistic value corresponding to the image pair according to the target shrinkage mode to obtain the shrunk standard density statistic value, the shrinkage amplitude corresponding to the target shrinkage mode and the value to be shrunk The size of the shrinkage is positively correlated; the predicted density statistic value corresponding to the image pair is shrunk according to the target shrinkage method to obtain the shrunk predicted density statistic value; according to the difference between the shrunk standard density statistic value and the shrunk predicted density statistic value value, the loss value of the image pair corresponding to the image pair is obtained, wherein the loss value of the image pair is positively correlated with the difference value.
在一些实施例中,训练模块1010还用于将预设数值作为底数,以标准密度统计值作为真数进行对数变换,将所得到的对数作为收缩后的标准密度统计值,预设数值大于1;将预设数值作为底数,以预测密度统计值作为真数进行对数变换,将所得到的对数作为收缩后的预测密度统计值。In some embodiments, the training module 1010 is further configured to use the preset value as the base, perform logarithmic transformation with the standard density statistic value as the true number, and use the obtained logarithm as the shrunk standard density statistic value, the preset value greater than 1; take the preset value as the base, perform logarithmic transformation with the predicted density statistic value as the true number, and use the obtained logarithm as the shrunk predicted density statistic value.
在一些实施例中,训练模块1010还用于根据图像对所对应的标准密度统计值确定图像对损失值的损失值权重,损失值权重与标准密度统计值成负相关关系;基于损失值权重以及图像对损失值进行加权求和,得到目标损失值。In some embodiments, the training module 1010 is further configured to determine the loss value weight of the loss value of the image pair according to the standard density statistic value corresponding to the image pair, and the loss value weight has a negative correlation with the standard density statistic value; based on the loss value weight and The image performs a weighted sum of the loss values to obtain the target loss value.
在一些实施例中,训练模块1010还用于对标准密度统计值进行密度区间划分,得到多个密度区间;获取标准密度统计值处于密度区间的标准图像块的图像块数量;基于标准图像块所对应的密度区间的图像块数量,确定标准图像块所对应的图像对损失值的损失值权重;图像块数量与损失值权重呈正相关关系。In some embodiments, the training module 1010 is further configured to divide the standard density statistic value into density intervals to obtain multiple density intervals; obtain the number of image blocks of the standard image blocks whose standard density statistic value is in the density interval; The number of image blocks in the corresponding density interval determines the loss value weight of the image corresponding to the standard image block to the loss value; the number of image blocks has a positive correlation with the loss value weight.
在一些实施例中,训练模块1010还用于根据目标衰减方式对图像对损失值进行衰减,得到衰减后的图像对损失值,其中,目标衰减方式所对应的衰减幅度与图像对损失值成正相关关系;对衰减后的图像对损失值进行求和运算,得到目标损失值。In some embodiments, the training module 1010 is further configured to attenuate the loss value of the image pair according to the target attenuation method to obtain the loss value of the image pair after attenuation, wherein the attenuation amplitude corresponding to the target attenuation method is positively correlated with the loss value of the image pair relationship; sum the loss value of the attenuated image to obtain the target loss value.
在一些实施例中,图像划分模块1006还用于获取滑动窗口;按照预设滑动方式将滑动窗口在标准密度图上进行滑动,将处于滑动窗口内的图像区域作为标准图像块;按照预设滑动方式将滑动窗口在预测密度图上进行滑动,将处于滑动窗口内的图像区域作为预测图像块。In some embodiments, the image division module 1006 is further configured to obtain a sliding window; slide the sliding window on the standard density map according to a preset sliding method, and use the image area within the sliding window as a standard image block; slide the sliding window according to the preset The method slides the sliding window on the predicted density map, and uses the image area within the sliding window as the predicted image block.
在一些实施例中,训练样本图像标注有多个对象位置点;图像划分模块1006还用于根据训练样本图像所对应的对象位置点,确定训练样本图像对应的对象响应图;对象响应图中对象位置点的像素值为第一像素值,非对象位置点的像素值为第二像素值;对对象响应图进行卷积处理,得到训练样本图像对应的标准密度图。In some embodiments, the training sample image is marked with a plurality of object position points; the image division module 1006 is further configured to determine the object response map corresponding to the training sample image according to the object position points corresponding to the training sample image; The pixel value of the position point is the first pixel value, and the pixel value of the non-object position point is the second pixel value; the object response map is convolved to obtain a standard density map corresponding to the training sample image.
在一些实施例中,如图11所示,提供了一种对象密度确定装置1100,该装置可以采用软件模块或硬件模块,或者是二者的结合成为计算机设备的一部分,该装置具体包括:In some embodiments, as shown in FIG. 11 , an object density determination apparatus 1100 is provided. The apparatus may adopt software modules or hardware modules, or a combination of the two to become a part of computer equipment. The apparatus specifically includes:
图像获取模块1102,用于获取待确定密度的目标图像;an image acquisition module 1102, configured to acquire a target image whose density is to be determined;
密度确定模块1104,用于将目标图像输入训练后的对象密度确定模型中,通过对象密度 确定模型进行对象密度确定;对象密度确定模型是基于图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数调整得到的;其中,图像对是标准图像块以及与标准图像块存在图像位置对应关系的预测图像块组成的,标准图像块是通过对训练样本图像所对应的标准密度图进行划分得到的;预测图像块是通过对预测密度图进行划分得到的,预测密度图是将训练样本图像输入到待训练的对象密度确定模型中处理得到;The density determination module 1104 is used to input the target image into the trained object density determination model, and the object density determination model is used to determine the object density; the object density determination model is based on the standard density statistic value and the predicted density statistic value corresponding to the image pair The difference between the two is obtained by adjusting the parameters of the object density determination model to be trained; among them, the image pair is composed of a standard image block and a predicted image block that has an image position correspondence with the standard image block. The standard density map corresponding to the sample image is divided; the predicted image block is obtained by dividing the predicted density map, and the predicted density map is obtained by inputting the training sample image into the object density determination model to be trained;
密度图获取模块1106,用于获取对象密度确定模型输出的目标图像对应的对象密度图。The density map acquisition module 1106 is configured to acquire the object density map corresponding to the target image output by the object density determination model.
上述对象密度确定装置,由于对象密度确定模型是基于图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数调整得到的,其中,图像对是标准图像块以及与标准图像块存在图像位置对应关系的预测图像块组成的,标准图像块是通过对训练样本图像所对应的标准密度图进行划分得到的,预测图像块是通过对预测密度图进行划分得到的,预测密度图是将训练样本图像输入到待训练的对象密度确定模型中处理得到,在对待训练的对象密度确定模型进行训练的过程中可以以图像块为单位去拟合局部区域的密度值,综合考虑了局部区域的整体密度值,提高了训练得到的对象密度确定模型用于确定对象密度时的准确性,从而将目标图像输入训练后的对象密度确定模型中,该对象密度确定模型可以输出准确的对象密度图。The above object density determination device, because the object density determination model is based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, the object density determination model to be trained is obtained by adjusting the parameters, wherein the image pair is The standard image block is composed of the standard image block and the predicted image block corresponding to the image position of the standard image block. The standard image block is obtained by dividing the standard density map corresponding to the training sample image, and the predicted image block is obtained by dividing the predicted density map. The predicted density map is obtained by inputting the training sample image into the object density determination model to be trained. During the training of the object density determination model to be trained, the image block can be used as a unit to fit the local area. The density value, which comprehensively considers the overall density value of the local area, improves the accuracy of the object density determination model obtained by training when it is used to determine the object density, so that the target image is input into the trained object density determination model, and the object density is determined. The model can output accurate object density maps.
在一些实施例中,上述装置还包括:训练模块,用于获取训练样本图像以及训练样本图像对应的标准密度图;将训练样本图像输入待训练的对象密度确定模型中,得到对象密度确定模型输出的预测密度图;分别对标准密度图和预测密度图进行划分,得到标准密度图对应的多个标准图像块以及预测密度图对应的多个预测图像块;对标准图像块中的对象密度进行统计,得到标准图像块对应的标准密度统计值,对预测图像块中的对象密度进行统计,得到预测图像块对应的预测密度统计值;将标准图像块以及与标准图像块存在图像位置对应关系的预测图像块组成图像对,基于图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数调整,得到训练后的对象密度确定模型。In some embodiments, the above-mentioned device further includes: a training module for acquiring training sample images and standard density maps corresponding to the training sample images; inputting the training sample images into the object density determination model to be trained, and obtaining the output of the object density determination model The predicted density map of , obtain the standard density statistic value corresponding to the standard image block, count the object density in the predicted image block, and obtain the predicted density statistic value corresponding to the predicted image block; The image blocks form image pairs, and based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, the parameters of the object density determination model to be trained are adjusted to obtain the trained object density determination model.
关于对象密度确定装置的具体限定可以参见上文中对于对象密度确定方法的限定,在此不再赘述。上述对象密度确定装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific definition of the device for determining the density of the object, reference may be made to the definition of the method for determining the density of the object above, which will not be repeated here. Each module in the above-mentioned object density determination device may be implemented in whole or in part by software, hardware and combinations thereof. The above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
在一些实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图12所示。该计算机设备包括通过系统总线连接的处理器、存储器和网络接口。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储训练样本图像数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种对象密度确定方法。In some embodiments, a computer device is provided, and the computer device may be a server, and its internal structure diagram may be as shown in FIG. 12 . The computer device includes a processor, memory, and a network interface connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium, an internal memory. The non-volatile storage medium stores an operating system, computer readable instructions and a database. The internal memory provides an environment for the execution of the operating system and computer-readable instructions in the non-volatile storage medium. The computer device's database is used to store training sample image data. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer readable instructions, when executed by a processor, implement an object density determination method.
本领域技术人员可以理解,图12中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。Those skilled in the art can understand that the structure shown in FIG. 12 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. Include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.
在一些实施例中,还提供了一种计算机设备,包括存储器和处理器,存储器中存储有计算机可读指令,计算机可读指令被处理器执行时,使得处理器执行上述各方法实施例中的步骤。In some embodiments, a computer device is also provided, including a memory and a processor, where computer-readable instructions are stored in the memory, and when the computer-readable instructions are executed by the processor, cause the processor to execute the above method embodiments. step.
在一些实施例中,提供了一个或多个非易失性可读存储介质,存储有计算机可读指令,计算机可读指令被一个或多个处理器执行时,使得处理器执行上述各方法实施例中的步骤。In some embodiments, one or more non-volatile readable storage media are provided, and computer-readable instructions are stored. When the computer-readable instructions are executed by one or more processors, the processors perform the above-mentioned methods. steps in the example.
在一些实施例中,提供了一种计算机程序产品,包括计算机可读指令,计算机可读指令被处理器执行时实现上述各方法实施例中的步骤。In some embodiments, a computer program product is provided, comprising computer-readable instructions, which, when executed by a processor, implement the steps in each of the foregoing method embodiments.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和易失性存储器中的至少一种。非易失性存储器可包括只读存储器(Read-Only Memory,ROM)、磁带、软盘、闪存或光存储器等。易失性存储器可包括随机存取存储器(Random Access Memory,RAM)或外部高速缓冲存储器。作为说明而非局限,RAM可以是多种形式,比如静态随机存取存储器(Static Random Access Memory,SRAM)或动态随机存取存储器(Dynamic Random Access Memory,DRAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing relevant hardware through computer-readable instructions, and the computer-readable instructions can be stored in a non-volatile computer. In the readable storage medium, the computer-readable instructions, when executed, may include the processes of the foregoing method embodiments. Wherein, any reference to memory, storage, database or other media used in the various embodiments provided in this application may include at least one of non-volatile and volatile memory. Non-volatile memory may include read-only memory (Read-Only Memory, ROM), magnetic tape, floppy disk, flash memory, or optical memory, and the like. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, the RAM may be in various forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM).
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应 当认为是本说明书记载的范围。The technical features of the above embodiments can be combined arbitrarily. In order to make the description simple, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features It is considered to be the range described in this specification.
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only represent several embodiments of the present application, and the descriptions thereof are specific and detailed, but should not be construed as a limitation on the scope of the invention patent. It should be noted that, for those skilled in the art, without departing from the concept of the present application, several modifications and improvements can be made, which all belong to the protection scope of the present application. Therefore, the scope of protection of the patent of the present application shall be subject to the appended claims.

Claims (18)

  1. 一种对象密度确定方法,由计算机设备执行,所述方法包括:A method for determining the density of an object, executed by a computer device, the method comprising:
    获取训练样本图像以及所述训练样本图像对应的标准密度图;Obtain a training sample image and a standard density map corresponding to the training sample image;
    将所述训练样本图像输入待训练的对象密度确定模型中,得到所述对象密度确定模型输出的预测密度图;Inputting the training sample image into the object density determination model to be trained, to obtain a predicted density map output by the object density determination model;
    分别对所述标准密度图和所述预测密度图进行划分,得到所述标准密度图对应的多个标准图像块以及所述预测密度图对应的多个预测图像块;respectively dividing the standard density map and the predicted density map to obtain multiple standard image blocks corresponding to the standard density map and multiple predicted image blocks corresponding to the predicted density map;
    对所述标准图像块中的对象密度进行统计,得到所述标准图像块对应的标准密度统计值,对所述预测图像块中的对象密度进行统计,得到所述预测图像块对应的预测密度统计值;及Counting the object density in the standard image block to obtain a standard density statistic value corresponding to the standard image block, and performing statistics on the object density in the predicted image block to obtain the predicted density statistic corresponding to the predicted image block value; and
    将所述标准图像块以及与所述标准图像块存在图像位置对应关系的预测图像块组成图像对,基于所述图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数调整,得到训练后的对象密度确定模型,所述训练后的对象密度确定模型用于生成对象密度图。The standard image block and the predicted image block that has an image position correspondence with the standard image block are formed into an image pair, and based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, an image pair is prepared for training. The parameters of the object density determination model are adjusted to obtain a trained object density determination model, and the trained object density determination model is used to generate an object density map.
  2. 根据权利要求1所述的方法,其特征在于,所述基于所述图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数调整,得到训练后的对象密度确定模型包括:The method according to claim 1, wherein, based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value, the parameters of the object density determination model to be trained are adjusted to obtain the training The post object density determination model includes:
    基于所述图像对所对应的标准密度统计值与预测密度统计值之间的差异,得到所述图像对所对应的图像对损失值;Obtaining the image pair loss value corresponding to the image pair based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value;
    对所述图像对损失值进行统计,得到目标损失值;及Counting the image pair loss values to obtain a target loss value; and
    基于所述目标损失值对待训练的对象密度确定模型进行参数调整,得到训练后的对象密度确定模型。Based on the target loss value, the parameters of the object density determination model to be trained are adjusted to obtain the trained object density determination model.
  3. 根据权利要求2所述的方法,其特征在于,所述基于所述图像对所对应的标准密度统计值与预测密度统计值之间的差异,得到所述图像对所对应的图像对损失值包括:The method according to claim 2, wherein the obtaining the image pair loss value corresponding to the image pair based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value comprises the following steps: :
    按照目标收缩方式对所述图像对所对应的标准密度统计值进行收缩,得到收缩后的标准密度统计值,所述目标收缩方式所对应的收缩幅度与待收缩数值的大小成正相关关系;The standard density statistic value corresponding to the image pair is shrunk according to the target shrinkage mode to obtain the shrunk standard density statistic value, and the shrinkage amplitude corresponding to the target shrinkage mode is positively correlated with the size of the value to be shrunk;
    按照所述目标收缩方式对所述图像对所对应的预测密度统计值进行收缩,得到收缩后的预测密度统计值;及shrinking the predicted density statistic value corresponding to the image pair according to the target shrinking method to obtain a shrunk predicted density statistic value; and
    根据收缩后的标准密度统计值与收缩后的预测密度统计值的差值,得到所述图像对所对 应的图像对损失值,其中所述图像对损失值与所述差值成正相关关系。According to the difference between the standard density statistic value after shrinkage and the predicted density statistic value after shrinkage, the image pair loss value corresponding to the image pair is obtained, wherein the image pair loss value has a positive correlation with the difference value.
  4. 根据权利要求3所述的方法,其特征在于,所述按照目标收缩方式对所述图像对所对应的标准密度统计值进行收缩,得到收缩后的标准密度统计值包括:The method according to claim 3, wherein the shrinking of the standard density statistic value corresponding to the image pair according to the target shrinking mode, and obtaining the shrunk standard density statistic value comprises:
    将预设数值作为底数,以所述标准密度统计值作为真数进行对数变换,将所得到的对数作为所述收缩后的标准密度统计值,所述预设数值大于1;及Taking the preset numerical value as the base, performing logarithmic transformation with the standard density statistic value as the true number, and using the obtained logarithm as the shrunk standard density statistic value, where the preset numerical value is greater than 1; and
    所述按照所述目标收缩方式对所述图像对所对应的预测密度统计值进行收缩,得到收缩后的预测密度统计值包括:The shrinking of the predicted density statistic value corresponding to the image pair according to the target shrinking method, and obtaining the shrunk predicted density statistic value includes:
    将所述预设数值作为底数,以所述预测密度统计值作为真数进行对数变换,将所得到的对数作为所述收缩后的预测密度统计值。The preset numerical value is used as the base, the logarithmic transformation is performed with the predicted density statistic value as the true number, and the obtained logarithm is used as the shrunk predicted density statistic value.
  5. 根据权利要求2所述的方法,其特征在于,所述对所述图像对损失值进行统计,得到目标损失值包括:The method according to claim 2, wherein, performing statistics on the loss value of the image pair to obtain the target loss value comprises:
    根据所述图像对所对应的标准密度统计值确定所述图像对损失值的损失值权重,所述损失值权重与所述标准密度统计值成负相关关系;及Determine a loss value weight of the image pair loss value according to the standard density statistic value corresponding to the image pair, and the loss value weight has a negative correlation with the standard density statistic value; and
    基于所述损失值权重以及所述图像对损失值进行加权求和,得到目标损失值。A weighted summation of the loss values is performed based on the loss value weight and the image to obtain a target loss value.
  6. 根据权利要求5所述的方法,其特征在于,所述根据所述图像对所对应的标准密度统计值确定所述图像对损失值的损失值权重包括:The method according to claim 5, wherein the determining the loss value weight of the loss value of the image pair according to the standard density statistic value corresponding to the image pair comprises:
    对所述标准密度统计值进行密度区间划分,得到多个密度区间;The standard density statistical value is divided into density intervals to obtain a plurality of density intervals;
    获取标准密度统计值处于所述密度区间的标准图像块的图像块数量;及obtaining the number of image blocks of the standard image blocks whose standard density statistics are in the density interval; and
    基于所述标准图像块所对应的密度区间的图像块数量,确定所述标准图像块所对应的图像对损失值的损失值权重;所述图像块数量与所述损失值权重成正相关关系。Based on the number of image blocks in the density interval corresponding to the standard image block, the loss value weight of the image corresponding to the standard image block to the loss value is determined; the number of image blocks is positively correlated with the loss value weight.
  7. 根据权利要求2所述的方法,其特征在于,所述对所述图像对损失值进行统计,得到目标损失值包括:The method according to claim 2, wherein, performing statistics on the loss value of the image pair to obtain the target loss value comprises:
    根据目标衰减方式对所述图像对损失值进行衰减,得到衰减后的图像对损失值,其中,所述目标衰减方式所对应的衰减幅度与图像对损失值成正相关关系;及Attenuate the image pair loss value according to a target attenuation mode to obtain an attenuated image pair loss value, wherein the attenuation amplitude corresponding to the target attenuation mode is positively correlated with the image pair loss value; and
    对衰减后的图像对损失值进行求和运算,得到目标损失值。The loss value of the attenuated image is summed to obtain the target loss value.
  8. 根据权利要求1至7任意一项所述的方法,其特征在于,所述分别对所述标准密度图和所述预测密度图进行划分,得到所述标准密度图对应的多个标准图像块以及所述预测密度图对应的多个预测图像块包括:The method according to any one of claims 1 to 7, wherein the standard density map and the predicted density map are respectively divided to obtain a plurality of standard image blocks corresponding to the standard density map and The multiple predicted image blocks corresponding to the predicted density map include:
    获取滑动窗口;Get the sliding window;
    按照预设滑动方式将所述滑动窗口在所述标准密度图上进行滑动,将处于所述滑动窗口内的图像区域作为标准图像块;及sliding the sliding window on the standard density map according to a preset sliding method, and using the image area within the sliding window as a standard image block; and
    按照所述预设滑动方式将所述滑动窗口在所述预测密度图上进行滑动,将处于所述滑动窗口内的图像区域作为预测图像块。The sliding window is slid on the predicted density map according to the preset sliding manner, and the image area within the sliding window is used as a predicted image block.
  9. 根据权利要求8所述的方法,其特征在于,所述训练样本图像标注有多个对象位置点;所述获取训练样本图像以及所述训练样本图像对应的标准密度图包括:The method according to claim 8, wherein the training sample image is marked with a plurality of object position points; the acquiring the training sample image and the standard density map corresponding to the training sample image comprises:
    根据所述训练样本图像所对应的对象位置点,确定所述训练样本图像对应的对象响应图;所述对象响应图中所述对象位置点的像素值为第一像素值,非对象位置点的像素值为第二像素值;及Determine the object response map corresponding to the training sample image according to the object position point corresponding to the training sample image; the pixel value of the object position point in the object response map is the first pixel value, and the pixel value of the non-object position point The pixel value is the second pixel value; and
    对所述对象响应图进行卷积处理,得到所述训练样本图像对应的标准密度图。Convolution processing is performed on the object response map to obtain a standard density map corresponding to the training sample image.
  10. 根据权利要求1所述的方法,其特征在于,所述对象密度确定模型包括编码层、解码层及预测层;所述将所述训练样本图像输入待训练的对象密度确定模型中,得到所述对象密度确定模型输出的预测密度图,包括:The method according to claim 1, wherein the object density determination model includes an encoding layer, a decoding layer and a prediction layer; the training sample image is input into the object density determination model to be trained, and the object density determination model is obtained. Object density determines the predicted density map output by the model, including:
    将所述训练样本图像输入所述编码层,通过所述编码层进行下采样处理,得到第一目标特征;Input the training sample image into the encoding layer, and perform downsampling processing through the encoding layer to obtain the first target feature;
    将所述第一目标特征输入所述解码层,通过所述编码层进行上采样处理,得到第二目标特征;Inputting the first target feature into the decoding layer, and performing up-sampling processing through the encoding layer to obtain a second target feature;
    将所述第二目标特征输入所述预测层,通过预测层进行对象密度预测,得到所述标准密度图。The second target feature is input into the prediction layer, and object density prediction is performed through the prediction layer to obtain the standard density map.
  11. 根据权利要求10所述的方法,其特征在于,所述编码层和所述解码层采用跳跃链接;所述编码层包括多个第一卷积层;所述解码层包括多个第二卷积层;所述将所述训练样本图像输入所述编码层,通过所述编码层进行下采样处理,得到第一目标特征,包括:The method according to claim 10, wherein the encoding layer and the decoding layer adopt skip links; the encoding layer comprises a plurality of first convolutional layers; the decoding layer comprises a plurality of second convolutional layers layer; inputting the training sample image into the encoding layer, and performing downsampling processing through the encoding layer to obtain the first target feature, including:
    在所述编码层中,通过当前第一卷积层对前一个第一卷积层输出的中间特征进行下采样,获取最后一个第一卷积层的输出作为第一目标特征;In the encoding layer, down-sampling the intermediate feature output from the previous first convolution layer through the current first convolution layer, and obtains the output of the last first convolution layer as the first target feature;
    所述将所述第一目标特征输入所述解码层,通过所述编码层进行上采样处理,得到第二目标特征,包括:The inputting the first target feature into the decoding layer, and performing up-sampling processing on the encoding layer to obtain the second target feature, including:
    在所述解码层,通过当前第二卷积层根据前一个第二卷积层输出的中间特征以及所连接 的第一卷积层输出的中间特征进行上采样,获取最后一个第二卷积层的输出作为第二目标特征。In the decoding layer, the current second convolution layer performs upsampling according to the intermediate features output by the previous second convolution layer and the intermediate features output by the connected first convolution layer to obtain the last second convolution layer. The output is used as the second target feature.
  12. 一种对象密度确定方法,由计算机设备执行,所述方法包括:A method for determining the density of an object, executed by a computer device, the method comprising:
    获取待确定密度的目标图像;Obtain the target image of the density to be determined;
    将所述目标图像输入训练后的对象密度确定模型中,通过所述对象密度确定模型进行对象密度确定;所述对象密度确定模型是基于图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数调整得到的;其中,所述图像对是由标准图像块以及与所述标准图像块存在图像位置对应关系的预测图像块组成的,所述标准图像块是通过对训练样本图像所对应的标准密度图进行划分得到的;所述预测图像块是通过对预测密度图进行划分得到的,所述预测密度图是将所述训练样本图像输入到待训练的对象密度确定模型中进行处理得到;及The target image is input into the trained object density determination model, and the object density determination model is performed through the object density determination model; the object density determination model is based on the standard density statistic value corresponding to the image pair and the predicted density statistic value. The difference between the two is obtained by adjusting the parameters of the object density determination model to be trained; wherein, the image pair is composed of a standard image block and a predicted image block that has an image position corresponding relationship with the standard image block, and the standard image pair is composed of The image block is obtained by dividing the standard density map corresponding to the training sample image; the predicted image block is obtained by dividing the predicted density map, and the predicted density map is obtained by inputting the training sample image into the processed in the trained object density determination model; and
    获取所述训练后的对象密度确定模型输出的所述目标图像对应的对象密度图。Obtain an object density map corresponding to the target image output by the trained object density determination model.
  13. 根据权利要求12所述的方法,其特征在于,所述对象密度确定模型的生成步骤包括:The method according to claim 12, wherein the generating step of the object density determination model comprises:
    获取训练样本图像以及所述训练样本图像对应的标准密度图;Obtain a training sample image and a standard density map corresponding to the training sample image;
    将所述训练样本图像输入待训练的对象密度确定模型中,得到所述对象密度确定模型输出的预测密度图;Inputting the training sample image into the object density determination model to be trained, to obtain a predicted density map output by the object density determination model;
    分别对所述标准密度图和所述预测密度图进行划分,得到所述标准密度图对应的多个标准图像块以及所述预测密度图对应的多个预测图像块;respectively dividing the standard density map and the predicted density map to obtain multiple standard image blocks corresponding to the standard density map and multiple predicted image blocks corresponding to the predicted density map;
    对所述标准图像块中的对象密度进行统计,得到所述标准图像块对应的标准密度统计值,对所述预测图像块中的对象密度进行统计,得到所述预测图像块对应的预测密度统计值;及将所述标准图像块以及与所述标准图像块存在图像位置对应关系的预测图像块组成图像对,基于所述图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数调整,得到训练后的对象密度确定模型。Counting the object density in the standard image block to obtain a standard density statistic value corresponding to the standard image block, and performing statistics on the object density in the predicted image block to obtain the predicted density statistic corresponding to the predicted image block and combining the standard image block and the predicted image block with the image position corresponding to the standard image block into an image pair, based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value , and adjust the parameters of the object density determination model to be trained to obtain the trained object density determination model.
  14. 一种对象密度确定装置,所述装置包括:An object density determination device, the device comprising:
    图像获取模块,用于获取训练样本图像以及所述训练样本图像对应的标准密度图;an image acquisition module for acquiring training sample images and standard density maps corresponding to the training sample images;
    图像输入模块,用于将所述训练样本图像输入待训练的对象密度确定模型中,得到所述对象密度确定模型输出的预测密度图;an image input module, configured to input the training sample image into the object density determination model to be trained to obtain a predicted density map output by the object density determination model;
    图像划分模块,用于分别对所述标准密度图和所述预测密度图进行划分,得到所述标准 密度图对应的多个标准图像块以及所述预测密度图对应的多个预测图像块;an image division module, for dividing the standard density map and the predicted density map respectively, to obtain a plurality of standard image blocks corresponding to the standard density map and a plurality of predicted image blocks corresponding to the predicted density map;
    密度统计模块,用于对所述标准图像块中的对象密度进行统计,得到所述标准图像块对应的标准密度统计值,对所述预测图像块中的对象密度进行统计,得到所述预测图像块对应的预测密度统计值;及A density statistics module, configured to perform statistics on the object density in the standard image block, obtain a standard density statistical value corresponding to the standard image block, and perform statistics on the object density in the predicted image block to obtain the predicted image the predicted density statistic corresponding to the block; and
    训练模块,用于将所述标准图像块以及与所述标准图像块存在图像位置对应关系的预测图像块组成图像对,基于所述图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数调整,得到训练后的对象密度确定模型,所述训练后的对象密度确定模型用于生成对象密度图。A training module, configured to form an image pair between the standard image block and the predicted image block that has an image position correspondence with the standard image block, based on the difference between the standard density statistic value corresponding to the image pair and the predicted density statistic value The difference between the two, the parameters of the object density determination model to be trained are adjusted to obtain a trained object density determination model, and the trained object density determination model is used to generate an object density map.
  15. 一种对象密度确定装置,所述装置包括:An object density determination device, the device comprising:
    图像获取模块,用于获取待确定密度的目标图像;an image acquisition module, used to acquire the target image of the density to be determined;
    密度确定模块,用于将所述目标图像输入训练后的对象密度确定模型中,通过所述对象密度确定模型进行对象密度确定;所述对象密度确定模型是基于图像对所对应的标准密度统计值与预测密度统计值之间的差异,对待训练的对象密度确定模型进行参数调整得到的;其中,所述图像对是由标准图像块以及与所述标准图像块存在图像位置对应关系的预测图像块组成的,所述标准图像块是通过对训练样本图像所对应的标准密度图进行划分得到的;所述预测图像块是通过对预测密度图进行划分得到的,所述预测密度图是将所述训练样本图像输入到待训练的对象密度确定模型中进行处理得到;及A density determination module for inputting the target image into the trained object density determination model, and performing object density determination through the object density determination model; the object density determination model is based on the standard density statistic value corresponding to the image pair The difference between the statistical value of the predicted density and the statistical value of the predicted density is obtained by adjusting the parameters of the object density determination model to be trained; wherein, the image pair is composed of a standard image block and a predicted image block that has an image position corresponding relationship with the standard image block. The standard image block is obtained by dividing the standard density map corresponding to the training sample image; the predicted image block is obtained by dividing the predicted density map, and the predicted density map is obtained by dividing the The training sample images are input into the object density determination model to be trained for processing; and
    密度图获取模块,用于获取所述训练后的对象密度确定模型输出的所述目标图像对应的对象密度图。A density map acquisition module, configured to acquire an object density map corresponding to the target image output by the trained object density determination model.
  16. 一种计算机设备,包括存储器和一个或者多个处理器,所述存储器存储有计算机可读指令,其特征在于,所述计算机可读指令被所述处理器执行时,使得所述一个或者多个处理器实现如权利要求1至11或者12至13中任一项所述的方法。A computer device, comprising a memory and one or more processors, wherein the memory stores computer-readable instructions, wherein the computer-readable instructions, when executed by the processor, cause the one or more The processor implements the method of any one of claims 1 to 11 or 12 to 13.
  17. 一个或多个非易失性可读存储介质,存储有计算机可读指令,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器实现如权利要1至11或者12至13中任一项所述的方法。One or more non-transitory readable storage media storing computer-readable instructions that, when executed by one or more processors, cause the one or more processors to implement the invention as claimed in claim 1 To the method of any one of 11 or 12 to 13.
  18. 一种计算机程序产品,包括计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现如权利要1至11或者12至13中任一项所述的方法。A computer program product comprising computer-readable instructions, characterized in that, when the computer-readable instructions are executed by a processor, the method according to any one of claims 1 to 11 or 12 to 13 is implemented.
PCT/CN2022/086848 2021-04-26 2022-04-14 Object density determination method and apparatus, computer device and storage medium WO2022228142A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110453975.X 2021-04-26
CN202110453975.XA CN112862023B (en) 2021-04-26 2021-04-26 Object density determination method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2022228142A1 true WO2022228142A1 (en) 2022-11-03

Family

ID=75992901

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/086848 WO2022228142A1 (en) 2021-04-26 2022-04-14 Object density determination method and apparatus, computer device and storage medium

Country Status (2)

Country Link
CN (1) CN112862023B (en)
WO (1) WO2022228142A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112862023B (en) * 2021-04-26 2021-07-16 腾讯科技(深圳)有限公司 Object density determination method and device, computer equipment and storage medium
CN114758243B (en) * 2022-04-29 2022-11-11 广东技术师范大学 Tea leaf picking method and device based on supplementary training and dual-class position prediction

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106157307A (en) * 2016-06-27 2016-11-23 浙江工商大学 A kind of monocular image depth estimation method based on multiple dimensioned CNN and continuous CRF
CN110705698A (en) * 2019-10-16 2020-01-17 南京林业大学 Target counting depth network design method based on scale self-adaptive perception
CN111178276A (en) * 2019-12-30 2020-05-19 上海商汤智能科技有限公司 Image processing method, image processing apparatus, and computer-readable storage medium
CN111582252A (en) * 2020-06-16 2020-08-25 上海眼控科技股份有限公司 Crowd density map acquisition method and device, computer equipment and storage medium
US20200387718A1 (en) * 2019-06-10 2020-12-10 City University Of Hong Kong System and method for counting objects
CN112101195A (en) * 2020-09-14 2020-12-18 腾讯科技(深圳)有限公司 Crowd density estimation method and device, computer equipment and storage medium
CN112560829A (en) * 2021-02-25 2021-03-26 腾讯科技(深圳)有限公司 Crowd quantity determination method, device, equipment and storage medium
CN112862023A (en) * 2021-04-26 2021-05-28 腾讯科技(深圳)有限公司 Object density determination method and device, computer equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108615027B (en) * 2018-05-11 2021-10-08 常州大学 Method for counting video crowd based on long-term and short-term memory-weighted neural network
CN111898578B (en) * 2020-08-10 2023-09-19 腾讯科技(深圳)有限公司 Crowd density acquisition method and device and electronic equipment
CN111985381B (en) * 2020-08-13 2022-09-09 杭州电子科技大学 Guidance area dense crowd counting method based on flexible convolution neural network

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106157307A (en) * 2016-06-27 2016-11-23 浙江工商大学 A kind of monocular image depth estimation method based on multiple dimensioned CNN and continuous CRF
US20200387718A1 (en) * 2019-06-10 2020-12-10 City University Of Hong Kong System and method for counting objects
CN110705698A (en) * 2019-10-16 2020-01-17 南京林业大学 Target counting depth network design method based on scale self-adaptive perception
CN111178276A (en) * 2019-12-30 2020-05-19 上海商汤智能科技有限公司 Image processing method, image processing apparatus, and computer-readable storage medium
CN111582252A (en) * 2020-06-16 2020-08-25 上海眼控科技股份有限公司 Crowd density map acquisition method and device, computer equipment and storage medium
CN112101195A (en) * 2020-09-14 2020-12-18 腾讯科技(深圳)有限公司 Crowd density estimation method and device, computer equipment and storage medium
CN112560829A (en) * 2021-02-25 2021-03-26 腾讯科技(深圳)有限公司 Crowd quantity determination method, device, equipment and storage medium
CN112862023A (en) * 2021-04-26 2021-05-28 腾讯科技(深圳)有限公司 Object density determination method and device, computer equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CAO, XINKUN: "Surveillance Video Analysis and Event Detection Based on Deep Learning", CHINESE SELECTED DOCTORAL DISSERTATIONS AND MASTER'S THESES FULL-TEXT DATABASES (MASTER), INFORMATION SCIENCE AND TECHNOLOGY, 15 September 2019 (2019-09-15), XP055980519, [retrieved on 20221111] *
WANG, LUYANG: "Image Crowd Counting Based on Convolutional Neural Network", INFORMATION & TECHNOLOGY, CHINA DOCTORAL DISSERTATIONS FULL-TEXT DATABASE, vol. 1, 15 January 2021 (2021-01-15), pages 43 - 59, XP055980523 *

Also Published As

Publication number Publication date
CN112862023A (en) 2021-05-28
CN112862023B (en) 2021-07-16

Similar Documents

Publication Publication Date Title
US10943145B2 (en) Image processing methods and apparatus, and electronic devices
AU2019213369B2 (en) Non-local memory network for semi-supervised video object segmentation
CN108876792B (en) Semantic segmentation method, device and system and storage medium
WO2022083536A1 (en) Neural network construction method and apparatus
US11354906B2 (en) Temporally distributed neural networks for video semantic segmentation
CN106203376B (en) Face key point positioning method and device
KR101880907B1 (en) Method for detecting abnormal session
Arietta et al. City forensics: Using visual elements to predict non-visual city attributes
WO2022228142A1 (en) Object density determination method and apparatus, computer device and storage medium
CN112639828A (en) Data processing method, method and equipment for training neural network model
AU2021354030B2 (en) Processing images using self-attention based neural networks
CN112052837A (en) Target detection method and device based on artificial intelligence
CN109033107A (en) Image search method and device, computer equipment and storage medium
CN112330684B (en) Object segmentation method and device, computer equipment and storage medium
CN111008631B (en) Image association method and device, storage medium and electronic device
JP7357176B1 (en) Night object detection, training method and device based on self-attention mechanism in frequency domain
WO2021249114A1 (en) Target tracking method and target tracking device
JP2023536025A (en) Target detection method, device and roadside equipment in road-vehicle cooperation
CN113011562A (en) Model training method and device
CN114360067A (en) Dynamic gesture recognition method based on deep learning
CN113807361A (en) Neural network, target detection method, neural network training method and related products
CN113554653A (en) Semantic segmentation method for long-tail distribution of point cloud data based on mutual information calibration
CN115294337B (en) Method for training semantic segmentation model, image semantic segmentation method and related device
CN114358186A (en) Data processing method and device and computer readable storage medium
CN113808151A (en) Method, device and equipment for detecting weak semantic contour of live image and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22794613

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22794613

Country of ref document: EP

Kind code of ref document: A1