CN115100500A - Target detection method and device and readable storage medium - Google Patents

Target detection method and device and readable storage medium Download PDF

Info

Publication number
CN115100500A
CN115100500A CN202210684927.6A CN202210684927A CN115100500A CN 115100500 A CN115100500 A CN 115100500A CN 202210684927 A CN202210684927 A CN 202210684927A CN 115100500 A CN115100500 A CN 115100500A
Authority
CN
China
Prior art keywords
image
target detection
network
parameters
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210684927.6A
Other languages
Chinese (zh)
Inventor
谢旭
凌明
杨作兴
杨敏
艾国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen MicroBT Electronics Technology Co Ltd
Original Assignee
Shenzhen MicroBT Electronics Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen MicroBT Electronics Technology Co Ltd filed Critical Shenzhen MicroBT Electronics Technology Co Ltd
Priority to CN202210684927.6A priority Critical patent/CN115100500A/en
Publication of CN115100500A publication Critical patent/CN115100500A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/73Deblurring; Sharpening
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/32Normalisation of the pattern dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/36Applying a local operator, i.e. means to operate on image points situated in the vicinity of a given point; Non-linear local filtering operations, e.g. median filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Nonlinear Science (AREA)
  • Image Processing (AREA)

Abstract

The embodiment of the invention provides a target detection method, a target detection device and a readable storage medium. The method comprises the following steps: inputting an image to be processed into a parameter prediction network, and outputting tuning parameters through the parameter prediction network, wherein the tuning parameters comprise at least one of defogging parameters, white balance parameters, contrast parameters, hue parameters, sharpening parameters and correction parameters; performing image enhancement processing on the image to be processed according to the tuning parameters output by the parameter prediction network to obtain an optimized image; inputting the optimized image corresponding to the image to be processed into a target detection network for target detection, and outputting a target detection result through the target detection network, wherein the target detection network and the parameter prediction network are neural networks obtained by joint training in advance by using training data, and the training data comprises images meeting preset conditions. The embodiment of the invention can reduce the operation cost of the user and improve the accuracy of target detection.

Description

Target detection method and device and readable storage medium
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to a target detection method, an apparatus, and a readable storage medium.
Background
Artificial Intelligence (AI) is a technology for studying and developing theories, methods, and application systems for simulating, extending, and expanding human Intelligence. Computer Vision (CV) is a branch of artificial intelligence that attempts to build artificial intelligence systems that can obtain information from images or multidimensional data.
Object detection is an important application of computer vision, such as detecting faces, vehicles or buildings from images. For high quality images, the object detection model can often accurately detect objects therein. However, for low-quality images, such as images shot under severe weather, dark light, and the like, it is difficult to accurately detect the images therein, which greatly affects the accuracy of target detection. Under the conditions of severe weather, dark light and the like, the quality of a shot image needs to be improved by manually adjusting camera parameters so as to improve the accuracy of target detection, but the method cannot be suitable for different shooting scenes and has higher professional requirements on shooting personnel.
Disclosure of Invention
The embodiment of the invention provides a target detection method, a target detection device and a readable storage medium, which can reduce the operation cost of a user and improve the accuracy of target detection.
In a first aspect, an embodiment of the present invention discloses a target detection method, where the method includes:
inputting an image to be processed into a parameter prediction network, and outputting tuning parameters through the parameter prediction network, wherein the tuning parameters comprise at least one of defogging parameters, white balance parameters, contrast parameters, hue parameters, sharpening parameters and correction parameters;
performing image enhancement processing on the image to be processed according to the tuning parameters output by the parameter prediction network to obtain an optimized image corresponding to the image to be processed;
inputting the optimized image corresponding to the image to be processed into a target detection network for target detection, and outputting a target detection result through the target detection network, wherein the target detection network and the parameter prediction network are neural networks obtained by joint training in advance by using training data, and the training data comprises images meeting preset conditions.
In a second aspect, an embodiment of the present invention discloses an apparatus for detecting a target, where the apparatus includes:
the parameter prediction module is used for inputting the image to be processed into a parameter prediction network and outputting tuning parameters through the parameter prediction network, wherein the tuning parameters comprise at least one of defogging parameters, white balance parameters, contrast parameters, hue parameters, sharpening parameters and correction parameters;
the image enhancement module is used for carrying out image enhancement processing on the image to be processed according to the tuning parameters output by the parameter prediction network to obtain an optimized image corresponding to the image to be processed;
and the target detection module is used for inputting the optimized image corresponding to the image to be processed into a target detection network for target detection, and outputting a target detection result through the target detection network, wherein the target detection network and the parameter prediction network are neural networks obtained by joint training by utilizing training data in advance, and the training data comprise images meeting preset conditions.
In a third aspect, embodiments of the invention disclose a machine-readable medium having instructions stored thereon, which when executed by one or more processors of an apparatus, cause the apparatus to perform an object detection method as described in one or more of the preceding.
The embodiment of the invention has the following advantages:
the embodiment of the invention obtains a parameter prediction network and a target detection network by training data in a combined manner in advance, inputs the image to be processed into the trained parameter prediction network, can output tuning parameters through the parameter prediction network, performs image enhancement processing on the image to be processed by using the tuning parameters, can obtain an optimized image corresponding to the image to be processed, and inputs the optimized image into the trained target detection network to output a target detection result. The embodiment of the invention can automatically predict the tuning parameters required by the image to be processed through the parameter prediction network, does not need to have higher professional requirements on shooting personnel when shooting the image, and can also reduce the operation cost of a user. In addition, the parameter prediction network is a neural network obtained through training of a large amount of training data, and the training data comprise images under preset conditions (such as severe weather), so that the parameter prediction network provided by the embodiment of the invention can accurately predict tuning parameters required by the images under the preset conditions (such as severe weather), the adaptability of the parameter prediction network to the images under the preset conditions (such as severe weather) can be enhanced, and compared with manual parameter setting, the embodiment of the invention can improve the accuracy of the tuning parameters, and further improve the accuracy of target detection. Moreover, the parameter prediction network and the target detection network of the embodiment of the invention can be obtained through end-to-end training and testing, and the cost of manual debugging can be reduced.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive labor.
FIG. 1 is a flow chart of the steps of one embodiment of a method of target detection of the present invention;
FIG. 2 is a schematic diagram of an image enhancement module process flow in one example of the invention;
FIG. 3 is a schematic diagram of an end-to-end system architecture of the present invention;
FIG. 4 is a flowchart illustrating steps of an embodiment of an object detection apparatus according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms first, second and the like in the description and in the claims of the present invention are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that embodiments of the invention may be practiced in sequences other than those illustrated or described herein, and that the objects identified as "first", "second", etc. are often one, and do not limit the number of objects, e.g., the first object may be one or more. Furthermore, the term "and/or" as used in the specification and claims to describe an associative relationship of associated objects means that there may be three relationships, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. The term "plurality" in the embodiments of the present invention means two or more, and other terms are similar thereto.
Referring to fig. 1, a flow chart of steps of an embodiment of a method of object detection of the present invention is shown, which may include the steps of:
step 101, inputting an image to be processed into a parameter prediction network, and outputting tuning parameters through the parameter prediction network, wherein the tuning parameters comprise at least one of defogging parameters, white balance parameters, contrast parameters, hue parameters, sharpening parameters and correction parameters;
102, predicting an optimization parameter output by a network according to the parameter, and performing image enhancement processing on the image to be processed to obtain an optimized image corresponding to the image to be processed;
step 103, inputting the optimized image corresponding to the image to be processed into a target detection network for target detection, and outputting a target detection result through the target detection network, wherein the target detection network and the parameter prediction network are neural networks obtained by joint training in advance by using training data, and the training data comprises images meeting preset conditions.
The target detection method is a method of finding a plurality of specific targets from an image and determining specific positions of the specific targets in the image. The target detection method provided by the embodiment of the invention can be used for a face recognition scene, detecting a face target in an image, and also can be used for an automatic driving scene, and detecting targets such as pedestrians, obstacles, traffic signals and the like in the image.
The embodiment of the invention realizes the target detection method through an end-to-end system. And inputting the image to be processed into the end-to-end system, and outputting a target detection result of the image to be processed. The end-to-end system mainly comprises the following three parts: the system comprises a parameter prediction network, an image enhancement module and a target detection network.
The parameter prediction network is used for receiving the image to be processed and outputting the tuning parameters corresponding to the image to be processed. The adjusting and optimizing parameters are parameters which are determined by the parameter prediction network according to the global information of the image to be processed and are used for optimizing the image to be processed, and are used for improving the quality of the image to be processed. The tuning parameters may include, but are not limited to, at least one of a defogging parameter, a white balance parameter, a contrast parameter, a hue parameter, a sharpening parameter, and a correction parameter. The image enhancement module is used for receiving the image to be processed and the tuning parameters output by the parameter prediction network, carrying out image enhancement processing on the received image to be processed according to the tuning parameters and outputting an optimized image. That is, the image enhancement module is used for optimizing the image to be processed according to the tuning parameters provided by the parameter prediction network to obtain an optimized image. For example, if the brightness of the image to be processed is too low, the brightness of the image to be processed can be enhanced according to the correction parameters output by the parameter prediction network; for another example, if the image to be processed is an image shot in a foggy day, the image to be processed may be subjected to defogging processing and the like according to the defogging parameters output by the parameter prediction network. The image enhancement module can improve the image quality after optimizing the image to be processed, and further can improve the accuracy of target detection. It should be noted that the type and number of tuning parameters output by the parameter prediction network may be set according to actual needs. The target detection network is used for receiving the optimized image output by the image enhancement module, carrying out target detection on the received optimized image and outputting a target detection result.
The embodiment of the invention does not limit the source of the image to be processed. For example, the image to be processed may be an image in a traffic monitoring video, or the image to be processed may be an image in a video recorded by a mobile phone of a user, and the like.
The target detection network and the parameter prediction network are neural networks obtained by joint training in advance by utilizing training data, and the training data comprises images meeting preset conditions. The image of the preset condition refers to an image with poor shooting condition, for example, the preset condition may include, but is not limited to, any one or more of the following shooting conditions: rainy day, snowy day, cloudy day, night, foggy day, dim light, strong light, etc.
The embodiment of the invention obtains the parameter prediction network and the target detection network by utilizing training data joint training in advance. Because the training data includes images meeting preset conditions, such as images shot in rainy days, snowy days, cloudy days, nights, foggy days, dim lights, strong lights and the like, the parameter prediction network and the target detection network are jointly trained based on the training data, the trained parameter prediction network can accurately predict tuning parameters corresponding to the images under the preset conditions, and the trained target detection network can accurately detect targets in the images under the preset conditions.
In an optional embodiment of the present invention, the predicting, according to the parameter, a tuning parameter output by a network in step 102, and performing image enhancement processing on the image to be processed may include:
step S11, inputting the image to be processed and the tuning parameters output by the parameter prediction network into an image enhancement module, wherein the image enhancement module comprises filters corresponding to the tuning parameters output by the parameter prediction network;
and step S12, sequentially performing image enhancement processing on the images to be processed by using the corresponding tuning parameters through each filter.
In an embodiment of the invention, the image enhancement module may include several differentiable filters, each filter being used to perform some kind of optimization on the image. The tuning parameters output by the parameter prediction network have a one-to-one correspondence relationship with the filters included in the image enhancement module, and the parameter prediction network can input the corresponding tuning parameters into the corresponding filters in the image enhancement module, so that each filter can sequentially perform image enhancement processing on the image to be processed by using the corresponding tuning parameters.
Further, the image enhancement module may include, but is not limited to, at least one of a defogging filter, a white balance filter, a contrast filter, a hue filter, a sharpening filter, and a correction filter. The parameter prediction network may input the defogging parameters into the defogging filter, the white balance parameters into the white balance filter, the contrast parameters into the contrast filter, the hue parameters into the hue filter, the sharpening parameters into the sharpening filter, and the correction parameters into the correction filter.
The defogging filter can be used for defogging the received image according to the received defogging parameters; the white balance filter is used for carrying out white balance adjustment processing on the received image according to the received white balance parameters; the contrast filter is used for carrying out contrast adjustment processing on the received image according to the received contrast parameters; the tone filter is used for carrying out tone adjustment processing on the received image according to the received tone parameters; the sharpening filter is used for sharpening the received image according to the received sharpening parameter; the correction filter is used for adjusting the brightness of the received image according to the received correction parameters.
Referring to fig. 2, a schematic diagram of an image enhancement process flow in one example of the invention is shown. The flow diagram shown in fig. 2 includes a parameter prediction network 201 and an image enhancement module 202, and the image enhancement module 202 sequentially includes a correction filter, a white balance filter, a sharpening filter, a contrast filter, and a defogging filter. The filters shown in fig. 2 are connected in sequence, and the output of the previous filter is used as the input of the next filter, thereby realizing the gradual optimization of the image to be processed. As shown in fig. 2, the input of the correction filter is the image to be processed, the image to be processed is input into the white balance filter after being processed by the correction filter, is input into the sharpening filter after being processed by the white balance filter, is input into the contrast filter after being processed by the sharpening filter, is input into the defogging filter after being processed by the contrast filter, and is output to obtain the optimized image after being processed by the defogging filter.
It should be noted that the connection sequence of the filters shown in fig. 2 is only an application example of the present invention, and the connection sequence of the filters included in the image enhancement module is not limited in the embodiment of the present invention, that is, the embodiment of the present invention does not limit the sequence of sequentially performing image enhancement processing on the image to be processed by using corresponding tuning parameters through each filter.
The correction filter may perform Gamma correction on the image by:
f(I)=I γ (1)
gamma correction is an important nonlinear transformation, which is to perform an exponential transformation on the gray value of an input image to further correct the brightness deviation of the image, and is generally applied to expand the details of a dark tone. In the above formula (1), f (I) is an image output from the correction filter, I is an image input to the correction filter, and γ is a Gamma value for Gamma correction. In the embodiment of the present invention, the correction parameter output by the parameter prediction network includes a Gamma value used for performing Gamma correction on an image to be processed, taking fig. 2 as an example, I is the image to be processed, and the correction parameter (Gamma value) output by the parameter prediction network according to the image to be processed I is γ (Gamma value). Inputting the image I to be processed and the correction parameter Gamma (Gamma value) output by the parameter prediction network into a correction filter, and carrying out Gamma correction on the image I to be processed by the correction filter according to the correction parameter Gamma and the formula (1). For example, when γ is greater than 1, the correction filter may enhance the luminance of the image to be processed by the above equation (1), and when γ is less than 1, the correction filter may reduce the luminance of the image to be processed by the above equation (1).
The white balance filter may adjust the white balance of the image by:
f(I r )=W r I r (2)
f(I g )=W g I g (3)
f(I b )=W b I b (4)
white balance is an index describing the accuracy of white color generated by mixing three primary colors of red, green and blue in a display. In the above formulae (2), (3) and (4), f (I) r )、f(I g ) And f (I) b ) Values of three channels R (red), G (green) and B (blue), I, respectively, corresponding to the image output by the white balance filter r 、I g And I b Values of three channels R (red) G (green) B (blue), W, respectively, corresponding to the image input to the white balance filter r 、W g And W b The weights are corresponding to the three channels of RGB respectively. In the embodiment of the present invention, the white balance parameters output by the parameter prediction network include weights W respectively corresponding to three RGB channels for performing white balance adjustment on the image to be processed r 、W g And W b The value of (c). Taking fig. 2 as an example, I is the image to be processed, and the white balance parameter output by the parameter prediction network according to the image to be processed I is W r 、W g And W b The image input to the white balance filter is the image output by the correction filter. Image and parameter prediction net for outputting correction filterWhite balance parameter (W) of envelope output r 、W g And W b Value of) is input to a white balance filter based on W r 、W g And W b The value of (c) is used to perform white balance adjustment on the image output by the correction filter according to equations (2), (3) and (4) above.
The sharpening filter may sharpen the image by:
F(x,λ)=I(x)+λ(I(x)-Gau(I(x))) (5)
the sharpness of the edges of the image can be improved by the sharpening process to highlight the detail information of the image. In the above formula (5), F (x, λ) is an image output from the sharpening filter, and i (x) is an image input to the sharpening filter. Taking fig. 2 as an example, the image input to the sharpening filter is the image output by the white balance filter. Gau (i (x)) denotes that the image inputted to the sharpening filter is subjected to gaussian filtering, where λ is a positive scale factor, and the degree of sharpening of the image can be adjusted by adjusting λ. In the embodiment of the present invention, the sharpening parameter output by the parameter prediction network includes a value of a positive scale factor λ used for sharpening the image to be processed. Taking fig. 2 as an example, I is an image to be processed, and the sharpening parameter output by the parameter prediction network according to the image to be processed I is a value of the positive scale factor λ. The image output by the white balance filter and the sharpening parameter (the value of the positive scale factor lambda) output by the parameter prediction network are input into the sharpening filter, and the sharpening filter sharpens the image output by the white balance filter according to the value of the positive scale factor lambda according to the formula (5).
The contrast filter may adjust the contrast of the image by:
F(x,α)=αI(x) (6)
contrast refers to the measurement of different brightness levels between the brightest white and darkest black of bright and dark regions in an image, and a larger difference range represents a larger contrast and a smaller difference range represents a smaller contrast. In the above formula (6), F (x, α) is an image output from the contrast filter, and i (x) is an image input to the contrast filter. Taking fig. 2 as an example, the image input to the contrast filter is the image output by the sharpening filter. Alpha is a contrast adjusting factor, and the contrast of the image can be adjusted by adjusting alpha. In the embodiment of the present invention, the contrast parameter output by the parameter prediction network includes a value of a contrast adjustment factor α used for performing contrast adjustment on an image to be processed. Taking fig. 2 as an example, I is an image to be processed, and the contrast parameter output by the parameter prediction network according to the image to be processed I is a value of the contrast adjustment factor α. And (3) inputting the image output by the sharpening filter and the contrast parameter (the value of the contrast adjusting factor alpha) output by the parameter prediction network into the contrast filter, and carrying out contrast adjustment on the image output by the sharpening filter according to the value of the contrast adjusting factor alpha by the contrast filter according to the formula (6).
The defogging filter may be used to defogg an image, and the image input to the defogging filter may be represented as follows:
I(x)=J(x)t(x)+A(1-t(x)) (7)
in the above formula (7), i (x) represents an image (foggy day image) input to the defogging filter, j (x) represents a normal image (non-foggy day image) after the defogging process on i (x), a represents the brightness of the image input to the defogging filter, and t (x) represents a medium projection map. The key to the defogging process is the control of the function t (x).
From equation (7) above, it can be derived that t (x) can be expressed approximately as follows:
Figure BDA0003694175880000091
in the above equation (8), C represents R, G, B three channels of the image input to the defogging filter, and the function t (x) processes these three channels in turn. When each channel is processed, the processing can be carried out according to the area, and the size of the area can be set according to the requirement. For example, assume that the image size of the input defogging filter is 416 × 416 pixels, and the region size per process is 3 × 3 pixels. For an image input to the defogging filter, y represents the pixel value of the currently processed region, min c Is the minimum pixel value, min, of the current channel y Is the minimum pixel value of the current region, I C (y) is the pixel value of the current region in the current channel, A C Is the brightness of the current channelAnd (4) degree.
Further, according to the above formula (8), a parameter ω can be introduced to control the degree of defogging as follows:
Figure BDA0003694175880000092
in the above equation (9), t (x, ω) is an image output from the defogging filter, and the image input to the defogging filter is an image output from the contrast filter, taking fig. 2 as an example. In the embodiment of the present invention, the defogging parameters output by the parameter prediction network include a value of a defogging control parameter ω for performing a defogging process on an image to be processed. Taking fig. 2 as an example, I is an image to be processed, and the defogging parameter output by the parameter prediction network according to the image to be processed I is a value of the defogging control parameter ω. And inputting the image output by the contrast filter and the defogging parameter (value of the defogging control parameter omega) output by the parameter prediction network into the defogging filter, and defogging the image output by the contrast filter by the defogging filter according to the value of the defogging control parameter omega by the defogging filter according to the formula (9). In the example shown in fig. 2, the image output by the defogging filter is an optimized image corresponding to the image I to be processed, that is, an image to be input into the target detection network.
In an optional embodiment of the present invention, before the step 101 of inputting the image to be processed into the parameter prediction network, the method may further include: adjusting the original image to respectively obtain a to-be-processed image with a first size and a to-be-processed image with a second size; the step 101 of inputting the image to be processed into the parameter prediction network may include: inputting the image to be processed with the first size into the parameter prediction network; in step 102, the predicting, according to the parameter, a tuning parameter output by the network, and performing image enhancement processing on the image to be processed may include: and predicting tuning parameters output by the network according to the parameters, and performing image enhancement processing on the image to be processed with the second size.
In the embodiment of the present invention, the image to be processed may be an image obtained by preprocessing an original image. The pre-processing may include resizing the original image. Further, the embodiment of the invention adjusts the original image to obtain the image to be processed with the first size and the image to be processed with the second size respectively.
The original image is an image which needs to be subjected to target detection, such as an image in a traffic monitoring video, or an image in a video recorded by a mobile phone of a user, and the like.
The embodiment of the invention inputs the image to be processed with the first size into the parameter prediction network to predict and optimize the tuning parameters. In order to improve the efficiency of target detection, the parameter prediction module may be a small neural network, and in the embodiment of the present invention, the size of the original image is adjusted to the first size (a smaller size) and then input into the parameter prediction network, so as to reduce the amount of calculation of the parameter prediction network, thereby improving the efficiency of parameter prediction and optimization by the parameter prediction network, and further improving the efficiency of target detection. Illustratively, an image pre-processing module may be provided to which the raw image is input for pre-processing, which may include, but is not limited to, resizing the raw image (e.g., to a first size and a second size, respectively), removing noise in the raw image, and the like. Inputting the image to be processed with the first size output by the image preprocessing module into a parameter prediction network, and inputting the image to be processed with the second size output by the image preprocessing module into an image enhancement module. It should be noted that, the method adopted by the embodiment of the present invention to adjust the size of the original image is not limited.
In a specific implementation, the larger the first size and the second size are, the more complete the original information retained in the image is, and the more accurate the target detection result is. In the case of directly processing by using an original image, the target detection result is more accurate, but the calculation efficiency is low, and the feasibility and the real-time performance are difficult to guarantee. Therefore, the first size and the second size may select a minimum size capable of securing an object detection effect. Illustratively, the first size may be selected to be 256 pixels by 256, and the second size may be selected to be 416 pixels by 416. The first size is selected to be a smaller size so as to reduce the calculation amount of the parameter prediction network and improve the target detection efficiency, and the second size is selected to be larger than the first size so as to ensure the image enhancement processing effect and further ensure the target detection accuracy.
It should be noted that, in the embodiment of the present invention, the sizes of the first dimension and the second dimension are not limited. For example, the first dimension may be smaller than the second dimension. Of course, the first size may also be greater than or equal to the second size.
Referring to fig. 3, a schematic diagram of an end-to-end system architecture of the present invention is shown. The system architecture shown in fig. 3 includes a parameter prediction network 301, an image enhancement module 302, and an object detection network 303.
In the embodiment of the present invention, an original image is adjusted to obtain a to-be-processed image with a first size (e.g., 256 × 256 pixels) and a to-be-processed image with a second size (e.g., 416 × 416 pixels), and the to-be-processed image with the first size (e.g., 256 × 256 pixels) is input into a parameter prediction network to predict tuning parameters (the tuning parameters are parameters required by each filter in an image enhancement module). Inputting the image to be processed with the second size (such as 416 × 416 pixels) and the tuning parameters output by the parameter prediction network into an image enhancement module, gradually performing image enhancement processing on the image to be processed with the second size by the image enhancement module through each filter according to the received tuning parameters so as to eliminate the influence of severe weather and the like and retain more key information to obtain an optimized image, and inputting the optimized image into a target detection network for target detection to obtain a target detection result.
In an optional embodiment of the present invention, before the step 101 of inputting the image to be processed into the parameter prediction network, the method may further include:
step S21, training data is obtained;
step S22, labeling the targets contained in the training data to obtain labeling results;
step S23, inputting the training data into an initial parameter prediction network, and outputting tuning parameters through the initial parameter prediction network;
step S24, predicting tuning parameters output by the network according to the initial parameters, and performing image enhancement processing on the training data to obtain an optimized image corresponding to the training data;
step S25, inputting the optimized image corresponding to the training data into an initial target detection network for target detection, and outputting a target detection result through the initial target detection network;
step S26, calculating a joint loss value according to the difference between the target detection result output by the initial target detection network and the labeling result, and performing iterative optimization on the parameters of the initial parameter prediction network and the parameters of the initial target detection network until the joint loss value meets an iteration stop condition to obtain a trained parameter prediction network and a trained target detection network.
The embodiment of the invention utilizes training data to jointly train the parameter prediction network and the target detection network in advance. Specifically, firstly, training data is obtained, and targets included in the training data are labeled to obtain a labeling result. The training data may include images under normal conditions (clear, high-quality photographing conditions) and images under the preset conditions. The annotation result may include whether the image includes the target and a position of the included target.
And then, inputting the training data into an initial parameter prediction network in sequence for iterative training. For example, after initializing the parameter prediction network and the target detection network, inputting a first image in the training data into an initial parameter prediction network, wherein the initial parameter prediction network outputs tuning parameters of the first image; the initial parameter prediction network may input the tuning parameters of the first image to the image enhancement module to set the parameters of each filter in the image enhancement module. The image enhancement module predicts the tuning parameters of the first image output by the network according to the initial parameters and performs image enhancement processing on the first image to obtain an optimized image corresponding to the first image; then, the image enhancement module inputs the optimized image corresponding to the first image into an initial target detection network for target detection, and outputs a target detection result of the first image through the initial target detection network; and finally, calculating a joint loss value according to the difference between the target detection result of the first image output by the initial target detection network and the labeling result of the first image, optimizing the parameters of the initial parameter prediction network and the parameters of the initial target detection network, entering next round of training if the joint loss value does not meet the iteration stop condition, inputting a second image in the training data into the initial parameter prediction network (at the moment, the parameters of the initial parameter prediction network and the parameters of the initial target detection network are optimized once), and executing second round of optimization until the joint loss value meets the iteration stop condition to obtain the trained parameter prediction network and the trained target detection network. The joint loss value satisfies an iteration stop condition, which may include: the joint loss value is less than a preset threshold.
In an optional embodiment of the present invention, the joint loss value may be obtained by performing weighted calculation on the loss value of the initial parameter prediction network and the loss value of the initial target detection network.
In a specific implementation, the loss value of the initial target detection network may be directly used as the joint loss value, or the loss value of the initial parameter prediction network may be further calculated, and the joint loss value is obtained by weighted calculation according to the loss value of the initial parameter prediction network and the loss value of the initial target detection network.
When a training data is constructed to carry out a joint training parameter prediction network and a target detection network, tuning parameters corresponding to the training data can be labeled, further, in the iterative training process, the loss value of the initial parameter prediction network can be calculated according to the difference between the tuning parameters output by the initial parameter prediction network and the labeled tuning parameters, and the joint loss value is calculated according to the loss value of the initial parameter prediction network and the loss value of the initial target detection network in a weighting mode.
In one example, assuming that in a certain iterative training process, the initial parameter prediction network has a loss value of a1, the initial parameter prediction network has a weight of b1, the initial target detection network has a loss value of a2, and the initial target detection network has a weight of b2, the joint loss value may be: a1 × b1+ a2 × b 2. It should be noted that, in the embodiment of the present invention, specific values of the weight b1 and the weight b2 are not limited, and the influence degree of the parameter prediction network and the target detection network on the whole end-to-end system in the joint training process can be adjusted by setting the weight b1 and the weight b 2. Illustratively, the weight b1 is set to 0.1 and the weight b2 is set to 1.
The parameter prediction network and the target detection network can be obtained by performing supervised training on the existing neural network according to a large amount of training data and a machine learning method. It should be noted that, the structures and training methods of the parameter prediction network and the target detection network are not limited in the embodiments of the present disclosure. The parameter prediction network and the target detection network can be fused with various neural networks. The neural network includes, but is not limited to, at least one or a combination, superposition, nesting of at least two of the following: CNN (Convolutional Neural Network), LSTM (Long Short-Term Memory) Network, RNN (Simple Recurrent Neural Network), attention Neural Network, and the like.
In an optional embodiment of the present invention, the parameter prediction network may include a first number of convolutional layers for performing convolution operation on an input image to be processed to output a feature map, and a second number of fully-connected layers for performing fully-connected operation on the feature map output by the convolutional layers to output tuning parameters.
The specific numerical values of the first number and the second number are not limited in the embodiment of the present invention. Illustratively, the first number is 5, the second number is 2, and the parameter prediction network may include 5 convolutional layers and two fully-connected layers. Further, in the specific implementation, the number of channels of each convolutional layer and the size of each convolutional kernel may also be set according to actual needs. Illustratively, the number of channels of the first convolutional layer may be set to 16, the number of channels of each of the second to fifth convolutional layers may be set to 32, and the size of each convolution kernel may be set to 3 × 3, and the convolution step may be set to 2. Carrying out convolution operation on the input image to be processed through the 5 convolution layers to obtain a characteristic diagram, and carrying out full connection on the characteristic diagram through the two full connection layers to obtain tuning parameters needing to be input into the image enhancement module.
The parameter prediction network predicts parameters required for optimizing the image to be processed according to the global information of the image to be processed, outputs tuning parameters, and can optimize a target detection result after the image to be processed is subjected to image enhancement processing by using the tuning parameters. Compared with the method for improving the quality of the shot image by manually adjusting the camera parameters, the method and the device for predicting the image quality can automatically predict the parameters required by optimizing the image quality through the trained parameter prediction network, do not need shooting personnel to have higher professional requirements, and can reduce the operation cost of a user. In addition, the parameter prediction network is a neural network obtained by training a large amount of training data, and the training data comprises images under preset conditions (such as severe weather), so that the parameter prediction network can accurately predict the optimal target detection effect of the optimized images under the optimal parameters of the images under the preset conditions (such as severe weather), and the accuracy of the optimized parameters can be improved relative to the manual parameter adjustment.
The embodiment of the invention does not limit the type of the target detection network. In an alternative embodiment of the invention, the target detection network may comprise a YOLOX network.
The embodiment of the invention can adopt a target detection network based on YOLOX, wherein YOLOX is a target detection network without an anchor frame, the calculation amount of post-processing can be reduced, and a target detection network based on YOLOX-L, YOLOX-M or YOLOX-T can be selected according to actually deployed equipment. Of course, the above-listed target detection networks are only exemplary, and in a specific implementation, the parameter prediction network and the image enhancement module according to the embodiment of the present invention may perform end-to-end training and detection in combination with various types of target detection networks.
Further, since the number of images under the preset condition (such as shot in severe weather) is usually small, and the number of training data has an important influence on the accuracy of the neural network model, in the embodiment of the present invention, when the training data is constructed, the images under the preset condition are made by using the images under the normal condition, so as to obtain more images under the preset condition. The normal condition refers to a non-preset condition such as a photographing condition capable of photographing a clear, high-quality image. The embodiment of the invention processes the images shot under the normal condition to obtain the images under the preset condition, for example, the images shot under the normal illumination in the daytime are modified into the images shot under the dark light at night, and the images shot on the sunny day are modified into the images shot in the foggy day, and the like.
In an optional embodiment of the present invention, before the step 101 of inputting the image to be processed into the parameter prediction network, the method may further include:
step S31, acquiring a non-foggy day image;
step S32, processing the non-foggy day images through a first control parameter and a second control parameter to generate foggy day images with different illumination intensities and different foggy day grades, wherein the first control parameter is used for controlling the illumination intensity of the generated foggy day images, and the second control parameter is used for controlling the foggy day grades of the generated foggy day images;
and step S33, constructing training data by using the foggy day images.
The foggy day is a common natural phenomenon, and the visibility is low in the foggy day, so that the detection and identification capability of the road vehicle information can be greatly influenced. Furthermore, because different illumination intensities and different fog day grades have different degrees of influence on the definition of the image, the fog day images with different illumination intensities and different fog day grades are generated when the fog day image is manufactured, so that the accuracy of the joint training parameter prediction network and the accuracy of the target detection network are improved.
The embodiment of the invention can generate foggy day images with different illumination intensities and different foggy day grades through the following formula:
I(x)=(J(x)t(x)+A(1-t(x))) γ (10)
in the above formula (10), i (x) is the prepared foggy day image, j (x) is an image (non-foggy day image) photographed in normal weather, a is the brightness of j (x), γ is a first control parameter for controlling the illumination intensity of the generated foggy day image, t (x) is a medium projection diagram, and t (x) is defined as follows:
t(x)=e d(x) (11)
in the above formula (11), β is an atmospheric scattering coefficient, and d (x) is defined as follows:
Figure BDA0003694175880000161
in the above equation (12), ρ represents the Euclidean distance from the current pixel to the center pixel, row and col represent the number of rows and columns of the image, respectively, the number of rows and columns being in units of pixels. Illustratively, according to the embodiment of the invention, a is 0.5, and ρ is 0.01 × i +0.05, where i is a second control parameter, and the value of i is 0 to 9, which can be used to adjust and correspond to ten different foggy day grades.
In the above equation (10), γ may be used to perform brightness adjustment once on the generated fog day image of a certain fog day level to simulate the fog day image under different illumination intensities, and if γ is greater than 1, the generated fog day image becomes bright, and if γ is less than 1, the generated fog day image becomes dark.
When the foggy day image is generated, values of gamma (a first control parameter) and i (a second control parameter) can be randomly selected to perform enhancement processing on the original image (non-foggy day image), and then foggy day images with different illumination intensities and different foggy day grades can be generated. Wherein i is used for controlling the fog level, and gamma is used for controlling the illumination intensity.
In summary, in the embodiments of the present invention, training data is used to perform joint training in advance to obtain a parameter prediction network and a target detection network, an image to be processed is input to the trained parameter prediction network, an optimization parameter can be output through the parameter prediction network, the image to be processed is subjected to image enhancement processing by using the optimization parameter, an optimized image corresponding to the image to be processed can be obtained, and the optimized image is input to the trained target detection network, so that a target detection result can be output. The embodiment of the invention can automatically predict the tuning parameters required by the image to be processed through the parameter prediction network, does not need to have higher professional requirements on shooting personnel when shooting the image, and can also reduce the operation cost of a user. In addition, the parameter prediction network is a neural network obtained through training of a large amount of training data, and the training data comprise images under preset conditions (such as severe weather), so that the parameter prediction network provided by the embodiment of the invention can accurately predict tuning parameters required by the images under the preset conditions (such as severe weather), the adaptability of the parameter prediction network to the images under the preset conditions (such as severe weather) can be enhanced, and compared with manual parameter setting, the embodiment of the invention can improve the accuracy of the tuning parameters, and further improve the accuracy of target detection. Moreover, the parameter prediction network and the target detection network of the embodiment of the invention can be obtained through end-to-end training and testing, and the cost of manual debugging can be reduced.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the embodiments are not limited by the order of acts, as some steps may occur in other orders or concurrently depending on the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Referring to fig. 4, a block diagram of an embodiment of an object detection apparatus of the present invention is shown, and the apparatus may include:
the parameter prediction module 401 is configured to input an image to be processed into a parameter prediction network, and output a tuning parameter through the parameter prediction network, where the tuning parameter includes at least one of a defogging parameter, a white balance parameter, a contrast parameter, a hue parameter, a sharpening parameter, and a correction parameter;
an image enhancement module 402, configured to perform image enhancement processing on the image to be processed according to the tuning parameters output by the parameter prediction network, so as to obtain an optimized image corresponding to the image to be processed;
the target detection module 403 is configured to input the optimized image corresponding to the image to be processed into a target detection network for target detection, and output a target detection result through the target detection network, where the target detection network and the parameter prediction network are neural networks obtained by performing joint training in advance using training data, and the training data includes images meeting preset conditions.
Optionally, the apparatus further comprises:
the data acquisition module is used for acquiring training data;
the data labeling module is used for labeling the targets contained in the training data to obtain labeling results;
the initial prediction module is used for inputting the training data into an initial parameter prediction network and outputting tuning parameters through the initial parameter prediction network;
the initial tuning module is used for predicting tuning parameters output by a network according to the initial parameters and carrying out image enhancement processing on the training data to obtain an optimized image corresponding to the training data;
the initial detection module is used for inputting the optimized image corresponding to the training data into an initial target detection network for target detection and outputting a target detection result through the initial target detection network;
and the iterative training module is used for calculating a joint loss value according to the difference between the target detection result output by the initial target detection network and the labeling result, and performing iterative optimization on the parameters of the initial parameter prediction network and the parameters of the initial target detection network until the joint loss value meets an iteration stop condition to obtain a trained parameter prediction network and a trained target detection network.
Optionally, the joint loss value is obtained by performing weighted calculation according to the loss value of the initial parameter prediction network and the loss value of the initial target detection network.
Optionally, the apparatus further comprises:
the image acquisition module is used for acquiring non-foggy day images;
the image making module is used for processing the non-foggy day images through a first control parameter and a second control parameter to generate foggy day images with different illumination intensities and different foggy day grades, the first control parameter is used for controlling the illumination intensity of the generated foggy day images, and the second control parameter is used for controlling the foggy day grades of the generated foggy day images;
and the data construction module is used for constructing training data by utilizing the foggy day images.
Optionally, the apparatus further comprises:
the size adjusting module is used for adjusting the original image to respectively obtain a to-be-processed image with a first size and a to-be-processed image with a second size;
the parameter prediction module is specifically configured to input the image to be processed of the first size into the parameter prediction network;
the image enhancement module is specifically configured to perform image enhancement processing on the image to be processed of the second size according to the tuning parameters output by the parameter prediction network.
Optionally, the image enhancement module comprises:
the input submodule is used for inputting the image to be processed and the tuning parameters output by the parameter prediction network into an image enhancement module, and the image enhancement module comprises filters which correspond to the tuning parameters output by the parameter prediction network one by one;
and the processing submodule is used for sequentially carrying out image enhancement processing on the image to be processed by using the corresponding tuning parameters through each filter.
Optionally, the image enhancement module comprises at least one of a defogging filter, a white balance filter, a contrast filter, a tone filter, a sharpening filter, and a correction filter, an output of a previous filter being an input of a next filter.
Optionally, the parameter prediction network includes a first number of convolutional layers and a second number of fully-connected layers, where the convolutional layers are configured to perform convolution operation on an input image to be processed to output a feature map, and the fully-connected layers are configured to perform fully-connected operation on the feature map output by the convolutional layers to output tuning parameters.
Optionally, the target detection network comprises a YOLOX network.
The embodiment of the invention obtains a parameter prediction network and a target detection network by training data in a combined manner in advance, inputs the image to be processed into the trained parameter prediction network, can output tuning parameters through the parameter prediction network, performs image enhancement processing on the image to be processed by using the tuning parameters, can obtain an optimized image corresponding to the image to be processed, and inputs the optimized image into the trained target detection network to output a target detection result. The embodiment of the invention can automatically predict the tuning parameters required by the image to be processed through the parameter prediction network, does not need to have higher professional requirements on shooting personnel when shooting the image, and can also reduce the operation cost of a user. In addition, the parameter prediction network is a neural network obtained through training of a large amount of training data, and the training data comprise images under preset conditions (such as severe weather), so that the parameter prediction network provided by the embodiment of the invention can accurately predict tuning parameters required by the images under the preset conditions (such as severe weather), the adaptability of the parameter prediction network to the images under the preset conditions (such as severe weather) can be enhanced, and compared with manual parameter setting, the embodiment of the invention can improve the accuracy of the tuning parameters, and further improve the accuracy of target detection. Moreover, the parameter prediction network and the target detection network of the embodiment of the invention can be obtained through end-to-end training and testing, and the cost of manual debugging can be reduced.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts in the embodiments are referred to each other.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
An embodiment of the present invention further provides a non-transitory computer-readable storage medium, where when a processor of a device (server or terminal) executes an instruction in the storage medium, the device is enabled to perform the description of the target detection method in the embodiment corresponding to fig. 1, and therefore, the description will not be repeated here. In addition, the beneficial effects of the same method are not described in detail. For technical details not disclosed in the embodiments of the computer program product or the computer program referred to in the present application, reference is made to the description of the embodiments of the method of the present application.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. The invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and should not be taken as limiting the scope of the present invention, which is intended to cover any modifications, equivalents, improvements, etc. within the spirit and scope of the present invention.
The above detailed description of the target detection method, the target detection device and the machine-readable storage medium provided by the present invention, and the specific examples applied herein have been set forth to explain the principles and embodiments of the present invention, and the above descriptions of the embodiments are only used to help understand the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (11)

1. A method of object detection, the method comprising:
inputting an image to be processed into a parameter prediction network, and outputting tuning parameters through the parameter prediction network, wherein the tuning parameters comprise at least one of defogging parameters, white balance parameters, contrast parameters, hue parameters, sharpening parameters and correction parameters;
performing image enhancement processing on the image to be processed according to the tuning parameters output by the parameter prediction network to obtain an optimized image corresponding to the image to be processed;
inputting the optimized image corresponding to the image to be processed into a target detection network for target detection, and outputting a target detection result through the target detection network, wherein the target detection network and the parameter prediction network are neural networks obtained by joint training in advance by using training data, and the training data comprises images meeting preset conditions.
2. The method of claim 1, wherein before inputting the image to be processed into the parameter prediction network, the method further comprises:
acquiring training data;
labeling targets contained in the training data to obtain a labeling result;
inputting the training data into an initial parameter prediction network, and outputting tuning parameters through the initial parameter prediction network;
predicting tuning parameters output by a network according to the initial parameters, and performing image enhancement processing on the training data to obtain an optimized image corresponding to the training data;
inputting the optimized image corresponding to the training data into an initial target detection network for target detection, and outputting a target detection result through the initial target detection network;
and calculating a joint loss value according to the difference between the target detection result output by the initial target detection network and the labeling result, and performing iterative optimization on the parameters of the initial parameter prediction network and the parameters of the initial target detection network until the joint loss value meets an iteration stop condition to obtain a trained parameter prediction network and a trained target detection network.
3. The method of claim 2, wherein the joint loss value is calculated by weighting the initial parameter prediction network loss value and the initial target detection network loss value.
4. The method according to any one of claims 1 to 3, wherein before inputting the image to be processed into the parameter prediction network, the method further comprises:
acquiring a non-foggy day image;
processing the non-foggy day images through a first control parameter and a second control parameter to generate foggy day images with different illumination intensities and different foggy day grades, wherein the first control parameter is used for controlling the illumination intensity of the generated foggy day images, and the second control parameter is used for controlling the foggy day grades of the generated foggy day images;
and constructing training data by using the foggy day image.
5. The method of claim 1, wherein before inputting the image to be processed into the parameter prediction network, the method further comprises:
adjusting the original image to respectively obtain a to-be-processed image with a first size and a to-be-processed image with a second size;
the inputting of the image to be processed into the parameter prediction network comprises:
inputting the image to be processed with the first size into the parameter prediction network;
the predicting the tuning parameters output by the network according to the parameters and carrying out image enhancement processing on the image to be processed comprises the following steps:
and predicting the tuning parameters output by the network according to the parameters, and performing image enhancement processing on the image to be processed with the second size.
6. The method according to claim 1, wherein said predicting the tuning parameters output by the network according to the parameters, and performing image enhancement processing on the image to be processed comprises:
inputting the image to be processed and the tuning parameters output by the parameter prediction network into an image enhancement module, wherein the image enhancement module comprises filters which correspond to the tuning parameters output by the parameter prediction network one by one;
and sequentially carrying out image enhancement processing on the images to be processed by using the corresponding tuning parameters through each filter.
7. The method of claim 6, wherein the image enhancement module comprises at least one of a defogging filter, a white balance filter, a contrast filter, a hue filter, a sharpening filter, and a correction filter, wherein an output of a previous filter is used as an input to a next filter.
8. The method of claim 1, wherein the parameter prediction network comprises a first number of convolutional layers for performing convolutional operation on the input image to be processed to output a feature map, and a second number of fully-connected layers for performing fully-connected output of tuning parameters on the feature map output by the convolutional layers.
9. The method of claim 1, wherein the target detection network comprises a YOLOX network.
10. An object detection apparatus, characterized in that the apparatus comprises:
the parameter prediction module is used for inputting the image to be processed into a parameter prediction network and outputting tuning parameters through the parameter prediction network, wherein the tuning parameters comprise at least one of defogging parameters, white balance parameters, contrast parameters, hue parameters, sharpening parameters and correction parameters;
the image enhancement module is used for carrying out image enhancement processing on the image to be processed according to the tuning parameters output by the parameter prediction network to obtain an optimized image corresponding to the image to be processed;
and the target detection module is used for inputting the optimized image corresponding to the image to be processed into a target detection network for target detection, and outputting a target detection result through the target detection network, wherein the target detection network and the parameter prediction network are neural networks obtained by joint training by utilizing training data in advance, and the training data comprise images meeting preset conditions.
11. A machine-readable storage medium having stored thereon instructions which, when executed by one or more processors of an apparatus, cause the apparatus to perform the object detection method of any one of claims 1 to 9.
CN202210684927.6A 2022-06-14 2022-06-14 Target detection method and device and readable storage medium Pending CN115100500A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210684927.6A CN115100500A (en) 2022-06-14 2022-06-14 Target detection method and device and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210684927.6A CN115100500A (en) 2022-06-14 2022-06-14 Target detection method and device and readable storage medium

Publications (1)

Publication Number Publication Date
CN115100500A true CN115100500A (en) 2022-09-23

Family

ID=83290702

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210684927.6A Pending CN115100500A (en) 2022-06-14 2022-06-14 Target detection method and device and readable storage medium

Country Status (1)

Country Link
CN (1) CN115100500A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116843583A (en) * 2023-09-01 2023-10-03 荣耀终端有限公司 Image processing method, device, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116843583A (en) * 2023-09-01 2023-10-03 荣耀终端有限公司 Image processing method, device, electronic equipment and storage medium
CN116843583B (en) * 2023-09-01 2024-05-14 荣耀终端有限公司 Image processing method, device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110378845B (en) Image restoration method based on convolutional neural network under extreme conditions
CN110610463A (en) Image enhancement method and device
Zheng et al. Infrared traffic image enhancement algorithm based on dark channel prior and gamma correction
CN107705254B (en) City environment assessment method based on street view
CN104166968A (en) Image dehazing method and device and mobile terminal
CN112149476A (en) Target detection method, device, equipment and storage medium
CN104331867A (en) Image defogging method and device and mobile terminal
CN115100500A (en) Target detection method and device and readable storage medium
CN112396042A (en) Real-time updated target detection method and system, and computer-readable storage medium
CN110807406B (en) Foggy day detection method and device
CN111898532A (en) Image processing method and device, electronic equipment and monitoring system
CN111723805B (en) Method and related device for identifying foreground region of signal lamp
CN113420871B (en) Image quality evaluation method, image quality evaluation device, storage medium, and electronic device
CN113592739A (en) Method and device for correcting lens shadow and storage medium
CN114549373A (en) HDR image generation method and device, electronic equipment and readable storage medium
CN104915933A (en) Foggy day image enhancing method based on APSO-BP coupling algorithm
CN104463812A (en) Method for repairing video image disturbed by raindrops in shooting process
CN115661645A (en) Power transmission line icing thickness prediction method based on improved Unet network
CN115456907A (en) ISP debugging method and device, image processing system, terminal and storage medium
WO2023110880A1 (en) Image processing methods and systems for low-light image enhancement using machine learning models
CN115965934A (en) Parking space detection method and device
CN114419018A (en) Image sampling method, system, device and medium
CN117078562B (en) Video image defogging method, device, computer equipment and medium
CN113222828A (en) Zero-reference-based image enhancement method for industrial Internet of things monitoring platform
Nie et al. Image Defogging Based on Joint Contrast Enhancement and Multi-scale Fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination