WO2022156763A1 - Target object detection method and device thereof - Google Patents

Target object detection method and device thereof Download PDF

Info

Publication number
WO2022156763A1
WO2022156763A1 PCT/CN2022/073151 CN2022073151W WO2022156763A1 WO 2022156763 A1 WO2022156763 A1 WO 2022156763A1 CN 2022073151 W CN2022073151 W CN 2022073151W WO 2022156763 A1 WO2022156763 A1 WO 2022156763A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target object
target
evaluation
evaluation value
Prior art date
Application number
PCT/CN2022/073151
Other languages
French (fr)
Chinese (zh)
Inventor
孔令广
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022156763A1 publication Critical patent/WO2022156763A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection

Definitions

  • the present application relates to the field of monitoring, and in particular, to a target object detection method and device thereof.
  • a target object detection method and device thereof are proposed, which can reduce the data amount of the encoded image.
  • an embodiment of the present application provides a target object detection method, the method includes: acquiring a first image; detecting whether the first image includes a target object; When the target object is included, do not perform encoding on the first image; acquire a second image; detect whether the second image includes the target object; when the second image includes the target object, perform the encoding on the second image Encoding and sending the encoded second image to the monitoring platform server.
  • the cameras can perform relevant image processing on each image in the video stream after acquiring the video stream. Therefore, if the intelligent camera can identify valid images and transmit them to the monitoring platform server, not only It can reduce the data transmission pressure, and can reduce the storage pressure and data processing pressure of the monitoring platform server.
  • the target detection method performs target object detection on the images captured by the camera, and encodes only the images with the target objects, and the images without the target objects may not be encoded, thus reducing the number of encoded images.
  • images without a target object may not be sent to the monitoring platform server, thereby reducing the subsequent transmission amount and the storage cost of the monitoring platform server.
  • acquiring the first image includes: acquiring at least one image within a preset interval, where the preset interval includes a preset time interval and a preset a number interval; selecting a first image from the at least one image in an image selection manner, the method further comprising: determining not to perform encoding on the at least one image.
  • the processing method representing the image is taken as the processing method of the multiple images as a whole, and the processing method includes: encoding/non-encoding, sending to the monitoring platform server/not sending to the monitoring platform server.
  • the method further includes: using an evaluation method to determine an evaluation value of the second image, where the evaluation value is used to describe the quality of the second image. Image Quality.
  • an evaluation method may be used to determine the evaluation value of the second image that already has the target object. That is, evaluating the second image by the quantized value (evaluation value) enables the user to evaluate the first image more intuitively.
  • the evaluation method includes: when the second image includes multiple target objects, the evaluation value of the second image is the same as the first The sub-evaluation values of the multiple target objects in the two images are correlated.
  • the evaluation of the second image involves each target object in the second image.
  • this method takes into account the correlation between the multiple target objects, so that the value of the second image can be more accurately measured.
  • the evaluation value is related to one or more of the following: the definition of the target object area where the target object is located; the pixels of the target object area Quantity; the shooting angle of the target object area and the number of key features possessed by the target object.
  • the second image may be evaluated from one or more of the above four aspects, so that the second image can be more accurately evaluated.
  • the method further includes: judging whether the evaluation value satisfies a preset threshold; if it satisfies the preset threshold, storing the second image Sent to the monitoring platform server.
  • the method can individually send images with high image quality to the monitoring platform server, so that the monitoring platform server can perform separate/focus analysis on these images, thereby reducing the processing pressure of the monitoring platform server and improving processing efficiency.
  • the method further includes discarding the first image.
  • the first image can be discarded, thereby saving storage space.
  • embodiments of the present application provide a camera, where the camera includes: a lens for receiving light for generating an image; a camera body for implementing the first aspect or multiple possibilities of the first aspect One or several target object detection methods in the implementation manner.
  • the embodiments provide a target object detection device, the device includes: an image acquisition unit for acquiring at least one image; a target object detection unit for performing target object detection on the at least one image, A target image with a target object is determined, and images without a target object are discarded; an encoding unit is used to perform encoding on the target image to generate an encoded image; and a sending unit is used to send the encoded image to the monitoring platform server.
  • the device further includes: an image quality evaluation unit, configured to perform evaluation on the image quality of the target image, and determine an evaluation value of the target image .
  • the image quality evaluation unit is specifically configured to use the sub-evaluation values of the multiple target objects when the target image includes multiple target objects The evaluation value is determined.
  • the image quality evaluation unit is further configured to determine whether the evaluation value satisfies a preset threshold; The target image is sent to the cache unit.
  • the device further includes: a cache unit, configured to store a target image that meets the preset threshold.
  • the sending unit is further configured to send the image in the cache unit to the monitoring platform server.
  • the evaluation value is related to one or more of the following: the definition of the target object area where the target object is located; the pixels of the target object area number; the shooting angle of the target object area; and the number of key features possessed by the target object.
  • the target object detection unit is specifically configured to select a representative image from the at least one image; perform target object detection on the representative image; If the representative image has a target object, it is determined that the at least one image is a target image with the target object.
  • an embodiment of the present application provides a camera, including: a lens for collecting light; a sensor for generating an image by performing photoelectric conversion on the light collected by the lens; a processor and a processor cluster for executing the above The first aspect or one or more target object detection methods in multiple possible implementation manners of the first aspect.
  • embodiments of the present application provide a non-volatile computer-readable storage medium on which computer program instructions are stored, characterized in that, when the computer program instructions are executed by a processor, the above-mentioned first aspect is implemented Or one or more target object detection methods in multiple possible implementation manners of the first aspect.
  • FIG. 1 shows a schematic diagram of an application scenario according to an embodiment of the present application
  • FIG. 2 shows a diagram of data processing of a smart camera according to an embodiment of the present application
  • FIG. 3 shows a schematic structural diagram of a target detection system according to an embodiment of the present application
  • FIG. 4 shows a flow chart of steps of a target object detection method according to an embodiment of the present application
  • FIG. 5 shows a block diagram of a target object detection apparatus according to an embodiment of the present application.
  • “/” may indicate that the objects associated before and after are an “or” relationship, for example, A/B may indicate A or B; “and/or” may be used to describe that there are three types of associated objects A relationship, for example, A and/or B, can mean that A exists alone, A and B exist at the same time, and B exists alone, where A and B can be singular or plural.
  • words such as “first” and “second” may be used to distinguish technical features with the same or similar functions. The words “first”, “second” and the like do not limit the quantity and execution order, and the words “first”, “second” and the like do not limit the difference.
  • words such as “exemplary” or “for example” are used to represent examples, illustrations or illustrations, and any embodiment or design solution described as “exemplary” or “for example” should not be construed are preferred or advantageous over other embodiments or designs.
  • the use of words such as “exemplary” or “such as” is intended to present the relevant concepts in a specific manner to facilitate understanding.
  • the technical solution of the present application is applicable to the field of video surveillance, and video surveillance is an important part of a security protection system.
  • video surveillance is an important part of a security protection system.
  • the application scenario of the technical solution will be briefly described below with reference to FIG. 1 .
  • FIG. 1 is a schematic diagram of an application scenario to which the technical solution provided by the present application is applicable.
  • a video surveillance system can be used to monitor road conditions. Before performing video surveillance, it is necessary to set the target object of video surveillance.
  • target objects include pedestrians, non-motor vehicles, and motor vehicles.
  • the video surveillance system may include devices with audio/video capture functions and a surveillance platform server that performs data communication with these devices.
  • the video surveillance system includes only four video capture devices, but in practice, the video surveillance system may include more or less video/audio capture devices as needed.
  • the video capture device may be a camera
  • the camera may include a common camera and a smart camera
  • the common camera refers to a device that converts the captured video data into a suitable bit rate and uploads it to the monitoring platform server, that is, Ordinary cameras need to use the monitoring platform server to process the captured video data (data processing such as object recognition), while smart cameras can use the intelligent processing module embedded in them to first perform image processing on the video data, and then process the processed video data.
  • the video data is uploaded to the monitoring platform server, wherein the intelligent processing module may include modules such as a face recognition module, a license plate recognition module, and the like.
  • the camera of the present application includes a lens and a camera body.
  • the lens is used to receive the light used to generate the image.
  • the function of the lens is to present the light image of the observed target on the sensor of the camera, also known as optical imaging.
  • the lens combines various optical parts (reflectors, transmission mirrors, prisms) of different shapes and different media (plastic, glass or crystal) in a certain way, so that after the light is transmitted or reflected by these optical parts, according to people's expectations. It is necessary to change the transmission direction of the light to be received by the receiving device to complete the optical imaging process of the object.
  • each lens is composed of multiple groups of lenses with different curved curvatures combined at different intervals.
  • the focal length of the lens is determined by the selection of indicators such as spacing, lens curvature, and light transmittance.
  • the main parameters of the lens include: effective focal length, aperture, maximum image plane, field of view, distortion, relative illumination, etc. The value of each index determines the overall performance of the lens.
  • the camera body may include a sensor and a processor.
  • a sensor also known as an image sensor
  • CCD charge-coupled device
  • CMOS complementary metal oxide semiconductor
  • Both CCD and CMOS have a large number (eg, tens of millions) of photodiodes, each photodiode is called a photosensitive cell, and each photosensitive cell corresponds to a pixel.
  • the photodiode converts the light signal into an electrical signal containing brightness (or brightness and color) after receiving light, and the image is reconstructed accordingly.
  • Bayer array is a common image sensor technology that can be used in CCD and CMOS.
  • Bayer array uses Bayer color filter to make different pixels only sensitive to one of the three primary colors of red, blue and green. These pixels are interleaved and then interpolated by demosaicing to restore the original image.
  • Bayer arrays can be applied to CCD or CMOS, and sensors using Bayer arrays are also called Bayer sensors.
  • sensor technologies such as X3 (developed by Foveon).
  • X3 technology uses three layers of photosensitive elements, each layer records one of the color channels of RGB, so it can capture all colors on one pixel. Image sensor.
  • the processor (aka image processor), such as a system-on-chip (SoC), is used to convert the image produced by the sensor into a three-channel format (such as YUV), improve the image quality, and detect whether there is a target object in the image, and is also used for Encode the image.
  • SoC system-on-chip
  • the above-mentioned smart processing module may be included in the processor.
  • there may be only one processor eg, a multi-function integrated SoC
  • a cluster composed of multiple processors eg, multiple processors including an ISP and an encoder.
  • the smart camera 201 can transmit the data to the monitoring platform server by using the communication network.
  • the data can be transmitted to the storage unit 202 (such as a hard disk) of the monitoring platform server and store the data in the storage unit 202 (such as a hard disk). in unit 202.
  • the monitoring platform server refers to a device that can receive the data sent by the camera, perform related processing on the data, and store the data.
  • the monitoring platform server may be a single computing device or multiple computing devices, such as a server, server cluster, public cloud/private cloud.
  • the video monitoring system can preset the data sent to the monitoring platform server.
  • the data sent to the monitoring platform server may include a video stream collected by the smart camera, a target image determined by an intelligent processing module, and a region of interest (ROI) in the target image.
  • the region of interest is often the region where the target object is located, so it can be called the target object region.
  • the smart camera can also send the set content to the storage unit 202 through the server, for example, the captured video stream and the recognized face image.
  • the smart camera may send to the monitoring platform an encoded video stream that only includes a target image of the target object and a target image that meets a preset image quality.
  • Figure 3 shows a schematic structural diagram of the target detection system.
  • the target detection system 300 includes a plurality of cameras 301 to 305 and a monitoring platform server 310, wherein each camera in the plurality of cameras 301 to 305 may be an ordinary camera or a smart camera, In the case where the cameras 301 to 305 are smart cameras, the smart processing modules embedded in each smart camera may be the same or different.
  • the cameras 301 to 305 can transmit the acquired video data to the monitoring platform server, and the interface connecting the monitoring platform server 310 and the cameras 301 to 305 can be wired or wireless communication.
  • the wired mode may include the transmission control protocol/internet protocol (TCP/IP) communication technology, the user datagram protocol (UDP) technology or the standard universal serial bus ( universal serial bus, USB) port, COM interface and other similar standard ports.
  • the wireless communication method may include technologies such as WiFi, Bluetooth, ZigBee or ultra wideband (UWB). The corresponding connection method can be selected according to the actual application scenario and the hardware form of the camera.
  • FIG. 4 shows a flow chart of steps of a target object detection method according to an embodiment of the present application.
  • the target object detection method shown in FIG. 4 can be executed by a smart camera in a target detection system.
  • step S410 a first image is acquired, wherein the first image is an image acquired by the smart camera described above, or an image acquired by an ordinary camera, and the image is sent to the corresponding smart camera.
  • the method may acquire at least one image within a preset interval, and select the first image from the at least one image in an image selection manner. That is to say, considering that real-time detection is very challenging for the processing capability of the smart camera, it is possible to acquire multiple images within a preset interval, and then select an image from the multiple images as a representative image.
  • the preset interval mentioned here may be a time interval, such as multiple frames of images captured within five seconds, or may be a number interval, such as 10 frames of images captured continuously.
  • the method of selecting a representative image from among the multiple images, for example, the image selection method may be to select an intermediate image as the representative image, or, for example, to select the first frame of image as the representative image, which is not limited in this application.
  • step S420 it is detected whether a target object is included in the first image, wherein the target object is preset, and in implementation, the type of the target object may be a pedestrian, a motor vehicle and/or a non-motor vehicle.
  • the number of the target objects may be a single target object or multiple target objects. In the case of multiple target objects, as long as one target object is detected, it can be determined that the first image includes the target object. For example, if the image includes multiple non-motor vehicles, or if the same image includes pedestrians, motor vehicles and non-motor vehicles at the same time, if it is detected that the first image includes non-motor vehicles, the It is determined that the first image includes the target object.
  • the set target object corresponds to the intelligent processing module embedded in the smart camera, that is, the smart camera has an intelligent processing module for detecting the target object, so that the intelligent processing module can be used to determine whether the first image includes the target object.
  • the intelligent processing module can be implemented by an SoC.
  • the intelligent processing module may indicate an artificial intelligence (AI) module corresponding to the target, and may also be a machine learning module or a deep learning module, etc., wherein the AI module refers to loading a large amount of data into a computing device And choose a model to "fit" the data so that the computing device comes up with predictions/inferences.
  • Models used by computing devices include both simple equations (such as equations for a straight line) and very complex logical/mathematical systems, once the model to be used is selected and adjusted (that is, the model is improved by adjustment), Computing devices use the model to learn patterns in the data. Finally, the model can be used to perform processing on the input data.
  • the AI module may be a module with corresponding target detection capability, and the model used by it may be determined by actual users or technicians, for example, models corresponding to face recognition, pedestrian recognition, and license plate recognition.
  • step S430 when the target object is not included in the first image, encoding is not performed on the first image. That is, in a case where it is determined that the first image does not include the target object, encoding is not performed on the first image.
  • the first image may be deleted (discarded), thereby reducing the subsequent transmission amount and the storage cost of the monitoring platform server.
  • encoding is not performed on multiple images within the preset interval represented by the first image, and all are deleted.
  • Encoding is the encoding of three-channel images (such as YUV format images) for ease of transmission and viewing by users.
  • images such as YUV format images
  • Encoding is the encoding of three-channel images (such as YUV format images) for ease of transmission and viewing by users.
  • images such as YUV format images
  • an image encoded in JPG format or a video encoded in H.264/H.265.
  • the method may further perform step S440 to acquire a second image. If it is determined that the second image includes the target object after performing step S420 in the method, step S450 is performed to perform step S450 on the second image.
  • the encoding is performed and the encoded second image is sent to the monitoring platform server.
  • the second image may be encoded in H.264/H.265 format and transmitted to the monitoring platform server.
  • the second image with the target object may also be described as the target image.
  • steps S410 and S440 are executed in parallel. In other embodiments, the two may also be performed sequentially, that is, steps S410, S420 and S430 are performed first, and then steps S440, S420 and S450 are performed.
  • the method further includes performing an evaluation on the image quality of the second image if it is determined that the second image includes the target object.
  • One evaluation method is to use a single target object in the image to evaluate the image, for example, for the image including only pedestrian A, or for the image including pedestrian A and pedestrian B, use the evaluation for pedestrian A/pedestrian B. Evaluation to represent the evaluation of the entire image. This approach ignores the association between multiple target objects. As an example, pedestrian A, pedestrian B, and pedestrian C cross the road together. After a period of time after pedestrian A and pedestrian B walk together, pedestrian A, pedestrian B, and pedestrian C walk together. During this process, if only the images including pedestrian A or If the relevant area in the image (that is, the area including pedestrian A) is evaluated, the correlation between multiple pedestrians in the image will be ignored, thus underestimating the importance of the image.
  • the method may utilize an evaluation method to determine an evaluation value of the second image, wherein the evaluation value may be used to describe the image quality of the second image.
  • the evaluation value of the second image may be determined by using an evaluation method.
  • the method may first determine the number of target objects included in the second image, and then determine a specific evaluation manner according to the number of target objects.
  • the evaluation method may include a single target object evaluation method in which only a single target object is included in the second image or a multi-target object evaluation method in which at least two target objects are included in the second image.
  • the above-mentioned intelligent processing module can be used to detect target objects, and then determine the number of detected target objects.
  • a single-target object evaluation method may be adopted.
  • the target object can be evaluated from different dimensions according to the evaluation index of image quality, and the evaluation value of the second image can be determined.
  • the evaluation value is related to the evaluation index.
  • the evaluation index includes the clarity of the target object area where the target is exclusively located, the number of key features possessed by the target object, the number of pixels in the target object area, and the shooting angle of the photographed target object area.
  • the sharpness of the target object area refers to the sharpness of each detail shadow and its boundary on the image, and in implementation, the sharpness can be used to measure each detail in the image.
  • the AI module needs to use the feature values of the key features of the target object in the process of recognizing the target object. Therefore, the number of key features possessed by the target object determines the recognition rate of the AI module. The more pixels in the target object area, the larger the area occupied by the target object in the image, which is more conducive to various image processing. , or the profile of the target subject being photographed.
  • the second image may be evaluated by using one or more of the above-mentioned evaluation indicators to determine the evaluation value of the second image.
  • the method determines whether the evaluation value is higher than a predetermined threshold, if higher than the predetermined threshold, the image quality of the second image is high, and if it is lower than the predetermined threshold, the image quality of the second image is not high. If the evaluation value is higher than the predetermined threshold, the second image is sent to the buffer unit, and if the evaluation value is lower than the predetermined threshold, the second image may be discarded. As an example, the method may send the image stored in the buffer unit together with the video stream generated by encoding to the monitoring platform server.
  • different grades may be divided according to the evaluation value, each grade corresponds to a range of evaluation values, and if the evaluation value of the second image falls within the range of evaluation values, the second image corresponds to this grade.
  • the evaluation value may be divided into five grades, which may include excellent, good, moderate, pass, and fail.
  • the method may send an image with an evaluation value above a certain level to the cache unit, and if it is lower than a certain level, discard the image, as an example, the certain level may be a good level. Finally, the images stored in the cache unit are individually sent to the monitoring platform server.
  • the method may adopt a multi-target object evaluation method to determine the evaluation value of the second image, wherein the multi-object
  • the object evaluation method is a method in which the evaluation value of the second image is determined after each target object included in the second image is evaluated separately. That is to say, the multi-target evaluation method is related to the sub-evaluation values of the plurality of target objects in the second image.
  • the evaluation value of the second image may be determined by using the sub-evaluation value obtained for each target object and the corresponding sub-evaluation weight.
  • a corresponding sub-evaluation value may be calculated for each target object in the plurality of target objects, that is, the target object area corresponding to each target object may be evaluated from different dimensions by using the evaluation indicators described above.
  • the sub-evaluation value of For example, in the case where it is determined that the second image includes the first target object, the second target object and the third target object, the first target object area corresponding to the first target object in the second image and the The second target object area corresponding to the second target object and the third target object corresponding to the third target object.
  • the sub-evaluation values of the first target object, the second target object and the third target object are calculated respectively.
  • index values corresponding to each evaluation index can be calculated, and the sub-evaluation values of the target object can be finally calculated by using these evaluation values.
  • the evaluation methods for each target object may be the same or different. As an example, different evaluation methods may be determined according to the category of the target object. For example, the evaluation method for pedestrians is different from the evaluation method for motor vehicles.
  • S indicates the evaluation value of the second image
  • a i indicates the sub-evaluation value of the ith target object in the second image
  • b i indicates the sub-evaluation weight of the ith target object.
  • the sub-evaluation weight may be a weight preset by a user. It may also be a weight determined according to the characteristic value of the target object. For example, different sub-evaluation weights may be assigned according to the category of the target object, and different sub-evaluation weights may also be assigned according to the size of each target object area, and so on.
  • the evaluation value is higher than a predetermined threshold. If the evaluation value is higher than the predetermined threshold, the image quality of the second image is high, and if it is lower than the predetermined threshold, the image quality of the second image is not high. If the evaluation value is higher than the predetermined threshold, the second image is sent to the buffer unit, and if the evaluation value is lower than the predetermined threshold, the second image may be discarded.
  • each grade corresponds to a range of evaluation values, and if the evaluation value of the second image falls within the range of evaluation values, the second image corresponds to this grade.
  • It can be divided into five grades according to the evaluation value, and the five grades can include excellent, good, medium, pass and fail.
  • an image whose evaluation value is above a certain level may be sent to the buffer unit, and if it is lower than a certain level, the image may be discarded.
  • the certain level may be a good level.
  • the method may send an image that meets the preset requirements together with a video stream generated by encoding to the monitoring platform server, that is, send the image stored in the cache unit to the monitoring platform server together with the video stream.
  • the above-mentioned terminal and the like include corresponding hardware structures and/or software modules for executing each function.
  • the embodiments of the present application can be implemented in hardware or a combination of hardware and computer software. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Experts may use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of the embodiments of the present application.
  • each functional module may be divided corresponding to each function, or two or at least two functions may be integrated into one processing module.
  • the above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. It should be noted that the division of modules in the embodiments of the present application is schematic, and is only a logical function division, and other division methods may be used in actual implementation.
  • FIG. 5 shows a block diagram of a target object detection device according to an embodiment of the present application.
  • the target object detection device 500 includes an image acquisition unit 510, a target object detection unit 520, an encoding unit 530 and a sending unit 540, wherein the image acquisition unit 510 is used for acquiring at least one image; the target object detection unit 520 is used for Performing target object detection on the at least one image, determining a target image with a target object, and discarding images without the target object; the encoding unit 530 is configured to perform encoding on the target image to generate an encoded image; the sending unit 540 is used for Send the encoded image to the monitoring platform server.
  • the target object detection device 500 further includes an image quality evaluation unit, configured to perform evaluation on the image quality of the target image, and determine the evaluation value of the target image.
  • an image quality evaluation unit configured to perform evaluation on the image quality of the target image, and determine the evaluation value of the target image.
  • the image quality evaluation unit is specifically configured to use the sub-evaluation values of the multiple target objects to determine the evaluation value when the target image includes multiple target objects.
  • the image quality evaluation unit is further configured to determine whether the target image satisfies a preset threshold; if the preset threshold is met, send the target image to a cache unit.
  • the target object detection device 500 further includes: a cache unit, configured to store the target image satisfying the preset threshold.
  • the sending unit 540 is further configured to send the image in the cache unit to the monitoring platform server.
  • the evaluation value is related to one or more of the following: the clarity of the target object area where the target object is located; the number of pixels in the target object area; the shooting angle for photographing the target object area and the The number of key features the target object possesses.
  • the target object detection unit 510 is specifically configured to select a representative image from the at least one image; perform target object detection on the representative image; if there is a target object in the representative image, determine the at least one An image is a target image with a target object.
  • An embodiment of the present application provides a camera, including: a lens for collecting light; a sensor for generating an image by performing photoelectric conversion on the light collected by the lens; a processor and a processor cluster for executing the above-mentioned method.
  • Embodiments of the present application provide a non-volatile computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, implement the above method.
  • Computer program instructions can be executed by a video camera or by a general purpose computer.
  • Embodiments of the present application provide a computer program product, including computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes are stored in a processor of an electronic device When running in the electronic device, the processor in the electronic device executes the above method.
  • a computer-readable storage medium may be a tangible device that can hold and store instructions for use by the instruction execution device.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (Electrically Programmable Read-Only-Memory, EPROM or flash memory), static random access memory (Static Random-Access Memory, SRAM), portable compact disk read-only memory (Compact Disc Read-Only Memory, CD - ROM), Digital Video Disc (DVD), memory sticks, floppy disks, mechanically encoded devices, such as punch cards or raised structures in grooves on which instructions are stored, and any suitable combination of the foregoing .
  • RAM random access memory
  • ROM read only memory
  • EPROM erasable programmable read-only memory
  • EPROM Errically Programmable Read-Only-Memory
  • SRAM static random access memory
  • portable compact disk read-only memory Compact Disc Read-Only Memory
  • CD - ROM Compact Disc Read-Only Memory
  • DVD Digital Video Disc
  • memory sticks floppy disks
  • the computer readable program instructions or code described herein may be downloaded to various computing/processing devices from a computer readable storage medium, or to an external computer or external storage device over a network such as the Internet, a local area network, a wide area network and/or a wireless network.
  • the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or network interface in each computing/processing device receives computer-readable program instructions from a network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
  • the computer program instructions used to perform the operations of the present application may be assembly instructions, Instruction Set Architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or in one or more source or object code written in any combination of programming languages, including object-oriented programming languages such as Smalltalk, C++, etc., and conventional procedural programming languages such as the "C" language or similar programming languages.
  • the computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement.
  • the remote computer can be connected to the user's computer through any kind of network—including a Local Area Network (LAN) or a Wide Area Network (WAN)—or, can be connected to an external computer (e.g. use an internet service provider to connect via the internet).
  • electronic circuits such as programmable logic circuits, Field-Programmable Gate Arrays (FPGA), or Programmable Logic Arrays (Programmable Logic Arrays), are personalized by utilizing state information of computer-readable program instructions.
  • Logic Array, PLA the electronic circuit can execute computer readable program instructions to implement various aspects of the present application.
  • These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer or other programmable data processing apparatus to produce a machine that causes the instructions when executed by the processor of the computer or other programmable data processing apparatus , resulting in means for implementing the functions/acts specified in one or more blocks of the flowchart and/or block diagrams.
  • These computer readable program instructions can also be stored in a computer readable storage medium, these instructions cause a computer, programmable data processing apparatus and/or other equipment to operate in a specific manner, so that the computer readable medium on which the instructions are stored includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks of the flowchart and/or block diagrams.
  • Computer readable program instructions can also be loaded onto a computer, other programmable data processing apparatus, or other equipment to cause a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other equipment to produce a computer-implemented process , thereby causing instructions executing on a computer, other programmable data processing apparatus, or other device to implement the functions/acts specified in one or more blocks of the flowcharts and/or block diagrams.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more functions for implementing the specified logical function(s) executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in hardware (eg, circuits or ASICs (Application) that perform the corresponding functions or actions. Specific Integrated Circuit, application-specific integrated circuit)), or can be implemented by a combination of hardware and software, such as firmware.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Studio Devices (AREA)
  • Train Traffic Observation, Control, And Security (AREA)

Abstract

A target object detection method and a device thereof. The method comprises performing target object detection on images photographed by a camera, wherein only an image having a target object is encoded and transmitted to a monitoring platform server, and an image having no target object is not encoded and is not transmitted to the monitoring platform server.

Description

目标对象检测方法及其设备Target object detection method and device thereof
本申请要求于2021年1月25日提交中国专利局、申请号为202110099193.0、申请名称为“目标对象检测方法及其设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202110099193.0 and the application name "Target object detection method and apparatus thereof" filed with the Chinese Patent Office on January 25, 2021, the entire contents of which are incorporated into this application by reference .
技术领域technical field
本申请涉及监控领域,尤其涉及一种目标对象检测方法及其设备。The present application relates to the field of monitoring, and in particular, to a target object detection method and device thereof.
背景技术Background technique
随着视频监控技术的发展,视频监控由单纯的记录视频向智能化发展。这样的智能化趋势同样深深影响了摄像机的发展,当前摄像机呈现出越来越智能化的趋势。同时,得益于计算机视觉、深度学习等技术的快速发展,智能摄像机的自动识别准确度已经超过了人类。With the development of video surveillance technology, video surveillance has evolved from simply recording video to intelligence. This intelligent trend also deeply affects the development of cameras, and the current cameras are showing a trend of more and more intelligence. At the same time, thanks to the rapid development of computer vision, deep learning and other technologies, the automatic recognition accuracy of smart cameras has surpassed that of humans.
随着摄像机越来越智能化,提高了监控效率并节省了人力成本,对平安城市、智慧城市等现代化以及工业生产自动化等起到重要促进作用。相关技术的难点之一在于如何利用摄像机来减轻后台分析系统的分析压力并提高后台分析系统的准确度。As cameras become more and more intelligent, the monitoring efficiency is improved and labor costs are saved, which plays an important role in promoting the modernization of safe cities and smart cities, as well as industrial production automation. One of the difficulties in the related art is how to use the camera to reduce the analysis pressure of the background analysis system and improve the accuracy of the background analysis system.
发明内容SUMMARY OF THE INVENTION
有鉴于此,提出了一种目标对象检测方法及其设备,可以减少编码图像的数据量。In view of this, a target object detection method and device thereof are proposed, which can reduce the data amount of the encoded image.
第一方面,本申请的实施例提供了一种目标对象检测方法,所述方法包括:获取第一图像;检测所述第一图像中是否包括目标对象;当所述第一图像中不包括所述目标对象时,不对所述第一图像执行编码;获取第二图像;检测所述第二图像中是否包括目标对象;当所述第二图像中包括目标对象时,对所述第二图像执行编码并将编码后的第二图像发送到监控平台服务器。In a first aspect, an embodiment of the present application provides a target object detection method, the method includes: acquiring a first image; detecting whether the first image includes a target object; When the target object is included, do not perform encoding on the first image; acquire a second image; detect whether the second image includes the target object; when the second image includes the target object, perform the encoding on the second image Encoding and sending the encoded second image to the monitoring platform server.
随着摄像机越来越智能,摄像机可在获取到视频流后对视频流中的各个图像执行相关的图像处理,因此,如果智能摄像机能够识别出有效图像并将有效图像传送到监控平台服务器,不仅可以减轻数据传输压力,而且能够减轻监控平台服务器的存储压力以及数据处理压力。基于此,所述目标检测方法对摄像机拍摄的图像进行目标对象检测,仅对有目标对象的图像进行编码,没有目标对象的图像可以不编码,因此减少了编码图像的数量。此外,没有目标对象的图像可以不发送给监控平台服务器,从而减少后续的传输量以及监控平台服务器的存储成本。As cameras become more and more intelligent, the cameras can perform relevant image processing on each image in the video stream after acquiring the video stream. Therefore, if the intelligent camera can identify valid images and transmit them to the monitoring platform server, not only It can reduce the data transmission pressure, and can reduce the storage pressure and data processing pressure of the monitoring platform server. Based on this, the target detection method performs target object detection on the images captured by the camera, and encodes only the images with the target objects, and the images without the target objects may not be encoded, thus reducing the number of encoded images. In addition, images without a target object may not be sent to the monitoring platform server, thereby reducing the subsequent transmission amount and the storage cost of the monitoring platform server.
根据第一方面,在第一方面的第一种可能实现方式中,获取第一图像,包括:获取预设间隔内的至少一个图像,其中,所述预设间隔包括预设时间间隔和预设数量间隔;从所述至少一个图像中按照图像选择方式选择第一图像,所述方法进一步包括:确定对所述至少一个图像不执行编码。According to the first aspect, in a first possible implementation manner of the first aspect, acquiring the first image includes: acquiring at least one image within a preset interval, where the preset interval includes a preset time interval and a preset a number interval; selecting a first image from the at least one image in an image selection manner, the method further comprising: determining not to perform encoding on the at least one image.
在实施中,考虑到实时检测非常考验智能摄像机的处理能力,可获取预设间隔内的多个图像,然后从这多个图像中选择某一图像作为代表图像,这样能够减轻智能摄像机的处理压力。若第一图像不执行编码,则这些图像均不执行编码,这样可以进一步减轻智能摄像机的 处理压力。换句话说,以代表图像的处理方式作为这多个图像整体的处理方式,处理方式包括:编码/不编码,发送给监控平台服务器/不发送给监控平台服务器。In the implementation, considering that real-time detection is very challenging for the processing capability of the smart camera, multiple images within a preset interval can be acquired, and then an image can be selected from the multiple images as the representative image, which can reduce the processing pressure of the smart camera. . If the first image is not encoded, these images are not encoded, which can further reduce the processing pressure of the smart camera. In other words, the processing method representing the image is taken as the processing method of the multiple images as a whole, and the processing method includes: encoding/non-encoding, sending to the monitoring platform server/not sending to the monitoring platform server.
根据第一方面,在第一方面的第二种可能实现方式中,所述方法还包括:利用评价方式,确定所述第二图像的评价值,所述评价值用于描述所述第二图像的图像质量。According to the first aspect, in a second possible implementation manner of the first aspect, the method further includes: using an evaluation method to determine an evaluation value of the second image, where the evaluation value is used to describe the quality of the second image. Image Quality.
在相关技术中,通常仅评价包括单个目标对象的图像,这样就忽略了对于多个目标对象之间的关联性,从而低估了图像的重要性。为了更好地衡量第一图像的价值,可利用评价方式确定已经具有目标对象的第二图像的评价值。也就是说,通过量化值(评价值)来评价第二图像,使得用户能够更直观地评价第一图像。In the related art, generally only images including a single target object are evaluated, thus ignoring the relevance among multiple target objects, thereby underestimating the importance of the images. In order to better measure the value of the first image, an evaluation method may be used to determine the evaluation value of the second image that already has the target object. That is, evaluating the second image by the quantized value (evaluation value) enables the user to evaluate the first image more intuitively.
根据第一方面,在第一方面的第三种可能实现方式中,所述评价方式包括:当所述第二图像中包括多个目标对象时,所述第二图像的评价值与所述第二图像中所述多个目标对象的子评价值相关。According to the first aspect, in a third possible implementation manner of the first aspect, the evaluation method includes: when the second image includes multiple target objects, the evaluation value of the second image is the same as the first The sub-evaluation values of the multiple target objects in the two images are correlated.
也就是说,在第二图像包括多个目标对象的情况下,对第二图像的评价涉及到第二图像中的各个目标对象,通过这种方式能够更准确地确定第二图像的价值,特别地,在第二图像包括多个目标对象的情况下,这种方式考虑到了多目标对象之间的关联性,从而能够更准确地衡量第二图像的价值。That is to say, in the case where the second image includes multiple target objects, the evaluation of the second image involves each target object in the second image. In this way, the value of the second image can be more accurately determined, especially Specifically, in the case that the second image includes multiple target objects, this method takes into account the correlation between the multiple target objects, so that the value of the second image can be more accurately measured.
根据第一方面,在第一方面的第四种可能实现方式中,所述评价值与下述一项或多项有关:所述目标对象所在目标对象区域的清晰度;所述目标对象区域的像素数量;拍摄所述目标对象区域的拍摄角度以及所述目标对象具备的关键特征的数量。According to the first aspect, in a fourth possible implementation manner of the first aspect, the evaluation value is related to one or more of the following: the definition of the target object area where the target object is located; the pixels of the target object area Quantity; the shooting angle of the target object area and the number of key features possessed by the target object.
在实施中,可从上述四个方面中的一项或者多项来评价第二图像,从而能够更准确地评价所述第二图像。In implementation, the second image may be evaluated from one or more of the above four aspects, so that the second image can be more accurately evaluated.
根据第一方面,在第一方面的第五种可能实现方式中,所述方法还包括:判断所述评价值是否满足预设的阈值;若满足所述预设的阈值,将所述第二图像发送到监控平台服务器。According to the first aspect, in a fifth possible implementation manner of the first aspect, the method further includes: judging whether the evaluation value satisfies a preset threshold; if it satisfies the preset threshold, storing the second image Sent to the monitoring platform server.
也就是说,所述方法可将图像质量高的图像单独发送到监控平台服务器,使得监控平台服务器可对这些图像进行单独/重点分析,从而减轻了监控平台服务器的处理压力并提高了处理效率。That is, the method can individually send images with high image quality to the monitoring platform server, so that the monitoring platform server can perform separate/focus analysis on these images, thereby reducing the processing pressure of the monitoring platform server and improving processing efficiency.
根据第一方面,在第一方面的第六种可能实现方式中,所述方法还包括丢弃所述第一图像。According to the first aspect, in a sixth possible implementation manner of the first aspect, the method further includes discarding the first image.
也就是说,所述方法对于不包括目标对象的第一图像不进行编码后可将所述第一图像丢弃,从而节省存储空间。That is, after the method does not encode the first image that does not include the target object, the first image can be discarded, thereby saving storage space.
第二方面,本申请的实施例提供了一种摄像机,所述摄像机包括:镜头,用于接收用于生成图像的光线;摄像机本体,用于执行上述第一方面或者第一方面的多种可能的实现方式中的一种或几种的目标对象检测方法。In a second aspect, embodiments of the present application provide a camera, where the camera includes: a lens for receiving light for generating an image; a camera body for implementing the first aspect or multiple possibilities of the first aspect One or several target object detection methods in the implementation manner.
第三方面,实施例提供了一种目标对象检测设备,所述设备包括:图像获取单元,用于获取至少一个图像;目标对象检测单元,用于通过对所述至少一个图像执行目标对象检测,确定具有目标对象的目标图像,并且舍弃没有目标对象的图像;编码单元,用于对所述目标图像执行编码,生成编码图像;发送单元,用于将编码图像发送到监控平台服务器。In a third aspect, the embodiments provide a target object detection device, the device includes: an image acquisition unit for acquiring at least one image; a target object detection unit for performing target object detection on the at least one image, A target image with a target object is determined, and images without a target object are discarded; an encoding unit is used to perform encoding on the target image to generate an encoded image; and a sending unit is used to send the encoded image to the monitoring platform server.
根据第三方面,在第三方面的第一种可能实现方式中,所述设备还包括:图像质量评价单元,用于对所述目标图像的图像质量执行评价,确定所述目标图像的评价值。According to a third aspect, in a first possible implementation manner of the third aspect, the device further includes: an image quality evaluation unit, configured to perform evaluation on the image quality of the target image, and determine an evaluation value of the target image .
根据第三方面,在第三方面的第二种可能实现方式中,所述图像质量评价单元具体用于 当所述目标图像中包括多个目标对象时利用所述多个目标对象的子评价值确定所述评价值。According to a third aspect, in a second possible implementation manner of the third aspect, the image quality evaluation unit is specifically configured to use the sub-evaluation values of the multiple target objects when the target image includes multiple target objects The evaluation value is determined.
根据第三方面,在第三方面的第三种可能实现方式中,所述图像质量评价单元还用于判断确定所述评价值是否满足预设的阈值;在满足所述预设的阈值,将所述目标图像发送到缓存单元。According to a third aspect, in a third possible implementation manner of the third aspect, the image quality evaluation unit is further configured to determine whether the evaluation value satisfies a preset threshold; The target image is sent to the cache unit.
根据第三方面,在第三方面的第四种可能实现方式中,所述设备还包括:缓存单元,用于存储满足所述预设的阈值的目标图像。According to a third aspect, in a fourth possible implementation manner of the third aspect, the device further includes: a cache unit, configured to store a target image that meets the preset threshold.
根据第三方面,在第三方面的第五种可能实现方式中,所述发送单元还用于将缓存单元内的图像发送到监控平台服务器。According to the third aspect, in a fifth possible implementation manner of the third aspect, the sending unit is further configured to send the image in the cache unit to the monitoring platform server.
根据第三方面,在第三方面的第六种可能实现方式中,所述评价值与下述一项或者多项有关:所述目标对象所在目标对象区域的清晰度;所述目标对象区域的像素数量;拍摄所述目标对象区域的拍摄角度;以及所述目标对象具备的关键特征的数量。According to a third aspect, in a sixth possible implementation manner of the third aspect, the evaluation value is related to one or more of the following: the definition of the target object area where the target object is located; the pixels of the target object area number; the shooting angle of the target object area; and the number of key features possessed by the target object.
根据第三方面,在第三方面的第七种可能实现方式中,所述目标对象检测单元具体用于从所述至少一个图像中选择代表图像;对所述代表图像执行目标对象检测;若所述代表图像中具有目标对象,则确定所述至少一个图像均是具有目标对象的目标图像。According to a third aspect, in a seventh possible implementation manner of the third aspect, the target object detection unit is specifically configured to select a representative image from the at least one image; perform target object detection on the representative image; If the representative image has a target object, it is determined that the at least one image is a target image with the target object.
第四方面,本申请的实施例提供一种摄像机,包括:镜头,用于采集光线;传感器,用于通过对镜头采集的光线进行光电转换生成图像;处理器和处理器集群,用于执行上述第一方面或者第一方面的多种可能的实现方式中的一种或几种的目标对象检测方法。In a fourth aspect, an embodiment of the present application provides a camera, including: a lens for collecting light; a sensor for generating an image by performing photoelectric conversion on the light collected by the lens; a processor and a processor cluster for executing the above The first aspect or one or more target object detection methods in multiple possible implementation manners of the first aspect.
第五方面,本申请的实施例提供一种非易失性计算机可读存储介质,其上存储有计算机程序指令,其特征在于,所述计算机程序指令被处理器执行时实现执行上述第一方面或者第一方面的多种可能的实现方式中的一种或几种的目标对象检测方法。In a fifth aspect, embodiments of the present application provide a non-volatile computer-readable storage medium on which computer program instructions are stored, characterized in that, when the computer program instructions are executed by a processor, the above-mentioned first aspect is implemented Or one or more target object detection methods in multiple possible implementation manners of the first aspect.
本申请的这些和其他方面在以下(多个)实施例的描述中会更加简明易懂。These and other aspects of the present application will be more clearly understood in the following description of the embodiment(s).
附图说明Description of drawings
包含在说明书中并且构成说明书的一部分的附图与说明书一起示出了本申请的示例性实施例、特征和方面,并且用于解释本申请的原理。The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate exemplary embodiments, features and aspects of the application and together with the description, serve to explain the principles of the application.
图1示出根据本申请一实施例的应用场景的示意图;FIG. 1 shows a schematic diagram of an application scenario according to an embodiment of the present application;
图2示出根据本申请一实施例的智能摄像机的数据处理的示图;FIG. 2 shows a diagram of data processing of a smart camera according to an embodiment of the present application;
图3示出根据本申请一实施例的目标检测系统的结构示意图;3 shows a schematic structural diagram of a target detection system according to an embodiment of the present application;
图4示出根据本申请一实施例的目标对象检测方法的步骤流程图;FIG. 4 shows a flow chart of steps of a target object detection method according to an embodiment of the present application;
图5示出根据本申请一实施例的目标对象检测设备的框图。FIG. 5 shows a block diagram of a target object detection apparatus according to an embodiment of the present application.
具体实施方式Detailed ways
以下将参考附图详细说明本申请的各种示例性实施例、特征和方面。附图中相同的附图标记表示功能相同或相似的元件。尽管在附图中示出了实施例的各种方面,但是除非特别指出,不必按比例绘制附图。Various exemplary embodiments, features and aspects of the present application will be described in detail below with reference to the accompanying drawings. The same reference numbers in the figures denote elements that have the same or similar functions. While various aspects of the embodiments are shown in the drawings, the drawings are not necessarily drawn to scale unless otherwise indicated.
在本申请实施例中,“/”可以表示前后关联的对象是一种“或”的关系,例如,A/B可以表示A或B;“和/或”可以用于描述关联对象存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况,其中A,B可以是单数或者复数。为了便于描述本申请实施例的技术方案,在本申请实施例中,可以采用“第一”、“第二”等 字样对功能相同或相似的技术特征进行区分。该“第一”、“第二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。在本申请实施例中,“示例性的”或者“例如”等词用于表示例子、例证或说明,被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念,便于理解。In this embodiment of the present application, "/" may indicate that the objects associated before and after are an "or" relationship, for example, A/B may indicate A or B; "and/or" may be used to describe that there are three types of associated objects A relationship, for example, A and/or B, can mean that A exists alone, A and B exist at the same time, and B exists alone, where A and B can be singular or plural. In order to facilitate the description of the technical solutions of the embodiments of the present application, in the embodiments of the present application, words such as "first" and "second" may be used to distinguish technical features with the same or similar functions. The words "first", "second" and the like do not limit the quantity and execution order, and the words "first", "second" and the like do not limit the difference. In the embodiments of the present application, words such as "exemplary" or "for example" are used to represent examples, illustrations or illustrations, and any embodiment or design solution described as "exemplary" or "for example" should not be construed are preferred or advantageous over other embodiments or designs. The use of words such as "exemplary" or "such as" is intended to present the relevant concepts in a specific manner to facilitate understanding.
在这里专用的词“示例性”意为“用作例子、实施例或说明性”。这里作为“示例性”所说明的任何实施例不必解释为优于或好于其它实施例。The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration." Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
另外,为了更好的说明本申请,在下文的具体实施方式中给出了众多的具体细节。本领域技术人员应当理解,没有某些具体细节,本申请同样可以实施。在一些实例中,对于本领域技术人员熟知的方法、手段、元件和电路未作详细描述,以便于凸显本申请的主旨。In addition, in order to better illustrate the present application, numerous specific details are given in the following detailed description. It should be understood by those skilled in the art that the present application may be practiced without certain specific details. In some instances, methods, means, components and circuits well known to those skilled in the art have not been described in detail so as not to obscure the subject matter of the present application.
本申请的技术方案适用于视频监控领域,视频监控是安全防范系统的重要组成部分,为了便于理解,以下将结合图1对该技术方案的应用场景进行简单描述。The technical solution of the present application is applicable to the field of video surveillance, and video surveillance is an important part of a security protection system. For ease of understanding, the application scenario of the technical solution will be briefly described below with reference to FIG. 1 .
图1给出了本申请提供的技术方案适用的一种应用场景示意图。如图1所示,可采用视频监控系统对道路情况进行监控。在执行视频监控之前,需要设定视频监控的目标对象。作为示例,目标对象包括行人、非机动车以及机动车。FIG. 1 is a schematic diagram of an application scenario to which the technical solution provided by the present application is applicable. As shown in Figure 1, a video surveillance system can be used to monitor road conditions. Before performing video surveillance, it is necessary to set the target object of video surveillance. As examples, target objects include pedestrians, non-motor vehicles, and motor vehicles.
所述视频监控系统可包括具有音频/视频采集功能的设备以及对这些设备执行数据通信的监控平台服务器。在图1中,所述视频监控系统仅包括四个视频采集设备,但在实施中,所述视频监控系统可根据需要包括更多或更少的视频/音频采集设备。The video surveillance system may include devices with audio/video capture functions and a surveillance platform server that performs data communication with these devices. In FIG. 1 , the video surveillance system includes only four video capture devices, but in practice, the video surveillance system may include more or less video/audio capture devices as needed.
作为示例,所述视频采集设备可以是摄像机,摄像机可包括普通摄像机以及智能摄像机,其中,普通摄像机是指将拍摄的视频数据转换为适合的码率上传至监控平台服务器的设备,也就是说,普通摄像机需要借助于监控平台服务器对拍摄的视频数据进行处理(数据处理例如对象识别),而智能摄像机则可利用其内部嵌入的智能处理模块,先对视频数据执行图像处理,再将处理后的视频数据上传至监控平台服务器,其中,所述智能处理模块可包括人脸识别模块、车牌识别模块等模块。As an example, the video capture device may be a camera, and the camera may include a common camera and a smart camera, where the common camera refers to a device that converts the captured video data into a suitable bit rate and uploads it to the monitoring platform server, that is, Ordinary cameras need to use the monitoring platform server to process the captured video data (data processing such as object recognition), while smart cameras can use the intelligent processing module embedded in them to first perform image processing on the video data, and then process the processed video data. The video data is uploaded to the monitoring platform server, wherein the intelligent processing module may include modules such as a face recognition module, a license plate recognition module, and the like.
无论普通摄像机还是智能摄像机,本申请的摄像机均包括镜头和摄像机本体。镜头用于接收用于生成图像的光线。具体来说,镜头的作用是把被观察目标的光像呈现在摄像机的传感器上,也称光学成像。镜头通过将各种不同形状、不同介质(塑料、玻璃或晶体)的光学零件(反射镜、透射镜、棱镜)按一定方式组合起来,使得光线经过这些光学零件的透射或反射以后,按照人们的需要改变光线的传输方向而被接收器件接收,完成物体的光学成像过程。一般来说每个镜头都由多组不同曲面曲率的透镜按不同间距组合而成。间距和镜片曲率、透光系数等指标的选择决定了该镜头的焦距。镜头主要的参数指标包括:有效焦距、光圈、最大像面、视场角、畸变、相对照度等,各项指标数值决定了镜头的综合性能。Regardless of an ordinary camera or a smart camera, the camera of the present application includes a lens and a camera body. The lens is used to receive the light used to generate the image. Specifically, the function of the lens is to present the light image of the observed target on the sensor of the camera, also known as optical imaging. The lens combines various optical parts (reflectors, transmission mirrors, prisms) of different shapes and different media (plastic, glass or crystal) in a certain way, so that after the light is transmitted or reflected by these optical parts, according to people's expectations. It is necessary to change the transmission direction of the light to be received by the receiving device to complete the optical imaging process of the object. Generally speaking, each lens is composed of multiple groups of lenses with different curved curvatures combined at different intervals. The focal length of the lens is determined by the selection of indicators such as spacing, lens curvature, and light transmittance. The main parameters of the lens include: effective focal length, aperture, maximum image plane, field of view, distortion, relative illumination, etc. The value of each index determines the overall performance of the lens.
摄像机本体可包括传感器和处理器。传感器(又称图像传感器),是一种将光学影像转换成电子信号的器件,广泛应用在数码相机和其他电子光学设备中。常见的传感器包括:感光耦合元件(charge-coupled device,CCD)和互补式金属氧化物半导体(complementary MOS,CMOS)。CCD和CMOS均拥有大量(例如数千万)的感光二极管(photodiode),每个感光二极管称为一个感光基元,每个感光基元对应一个像素。曝光时,该感光二极管在接受光线照射之后,把光信号转化成包含了亮度(或者亮度与颜色)的电信号,影像就随之被重新构建起来。拜尔(Bayer)阵列是一种常见的图像传感器技术,可以应用于CCD和CMOS中,拜 耳阵列使用拜尔滤色镜让不同的像素点只对红、蓝、绿三原色光中的其中一种感光,这些像素交织在一起,然后通过去马赛克(demosaicing)内插来恢复原始影像。拜耳阵列可以应用于CCD或者CMOS中,应用了拜耳阵列的传感器又称为拜耳传感器。除了拜耳传感器之外,还有X3(Foveon公司开发)等传感器技术,X3技术采用三层感光元件,每层记录RGB的其中一个颜色通道,因此可以在一个像素上捕捉全部色彩的图像传感器。The camera body may include a sensor and a processor. A sensor (also known as an image sensor) is a device that converts an optical image into an electronic signal, and is widely used in digital cameras and other electronic and optical devices. Common sensors include: charge-coupled device (CCD) and complementary metal oxide semiconductor (complementary MOS, CMOS). Both CCD and CMOS have a large number (eg, tens of millions) of photodiodes, each photodiode is called a photosensitive cell, and each photosensitive cell corresponds to a pixel. During exposure, the photodiode converts the light signal into an electrical signal containing brightness (or brightness and color) after receiving light, and the image is reconstructed accordingly. Bayer array is a common image sensor technology that can be used in CCD and CMOS. Bayer array uses Bayer color filter to make different pixels only sensitive to one of the three primary colors of red, blue and green. These pixels are interleaved and then interpolated by demosaicing to restore the original image. Bayer arrays can be applied to CCD or CMOS, and sensors using Bayer arrays are also called Bayer sensors. In addition to the Bayer sensor, there are also sensor technologies such as X3 (developed by Foveon). X3 technology uses three layers of photosensitive elements, each layer records one of the color channels of RGB, so it can capture all colors on one pixel. Image sensor.
处理器(又称图像处理器)例如是片上系统(SoC),用于把传感器产生的图像转换成三通道格式(例如YUV)、改善图像质量,以及检测图像中是否有目标对象,还用于对图像进行编码。在智能摄像机的情况下,所述处理器中可包括以上所述的智能处理模块。本发明实施例中可以只有一个处理器(例如多功能集成的SoC),也可以是由多个处理器(例如由ISP和编码器等多个处理器)组成的集群。The processor (aka image processor), such as a system-on-chip (SoC), is used to convert the image produced by the sensor into a three-channel format (such as YUV), improve the image quality, and detect whether there is a target object in the image, and is also used for Encode the image. In the case of a smart camera, the above-mentioned smart processing module may be included in the processor. In this embodiment of the present invention, there may be only one processor (eg, a multi-function integrated SoC), or a cluster composed of multiple processors (eg, multiple processors including an ISP and an encoder).
如图2所示,所述智能摄像机201可利用通信网络将数据传输至监控平台服务器,作为示例,可将所述数据传输至监控平台服务器的存储单元202(例如硬盘)并将数据存储在存储单元202中。监控平台服务器是指能够接收摄像机发送的数据、对这些数据执行相关处理并对这些数据执行存储的设备。在实施中,监控平台服务器可以是单个计算设备也可以是多个计算设备,例如服务器、服务器集群、公有云/私有云。As shown in FIG. 2 , the smart camera 201 can transmit the data to the monitoring platform server by using the communication network. As an example, the data can be transmitted to the storage unit 202 (such as a hard disk) of the monitoring platform server and store the data in the storage unit 202 (such as a hard disk). in unit 202. The monitoring platform server refers to a device that can receive the data sent by the camera, perform related processing on the data, and store the data. In implementation, the monitoring platform server may be a single computing device or multiple computing devices, such as a server, server cluster, public cloud/private cloud.
由于智能摄像机可以对拍摄的图像执行图像处理,并将图像处理后的图像发送到监控平台服务器,因此,所述视频监控系统可预先设置发送到监控平台服务器的数据。作为示例,向监控平台服务器发送的数据可包括所述智能摄像机采集的视频流、采用智能处理模块确定的目标图像以及在所述目标图像中的感兴趣区域(ROI)。感兴趣区域往往是目标对象所在的区域,因此可以称之为目标对象区域。此外,在实施中,智能摄像机还可通过服务器向存储单元202发送设定的内容,例如,采集的视频流以及识别出的人脸图像。在本申请的示例性实施例中,所述智能摄像机可向监控平台发送仅包括目标对象的目标图像编码后的视频流以及满足预设图像质量的目标图像。Since the smart camera can perform image processing on the captured images and send the processed images to the monitoring platform server, the video monitoring system can preset the data sent to the monitoring platform server. As an example, the data sent to the monitoring platform server may include a video stream collected by the smart camera, a target image determined by an intelligent processing module, and a region of interest (ROI) in the target image. The region of interest is often the region where the target object is located, so it can be called the target object region. In addition, in implementation, the smart camera can also send the set content to the storage unit 202 through the server, for example, the captured video stream and the recognized face image. In an exemplary embodiment of the present application, the smart camera may send to the monitoring platform an encoded video stream that only includes a target image of the target object and a target image that meets a preset image quality.
作为示例,图3给出了目标检测系统的结构示意图。如图3所示,所述目标检测系统300包括多个摄像机301至305以及监控平台服务器310,其中,所述多个摄像机301至305中的每个摄像机可以是普通摄像机也可以是智能摄像机,在这些摄像机301至305是智能摄像机的情况下,每个智能摄像机内嵌的智能处理模块可以是相同的也可以是不同的。As an example, Figure 3 shows a schematic structural diagram of the target detection system. As shown in FIG. 3 , the target detection system 300 includes a plurality of cameras 301 to 305 and a monitoring platform server 310, wherein each camera in the plurality of cameras 301 to 305 may be an ordinary camera or a smart camera, In the case where the cameras 301 to 305 are smart cameras, the smart processing modules embedded in each smart camera may be the same or different.
摄像机301至305可将获取的视频数据传输给监控平台服务器,监控平台服务器310和摄像机301至305连接的接口可以为有线或无线通信方式。有线方式可以包括以太网技术中的传输控制协议/因特网互联协议(transmission control protocol/internet protocol,TCP/IP)通信技术、用户数据报协议(user datagram protocol,UDP)技术或标准通用串行总线(universal serial bus,USB)端口、COM接口及其他类似的标准端口等方式。无线通信方式可以包括WiFi、蓝牙、ZigBee或超宽带(ultra wideband,UWB)等技术。可以根据实际应用场景及摄像机的硬件形态选择对应的连接方式。The cameras 301 to 305 can transmit the acquired video data to the monitoring platform server, and the interface connecting the monitoring platform server 310 and the cameras 301 to 305 can be wired or wireless communication. The wired mode may include the transmission control protocol/internet protocol (TCP/IP) communication technology, the user datagram protocol (UDP) technology or the standard universal serial bus ( universal serial bus, USB) port, COM interface and other similar standard ports. The wireless communication method may include technologies such as WiFi, Bluetooth, ZigBee or ultra wideband (UWB). The corresponding connection method can be selected according to the actual application scenario and the hardware form of the camera.
图4示出根据本申请一实施例的目标对象检测方法的步骤流程图,在实施中,图4中示出的目标对象检测方法可由目标检测系统中的智能摄像机执行。FIG. 4 shows a flow chart of steps of a target object detection method according to an embodiment of the present application. In implementation, the target object detection method shown in FIG. 4 can be executed by a smart camera in a target detection system.
在步骤S410,获取第一图像,其中,所述第一图像是由以上描述的智能摄像机获取的图像,也可以是由普通摄像机所获取的图像并将该图像发送到对应的智能摄像机。In step S410, a first image is acquired, wherein the first image is an image acquired by the smart camera described above, or an image acquired by an ordinary camera, and the image is sent to the corresponding smart camera.
作为示例,所述方法可获取预设间隔内的至少一个图像,从所述至少一个图像中按照图 像选择方式选择第一图像。也就是说,考虑到实时检测非常考验智能摄像机的处理能力,可获取预设间隔内的多个图像,然后从这多个图像中选择某一图像作为代表图像。As an example, the method may acquire at least one image within a preset interval, and select the first image from the at least one image in an image selection manner. That is to say, considering that real-time detection is very challenging for the processing capability of the smart camera, it is possible to acquire multiple images within a preset interval, and then select an image from the multiple images as a representative image.
此处提及的预设间隔可以是时间间隔,比如,五秒内拍摄的多帧图像,又或者可以是数量间隔,比如,连续拍摄的10帧图像,图像选择方式指示用户预先设定的从多个图像中选择代表图像的方式,比如,所述图像选择方式可以是选择中间图像作为代表图像,又比如,可以是选择第一帧图像作为代表图像,对此本申请不做限制。The preset interval mentioned here may be a time interval, such as multiple frames of images captured within five seconds, or may be a number interval, such as 10 frames of images captured continuously. The method of selecting a representative image from among the multiple images, for example, the image selection method may be to select an intermediate image as the representative image, or, for example, to select the first frame of image as the representative image, which is not limited in this application.
在步骤S420,检测所述第一图像中是否包括目标对象,其中,所述目标对象是预先设置的,在实施中,所述目标对象的类型可以是行人、机动车和/或非机动车。所述目标对象的数量可以是单个目标对象,也可以是多个目标对象。在多个目标对象的情况下,只要检测出一个目标对象,即可确定第一图像包括目标对象。举例来说,在图像中包括多个非机动车的情况下,或者在同一个图像中同时包括行人、机动车和非机动车的情况下,若检测出第一图像包括非机动车,则可确定第一图像包括目标对象。In step S420, it is detected whether a target object is included in the first image, wherein the target object is preset, and in implementation, the type of the target object may be a pedestrian, a motor vehicle and/or a non-motor vehicle. The number of the target objects may be a single target object or multiple target objects. In the case of multiple target objects, as long as one target object is detected, it can be determined that the first image includes the target object. For example, if the image includes multiple non-motor vehicles, or if the same image includes pedestrians, motor vehicles and non-motor vehicles at the same time, if it is detected that the first image includes non-motor vehicles, the It is determined that the first image includes the target object.
所设定的目标对象与智能摄像机内嵌的智能处理模块对应,也就是说,智能摄像机内具有检测目标对象的智能处理模块,这样可利用该智能处理模块确定第一图像是否包括目标对象。该智能处理模块可以由SoC实现。The set target object corresponds to the intelligent processing module embedded in the smart camera, that is, the smart camera has an intelligent processing module for detecting the target object, so that the intelligent processing module can be used to determine whether the first image includes the target object. The intelligent processing module can be implemented by an SoC.
作为示例,所述智能处理模块可指示与目标对应的人工智能(artificial intelligence,AI)模块,还可以是机器学习模块或者深度学习模块等,其中,AI模块是指将大量数据加载到计算设备中并选择一种模型“拟合”数据,使得计算设备得出预测/推理。计算设备所使用的模型既包括简单的方程式(如直线方程式),又包括非常复杂的逻辑/数学系统,一旦选择要使用的模型并对其进行调整(也就是说,通过调整来改进模型),计算设备就会使用该模型学习数据中的模式。最后,可利用该模型对输入数据执行处理。As an example, the intelligent processing module may indicate an artificial intelligence (AI) module corresponding to the target, and may also be a machine learning module or a deep learning module, etc., wherein the AI module refers to loading a large amount of data into a computing device And choose a model to "fit" the data so that the computing device comes up with predictions/inferences. Models used by computing devices include both simple equations (such as equations for a straight line) and very complex logical/mathematical systems, once the model to be used is selected and adjusted (that is, the model is improved by adjustment), Computing devices use the model to learn patterns in the data. Finally, the model can be used to perform processing on the input data.
在实施中,所述AI模块可以是具有对应的目标检测能力的模块,其所使用的模型可由实际使用的用户或者技术人员确定,例如,人脸识别、行人识别以及车牌识别对应的模型。In implementation, the AI module may be a module with corresponding target detection capability, and the model used by it may be determined by actual users or technicians, for example, models corresponding to face recognition, pedestrian recognition, and license plate recognition.
在步骤S430,当所述第一图像中不包括所述目标对象时,不对所述第一图像执行编码。也就是说,在确定第一图像不包括目标对象的情况下,则不对第一图像执行编码。可选的,可以将第一图像进行删除(舍弃),从而减少后续的传输量以及监控平台服务器的存储成本。在实施中,若第一图像作为代表图像,则第一图像所代表的预设间隔内的多个图像均不执行编码,并且均被删除。In step S430, when the target object is not included in the first image, encoding is not performed on the first image. That is, in a case where it is determined that the first image does not include the target object, encoding is not performed on the first image. Optionally, the first image may be deleted (discarded), thereby reducing the subsequent transmission amount and the storage cost of the monitoring platform server. In implementation, if the first image is used as the representative image, encoding is not performed on multiple images within the preset interval represented by the first image, and all are deleted.
编码有时又称压缩,是把三通道图像(例如YUV格式图像)进行编码以方便传输和用户观看。例如编码为JPG格式的图像,或者编码成H.264/H.265的视频。Encoding, sometimes called compression, is the encoding of three-channel images (such as YUV format images) for ease of transmission and viewing by users. For example, an image encoded in JPG format, or a video encoded in H.264/H.265.
如图4所示,所述方法还可执行步骤S440,获取第二图像,若所述方法执行步骤S420后,确定第二图像包括所述目标对象,则执行步骤S450,对所述第二图像执行编码并将编码后的第二图像发送到监控平台服务器。在实施中,可按照H.264/H.265格式对第二图像执行编码并传输至监控平台服务器。在本申请中,具有目标对象的第二图像还可被描述为目标图像。需要说明的是,在图4中,步骤S410和S440并行执行。在其它实施例中,二者也可以依次执行,也就是:先执行步骤S410、S420和S430,再执行步骤S440、S420和S450。As shown in FIG. 4 , the method may further perform step S440 to acquire a second image. If it is determined that the second image includes the target object after performing step S420 in the method, step S450 is performed to perform step S450 on the second image. The encoding is performed and the encoded second image is sent to the monitoring platform server. In an implementation, the second image may be encoded in H.264/H.265 format and transmitted to the monitoring platform server. In this application, the second image with the target object may also be described as the target image. It should be noted that, in FIG. 4 , steps S410 and S440 are executed in parallel. In other embodiments, the two may also be performed sequentially, that is, steps S410, S420 and S430 are performed first, and then steps S440, S420 and S450 are performed.
此外,所述方法还包括在确定第二图像包括所述目标对象的情况下,对第二图像的图像质量执行评价。一种评价方法是:利用图像中的单个目标对象对该图像进行评价,例如,对于仅包括行人A的图像进行评价,或者对于包括行人A和行人B的图像,利用针对行人A/ 行人B的评价来代表对整个图像的评价。这种做法忽略了多个目标对象之间的关联性。作为示例,行人A、行人B以及行人C一起过马路,行人A与行人B同行后一段时间后,行人A、行人B和行人C同行,在此过程中,若仅对包括行人A的图像或者图像中的相关区域(即,包括行人A的区域)进行评价,则会忽略掉该图像中多行人之间的关联性,从而低估了图像的重要性。Furthermore, the method further includes performing an evaluation on the image quality of the second image if it is determined that the second image includes the target object. One evaluation method is to use a single target object in the image to evaluate the image, for example, for the image including only pedestrian A, or for the image including pedestrian A and pedestrian B, use the evaluation for pedestrian A/pedestrian B. Evaluation to represent the evaluation of the entire image. This approach ignores the association between multiple target objects. As an example, pedestrian A, pedestrian B, and pedestrian C cross the road together. After a period of time after pedestrian A and pedestrian B walk together, pedestrian A, pedestrian B, and pedestrian C walk together. During this process, if only the images including pedestrian A or If the relevant area in the image (that is, the area including pedestrian A) is evaluated, the correlation between multiple pedestrians in the image will be ignored, thus underestimating the importance of the image.
为此,所述方法可利用评价方式,确定所述第二图像的评价值,其中,所述评价值可用于描述所述第二图像的图像质量。特别地,针对所述第二图像包括多个目标对象的情况,可利用评价方式确定第二图像的评价值。To this end, the method may utilize an evaluation method to determine an evaluation value of the second image, wherein the evaluation value may be used to describe the image quality of the second image. In particular, in the case where the second image includes multiple target objects, the evaluation value of the second image may be determined by using an evaluation method.
作为示例,所述方法可先确定第二图像中包括的目标对象的数量,然后,根据目标对象的数量确定具体的评价方式。所述评价方式可包括所述第二图像中仅包括单个目标对象的单目标对象评价方式或所述第二图像中包括至少两个目标对象的多目标对象评价方式。As an example, the method may first determine the number of target objects included in the second image, and then determine a specific evaluation manner according to the number of target objects. The evaluation method may include a single target object evaluation method in which only a single target object is included in the second image or a multi-target object evaluation method in which at least two target objects are included in the second image.
在实施中,可利用以上提到的智能处理模块检测出目标对象,然后确定检测出的目标对象的数量。当确定第二图像中仅包括单个目标对象时,可采用单目标对象评价方式。具体来说,可按照图像质量的评价指标,从不同维度上对所述目标对象进行评价,确定第二图像的评价值。换言之,所述评价值与评价指标相关。所述评价指标包括所述目标独享所在目标对象区域的清晰度,所述目标对象具备的关键特征的数量,目标对象区域的像素数量以及所拍摄的目标对象区域的拍摄角度。In implementation, the above-mentioned intelligent processing module can be used to detect target objects, and then determine the number of detected target objects. When it is determined that only a single target object is included in the second image, a single-target object evaluation method may be adopted. Specifically, the target object can be evaluated from different dimensions according to the evaluation index of image quality, and the evaluation value of the second image can be determined. In other words, the evaluation value is related to the evaluation index. The evaluation index includes the clarity of the target object area where the target is exclusively located, the number of key features possessed by the target object, the number of pixels in the target object area, and the shooting angle of the photographed target object area.
目标对象区域的清晰度是指图像上的各细部影纹及其边界的清晰程度,在实施中,所述清晰度可用于衡量图像中的各个细节。以上提及AI模块在识别目标对象的过程中需要利用所述目标对象的关键特征的特征值,因此,目标对象所具备的关键特征的数量决定了AI模块的识别率。目标对象区域的像素越多则目标对象在图像中所占的区域越大,更利于执行各种图像处理,拍摄角度可指示所拍摄的目标对象的角度,例如,所拍摄的目标对象的正脸,或者所拍摄的目标对象的侧脸。作为示例,可利用上述评价指标中的一种或多种对所述第二图像进行评价,确定所述第二图像的评价值。The sharpness of the target object area refers to the sharpness of each detail shadow and its boundary on the image, and in implementation, the sharpness can be used to measure each detail in the image. As mentioned above, the AI module needs to use the feature values of the key features of the target object in the process of recognizing the target object. Therefore, the number of key features possessed by the target object determines the recognition rate of the AI module. The more pixels in the target object area, the larger the area occupied by the target object in the image, which is more conducive to various image processing. , or the profile of the target subject being photographed. As an example, the second image may be evaluated by using one or more of the above-mentioned evaluation indicators to determine the evaluation value of the second image.
随后,所述方法确定所述评价值是否高于预定阈值,若高于预定阈值,则说明第二图像的图像质量较高,若低于预定阈值,则说明第二图像的图像质量不高。在评价值高于预定阈值的情况下,将第二图像发送到缓存单元,在评价值低于预定阈值的情况下,可舍弃第二图像。作为示例,所述方法可将缓存单元内存储的图像与编码生成的视频流一起发送到监控平台服务器。Then, the method determines whether the evaluation value is higher than a predetermined threshold, if higher than the predetermined threshold, the image quality of the second image is high, and if it is lower than the predetermined threshold, the image quality of the second image is not high. If the evaluation value is higher than the predetermined threshold, the second image is sent to the buffer unit, and if the evaluation value is lower than the predetermined threshold, the second image may be discarded. As an example, the method may send the image stored in the buffer unit together with the video stream generated by encoding to the monitoring platform server.
在实施中,可根据评价值划分为不同的等级,每个等级均对应于一个评价值范围,若第二图像的评价值落入该评价值范围内,则第二图像对应于该等级。作为示例,可将评价值划分为五个等级,这五个等级可包括优秀、良好、中等、及格以及不及格。随后,所述方法可将评价值处于特定级别以上的图像发送到缓存单元,若低于特定级别,则舍弃该图像,作为示例,特定级别可以是良好级别。最后,将缓存单元内存储的图像单独发送到监控平台服务器。In implementation, different grades may be divided according to the evaluation value, each grade corresponds to a range of evaluation values, and if the evaluation value of the second image falls within the range of evaluation values, the second image corresponds to this grade. As an example, the evaluation value may be divided into five grades, which may include excellent, good, moderate, pass, and fail. Subsequently, the method may send an image with an evaluation value above a certain level to the cache unit, and if it is lower than a certain level, discard the image, as an example, the certain level may be a good level. Finally, the images stored in the cache unit are individually sent to the monitoring platform server.
当确定第二图像中包括至少两个目标对象时,也就是说,第二图像包括多个目标对象,所述方法可采用多目标对象评价方式,确定第二图像的评价值,其中,多目标对象评价方式是对第二图像包括的各个目标对象分别执行评价后确定第二图像的评价值的方式。也就是说,多目标对象评价方式与第二图像中的多个目标对象的子评价值相关。When it is determined that the second image includes at least two target objects, that is, the second image includes multiple target objects, the method may adopt a multi-target object evaluation method to determine the evaluation value of the second image, wherein the multi-object The object evaluation method is a method in which the evaluation value of the second image is determined after each target object included in the second image is evaluated separately. That is to say, the multi-target evaluation method is related to the sub-evaluation values of the plurality of target objects in the second image.
具体来说,可利用针对每个目标对象获取的子评价值以及对应的子评价权重确定第二图像的评价值。在实施中,可针对多个目标对象中的每个目标对象均可计算其对应的子评价值,也就是说,利用以上描述的评价指标从不同维度上评价每个目标对象对应的目标对象区域的子评价值。举例来说,在确定第二图像中包括第一目标对象、第二目标对象以及第三目标对象的情况下,可确定在第二图像中与第一目标对象对应的第一目标对象区域、与第二目标对象对应的第二目标对象区域以及与第三目标对象对应的第三目标对象。Specifically, the evaluation value of the second image may be determined by using the sub-evaluation value obtained for each target object and the corresponding sub-evaluation weight. In implementation, a corresponding sub-evaluation value may be calculated for each target object in the plurality of target objects, that is, the target object area corresponding to each target object may be evaluated from different dimensions by using the evaluation indicators described above. The sub-evaluation value of . For example, in the case where it is determined that the second image includes the first target object, the second target object and the third target object, the first target object area corresponding to the first target object in the second image and the The second target object area corresponding to the second target object and the third target object corresponding to the third target object.
利用第一目标对象区域、第二目标对象区域与第三目标对象区域,分别计算第一目标对象、第二目标对象与第三目标对象的子评价值。在实施中,针对每个目标对象,均可计算对应于各个评价指标的指标值并利用这些评价值最终计算该目标对象的子评价值。应注意,在实施中,针对每个目标对象的评价方式可以是相同的也可以是不同的。作为示例,可根据目标对象的类别分别确定不同的评价方式。例如,针对行人的评价方式不同于针对机动车的评价方式。Using the first target object area, the second target object area and the third target object area, the sub-evaluation values of the first target object, the second target object and the third target object are calculated respectively. In implementation, for each target object, index values corresponding to each evaluation index can be calculated, and the sub-evaluation values of the target object can be finally calculated by using these evaluation values. It should be noted that, in implementation, the evaluation methods for each target object may be the same or different. As an example, different evaluation methods may be determined according to the category of the target object. For example, the evaluation method for pedestrians is different from the evaluation method for motor vehicles.
在已计算出第一目标对象、第二目标对象与第三目标对象的子评价值,可利用以下公式1计算第二图像的评价值:After the sub-evaluation values of the first target object, the second target object and the third target object have been calculated, the following formula 1 can be used to calculate the evaluation value of the second image:
S=∑a ib i     (1) S=∑a i b i (1)
其中,S指示第二图像的评价值,a i指示第二图像内的第i个目标对象的子评价值,b i指示第i个目标对象的子评价权重。在实施中,所述子评价权重可以是用户预先设置的权重。还可以是根据目标对象的特征值确定的权重,例如,可根据目标对象的类别分配不同的子评价权重,还可根据各个目标对象区域的大小分配不同的子评价权重,等。 Wherein, S indicates the evaluation value of the second image, a i indicates the sub-evaluation value of the ith target object in the second image, and b i indicates the sub-evaluation weight of the ith target object. In implementation, the sub-evaluation weight may be a weight preset by a user. It may also be a weight determined according to the characteristic value of the target object. For example, different sub-evaluation weights may be assigned according to the category of the target object, and different sub-evaluation weights may also be assigned according to the size of each target object area, and so on.
随后,确定所述评价值是否高于预定阈值,若所述评价值高于预定阈值,则说明第二图像的图像质量较高,若低于预定阈值,则说明第二图像的图像质量不高。在评价值高于预定阈值的情况下,将第二图像发送到缓存单元,在评价值低于预定阈值的情况下,可舍弃第二图像。Then, it is determined whether the evaluation value is higher than a predetermined threshold. If the evaluation value is higher than the predetermined threshold, the image quality of the second image is high, and if it is lower than the predetermined threshold, the image quality of the second image is not high. If the evaluation value is higher than the predetermined threshold, the second image is sent to the buffer unit, and if the evaluation value is lower than the predetermined threshold, the second image may be discarded.
同样地,可根据评价值划分为不同的等级,每个等级均对应于一个评价值范围,若第二图像的评价值落入该评价值范围内,则第二图像对应于该等级。可根据评价值划分为五个等级,这五个等级可包括优秀、良好、中等、及格以及不及格。随后,可将评价值处于特定级别以上的图像发送到缓存单元,若低于特定级别,则舍弃该图像,作为示例,特定级别可以是良好级别。作为示例,所述方法可将满足预设要求的图像与编码生成的视频流一起发送到监控平台服务器,也就是说,将缓存单元内存储的图像与视频流一起发送到监控平台服务器。Similarly, it can be divided into different grades according to the evaluation value, each grade corresponds to a range of evaluation values, and if the evaluation value of the second image falls within the range of evaluation values, the second image corresponds to this grade. It can be divided into five grades according to the evaluation value, and the five grades can include excellent, good, medium, pass and fail. Subsequently, an image whose evaluation value is above a certain level may be sent to the buffer unit, and if it is lower than a certain level, the image may be discarded. As an example, the certain level may be a good level. As an example, the method may send an image that meets the preset requirements together with a video stream generated by encoding to the monitoring platform server, that is, send the image stored in the cache unit to the monitoring platform server together with the video stream.
可以理解的是,上述终端等为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请实施例能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请实施例的范围。It can be understood that, in order to realize the above-mentioned functions, the above-mentioned terminal and the like include corresponding hardware structures and/or software modules for executing each function. Those skilled in the art should easily realize that, in conjunction with the units and algorithm steps of each example described in the embodiments disclosed herein, the embodiments of the present application can be implemented in hardware or a combination of hardware and computer software. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Experts may use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of the embodiments of the present application.
本申请实施例可以根据上述方法示例对上述终端等进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或至少两个的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明 的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In this embodiment of the present application, functional modules may be divided into the above terminal and the like according to the above method examples. For example, each functional module may be divided corresponding to each function, or two or at least two functions may be integrated into one processing module. The above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. It should be noted that the division of modules in the embodiments of the present application is schematic, and is only a logical function division, and other division methods may be used in actual implementation.
在采用对应各个功能划分各个功能模块的情况下,图5示出根据本申请一实施例的目标对象检测设备的框图。所述目标对象检测设备500包括图像获取单元510、目标对象检测单元520、编码单元530以及发送单元540,其中,所述图像获取单元510用于获取至少一个图像;目标对象检测单元520用于通过对所述至少一个图像执行目标对象检测,确定具有目标对象的目标图像,并且舍弃没有目标对象的图像;编码单元530用于对所述目标图像执行编码,生成编码图像;发送单元540,用于将编码图像发送到监控平台服务器。In the case where each functional module is divided according to each function, FIG. 5 shows a block diagram of a target object detection device according to an embodiment of the present application. The target object detection device 500 includes an image acquisition unit 510, a target object detection unit 520, an encoding unit 530 and a sending unit 540, wherein the image acquisition unit 510 is used for acquiring at least one image; the target object detection unit 520 is used for Performing target object detection on the at least one image, determining a target image with a target object, and discarding images without the target object; the encoding unit 530 is configured to perform encoding on the target image to generate an encoded image; the sending unit 540 is used for Send the encoded image to the monitoring platform server.
可选地,所述目标对象检测设备500还包括图像质量评价单元,用于对所述目标图像的图像质量执行评价,确定所述目标图像的评价值。Optionally, the target object detection device 500 further includes an image quality evaluation unit, configured to perform evaluation on the image quality of the target image, and determine the evaluation value of the target image.
可选地,所述图像质量评价单元具体用于当所述目标图像中包括多个目标对象时利用所述多个目标对象的子评价值确定所述评价值。Optionally, the image quality evaluation unit is specifically configured to use the sub-evaluation values of the multiple target objects to determine the evaluation value when the target image includes multiple target objects.
可选地,所述图像质量评价单元还用于确定所述目标图像是否满足预设的阈值;在满足所述预设的阈值的情况下,将所述目标图像发送到缓存单元。Optionally, the image quality evaluation unit is further configured to determine whether the target image satisfies a preset threshold; if the preset threshold is met, send the target image to a cache unit.
可选地,目标对象检测设备500还包括:缓存单元,用于存储满足所述预设的阈值的目标图像。Optionally, the target object detection device 500 further includes: a cache unit, configured to store the target image satisfying the preset threshold.
可选地,所述发送单元540还用于将缓存单元内的图像发送到监控平台服务器。Optionally, the sending unit 540 is further configured to send the image in the cache unit to the monitoring platform server.
可选地,所述评价值与下述一项或者多项有关:所述目标对象所在目标对象区域的清晰度;所述目标对象区域的像素数量;拍摄所述目标对象区域的拍摄角度以及所述目标对象具备的关键特征的数量。Optionally, the evaluation value is related to one or more of the following: the clarity of the target object area where the target object is located; the number of pixels in the target object area; the shooting angle for photographing the target object area and the The number of key features the target object possesses.
可选地,所述目标对象检测单元510具体用于从所述至少一个图像中选择代表图像;对所述代表图像执行目标对象检测;若所述代表图像中具有目标对象,则确定所述至少一个图像均是具有目标对象的目标图像。Optionally, the target object detection unit 510 is specifically configured to select a representative image from the at least one image; perform target object detection on the representative image; if there is a target object in the representative image, determine the at least one An image is a target image with a target object.
本申请的实施例提供了一种摄像机,包括:镜头,用于采集光线;传感器,用于通过对镜头采集的光线进行光电转换生成图像;处理器和处理器集群,用于执行以上所述的方法。An embodiment of the present application provides a camera, including: a lens for collecting light; a sensor for generating an image by performing photoelectric conversion on the light collected by the lens; a processor and a processor cluster for executing the above-mentioned method.
本申请的实施例提供了一种非易失性计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现上述方法。计算机程序指令可以被摄像机或者通用计算机执行。Embodiments of the present application provide a non-volatile computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, implement the above method. Computer program instructions can be executed by a video camera or by a general purpose computer.
本申请的实施例提供了一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,当所述计算机可读代码在电子设备的处理器中运行时,所述电子设备中的处理器执行上述方法。Embodiments of the present application provide a computer program product, including computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes are stored in a processor of an electronic device When running in the electronic device, the processor in the electronic device executes the above method.
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是――但不限于――电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(Random Access Memory,RAM)、只读存储器(Read Only Memory,ROM)、可擦式可编程只读存储器(Electrically Programmable Read-Only-Memory,EPROM或闪存)、静态随机存取存储器(Static Random-Access Memory,SRAM)、便携式压缩盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、数字多功能盘(Digital Video Disc,DVD)、记忆棒、软盘、机械编码设备、例如 其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。A computer-readable storage medium may be a tangible device that can hold and store instructions for use by the instruction execution device. The computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (Electrically Programmable Read-Only-Memory, EPROM or flash memory), static random access memory (Static Random-Access Memory, SRAM), portable compact disk read-only memory (Compact Disc Read-Only Memory, CD - ROM), Digital Video Disc (DVD), memory sticks, floppy disks, mechanically encoded devices, such as punch cards or raised structures in grooves on which instructions are stored, and any suitable combination of the foregoing .
这里所描述的计算机可读程序指令或代码可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。The computer readable program instructions or code described herein may be downloaded to various computing/processing devices from a computer readable storage medium, or to an external computer or external storage device over a network such as the Internet, a local area network, a wide area network and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer-readable program instructions from a network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
用于执行本申请操作的计算机程序指令可以是汇编指令、指令集架构(Instruction Set Architecture,ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(Local Area Network,LAN)或广域网(Wide Area Network,WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或可编程逻辑阵列(Programmable Logic Array,PLA),该电子电路可以执行计算机可读程序指令,从而实现本申请的各个方面。The computer program instructions used to perform the operations of the present application may be assembly instructions, Instruction Set Architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or in one or more source or object code written in any combination of programming languages, including object-oriented programming languages such as Smalltalk, C++, etc., and conventional procedural programming languages such as the "C" language or similar programming languages. The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement. In the case of a remote computer, the remote computer can be connected to the user's computer through any kind of network—including a Local Area Network (LAN) or a Wide Area Network (WAN)—or, can be connected to an external computer (e.g. use an internet service provider to connect via the internet). In some embodiments, electronic circuits, such as programmable logic circuits, Field-Programmable Gate Arrays (FPGA), or Programmable Logic Arrays (Programmable Logic Arrays), are personalized by utilizing state information of computer-readable program instructions. Logic Array, PLA), the electronic circuit can execute computer readable program instructions to implement various aspects of the present application.
这里参照根据本申请实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本申请的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。Aspects of the present application are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the present application. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer or other programmable data processing apparatus to produce a machine that causes the instructions when executed by the processor of the computer or other programmable data processing apparatus , resulting in means for implementing the functions/acts specified in one or more blocks of the flowchart and/or block diagrams. These computer readable program instructions can also be stored in a computer readable storage medium, these instructions cause a computer, programmable data processing apparatus and/or other equipment to operate in a specific manner, so that the computer readable medium on which the instructions are stored includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks of the flowchart and/or block diagrams.
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。Computer readable program instructions can also be loaded onto a computer, other programmable data processing apparatus, or other equipment to cause a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other equipment to produce a computer-implemented process , thereby causing instructions executing on a computer, other programmable data processing apparatus, or other device to implement the functions/acts specified in one or more blocks of the flowcharts and/or block diagrams.
附图中的流程图和框图显示了根据本申请的多个实施例的装置、系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行, 它们有时也可以按相反的顺序执行,这依所涉及的功能而定。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of apparatuses, systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more functions for implementing the specified logical function(s) executable instructions. In some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行相应的功能或动作的硬件(例如电路或ASIC(Application Specific Integrated Circuit,专用集成电路))来实现,或者可以用硬件和软件的组合,如固件等来实现。It is also noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented in hardware (eg, circuits or ASICs (Application) that perform the corresponding functions or actions. Specific Integrated Circuit, application-specific integrated circuit)), or can be implemented by a combination of hardware and software, such as firmware.
尽管在此结合各实施例对本发明进行了描述,然而,在实施所要求保护的本发明过程中,本领域技术人员通过查看所述附图、公开内容、以及所附权利要求书,可理解并实现所述公开实施例的其它变化。在权利要求中,“包括”(comprising)一词不排除其他组成部分或步骤,“一”或“一个”不排除多个的情况。单个处理器或其它单元可以实现权利要求中列举的若干项功能。相互不同的从属权利要求中记载了某些措施,但这并不表示这些措施不能组合起来产生良好的效果。While the invention has been described herein in connection with various embodiments, those skilled in the art will understand and understand from a review of the drawings, the disclosure, and the appended claims in practicing the claimed invention. Other variations of the disclosed embodiments are implemented. In the claims, the word "comprising" does not exclude other components or steps, and "a" or "an" does not exclude a plurality. A single processor or other unit may fulfill the functions of several items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that these measures cannot be combined to advantage.
以上已经描述了本申请的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术的改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。Various embodiments of the present application have been described above, and the foregoing descriptions are exemplary, not exhaustive, and not limiting of the disclosed embodiments. Numerous modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope of the described embodiments. The terminology used herein was chosen to best explain the principles of the various embodiments, the practical application or improvement over the technology in the marketplace, or to enable others of ordinary skill in the art to understand the various embodiments disclosed herein.

Claims (18)

  1. 一种目标对象检测方法,其特征在于,包括:A target object detection method, comprising:
    获取第一图像;get the first image;
    检测所述第一图像中是否包括目标对象;detecting whether a target object is included in the first image;
    当所述第一图像中不包括所述目标对象时,不对所述第一图像执行编码;When the target object is not included in the first image, encoding is not performed on the first image;
    获取第二图像;get the second image;
    检测所述第二图像中是否包括目标对象;detecting whether a target object is included in the second image;
    当所述第二图像中包括目标对象时,对所述第二图像执行编码并将编码后的第二图像发送到监控平台服务器。When the target object is included in the second image, encoding is performed on the second image and the encoded second image is sent to the monitoring platform server.
  2. 如权利要求1所述的方法,其特征在于,获取第一图像,包括:The method of claim 1, wherein acquiring the first image comprises:
    获取预设间隔内的至少一个图像,其中,所述预设间隔包括预设时间间隔和预设数量间隔;acquiring at least one image within a preset interval, wherein the preset interval includes a preset time interval and a preset number of intervals;
    从所述至少一个图像中按照图像选择方式选择所述第一图像,所述方法进一步包括:Selecting the first image from the at least one image in an image selection manner, the method further comprising:
    确定对所述至少一个图像不执行编码。It is determined that no encoding is performed on the at least one image.
  3. 如权利要求1所述的方法,其特征在于,还包括:The method of claim 1, further comprising:
    利用评价方式,确定所述第二图像的评价值,所述评价值用于描述所述第二图像的图像质量。Using the evaluation method, an evaluation value of the second image is determined, and the evaluation value is used to describe the image quality of the second image.
  4. 如权利要求3所述的方法,其特征在于,所述评价方式包括:The method of claim 3, wherein the evaluation method comprises:
    当所述第二图像中包括多个目标对象时,所述第二图像的评价值与所述多个目标对象的子评价值相关。When the second image includes a plurality of target objects, the evaluation value of the second image is related to the sub-evaluation values of the plurality of target objects.
  5. 如权利要求3或4所述的方法,其特征在于,所述评价值与下述一项或者多项有关:The method of claim 3 or 4, wherein the evaluation value is related to one or more of the following:
    所述目标对象所在目标对象区域的清晰度;The clarity of the target object area where the target object is located;
    所述目标对象区域的像素数量;the number of pixels in the target object area;
    拍摄所述目标对象区域的拍摄角度;以及a shooting angle for shooting the target object area; and
    所述目标对象具备的关键特征的数量。The number of key features that the target object possesses.
  6. 如权利要求3至5中的任一权利要求所述的方法,其特征在于,还包括:The method of any one of claims 3 to 5, further comprising:
    判断所述评价值是否满足预设的阈值;judging whether the evaluation value satisfies a preset threshold;
    若满足所述预设的阈值,将所述第二图像发送到监控平台服务器。If the preset threshold is met, the second image is sent to the monitoring platform server.
  7. 如权利要求1至6中的任一权利要求所述的方法,其特征在于,还包括:The method of any one of claims 1 to 6, further comprising:
    丢弃所述第一图像。The first image is discarded.
  8. 一种摄像机,其特征在于,所述摄像机包括:A camera, characterized in that the camera comprises:
    镜头,用于接收用于生成图像的光线;a lens to receive the light used to generate the image;
    摄像机本体,用于执行如权利要求1至7中的任一项权利要求所述的方法。A camera body for carrying out the method as claimed in any one of claims 1 to 7.
  9. 一种目标对象检测设备,其特征在于,包括:A target object detection device, characterized in that it includes:
    图像获取单元,用于获取至少一个图像;an image acquisition unit for acquiring at least one image;
    目标对象检测单元,用于通过对所述至少一个图像执行目标对象检测,确定具有目标对象的目标图像,并且舍弃没有目标对象的图像;a target object detection unit for determining a target image with a target object by performing target object detection on the at least one image, and discarding images without the target object;
    编码单元,用于对所述目标图像执行编码,生成编码图像;an encoding unit, configured to perform encoding on the target image to generate an encoded image;
    发送单元,用于将编码图像发送到监控平台服务器。The sending unit is used for sending the encoded image to the monitoring platform server.
  10. 如权利要求9所述的设备,其特征在于,还包括:The device of claim 9, further comprising:
    图像质量评价单元,用于对所述目标图像的图像质量执行评价,确定所述目标图像的评价值。An image quality evaluation unit, configured to perform evaluation on the image quality of the target image, and determine an evaluation value of the target image.
  11. 如权利要求10所述的设备,其特征在于,所述图像质量评价单元具体用于当所述目标图像中包括多个目标对象时利用所述多个目标对象的子评价值确定所述评价值。The device according to claim 10, wherein the image quality evaluation unit is specifically configured to determine the evaluation value by using sub-evaluation values of the plurality of target objects when the target image includes a plurality of target objects.
  12. 如权利要求10或11所述的设备,其特征在于,所述图像质量评价单元还用于判断所述评价值是否满足预设的阈值;在满足所述预设的阈值的情况下,将所述目标图像发送到缓存单元。The device according to claim 10 or 11, wherein the image quality evaluation unit is further configured to determine whether the evaluation value satisfies a preset threshold; in the case of satisfying the preset threshold, the image quality evaluation unit The target image is sent to the cache unit.
  13. 如权利要求12所述的设备,其特征在于,还包括:The apparatus of claim 12, further comprising:
    缓存单元,用于存储满足所述预设的阈值的目标图像。The cache unit is used for storing the target image satisfying the preset threshold.
  14. 如权利要求13所述的设备,其特征在于,所述发送单元还用于将缓存单元内的图像发送到监控平台服务器。The device according to claim 13, wherein the sending unit is further configured to send the image in the cache unit to the monitoring platform server.
  15. 如权利要求11至14中的任一项权利要求所述的设备,其特征在于,所述评价值与下述一项或者多项有关:The device according to any one of claims 11 to 14, wherein the evaluation value is related to one or more of the following:
    所述目标对象所在目标对象区域的清晰度;The clarity of the target object area where the target object is located;
    所述目标对象区域的像素数量;the number of pixels in the target object area;
    拍摄所述目标对象区域的拍摄角度;以及a shooting angle for shooting the target object area; and
    所述目标对象具备的关键特征的数量。The number of key features that the target object possesses.
  16. 如权利要求9至15中的任一权利要求所述设备,其特征在于,所述目标对象检测单元具体用于从所述至少一个图像中选择代表图像;对所述代表图像执行目标对象检测;若所述代表图像中具有目标对象,则确定所述至少一个图像均是具有目标对象的目标图像。The device according to any one of claims 9 to 15, wherein the target object detection unit is specifically configured to select a representative image from the at least one image; perform target object detection on the representative image; If the representative image has a target object, it is determined that the at least one image is a target image with the target object.
  17. 一种摄像机,其特征在于,包括:A camera, characterized in that it includes:
    镜头,用于采集光线;lens, for collecting light;
    传感器,用于通过对镜头采集的光线进行光电转换生成图像;The sensor is used to generate an image by photoelectric conversion of the light collected by the lens;
    处理器和处理器集群,用于执行权利要求1-7任一项所述的方法。A processor and a processor cluster for performing the method of any one of claims 1-7.
  18. 一种非易失性计算机可读存储介质,其上存储有计算机程序指令,其特征在于,所述计算机程序指令被处理器执行时实现权利要求1-7中任意一项所述的方法。A non-volatile computer-readable storage medium on which computer program instructions are stored, characterized in that, when the computer program instructions are executed by a processor, the method described in any one of claims 1-7 is implemented.
PCT/CN2022/073151 2021-01-25 2022-01-21 Target object detection method and device thereof WO2022156763A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110099193.0 2021-01-25
CN202110099193.0A CN114898239A (en) 2021-01-25 2021-01-25 Target object detection method and device thereof

Publications (1)

Publication Number Publication Date
WO2022156763A1 true WO2022156763A1 (en) 2022-07-28

Family

ID=82548515

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/073151 WO2022156763A1 (en) 2021-01-25 2022-01-21 Target object detection method and device thereof

Country Status (2)

Country Link
CN (1) CN114898239A (en)
WO (1) WO2022156763A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060078217A1 (en) * 2004-05-20 2006-04-13 Seiko Epson Corporation Out-of-focus detection method and imaging device control method
CN107291810A (en) * 2017-05-18 2017-10-24 深圳云天励飞技术有限公司 Data processing method, device and storage medium
CN110868600A (en) * 2019-11-11 2020-03-06 腾讯云计算(北京)有限责任公司 Target tracking video plug-flow method, display method, device and storage medium
CN111340140A (en) * 2020-03-30 2020-06-26 北京金山云网络技术有限公司 Image data set acquisition method and device, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060078217A1 (en) * 2004-05-20 2006-04-13 Seiko Epson Corporation Out-of-focus detection method and imaging device control method
CN107291810A (en) * 2017-05-18 2017-10-24 深圳云天励飞技术有限公司 Data processing method, device and storage medium
CN110868600A (en) * 2019-11-11 2020-03-06 腾讯云计算(北京)有限责任公司 Target tracking video plug-flow method, display method, device and storage medium
CN111340140A (en) * 2020-03-30 2020-06-26 北京金山云网络技术有限公司 Image data set acquisition method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN114898239A (en) 2022-08-12

Similar Documents

Publication Publication Date Title
US10986338B2 (en) Thermal-image based video compression systems and methods
JP5882455B2 (en) High resolution multispectral image capture
JP6732902B2 (en) Imaging device and imaging system
TWI713794B (en) Method for identifying events in a motion video
EP3579145A1 (en) Method and device for image processing, computer readable storage medium, and electronic device
WO2019196539A1 (en) Image fusion method and apparatus
US8798369B2 (en) Apparatus and method for estimating the number of objects included in an image
WO2020094088A1 (en) Image capturing method, monitoring camera, and monitoring system
TWI522967B (en) Method and apparatus for moving object detection based on cerebellar model articulation controller network
JP2016129347A (en) Method for automatically determining probability of image capture with terminal using contextual data
CN107704798B (en) Image blurring method and device, computer readable storage medium and computer device
CN103905727A (en) Object area tracking apparatus, control method, and program of the same
WO2022237591A1 (en) Moving object identification method and apparatus, electronic device, and readable storage medium
JP7024736B2 (en) Image processing equipment, image processing method, and program
CN107613216A (en) Focusing method, device, computer-readable recording medium and electronic equipment
US20230127009A1 (en) Joint objects image signal processing in temporal domain
TWI521473B (en) Device, method for image analysis and computer-readable medium
US20120033854A1 (en) Image processing apparatus
CN110929615B (en) Image processing method, image processing apparatus, storage medium, and terminal device
US9189863B2 (en) Method and system for detecting motion capable of removing shadow by heat
WO2022156763A1 (en) Target object detection method and device thereof
CN107959840A (en) Image processing method, device, computer-readable recording medium and computer equipment
CN111800605A (en) Gun-ball linkage based vehicle shape and license plate transmission method, system and equipment
Law et al. Performance enhancement of PRNU-based source identification for smart video surveillance
US12035033B2 (en) DNN assisted object detection and image optimization

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22742242

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22742242

Country of ref document: EP

Kind code of ref document: A1