WO2020151084A1 - 目标对象的监控方法、装置及系统 - Google Patents
目标对象的监控方法、装置及系统 Download PDFInfo
- Publication number
- WO2020151084A1 WO2020151084A1 PCT/CN2019/080747 CN2019080747W WO2020151084A1 WO 2020151084 A1 WO2020151084 A1 WO 2020151084A1 CN 2019080747 W CN2019080747 W CN 2019080747W WO 2020151084 A1 WO2020151084 A1 WO 2020151084A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- target
- video
- image
- server
- video frame
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 98
- 238000012544 monitoring process Methods 0.000 title claims abstract description 62
- 238000012806 monitoring device Methods 0.000 claims abstract description 69
- 230000033001 locomotion Effects 0.000 claims description 157
- 239000013598 vector Substances 0.000 claims description 154
- 230000006870 function Effects 0.000 claims description 43
- 238000013528 artificial neural network Methods 0.000 claims description 41
- 238000010586 diagram Methods 0.000 claims description 39
- 230000003287 optical effect Effects 0.000 claims description 31
- 230000004913 activation Effects 0.000 claims description 30
- 238000003062 neural network model Methods 0.000 claims description 30
- 238000004590 computer program Methods 0.000 claims description 18
- 230000015654 memory Effects 0.000 claims description 18
- 238000005070 sampling Methods 0.000 claims description 13
- 230000004044 response Effects 0.000 claims description 8
- 230000000694 effects Effects 0.000 abstract description 13
- 238000001514 detection method Methods 0.000 description 61
- 241000699666 Mus <mouse, genus> Species 0.000 description 37
- 241000607479 Yersinia pestis Species 0.000 description 26
- 238000004422 calculation algorithm Methods 0.000 description 25
- 241000700159 Rattus Species 0.000 description 24
- 241000283984 Rodentia Species 0.000 description 23
- 238000000605 extraction Methods 0.000 description 18
- 230000005540 biological transmission Effects 0.000 description 17
- 230000004297 night vision Effects 0.000 description 16
- 238000012545 processing Methods 0.000 description 12
- 230000008569 process Effects 0.000 description 11
- 238000005516 engineering process Methods 0.000 description 10
- 241001674044 Blattodea Species 0.000 description 8
- 206010061217 Infestation Diseases 0.000 description 8
- 238000004458 analytical method Methods 0.000 description 8
- 238000004364 calculation method Methods 0.000 description 8
- 230000006855 networking Effects 0.000 description 6
- 230000000007 visual effect Effects 0.000 description 6
- 241000699670 Mus sp. Species 0.000 description 5
- 238000007781 pre-processing Methods 0.000 description 4
- 238000007405 data analysis Methods 0.000 description 3
- 238000013480 data collection Methods 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 230000000877 morphologic effect Effects 0.000 description 3
- 230000002265 prevention Effects 0.000 description 3
- 241000255925 Diptera Species 0.000 description 2
- 241000238631 Hexapoda Species 0.000 description 2
- WUKWITHWXAAZEY-UHFFFAOYSA-L calcium difluoride Chemical compound [F-].[F-].[Ca+2] WUKWITHWXAAZEY-UHFFFAOYSA-L 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 239000010436 fluorite Substances 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000001784 detoxification Methods 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000007500 overflow downdraw method Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
Definitions
- This application relates to the computer field, and in particular to a method, device and system for monitoring a target object.
- the current method of monitoring the target object is usually to identify the target object in the captured video, but this method is often inefficient.
- the embodiments of the present application provide a method, device, and system for monitoring a target object, so as to at least solve the problem of low efficiency in monitoring the target object in related technologies.
- a method for monitoring a target object including: a first server receives an image sent by a video surveillance device when a moving object is detected in the target area, wherein the image It is an image obtained from a target video where the object appears in a video captured by the video monitoring device of the target area; the first server determines whether the object is a target object according to the image.
- the method further includes: in a case where the object is determined to be the target object, the first server Obtain the target video.
- the first server acquiring the target video includes: the first server acquiring the target video from the video surveillance device; or, the first server acquiring the target video from a second server, Wherein, the target video is sent to the second server by the video monitoring device when a moving object is detected in the target area.
- the method further includes: in a case where it is determined that the object is not the target object, the first The server sends instruction information to the second server, where the instruction information is used to instruct the second server to delete the target video.
- the method further includes: the first server determines in the target video a movement track of the target object in the target area.
- the method further includes: the first server generates according to the movement track Prompt information, wherein the prompt information is used to prompt a way to eliminate the target object.
- the method further includes: the first server generates alarm information corresponding to the target object, wherein the alarm information is used to indicate The target object appears in the target area, and the alarm information includes at least one of the following: the target video, the movement track, and the prompt information; the first server sends the alarm information to Client.
- the method further includes: the video surveillance device detects that a moving object appears in the target area.
- a moving object a video image is intercepted from the video obtained by the video surveillance device shooting the target area every predetermined time since the object appears in the target area, until the object no longer appears in the In the target area, the image includes the video image; the video surveillance device sends the intercepted video image to the first server in real time; or, the video surveillance device acquires all the intercepted videos And send the image set to the first server.
- the first server determining whether the object is the target object according to the image includes: the first server recognizing whether the object in each received video image is the target object , Obtain the recognition result corresponding to each of the video images; the first server merges the recognition results corresponding to all the received video images into a target result; the first server determines the recognition result according to the target result Whether the object is the target object.
- the first server identifying whether the object in each of the received video images is the target object includes: the first server determining that each of the received video images is Whether the object appears; the first server recognizes whether the object in the video image in which the object appears is the target object.
- the first server determining whether the object is a target object according to the image includes:
- the first server detects the target object for each target video frame image to obtain the image characteristics of each target video frame image, wherein the image includes multiple target video frames obtained from the target video Image, each target video frame image is used to indicate the object in the target area, and the image feature is used to indicate that the similarity between the target object and the target object is greater than the first The target image area where the threshold object is located;
- the first server determines the motion feature according to the image feature of each of the target video frame images, where the motion feature is used to indicate the motion speed and the motion direction of the object in the multiple target video frame images;
- the first server determines whether the target object appears in the multiple target video frame images according to the motion characteristic and the image characteristic of each target video frame image.
- the first server determining the motion feature according to the image feature of each target video frame image includes:
- the moving speed and moving direction of the object when passing through the target image area forming the first target vector according to the time sequence of each target video frame image in the video file by the multiple target vectors, wherein, the motion feature includes the first target vector; or
- each of the two-dimensional optical flow diagrams includes a corresponding The moving speed and moving direction of the object in one of the target video frame images when passing through the target image area; the multiple two-dimensional optical flow diagrams are displayed in the video file according to each of the target video frame images
- the time sequence in composes a three-dimensional second target vector, wherein the motion feature includes the three-dimensional second target vector.
- the first server determining whether the target object appears in the multiple target video frame images according to the motion characteristic and the image characteristic of each target video frame image includes:
- the motion feature and the image feature of each target video frame image are input into a pre-trained neural network model to obtain an object recognition result, where the object recognition result is used to represent the multiple target video frames Whether the target object appears in the image.
- inputting the motion feature and the image feature of each target video frame image into a pre-trained neural network model to obtain an object recognition result includes:
- a neural network layer structure including a convolution layer, a regularization layer, and an activation function layer to obtain a plurality of first feature vectors; fuse the plurality of first feature vectors with the motion feature , Obtain the second feature vector; input the second feature vector to the fully connected layer for classification, and obtain the first classification result, wherein the neural network model includes the neural network layer structure and the fully connected layer, so The object recognition result includes the first classification result, and the first classification result is used to indicate whether the target object appears in the multiple target video frame images; or
- each image feature through a first neural network layer structure including a convolutional layer, a regularization layer, and an activation function layer to obtain multiple first feature vectors; pass the motion feature through a convolutional layer, a regularization layer 1.
- the result includes the second classification result, and the second classification result is used to indicate whether the target object appears in the multiple target video frame images.
- the receiving, by the first server, the image sent by the video surveillance device when a moving object is detected in the target area includes:
- the first server receives the multiple target video frame images sent by a video surveillance device, where the multiple target video frame images are obtained by sampling the target video by the video surveillance device to obtain a set of Video frame images, and determined in the set of video frame images according to the pixel values of the pixels in the set of video frame images; or,
- the first server receives a set of video frame images sent by a video surveillance device, where the set of video frame images is obtained by sampling the target video by the video surveillance device; the first server The multiple target video frame images are determined in the group of video frame images according to the pixel values of the pixels in the group of video frame images.
- the first server includes: a first cloud server.
- the second server includes: a second cloud server.
- a method for monitoring a target object includes: when a video monitoring device detects that a moving object appears in the target area, shooting the target area from the video monitoring device In the obtained video, an image is obtained from the target video where the object appears; the video monitoring device sends the image to the first server, where the image is used to instruct the first server to determine the location based on the image Whether the object is the target object.
- the method further includes: the video monitoring device sends the target video to a second server, where the second server is set to In a case where the first request sent by the first server is received, the target video is sent to the first server in response to the first request.
- the method further includes: the video surveillance device receives a second request sent by the first server; the video surveillance device responds The second request sends the target video to the first server.
- acquiring an image from the target video where the object appears in the video obtained by the video surveillance device shooting the target area includes: the video surveillance device detects that a moving object appears in the target area From the moment the object appears in the target area, a video image is intercepted every predetermined time from the video obtained by the video surveillance device shooting the target area until the object no longer appears in the target area ,
- the image includes the video image; sending the image to the first server by the video monitoring device includes: the video monitoring device sends the intercepted video image to the first server in real time; or The video monitoring device obtains an image set including all the intercepted video images, and sends the image set to the first server.
- the method further includes: the video monitoring device obtains from the video obtained by shooting the target area from the object appearing in the target area The object starts with the first video until the object no longer appears in the target area; the video monitoring device acquires the second video of the first target time period before the object appears in the target area and all The third video of the second target time period after the object no longer appears in the target area; the video monitoring device determines the second video, the first video, and the third video as the target video.
- a monitoring system for a target object including: a video monitoring device and a first server, wherein the video monitoring device is connected to the first server; the video monitoring device is configured To obtain an image from the target video where the object appears in the video obtained by shooting the target area when a moving object is detected in the target area, and send the image to the first server ; The first server is configured to determine whether the object is a target object according to the image.
- the video surveillance device is configured to: in the case of detecting that a moving object appears in the target area, start from the occurrence of the object in the target area, start from the video surveillance device every predetermined time
- the video image is intercepted from the video captured by the target area until the object no longer appears in the target area, and the image includes the video image; the intercepted video image is sent to the first server in real time Or, acquiring an image set including all the captured video images, and sending the image set to the first server.
- the first server is configured to: identify whether the object in each of the received video images is the target object, and obtain the recognition result corresponding to each of the video images; Recognition results corresponding to all the video images obtained are merged into a target result; and whether the object is a target object is determined according to the target result.
- the first server is further configured to: in a case where it is determined that the object is the target object, obtain the target video; determine in the target video that the target object is in the target The movement trajectory in the area; generate prompt information according to the movement trajectory, wherein the prompt information is used to prompt the way to eliminate the target object; generate alarm information corresponding to the target object, wherein the alarm information is used for It is indicated that the target object appears in the target area, and the alarm information includes at least one of the following: the target video, the movement track, and the prompt information.
- the system further includes: a client, wherein the first server is connected to the client; the first server is configured to send the alarm information to the client; the client Set to display the alarm information on the display interface.
- the system further includes: a second server, wherein the second server is connected to the video monitoring device and the first server; the video monitoring device is further configured to send the video to the The second server; the second server is configured to store the target video; the first server is configured to obtain the target video from the second server.
- a second server wherein the second server is connected to the video monitoring device and the first server; the video monitoring device is further configured to send the video to the The second server; the second server is configured to store the target video; the first server is configured to obtain the target video from the second server.
- the first server is further configured to send indication information to the second server in a case where it is determined that the object is not the target object; the second server is configured to: respond to the indication The information deletes the target video.
- the video monitoring device is further configured to: acquire from a video obtained by shooting the target area from the time the object appears in the target area until the object no longer appears in the target area Acquiring a second video in a first target time period before the object appears in the target area and a third video in a second target time period after the object no longer appears in the target area; The second video, the first video, and the third video are determined as the target video.
- a monitoring device for a target object which is applied to a first server, and includes: a receiving module configured to receive when a video monitoring device detects a moving object in the target area The sent image, where the image is an image obtained from the target video where the object appears in the video captured by the video monitoring device in the target area; the determining module is configured to determine the Whether the object is the target object.
- a monitoring device for a target object which is applied to a video monitoring device, and includes: an acquisition module configured to detect a moving object in the target area from the The video obtained by the video monitoring device shooting the target area acquires an image on the target video where the object appears; the sending module is configured to send the image to the first server, where the image is used to indicate the first server A server determines whether the object is a target object according to the image.
- a storage medium in which a computer program is stored, wherein the computer program is configured to execute the steps in any one of the foregoing method embodiments when running.
- an electronic device including a memory and a processor, the memory is stored with a computer program, and the processor is configured to run the computer program to execute any of the above Steps in the method embodiment.
- the first server receives the image sent by the video surveillance device when a moving object in the target area is detected, where the image is the video obtained from the video surveillance device shooting the target area and the object appears The image obtained on the target video; the first server determines whether the object is the target object according to the image, the first server determines whether the object appearing in the target area is the target object according to the image obtained from the video surveillance device, the image is video surveillance When the device detects that a moving object appears in the target area, it is obtained from the target video of the object appearing in the video obtained by the video surveillance device shooting the target area, so the video surveillance device only needs to detect the target When a moving object appears in the area, it sends an image of a possible object to the first server.
- the first server can determine whether the object appearing in the target area is the target object based on the received image. It can be seen that compared to monitoring the target object based on video The method can greatly reduce the amount of data transmitted, thereby increasing the transmission speed, reducing the transmission time, and improving the monitoring efficiency. Therefore, the problem of low efficiency in monitoring the target object in related technologies can be solved, and the effect of improving the efficiency of monitoring the target object can be achieved.
- FIG. 1 is a block diagram of the hardware structure of a mobile terminal of a method for monitoring a target object according to an embodiment of the present application
- Fig. 2 is a first flowchart of a method for monitoring a target object according to an embodiment of the present application
- Fig. 3 is a schematic diagram of a data connection of each module according to an embodiment of the present application.
- Fig. 4 is a schematic diagram of the principle of a rat infestation detection system according to an embodiment of the present application.
- FIG. 5 is a schematic diagram of a Faster-RCNN network model according to an embodiment of the present application.
- Fig. 6 is a second flowchart of a method for monitoring a target object according to an embodiment of the present application.
- Fig. 7 is a first structural block diagram of a monitoring device for a target object according to an embodiment of the present application.
- Fig. 8 is a second structural block diagram of a monitoring device for a target object according to an embodiment of the present application.
- Fig. 9 is a structural block diagram of a target object monitoring system according to an embodiment of the present application.
- Fig. 10 is a schematic diagram of a target object monitoring architecture according to an optional embodiment of the present application.
- FIG. 1 is a hardware structure block diagram of a mobile terminal of a method for monitoring a target object in an embodiment of the present application.
- the mobile terminal 10 may include one or more (only one is shown in FIG. 1) processor 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA. ) And a memory 104 for storing data.
- the above-mentioned mobile terminal may also include a transmission device 106 and an input/output device 108 for communication functions.
- FIG. 1 is merely illustrative, and does not limit the structure of the above-mentioned mobile terminal.
- the mobile terminal 10 may also include more or fewer components than those shown in FIG. 1, or have a different configuration from that shown in FIG.
- the memory 104 may be used to store computer programs, for example, software programs and modules of application software, such as the computer programs corresponding to the monitoring method of the target object in the embodiment of the present application.
- the processor 102 runs the computer programs stored in the memory 104, thereby Perform various functional applications and data processing, that is, realize the above-mentioned methods.
- the memory 104 may include a high-speed random access memory, and may also include a non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory.
- the memory 104 may further include a memory remotely provided with respect to the processor 102, and these remote memories may be connected to the mobile terminal 10 via a network. Examples of the aforementioned networks include but are not limited to the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
- the transmission device 106 is configured to receive or transmit data via a network.
- the aforementioned optional network examples may include a wireless network provided by a communication provider of the mobile terminal 10.
- the transmission device 106 includes a network adapter (Network Interface Controller, NIC for short), which can be connected to other network devices through a base station so as to communicate with the Internet.
- the transmission device 106 may be a radio frequency (RF) module, which is configured to communicate with the Internet in a wireless manner.
- RF radio frequency
- a method for monitoring a target object is provided.
- Fig. 2 is a flowchart 1 of the method for monitoring a target object according to an embodiment of the present application. As shown in Fig. 2, the process includes the following steps:
- step S202 the first server receives an image sent by the video surveillance device when a moving object is detected in the target area, where the image is the target of the object in the video obtained from the video surveillance device shooting the target area. Images captured on the video;
- Step S204 The first server determines whether the object is a target object according to the image.
- the target object may include, but is not limited to: rats, pests and other harmful organisms.
- the target area may include, but is not limited to, a kitchen, a warehouse, a factory building, and so on.
- the video monitoring device may include, but is not limited to, a camera, a monitor, and so on.
- the aforementioned camera may include, but is not limited to, a camera with an infrared lighting function, for example, an infrared low-light night vision camera. Further, the camera may also include, but is not limited to: motion detection function, storage function, networking function (such as Wierless Fidelity (WIFI) networking) and high-definition (such as greater than 1080p) configuration.
- a camera with an infrared lighting function for example, an infrared low-light night vision camera.
- the camera may also include, but is not limited to: motion detection function, storage function, networking function (such as Wierless Fidelity (WIFI) networking) and high-definition (such as greater than 1080p) configuration.
- WIFI Wierless Fidelity
- the video surveillance device may include, but is not limited to, one or more video surveillance devices.
- the first server may include, but is not limited to: a first cloud server.
- a first cloud server For example: Ziyouyun.
- the first server determines whether the object appearing in the target area is the target object according to the image obtained from the video surveillance device.
- the image is the video surveillance device from the video when the video surveillance device detects a moving object in the target area.
- the video obtained by the surveillance equipment shooting the target area is obtained from the target video where the object appears, so the video surveillance equipment only needs to send the possible object to the first server when a moving object is detected in the target area
- the first server can determine whether the object appearing in the target area is the target object. It can be seen that compared with the method of monitoring the target object based on video, the amount of data transmitted can be greatly reduced, thereby increasing the transmission speed and reducing Transmission time improves monitoring efficiency. Therefore, the problem of low efficiency in monitoring the target object in related technologies can be solved, and the effect of improving the efficiency of monitoring the target object can be achieved.
- the first server may obtain the target video after determining that the object appearing in the target area is the target object. If the object appearing in the target area is not the target object, the target video is no longer obtained, thereby saving Resources. For example: after the above step S204, in a case where the object is determined to be the target object, the first server obtains the target video.
- the storage location of the target video may include, but is not limited to, a video surveillance device or a second server.
- the first server may, but is not limited to, obtain the target video in one of the following ways:
- Method 1 The first server obtains the target video from the video surveillance device.
- the first server obtains the target video from the second server, where the target video is sent to the second server by the video surveillance device when a moving object is detected in the target area.
- the second server may include but is not limited to: a second cloud server.
- a second cloud server For example: fluorite cloud.
- the video surveillance device may send the target video to the second server. If the first server determines that the object in the target area is not the target object according to the image, it may send indication information to the second server to instruct the second server to The target video is deleted to save storage space. For example: after the above step S204, in the case where it is determined that the object is not the target object, the first server sends instruction information to the second server, where the instruction information is used to instruct the second server to delete the target video.
- the first server may analyze the movement track of the target object in the target area from the target video. For example: after the first server obtains the target video, the first server determines the movement track of the target object in the target area in the target video.
- the first server may generate a suggestion for eliminating the target object according to the analyzed movement track of the target object, and provide it to the user. For example: after the first server determines the movement track of the target object in the target area in the target video, the first server generates prompt information according to the movement track, where the prompt information is used to prompt a way to eliminate the target object.
- the first server may send alarm information carrying the target video, movement trajectory, and prompt information to the client to provide the user with an alarm of the target object, and how to eliminate the target object according to the movement trajectory of the target object, And the playback of the moving process of the target object is provided to users for their reference.
- the first server may send alarm information carrying the target video, movement trajectory, and prompt information to the client to provide the user with an alarm of the target object, and how to eliminate the target object according to the movement trajectory of the target object, And the playback of the moving process of the target object is provided to users for their reference.
- the first server after the first server generates prompt information according to the movement track, the first server generates alarm information corresponding to the target object, where the alarm information is used to indicate that the target object appears in the target area, and the alarm information includes at least one of the following: target Video, movement track, prompt information; the first server sends the alarm information to the client.
- the video surveillance device may, but is not limited to, obtain the image sent to the first server in the following manner: when the video surveillance device detects a moving object in the target area, When the object appears in the video, it starts to intercept the video image from the video obtained by the video surveillance equipment shooting the target area at predetermined intervals, until the object no longer appears in the target area, the image includes the video image; the video surveillance equipment will intercept the video image in real time Send to the first server; or, the video surveillance device obtains an image set including all the intercepted video images, and sends the image set to the first server.
- the images sent by the video surveillance device to the first server may be multiple images, and the first server may recognize each image to obtain recognition results, and then merge these recognition results to obtain the final target result.
- the first server recognizes whether the object in each received video image is the target object, and obtains the recognition result corresponding to each video image; the first server corresponds to all the received video images The recognition result of is fused into the target result; the first server determines whether the object is the target object according to the target result.
- the first server may, but is not limited to, recognize whether the object in the video image is the target object in the following manner:
- the first server determines whether an object appears in each video image received
- the first server recognizes whether the object in the video image where the object appears is the target object.
- the target object may be recognized but not limited to the following methods:
- the first server detects the target object for each target video frame image to obtain the image characteristics of each target video frame image, where the image includes multiple target video frame images obtained from the target video, and each target video frame image It is used to indicate the object in the target area, and the image feature is used to indicate the target image area where the similarity between the object and the target object is greater than the first threshold among the moving objects;
- the first server determines the motion feature according to the image feature of each target video frame image, where the motion feature is used to represent the motion speed and motion direction of the object in the multiple target video frame images;
- the first server determines whether the target object appears in the multiple target video frame images according to the motion characteristics and the image characteristics of each target video frame image.
- a method for determining a target object is also provided. Assuming that the video surveillance device is a camera device, the acquired image is an image frame extracted from the target video. The above method includes the following steps:
- Step S1 Obtain a video file obtained by shooting the target area by the camera device.
- the camera device may be a surveillance camera, for example, the camera device is an infrared low-light night vision camera for shooting and monitoring the target area to obtain a video file.
- the target area is the space area detected in the target building, that is, the area used to detect whether there is a target object.
- the target object can be a large-sized disease vector that needs to be controlled, for example, the target object For the mouse.
- the video file of this embodiment includes original video data obtained by shooting a target area, and may include a surveillance video sequence of the target area, which is also an image video sequence.
- the original video data of the target area is acquired through the ARM board at the video data collection layer to generate the above-mentioned video file, thereby achieving the purpose of collecting the video of the target area.
- Step S2 Perform frame sampling on the video file to obtain a group of video frame images.
- step S2 of this application after obtaining the video file captured by the camera device in the target area, the video file is preprocessed, and the video file can be sampled at the video data processing layer to obtain a set of Video frame image.
- the video file can be sampled at equal intervals to obtain a set of video frame images of the video file.
- a video file includes a sequence of 100 video frames. After the frame sampling is performed, 10 frames are obtained.
- the 10 video frame sequences are used as the above-mentioned set of video frame images, thereby reducing the calculation amount of the algorithm for determining the target object.
- Step S3 Determine multiple target video frame images in a group of video frame images according to pixel values of pixels in a group of video frame images.
- step S3 of this application after sampling the video file to obtain a group of video frame images, the pixel values of the pixels in the group of video frame images are determined in a group of video frame images. Multiple target video frame images are generated, where each target video frame image is used to indicate an object moving in a corresponding target area.
- preprocessing the video file also includes performing dynamic detection on the video file, and determining a target video frame image used to indicate an object moving in the target area from a set of video frame images, that is, in the A moving object in the target video frame image.
- the target video frame image may be a video clip of a moving object, where the moving object may or may not be the target object.
- the target video frame image can be determined by a dynamic detection algorithm, and multiple target video frame images can be determined in a group of video frame images according to the pixel values of pixels in a group of video frame images, and then step S4 is performed.
- video frame images other than multiple target video frame images do not indicate that there is a moving image in the corresponding target area, and subsequent detection may not be performed.
- Step S4 Perform target object detection on each target video frame image to obtain the image characteristics of each target video frame image.
- each target video frame image After determining multiple target video frame images in a set of video frame images according to the pixel values of pixels in a set of video frame images, each target video frame image Perform target object detection to obtain the image characteristics of each target video frame image. For each target video frame image, the image characteristics are used to indicate that among the moving objects, the similarity with the target object is greater than the first The target image area where the threshold object is located.
- the target object detection is performed on each target video frame image, that is, the moving object existing in the target video frame image is detected.
- the target detection system can adopt the dynamic target detection method and the target based on neural network.
- the detection method detects the moving objects in the target video frame image, and obtains the image characteristics of each target video frame image.
- the dynamic target detection method has fast calculation speed and low requirements for machine configuration, while the neural network-based target The accuracy and robustness of the detection method is better.
- the image feature can be the visual information in a rectangular frame to represent the target image area.
- the rectangular frame can be a detection frame to indicate that the object is in a moving object and is The target image area where the similarity between the target objects is greater than the first threshold.
- the above-mentioned image features are used to indicate the possible locations of the target objects confirmed by the coarse screen.
- Step S5 Determine the motion feature according to the image feature of each target video frame image.
- step S5 of the present application after the target object detection is performed on each target video frame image, and the image characteristics of each target video frame image are obtained, it is determined according to the image characteristics of each target video frame image
- the motion feature where the motion feature is used to represent the motion speed and motion direction of objects moving in multiple target video frame images.
- the image characteristics of each target video frame image can be input to the motion feature extraction module.
- the motion feature extraction module determines the motion feature according to the image feature of each target video frame image. For multiple target video frame images, the motion feature is used to represent the motion speed and direction of the moving object in the multiple target video frame images , And at the same time further filter out the interference images caused by the movement of non-target objects, for example, delete the interference information such as the movement of mosquitoes.
- the motion feature extraction algorithm of the motion feature extraction module may first detect multiple images based on the image features of each target video frame image.
- the correlation of the image features between the target video frame images can determine the objects corresponding to the image features with high correlation as the same object, and match the image features of each target video frame image to obtain a series of moving pictures of the object.
- a three-dimensional (3-Dimension, abbreviated as 3D) feature extraction network can be used to extract the features of the motion sequence to obtain the motion characteristics.
- the detection frame of each target video frame image calculate the difference between multiple target video frame images
- the correlation of the detection frame can determine the object corresponding to the detection frame with high correlation as the same object, and match the detection frame of each target video frame image to obtain a series of moving pictures of the object, and finally use the 3D feature extraction network
- the features of the motion sequence are extracted to obtain the motion characteristics, and then the motion speed and motion direction of the moving objects in multiple target video frame images are determined.
- the image features of multiple target video frames can also be fused and feature extraction is performed, so as to prevent a single frame of target detector from misjudgment, and then realize the precision of the target video frame image. Screen to accurately determine whether the target object appears.
- Step S6 according to the motion characteristics and the image characteristics of each target video frame image, it is determined whether the target object appears in the multiple target video frame images.
- the classification network is a pre-designed classification network model used to determine whether there are target objects in multiple target video frame images, and then determine according to the motion characteristics and the image characteristics of each target video frame image Whether there are target objects in multiple target video frame images, for example, determine whether there are rats in multiple target video frame images.
- this embodiment can input the image features in the images with the target object in the multiple target video frame images to the front-end display interface, which can further display the detection frame and movement track of the target object.
- the classification network model of this embodiment can be used to filter non-target object picture sequences, while retaining the target object picture sequence, thereby reducing the false alarm rate and ensuring the accuracy of the target object prompt information.
- each target video frame image is used to indicate an object moving in the target area; target object detection is performed on each target video frame image to obtain each target video
- the image feature of the frame image where the image feature is used to indicate the target image area where the similarity between the target object and the target object is greater than the first threshold among the moving objects; it is determined according to the image characteristics of each target video frame image
- Motion features where the motion features are used to indicate the speed and direction of the moving objects in multiple target video frames; according to the motion characteristics and the image characteristics of each target video frame, determine whether the multiple target video frames There is a target.
- the video file in the target area is sampled to obtain a set of video frame images.
- a set of video frame images is determined to indicate the target area.
- the multiple target video frame images of the moving object in the moving object and then determine the motion characteristics according to the image characteristics of each target video frame image, and then according to the motion characteristics and the image characteristics of each target video frame image, to automatically determine multiple target video frames Whether the purpose of the target object appears in the image not only greatly reduces the labor cost of determining the target object, but also improves the accuracy of determining the target object, solves the problem of low efficiency in determining the target object, and thus achieves the improvement of rat infestation The effect of detection accuracy.
- step S3, determining multiple target video frame images in a group of video frame images according to the pixel values of pixels in a group of video frame images includes: acquiring The average pixel value of each pixel; get the difference between the pixel value of each pixel in each video frame image in a group of video frame images and the corresponding average pixel value; combine a group of video frame images The video frame image whose difference value meets the predetermined condition is determined as the target video frame image.
- each pixel point in a group of video frame images can be obtained Calculate the average pixel value according to the pixel value of each pixel, and then obtain the difference between the pixel value of each pixel in a group of video frame images and the corresponding average pixel value.
- this embodiment may also obtain the difference between the pixel value of each pixel in each video frame image in a group of video frame images and the background or the previous frame of each video frame image.
- the video frame image of a group of video frame images whose difference value meets the predetermined condition is determined as the target video frame image, thereby obtaining multiple targets in the group of video frame images Video frame image.
- each video frame image is regarded as In the current video frame image, each pixel is regarded as the current pixel.
- (x, y) can be used to indicate the coordinates of the current pixel in the current video frame image, for example, the upper left corner of the current video frame image is the origin, and the width
- the direction is the X axis
- the height direction is the coordinate of the pixel in the coordinate system established by the Y axis.
- the pixel value of the current pixel is represented by f(x,y), and the average pixel value of the current pixel is represented by b(x,y).
- each video frame image is regarded as the current video frame image, and each pixel is viewed Is the current pixel
- M(x,y) represents the current video frame image
- D(x,y) represents the difference between the pixel value of the current pixel and the corresponding average pixel value
- T represents the first preset Threshold
- multiple target video frame images in a group of video frame images form a moving target video frame image, and all moving objects can be obtained by combining pixels through morphological operations as an output result.
- the detection of moving objects in the target video frame image in this embodiment is a neural network-based target detection.
- a group of video frame images can be input to a pre-trained network model to obtain all moving objects and their confidence levels. , And use image features greater than a certain confidence threshold as the output of the network module.
- the network model used can include, but is not limited to, Single Shot MultiBox Detector (SSD), Regional Convolutional Network (Faster Region-CNN, Faster-RCNN), Feature Pyramid Network (Feature Pyramid Network). , Referred to as FPN), etc., there are no restrictions here.
- the time sequence in the video file composes the first target vector, where the motion feature includes the first target vector; or the two-dimensional optical flow diagram corresponding to the target image area represented by the image feature of each target video frame image is obtained to obtain Multiple two-dimensional optical flow diagrams, where each two-dimensional optical flow diagram includes the movement speed and direction of the moving object in a corresponding target video frame image when passing through the target image area;
- the time sequence of each target video frame image in the video file forms a three-dimensional second target vector, where the motion feature includes the three-dimensional second target vector.
- the image feature of each target video frame image can be used to represent the target vector corresponding to the target image area, so as to obtain multiple target vectors one-to-one corresponding to multiple target video frames, each of which is The vector is used to represent the moving speed and direction of the moving object in the corresponding target video frame image when passing the target image area, that is, the moving speed of the moving object in each target video frame image when passing the target image area And the direction of motion, as the image characteristics of each target video frame image.
- the multiple target vectors are formed into the first target vector according to the time sequence of each target video frame image in the video file, where the time sequence of each target video frame image in the video file can be passed
- the time axis is expressed, and multiple target vectors can be spliced along the time axis to obtain a first target vector, the first target vector is a one-dimensional vector, and the one-dimensional vector is output as a motion feature.
- each target video frame image is used to represent the target image area
- the optical flow (optical flow or optic flow) of each target image area can be calculated to obtain the two-dimensional optical flow corresponding to the target image area Figure, and then obtain multiple two-dimensional optical flow diagrams corresponding to multiple target video frame images one-to-one, where the optical flow is used to describe the movement of the observation target, surface or edge caused by the movement of the observer.
- Each two-dimensional optical flow diagram of this embodiment includes the moving speed and direction of the moving object in a corresponding target video frame image when passing through the target image area, that is, the moving object in the target video frame image is passing through the target image area.
- the speed and direction of movement at time can be represented by a two-dimensional optical flow diagram.
- the multiple two-dimensional optical flow diagrams are formed into a three-dimensional second target vector according to the time sequence of each target video frame image in the video file, where each target video frame image is in the video file.
- the time sequence in the file can be represented by the time axis.
- Multiple two-dimensional optical flow graphs can be spliced along the time axis to obtain a second target vector.
- the second target vector is a three-dimensional vector.
- This embodiment adopts a target vector used to represent the moving speed and direction of the moving object in a corresponding target video frame image when passing through the target image area, or the target image area represented by the image characteristics of each target video frame image.
- the corresponding two-dimensional optical flow diagram is used to determine the motion feature.
- the motion feature can be a one-dimensional vector or a three-dimensional vector, thereby achieving the purpose of determining the motion feature according to the image feature of each target video frame image, and then according to the motion feature With the image characteristics of each target video frame image, determine whether there are target objects in multiple target video frame images, achieve the purpose of automatically determining whether there are target objects in multiple target video frame images, and improve the accuracy of determining target objects rate.
- a feature map is output by a network that combines the detection of the above-mentioned moving object (target detection) and motion feature extraction.
- the feature map is fused with a four-dimensional vector including visual and motion features, where the four-dimensional
- the vector may include, but is not limited to, time dimension, channel dimension, long dimension, and high dimension.
- step S6 determining whether a target object appears in the multiple target video frame images according to the motion characteristics and the image characteristics of each target video frame image includes: combining the motion characteristics and each target video frame The image features of the image are input into a pre-trained neural network model to obtain an object recognition result, where the object recognition result is used to indicate whether there are target objects in multiple target video frame images.
- the motion characteristics and the image characteristics of each target video frame image can be combined.
- the neural network model is also the classification network model, which can be based on the image feature samples of the moving target object, the motion feature sample and the data used to indicate the target object.
- the initial neural network model is trained and used to determine whether there is a model of the target object in the video frame image.
- the object recognition result that is, the classification result and the discrimination result, is used to indicate whether there are target objects in multiple target video frame images.
- inputting the motion feature and the image feature of each target video frame image into a pre-trained neural network model to obtain the object recognition result includes: passing each image feature through a convolutional layer, The neural network layer structure of the regularization layer and the activation function layer to obtain multiple first feature vectors; fuse multiple first feature vectors with motion features to obtain a second feature vector; input the second feature vector to the fully connected layer Perform classification to obtain the first classification result.
- the neural network model includes the neural network layer structure and the fully connected layer.
- the object recognition result includes the first classification result. The first classification result is used to indicate whether there are multiple target video frames.
- Target object or pass each image feature through a first neural network layer structure including a convolutional layer, a regularization layer and an activation function layer to obtain multiple first feature vectors; pass a motion feature through a convolutional layer, a regularization layer
- the second neural network layer structure of the activation function layer is used to obtain the second feature vector; the multiple first feature vectors are merged with the second feature vector to obtain the third feature vector; the third feature vector is input to the fully connected layer to perform Classification to obtain a second classification result, where the neural network model includes a first neural network layer structure, a second neural network layer structure, and a fully connected layer, and the object recognition result includes a second classification result, and the second classification result is used to represent multiple Whether the target object appears in the target video frame image.
- the overall structure of the neural network model can be divided into a convolutional layer, a regularization layer, an activation function layer, and a fully connected layer.
- the convolutional layer is composed of several convolutional units. The parameters of each convolutional unit They are all optimized through the back-propagation algorithm; the regularization layer can be used to prevent the over-fitting of the neural network model training, the activation function layer can introduce nonlinearity into the network, and the fully connected layer starts in the entire convolutional neural network. To the role of the classifier.
- each image feature can be passed through a convolutional layer, a regular
- the neural network layer structure of the transformation layer and the activation function layer obtains a plurality of first feature vectors, and the plurality of first feature vectors are merged with the aforementioned motion features to obtain a second feature vector, where the motion feature is one-dimensional motion feature.
- multiple first feature vectors and motion features can be spliced (or called a combination) to obtain a second feature vector.
- the second feature vector is input to the fully connected layer for classification, that is, the second feature vector is classified through the fully connected layer to obtain the first classification result, wherein the neural network of this embodiment
- the network model includes the above-mentioned neural network layer structure and the above-mentioned fully connected layer.
- the first classification result is used to indicate whether the object recognition result of the target object appears in the multiple target video frame images, for example, whether the target object appears in the multiple target video frame images There are classification results of mice.
- each image feature is passed through a neural network layer structure including a convolution layer, a regularization layer, and an activation function layer to obtain multiple first feature vectors, and the multiple first feature vectors are merged with the motion features, Obtain the second feature vector, input the second feature vector into the fully connected layer for classification, and obtain the first classification result.
- the method can obtain the target vector corresponding to the target image area represented by the image feature of each target video frame image , Obtain multiple target vectors, and execute the multiple target vectors after forming the first target vector according to the time sequence of each target video frame image in the video file.
- each image feature is passed through a convolutional layer, a regularization layer and Activate the first neural network layer structure of the activation function layer to obtain a plurality of first feature vectors; pass the above motion features through the second neural network layer structure including the convolutional layer, the regularization layer, and the activation function layer to obtain the second feature vector.
- the multiple first feature vectors and the second feature vectors are merged to obtain the third feature vector.
- first feature vectors and second feature vectors can be spliced (or called a combination) to obtain a third feature vector.
- the neural network model of this embodiment includes a first neural network layer structure and a second neural network layer. Structure and fully connected layer, the object recognition result includes the second classification result, the second classification result is used to indicate whether there are target objects in multiple target video frame images, for example, whether there are rats in multiple target video frame images The classification results.
- each image feature is passed through a first neural network layer structure including a convolutional layer, a regularization layer, and an activation function layer to obtain a plurality of first feature vectors
- the motion feature is passed through a convolutional layer, a regularization layer, and
- the second neural network layer structure of the layer and activation function layer to obtain the second feature vector, fuse multiple first feature vectors with the second feature vector to obtain the third feature vector, and input the third feature vector to the fully connected layer
- the method of performing classification to obtain the second classification result can obtain a two-dimensional optical flow diagram corresponding to the target image area represented by the image feature of each target video frame image to obtain multiple two-dimensional optical flow diagrams.
- the two-dimensional optical flow diagram is executed after the three-dimensional second target vector is composed according to the time sequence of each target video frame image in the video file.
- inputting the motion feature and the image feature of each target video frame image into a pre-trained neural network model to obtain the object recognition result includes: passing each image feature through multiple blocks in turn, Obtain a plurality of first feature vectors, where in each block, the input of the block is sequentially performed on the convolution operation on the convolution layer, the regularization operation on the regularization layer, and the activation operation on the activation function layer; The first feature vector is spliced with the motion feature to obtain the second feature vector; the second feature vector is input to the fully connected layer, and the first classification result is obtained through the output of the fully connected layer.
- the neural network model includes multiple blocks and full In the connection layer, the object recognition result includes the first classification result.
- the first classification result is used to indicate whether the target object appears in the multiple target video frame images; or each image feature passes through multiple first blocks in turn to obtain multiple first blocks.
- the feature passes through multiple second blocks in turn to obtain a second feature vector. In each second block, the input of the second block is sequentially performed on the convolution layer and the regularization operation on the regularization layer.
- the neural network model includes multiple first blocks, multiple second blocks, and fully connected layers, the object recognition result includes a second classification result, and the second classification result is used to indicate whether there are multiple target video frame images. target.
- each image feature can also be processed by block.
- Each image feature can be passed through multiple blocks in turn to obtain multiple first feature vectors.
- the input of the block will be sequentially performed on the convolution layer and regularization on the regularization layer. Operations and activation operations on the activation function layer.
- the multiple first feature vectors are obtained, the multiple first feature vectors are spliced with the motion feature to obtain the second feature vector.
- the second feature vector is obtained, the second feature vector is input to the fully connected layer for classification, and the first classification result is obtained through the output of the fully connected layer.
- the neural network model of this embodiment includes multiple blocks and a fully connected layer,
- the object recognition result includes a first classification result, and the first classification result is used to indicate whether a target object appears in a plurality of target video frame images, for example, whether a mouse appears in a plurality of target video frame images.
- this embodiment processes each image feature through the first block, and passes each image feature through multiple first blocks in turn to obtain multiple first feature vectors.
- the first feature vector is obtained.
- a block of input sequentially performs the convolution operation on the convolution layer, the regularization operation on the regularization layer, and the activation operation on the activation function layer.
- the motion feature can also be processed through the second block, and the motion feature is sequentially passed through multiple second blocks to obtain the second feature vector.
- the input of the second block is sequentially executed in the volume. Convolution operation on the build-up layer, regularization operation on the regularization layer, and activation operation on the activation function layer.
- the neural network model of this embodiment includes a plurality of first blocks, a plurality of second blocks, and a fully connected layer.
- the object recognition result includes the second classification result. The result is used to indicate whether there are target objects in multiple target video frame images, for example, the classification result of whether there are rats in multiple target video frame images.
- performing frame sampling on a video file to obtain a group of video frame images includes: sampling a video sequence in the video file at equal intervals to obtain a group of video frame images.
- the video file includes a video sequence.
- the video sequence in the video file is sampled at equal intervals to obtain a set of video frames. Image, thereby reducing the calculation amount of the algorithm for determining the target object, and then quickly whether there is a target object in multiple target video frames, and improving the efficiency of determining the target object.
- acquiring a video file captured by a camera device on a target area includes: the acquired video file includes: acquiring a video file captured by an infrared low-light night vision camera on the target area, where in the video file The video frame image is an image captured by an infrared low-light night vision camera.
- the imaging device may be a camera, for example, an infrared low-light night vision camera, and the infrared low-light night vision camera has an infrared illumination function.
- the target area is photographed by an infrared low-light night vision camera to obtain a video file, and the video frame image in the video file is an image taken by the infrared low-light night vision camera.
- the camera device of this embodiment also includes but is not limited to: motion detection function, networking function (such as WIFI networking) and high-definition (such as greater than 1080p) configuration.
- motion detection function such as WIFI networking
- high-definition such as greater than 1080p
- the method further includes: in the case where it is determined that the target object appears in the multiple target video frame images, determining the target The position of the object in multiple target video frames; the position is displayed in multiple target video frames.
- the target object after determining whether the target object appears in the multiple target video frame images, in the case where it is determined that the target object appears in the multiple target video frame images, it can be further determined that the target object is in the multiple target video frames.
- the position in the frame image for example, to determine the position of the mouse in multiple target video frame images, and then display the position in multiple target video frame images, for example, display information such as icons and texts used to indicate the position in multiple target video frames.
- the target video frame image for example, to determine the position of the mouse in multiple target video frame images, and then display the position in multiple target video frame images, for example, display information such as icons and texts used to indicate the position in multiple target video frames.
- this embodiment can also obtain information such as the time when the target object appears, the active area in the target area, and the location and time of the target object, the specific active area in the target area, and the frequency of activity in the target area.
- the movement track and other information are output to the front end, the front end is also the display part.
- the information such as the appearance time and active area of the target object can be displayed on the display interface, thereby avoiding the inefficient determination of the target object caused by the manual determination of the target object For the problem.
- an alarm message can be sent to the front end.
- the alarm information is used to indicate that the target object appears in the target area, so that relevant prevention and control personnel can take prevention measures. Measures to improve the efficiency of prevention and control of target objects.
- the method for determining the target object is executed by a server set locally.
- the method for determining the target object in this embodiment can be executed by a server set up locally, without connecting to a cloud server, the above calculation and visualization can be realized internally, which avoids that the computing end is on the cloud server, and there will be computing resources and transmission.
- This embodiment aims to apply image recognition technology, integrate image features and motion features, automatically detect whether there is a target object in the surveillance video, locate and track the target object, and generate the movement trajectory of the target object and the activity in each target area Frequency, the whole process is realized by algorithm, without additional labor cost;
- this embodiment does not need to place a target capture device to determine the target object in the target area, and does not need to spend manpower for observation, which not only greatly reduces the monitoring of the target object
- the labor cost improves the efficiency of determining the target object, and further facilitates the work of preventing and controlling the target object.
- the target object is a mouse as an example.
- Another method for determining a target object according to an embodiment of the present application.
- the method also includes:
- Step S1 Obtain a video file captured by an infrared low-light night vision camera.
- Step S2 Determine whether there are moving objects in the video file.
- Step S3 if there is a moving object, extract a video clip with the moving object.
- Step S4 Perform image feature and dynamic feature extraction on the video clip with moving objects.
- Step S5 judging whether the moving object is a mouse based on the extracted image features and dynamic features.
- step S6 if the judgment result is yes, a prompt message is sent.
- the video file captured by the infrared low-light night vision camera is acquired; it is determined whether there are moving objects in the video file; if there are moving objects, the video clips with moving objects are extracted; the video clips with moving objects are imaged. And dynamic feature extraction; judge whether the moving object is a mouse according to the extracted image features and dynamic features; if the judgment result is yes, then a prompt message will be issued, thereby solving the problem of low efficiency in determining the target object, thereby achieving improvement The effect of rodent detection accuracy.
- the technical solutions of the embodiments of the present application can be used as a mouse-infested video monitoring method that integrates visual features and trajectory features, and can be used in a variety of scenes to detect whether there are mice in the captured video, through an infrared low-light night vision camera Take a video file of the current environment, and then determine whether there is a moving object. If there is a moving object, perform feature recognition by extracting the video clip of the moving object to further determine whether the extracted moving object is a mouse. If it is determined to be a mouse, a prompt message will be issued , The prompt message can be text displayed on the screen, it can be a sound prompt message, or it can be a variety of types of prompt information such as lighting or flashing.
- the surveillance camera adopts an infrared low-light night vision camera.
- the judgment, extraction and other processing processes are performed in the local server, and there is no need to send data to the remote server. It can reduce the amount of data transmission and improve the efficiency of monitoring.
- the position of the moving object in each frame of the picture in the video file is determined; the preset mark is superimposed on the position corresponding to each frame of picture and displayed on the front-end interface.
- the preset mark can be a green or red rectangular frame. Mark the position of the mouse in each frame of the picture with a rectangular frame, so that the user can check the position of the mouse and the area frequently seen in time.
- judging whether there are moving objects in the video file includes: sampling the video sequence in the video file at equal intervals to obtain sampled video frames; judging the sampled video through a dynamic target detection algorithm or a neural network-based target detection algorithm Whether there are moving objects in the frame image.
- M(x, y) is 1, it means there is a moving target, and all pixels of X(x, y) form the moving target video frame image, and all moving targets can be obtained by merging the pixels through morphological operations.
- judging whether the moving object is a mouse based on the extracted image features and dynamic features includes: inputting the extracted image features and dynamic features into a pre-trained neural network model, performing model discrimination, and obtaining model output results; Determine whether the moving object is a mouse according to the output result of the model.
- the extracted image features and dynamic features can be distinguished by the pre-trained neural network model.
- the model is trained in advance based on a large number of samples. A large number of samples include the picture and whether there is a mouse label in the picture. In this case, you can also include the label of the number of rats in the picture, which can make the model more accurate.
- the technical solutions of the embodiments of this application can be used in kitchens, restaurants and other application scenarios that need to be monitored for rat infestation, and can also be used in hospitality schools, laboratories, hospitals and other indoor and outdoor places that require environmental hygiene.
- the image recognition technology of the embodiments of this application is used to detect and track rodents.
- An independent device is used to monitor rodent infestations locally through a surveillance camera.
- Rat works are used.
- the embodiments of this application aim to apply image recognition technology, integrate visual and image sequence features, automatically detect whether there is a mouse in the surveillance video, locate and track the mouse, and generate the movement trajectory route of the mouse and the activity frequency of each area.
- the process is all implemented by algorithms, without additional labor costs, and is an independent device without connecting to a cloud server, and all calculations and visualizations can be implemented internally.
- a mouse disease video monitoring device can include several components: an infrared low-light night vision camera, a data processing module and a front-end display component.
- the working principle of the above device is as follows: the infrared low-light night vision camera is responsible for Collect the scene video sequence, the data processing module receives the video sequence and detects whether there is a mouse in the video. If a mouse is detected, a series of information such as the position of the mouse is output to the front-end display interface.
- the front-end display interface displays the mouse's position, appearance time, and activity Area and can immediately alarm for rat infestation.
- FIG. 3 is a schematic diagram of a data connection of each module according to an embodiment of the present application.
- the video capture module 302 uses a reduced instruction set computer (Reduced Instruction Set Computer, referred to as RISC) microprocessor (Advanced RISC Machines).
- RISC Reduced Instruction Set Computer
- ARM Advanced RISC Machines
- FIG. 3 collects video data, and preprocesses it through the video preprocessing module 3024, the video processing module 304 reads the trained model in the embedded graphics processor (Graphics Processing Unit, referred to as GPU) processor
- the video processing is performed according to the deep learning algorithm. If the deep learning network model detects a mouse in a certain segment time, the segment and the corresponding detection result are stored in the storage module 306, and the storage module 306 outputs the series of information to the front end .
- Fig. 4 is a schematic diagram of the principle of a rat infestation detection system according to an embodiment of the present application.
- the algorithm includes the following modules: preprocessing, target detection, motion feature extraction and classification network.
- the input of the system is the original video sequence.
- Preprocessing consists of two steps: frame extraction and dynamic detection.
- the original video sequence is sampled at equal intervals to reduce the computational complexity of the algorithm, and then the target detection algorithm is used for target detection to determine whether there are moving objects in the image. If there is no moving object, no subsequent detection is performed. If there is a moving object , The video clips of moving objects are input to the subsequent module.
- each frame of the pre-processed video sequence is detected, and image features (such as the visual information in the detection frame corresponding to the location) are acquired at the location where rats may exist, and the motion feature extraction module is used to The information between each video image frame is fused and feature extraction is performed to prevent the single-frame target detector from misjudgment. Then the extracted motion features and image features are input into the classification network, and the classification network determines whether it is a mouse. If it is a mouse, the rectangular detection frame of the mouse at each frame is transmitted to the front-end display interface.
- image features such as the visual information in the detection frame corresponding to the location
- the above-mentioned target detection process allocates two algorithms according to specific machine computing resources: dynamic target detection algorithm and neural network-based target detection algorithm.
- the former has fast calculation speed and requires machine configuration. Low, the latter is accurate and robust.
- the dynamic target detection algorithm includes background difference and frame difference methods, using the following formula (1) to calculate the difference between the current frame and the background or the previous frame:
- (x, y) is the origin of the upper left corner of the image
- the width direction is the X axis
- the height direction is the coordinate of the pixel in the coordinate system established by the Y axis
- k is the index of the current frame
- f represents the current frame.
- b represents the background or the previous frame.
- M(x,y) is a moving image
- T is a threshold. If M(x,y) is 1, it means there is a moving target. All pixels of X(x,y) form the moving target video frame image, which is combined through morphological operations Pixels can get all moving targets as the output of this module.
- Fig. 5 is a schematic diagram of a Faster-RCNN network model according to an embodiment of the present application. As shown in Figure 5, where conv is the convolutional layer, the convolution kernel (which is a matrix) draws windows on the input, and the window position of each input is multiplied by the matrix according to formula (3), the result F is output as the feature of the window position.
- conv is the convolutional layer
- the convolution kernel which is a matrix
- RPN is a region proposal network, and a series of candidate frames will be proposed.
- the region of interest pooling layer maps the region of the feature map mentioned by the convolutional layer into the coordinates of the RPN output to a fixed size (w, h)
- the input is a classifier composed of a fully connected layer and a border regression, and the border regression outputs the possible coordinate position of the mouse.
- the output of the classifier is the confidence level of the mouse at that position.
- the motion feature extraction algorithm first calculates the correlation of the detection frame between frames according to the detection frame obtained in each frame, and the detection frame with a large correlation is considered the same object. Match the detection frame of each frame to obtain a series of moving pictures of the object, and finally use the 3D feature extraction network to extract the features of the motion sequence.
- the above classification network fusion of the visual information and motion characteristics in the target detection box, input the designed classification network model, used to screen out the picture sequence of non-rats, reduce the false alarm rate, and input the results into the front-end display interface to display the mouse’s Detection frame and track.
- the overall framework it is also possible but not limited to achieve the purpose of detection and recognition through target detection and classification network, so as to save the cost of framework layout.
- the embodiment of this application proposes the use of image recognition algorithms to automatically identify mice in surveillance videos, without placing mouse traps in mouse cages, and without having to spend manpower for observation, turning monitoring of rodent damage into an efficient and fully automated process, which not only greatly reduces
- the labor cost of monitoring rodents is high and the accuracy rate is high, which is convenient for the supervision of the hygiene of the rear kitchen rodents.
- it can also provide the trajectory of the rat movement, which is convenient for personnel to choose the location of the rodent control tool, which facilitates the further work of detoxification.
- FIG. 6 is a flowchart of the target object monitoring method according to an embodiment of the present application. As shown in FIG. 6, the process includes the following steps:
- step S602 when the video surveillance device detects that a moving object appears in the target area, it acquires an image from the target video where the object appears in the video obtained by the video surveillance device shooting the target area;
- Step S604 The video surveillance device sends the image to the first server, where the image is used to instruct the first server to determine whether the object is the target object according to the image.
- the target object may include, but is not limited to: rats, pests and other harmful organisms.
- the target area may include, but is not limited to, a kitchen, a warehouse, a factory building, and so on.
- the video monitoring device may include, but is not limited to, a camera, a monitor, and so on.
- the video surveillance device may include, but is not limited to, one or more video surveillance devices.
- the first server may include, but is not limited to: a first cloud server.
- a first cloud server For example: Ziyouyun.
- the first server determines whether the object appearing in the target area is the target object according to the image obtained from the video surveillance device.
- the image is the video surveillance device from the video when the video surveillance device detects a moving object in the target area.
- the video obtained by the surveillance equipment shooting the target area is obtained from the target video where the object appears, so the video surveillance equipment only needs to send the possible object to the first server when a moving object is detected in the target area
- the first server can determine whether the object appearing in the target area is the target object. It can be seen that compared with the method of monitoring the target object based on video, the amount of data transmitted can be greatly reduced, thereby increasing the transmission speed and reducing Transmission time improves monitoring efficiency. Therefore, the problem of low efficiency in monitoring the target object in related technologies can be solved, and the effect of improving the efficiency of monitoring the target object can be achieved.
- the video surveillance device sends the target video to the second server, where the second server is used for receiving the first request sent by the first server.
- the target video is sent to the first server in response to the first request.
- the video surveillance device receives the second request sent by the first server, and the video surveillance device sends the target video to the first server in response to the second request.
- the video surveillance device in the case that the video surveillance device detects that a moving object appears in the target area, it is obtained from the video surveillance device taking pictures of the target area every predetermined time since the object appears in the target area. Intercept the video image from the video until the object no longer appears in the target area, and the image includes the video image.
- the video surveillance device sending the image to the first server includes: the video surveillance device sends the intercepted video image to the first server in real time; or the video surveillance device acquires an image set including all the intercepted video images, and sends the image set to The first server.
- the video surveillance device obtains from the video obtained by shooting the target area from the time the object appears in the target area until the object no longer appears in the target area The first video; the video surveillance device acquires the second video of the first target time period before the object appears in the target area and the third video of the second target time period after the object no longer appears in the target area; the video surveillance device sets the second video Video, the first video and the third video are determined as target videos.
- the method according to the above embodiment can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is Better implementation.
- the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, The optical disc) includes several instructions to enable a terminal device (which can be a mobile phone, a computer, a server, or a network device, etc.) to execute the methods of the various embodiments of the present application.
- a device for monitoring a target object is also provided, which is applied to the first server.
- the device is used to implement the above-mentioned embodiments and optional implementation manners, and those that have been explained will not be repeated.
- the term "module" can implement a combination of software and/or hardware with predetermined functions.
- the devices described in the following embodiments are preferably implemented by software, hardware or a combination of software and hardware is also possible and conceived.
- Fig. 7 is a first structural block diagram of a device for monitoring a target object according to an embodiment of the present application. As shown in Fig. 7, the device includes:
- the receiving module 72 is configured to receive an image sent by the video surveillance device when a moving object is detected in the target area, where the image is the target of the object appearing in the video obtained from the video surveillance device shooting the target area Images captured on the video;
- the determining module 74 is configured to determine whether the object is the target object according to the image.
- the target object may include, but is not limited to: rats, pests and other harmful organisms.
- the target area may include, but is not limited to, a kitchen, a warehouse, a factory building, and so on.
- the video monitoring device may include, but is not limited to, a camera, a monitor, and so on.
- the aforementioned camera may include, but is not limited to, a camera with an infrared lighting function, for example, an infrared low-light night vision camera. Further, the camera may also include but is not limited to: motion detection function, storage function, networking function (such as wifi networking) and high-definition (such as greater than 1080p) configuration.
- the video surveillance device may include, but is not limited to, one or more video surveillance devices.
- the first server may include, but is not limited to: a first cloud server.
- a first cloud server For example: Ziyouyun.
- the above-mentioned apparatus is further configured to obtain the target video in a case where the object is determined to be the target object.
- the above-mentioned apparatus is further configured to: obtain a target video from a video surveillance device; or obtain a target video from a second server, where the target video is a situation where a moving object is detected by the video surveillance device in the target area Sent to the second server.
- the above-mentioned apparatus is further configured to send instruction information to the second server when it is determined that the object is not the target object, where the instruction information is used to instruct the second server to delete the target video.
- the above-mentioned device is further configured to determine the movement track of the target object in the target area in the target video.
- the above-mentioned device is further configured to generate prompt information according to the movement track, wherein the prompt information is used to prompt a way to eliminate the target object.
- the above device is further configured to generate alarm information corresponding to the target object, where the alarm information is used to indicate that the target object appears in the target area, and the alarm information includes at least one of the following: target video, movement track, and prompt information ; Send the alarm information to the client.
- the determining module is configured to: identify whether the object in each received video image is the target object, and obtain the recognition result corresponding to each video image; and merge the recognition results corresponding to all the received video images into Target result: Determine whether the object is the target object according to the target result.
- the determining module is further configured to: determine whether an object appears in each video image received; and identify whether the object in the video image where the object appears is the target object.
- the determining module is configured to: perform target object detection on each target video frame image to obtain the image characteristics of each target video frame image, where the image includes multiple target video frame images obtained from the target video, Each target video frame image is used to indicate the object in the target area, and the image feature is used to indicate the target image area of the object whose similarity with the target object is greater than the first threshold; according to each target video
- the image characteristics of the frame image determine the motion characteristics, where the motion characteristics are used to indicate the motion speed and direction of the objects in the multiple target video frame images; multiple targets are determined according to the motion characteristics and the image characteristics of each target video frame image Whether the target object appears in the video frame image.
- the determining module is configured to: obtain a target vector corresponding to the target image area represented by the image feature of each target video frame image to obtain multiple target vectors, wherein each target vector is used to represent a corresponding target The motion speed and direction of the object in the video frame image when it passes through the target image area; multiple target vectors are formed into the first target vector according to the time sequence of each target video frame image in the video file, where the motion feature includes the first target vector Target vector; or, obtain a two-dimensional optical flow diagram corresponding to the target image area represented by the image feature of each target video frame image to obtain multiple two-dimensional optical flow diagrams, where each two-dimensional optical flow diagram includes a corresponding The moving speed and direction of the object in a target video frame image when passing through the target image area; multiple two-dimensional optical flow graphs are formed into a three-dimensional second target vector according to the time sequence of each target video frame image in the video file, Among them, the motion feature includes a three-dimensional second target vector.
- the determining module is configured to: input the motion characteristics and the image characteristics of each target video frame image into a pre-trained neural network model to obtain an object recognition result, where the object recognition result is used to represent multiple target videos Whether the target object appears in the frame image.
- the determining module is configured to: pass each image feature through a neural network layer structure including a convolution layer, a regularization layer, and an activation function layer to obtain multiple first feature vectors; combine the multiple first feature vectors with motion Features are fused to obtain the second feature vector; the second feature vector is input to the fully connected layer for classification, and the first classification result is obtained.
- a neural network layer structure including a convolution layer, a regularization layer, and an activation function layer
- the determining module is configured to: pass each image feature through a neural network layer structure including a convolution layer, a regularization layer, and an activation function layer to obtain multiple first feature vectors; combine the multiple first feature vectors with motion Features are fused to obtain the second feature vector; the second feature vector is input to the fully connected layer for classification, and the first classification result is obtained.
- the neural network model includes the neural network layer structure and the fully connected layer, and the object recognition result includes the first Classification result, the first classification result is used to indicate whether there are target objects in multiple target video frame images; or, each image feature is passed through a first neural network layer structure including a convolution layer, a regularization layer, and an activation function layer , Obtain multiple first feature vectors; pass the motion feature through the second neural network layer structure including the convolution layer, the regularization layer, and the activation function layer to obtain the second feature vector; combine the multiple first feature vectors with the second feature The vector is fused to obtain the third feature vector; the third feature vector is input to the fully connected layer for classification, and the second classification result is obtained.
- the neural network model includes the first neural network layer structure, the second neural network layer structure and the full In the connection layer, the object recognition result includes a second classification result, and the second classification result is used to indicate whether the target object appears in the multiple target video frame images.
- the receiving module is configured to receive multiple target video frame images sent by the video surveillance device, where the multiple target video frame images are obtained by sampling the target video by the video surveillance device to obtain a set of video frame images, And determined in a set of video frame images according to the pixel values of pixels in a set of video frame images; or,
- another target object monitoring device is also provided, which is applied to video monitoring equipment.
- the device is used to implement the above-mentioned embodiments and optional implementation modes, and those that have been described will not be repeated.
- the term "module" can implement a combination of software and/or hardware with predetermined functions.
- the devices described in the following embodiments are preferably implemented by software, hardware or a combination of software and hardware is also possible and conceived.
- Fig. 8 is a second structural block diagram of a device for monitoring a target object according to an embodiment of the present application. As shown in Fig. 8, the device includes:
- the acquiring module 82 is configured to acquire an image from the target video where the object appears in the video obtained by the video surveillance device shooting the target area in the case of detecting that a moving object appears in the target area;
- the sending module 84 is configured to send the image to the first server, where the image is used to instruct the first server to determine whether the object is the target object according to the image.
- the above-mentioned device is further configured to send the target video to a second server in the case that a moving object is detected in the target area, where the second server is configured to receive the first server sent by the first server.
- the target video is sent to the first server in response to the first request.
- the above device is further configured to: receive a second request sent by the first server; and send the target video to the first server in response to the second request.
- the acquisition module is set to: in the case that the video surveillance device detects that a moving object appears in the target area, it is obtained from the video surveillance device taking pictures of the target area every predetermined time since the object appears in the target area Intercept the video image from the video until the object no longer appears in the target area, and the image includes the video image;
- the sending module is configured to: the video surveillance device sends the intercepted video images to the first server in real time; or, the video surveillance device acquires an image set including all the intercepted video images, and sends the image set to the first server.
- the above-mentioned device is further configured to: in the case of detecting that a moving object appears in the target area, acquire from the video obtained by shooting the target area from the occurrence of the object in the target area until the target area no longer appears The first video until the object; the second video of the first target time period before the object appears in the target area and the third video of the second target time period after the object no longer appears in the target area; the second video, the first The first video and the third video are determined as target videos.
- each of the above modules can be implemented by software or hardware.
- it can be implemented in the following manner, but not limited to this: the above modules are all located in the same processor; or, the above modules are combined in any combination The forms are located in different processors.
- FIG. 9 is a structural block diagram of the target object monitoring system according to an embodiment of the present application. As shown in FIG. 9, the system includes: a video monitoring device 92 and a second One server 94, of which,
- the video monitoring device 92 is connected to the first server 94;
- the video monitoring device 92 is configured to obtain an image from the target video where the object appears in the video obtained by shooting the target area when a moving object is detected in the target area, and send the image to the first server 94 ;
- the first server 94 is configured to determine whether the object is a target object based on the image.
- the video surveillance device is set to: in the case of detecting that a moving object appears in the target area, start from the occurrence of the object in the target area from the video obtained by the video surveillance device shooting the target area at predetermined intervals Intercept the video image until the object no longer appears in the target area, the image includes the video image; send the intercepted video image to the first server in real time; or obtain an image set including all the intercepted video images, and send the image set To the first server.
- the first server is configured to: identify whether an object in each received video image is a target object, and obtain a recognition result corresponding to each video image; and merge the recognition results corresponding to all received video images Is the target result; according to the target result, determine whether the object is the target object.
- the first server is further configured to: when the object is determined to be the target object, obtain the target video; determine the movement track of the target object in the target area in the target video; generate prompt information according to the movement track, wherein , The prompt information is used to prompt the way to eliminate the target object; the alarm information corresponding to the target object is generated, where the alarm information is used to indicate that the target object appears in the target area, and the alarm information includes at least one of the following: target video, moving track, Prompt information.
- the above system further includes: a client, wherein the first server is connected to the client; the first server is set to send alarm information to the client; the client is set to display alarm information on a display interface.
- the above system further includes: a second server, wherein the second server is connected to the video monitoring device and the first server; the video monitoring device is further configured to send the video to the second server; the second server is configured to store the target video ; The first server is set to obtain the target video from the second server.
- the first server is further configured to send instruction information to the second server in a case where it is determined that the object is not the target object; the second server is configured to delete the target video in response to the instruction information.
- the video monitoring device is further configured to: obtain the first video from the video obtained by shooting the target area from the time the object appears in the target area until the object no longer appears in the target area; obtain the first video before the object appears in the target area The second video in the first target time period and the third video in the second target time period after the object no longer appears in the target area; the second video, the first video and the third video are determined as the target videos.
- FIG. 10 is a schematic diagram of a monitoring architecture of a target object according to an optional embodiment of the present application.
- a system architecture is proposed. Information on the external environment and pest activities. The system has the characteristics of rapid deployment. There is no need to deploy a server on the customer site. It only needs video surveillance equipment to collect data and deploy a wireless network environment for data upload. All subsequent calculations and analysis are completed in the cloud, which greatly saves the hardware of the system. Cost, complexity of system deployment, and can also excellently complete functions such as real-time warning of pests, video playback, path analysis, and rodent control and pest control recommendations. The system also combines pest monitoring and pest control, forming a benign closed loop, and assisting the actual pest control work as a whole.
- the system includes the following parts: a data collection part, a data analysis part, an instant alarm part, a video playback part, a path analysis part, and an application (APP) display part.
- a data collection part a data analysis part
- an instant alarm part a data analysis part
- a video playback part a video playback part
- a path analysis part a path analysis part
- APP application
- the data collection part is used to collect video and picture collections.
- an indoor environment can deploy multiple sets of monitoring equipment. Taking into account the characteristics of rats appearing at night, the video surveillance equipment needs infrared night vision function.
- Video surveillance equipment uses motion detection. When there are any changes in the content of the filmed picture (for example, when there is a mouse, a cockroach, or a foreign object flies in), the video in the period is written into the SD card (usually it will Pre-record and delay the video for 5 seconds, so that the video can record a complete action), upload the video data to the video cloud server (ie fluorite cloud, or other public clouds).
- the video surveillance equipment has the function of resuming the transmission when the network environment is unstable. It can also ensure that the video is uploaded to the video cloud server later.
- the video cloud server is set to temporarily save the video data. After the image recognition and analysis of the pictures, it is confirmed that there are pests and rodents, for the retrieval and playback, and further analysis.
- the video surveillance equipment saves and uploads the video, while saving a picture every 500 milliseconds (ms), and uploads the picture to its own cloud server in real time for image recognition.
- the self-owned cloud server After receiving the picture, the self-owned cloud server immediately completes the image recognition of the picture, using artificial intelligence (AI) technology to determine whether there are target pests in the image, such as mice, cockroaches, etc., or just Non-insect attack scenes such as foreign objects flying in. Enter the data analysis part.
- AI artificial intelligence
- the data analysis part uses its own cloud to perform image recognition, and applies image recognition algorithms to the images returned by video surveillance equipment to recognize rats, cockroaches and other pests.
- image recognition it is considered that rodents and pests have been found at that moment, and a request is sent to the video cloud server to retrieve and download the video data of pests and rodents in this time period for further analysis (when the server receives continuous pictures After the collection is received, and it is judged that there is a pest intrusion, the video of the entire time period is requested in real time; when the recognition is false, the dynamic recognition at that moment is considered to be irrelevant to the pest and no further processing is performed.
- the instant alarm part can be used for emergency rodent control.
- the cloud server sends an alarm message to the user terminal to instruct restaurant operators and pest control personnel to take measures. It also provides image playback to mark the identified pests such as rats and cockroaches, so that the operator can make a preliminary judgment on the location and hazards of the animals, and take timely control measures.
- the emergency deratization scene is suitable for the monitoring of places where rodent infestation is not allowed, such as computer rooms, hospitals, etc., with people on duty. Immediately instruct relevant personnel to take measures after discovering the rodent situation, and the system is responsible for providing pictures and video playback in time for reference to rodent control.
- the alarm information can also be sent via SMS, push information, etc.
- Video playback part When the video cloud server returns the requested video data and downloads it to its own cloud, the user terminal can access the video playback data.
- the speed of video downloading depends on whether the network is unblocked or not. It is slightly slower than the real-time picture display. Generally, the video playback data can be obtained within a few minutes after the rat situation occurs.
- the path analysis part extracts the movement paths of pests such as mice and cockroaches through further analysis of the video data, and marks the intrusion point, hiding point, travel route, activity duration, skin color and other information when the rat is infested for the purpose of formulating mouse control ,
- the further program of insect control is displayed on the user terminal.
- the mouse path display can be indicated by punctuation, with a string of numbers from small to large on the line segment to indicate the direction of the mouse or cockroach.
- the APP display part can display rodent and insect-killing recommendations, which are used for conventional pest control, summarize the pest information collected at each contact point, and visualize the historical path of pests and rodents. It is suitable for the deployment of sticky boards and cockroaches based on the location. The location of equipment such as the house, and suggestions for placement are given.
- the data dimensions used for display can also include the active duration of pests and rodents on the previous day/night, the types of pests, and the number of catches.
- the embodiment of the present application also provides a storage medium in which a computer program is stored, wherein the computer program is configured to execute the steps in any of the foregoing method embodiments when running.
- the foregoing storage medium may be configured to store a computer program for executing the following steps:
- the first server receives an image sent by the video surveillance device when a moving object is detected in the target area, where the image is a target video where the object appears in the video obtained from the video surveillance device shooting the target area Images acquired on
- S2 The first server determines whether the object is the target object according to the image.
- the foregoing storage medium may include, but is not limited to: U disk, Read-Only Memory (Read-Only Memory, ROM for short), Random Access Memory (Random Access Memory, RAM for short), Various media that can store computer programs such as mobile hard disks, magnetic disks, or optical disks.
- U disk Read-Only Memory
- ROM Read-Only Memory
- RAM Random Access Memory
- Various media that can store computer programs such as mobile hard disks, magnetic disks, or optical disks.
- An embodiment of the present application also provides an electronic device, including a memory and a processor, the memory stores a computer program, and the processor is configured to run the computer program to execute the steps in any one of the foregoing method embodiments.
- the aforementioned electronic device may further include a transmission device and an input-output device, wherein the transmission device is connected to the aforementioned processor, and the input-output device is connected to the aforementioned processor.
- the foregoing processor may be configured to execute the following steps through a computer program:
- the first server receives an image sent by the video surveillance device when a moving object is detected in the target area, where the image is a target video where the object appears in the video obtained from the video surveillance device shooting the target area Images acquired on
- S2 The first server determines whether the object is the target object according to the image.
- modules or steps of the present application can be implemented by a general computing device, and they can be concentrated on a single computing device or distributed in a network composed of multiple computing devices.
- they can be implemented with program codes executable by the computing device, so that they can be stored in the storage device for execution by the computing device, and in some cases, can be executed in a different order than here.
- this application receives through the first server the image sent by the video surveillance device when a moving object is detected in the target area, where the image is taken from the video surveillance device to the target area
- the obtained video shows the image obtained on the target video of the object
- the first server determines whether the object is the target object according to the image
- the first server determines whether the object appearing in the target area is the target according to the image obtained from the video surveillance device
- the object the image is obtained from the target video where the object appears in the video obtained by the video surveillance device shooting the target area when the video surveillance device detects that a moving object appears in the target area, thus the video surveillance
- the device only needs to send an image of the possible object to the first server when it detects a moving object in the target area, and the first server can determine whether the object in the target area is the target object according to the received image.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Alarm Systems (AREA)
- Closed-Circuit Television Systems (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (34)
- 一种目标对象的监控方法,包括:第一服务器接收视频监控设备在检测到目标区域中出现了移动的对象的情况下发送的图像,其中,所述图像是从所述视频监控设备对目标区域进行拍摄得到的视频中出现了所述对象的目标视频上获取的图像;所述第一服务器根据所述图像确定所述对象是否为目标对象。
- 根据权利要求1所述的方法,其中,在所述第一服务器根据所述图像确定所述对象是否为目标对象之后,所述方法还包括:在确定出所述对象为所述目标对象的情况下,所述第一服务器获取所述目标视频。
- 根据权利要求2所述的方法,其中,所述第一服务器获取所述目标视频包括:所述第一服务器从所述视频监控设备获取所述目标视频;或者,所述第一服务器从第二服务器获取所述目标视频,其中,所述目标视频是由所述视频监控设备在检测到目标区域中出现了移动的对象的情况下发送至所述第二服务器的。
- 根据权利要求3所述的方法,其中,在所述第一服务器根据所述图像确定所述对象是否为目标对象之后,所述方法还包括:在确定出所述对象不为所述目标对象的情况下,所述第一服务器向所述第二服务器发送指示信息,其中,所述指示信息用于指示所述第二服务器删除所述目标视频。
- 根据权利要求2所述的方法,其中,在所述第一服务器获取 所述目标视频之后,所述方法还包括:所述第一服务器在所述目标视频中确定出所述目标对象在所述目标区域中的移动轨迹。
- 根据权利要求5所述的方法,其中,在所述第一服务器在所述目标视频中确定出所述目标对象在所述目标区域中的移动轨迹之后,所述方法还包括:所述第一服务器根据所述移动轨迹生成提示信息,其中,所述提示信息用于提示消除所述目标对象的方式。
- 根据权利要求6所述的方法,其中,在所述第一服务器根据所述移动轨迹生成提示信息之后,所述方法还包括:所述第一服务器生成所述目标对象对应的告警信息,其中,所述告警信息用于指示在所述目标区域出现了所述目标对象,所述告警信息中包括以下至少之一:所述目标视频、所述移动轨迹、所述提示信息;所述第一服务器将所述告警信息发送至客户端。
- 根据权利要求1所述的方法,其中,在第一服务器接收视频监控设备在检测到目标区域中出现了移动的对象的情况下发送的图像之前,所述方法还包括:所述视频监控设备在检测到目标区域中出现了移动的对象的情况下,从所述目标区域中出现了所述对象开始每隔预定时间从所述视频监控设备对目标区域进行拍摄得到的视频中截取视频图像,直至所述对象不再出现在所述目标区域中,所述图像包括所述视频图像;所述视频监控设备将截取的所述视频图像实时发送至所述第一 服务器;或者,所述视频监控设备获取包括截取到的全部视频图像的图像集,并将所述图像集发送至所述第一服务器。
- 根据权利要求8所述的方法,其中,所述第一服务器根据所述图像确定所述对象是否为目标对象包括:所述第一服务器识别接收到的每一张所述视频图像中的所述对象是否为所述目标对象,得到每一张所述视频图像对应的识别结果;所述第一服务器将接收到的全部所述视频图像对应的识别结果融合为目标结果;所述第一服务器根据所述目标结果确定所述对象是否为目标对象。
- 根据权利要求9所述的方法,其中,所述第一服务器识别接收到的每一张所述视频图像中的所述对象是否为所述目标对象包括:所述第一服务器确定接收到的每一张所述视频图像中是否出现了所述对象;所述第一服务器识别出现了所述对象的所述视频图像中的所述对象是否为所述目标对象。
- 根据权利要求1所述的方法,其中,所述第一服务器根据所述图像确定所述对象是否为目标对象包括:所述第一服务器对每个目标视频帧图像进行目标对象的检测,得到每个所述目标视频帧图像的图像特征,其中,所述图像包括从所述目标视频上获取的多个目标视频帧图像,每个所述目标视频帧图像用于指示在所述目标区域中的所述对象,所述图像特征用于表示在所述对象中,与所述目标对象之间的相似度大于第一阈值的对象所在的目 标图像区域;所述第一服务器根据每个所述目标视频帧图像的图像特征确定出运动特征,其中,所述运动特征用于表示所述多个目标视频帧图像中所述对象的运动速度和运动方向;所述第一服务器根据所述运动特征和每个所述目标视频帧图像的图像特征,确定所述多个目标视频帧图像中是否出现有所述目标对象。
- 根据权利要求11所述的方法,其中,所述第一服务器根据每个所述目标视频帧图像的图像特征确定出运动特征包括:获取与每个所述目标视频帧图像的图像特征所表示的目标图像区域对应的目标矢量,得到多个目标矢量,其中,每个所述目标矢量用于表示对应的一个所述目标视频帧图像中所述对象在经过所述目标图像区域时的运动速度和运动方向;将所述多个目标矢量按照每个所述目标视频帧图像在所述视频文件中的时间顺序组成第一目标向量,其中,所述运动特征包括所述第一目标向量;或者获取与每个所述目标视频帧图像的图像特征所表示的目标图像区域对应的二维光流图,得到多个二维光流图,其中,每个所述二维光流图包括对应的一个所述目标视频帧图像中所述对象在经过所述目标图像区域时的运动速度和运动方向;将所述多个二维光流图按照每个所述目标视频帧图像在所述视频文件中的时间顺序组成三维第二目标向量,其中,所述运动特征包括所述三维第二目标向量。
- 根据权利要求11所述的方法,其中,所述第一服务器根据所述运动特征和每个所述目标视频帧图像的图像特征,确定所述多个目标视频帧图像中是否出现有所述目标对象包括:将所述运动特征和每个所述目标视频帧图像的图像特征输入到预先训练好的神经网络模型中,得到对象识别结果,其中,所述对象识别结果用于表示所述多个目标视频帧图像中是否出现有所述目标对象。
- 根据权利要求13所述的方法,其中,将所述运动特征和每个所述目标视频帧图像的图像特征输入到预先训练好的神经网络模型中,得到对象识别结果包括:将每个所述图像特征经过包括卷积层、正则化层和激活函数层的神经网络层结构,得到多个第一特征向量;将所述多个第一特征向量与所述运动特征进行融合,得到第二特征向量;将所述第二特征向量输入到全连接层进行分类,得到第一分类结果,其中,所述神经网络模型包括所述神经网络层结构和所述全连接层,所述对象识别结果包括所述第一分类结果,所述第一分类结果用于表示所述多个目标视频帧图像中是否出现有所述目标对象;或者将每个所述图像特征经过包括卷积层、正则化层和激活函数层的第一神经网络层结构,得到多个第一特征向量;将所述运动特征经过包括卷积层、正则化层、激活函数层的第二神经网络层结构,得到第二特征向量;将所述多个第一特征向量与所述第二特征向量进行融合,得到第三特征向量;将所述第三特征向量输入到全连接层进行分类,得到第二分类结果,其中,所述神经网络模型包括所述第一神经网络层结构、所述第二神经网络层结构和所述全连接层,所述对象识别结果包括所述第二分类结果,所述第二分类结果用于表示所述多个目标视频帧图像中是否出现有所述目标对象。
- 根据权利要求11所述的方法,其中,所述第一服务器接收视频监控设备在检测到目标区域中出现了移动的对象的情况下发送的图像包括:所述第一服务器接收视频监控设备发送的所述多个目标视频帧图像,其中,所述多个目标视频帧图像是通过所述视频监控设备对所述目标视频进行抽帧采样,得到一组视频帧图像,并根据所述一组视频帧图像中的像素点的像素值在所述一组视频帧图像中确定的;或者,所述第一服务器接收视频监控设备发送的一组视频帧图像,其中,所述一组视频帧图像是通过所述视频监控设备对所述目标视频进行抽帧采样得到的;所述第一服务器根据所述一组视频帧图像中的像素点的像素值在所述一组视频帧图像中确定出所述多个目标视频帧图像。
- 根据权利要求1至15中任一项所述的方法,其中,所述第一服务器包括:第一云服务器。
- 根据权利要求3所述的方法,其中,所述第二服务器包括:第二云服务器。
- 一种目标对象的监控方法,包括:视频监控设备在检测到目标区域中出现了移动的对象的情况下,从所述视频监控设备对目标区域进行拍摄得到的视频中出现了所述对象的目标视频上获取图像;所述视频监控设备将所述图像发送至第一服务器,其中,所述图像用于指示所述第一服务器根据所述图像确定所述对象是否为目标对象。
- 根据权利要求18所述的方法,其中,在检测到目标区域中出现了移动的对象的情况下,所述方法还包括:所述视频监控设备将所述目标视频发送至第二服务器,其中,所述第二服务器用于在接收到所述第一服务器发送的第一请求的情况 下,响应所述第一请求将所述目标视频发送至所述第一服务器。
- 根据权利要求18所述的方法,其中,在所述视频监控设备将所述图像发送至第一服务器之后,所述方法还包括:所述视频监控设备接收所述第一服务器发送的第二请求;所述视频监控设备响应所述第二请求将所述目标视频发送至所述第一服务器。
- 根据权利要求18所述的方法,其中,从所述视频监控设备对目标区域进行拍摄得到的视频中出现了所述对象的目标视频上获取图像包括:所述视频监控设备在检测到目标区域中出现了移动的对象的情况下,从所述目标区域中出现了所述对象开始每隔预定时间从所述视频监控设备对目标区域进行拍摄得到的视频中截取视频图像,直至所述对象不再出现在所述目标区域中,所述图像包括所述视频图像;所述视频监控设备将所述图像发送至第一服务器包括:所述视频监控设备将截取的所述视频图像实时发送至所述第一服务器;或者,所述视频监控设备获取包括截取到的全部视频图像的图像集,并将所述图像集发送至所述第一服务器。
- 根据权利要求18所述的方法,其中,在检测到目标区域中出现了移动的对象的情况下,所述方法还包括:所述视频监控设备从对所述目标区域进行拍摄得到的视频中获取从所述目标区域中出现所述对象开始直至所述目标区域中不再出现所述对象为止的第一视频;所述视频监控设备获取所述目标区域中出现所述对象之前的第 一目标时间段的第二视频以及所述目标区域中不再出现所述对象之后的第二目标时间段的第三视频;所述视频监控设备将所述第二视频,所述第一视频和所述第三视频确定为所述目标视频。
- 一种目标对象的监控系统,包括:视频监控设备和第一服务器,其中,所述视频监控设备与所述第一服务器连接;所述视频监控设备设置为在检测到目标区域中出现了移动的对象的情况下,从对目标区域进行拍摄得到的视频中出现了所述对象的目标视频上获取图像,并将所述图像发送至所述第一服务器;所述第一服务器设置为根据所述图像确定所述对象是否为目标对象。
- 根据权利要求23所述的系统,其中,所述视频监控设备设置为:在检测到目标区域中出现了移动的对象的情况下,从所述目标区域中出现了所述对象开始每隔预定时间从所述视频监控设备对目标区域进行拍摄得到的视频中截取视频图像,直至所述对象不再出现在所述目标区域中,所述图像包括所述视频图像;将截取的所述视频图像实时发送至所述第一服务器;或者,获取包括截取到的全部视频图像的图像集,并将所述图像集发送至所述第一服务器。
- 根据权利要求24所述的系统,其中,所述第一服务器设置为:识别接收到的每一张所述视频图像中的所述对象是否为所述目标对象,得到每一张所述视频图像对应的识别结果;将接收到的全部所述视频图像对应的识别结果融合为目标结果;根据所述目标结果确定所述对象是否为目标对象。
- 根据权利要求23所述的系统,其中,所述第一服务器还设置为:在确定出所述对象为所述目标对象的情况下,获取所述目标视频;在所述目标视频中确定出所述目标对象在所述目标区域中的移动轨迹;根据所述移动轨迹生成提示信息,其中,所述提示信息用于提示消除所述目标对象的方式;生成所述目标对象对应的告警信息,其中,所述告警信息用于指示在所述目标区域出现了所述目标对象,所述告警信息中包括以下至少之一:所述目标视频、所述移动轨迹、所述提示信息。
- 根据权利要求26所述的系统,其中,所述系统还包括:客户端,其中,所述第一服务器与所述客户端连接;所述第一服务器设置为将所述告警信息发送至所述客户端;所述客户端设置为在显示界面上显示所述告警信息。
- 根据权利要求26所述的系统,其中,所述系统还包括:第二服务器,其中,所述第二服务器与所述视频监控设备和所述第一服务器连接;所述视频监控设备还设置为将所述视频发送至所述第二服务器;所述第二服务器设置为存储所述目标视频;所述第一服务器设置为从所述第二服务器获取所述目标视频。
- 根据权利要求28所述的系统,其中,所述第一服务器还设置为:在确定所述对象不为所述目标对象的情况下,向所述第二服务器发送指示信息;所述第二服务器设置为:响应所述指示信息删除所述目标视频。
- 根据权利要求26所述的系统,其中,所述视频监控设备还设置为:从对所述目标区域进行拍摄得到的视频中获取从所述目标区域中出现所述对象开始直至所述目标区域中不再出现所述对象为止的第一视频;获取所述目标区域中出现所述对象之前的第一目标时间段的第二视频以及所述目标区域中不再出现所述对象之后的第二目标时间段的第三视频;将所述第二视频,所述第一视频和所述第三视频确定为所述目标视频。
- 一种目标对象的监控装置,应用于第一服务器,包括:接收模块,设置为接收视频监控设备在检测到目标区域中出现了移动的对象的情况下发送的图像,其中,所述图像是从所述视频监控设备对目标区域进行拍摄得到的视频中出现了所述对象的目标视频上获取的图像;确定模块,设置为根据所述图像确定所述对象是否为目标对象。
- 一种目标对象的监控装置,应用于视频监控设备,包括:获取模块,设置为在检测到目标区域中出现了移动的对象的情况下,从所述视频监控设备对目标区域进行拍摄得到的视频中出现了所述对象的目标视频上获取图像;发送模块,设置为将所述图像发送至第一服务器,其中,所述图像用于指示所述第一服务器根据所述图像确定所述对象是否为目标对象。
- 一种存储介质,所述存储介质中存储有计算机程序,其中,所述计算机程序被设置为运行时执行所述权利要求1至22任一项中所述的方法。
- 一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器被设置为运行所述计算机程序以执行所述权利要求1至22任一项中所述的方法。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2019570566A JP7018462B2 (ja) | 2019-01-24 | 2019-04-01 | 目標対象物の監視方法、装置及びシステム |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910068774.0 | 2019-01-24 | ||
CN201910068774.0A CN109919009A (zh) | 2019-01-24 | 2019-01-24 | 目标对象的监控方法、装置及系统 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020151084A1 true WO2020151084A1 (zh) | 2020-07-30 |
Family
ID=66960691
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/080747 WO2020151084A1 (zh) | 2019-01-24 | 2019-04-01 | 目标对象的监控方法、装置及系统 |
Country Status (3)
Country | Link |
---|---|
JP (1) | JP7018462B2 (zh) |
CN (1) | CN109919009A (zh) |
WO (1) | WO2020151084A1 (zh) |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112101344A (zh) * | 2020-08-25 | 2020-12-18 | 腾讯科技(深圳)有限公司 | 一种视频文本跟踪方法及装置 |
CN112199993A (zh) * | 2020-09-01 | 2021-01-08 | 广西大学 | 基于人工智能识别任意方向变电站绝缘子红外图像检测模型的方法 |
CN112437274A (zh) * | 2020-11-17 | 2021-03-02 | 浙江大华技术股份有限公司 | 一种抓拍图片的传输方法及抓拍机 |
CN112565863A (zh) * | 2020-11-26 | 2021-03-26 | 深圳Tcl新技术有限公司 | 视频播放方法、装置、终端设备及计算机可读存储介质 |
CN112633131A (zh) * | 2020-12-18 | 2021-04-09 | 宁波长壁流体动力科技有限公司 | 一种基于深度学习视频识别的井下自动跟机方法 |
CN112784738A (zh) * | 2021-01-21 | 2021-05-11 | 上海云从汇临人工智能科技有限公司 | 运动目标检测告警方法、装置以及计算机可读存储介质 |
CN112836089A (zh) * | 2021-01-28 | 2021-05-25 | 浙江大华技术股份有限公司 | 运动轨迹的确认方法及装置、存储介质、电子装置 |
CN113055654A (zh) * | 2021-03-26 | 2021-06-29 | 太原师范学院 | 边缘设备中的视频流有损压缩方法 |
CN113221800A (zh) * | 2021-05-24 | 2021-08-06 | 珠海大横琴科技发展有限公司 | 一种待检测目标的监控判断方法及系统 |
CN113435368A (zh) * | 2021-06-30 | 2021-09-24 | 青岛海尔科技有限公司 | 监控数据的识别方法和装置、存储介质及电子装置 |
CN113609317A (zh) * | 2021-09-16 | 2021-11-05 | 杭州海康威视数字技术股份有限公司 | 一种图像库构建方法、装置及电子设备 |
CN114241420A (zh) * | 2021-12-20 | 2022-03-25 | 国能(泉州)热电有限公司 | 一种动火作业检测方法及装置 |
CN114403047A (zh) * | 2022-02-09 | 2022-04-29 | 上海依蕴宠物用品有限公司 | 一种基于图像分析技术的老龄动物健康干预方法及系统 |
CN115150371A (zh) * | 2022-08-31 | 2022-10-04 | 深圳市万佳安物联科技股份有限公司 | 基于云平台的业务处理方法、系统及储存介质 |
CN115187916A (zh) * | 2022-09-13 | 2022-10-14 | 太极计算机股份有限公司 | 基于时空关联的建筑内疫情防控方法、装置、设备和介质 |
CN115457447A (zh) * | 2022-11-07 | 2022-12-09 | 浙江莲荷科技有限公司 | 运动物体识别的方法、装置、系统及电子设备、存储介质 |
CN116684626A (zh) * | 2023-08-04 | 2023-09-01 | 广东星云开物科技股份有限公司 | 视频压缩方法和共享售卖柜 |
CN116890668A (zh) * | 2023-09-07 | 2023-10-17 | 国网浙江省电力有限公司台州供电公司 | 信息同步互联的安全充电方法及充电装置 |
CN117392596A (zh) * | 2023-09-07 | 2024-01-12 | 中关村科学城城市大脑股份有限公司 | 数据处理方法、装置、电子设备和计算机可读介质 |
CN117671597A (zh) * | 2023-12-25 | 2024-03-08 | 北京大学长沙计算与数字经济研究院 | 一种老鼠检测模型的构建方法和老鼠检测方法及装置 |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110472492A (zh) * | 2019-07-05 | 2019-11-19 | 平安国际智慧城市科技股份有限公司 | 目标生物检测方法、装置、计算机设备和存储介质 |
CN110516535A (zh) * | 2019-07-12 | 2019-11-29 | 杭州电子科技大学 | 一种基于深度学习的老鼠活跃度检测方法和系统、及卫生评估方法 |
CN111753609B (zh) * | 2019-08-02 | 2023-12-26 | 杭州海康威视数字技术股份有限公司 | 一种目标识别的方法、装置及摄像机 |
CN110674793A (zh) * | 2019-10-22 | 2020-01-10 | 上海秒针网络科技有限公司 | 调味品容器加盖监测方法及系统 |
CN111126317B (zh) * | 2019-12-26 | 2023-06-23 | 腾讯科技(深圳)有限公司 | 一种图像处理方法、装置、服务器及存储介质 |
CN111553238A (zh) * | 2020-04-23 | 2020-08-18 | 北京大学深圳研究生院 | 一种用于动作的时间轴定位的回归分类模块和方法 |
CN111611938B (zh) * | 2020-05-22 | 2023-08-29 | 浙江大华技术股份有限公司 | 一种逆行方向确定方法及装置 |
EP3929801A1 (en) * | 2020-06-25 | 2021-12-29 | Axis AB | Training of an object recognition neural network |
CN112001457A (zh) * | 2020-07-14 | 2020-11-27 | 浙江大华技术股份有限公司 | 图像预处理方法、装置、系统和计算机可读存储介质 |
CN111898581B (zh) * | 2020-08-12 | 2024-05-17 | 成都佳华物链云科技有限公司 | 动物检测方法、装置、电子设备及可读存储介质 |
CN112311966A (zh) * | 2020-11-13 | 2021-02-02 | 深圳市前海手绘科技文化有限公司 | 一种短视频中动态镜头制作的方法和装置 |
CN112861826B (zh) * | 2021-04-08 | 2021-12-14 | 重庆工程职业技术学院 | 基于视频图像的煤矿监管方法、系统、设备及存储介质 |
CN113487821A (zh) * | 2021-07-30 | 2021-10-08 | 重庆予胜远升网络科技有限公司 | 基于机器视觉的电力设备异物入侵识别系统及方法 |
CN114051124B (zh) * | 2022-01-17 | 2022-05-20 | 深圳市华付信息技术有限公司 | 支持多区域监控的视频监控方法、装置、设备及存储介质 |
CN115091472B (zh) * | 2022-08-26 | 2022-11-22 | 珠海市南特金属科技股份有限公司 | 基于人工智能的目标定位方法及装夹机械手控制系统 |
TWI826129B (zh) * | 2022-11-18 | 2023-12-11 | 英業達股份有限公司 | 週期時間偵測及修正系統與方法 |
CN117221391B (zh) * | 2023-11-09 | 2024-02-23 | 天津华来科技股份有限公司 | 基于视觉语义大模型的智能摄像机推送方法、装置及设备 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160366346A1 (en) * | 2015-06-12 | 2016-12-15 | Google Inc. | Using infrared images of a monitored scene to identify windows |
CN106559645A (zh) * | 2015-09-25 | 2017-04-05 | 杭州海康威视数字技术股份有限公司 | 基于摄像机的监控方法、系统和装置 |
CN106878666A (zh) * | 2015-12-10 | 2017-06-20 | 杭州海康威视数字技术股份有限公司 | 基于监控摄像机来查找目标对象的方法、装置和系统 |
CN107358160A (zh) * | 2017-06-08 | 2017-11-17 | 小草数语(北京)科技有限公司 | 终端监控视频处理方法、监控终端以及服务器 |
CN108259830A (zh) * | 2018-01-25 | 2018-07-06 | 深圳冠思大数据服务有限公司 | 基于云服务器的鼠患智能监控系统和方法 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004266735A (ja) | 2003-03-04 | 2004-09-24 | Ecore Kk | ねずみの監視システム |
US7746378B2 (en) * | 2004-10-12 | 2010-06-29 | International Business Machines Corporation | Video analysis, archiving and alerting methods and apparatus for a distributed, modular and extensible video surveillance system |
CN101854516B (zh) * | 2009-04-02 | 2014-03-05 | 北京中星微电子有限公司 | 视频监控系统、视频监控服务器及视频监控方法 |
JP2011197365A (ja) | 2010-03-19 | 2011-10-06 | Panasonic Corp | 映像表示装置および映像表示方法 |
WO2017208356A1 (ja) | 2016-05-31 | 2017-12-07 | 株式会社オプティム | IoT制御システム、IoT制御方法及びプログラム |
WO2019043855A1 (ja) | 2017-08-31 | 2019-03-07 | 三菱電機株式会社 | データ伝送装置、データ処理システムおよびデータ伝送方法 |
-
2019
- 2019-01-24 CN CN201910068774.0A patent/CN109919009A/zh active Pending
- 2019-04-01 WO PCT/CN2019/080747 patent/WO2020151084A1/zh active Application Filing
- 2019-04-01 JP JP2019570566A patent/JP7018462B2/ja active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160366346A1 (en) * | 2015-06-12 | 2016-12-15 | Google Inc. | Using infrared images of a monitored scene to identify windows |
CN106559645A (zh) * | 2015-09-25 | 2017-04-05 | 杭州海康威视数字技术股份有限公司 | 基于摄像机的监控方法、系统和装置 |
CN106878666A (zh) * | 2015-12-10 | 2017-06-20 | 杭州海康威视数字技术股份有限公司 | 基于监控摄像机来查找目标对象的方法、装置和系统 |
CN107358160A (zh) * | 2017-06-08 | 2017-11-17 | 小草数语(北京)科技有限公司 | 终端监控视频处理方法、监控终端以及服务器 |
CN108259830A (zh) * | 2018-01-25 | 2018-07-06 | 深圳冠思大数据服务有限公司 | 基于云服务器的鼠患智能监控系统和方法 |
Cited By (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112101344A (zh) * | 2020-08-25 | 2020-12-18 | 腾讯科技(深圳)有限公司 | 一种视频文本跟踪方法及装置 |
CN112199993B (zh) * | 2020-09-01 | 2022-08-09 | 广西大学 | 基于人工智能识别任意方向变电站绝缘子红外图像检测模型的方法 |
CN112199993A (zh) * | 2020-09-01 | 2021-01-08 | 广西大学 | 基于人工智能识别任意方向变电站绝缘子红外图像检测模型的方法 |
CN112437274A (zh) * | 2020-11-17 | 2021-03-02 | 浙江大华技术股份有限公司 | 一种抓拍图片的传输方法及抓拍机 |
CN112565863A (zh) * | 2020-11-26 | 2021-03-26 | 深圳Tcl新技术有限公司 | 视频播放方法、装置、终端设备及计算机可读存储介质 |
CN112633131A (zh) * | 2020-12-18 | 2021-04-09 | 宁波长壁流体动力科技有限公司 | 一种基于深度学习视频识别的井下自动跟机方法 |
CN112633131B (zh) * | 2020-12-18 | 2022-09-13 | 宁波长壁流体动力科技有限公司 | 一种基于深度学习视频识别的井下自动跟机方法 |
CN112784738B (zh) * | 2021-01-21 | 2023-09-19 | 上海云从汇临人工智能科技有限公司 | 运动目标检测告警方法、装置以及计算机可读存储介质 |
CN112784738A (zh) * | 2021-01-21 | 2021-05-11 | 上海云从汇临人工智能科技有限公司 | 运动目标检测告警方法、装置以及计算机可读存储介质 |
CN112836089A (zh) * | 2021-01-28 | 2021-05-25 | 浙江大华技术股份有限公司 | 运动轨迹的确认方法及装置、存储介质、电子装置 |
CN112836089B (zh) * | 2021-01-28 | 2023-08-22 | 浙江大华技术股份有限公司 | 运动轨迹的确认方法及装置、存储介质、电子装置 |
CN113055654A (zh) * | 2021-03-26 | 2021-06-29 | 太原师范学院 | 边缘设备中的视频流有损压缩方法 |
CN113221800A (zh) * | 2021-05-24 | 2021-08-06 | 珠海大横琴科技发展有限公司 | 一种待检测目标的监控判断方法及系统 |
CN113435368A (zh) * | 2021-06-30 | 2021-09-24 | 青岛海尔科技有限公司 | 监控数据的识别方法和装置、存储介质及电子装置 |
CN113435368B (zh) * | 2021-06-30 | 2024-03-22 | 青岛海尔科技有限公司 | 监控数据的识别方法和装置、存储介质及电子装置 |
CN113609317A (zh) * | 2021-09-16 | 2021-11-05 | 杭州海康威视数字技术股份有限公司 | 一种图像库构建方法、装置及电子设备 |
CN113609317B (zh) * | 2021-09-16 | 2024-04-02 | 杭州海康威视数字技术股份有限公司 | 一种图像库构建方法、装置及电子设备 |
CN114241420A (zh) * | 2021-12-20 | 2022-03-25 | 国能(泉州)热电有限公司 | 一种动火作业检测方法及装置 |
CN114403047B (zh) * | 2022-02-09 | 2023-01-06 | 上海依蕴宠物用品有限公司 | 一种基于图像分析技术的老龄动物健康干预方法及系统 |
CN114403047A (zh) * | 2022-02-09 | 2022-04-29 | 上海依蕴宠物用品有限公司 | 一种基于图像分析技术的老龄动物健康干预方法及系统 |
CN115150371A (zh) * | 2022-08-31 | 2022-10-04 | 深圳市万佳安物联科技股份有限公司 | 基于云平台的业务处理方法、系统及储存介质 |
CN115187916A (zh) * | 2022-09-13 | 2022-10-14 | 太极计算机股份有限公司 | 基于时空关联的建筑内疫情防控方法、装置、设备和介质 |
CN115457447A (zh) * | 2022-11-07 | 2022-12-09 | 浙江莲荷科技有限公司 | 运动物体识别的方法、装置、系统及电子设备、存储介质 |
CN116684626A (zh) * | 2023-08-04 | 2023-09-01 | 广东星云开物科技股份有限公司 | 视频压缩方法和共享售卖柜 |
CN116684626B (zh) * | 2023-08-04 | 2023-11-24 | 广东星云开物科技股份有限公司 | 视频压缩方法和共享售卖柜 |
CN116890668A (zh) * | 2023-09-07 | 2023-10-17 | 国网浙江省电力有限公司台州供电公司 | 信息同步互联的安全充电方法及充电装置 |
CN116890668B (zh) * | 2023-09-07 | 2023-11-28 | 国网浙江省电力有限公司杭州供电公司 | 信息同步互联的安全充电方法及充电装置 |
CN117392596A (zh) * | 2023-09-07 | 2024-01-12 | 中关村科学城城市大脑股份有限公司 | 数据处理方法、装置、电子设备和计算机可读介质 |
CN117392596B (zh) * | 2023-09-07 | 2024-04-30 | 中关村科学城城市大脑股份有限公司 | 数据处理方法、电子设备和计算机可读介质 |
CN117671597A (zh) * | 2023-12-25 | 2024-03-08 | 北京大学长沙计算与数字经济研究院 | 一种老鼠检测模型的构建方法和老鼠检测方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
JP2021514548A (ja) | 2021-06-10 |
JP7018462B2 (ja) | 2022-02-10 |
CN109919009A (zh) | 2019-06-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020151084A1 (zh) | 目标对象的监控方法、装置及系统 | |
CN109922310B (zh) | 目标对象的监控方法、装置及系统 | |
CN109886130B (zh) | 目标对象的确定方法、装置、存储介质和处理器 | |
WO2020151083A1 (zh) | 区域确定方法、装置、存储介质和处理器 | |
US20220301300A1 (en) | Processing method for augmented reality scene, terminal device, system, and computer storage medium | |
CN109886999B (zh) | 位置确定方法、装置、存储介质和处理器 | |
JP7229662B2 (ja) | ビデオ監視システムで警告を発する方法 | |
CN101918989B (zh) | 带有对象跟踪和检索的视频监控系统 | |
KR102296088B1 (ko) | 보행자 추적 방법 및 전자 디바이스 | |
CN109886129B (zh) | 提示信息生成方法和装置,存储介质及电子装置 | |
CN104303193B (zh) | 基于聚类的目标分类 | |
WO2021139049A1 (zh) | 检测方法、检测装置、监控设备和计算机可读存储介质 | |
CN106559645B (zh) | 基于摄像机的监控方法、系统和装置 | |
AU2012340862A1 (en) | Geographic map based control | |
CN112733690A (zh) | 一种高空抛物检测方法、装置及电子设备 | |
US11134221B1 (en) | Automated system and method for detecting, identifying and tracking wildlife | |
WO2021063046A1 (zh) | 一种分布式目标监测系统和方法 | |
JP6787831B2 (ja) | 検索結果による学習が可能な対象検出装置、検出モデル生成装置、プログラム及び方法 | |
CN108288017A (zh) | 获取对象密度的方法及装置 | |
CN109831634A (zh) | 目标对象的密度信息确定方法及装置 | |
CN111291646A (zh) | 一种人流量统计方法、装置、设备及存储介质 | |
KR101944374B1 (ko) | 이상 개체 검출 장치 및 방법, 이를 포함하는 촬상 장치 | |
KR102424098B1 (ko) | 딥러닝을 이용한 드론 검출 장치 및 방법 | |
KR102171384B1 (ko) | 영상 보정 필터를 이용한 객체 인식 시스템 및 방법 | |
CN111681269B (zh) | 一种基于空间一致性的多摄像机协同人物追踪系统及训练方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2019570566 Country of ref document: JP Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19911023 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 15.11.2021) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19911023 Country of ref document: EP Kind code of ref document: A1 |