CN112652013A - Camera object finding method based on deep learning - Google Patents

Camera object finding method based on deep learning Download PDF

Info

Publication number
CN112652013A
CN112652013A CN202110082166.2A CN202110082166A CN112652013A CN 112652013 A CN112652013 A CN 112652013A CN 202110082166 A CN202110082166 A CN 202110082166A CN 112652013 A CN112652013 A CN 112652013A
Authority
CN
China
Prior art keywords
deep learning
video
camera
target detection
article
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110082166.2A
Other languages
Chinese (zh)
Inventor
段强
李锐
王建华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinan Inspur Hi Tech Investment and Development Co Ltd
Original Assignee
Jinan Inspur Hi Tech Investment and Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinan Inspur Hi Tech Investment and Development Co Ltd filed Critical Jinan Inspur Hi Tech Investment and Development Co Ltd
Priority to CN202110082166.2A priority Critical patent/CN112652013A/en
Publication of CN112652013A publication Critical patent/CN112652013A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/48Matching video sequences
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a camera object finding method based on deep learning, which belongs to the technical field of deep learning and image processing. The method for detecting the object by the deep learning target and extracting the features can greatly improve the detection rate of the object, and improve the detection accuracy and the class number of the detected objects.

Description

Camera object finding method based on deep learning
Technical Field
The invention relates to the technical field of deep learning and image processing, in particular to a camera object finding method based on deep learning.
Background
The current concept of the internet of things is prevalent, and a huge number of monitoring cameras exist in society or families, so that most of the life time and the life area of people are covered. The ubiquitous video data can be used for monitoring and can be expanded to other applications. For example, video object finding is also performed by some object finding algorithms based on a camera at present, but most of the algorithms are based on a traditional image comparison mode, the object detection rate of the algorithms is not high, and the detection accuracy is low.
Disclosure of Invention
The technical task of the invention is to provide the camera object finding method based on deep learning aiming at the defects, which can greatly improve the object detection rate, improve the detection accuracy and the class number of the detected objects.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a camera object finding method based on deep learning carries out video analysis on a real-time video or a historical video through target detection or local feature extraction, and positions the current position or the last appearing position of an object to be found.
Deploying a target detection and feature extraction algorithm, calling data of a monitoring camera, acquiring a category or sample image of an article to be searched from a user, performing video analysis to detect or match the article, and giving the current article position or the position of the article appearing last in a history record.
Preferably, the target detection is performed by specifying an object in a real-time surveillance video or a stored surveillance video based on a general target detection algorithm to perform video analysis and positioning.
Further, the target detection algorithm includes efficientDet, YOLO, and/or SSD.
Preferably, the target detection algorithm may be fine-tuned if necessary using its own data derived from the user's labelling of the item susceptible to loss.
Preferably, the local feature extraction is based on deep learning, and feature extraction and comparison are carried out in the video through sample graphs of the given articles for positioning.
Furthermore, the feature extraction network used for local feature extraction comprises GeoDesc and/or Hardnet, only an image sample of an article to be searched is required to be given to generate a feature point set, then a monitoring image is given to generate the feature point set, matching is carried out between the two point sets by using a FLANN or BruteForce method, and supervision information is not required.
Preferably, the video clip of the last occurrence of the item is given when the real-time positioning fails.
Preferably, the method is implemented as follows:
1) deploying a deep learning framework and a target detection and feature extraction algorithm in an edge server or a cloud server, and accessing camera data;
2) converting all the frames of the video into images, and performing uniform pretreatment on all the images;
3) the user gives a category name or a sample figure of an article to be searched and selects different video analysis modes according to different given information;
4) when the article type information is given, firstly searching the type supported by the target detection, and if the article exists, performing the target detection task on the video;
5) when the target detection fails or a sample image of the image is given, extracting and matching image features;
6) and any one of the two steps detects the time and the position of the article when the article to be searched is returned.
The invention also claims a camera object finding device based on deep learning, which comprises: at least one memory and at least one processor;
the at least one memory to store a machine readable program;
the at least one processor is used for calling the machine readable program and executing the method.
The invention also claims a computer readable medium having stored thereon computer instructions which, when executed by a processor, cause the processor to perform the above-described method.
Compared with the prior art, the camera object searching method based on deep learning has the following beneficial effects:
the method can greatly improve the detection rate of objects by using a deep learning target detection and feature extraction method, including the detection accuracy and the category number of detected articles; and as system redundancy, the method also uses local feature extraction based on deep learning, compared with the traditional SIFT, SURF and other methods, the feature extractor based on learning is more robust, and the number of feature points and the discrimination of feature description are better.
Drawings
Fig. 1 is a flowchart of a camera object finding method based on deep learning according to an embodiment of the present invention;
Detailed Description
The present invention will be further described with reference to the following specific examples.
People can recall spider silk traces in the brain most of time when losing things, and often a proper reminder can enable people to instantly recall the position of the object. And by utilizing target detection or feature extraction in deep learning to carry out video analysis, the current or last appearing position of the article is positioned so as to provide clues for people to seek objects.
The embodiment of the invention provides a camera object finding method based on deep learning,
and performing video analysis on the real-time video or the historical video through target detection or local feature extraction, and positioning to the current position or the last appearing position of the object to be searched.
The camera object finding method based on deep learning has two modes,
the target detection is to perform video analysis and positioning on specified articles in a real-time monitoring video or a stored monitoring video based on a general target detection algorithm;
the local feature extraction is based on deep learning, and feature extraction and comparison are carried out in the video through the sample graph of the given article for positioning.
When the real-time positioning fails, a video clip of the last appearance of the article is given.
The target detection can use advanced target detection algorithms such as efficentdet, YOLO, SSD, etc. When necessary, the data can be finely adjusted by using the data of the user, and the data is derived from the label of the lost article.
The EfficientDet is a target detection algorithm series published by google in 2019, 11 months, respectively comprises eight algorithms from D0-D7, can give the result of SOTA for different equipment limitations, and always obtains better efficiency than the prior art under wide resource constraints. Particularly, under the conditions of a single model and a single scale, the EfficientDet-D7 achieves the most advanced 52.2AP on a COCO testing device, has 52M parameters and 325B FLOPs, and compared with the prior algorithm, the parameter quantity is reduced by 4 to 9 times, and the FLOPs are reduced by 13 to 42 times.
YOLO defines the problem of object detection as a regression problem of bounding box and classification confidence; the whole image is used as input and is divided into SxS grids, each cell predicts B bounding boxes (x, y, w, h) and corresponding classification confidence coefficients (class-specific confidence score), wherein the classification confidence coefficients are the probability that the bounding boxes are objects and the result of multiplying the bounding boxes by a true value IOU.
The SSD abstracts the solution space of the object detection problem into a set of bounding boxes with preset (dimension, aspect ratio), and predicts the classified label and box offset to better frame out the object in each bounding box, and combines the prediction results of a plurality of feature maps with different sizes for one picture, so as to process the objects with different sizes.
Local feature extraction based on deep learning can use GeoDesc, Hardnet and other feature extraction networks, the local feature extraction does not need supervision information, only needs to give an image sample of an object to be searched to generate a feature point set, then gives a monitoring image to generate the feature point set, and matches the two point sets by using a FLANN or BruteForce method.
Deploying a target detection and feature extraction algorithm, calling data of a monitoring camera, acquiring a category or sample image of an article to be searched from a user, performing video analysis to detect or match the article, and giving the current article position or the position of the article appearing last in a history record.
The embodiment of the invention provides a camera object finding method based on deep learning, which comprises the following implementation processes:
1) deploying a deep learning framework and a target detection and feature extraction algorithm in an edge server or a cloud server, and accessing camera data;
2) converting all the frames of the video into images, and performing uniform pretreatment on all the images;
3) the user gives a category name or a sample figure of an article to be searched and selects different video analysis modes according to different given information;
4) when the article type information is given, firstly searching the type supported by the target detection, and if the article exists, performing the target detection task on the video;
5) when the target detection fails or a sample image of the image is given, extracting and matching image features;
6) and any one of the two steps detects the time and the position of the article when the article to be searched is returned.
The embodiment of the invention also provides a camera object finding device based on deep learning, which comprises: at least one memory and at least one processor;
the at least one memory to store a machine readable program;
the at least one processor is configured to invoke the machine readable program to execute the method for camera object finding based on deep learning described in the above embodiments.
An embodiment of the present invention further provides a computer-readable medium, where the computer-readable medium has stored thereon computer instructions, and when executed by a processor, the computer instructions cause the processor to execute the method for finding an object based on deep learning in the above embodiment of the present invention. Specifically, a system or an apparatus equipped with a storage medium on which software program codes that realize the functions of any of the above-described embodiments are stored may be provided, and a computer (or a CPU or MPU) of the system or the apparatus is caused to read out and execute the program codes stored in the storage medium.
In this case, the program code itself read from the storage medium can realize the functions of any of the above-described embodiments, and thus the program code and the storage medium storing the program code constitute a part of the present invention.
Examples of the storage medium for supplying the program code include a floppy disk, a hard disk, a magneto-optical disk, an optical disk (e.g., CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD + RW), a magnetic tape, a nonvolatile memory card, and a ROM. Alternatively, the program code may be downloaded from a server computer via a communications network.
Further, it should be clear that the functions of any one of the above-described embodiments may be implemented not only by executing the program code read out by the computer, but also by causing an operating system or the like operating on the computer to perform a part or all of the actual operations based on instructions of the program code.
Further, it is to be understood that the program code read out from the storage medium is written to a memory provided in an expansion board inserted into the computer or to a memory provided in an expansion unit connected to the computer, and then causes a CPU or the like mounted on the expansion board or the expansion unit to perform part or all of the actual operations based on instructions of the program code, thereby realizing the functions of any of the above-described embodiments.
While the invention has been shown and described in detail in the drawings and in the preferred embodiments, it is not intended to limit the invention to the embodiments disclosed, and it will be apparent to those skilled in the art that various combinations of the code auditing means in the various embodiments described above may be used to obtain further embodiments of the invention, which are also within the scope of the invention.

Claims (10)

1. A camera object finding method based on deep learning is characterized in that real-time video or historical video is subjected to video analysis through target detection or local feature extraction, and the current position or the last appearing position of an object to be found is located.
2. The camera finding method according to claim 1, wherein the object detection is based on a general object detection algorithm to specify an object in a real-time surveillance video or a stored surveillance video for video analysis and positioning.
3. The camera object-seeking method based on deep learning of claim 2, wherein said target detection algorithm comprises efficientDet, YOLO and/or SSD.
4. The camera finding method based on deep learning of claim 3, wherein the target detection algorithm can be fine-tuned by using data from the user's label of the easily lost object.
5. The camera finding method based on deep learning as claimed in claim 1, wherein the local feature extraction is based on deep learning, and feature extraction and comparison are performed in the video through a sample graph of a given article for positioning.
6. The camera object finding method based on deep learning of claim 5, wherein the feature extraction network used for local feature extraction includes GeoDesc and/or Hardnet, an image sample of an object to be found is given, a feature point set is generated, a monitoring image is given to generate a feature point set, and a FLANN or BruteForce method is used for matching between the two point sets.
7. The camera object-seeking method based on deep learning of claim 1, wherein a video clip of the last appearance of an object is given when real-time positioning fails.
8. The camera object finding method based on deep learning according to any one of claims 1 to 7, characterized in that the method is implemented as follows:
1) deploying a deep learning framework and a target detection and feature extraction algorithm in an edge server or a cloud server, and accessing camera data;
2) converting all the frames of the video into images, and performing uniform pretreatment on all the images;
3) the user gives a category name or a sample figure of an article to be searched and selects different video analysis modes according to different given information;
4) when the article type information is given, firstly searching the type supported by the target detection, and if the article exists, performing the target detection task on the video;
5) when the target detection fails or a sample image of the image is given, extracting and matching image features;
6) and any one of the two steps detects the time and the position of the article when the article to be searched is returned.
9. The utility model provides a camera device of looking for something based on deep learning which characterized in that includes: at least one memory and at least one processor;
the at least one memory to store a machine readable program;
the at least one processor, configured to invoke the machine readable program to perform the method of any of claims 1 to 8.
10. Computer readable medium, characterized in that it has stored thereon computer instructions which, when executed by a processor, cause the processor to carry out the method of any one of claims 1 to 8.
CN202110082166.2A 2021-01-21 2021-01-21 Camera object finding method based on deep learning Pending CN112652013A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110082166.2A CN112652013A (en) 2021-01-21 2021-01-21 Camera object finding method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110082166.2A CN112652013A (en) 2021-01-21 2021-01-21 Camera object finding method based on deep learning

Publications (1)

Publication Number Publication Date
CN112652013A true CN112652013A (en) 2021-04-13

Family

ID=75371101

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110082166.2A Pending CN112652013A (en) 2021-01-21 2021-01-21 Camera object finding method based on deep learning

Country Status (1)

Country Link
CN (1) CN112652013A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106926247A (en) * 2017-01-16 2017-07-07 深圳前海勇艺达机器人有限公司 With the robot looked for something in automatic family
CN109034094A (en) * 2018-08-10 2018-12-18 佛山市泽胜科技有限公司 A kind of articles seeking method and apparatus
CN109993045A (en) * 2017-12-29 2019-07-09 杭州海康威视系统技术有限公司 Articles seeking method and lookup device search system and machine readable storage medium
CN111353436A (en) * 2020-02-28 2020-06-30 罗普特科技集团股份有限公司 Super store operation analysis method and device based on image deep learning algorithm
CN111383270A (en) * 2018-12-27 2020-07-07 深圳市优必选科技有限公司 Object positioning method and device, computer equipment and storage medium
CN112100430A (en) * 2020-11-06 2020-12-18 北京沃东天骏信息技术有限公司 Article tracing method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106926247A (en) * 2017-01-16 2017-07-07 深圳前海勇艺达机器人有限公司 With the robot looked for something in automatic family
CN109993045A (en) * 2017-12-29 2019-07-09 杭州海康威视系统技术有限公司 Articles seeking method and lookup device search system and machine readable storage medium
CN109034094A (en) * 2018-08-10 2018-12-18 佛山市泽胜科技有限公司 A kind of articles seeking method and apparatus
CN111383270A (en) * 2018-12-27 2020-07-07 深圳市优必选科技有限公司 Object positioning method and device, computer equipment and storage medium
CN111353436A (en) * 2020-02-28 2020-06-30 罗普特科技集团股份有限公司 Super store operation analysis method and device based on image deep learning algorithm
CN112100430A (en) * 2020-11-06 2020-12-18 北京沃东天骏信息技术有限公司 Article tracing method and device

Similar Documents

Publication Publication Date Title
CN106557778B (en) General object detection method and device, data processing device and terminal equipment
CN111160469B (en) Active learning method of target detection system
JP6188976B2 (en) Method, apparatus and computer-readable recording medium for detecting text contained in an image
WO2019080411A1 (en) Electrical apparatus, facial image clustering search method, and computer readable storage medium
US8706711B2 (en) Descriptor storage and searches of k-dimensional trees
US11531839B2 (en) Label assigning device, label assigning method, and computer program product
CN112364014B (en) Data query method, device, server and storage medium
Chien et al. $ HS^ 2$: Active learning over hypergraphs with pointwise and pairwise queries
KR101472451B1 (en) System and Method for Managing Digital Contents
CN111563398A (en) Method and device for determining information of target object
CN107203638B (en) Monitoring video processing method, device and system
CN110083731B (en) Image retrieval method, device, computer equipment and storage medium
CN115795021A (en) Big data risk monitoring, recognizing and early warning device and system
CN109699003B (en) Position determination method and device
CN113255759B (en) In-target feature detection system, method and storage medium based on attention mechanism
CN113065447A (en) Method and equipment for automatically identifying commodities in image set
CN110209895B (en) Vector retrieval method, device and equipment
CN112652013A (en) Camera object finding method based on deep learning
CN110728229A (en) Image processing method, device, equipment and storage medium
CN112232295B (en) Method and device for confirming newly-added target ship and electronic equipment
CN112861652B (en) Video target tracking and segmentation method and system based on convolutional neural network
CN111581487B (en) Information processing method and device
CN110348509B (en) Method, device and equipment for adjusting data augmentation parameters and storage medium
CN110414845B (en) Risk assessment method and device for target transaction
US11227186B2 (en) Method and device for training image recognition model and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210413

RJ01 Rejection of invention patent application after publication