CN110826455A - Target identification method and image processing equipment - Google Patents
Target identification method and image processing equipment Download PDFInfo
- Publication number
- CN110826455A CN110826455A CN201911047606.XA CN201911047606A CN110826455A CN 110826455 A CN110826455 A CN 110826455A CN 201911047606 A CN201911047606 A CN 201911047606A CN 110826455 A CN110826455 A CN 110826455A
- Authority
- CN
- China
- Prior art keywords
- target
- image
- video frame
- identity
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/166—Detection; Localisation; Normalisation using acquisition arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Library & Information Science (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a target identification method and image processing equipment, and relates to the technical field of machine vision. The method comprises the steps of detecting a target in a video frame image, optimizing the image, obtaining the characteristics of a target face by utilizing a depth characteristic network, searching an identity matched with the characteristics in a pre-stored corresponding relation between the characteristics and the identity, and determining the searched identity as a target identification result. According to the method, through an image optimization technology, the target in the detected video frame image is subjected to image enhancement, face capture and image scaling to obtain the image of the target face, the facial features of the target in the video frame image are identified by utilizing a depth feature network, the identity of the target is identified by combining the pre-stored corresponding relation between the features and the identity, the identification accuracy is high, and the problem that the target identification is inaccurate due to the fact that acquired target information is fuzzy when light rays in a scene are dark or at night is solved.
Description
Technical Field
The invention belongs to the technical field of machine vision, and particularly relates to a target identification method and image processing equipment.
Background
At present, monitoring equipment is arranged in a plurality of scenes, for example, in scenes such as shopping malls, schools and office buildings, a plurality of cameras are usually arranged to monitor the scenes. And the image processing equipment connected with the camera analyzes the image acquired by the camera, and can identify the target in the image according to the analysis result.
However, when the scene is dark or at night, the camera is difficult to shoot the face of the target, and the collected target information is fuzzy, so that the target identification is inaccurate.
Disclosure of Invention
The invention aims to provide a target identification method and image processing equipment, which solve the problems that a camera is difficult to shoot the face of a target when light rays are dark in a scene or at night, and acquired target information is fuzzy, so that target identification is inaccurate through an image optimization technology.
In order to solve the technical problems, the invention is realized by the following technical scheme:
the invention relates to a target identification method, which comprises the following steps:
SS01 detects objects in the video frame images;
SS02 image optimization: carrying out image enhancement on a target in a detected video frame image, carrying out face capture on the target in the detected video frame image, and carrying out image scaling on the target in the detected video frame image to obtain an image of a target face;
SS03 inputs the image area of the target face into the depth feature network to obtain the feature of the target face; wherein the deep feature network is: inputting a plurality of facial images into a deep convolution neural network for training to obtain the facial images;
SS04 searches the identity matched with the feature in the pre-stored corresponding relation between the feature and the identity;
SS05 determines the found identity as the target recognition result.
Further, the detecting the object in the video frame image includes:
detecting a target in a video frame image by using an image detection algorithm;
or matching the video frame image with a preset target model, and determining the image area successfully matched as the image area where the target is located.
Further, the training process of the deep feature network comprises the following steps:
acquiring a plurality of facial images, wherein the plurality of facial images comprise the same target under multiple angles;
inputting the regions where the targets in the facial images are located into a deep convolutional neural network;
carrying out classification training on the same target under multiple angles by using a random gradient descent algorithm, and calculating parameters of each layer in the deep convolutional neural network through back propagation;
and constructing the depth feature network according to the parameters of each layer.
Further, the video frame image includes images acquired by at least two image acquisition devices, and after the step of determining the searched identity as the target identification result, the method further includes:
and tracking the target in the image acquired by each image acquisition device by using the target identification result corresponding to each image acquisition device.
An image processing apparatus comprising: a processor and a memory, wherein the memory is used for storing executable program codes, and the processor runs programs corresponding to the executable program codes by reading the executable program codes stored in the memory, so as to execute the following steps:
SS01 detects objects in the video frame images;
SS02 image optimization: carrying out image enhancement on a target in a detected video frame image, carrying out face capture on the target in the detected video frame image, and carrying out image scaling on the target in the detected video frame image to obtain an image of a target face;
SS03 inputs the image area of the target face into the depth feature network to obtain the feature of the target face; wherein the deep feature network is: inputting a plurality of images into a deep convolution neural network for training to obtain;
SS04 searches the identity matched with the feature in the pre-stored corresponding relation between the feature and the identity;
SS05 determines the found identity as the target recognition result.
Further, the processor is further configured to perform the steps of:
detecting a target in a video frame image by using an image detection algorithm;
or matching the video frame image with a preset target model, and determining the image area successfully matched as the image area where the target is located.
Further, the processor is further configured to perform the steps of:
acquiring a plurality of facial images, wherein the plurality of facial images comprise the same target under multiple angles;
inputting the regions where the targets in the facial images are located into a deep convolutional neural network;
carrying out classification training on the same target under multiple angles by using a random gradient descent algorithm, and calculating parameters of each layer in the deep convolutional neural network through back propagation;
and constructing the depth feature network according to the parameters of each layer.
Further, the video frame image includes images acquired by at least two image acquisition devices, and after the step of determining the searched identity as the target identification result, the method further includes:
and tracking the target in the image acquired by each image acquisition device by using the target identification result corresponding to each image acquisition device.
The invention has the following beneficial effects: according to the method, through an image optimization technology, the target in the detected video frame image is subjected to image enhancement, the target in the detected video frame image is subjected to face capture, the target in the detected video frame image is subjected to image scaling to obtain the image of the target face, the facial features of the target in the video frame image are identified by using a depth feature network, the identity of the target is identified by combining the corresponding relation between the pre-stored features and the identity, the identification accuracy is high, and the problems that when the light ray in a scene is dark or at night, the camera is difficult to shoot the face of the target, the acquired target information is fuzzy, and the target identification is inaccurate are solved.
Of course, it is not necessary for any product in which the invention is practiced to achieve all of the above-described advantages at the same time.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a method of object recognition;
fig. 2 is a schematic view of an application scenario in the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1-2, the present invention is a target identification method, including:
SS01 detects objects in the video frame images;
SS02 image optimization: carrying out image enhancement on a target in a detected video frame image, carrying out face capture on the target in the detected video frame image, and carrying out image scaling on the target in the detected video frame image to obtain an image of a target face;
SS03 inputs the image area of the target face into the depth feature network to obtain the feature of the target face; wherein the deep feature network is: inputting a plurality of facial images into a deep convolution neural network for training to obtain the facial images;
SS04 searches the identity matched with the feature in the pre-stored corresponding relation between the feature and the identity;
SS05 determines the found identity as the target recognition result.
The image enhancement is to add some information or transform data to the original image by a certain means, selectively highlight interesting features in the image or inhibit some unnecessary features in the image, so that the image is matched with visual response characteristics, specifically, a spatial domain method is adopted, the spatial domain method is to operate pixel points in the image, and the following formula is described:
g(x,y)=f(x,y)*h(x,y)
wherein f (x, y) is the original image; h (x, y) is a spatial transfer function; g (x, y) represents the processed image.
The face capturing adopts the existing OptiTrack facial expression capturing technology, and can accurately capture the target face in the detected video frame image.
The method comprises the steps of firstly segmenting an original low-resolution image into different regions, then mapping interpolation points to the low-resolution image, judging the region to which the interpolation points belong, and finally designing different interpolation formulas according to neighborhood pixels of the interpolation points to calculate the values of the interpolation points, namely obtaining the facial features of the target high resolution.
Wherein the detecting the target in the video frame image comprises:
detecting a target in a video frame image by using an image detection algorithm;
or matching the video frame image with a preset target model, and determining the image area successfully matched as the image area where the target is located.
Wherein the training process of the deep feature network comprises the following steps:
acquiring a plurality of facial images, wherein the plurality of facial images comprise the same target under multiple angles;
inputting the regions where the targets in the facial images are located into a deep convolutional neural network;
carrying out classification training on the same target under multiple angles by using a random gradient descent algorithm, and calculating parameters of each layer in the deep convolutional neural network through back propagation;
and constructing the depth feature network according to the parameters of each layer.
Wherein the video frame image comprises images acquired by at least two image acquisition devices, and after the found identity is determined as a target identification result, the method further comprises:
and tracking the target in the image acquired by each image acquisition device by using the target identification result corresponding to each image acquisition device.
An image processing apparatus comprising: a processor and a memory, wherein the memory is used for storing executable program codes, and the processor runs programs corresponding to the executable program codes by reading the executable program codes stored in the memory, so as to execute the following steps:
SS01 detects objects in the video frame images;
the image processing equipment is in communication connection with the image acquisition equipment, the equipment can acquire video frame images acquired by the image acquisition equipment, and target identification can be performed by using the scheme aiming at each acquired video frame image.
SS02 image optimization: carrying out image enhancement on a target in a detected video frame image, carrying out face capture on the target in the detected video frame image, and carrying out image scaling on the target in the detected video frame image to obtain an image of a target face;
SS03 inputs the image area of the target face into the depth feature network to obtain the feature of the target face; wherein the deep feature network is: inputting a plurality of images into a deep convolution neural network for training to obtain;
SS04 searches the identity matched with the feature in the pre-stored corresponding relation between the feature and the identity;
the corresponding relationship may be stored in a form of a data table, where the identity is represented by a key and the feature is represented by a value, and the data table may be stored in the device, or may be stored in another device connected to the device, which is not limited specifically.
SS05 determines the found identity as the target recognition result.
Wherein the processor is further configured to perform the steps of:
detecting a target in a video frame image by using an image detection algorithm;
or matching the video frame image with a preset target model, and determining the image area successfully matched as the image area where the target is located.
Wherein the processor is further configured to perform the steps of:
acquiring a plurality of facial images, wherein the plurality of facial images comprise the same target under multiple angles;
inputting the regions where the targets in the facial images are located into a deep convolutional neural network;
carrying out classification training on the same target under multiple angles by using a random gradient descent algorithm, and calculating parameters of each layer in the deep convolutional neural network through back propagation;
and constructing the depth feature network according to the parameters of each layer.
Wherein the video frame image comprises images acquired by at least two image acquisition devices, and after the found identity is determined as a target identification result, the method further comprises:
and tracking the target in the image acquired by each image acquisition device by using the target identification result corresponding to each image acquisition device.
According to the method, through an image optimization technology, the target in the detected video frame image is subjected to image enhancement, the target in the detected video frame image is subjected to face capture, the target in the detected video frame image is subjected to image scaling to obtain the image of the target face, the facial features of the target in the video frame image are identified by using a depth feature network, the identity of the target is identified by combining the corresponding relation between the pre-stored features and the identity, the identification accuracy is high, and the problems that when the light ray in a scene is dark or at night, the camera is difficult to shoot the face of the target, the acquired target information is fuzzy, and the target identification is inaccurate are solved.
In the description herein, references to the description of "one embodiment," "an example," "a specific example" or the like are intended to mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The preferred embodiments of the invention disclosed above are intended to be illustrative only. The preferred embodiments are not intended to be exhaustive or to limit the invention to the precise embodiments disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best utilize the invention. The invention is limited only by the claims and their full scope and equivalents.
Claims (8)
1. A method of object recognition, comprising:
SS01 detects objects in the video frame images;
SS02 image optimization: carrying out image enhancement on a target in a detected video frame image, carrying out face capture on the target in the detected video frame image, and carrying out image scaling on the target in the detected video frame image to obtain an image of a target face;
SS03 inputs the image area of the target face into the depth feature network to obtain the feature of the target face; wherein the deep feature network is: inputting a plurality of facial images into a deep convolution neural network for training to obtain the facial images;
SS04 searches the identity matched with the feature in the pre-stored corresponding relation between the feature and the identity;
SS05 determines the found identity as the target recognition result.
2. The method of claim 1, wherein the detecting the object in the video frame image comprises:
detecting a target in a video frame image by using an image detection algorithm;
or matching the video frame image with a preset target model, and determining the image area successfully matched as the image area where the target is located.
3. The method of claim 1, wherein the training process of the deep feature network comprises:
acquiring a plurality of facial images, wherein the plurality of facial images comprise the same target under multiple angles;
inputting the regions where the targets in the facial images are located into a deep convolutional neural network;
carrying out classification training on the same target under multiple angles by using a random gradient descent algorithm, and calculating parameters of each layer in the deep convolutional neural network through back propagation;
and constructing the depth feature network according to the parameters of each layer.
4. The method as claimed in claim 1, wherein the video frame images include images captured by at least two image capturing devices, and after the step of determining the located identity as the target recognition result, the method further comprises:
and tracking the target in the image acquired by each image acquisition device by using the target identification result corresponding to each image acquisition device.
5. An image processing apparatus characterized by comprising: a processor and a memory, wherein the memory is used for storing executable program codes, and the processor runs programs corresponding to the executable program codes by reading the executable program codes stored in the memory, so as to execute the following steps:
SS01 detects objects in the video frame images;
SS02 image optimization: carrying out image enhancement on a target in a detected video frame image, carrying out face capture on the target in the detected video frame image, and carrying out image scaling on the target in the detected video frame image to obtain an image of a target face;
SS03 inputs the image area of the target face into the depth feature network to obtain the feature of the target face; wherein the deep feature network is: inputting a plurality of images into a deep convolution neural network for training to obtain;
SS04 searches the identity matched with the feature in the pre-stored corresponding relation between the feature and the identity;
SS05 determines the found identity as the target recognition result.
6. An image processing apparatus according to claim 5, wherein the processor is further configured to perform the steps of:
detecting a target in a video frame image by using an image detection algorithm;
or matching the video frame image with a preset target model, and determining the image area successfully matched as the image area where the target is located.
7. An image processing apparatus according to claim 5, wherein the processor is further configured to perform the steps of:
acquiring a plurality of facial images, wherein the plurality of facial images comprise the same target under multiple angles;
inputting the regions where the targets in the facial images are located into a deep convolutional neural network;
carrying out classification training on the same target under multiple angles by using a random gradient descent algorithm, and calculating parameters of each layer in the deep convolutional neural network through back propagation;
and constructing the depth feature network according to the parameters of each layer.
8. The image processing apparatus according to claim 5, wherein the video frame images include images captured by at least two image capturing apparatuses, and after the determining the located identity as the target recognition result, further comprising:
and tracking the target in the image acquired by each image acquisition device by using the target identification result corresponding to each image acquisition device.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911047606.XA CN110826455A (en) | 2019-10-31 | 2019-10-31 | Target identification method and image processing equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911047606.XA CN110826455A (en) | 2019-10-31 | 2019-10-31 | Target identification method and image processing equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110826455A true CN110826455A (en) | 2020-02-21 |
Family
ID=69551605
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911047606.XA Withdrawn CN110826455A (en) | 2019-10-31 | 2019-10-31 | Target identification method and image processing equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110826455A (en) |
-
2019
- 2019-10-31 CN CN201911047606.XA patent/CN110826455A/en not_active Withdrawn
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106709950B (en) | Binocular vision-based inspection robot obstacle crossing wire positioning method | |
JP6125188B2 (en) | Video processing method and apparatus | |
WO2020094091A1 (en) | Image capturing method, monitoring camera, and monitoring system | |
US11093737B2 (en) | Gesture recognition method and apparatus, electronic device, and computer-readable storage medium | |
CN105930822A (en) | Human face snapshot method and system | |
KR101747216B1 (en) | Apparatus and method for extracting target, and the recording media storing the program for performing the said method | |
CN109426785B (en) | Human body target identity recognition method and device | |
CN110516517B (en) | Target identification method, device and equipment based on multi-frame image | |
CN108470356B (en) | Target object rapid ranging method based on binocular vision | |
CN110287907B (en) | Object detection method and device | |
CN111161206A (en) | Image capturing method, monitoring camera and monitoring system | |
Huang et al. | Motion detection with pyramid structure of background model for intelligent surveillance systems | |
KR102199094B1 (en) | Method and Apparatus for Learning Region of Interest for Detecting Object of Interest | |
CN111723773B (en) | Method and device for detecting carryover, electronic equipment and readable storage medium | |
CN112560619B (en) | Multi-focus image fusion-based multi-distance bird accurate identification method | |
CN102982537A (en) | Scene change detection method and scene change detection system | |
CN111582118A (en) | Face recognition method and device | |
Sokolova et al. | Human identification by gait from event-based camera | |
Feng et al. | A novel saliency detection method for wild animal monitoring images with WMSN | |
CN104966283A (en) | Imaging layered registering method | |
CN110599514A (en) | Image segmentation method and device, electronic equipment and storage medium | |
US20220366570A1 (en) | Object tracking device and object tracking method | |
CN112070035A (en) | Target tracking method and device based on video stream and storage medium | |
JP5217917B2 (en) | Object detection and tracking device, object detection and tracking method, and object detection and tracking program | |
CN113409334B (en) | Centroid-based structured light angle point detection method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20200221 |
|
WW01 | Invention patent application withdrawn after publication |