CN110826455A - Target identification method and image processing equipment - Google Patents

Target identification method and image processing equipment Download PDF

Info

Publication number
CN110826455A
CN110826455A CN201911047606.XA CN201911047606A CN110826455A CN 110826455 A CN110826455 A CN 110826455A CN 201911047606 A CN201911047606 A CN 201911047606A CN 110826455 A CN110826455 A CN 110826455A
Authority
CN
China
Prior art keywords
target
image
video frame
identity
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201911047606.XA
Other languages
Chinese (zh)
Inventor
韩贵金
侯铁双
周有
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Posts and Telecommunications
Original Assignee
Xian University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Posts and Telecommunications filed Critical Xian University of Posts and Telecommunications
Priority to CN201911047606.XA priority Critical patent/CN110826455A/en
Publication of CN110826455A publication Critical patent/CN110826455A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Library & Information Science (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a target identification method and image processing equipment, and relates to the technical field of machine vision. The method comprises the steps of detecting a target in a video frame image, optimizing the image, obtaining the characteristics of a target face by utilizing a depth characteristic network, searching an identity matched with the characteristics in a pre-stored corresponding relation between the characteristics and the identity, and determining the searched identity as a target identification result. According to the method, through an image optimization technology, the target in the detected video frame image is subjected to image enhancement, face capture and image scaling to obtain the image of the target face, the facial features of the target in the video frame image are identified by utilizing a depth feature network, the identity of the target is identified by combining the pre-stored corresponding relation between the features and the identity, the identification accuracy is high, and the problem that the target identification is inaccurate due to the fact that acquired target information is fuzzy when light rays in a scene are dark or at night is solved.

Description

Target identification method and image processing equipment
Technical Field
The invention belongs to the technical field of machine vision, and particularly relates to a target identification method and image processing equipment.
Background
At present, monitoring equipment is arranged in a plurality of scenes, for example, in scenes such as shopping malls, schools and office buildings, a plurality of cameras are usually arranged to monitor the scenes. And the image processing equipment connected with the camera analyzes the image acquired by the camera, and can identify the target in the image according to the analysis result.
However, when the scene is dark or at night, the camera is difficult to shoot the face of the target, and the collected target information is fuzzy, so that the target identification is inaccurate.
Disclosure of Invention
The invention aims to provide a target identification method and image processing equipment, which solve the problems that a camera is difficult to shoot the face of a target when light rays are dark in a scene or at night, and acquired target information is fuzzy, so that target identification is inaccurate through an image optimization technology.
In order to solve the technical problems, the invention is realized by the following technical scheme:
the invention relates to a target identification method, which comprises the following steps:
SS01 detects objects in the video frame images;
SS02 image optimization: carrying out image enhancement on a target in a detected video frame image, carrying out face capture on the target in the detected video frame image, and carrying out image scaling on the target in the detected video frame image to obtain an image of a target face;
SS03 inputs the image area of the target face into the depth feature network to obtain the feature of the target face; wherein the deep feature network is: inputting a plurality of facial images into a deep convolution neural network for training to obtain the facial images;
SS04 searches the identity matched with the feature in the pre-stored corresponding relation between the feature and the identity;
SS05 determines the found identity as the target recognition result.
Further, the detecting the object in the video frame image includes:
detecting a target in a video frame image by using an image detection algorithm;
or matching the video frame image with a preset target model, and determining the image area successfully matched as the image area where the target is located.
Further, the training process of the deep feature network comprises the following steps:
acquiring a plurality of facial images, wherein the plurality of facial images comprise the same target under multiple angles;
inputting the regions where the targets in the facial images are located into a deep convolutional neural network;
carrying out classification training on the same target under multiple angles by using a random gradient descent algorithm, and calculating parameters of each layer in the deep convolutional neural network through back propagation;
and constructing the depth feature network according to the parameters of each layer.
Further, the video frame image includes images acquired by at least two image acquisition devices, and after the step of determining the searched identity as the target identification result, the method further includes:
and tracking the target in the image acquired by each image acquisition device by using the target identification result corresponding to each image acquisition device.
An image processing apparatus comprising: a processor and a memory, wherein the memory is used for storing executable program codes, and the processor runs programs corresponding to the executable program codes by reading the executable program codes stored in the memory, so as to execute the following steps:
SS01 detects objects in the video frame images;
SS02 image optimization: carrying out image enhancement on a target in a detected video frame image, carrying out face capture on the target in the detected video frame image, and carrying out image scaling on the target in the detected video frame image to obtain an image of a target face;
SS03 inputs the image area of the target face into the depth feature network to obtain the feature of the target face; wherein the deep feature network is: inputting a plurality of images into a deep convolution neural network for training to obtain;
SS04 searches the identity matched with the feature in the pre-stored corresponding relation between the feature and the identity;
SS05 determines the found identity as the target recognition result.
Further, the processor is further configured to perform the steps of:
detecting a target in a video frame image by using an image detection algorithm;
or matching the video frame image with a preset target model, and determining the image area successfully matched as the image area where the target is located.
Further, the processor is further configured to perform the steps of:
acquiring a plurality of facial images, wherein the plurality of facial images comprise the same target under multiple angles;
inputting the regions where the targets in the facial images are located into a deep convolutional neural network;
carrying out classification training on the same target under multiple angles by using a random gradient descent algorithm, and calculating parameters of each layer in the deep convolutional neural network through back propagation;
and constructing the depth feature network according to the parameters of each layer.
Further, the video frame image includes images acquired by at least two image acquisition devices, and after the step of determining the searched identity as the target identification result, the method further includes:
and tracking the target in the image acquired by each image acquisition device by using the target identification result corresponding to each image acquisition device.
The invention has the following beneficial effects: according to the method, through an image optimization technology, the target in the detected video frame image is subjected to image enhancement, the target in the detected video frame image is subjected to face capture, the target in the detected video frame image is subjected to image scaling to obtain the image of the target face, the facial features of the target in the video frame image are identified by using a depth feature network, the identity of the target is identified by combining the corresponding relation between the pre-stored features and the identity, the identification accuracy is high, and the problems that when the light ray in a scene is dark or at night, the camera is difficult to shoot the face of the target, the acquired target information is fuzzy, and the target identification is inaccurate are solved.
Of course, it is not necessary for any product in which the invention is practiced to achieve all of the above-described advantages at the same time.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a method of object recognition;
fig. 2 is a schematic view of an application scenario in the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1-2, the present invention is a target identification method, including:
SS01 detects objects in the video frame images;
SS02 image optimization: carrying out image enhancement on a target in a detected video frame image, carrying out face capture on the target in the detected video frame image, and carrying out image scaling on the target in the detected video frame image to obtain an image of a target face;
SS03 inputs the image area of the target face into the depth feature network to obtain the feature of the target face; wherein the deep feature network is: inputting a plurality of facial images into a deep convolution neural network for training to obtain the facial images;
SS04 searches the identity matched with the feature in the pre-stored corresponding relation between the feature and the identity;
SS05 determines the found identity as the target recognition result.
The image enhancement is to add some information or transform data to the original image by a certain means, selectively highlight interesting features in the image or inhibit some unnecessary features in the image, so that the image is matched with visual response characteristics, specifically, a spatial domain method is adopted, the spatial domain method is to operate pixel points in the image, and the following formula is described:
g(x,y)=f(x,y)*h(x,y)
wherein f (x, y) is the original image; h (x, y) is a spatial transfer function; g (x, y) represents the processed image.
The face capturing adopts the existing OptiTrack facial expression capturing technology, and can accurately capture the target face in the detected video frame image.
The method comprises the steps of firstly segmenting an original low-resolution image into different regions, then mapping interpolation points to the low-resolution image, judging the region to which the interpolation points belong, and finally designing different interpolation formulas according to neighborhood pixels of the interpolation points to calculate the values of the interpolation points, namely obtaining the facial features of the target high resolution.
Wherein the detecting the target in the video frame image comprises:
detecting a target in a video frame image by using an image detection algorithm;
or matching the video frame image with a preset target model, and determining the image area successfully matched as the image area where the target is located.
Wherein the training process of the deep feature network comprises the following steps:
acquiring a plurality of facial images, wherein the plurality of facial images comprise the same target under multiple angles;
inputting the regions where the targets in the facial images are located into a deep convolutional neural network;
carrying out classification training on the same target under multiple angles by using a random gradient descent algorithm, and calculating parameters of each layer in the deep convolutional neural network through back propagation;
and constructing the depth feature network according to the parameters of each layer.
Wherein the video frame image comprises images acquired by at least two image acquisition devices, and after the found identity is determined as a target identification result, the method further comprises:
and tracking the target in the image acquired by each image acquisition device by using the target identification result corresponding to each image acquisition device.
An image processing apparatus comprising: a processor and a memory, wherein the memory is used for storing executable program codes, and the processor runs programs corresponding to the executable program codes by reading the executable program codes stored in the memory, so as to execute the following steps:
SS01 detects objects in the video frame images;
the image processing equipment is in communication connection with the image acquisition equipment, the equipment can acquire video frame images acquired by the image acquisition equipment, and target identification can be performed by using the scheme aiming at each acquired video frame image.
SS02 image optimization: carrying out image enhancement on a target in a detected video frame image, carrying out face capture on the target in the detected video frame image, and carrying out image scaling on the target in the detected video frame image to obtain an image of a target face;
SS03 inputs the image area of the target face into the depth feature network to obtain the feature of the target face; wherein the deep feature network is: inputting a plurality of images into a deep convolution neural network for training to obtain;
SS04 searches the identity matched with the feature in the pre-stored corresponding relation between the feature and the identity;
the corresponding relationship may be stored in a form of a data table, where the identity is represented by a key and the feature is represented by a value, and the data table may be stored in the device, or may be stored in another device connected to the device, which is not limited specifically.
SS05 determines the found identity as the target recognition result.
Wherein the processor is further configured to perform the steps of:
detecting a target in a video frame image by using an image detection algorithm;
or matching the video frame image with a preset target model, and determining the image area successfully matched as the image area where the target is located.
Wherein the processor is further configured to perform the steps of:
acquiring a plurality of facial images, wherein the plurality of facial images comprise the same target under multiple angles;
inputting the regions where the targets in the facial images are located into a deep convolutional neural network;
carrying out classification training on the same target under multiple angles by using a random gradient descent algorithm, and calculating parameters of each layer in the deep convolutional neural network through back propagation;
and constructing the depth feature network according to the parameters of each layer.
Wherein the video frame image comprises images acquired by at least two image acquisition devices, and after the found identity is determined as a target identification result, the method further comprises:
and tracking the target in the image acquired by each image acquisition device by using the target identification result corresponding to each image acquisition device.
According to the method, through an image optimization technology, the target in the detected video frame image is subjected to image enhancement, the target in the detected video frame image is subjected to face capture, the target in the detected video frame image is subjected to image scaling to obtain the image of the target face, the facial features of the target in the video frame image are identified by using a depth feature network, the identity of the target is identified by combining the corresponding relation between the pre-stored features and the identity, the identification accuracy is high, and the problems that when the light ray in a scene is dark or at night, the camera is difficult to shoot the face of the target, the acquired target information is fuzzy, and the target identification is inaccurate are solved.
In the description herein, references to the description of "one embodiment," "an example," "a specific example" or the like are intended to mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The preferred embodiments of the invention disclosed above are intended to be illustrative only. The preferred embodiments are not intended to be exhaustive or to limit the invention to the precise embodiments disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best utilize the invention. The invention is limited only by the claims and their full scope and equivalents.

Claims (8)

1. A method of object recognition, comprising:
SS01 detects objects in the video frame images;
SS02 image optimization: carrying out image enhancement on a target in a detected video frame image, carrying out face capture on the target in the detected video frame image, and carrying out image scaling on the target in the detected video frame image to obtain an image of a target face;
SS03 inputs the image area of the target face into the depth feature network to obtain the feature of the target face; wherein the deep feature network is: inputting a plurality of facial images into a deep convolution neural network for training to obtain the facial images;
SS04 searches the identity matched with the feature in the pre-stored corresponding relation between the feature and the identity;
SS05 determines the found identity as the target recognition result.
2. The method of claim 1, wherein the detecting the object in the video frame image comprises:
detecting a target in a video frame image by using an image detection algorithm;
or matching the video frame image with a preset target model, and determining the image area successfully matched as the image area where the target is located.
3. The method of claim 1, wherein the training process of the deep feature network comprises:
acquiring a plurality of facial images, wherein the plurality of facial images comprise the same target under multiple angles;
inputting the regions where the targets in the facial images are located into a deep convolutional neural network;
carrying out classification training on the same target under multiple angles by using a random gradient descent algorithm, and calculating parameters of each layer in the deep convolutional neural network through back propagation;
and constructing the depth feature network according to the parameters of each layer.
4. The method as claimed in claim 1, wherein the video frame images include images captured by at least two image capturing devices, and after the step of determining the located identity as the target recognition result, the method further comprises:
and tracking the target in the image acquired by each image acquisition device by using the target identification result corresponding to each image acquisition device.
5. An image processing apparatus characterized by comprising: a processor and a memory, wherein the memory is used for storing executable program codes, and the processor runs programs corresponding to the executable program codes by reading the executable program codes stored in the memory, so as to execute the following steps:
SS01 detects objects in the video frame images;
SS02 image optimization: carrying out image enhancement on a target in a detected video frame image, carrying out face capture on the target in the detected video frame image, and carrying out image scaling on the target in the detected video frame image to obtain an image of a target face;
SS03 inputs the image area of the target face into the depth feature network to obtain the feature of the target face; wherein the deep feature network is: inputting a plurality of images into a deep convolution neural network for training to obtain;
SS04 searches the identity matched with the feature in the pre-stored corresponding relation between the feature and the identity;
SS05 determines the found identity as the target recognition result.
6. An image processing apparatus according to claim 5, wherein the processor is further configured to perform the steps of:
detecting a target in a video frame image by using an image detection algorithm;
or matching the video frame image with a preset target model, and determining the image area successfully matched as the image area where the target is located.
7. An image processing apparatus according to claim 5, wherein the processor is further configured to perform the steps of:
acquiring a plurality of facial images, wherein the plurality of facial images comprise the same target under multiple angles;
inputting the regions where the targets in the facial images are located into a deep convolutional neural network;
carrying out classification training on the same target under multiple angles by using a random gradient descent algorithm, and calculating parameters of each layer in the deep convolutional neural network through back propagation;
and constructing the depth feature network according to the parameters of each layer.
8. The image processing apparatus according to claim 5, wherein the video frame images include images captured by at least two image capturing apparatuses, and after the determining the located identity as the target recognition result, further comprising:
and tracking the target in the image acquired by each image acquisition device by using the target identification result corresponding to each image acquisition device.
CN201911047606.XA 2019-10-31 2019-10-31 Target identification method and image processing equipment Withdrawn CN110826455A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911047606.XA CN110826455A (en) 2019-10-31 2019-10-31 Target identification method and image processing equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911047606.XA CN110826455A (en) 2019-10-31 2019-10-31 Target identification method and image processing equipment

Publications (1)

Publication Number Publication Date
CN110826455A true CN110826455A (en) 2020-02-21

Family

ID=69551605

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911047606.XA Withdrawn CN110826455A (en) 2019-10-31 2019-10-31 Target identification method and image processing equipment

Country Status (1)

Country Link
CN (1) CN110826455A (en)

Similar Documents

Publication Publication Date Title
CN106709950B (en) Binocular vision-based inspection robot obstacle crossing wire positioning method
JP6125188B2 (en) Video processing method and apparatus
WO2020094091A1 (en) Image capturing method, monitoring camera, and monitoring system
US11093737B2 (en) Gesture recognition method and apparatus, electronic device, and computer-readable storage medium
CN105930822A (en) Human face snapshot method and system
KR101747216B1 (en) Apparatus and method for extracting target, and the recording media storing the program for performing the said method
CN109426785B (en) Human body target identity recognition method and device
CN110516517B (en) Target identification method, device and equipment based on multi-frame image
CN108470356B (en) Target object rapid ranging method based on binocular vision
CN110287907B (en) Object detection method and device
CN111161206A (en) Image capturing method, monitoring camera and monitoring system
Huang et al. Motion detection with pyramid structure of background model for intelligent surveillance systems
KR102199094B1 (en) Method and Apparatus for Learning Region of Interest for Detecting Object of Interest
CN111723773B (en) Method and device for detecting carryover, electronic equipment and readable storage medium
CN112560619B (en) Multi-focus image fusion-based multi-distance bird accurate identification method
CN102982537A (en) Scene change detection method and scene change detection system
CN111582118A (en) Face recognition method and device
Sokolova et al. Human identification by gait from event-based camera
Feng et al. A novel saliency detection method for wild animal monitoring images with WMSN
CN104966283A (en) Imaging layered registering method
CN110599514A (en) Image segmentation method and device, electronic equipment and storage medium
US20220366570A1 (en) Object tracking device and object tracking method
CN112070035A (en) Target tracking method and device based on video stream and storage medium
JP5217917B2 (en) Object detection and tracking device, object detection and tracking method, and object detection and tracking program
CN113409334B (en) Centroid-based structured light angle point detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20200221

WW01 Invention patent application withdrawn after publication