Disclosure of Invention
The invention mainly aims to provide an image recognition method, an image recognition device and a computer readable storage medium, and aims to solve the technical problem of large calculation amount when image recognition is carried out through deep learning.
In order to achieve the above object, the present invention provides an image recognition method, including:
acquiring an image to be identified, and determining a non-interested area in the image to be identified;
acquiring the initial resolution of the non-interesting region, and performing resolution reduction processing on the initial resolution of the non-interesting region according to preset parameters;
and carrying out image recognition calculation on the image to be recognized after the resolution of the non-interesting region is reduced, and recognizing the target object.
Optionally, the step of acquiring an image to be identified and determining a region of non-interest in the image to be identified includes:
acquiring an image to be identified, and dividing an interested region and a non-interested region from the image to be identified according to a preset rule.
Optionally, the step of performing resolution reduction processing on the initial resolution of the non-region of interest according to a preset parameter includes:
and performing resolution reduction processing on the initial resolution of the region of non-interest according to preset parameters by using a preset tool.
Optionally, the step of performing image recognition calculation on the image to be recognized after the resolution of the region of non-interest is reduced, and recognizing the target object includes:
and carrying out image recognition calculation on the image to be recognized after the resolution of the non-interesting region is reduced by adopting a convolutional neural network based on deep learning, and recognizing the target object, wherein the convolutional neural network based on deep learning is a model which is obtained by pre-training and is used for recognizing the target object.
Optionally, the step of performing image recognition calculation on the image to be recognized after the resolution of the region of non-interest is reduced by using a convolutional neural network based on deep learning, and recognizing the target object includes:
respectively extracting the features of the region of interest and the region of non-interest in the image to be identified after the resolution of the region of non-interest is reduced to obtain a first feature vector corresponding to the region of interest and a second feature vector corresponding to the region of non-interest;
and inputting the first characteristic vector corresponding to the region of interest and the second characteristic vector corresponding to the region of non-interest into the convolutional neural network based on deep learning respectively, and identifying the target object aiming at the region of interest and the target object aiming at the region of non-interest to identify the target object.
Further, to achieve the above object, the present invention also provides an image recognition apparatus comprising: a memory, a processor, and an image recognition program stored on the memory and executable on the processor, the image recognition program when executed by the processor implementing the steps of:
acquiring an image to be identified, and determining a non-interested area in the image to be identified;
acquiring the initial resolution of the non-interesting region, and performing resolution reduction processing on the initial resolution of the non-interesting region according to preset parameters;
and carrying out image recognition calculation on the image to be recognized after the resolution of the non-interesting region is reduced, and recognizing the target object.
Optionally, the image recognition program when executed by the processor implements the steps of:
acquiring an image to be identified, and dividing an interested region and a non-interested region from the image to be identified according to a preset rule.
Optionally, the image recognition program when executed by the processor further implements the steps of:
and performing resolution reduction processing on the initial resolution of the region of non-interest according to preset parameters by using a preset tool.
Optionally, the image recognition program when executed by the processor further implements the steps of:
and carrying out image recognition calculation on the image to be recognized after the resolution of the non-interesting region is reduced by adopting a convolutional neural network based on deep learning, and recognizing the target object, wherein the convolutional neural network based on deep learning is a model which is obtained by pre-training and is used for recognizing the target object.
Further, to achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon an image recognition program which, when executed by a processor, realizes the steps of:
acquiring an image to be identified, and determining a non-interested area in the image to be identified;
acquiring the initial resolution of the non-interesting region, and performing resolution reduction processing on the initial resolution of the non-interesting region according to preset parameters;
and carrying out image recognition calculation on the image to be recognized after the resolution of the non-interesting region is reduced, and recognizing the target object.
The invention provides an image identification method, which comprises the steps of firstly obtaining an image to be identified, then determining a non-interesting region in the image to be identified, further obtaining the initial resolution of the non-interesting region, carrying out resolution reduction processing on the initial resolution of the non-interesting region according to preset parameters, and then carrying out image identification calculation on the image to be identified after the resolution of the non-interesting region is reduced, so as to identify a target object. When the image is identified through the deep learning, the number of the image pixels with low resolution is small, and the calculated amount during the identification is relatively reduced, so that when the image to be identified with the resolution of the non-interest region reduced is identified, the calculated amount of the non-interest region with the resolution reduced is reduced, the calculated amount required by the identification of the image to be identified is reduced, and the identification efficiency is improved.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The main solution of the embodiment of the invention is as follows: acquiring an image to be identified, and determining a non-interested area in the image to be identified; acquiring the initial resolution of the non-interesting region, and performing resolution reduction processing on the initial resolution of the non-interesting region according to preset parameters; and carrying out image recognition calculation on the image to be recognized after the resolution of the non-interesting region is reduced, and recognizing the target object.
As shown in fig. 1, fig. 1 is a schematic structural diagram of a terminal belonging to a device in a hardware operating environment according to an embodiment of the present invention.
As shown in fig. 1, the terminal may include: a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.
Optionally, the terminal may further include a camera, a Radio Frequency (RF) circuit, a sensor, an audio circuit, a Wi-Fi module, and the like. Such as light sensors, motion sensors, and other sensors. Specifically, the light sensor may include an ambient light sensor that may adjust the brightness of the display screen according to the brightness of ambient light, and a proximity sensor that may turn off the display screen and/or the backlight when the mobile terminal is moved to the ear. As one of the motion sensors, the gravity acceleration sensor can detect the magnitude of acceleration in each direction (generally, three axes), detect the magnitude and direction of gravity when the mobile terminal is stationary, and can be used for applications (such as horizontal and vertical screen switching, related games, magnetometer attitude calibration), vibration recognition related functions (such as pedometer and tapping) and the like for recognizing the attitude of the mobile terminal; of course, the mobile terminal may also be configured with other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which are not described herein again.
Those skilled in the art will appreciate that the terminal structure shown in fig. 1 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a kind of computer storage medium, may include therein an operating device, a network communication module, a user interface module, and an image recognition program.
In the terminal shown in fig. 1, the network interface 1004 is mainly used for connecting to a backend server and performing data communication with the backend server; the user interface 1003 is mainly used for connecting a client (user side) and performing data communication with the client; and the processor 1001 may be configured to call the image recognition program stored in the memory 1005 and perform the following operations:
acquiring an image to be identified, and determining a non-interested area in the image to be identified;
acquiring the initial resolution of the non-interesting region, and performing resolution reduction processing on the initial resolution of the non-interesting region according to preset parameters;
and carrying out image recognition calculation on the image to be recognized after the resolution of the non-interesting region is reduced, and recognizing the target object.
Further, the processor 1001 may call the image recognition program stored in the memory 1005, and also perform the following operations:
acquiring an image to be identified, and dividing an interested region and a non-interested region from the image to be identified according to a preset rule.
Further, the processor 1001 may call the image recognition program stored in the memory 1005, and also perform the following operations:
and performing resolution reduction processing on the initial resolution of the region of non-interest according to preset parameters by using a preset tool.
Further, the processor 1001 may call the image recognition program stored in the memory 1005, and also perform the following operations:
and carrying out image recognition calculation on the image to be recognized after the resolution of the non-interesting region is reduced by adopting a convolutional neural network based on deep learning, and recognizing the target object, wherein the convolutional neural network based on deep learning is a model which is obtained by pre-training and is used for recognizing the target object.
Further, the processor 1001 may call the image recognition program stored in the memory 1005, and also perform the following operations:
respectively extracting the features of the region of interest and the region of non-interest in the image to be identified after the resolution of the region of non-interest is reduced to obtain a first feature vector corresponding to the region of interest and a second feature vector corresponding to the region of non-interest;
and inputting the first characteristic vector corresponding to the region of interest and the second characteristic vector corresponding to the region of non-interest into the convolutional neural network based on deep learning respectively, and identifying the target object aiming at the region of interest and the target object aiming at the region of non-interest to identify the target object.
Based on the hardware structure, various embodiments of the image recognition method of the invention are provided.
Referring to fig. 2, a first embodiment of an image recognition method according to the present invention provides an image recognition method including:
step S10, acquiring an image to be identified, and determining a non-interested area in the image to be identified;
the scheme of the embodiment can be applied to the field of unmanned automatic driving. The unmanned vehicle needs to have environment sensing capability for realizing automatic driving, the environment sensing capability comprises recognition and detection of targets such as vehicles and pedestrians, and the unmanned vehicle can be realized by processing image information shot by a shooting module on the unmanned vehicle through deep learning. At present, the image is identified through deep learning, which is realized by calculating the image by using an algorithm of a deep neural network of the deep learning, and in order to reduce the calculation amount during the image identification and improve the image identification efficiency, the scheme of the embodiment provides an image identification method.
The Deep learning method can be divided into unsupervised learning and supervised learning, wherein the unsupervised learning comprises Deep Belief Networks (DBNs), the supervised learning comprises Convolutional Neural Networks (CNNs) which are one type of artificial neural networks and are a variant model of a multi-layer perception machine (M L P) and become an efficient recognition method in the field of image recognition, and an algorithm of the Convolutional Neural Networks (CNNs) can be adopted to recognize and calculate images containing the surrounding environment of an unmanned vehicle.
In this embodiment, the sensor for sensing the surrounding environment by the unmanned vehicle comprises a camera which can shoot the environment images right in front of the road and on two sides of the road where the unmanned vehicle runs. Since the lens parameters (including focal length, field angle, aperture, etc.) of the camera are fixed, the size and resolution of the image captured by the camera are known.
Firstly, an original image shot by a camera is obtained, the image is an image to be identified, and a non-interested area in the image to be identified is determined. The non-region of interest is based on a region of interest (ROI), in this embodiment, a region to which an environment right in front of a road on which the unmanned vehicle travels in the image to be recognized belongs (i.e., a region in the image to be recognized that is decisive for a travel decision of the unmanned vehicle) is used as the region of interest, and regions to which environments on both sides of the road on which the unmanned vehicle travels in the image to be recognized belong (i.e., an edge region in the image to be recognized that has a small influence on the travel decision of the unmanned vehicle) are used as the non-region of interest.
Step S20, acquiring the initial resolution of the region of no interest, and performing resolution reduction processing on the initial resolution of the region of no interest according to preset parameters;
after the non-interesting region in the image to be identified is determined, the initial resolution of the non-interesting region is obtained, and resolution reduction processing is carried out on the initial resolution of the non-interesting region according to preset parameters. The resolution refers to the number of pixels included in a unit inch image, and if the pixel density is higher and the resolution is higher, then the resolution reduction processing is performed on the initial resolution of the non-interest region, that is, the number of pixels of the non-interest region is modified, so that the number of pixels of the non-interest region is reduced. The preset parameter is set based on a resolution threshold at which the image to be recognized can be recognized (the resolution threshold at which the image to be recognized can be recognized is determined based on the property of the convolutional neural network adopted in the embodiment and is a fixed value) and an initial resolution of the image to be recognized (the initial resolution of the image to be recognized is higher than the resolution threshold that can be recognized), and the preset parameter may be a proportional parameter or a specific resolution parameter, and may be flexibly set. If the preset parameter is a scale parameter, the scale parameter is smaller than 1, but a lower limit value which can be set by the scale parameter needs to be determined by combining the initial resolution of the image to be recognized and a resolution threshold value which can be recognized by the image to be recognized, and needs to satisfy a condition that the lower limit value can be recognized, if the lower limit value which can be set by the scale parameter is satisfied when the scale parameter is 1/3, taking the scale parameter as 1/3 as an example, at this time, the step of performing resolution reduction processing on the initial resolution of the region of non-interest according to the preset parameter is 1/3 of adjusting the resolution of the region of non-interest to the initial resolution; if the preset parameter is a specific resolution parameter, for example, if the recognizable resolution threshold is 100X100 and the initial resolution of the image to be recognized is 1600X1600, the preset parameter may be set to a specific resolution parameter, such as 400X400, within the range of the recognizable resolution threshold and the initial resolution of the image to be recognized, at this time, the step of performing resolution reduction processing on the initial resolution of the non-interesting region according to the preset parameter is to reduce the initial resolution of the non-interesting region to 400X 400. Therefore, after the resolution reduction processing is carried out on the initial resolution of the non-interested area of the image to be recognized, the recognition of the target object in the non-interested area of the image to be recognized is not influenced.
And step S21, using a preset tool to perform resolution reduction processing on the initial resolution of the region of non-interest according to preset parameters.
In this embodiment, a tool capable of reducing the resolution of an image may be preset to perform resolution reduction processing on the initial resolution of the non-interest region according to preset parameters, for example, the resolution reduction processing may be implemented by OpenCV, where OpenCV is a cross-platform computer vision library issued based on BSD permission (open source), and may perform processing including contour extraction, morphological processing, image segmentation, feature description, and the like on the image, and a specific processing process refers to an existing OpenCV process for performing resolution reduction processing, and is not described here again.
And step S30, performing image recognition calculation on the image to be recognized after the resolution of the non-interesting region is reduced, and recognizing the target object.
In the embodiment, the image to be recognized after the resolution of the non-interest region is reduced is subjected to image recognition calculation through the convolutional neural network of deep learning so as to recognize the target object. The target object includes a vehicle, a pedestrian, a traffic sign, and the like. The deep learning convolutional neural network is a model trained in advance for identifying a target object. The training of the convolutional neural network comprises the steps of calculating a convolutional characteristic value and a loss function value, specifically, a sample image is input into a convolutional layer of the convolutional neural network to be trained to be processed to obtain a convolutional characteristic value, then the convolutional characteristic value is input into a loss function layer of the convolutional neural network to be trained to be subjected to loss function calculation to obtain a loss function value, when the loss function value is smaller than the convolutional characteristic value, the convolutional layer is adjusted, the convolutional characteristic value and the loss function value are subjected to iterative calculation until the iterative times reach a certain number (for example, 300 ten thousand), and the convolutional neural network can be obtained. And inputting the image to be recognized after the resolution of the non-interesting region is reduced into the convolutional neural network, namely respectively inputting the non-interesting region with the reduced resolution and the interesting region corresponding to the initial resolution into the convolutional neural network to obtain a recognition result of the target object, and recognizing the target objects such as vehicles, pedestrians, traffic signs and the like.
The method comprises the steps of firstly obtaining an image to be identified, then determining a non-interesting region in the image to be identified, further obtaining the initial resolution of the non-interesting region, carrying out resolution reduction processing on the initial resolution of the non-interesting region according to preset parameters, and then carrying out image identification calculation on the image to be identified after the resolution of the non-interesting region is reduced, so as to identify a target object. When the image is identified through the deep learning, the number of the image pixels with low resolution is small, and the calculated amount during the identification is relatively reduced, so that when the image to be identified with the resolution of the non-interest region reduced is identified, the calculated amount of the non-interest region with the resolution reduced is reduced, the calculated amount required by the identification of the image to be identified is reduced, and the identification efficiency is improved.
Further, referring to fig. 2, a second embodiment of the image recognition method according to the present invention provides an image recognition method, based on the above embodiment shown in fig. 2, step S10 may include:
step S11, acquiring an image to be recognized, and dividing an interested area and a non-interested area from the image to be recognized according to a preset rule.
After an image to be recognized shot by a camera is obtained, an interested area and a non-interested area are divided from the image to be recognized according to a preset rule. In the embodiment, a division rule for the region of interest and the region of non-interest is preset, that is, a region within a driving lane range of the unmanned vehicle in the image to be recognized (i.e., a region in the image to be recognized which plays a decisive role in driving decision of the unmanned vehicle) is used as the region of interest, and regions on two sides of the driving lane of the unmanned vehicle in the image to be recognized (i.e., edge regions in the image to be recognized which have less influence on driving decision of the unmanned vehicle) are used as the region of non-interest. The division of the interested region and the non-interested region in the image to be recognized can be realized through the preset OpenCV. OpenCV is a cross-platform computer vision library published based on BSD licensing (open source) that can perform processing on images including contour extraction, morphological processing, image segmentation, feature description, and the like. The specific process of dividing the interesting region and the non-interesting region in the image to be recognized through OpenCV may refer to the prior art, and is not described herein again.
Further, a third embodiment of the image recognition method of the present invention provides an image recognition method, and based on the first embodiment shown in fig. 2, the step S30 may include:
and step S31, performing image recognition calculation on the image to be recognized after the resolution of the non-interest region is reduced by adopting a convolutional neural network based on deep learning, and recognizing the target object, wherein the convolutional neural network based on deep learning is a model which is obtained by training in advance and is used for recognizing the target object.
In the embodiment, the image to be recognized after the resolution of the non-interest region is reduced is subjected to image recognition calculation through the convolutional neural network of deep learning so as to recognize the target object. Specifically, referring to fig. 3, step S31 may include:
step S310, respectively extracting the characteristics of the region of interest and the region of non-interest in the image to be identified after the resolution of the region of non-interest is reduced to obtain a first characteristic vector corresponding to the region of interest and a second characteristic vector corresponding to the region of non-interest;
step S311, inputting the first feature vector corresponding to the region of interest and the second feature vector corresponding to the region of non-interest into the convolutional neural network based on deep learning, respectively, performing target object identification for the region of interest and target object identification for the region of non-interest, and identifying the target object.
When the image identification calculation is carried out on the image to be identified after the resolution of the non-interest region is reduced through the deep learning convolutional neural network, when the image identification calculation is specifically implemented, feature extraction is respectively carried out on the interest region and the non-interest region in the image to be identified after the resolution of the non-interest region is reduced, a feature vector (defined as a first feature vector) corresponding to the interest region and a feature vector (defined as a second feature vector) corresponding to the non-interest region are obtained, then the first feature vector corresponding to the interest region and the second feature vector corresponding to the non-interest region are respectively input into the convolutional neural network which is trained in advance, target object identification aiming at the interest region and target object identification aiming at the non-interest region are carried out, and then target objects such as vehicles, pedestrians, traffic signs and the like can be identified.
The application scenario of this embodiment may be: when the unmanned vehicle runs, an environment image in front of the unmanned vehicle, which is shot by a camera of the unmanned vehicle, is obtained in real time, a non-interested region which has a small influence on a running decision of the unmanned vehicle in the environment image is determined, the resolution of the non-interested region is obtained, the initial resolution of the non-interested region is subjected to resolution reduction processing according to preset parameters, namely the number of pixels of the non-interested region is reduced, then image recognition is performed on the image to be recognized after the resolution of the non-interested region is reduced, at the moment, the calculated amount of the non-interested region after the resolution reduction is reduced, so that the calculated amount required when target objects such as vehicles, pedestrians, traffic signs and the like in the image to be recognized are recognized can be reduced, and the recognition efficiency is improved. After target objects such as vehicles, pedestrians, traffic signs and the like are identified from the image, correct environmental information can be provided for the driving decision module, accidents such as scratch and collision are effectively reduced, and normal completion of unmanned driving is guaranteed.
In addition, the embodiment of the invention also provides an image recognition device.
The image recognition apparatus of the present invention includes: a memory, a processor, and an image recognition program stored on the memory and executable on the processor, the image recognition program when executed by the processor implementing the steps of:
acquiring an image to be identified, and determining a non-interested area in the image to be identified;
acquiring the initial resolution of the non-interesting region, and performing resolution reduction processing on the initial resolution of the non-interesting region according to preset parameters;
and carrying out image recognition calculation on the image to be recognized after the resolution of the non-interesting region is reduced, and recognizing the target object.
Further, the image recognition program when executed by the processor further implements the steps of:
acquiring an image to be identified, and dividing an interested region and a non-interested region from the image to be identified according to a preset rule.
Further, the image recognition program when executed by the processor further implements the steps of:
and performing resolution reduction processing on the initial resolution of the region of non-interest according to preset parameters by using a preset tool.
Further, the image recognition program when executed by the processor further implements the steps of:
and carrying out image recognition calculation on the image to be recognized after the resolution of the non-interesting region is reduced by adopting a convolutional neural network based on deep learning, and recognizing the target object, wherein the convolutional neural network based on deep learning is a model which is obtained by pre-training and is used for recognizing the target object.
The specific embodiment of the image recognition program stored in the image recognition apparatus of the present invention and executed by the processor is basically the same as the embodiments of the image recognition method described above, and will not be described herein again.
The invention also provides a computer readable storage medium.
The computer-readable storage medium of the present invention has stored thereon an image recognition program which, when executed by the processor, realizes the steps of:
acquiring an image to be identified, and determining a non-interested area in the image to be identified;
acquiring the initial resolution of the non-interesting region, and performing resolution reduction processing on the initial resolution of the non-interesting region according to preset parameters;
and carrying out image recognition calculation on the image to be recognized after the resolution of the non-interesting region is reduced, and recognizing the target object.
The specific embodiment of the image recognition program stored in the computer-readable storage medium of the present invention executed by the processor is basically the same as the embodiments of the image recognition method described above, and is not described herein again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.