CN113111735A - Rapid scene recognition method and device under complex environment - Google Patents
Rapid scene recognition method and device under complex environment Download PDFInfo
- Publication number
- CN113111735A CN113111735A CN202110317587.9A CN202110317587A CN113111735A CN 113111735 A CN113111735 A CN 113111735A CN 202110317587 A CN202110317587 A CN 202110317587A CN 113111735 A CN113111735 A CN 113111735A
- Authority
- CN
- China
- Prior art keywords
- scene recognition
- recognition model
- scene
- model
- converged
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 61
- 238000012549 training Methods 0.000 claims abstract description 18
- 238000002054 transplantation Methods 0.000 claims abstract description 10
- 230000008569 process Effects 0.000 claims description 12
- 230000006870 function Effects 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 9
- 238000010276 construction Methods 0.000 claims description 5
- 230000004927 fusion Effects 0.000 claims description 5
- 230000005012 migration Effects 0.000 claims description 4
- 238000013508 migration Methods 0.000 claims description 4
- 230000000694 effects Effects 0.000 description 16
- 238000010586 diagram Methods 0.000 description 13
- 238000012545 processing Methods 0.000 description 13
- 238000013528 artificial neural network Methods 0.000 description 8
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a rapid scene recognition method and a rapid scene recognition device under a complex environment, wherein the method is applied to a rapid scene recognition system under the complex environment, the system comprises an image acquisition device, and the method comprises the following steps of; obtaining first image data by the image acquisition device; constructing a first scene recognition model through a GoogleNet network; network parameter training is carried out on the first scene recognition model on a self-built five-type terrain data set, and the first scene recognition model after convergence is obtained; carrying out transplantation deployment of the converged first scene recognition model at an embedded end through TensorRT; inputting the first image data into the converged first scene recognition model after deployment, and performing forward reasoning to obtain a recognition result. The technical problems that in the prior art, the sensor identification is easy to have device errors, the complex scene identification rate is low, the identification speed is low, and the device is not suitable for real-time occasions are solved.
Description
Technical Field
The invention relates to the field of artificial intelligence correlation, in particular to a method and a device for rapidly identifying scenes in a complex environment.
Background
Scene recognition belongs to an image processing task, namely, the type of a place where an image scene is located is judged in an image and contains accurate geographic position coordinates, so that the result of the scene recognition can be used for subsequent positioning, but under the condition that higher requirements are provided for maneuvering operation capacity such as high jump and lower jump, the terrain is changeable and strong in instantaneity in the jump process, and the traditional method for completing the scene recognition by observing and fusing sensor data based on stacking of various sensors is difficult to evaluate and feed back complex terrain in real time.
However, in the process of implementing the technical solution of the invention in the embodiments of the present application, the inventors of the present application find that the above-mentioned technology has at least the following technical problems:
the sensor identification in the prior art is easy to have device errors, and has the technical problems of low identification rate for complex scenes, low identification speed and inapplicability to real-time occasions.
Disclosure of Invention
The embodiment of the application provides a rapid scene recognition method and device in a complex environment, solves the technical problems that in the prior art, sensor recognition is easy to have device errors, the complex scene recognition rate is low, the recognition speed is low, and the method is not suitable for real-time occasions, achieves the technical effects of realizing embedded end deployment of a scene recognition model with high accuracy by using a high-efficiency deep neural network, and meeting the real-time performance, accuracy and rapidity of scene recognition.
In view of the foregoing problems, the embodiments of the present application provide a method and an apparatus for fast scene recognition in a complex environment.
In a first aspect, an embodiment of the present application provides a method for fast scene recognition in a complex environment, where the method is applied to a system for fast scene recognition in a complex environment, the system includes an image acquisition device, and the method includes; obtaining first image data by the image acquisition device; constructing a first scene recognition model through a GoogleNet network; network parameter training is carried out on the first scene recognition model on a self-built five-type terrain data set, and the first scene recognition model after convergence is obtained; carrying out transplantation deployment of the converged first scene recognition model at an embedded end through TensorRT; inputting the first image data into the converged first scene recognition model after deployment, and performing forward reasoning to obtain a recognition result.
On the other hand, the application also provides a device for fast scene recognition in a complex environment, and the device comprises: a first obtaining unit for obtaining first image data by the image acquisition device; the first construction unit is used for constructing a first scene recognition model through a GoogleNet network; a second obtaining unit, configured to perform network parameter training on the first scene recognition model on a self-established five-class terrain data set, and obtain the converged first scene recognition model; a first operation unit, configured to perform migration deployment of the converged first scene recognition model at an embedding end through a TensorRT; a third obtaining unit, configured to input the first image data into the converged first scene identification model after deployment is completed, and perform forward inference to obtain an identification result.
In a third aspect, the present invention provides a fast scene recognition apparatus in a complex environment, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the method of the first aspect when executing the program.
One or more technical solutions provided in the embodiments of the present application have at least the following technical effects or advantages:
because the method adopts the mode that the camera acquires image data through photoelectric imaging, the laser radar acquires distance information and encodes the distance information into a point cloud data packet by receiving reflected laser, the point cloud data packet is transmitted to the core processor through the USB interface, the Ethernet interface and the data bus for processing, a sparse and high-computation-performance network structure is built based on the GoogleNet network architecture, the first scene recognition model is subjected to network parameter training on a self-built five-type terrain data set to obtain the converged first scene recognition model, then transplantation deployment at the embedded end is carried out, the deployed converged first scene recognition model is subjected to forward reasoning to obtain a recognition result, the embedded end deployment of the high-accuracy scene recognition model is realized by utilizing the high-efficiency deep neural network, the real-time performance of scene recognition is met, Accuracy and rapidity.
The foregoing description is only an overview of the technical solutions of the present application, and the present application can be implemented according to the content of the description in order to make the technical means of the present application more clearly understood, and the following detailed description of the present application is given in order to make the above and other objects, features, and advantages of the present application more clearly understandable.
Drawings
Fig. 1 is a schematic flowchart of a method for fast scene recognition in a complex environment according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a fast scene recognition apparatus in a complex environment according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an exemplary electronic device according to an embodiment of the present application.
Description of reference numerals: a first obtaining unit 11, a first constructing unit 12, a second obtaining unit 13, a first operating unit 14, a third obtaining unit 15, a bus 300, a receiver 301, a processor 302, a transmitter 303, a memory 304, a bus interface 305.
Detailed Description
The embodiment of the application provides a rapid scene recognition method and device in a complex environment, solves the technical problems that in the prior art, sensor recognition is easy to have device errors, the complex scene recognition rate is low, the recognition speed is low, and the method is not suitable for real-time occasions, achieves the technical effects of realizing embedded end deployment of a scene recognition model with high accuracy by using a high-efficiency deep neural network, and meeting the real-time performance, accuracy and rapidity of scene recognition. Hereinafter, example embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are merely some embodiments of the present application and not all embodiments of the present application, and it should be understood that the present application is not limited to the example embodiments described herein.
Summary of the application
Scene recognition belongs to an image processing task, namely, the type of a place where an image scene is located is judged in an image and contains accurate geographic position coordinates, so that the result of the scene recognition can be used for subsequent positioning, but under the condition that higher requirements are provided for maneuvering operation capacity such as high jump and lower jump, the terrain is changeable and strong in instantaneity in the jump process, and the traditional method for completing the scene recognition by observing and fusing sensor data based on stacking of various sensors is difficult to evaluate and feed back complex terrain in real time. However, in the prior art, the sensor identification is easy to have device errors, the complex scene identification rate is low, the identification speed is low, and the method is not suitable for real-time occasions.
In view of the above technical problems, the technical solution provided by the present application has the following general idea:
the embodiment of the application provides a rapid scene identification method under a complex environment, wherein the method is applied to a rapid scene identification system under the complex environment, the system comprises an image acquisition device, and the method comprises the following steps of; obtaining first image data by the image acquisition device; constructing a first scene recognition model through a GoogleNet network; network parameter training is carried out on the first scene recognition model on a self-built five-type terrain data set, and the first scene recognition model after convergence is obtained; carrying out transplantation deployment of the converged first scene recognition model at an embedded end through TensorRT; inputting the first image data into the converged first scene recognition model after deployment, and performing forward reasoning to obtain a recognition result. The embedded end deployment of the scene recognition model with high accuracy is realized by utilizing the efficient deep neural network, and the technical effects of real-time performance, accuracy and rapidity of scene recognition are met.
Having thus described the general principles of the present application, various non-limiting embodiments thereof will now be described in detail with reference to the accompanying drawings.
Example one
As shown in fig. 1, an embodiment of the present application provides a method for fast scene recognition in a complex environment, where the method is applied to a system for fast scene recognition in a complex environment, where the system includes an image capture device, and the method includes:
step S100: obtaining first image data by the image acquisition device;
specifically, the image acquisition device comprises a camera module and a laser radar, wherein after the camera module forms an image through a camera, an optical signal captured from the outside is converted into an electric signal to be processed; the laser radar collects outside distance information by receiving reflected laser, encodes the collected information into a point cloud data packet and processes the point cloud data packet. The first image data is the integration of the camera module and the data collected by the laser radar. The camera module and the laser radar need to initialize and set related parameters through a control bus and a USB interface Ethernet interface according to a processor after acquiring related information. Furthermore, the image data can be subjected to advanced relevant preprocessing and then to next data processing, the corona data packet can be analyzed and then to relevant calculation, and in addition, all data information obtained by the image acquisition device can be integrated and then transmitted.
Step S200: constructing a first scene recognition model through a GoogleNet network;
specifically, the first scene recognition model is constructed and realized based on the GoogleNet network, wherein the construction based on the GoogleNet network is based on obtaining a high-quality scene model so as to increase the depth or the width of the model, and in detail, the GoogleNet network constructs a 'basic neuron' inclusion structure to construct a sparse and high-computing-performance network structure. Generally speaking, since general convolution is converted into sparse connection, the calculation efficiency is not high, and therefore, the main idea of the inclusion structure is to find out an approximate optimal local sparse structure, so that optimization operation should be completed by classification, and therefore the technical effects of accurately constructing the first scene identification model and improving the accuracy of processing data by the first scene identification model are achieved.
Step S300: network parameter training is carried out on the first scene recognition model on a self-built five-type terrain data set, and the first scene recognition model after convergence is obtained;
specifically, the detail level generated by the terrain data set for improving efficiency is generated by a point reduction or point refinement process based on a terrain pyramid due to the complexity of a flight scene, so that the number of measurement values required for representing the surface of a given area is reduced for research, not only the basic features of the terrain are maintained, but also the diversity accuracy of training data is ensured, and since the purpose of the network is to classify, the loss function optimized in the network parameter training process is a cross entropy function of a predicted value and a true value:
wherein p isiRepresenting the probability of the sample belonging to the ith class; y isiOnehot representation representing a sample label, y when a sample belongs to the category ii1, otherwise yi0; and c represents a sample label. And further. By training the network parameters of the first scene recognition model, the corresponding optimal scene recognition accuracy can be obtained, and the technical effect of ensuring high-precision recognition of the first scene recognition model is further achieved.
Step S400: carrying out transplantation deployment of the converged first scene recognition model at an embedded end through TensorRT;
specifically, the TensorRT is a high-performance deep learning inference optimizer, and can provide low-delay and high-throughput deployment inference for deep learning applications. After the server end completes model training, the model obtained through training needs to be transplanted to the embedded end. The TensorRT is adopted to carry out transplantation and deployment of the network at the embedded end, and the reasoning speed of the scene recognition model can be accelerated. The designed embedded device needs to have the functions of acquiring image data, acquiring laser radar data, transmitting, carrying and the like. In detail, after the scene model is trained, the information of the first scene model is saved, and the weight of the model is saved in a binary form. And the wts file with the model structure information and the binary weight parameters is moved to the embedded end, and the next processing is completed based on the embedded end. The method mainly comprises the steps of reconstructing a model structure, combining some operations together, wherein the operation combination comprises vertical combination and horizontal combination, the vertical combination is to combine Conv, BN and Relu three layers of the current mainstream neural network structure into one layer, and the horizontal combination is to combine the layers which are input into the same tensor and execute the same operation together. Therefore, the technical effect of accelerating the reasoning speed of the scene recognition model is achieved.
Step S500: inputting the first image data into the converged first scene recognition model after deployment, and performing forward reasoning to obtain a recognition result.
Specifically, model reasoning is further completed through the embedded end, wherein the first scene recognition model needs to be subjected to format conversion and stored before model reasoning is carried out, when the model needs to be operated, the model can be directly loaded through deserialization, the time for program initialization is greatly shortened, and the first scene recognition model is transplanted and deployed, so that the technical effects that the recognition delay is low in a process of a flying scene, and the method can be suitable for high-real-time application occasions are achieved.
Further, where the first scene recognition model is constructed through a GoogleNet network, step S200 in the embodiment of the present application further includes:
step S210: obtaining an inclusion structure according to the GoogleNet network;
step S220: extracting features by utilizing convolution kernels with different sizes through a plurality of branches according to the inclusion structure to obtain feature maps with different sizes;
step S230: and performing feature fusion according to the feature graphs of different sizes to obtain the first scene recognition model.
In particular, since the main idea of the inclusion structure is to find an approximately optimal local sparse structure, the characteristics are extracted by a plurality of branches by using convolution kernels with different sizes to obtain characteristic diagrams of reception fields with different sizes, and finally the characteristic diagrams are spliced to fuse characteristics with different scales, in particular, the inclusion structure increases the width of a network on one hand and also increases the adaptability of the network to the scales on the other hand, since extracting features through convolution kernels of different sizes implies a receptive field of different sizes, and final stitching implies fusion of features of different scales, namely, the features with the same dimensionality can be obtained through setting a convolution kernel, and then the features are directly spliced together to further obtain the optimal local sparse structure so as to obtain the first scene recognition model, in order to avoid too many channels of the feature map, correlation processing of convolution kernels is required. The convolutional network layer can extract information and simultaneously reduce overfitting so as to increase the nonlinear characteristics of the network, thereby achieving the technical effect of increasing the operation speed on the basis of processing more and richer spatial characteristics.
Further, in which the converged first scene recognition model is transplanted and deployed at an embedding end through TensorRT, step S400 in this embodiment of the present application further includes:
step S410: generating wts files of the converged first scene recognition model at a server side;
step S420: moving the wts file to an embedded end;
step S430: defining, by the embedded end, the first scene recognition model definition layer and/or structure information;
step S440: loading and analyzing the wts file to obtain model structure information and weight parameters;
step S450: and deploying and creating the first scene recognition model according to the model structure information and the weight parameters.
Specifically, after the scene model is trained at the server side, the information of the model is saved, and the weight of the model is saved in a binary form. The wts file with model structure information and binary weight parameters is then moved to the embedded end, where further the model can be rewritten in machine language, since TensrT is a C + + library, providing C + + API, basically with more classical layers such as convolution, deconvolution, full concatenation, softmax, etc., with corresponding implementations in TensrT. Specifically, the relevant layer or structure is defined by using the API of the TensorRT, and then the wts file having the model structure information and the weight parameters is loaded and parsed to obtain the structure information and the weight parameters of the model, which are stored in the map container. And calling a predefined layer or structure according to the structural information of the model, loading the model weight parameters and then creating the model, thereby realizing the embedded end deployment effect of the rapid scene recognition device and the scene recognition model with high accuracy.
Further, before inputting the first image data into the converged first scene recognition model after deployment and performing forward inference to obtain a recognition result, step S500 in this embodiment of the present application further includes:
step S510: converting the first scene recognition model into an Engine file;
step S520: and storing the Engine file after serialization.
In particular, the process of building an Engine due to TensorRT is often time consuming, especially on embedded devices. Therefore, the first scene recognition model is converted into the Engine file capable of loading the model, the generated Engine file is further stored after serialization, when the model needs to be operated, the model can be directly loaded through deserialization, the program initialization time is greatly shortened, the program modularization is realized, the model calling speed is improved, the response speed of the model when the model processes data is higher, and the technical effect of meeting the requirements of scene recognition accuracy and rapidity is achieved.
Further, in an embodiment of the present invention, the inputting the first image data into the converged first scene recognition model after deployment is performed, and performing forward inference to obtain a recognition result, step S500 further includes:
step S530: judging whether a device needs to operate the first scene recognition model or not;
step S540: and if the device needs to operate the first scene recognition model, loading the first scene recognition model through deserialization, and performing forward reasoning to obtain a recognition result.
In particular, since each module is independent of each other, the loading speed of the program is faster, and the modules are loaded only when the corresponding functions are requested, the updates are easily applied to the respective modules without affecting other parts of the program. Further, the deserialization core is the saving and reconstruction of the object state, and the search of the forward reasoning, which starts from the initial data towards the target, aims to find the path through the problem space. The forward reasoning process can be mainly expressed as that a reasoning engine utilizes the provided information to explore a knowledge base, the priority of the constraint is matched with the given current state, and then an accurate recognition result is obtained, so that the technical effects of meeting the situation with high real-time requirement, having very high scene recognition accuracy and being capable of quickly and accurately recognizing a complex down-stepping scene are achieved.
To sum up, the method and the device for fast scene recognition in a complex environment provided by the embodiment of the present application have the following technical effects:
1. because the method adopts the mode that the camera acquires image data through photoelectric imaging, the laser radar acquires distance information and encodes the distance information into a point cloud data packet by receiving reflected laser, the point cloud data packet is transmitted to the core processor through the USB interface, the Ethernet interface and the data bus to be processed, the first scene recognition model is built based on the GoogleNet network architecture, the network parameter training is carried out on the first scene recognition model on the self-built five-type terrain data set to obtain the converged first scene recognition model, then the transplantation deployment at the embedding end is carried out, and the deployed converged first scene recognition model is subjected to forward reasoning to obtain a recognition result, the embedding end deployment of the high-accuracy scene recognition model is realized by utilizing the high-efficiency deep neural network, and the real-time performance of scene recognition is met, Accuracy and rapidity.
2. Because a network structure with sparsity and high computing performance is built based on a GoogleNet network architecture, and a characteristic fusion mode is carried out according to the inclusion structure, the first scene recognition model with higher precision is obtained, and the technical effect of improving the recognition speed is achieved.
3. Because the model obtained after training is transplanted to the embedded end, the transplantation and deployment of the network at the embedded end are carried out based on TensrT, and the embedded hardware device designed by deep neural network deployment accelerates the reasoning speed of the scene recognition model with the powerful calculation performance, the recognition speed is high, and the technical effect of occasions with higher real-time requirements can be met.
Example two
Based on the same inventive concept as the method for identifying a fast scene in a complex environment in the foregoing embodiment, the present invention further provides a device for identifying a fast scene in a complex environment, as shown in fig. 2, the device includes:
a first obtaining unit 11, wherein the first obtaining unit 11 is used for obtaining first image data through the image acquisition device;
a first construction unit 12, wherein the first construction unit 12 is configured to construct a first scene recognition model through a google net network;
a second obtaining unit 13, where the second obtaining unit 13 is configured to perform network parameter training on the first scene recognition model on a self-established five-class terrain data set, and obtain the converged first scene recognition model;
a first operation unit 14, wherein the first operation unit 14 is configured to perform migration deployment of the converged first scene recognition model at an embedding end through TensorRT;
a third obtaining unit 15, where the third obtaining unit 15 is configured to input the first image data into the converged first scene identification model after deployment is completed, and perform forward inference to obtain an identification result.
Further, the apparatus further comprises:
a fourth obtaining unit, configured to obtain an inclusion structure according to the GoogleNet network;
a fifth obtaining unit, configured to extract features by using convolution kernels of different sizes through a plurality of branches according to the inclusion structure, and obtain feature maps of different sizes;
a sixth obtaining unit, configured to perform feature fusion according to the feature maps of different sizes to obtain the first scene identification model.
Further, the apparatus further comprises:
a first generating unit, configured to generate, at a server side, an wts file of the converged first scene recognition model;
a first moving unit for moving the wts file to an embedded end;
a first defining unit for defining the first scene recognition model definition layer and/or structure information by the embedding terminal;
a seventh obtaining unit, configured to load and parse the wts file, and obtain model structure information and weight parameters;
a first creating unit, configured to deploy and create the first scene recognition model according to the model structure information and the weight parameter.
Further, the apparatus further comprises:
a first conversion unit for converting the first scene recognition model into an Engine file;
and the first storage unit is used for storing the Engine file after serialization.
Further, the apparatus further comprises:
a first judging unit, configured to judge whether a device needs to run the first scene recognition model;
and the eighth obtaining unit is used for loading the first scene recognition model through deserialization and carrying out forward reasoning to obtain a recognition result if the first scene recognition model needs to be operated by the device.
Various changes and specific examples of the method for identifying a fast scene in a complex environment in the first embodiment of fig. 1 are also applicable to the apparatus for identifying a fast scene in a complex environment of the present embodiment, and through the foregoing detailed description of the method for identifying a fast scene in a complex environment, a person skilled in the art can clearly know the method for implementing the apparatus for identifying a fast scene in a complex environment of the present embodiment, so for the brevity of the description, detailed descriptions are omitted here.
Exemplary electronic device
The electronic device of the embodiment of the present application is described below with reference to fig. 3.
Fig. 3 illustrates a schematic structural diagram of an electronic device according to an embodiment of the present application.
Based on the inventive concept of the method for fast scene recognition in a complex environment in the foregoing embodiments, the present invention further provides a device for fast scene recognition in a complex environment, in which a computer program is stored, and when the computer program is executed by a processor, the steps of any one of the methods for fast scene recognition in a complex environment are implemented.
Where in fig. 3 a bus architecture (represented by bus 300), bus 300 may include any number of interconnected buses and bridges, bus 300 linking together various circuits including one or more processors, represented by processor 302, and memory, represented by memory 304. The bus 300 may also link together various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. A bus interface 306 provides an interface between the bus 300 and the receiver 301 and transmitter 303. The receiver 301 and the transmitter 303 may be the same element, i.e., a transceiver, providing a means for communicating with various other apparatus over a transmission medium.
The processor 302 is responsible for managing the bus 300 and general processing, and the memory 304 may be used for storing data used by the processor 302 in performing operations.
The embodiment of the invention provides a rapid scene identification method under a complex environment, wherein the method is applied to a rapid scene identification system under the complex environment, the system comprises an image acquisition device, and the method comprises the following steps of; obtaining first image data by the image acquisition device; constructing a first scene recognition model through a GoogleNet network; network parameter training is carried out on the first scene recognition model on a self-built five-type terrain data set, and the first scene recognition model after convergence is obtained; carrying out transplantation deployment of the converged first scene recognition model at an embedded end through TensorRT; inputting the first image data into the converged first scene recognition model after deployment, and performing forward reasoning to obtain a recognition result. The technical problems that in the prior art, device errors are easy to exist in sensor identification, the complex scene identification rate is low, the identification speed is low, and the method is not suitable for real-time occasions are solved, the embedded end deployment of the scene identification model with high accuracy is realized by utilizing the high-efficiency deep neural network, and the technical effects of real-time performance, accuracy and rapidity of scene identification are met.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.
Claims (8)
1. A rapid scene recognition method under a complex environment is applied to a rapid scene recognition system under the complex environment, the system comprises an image acquisition device, and the method comprises the following steps of;
obtaining first image data by the image acquisition device;
constructing a first scene recognition model through a GoogleNet network;
network parameter training is carried out on the first scene recognition model on a self-built five-type terrain data set, and the first scene recognition model after convergence is obtained;
carrying out transplantation deployment of the converged first scene recognition model at an embedded end through TensorRT;
inputting the first image data into the converged first scene recognition model after deployment, and performing forward reasoning to obtain a recognition result.
2. The method of claim 1, wherein said building a first scene recognition model through a GoogleNet network comprises:
obtaining an inclusion structure according to the GoogleNet network;
extracting features by utilizing convolution kernels with different sizes through a plurality of branches according to the inclusion structure to obtain feature maps with different sizes;
and performing feature fusion according to the feature graphs of different sizes to obtain the first scene recognition model.
3. The method of claim 1, wherein the optimized loss function in the first scene recognition model training process is a cross-entropy function of predicted values and true values:
wherein p isiRepresenting the probability of the sample belonging to the ith class;
yionehot representation representing a sample label, y when a sample belongs to the category ii1, otherwise yi=0;
And c represents a sample label.
4. The method of claim 1, wherein said converged migration deployment of said first scene recognition model at an embedding end by TensorRT comprises;
generating wts files of the converged first scene recognition model at a server side;
moving the wts file to an embedded end;
defining, by the embedded end, the first scene recognition model definition layer and/or structure information;
loading and analyzing the wts file to obtain model structure information and weight parameters;
and deploying and creating the first scene recognition model according to the model structure information and the weight parameters.
5. The method of claim 1, wherein said inputting said first image data into said first scene recognition model after said convergence of deployment, and performing forward reasoning, before obtaining recognition results, comprises;
converting the first scene recognition model into an Engine file;
and storing the Engine file after serialization.
6. The method as claimed in claim, wherein said inputting said first image data into said first scene recognition model after deployment is completed and said converged scene recognition model is subject to forward reasoning to obtain recognition results, including;
judging whether a device needs to operate the first scene recognition model or not;
and if the device needs to operate the first scene recognition model, loading the first scene recognition model through deserialization, and performing forward reasoning to obtain a recognition result.
7. An apparatus for fast scene recognition in a complex environment, wherein the apparatus comprises:
a first obtaining unit for obtaining first image data by the image acquisition device;
the first construction unit is used for constructing a first scene recognition model through a GoogleNet network;
a second obtaining unit, configured to perform network parameter training on the first scene recognition model on a self-established five-class terrain data set, and obtain the converged first scene recognition model;
a first operation unit, configured to perform migration deployment of the converged first scene recognition model at an embedding end through a TensorRT;
a third obtaining unit, configured to input the first image data into the converged first scene identification model after deployment is completed, and perform forward inference to obtain an identification result.
8. A fast scene recognition apparatus in a complex environment, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the method according to any one of claims 1 to 6 when executing the program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110317587.9A CN113111735A (en) | 2021-03-25 | 2021-03-25 | Rapid scene recognition method and device under complex environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110317587.9A CN113111735A (en) | 2021-03-25 | 2021-03-25 | Rapid scene recognition method and device under complex environment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113111735A true CN113111735A (en) | 2021-07-13 |
Family
ID=76710421
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110317587.9A Pending CN113111735A (en) | 2021-03-25 | 2021-03-25 | Rapid scene recognition method and device under complex environment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113111735A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113850134A (en) * | 2021-08-24 | 2021-12-28 | 中国船舶重工集团公司第七0九研究所 | Safety helmet wearing detection method and system integrating attention mechanism |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106778682A (en) * | 2017-01-11 | 2017-05-31 | 厦门中控生物识别信息技术有限公司 | A kind of training method and its equipment of convolutional neural networks model |
CN110852167A (en) * | 2019-10-10 | 2020-02-28 | 中国人民解放军军事科学院国防科技创新研究院 | Remote sensing image classification method based on optimal activation model |
CN111012261A (en) * | 2019-11-18 | 2020-04-17 | 深圳市杉川机器人有限公司 | Sweeping method and system based on scene recognition, sweeping equipment and storage medium |
CN111832546A (en) * | 2020-06-23 | 2020-10-27 | 南京航空航天大学 | A Lightweight Natural Scene Text Recognition Method |
CN112070070A (en) * | 2020-11-10 | 2020-12-11 | 南京信息工程大学 | LW-CNN method and system for urban remote sensing scene recognition |
CN112348003A (en) * | 2021-01-11 | 2021-02-09 | 航天神舟智慧系统技术有限公司 | Airplane refueling scene recognition method and system based on deep convolutional neural network |
-
2021
- 2021-03-25 CN CN202110317587.9A patent/CN113111735A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106778682A (en) * | 2017-01-11 | 2017-05-31 | 厦门中控生物识别信息技术有限公司 | A kind of training method and its equipment of convolutional neural networks model |
CN110852167A (en) * | 2019-10-10 | 2020-02-28 | 中国人民解放军军事科学院国防科技创新研究院 | Remote sensing image classification method based on optimal activation model |
CN111012261A (en) * | 2019-11-18 | 2020-04-17 | 深圳市杉川机器人有限公司 | Sweeping method and system based on scene recognition, sweeping equipment and storage medium |
CN111832546A (en) * | 2020-06-23 | 2020-10-27 | 南京航空航天大学 | A Lightweight Natural Scene Text Recognition Method |
CN112070070A (en) * | 2020-11-10 | 2020-12-11 | 南京信息工程大学 | LW-CNN method and system for urban remote sensing scene recognition |
CN112348003A (en) * | 2021-01-11 | 2021-02-09 | 航天神舟智慧系统技术有限公司 | Airplane refueling scene recognition method and system based on deep convolutional neural network |
Non-Patent Citations (3)
Title |
---|
应用光学: "《使用TensorRT 进行深度学习推理》", 《应用光学》 * |
王玮: "《基于无人机视角下的实时车辆检测与跟踪算法研究》", 《中国优秀硕士学位论文全文数据库 (工程科技Ⅱ辑)》 * |
蔡青青: "《基于GoogLeNet的场景识别研究》", 《中国新技术新产品》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113850134A (en) * | 2021-08-24 | 2021-12-28 | 中国船舶重工集团公司第七0九研究所 | Safety helmet wearing detection method and system integrating attention mechanism |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Pallagani et al. | DCrop: A deep-learning based framework for accurate prediction of diseases of crops in smart agriculture | |
US10902615B2 (en) | Hybrid and self-aware long-term object tracking | |
Dong et al. | Research and discussion on image recognition and classification algorithm based on deep learning | |
CN111507378A (en) | Method and apparatus for training image processing model | |
CN110796037A (en) | Satellite-borne optical remote sensing image ship target detection method based on lightweight receptive field pyramid | |
CN113570029A (en) | Method for obtaining neural network model, image processing method and device | |
US20180247199A1 (en) | Method and apparatus for multi-dimensional sequence prediction | |
CN111967594A (en) | Neural network compression method, device, equipment and storage medium | |
CN107818302A (en) | Non-rigid multi-scale object detection method based on convolutional neural network | |
CN113807399A (en) | Neural network training method, neural network detection method and neural network detection device | |
CN112580512B (en) | Lightweight unmanned aerial vehicle target detection method based on channel cutting | |
US20200151506A1 (en) | Training method for tag identification network, tag identification apparatus/method and device | |
Dai | Real-time and accurate object detection on edge device with TensorFlow Lite | |
CN113743417B (en) | Semantic segmentation method and semantic segmentation device | |
CN108416270B (en) | A Traffic Sign Recognition Method Based on Multi-attribute Joint Features | |
CN113592060A (en) | Neural network optimization method and device | |
CN111797992A (en) | Machine learning optimization method and device | |
CN115861619A (en) | Airborne LiDAR (light detection and ranging) urban point cloud semantic segmentation method and system of recursive residual double-attention kernel point convolution network | |
CN114746868A (en) | Method and apparatus for compiling neural network model | |
CN115018039A (en) | Neural network distillation method, target detection method and device | |
Guo et al. | Varied channels region proposal and classification network for wildlife image classification under complex environment | |
CN113111735A (en) | Rapid scene recognition method and device under complex environment | |
CN117671391B (en) | Image data analysis method and system based on deep learning | |
US20230108248A1 (en) | Model compression via quantized sparse principal component analysis | |
WO2023059723A1 (en) | Model compression via quantized sparse principal component analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210713 |