CN113822212B - Embedded object recognition method and device - Google Patents

Embedded object recognition method and device Download PDF

Info

Publication number
CN113822212B
CN113822212B CN202111138968.7A CN202111138968A CN113822212B CN 113822212 B CN113822212 B CN 113822212B CN 202111138968 A CN202111138968 A CN 202111138968A CN 113822212 B CN113822212 B CN 113822212B
Authority
CN
China
Prior art keywords
layer
combine
neural network
full
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111138968.7A
Other languages
Chinese (zh)
Other versions
CN113822212A (en
Inventor
张红良
李广明
余晨晖
张红
罗嘉琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dongguan University of Technology
Original Assignee
Dongguan University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dongguan University of Technology filed Critical Dongguan University of Technology
Priority to CN202111138968.7A priority Critical patent/CN113822212B/en
Publication of CN113822212A publication Critical patent/CN113822212A/en
Application granted granted Critical
Publication of CN113822212B publication Critical patent/CN113822212B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an embedded object identification method and device, which relate to the technical field of embedded artificial intelligence, and the method comprises the following steps: collecting and processing color images of objects to obtain processed image data; training and testing the constructed Combine-MobileNet neural network by using the processed image data to obtain a trained Combine-MobileNet neural network; storing and loading the trained Combine-MobileNet neural network on an embedded platform; inputting the image data of the object to be identified into the embedded platform, and carrying out real-time reasoning on the category of the object to be identified to obtain an identification result. The Combine-MobileNet neural network constructed by the invention has the advantages of simple structure, low calculation cost and high accuracy, is loaded on an embedded platform, and can realize accurate identification of objects in low-resource and low-cost environments.

Description

Embedded object recognition method and device
Technical Field
The invention relates to the technical field of embedded artificial intelligence, in particular to an embedded object identification method and device.
Background
The embedded artificial intelligence is a technical concept of applying an artificial intelligence algorithm to terminal equipment, so that various equipment can finish the functions of environment sensing, man-machine interaction and the like under the condition of no networking. The embedded system is an important bearing platform for artificial intelligence technology, is used in automatic sorting robots, automatic delivery vehicles and the like in the field of logistics service, and is used in the new application of combination of artificial intelligence and embedded type, such as face recognition, fingerprint recognition, intelligent cameras and the like in the safety precaution neighborhood, automatic parking, automatic vehicle recognition, intelligent parking lots and the like in the urban traffic neighborhood, and medical service case diagnosis, intelligent disinfection robots and the like. The deep neural network for the embedded system has very high requirements on computing capacity and resources, resulting in increased system power consumption; processors supporting deep neural network acceleration are typically complex socs integrating multiple architectures, with very high usage costs. The existing embedded object identification adopts a chip with extremely high complexity and high cost, which is not beneficial to learning and use; the method is difficult to realize on a low-resource and low-cost chip, is not suitable for a single task, and is easy to cause resource waste; the embedded object recognition can be realized on a chip with low resources, and the adopted model is simple, the training strategy and the evaluation strategy are single, so that the accuracy of the embedded object recognition is lower.
Published chinese patent application CN113138789a, 7 in 2021 and 20, provides an embedded object recognition system comprising: the system comprises a program updating module, a camera module, a display screen module, a tri-color lamp module and a main control chip; the main control chip is respectively connected with the program updating module, the camera module, the display screen module and the tri-color lamp module; and the main control chip performs program updating according to the input of the program updating module, receives the image data acquired by the camera module, performs image compression, input standardization and image recognition, and displays the result on the display screen module. The invention can only identify the numbers 0-9, has larger limitation, needs high storage environment and calculation cost when identifying other complex objects, and cannot realize accurate identification otherwise.
Disclosure of Invention
The invention provides an embedded object recognition method and device for overcoming the defect that the existing embedded object recognition technology cannot accurately recognize objects in a low-resource and low-cost environment, and can accurately recognize objects in the low-resource and low-cost environment.
In order to solve the technical problems, the technical scheme of the invention is as follows:
the invention provides an embedded object identification method, which comprises the following steps:
s1: collecting a color image of an object;
s2: processing the color image to obtain processed image data;
s3: training and testing the constructed Combine-MobileNet neural network by using the processed image data to obtain a trained Combine-MobileNet neural network;
s4: storing and loading the trained Combine-MobileNet neural network on an embedded platform;
s5: inputting the image data of the object to be identified into the embedded platform, and carrying out real-time reasoning on the category of the object to be identified to obtain an identification result.
Preferably, in the step S2, the specific steps of processing the color image are:
s2.1: converting the color image into a gray scale image;
s2.2: randomly dividing the gray level image into a training image and a test image;
s2.3: performing data enhancement operation on the training image to obtain an enhanced training image;
s2.4: and (5) downsampling the enhanced training image to obtain a downsampled training image.
Preferably, in the step S2.3, the data enhancement operation performed on the training image includes: rotation, clipping, translation, and gaussian noise.
And (3) performing rotation, cutting, translation and Gaussian noise operation on each training image, and enhancing a plurality of enhanced training images by one training image, so that the scale and complexity of training data are increased, and the accuracy of a network is improved.
Preferably, in step S2.4, the downsampling operation of the enhanced training image includes sequentially performing a tie pooling operation and a max pooling operation on the enhanced training image.
And carrying out one tie pooling operation and one maximum pooling operation on each enhanced training image in sequence, so that the size of training data is reduced, and the calculation cost is reduced.
Preferably, in the step S3, training and testing the constructed Combine-MobileNet neural network by using the processed image data, and the specific method for obtaining the trained Combine-MobileNet neural network is as follows:
s3.1: setting a loss function, an optimal loss function value replacement frequency threshold and a maximum training frequency of a Combine-MobileNet neural network;
s3.2: inputting the downsampled training image into a Combine-MobileNet neural network, and calculating a loss function value loss of the downsampled training image by using cross entropy;
s3.3: setting an early-stop strategy, namely comparing the loss function value of the downsampled training image with the optimal loss function value, replacing the optimal loss function value with the loss function value when the loss function value is larger than the optimal loss function value, and recording the replacement times;
s3.4: comparing the replacement times with the optimal loss function value replacement times threshold, and performing the next training when the replacement times are smaller than the optimal loss function value replacement times threshold; otherwise, completing the training of the Combine-MobileNet neural network;
s3.5: and inputting the test image into the trained Combine-MobileNet neural network for testing, and obtaining the trained Combine-MobileNet neural network.
Preferably, in the step S3, the Combine-MobileNet neural network includes a first standard convolution layer, a second standard convolution layer, a first depth separable convolution layer, a second depth separable convolution layer, a first full-connection layer, a second full-connection layer, a feature fusion layer, an average pooling layer, and a third full-connection layer;
the output end of the first standard convolution layer is connected with the input end of the second standard convolution layer, and the output end of the second standard convolution layer is respectively connected with the input ends of the first depth separable convolution layer and the second depth separable convolution layer;
the output end of the first depth separable convolution layer is connected with the input end of the first full-connection layer, and the output end of the first full-connection layer is connected with the input end of the feature fusion layer; the output end of the second depth separable convolution layer is connected with the input end of the second full-connection layer, and the output end of the second full-connection layer is connected with the input end of the feature fusion layer;
the output end of the characteristic fusion layer is connected with the input end of the average pooling layer, and the output end of the average pooling layer is connected with the input end of the third full-connection layer.
Inputting the downsampled training image into a Combine-MobileNet neural network, and obtaining a first characteristic v through a first standard convolution layer of 3x3 1 The method comprises the steps of carrying out a first treatment on the surface of the First feature v 1 After passing through a second standard convolution layer of 1x1, a second characteristic v is obtained 2 The method comprises the steps of carrying out a first treatment on the surface of the Second characteristic v 2 Obtaining a third feature v by a 3x3 first depth separable convolutional layer of step size 2 3 At the same time, a fourth feature v is obtained by a 3x3 second depth separable convolution layer with a step size of 1 4 The method comprises the steps of carrying out a first treatment on the surface of the Third feature v 3 Input to the feature fusion layer through the first full connection layer, the fourth feature v 4 Input to the feature fusion layer through the second full connection layer, and merged into a fifth feature v 5 The method comprises the steps of carrying out a first treatment on the surface of the Final fifth feature v 5 Outputting a sixth feature v after passing through the 3x3 average pooling layer and the third full connection layer 6 Sixth feature v 6 Is the number of categories that identify the object.
Preferably, in the step S4, the specific method for storing and loading the trained Combine-MobileNet neural network onto the embedded platform is as follows:
s4.1: saving the trained Combine-MobileNet neural network as an H5 file;
s4.2: analyzing the H5 file to obtain matrixing network parameters of the Combine-MobileNet neural network;
s4.3: creating two c-language files of a model_init.c and a model_init.h, and writing matrixing network parameters into the model_init.h file according to a data stream form;
s4.4: the corresponding old file in the embedded platform engineering file is replaced with "model_init.c" and "model_init.h".
The image processing and the network training are carried out at the PC end, but the network parameters of the Combine-MobileNet neural network cannot be processed by the embedded platform, and the network parameters need to be converted into a matrix form which can be processed by the embedded platform. The data format in the H5 file is a tree structure and is divided into weight and bias, wherein the expression form of a convolution kernel element a of an H row and a w column of an n-th dimension of the layer1 network is as follows: a=layer1 (n, h, w), the data representation of the kth offset b of the layer1 network is: b=layer1 (bias, k). And after replacing the corresponding old file in the engineering file of the embedded platform by the model_init.c and the model_init.h, modifying the object class name defined in the engineering file of the embedded platform into the current trained object class name.
Preferably, in the step S5, real-time reasoning is performed on the object to be identified based on the carefully chosen strategy, so as to obtain an identification result; when the category of an object is inferred, image data of the object at different times are acquired for multiple inferences, and an inference result with the largest occurrence number or probability is selected as an object identification result.
And the competitive strategy uses a strategy of time exchange accuracy, and the influence of human factors and equipment on the reasoning accuracy is reduced by comprehensively evaluating the multiple reasoning results.
Preferably, the embedded platform is an STM 32-based embedded platform.
The invention also provides an embedded object recognition device, which comprises:
the data acquisition module is used for acquiring color images of the object;
the data processing module is used for processing the color image to obtain processed image data;
the network training test module is used for training and testing the constructed Combine-MobileNet neural network by using the processed image data to obtain a trained Combine-MobileNet neural network;
the network loading module is used for storing and loading the trained Combine-MobileNet neural network onto the embedded platform;
the reasoning and identifying module is used for inputting the image data of the object to be identified into the embedded platform, and carrying out real-time reasoning on the category of the object to be identified to obtain an identification result.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention processes the color image, uses the processed image data as training data, increases the scale and complexity of the data, and is helpful for improving the accuracy of the Combine-MobileNet neural network during training; the constructed Combine-MobileNet neural network has the advantages of simple structure, low calculation cost and high accuracy; the trained Combine-MobileNet neural network is stored and loaded on the embedded platform, so that the object can be accurately identified in a low-resource and low-cost environment.
Drawings
Fig. 1 is a flowchart of an embedded object recognition method according to embodiment 1.
FIG. 2 is a block diagram of a Combine-MobileNet neural network as described in example 1.
Fig. 3 is a structural diagram of an embedded object recognition device according to embodiment 2.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the present patent;
for the purpose of better illustrating the embodiments, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the actual product dimensions;
it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical scheme of the invention is further described below with reference to the accompanying drawings and examples.
Example 1
The invention provides an embedded object identification method, as shown in figure 1, comprising the following steps:
s1: collecting a color image of an object;
the collected color image of the object is a 320×240 RGB256 format color image, each pixel occupies 16-bit storage space, and red, green and blue respectively occupy 0-4 bits, 5-10 bits and 11-15 bits;
s2: processing the color image to obtain processed image data;
the specific steps of processing the color image are as follows
S2.1: converting the color image into a gray scale image;
converting the three-dimensional color image into a single-channel gray image according to a calculation mode of Y=0.3 R+0.59G+0.11B, wherein Y represents gray, R represents red, G represents green, B represents blue, and using "start" to represent the beginning of one gray picture, and "end" to represent the end of one gray picture;
s2.2: randomly dividing the gray level image into a training image and a test image;
in this embodiment, the gray scale image is set according to 8:2, randomly dividing the ratio into a training image and a test image;
s2.3: performing data enhancement operation on the training image to obtain an enhanced training image;
the data enhancement operation includes: rotation, cutting, translation and Gaussian noise, a plurality of enhanced training images are enhanced by one training image, the scale and complexity of training data are increased, and the accuracy of the subsequent training of the network is improved;
s2.4: downsampling the enhanced training image to obtain a downsampled training image;
and carrying out one tie pooling operation and one maximum pooling operation on the enhanced training image in sequence, so that the size of training data is reduced, and the calculation cost is reduced.
S3: training and testing the constructed Combine-MobileNet neural network by using the processed image data to obtain a trained Combine-MobileNet neural network;
as shown in fig. 2, the Combine-MobileNet neural network includes a first standard convolution layer, a second standard convolution layer, a first depth separable convolution layer, a second depth separable convolution layer, a first full-connection layer, a second full-connection layer, a feature fusion layer, an average pooling layer, and a third full-connection layer;
the output end of the first standard convolution layer is connected with the input end of the second standard convolution layer, and the output end of the second standard convolution layer is respectively connected with the input ends of the first depth separable convolution layer and the second depth separable convolution layer;
the output end of the first depth separable convolution layer is connected with the input end of the first full-connection layer, and the output end of the first full-connection layer is connected with the input end of the feature fusion layer; the output end of the second depth separable convolution layer is connected with the input end of the second full-connection layer, and the output end of the second full-connection layer is connected with the input end of the feature fusion layer;
the output end of the characteristic fusion layer is connected with the input end of the average pooling layer, and the output end of the average pooling layer is connected with the input end of the third full-connection layer;
the specific method for obtaining the trained Combine-MobileNet neural network comprises the following steps:
s3.1: setting a loss function, an optimal loss function value replacement frequency threshold and a maximum training frequency of a Combine-MobileNet neural network; in this embodiment, the threshold value of the optimal loss function value is 10 times, the maximum training time is 10000 times, and the optimal loss function value is set according to the need;
s3.2: inputting the downsampled training image into a Combine-MobileNet neural network, and calculating a loss function value loss of the downsampled training image by using cross entropy;
s3.3: setting an early-stop strategy, namely comparing the loss function value of the downsampled training image with the optimal loss function value, replacing the optimal loss function value with the loss function value when the loss function value is larger than the optimal loss function value, and recording the replacement times;
s3.4: comparing the replacement times with the optimal loss function value replacement times threshold, and performing the next training when the replacement times are smaller than the optimal loss function value replacement times threshold; otherwise, completing the training of the Combine-MobileNet neural network; in this embodiment, when the loss function value does not decrease further after 10 times, training is completed;
s3.5: inputting the test image into the trained Combine-MobileNet neural network for testing to obtain a trained Combine-MobileNet neural network;
the Pytorch frame is adopted to build a Combine-MobileNet model which is improved based on MobileNet-V2, depth separable convolution is adopted to build the model, features extracted by the depth separable convolution with the step length of 1 and the step length of 2 in MobileNet-V2 are fused, two features are fully utilized by 2, the identification accuracy is improved, and meanwhile, the 7x7 average pooling layer of MobileNet-V2 is replaced by 3x3 average pooling, so that the calculation cost is reduced.
Inputting the downsampled training image into a Combine-MobileNet neural network, and obtaining a first characteristic v through a first standard convolution layer of 3x3 1 The method comprises the steps of carrying out a first treatment on the surface of the First feature v 1 After passing through a second standard convolution layer of 1x1, a second characteristic v is obtained 2 The method comprises the steps of carrying out a first treatment on the surface of the Second characteristic v 2 Obtaining a third feature v by a 3x3 first depth separable convolutional layer of step size 2 3 At the same time, a fourth feature v is obtained by a 3x3 second depth separable convolution layer with a step size of 1 4 The method comprises the steps of carrying out a first treatment on the surface of the Third feature v 3 Input to the feature fusion layer through the first full connection layer, the fourth feature v 4 Input to the feature fusion layer through the second full connection layer, and merged into a fifth feature v 5 The method comprises the steps of carrying out a first treatment on the surface of the Final fifth feature v 5 Outputting a sixth feature v after passing through the 3x3 average pooling layer and the third full connection layer 6 Sixth feature v 6 Is the number of categories that identify the object.
S4: storing and loading the trained Combine-MobileNet neural network on an embedded platform;
the loading step comprises the following steps:
s4.1: saving the trained Combine-MobileNet neural network as an H5 file;
s4.2: analyzing the H5 file to obtain matrixing network parameters of the Combine-MobileNet neural network;
s4.3: creating two c-language files of a model_init.c and a model_init.h, and writing matrixing network parameters into the model_init.h file according to a data stream form;
s4.4: the corresponding old file in the embedded platform engineering file is replaced with "model_init.c" and "model_init.h".
The image processing and the network training are carried out at the PC end, but the network parameters of the Combine-MobileNet neural network cannot be processed by the embedded platform, and the network parameters need to be converted into a matrix form which can be processed by the embedded platform. The data format in the H5 file is in a tree structure and is divided into weight and bias, and matrixing network parameters are obtained after analysis: the h row w column of the nth dimension of the layer1 network's convolution kernel element a behaves as: a=layer1 (n, h, w), the data of the kth offset term b of the layer1 network appears as: b=layer1 (bias, k). And after replacing the corresponding old file in the engineering file of the embedded platform by the model_init.c and the model_init.h, modifying the object class name defined in the engineering file of the embedded platform into the current trained object class name.
S5: inputting the image data of the object to be identified into the embedded platform, and carrying out real-time reasoning on the category of the object to be identified to obtain an identification result.
And carrying out real-time reasoning on the object to be identified based on the carefully chosen strategy to obtain an identification result. The carefully chosen strategy is a strategy for using time to exchange accuracy, and the influence of human factors and equipment on the reasoning accuracy is reduced by comprehensively evaluating the multiple reasoning results; specifically, when each time a category of an object is inferred, image data of the object at different times is acquired for multiple inferences, and an inference result with the largest occurrence number or probability is selected as a recognition result of the object. In this embodiment, image data of three different times of the object is acquired to perform multiple reasoning, and three reasoning results with the largest occurrence number or the largest probability are selected as final recognition results, so that the object recognition results are more accurate.
In the actual operation process, installing a Visual Studio 2019 and building an image acquisition system; installing AHL-GEC-IDE (4.08) and constructing an embedded engineering development platform; installing a TT-USB serial port (CH 340) driver to realize communication between the embedded platform and a pc end; jettheinspycharm 2019.1.1x64 is installed to implement object authentication system functions. Firstly, importing an engineering file, and importing the modified engineering file into a compiler; compiling the engineering file; the embedded platform is connected with the PC end through a port; loading the compiled engineering file to the embedded platform through the port; and reasoning is carried out on the acquired image data of the identification object, and the identification result is displayed on a display screen of the embedded platform.
Example 2
The present embodiment provides an embedded object recognition apparatus, as shown in fig. 3, including:
the data acquisition module is used for acquiring color images of the object;
the data processing module is used for processing the color image to obtain processed image data;
the network training test module is used for training and testing the constructed Combine-MobileNet neural network by using the processed image data to obtain a trained Combine-MobileNet neural network;
the network loading module is used for storing and loading the trained Combine-MobileNet neural network onto the embedded platform;
the reasoning and identifying module is used for inputting the image data of the object to be identified into the embedded platform, and carrying out real-time reasoning on the category of the object to be identified to obtain an identification result.
It is to be understood that the above examples of the present invention are provided by way of illustration only and not by way of limitation of the embodiments of the present invention. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the invention are desired to be protected by the following claims.

Claims (9)

1. An embedded object recognition method, comprising:
s1: collecting a color image of an object;
s2: processing the color image to obtain processed image data;
s3: training and testing the constructed Combine-MobileNet neural network by using the processed image data to obtain a trained Combine-MobileNet neural network;
the Combine-MobileNet neural network comprises a first standard convolution layer, a second standard convolution layer, a first depth separable convolution layer, a second depth separable convolution layer, a first full-connection layer, a second full-connection layer, a feature fusion layer, an average pooling layer and a third full-connection layer;
the output end of the first standard convolution layer is connected with the input end of the second standard convolution layer, and the output end of the second standard convolution layer is respectively connected with the input ends of the first depth separable convolution layer and the second depth separable convolution layer;
the output end of the first depth separable convolution layer is connected with the input end of the first full-connection layer, and the output end of the first full-connection layer is connected with the input end of the feature fusion layer; the output end of the second depth separable convolution layer is connected with the input end of the second full-connection layer, and the output end of the second full-connection layer is connected with the input end of the feature fusion layer;
the output end of the characteristic fusion layer is connected with the input end of the average pooling layer, and the output end of the average pooling layer is connected with the input end of the third full-connection layer;
s4: storing and loading the trained Combine-MobileNet neural network on an embedded platform;
s5: inputting the image data of the object to be identified into the embedded platform, and carrying out real-time reasoning on the category of the object to be identified to obtain an identification result.
2. The embedded object recognition method according to claim 1, wherein in the step S2, the specific steps of processing the color image are:
s2.1: converting the color image into a gray scale image;
s2.2: randomly dividing the gray level image into a training image and a test image;
s2.3: performing data enhancement operation on the training image to obtain an enhanced training image;
s2.4: and (5) downsampling the enhanced training image to obtain a downsampled training image.
3. The embedded object recognition method according to claim 2, wherein in the step S2.3, the data enhancement operation performed on the training image includes: rotation, clipping, translation, and gaussian noise.
4. The embedded object recognition method according to claim 3, wherein the step S2.4 of downsampling the enhanced training image comprises sequentially performing a tie pooling operation and a max pooling operation on the enhanced training image.
5. The embedded object recognition method according to claim 4, wherein in the step S3, the method for training and testing the constructed Combine-MobileNet neural network by using the processed image data comprises the following steps:
s3.1: setting a loss function, an optimal loss function value replacement frequency threshold and a maximum training frequency of a Combine-MobileNet neural network;
s3.2: inputting the downsampled training image into a Combine-MobileNet neural network, and calculating a loss function value loss of the downsampled training image by using cross entropy;
s3.3: setting an early-stop strategy, namely comparing the loss function value of the downsampled training image with the optimal loss function value, replacing the optimal loss function value with the loss function value when the loss function value is larger than the optimal loss function value, and recording the replacement times;
s3.4: comparing the replacement times with the optimal loss function value replacement times threshold, and performing the next training when the replacement times are smaller than the optimal loss function value replacement times threshold; otherwise, completing the training of the Combine-MobileNet neural network;
s3.5: and inputting the test image into the trained Combine-MobileNet neural network for testing, and obtaining the trained Combine-MobileNet neural network.
6. The embedded object recognition method according to claim 1, wherein in the step S4, the specific method for saving and loading the trained Combine-MobileNet neural network onto the embedded platform is as follows:
s4.1: saving the trained Combine-MobileNet neural network as an H5 file;
s4.2: analyzing the H5 file to obtain matrixing network parameters of the Combine-MobileNet neural network;
s4.3: creating two c-language files of a model_init.c and a model_init.h, and writing matrixing network parameters into the model_init.h file according to a data stream form;
s4.4: the corresponding old file in the embedded platform engineering file is replaced with "model_init.c" and "model_init.h".
7. The embedded object recognition method according to claim 1, wherein in the step S5, real-time reasoning is performed on the category of the object to be recognized based on the selection strategy, so as to obtain a recognition result; when the category of an object is inferred, image data of the object at different times are acquired for multiple inferences, and an inference result with the largest occurrence number or probability is selected as an object identification result.
8. The embedded object recognition method of claim 1, wherein the embedded platform is an STM 32-based embedded platform.
9. An embedded object recognition device, comprising:
the data acquisition module is used for acquiring color images of the object;
the data processing module is used for processing the color image to obtain processed image data;
the network training test module is used for training and testing the constructed Combine-MobileNet neural network by using the processed image data to obtain a trained Combine-MobileNet neural network;
the Combine-MobileNet neural network comprises a first standard convolution layer, a second standard convolution layer, a first depth separable convolution layer, a second depth separable convolution layer, a first full-connection layer, a second full-connection layer, a feature fusion layer, an average pooling layer and a third full-connection layer;
the output end of the first standard convolution layer is connected with the input end of the second standard convolution layer, and the output end of the second standard convolution layer is respectively connected with the input ends of the first depth separable convolution layer and the second depth separable convolution layer;
the output end of the first depth separable convolution layer is connected with the input end of the first full-connection layer, and the output end of the first full-connection layer is connected with the input end of the feature fusion layer; the output end of the second depth separable convolution layer is connected with the input end of the second full-connection layer, and the output end of the second full-connection layer is connected with the input end of the feature fusion layer;
the output end of the characteristic fusion layer is connected with the input end of the average pooling layer, and the output end of the average pooling layer is connected with the input end of the third full-connection layer;
the network loading module is used for storing and loading the trained Combine-MobileNet neural network onto the embedded platform;
the reasoning and identifying module is used for inputting the image data of the object to be identified into the embedded platform, and carrying out real-time reasoning on the category of the object to be identified to obtain the identification result of the object to be identified.
CN202111138968.7A 2021-09-27 2021-09-27 Embedded object recognition method and device Active CN113822212B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111138968.7A CN113822212B (en) 2021-09-27 2021-09-27 Embedded object recognition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111138968.7A CN113822212B (en) 2021-09-27 2021-09-27 Embedded object recognition method and device

Publications (2)

Publication Number Publication Date
CN113822212A CN113822212A (en) 2021-12-21
CN113822212B true CN113822212B (en) 2024-01-05

Family

ID=78915717

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111138968.7A Active CN113822212B (en) 2021-09-27 2021-09-27 Embedded object recognition method and device

Country Status (1)

Country Link
CN (1) CN113822212B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111695469A (en) * 2020-06-01 2020-09-22 西安电子科技大学 Hyperspectral image classification method of lightweight depth separable convolution feature fusion network
CN112528899A (en) * 2020-12-17 2021-03-19 南开大学 Image salient object detection method and system based on implicit depth information recovery
CN112818893A (en) * 2021-02-10 2021-05-18 北京工业大学 Lightweight open-set landmark identification method facing mobile terminal

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784325A (en) * 2017-11-10 2019-05-21 富士通株式会社 Opener recognition methods and equipment and computer readable storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111695469A (en) * 2020-06-01 2020-09-22 西安电子科技大学 Hyperspectral image classification method of lightweight depth separable convolution feature fusion network
CN112528899A (en) * 2020-12-17 2021-03-19 南开大学 Image salient object detection method and system based on implicit depth information recovery
CN112818893A (en) * 2021-02-10 2021-05-18 北京工业大学 Lightweight open-set landmark identification method facing mobile terminal

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于神经网络的多类别目标识别;赵静;王弦;王奔;蒋国平;谢非;徐丰羽;;控制与决策(第08期);全文 *

Also Published As

Publication number Publication date
CN113822212A (en) 2021-12-21

Similar Documents

Publication Publication Date Title
CN112465748B (en) Crack identification method, device, equipment and storage medium based on neural network
CN112132156B (en) Image saliency target detection method and system based on multi-depth feature fusion
CN108197326A (en) A kind of vehicle retrieval method and device, electronic equipment, storage medium
CN111275660B (en) Flat panel display defect detection method and device
CN113971660B (en) Computer vision method for bridge health diagnosis and intelligent camera system
CN110222604A (en) Target identification method and device based on shared convolutional neural networks
CN115909006B (en) Mammary tissue image classification method and system based on convolution transducer
CN113052295B (en) Training method of neural network, object detection method, device and equipment
CN113888514A (en) Method and device for detecting defects of ground wire, edge computing equipment and storage medium
CN109740553B (en) Image semantic segmentation data screening method and system based on recognition
CN110909657A (en) Method for identifying apparent tunnel disease image
CN114463637A (en) Winter wheat remote sensing identification analysis method and system based on deep learning
CN113205107A (en) Vehicle type recognition method based on improved high-efficiency network
CN115620190A (en) Joint identification platform based on data analysis
CN113362277A (en) Workpiece surface defect detection and segmentation method based on deep learning
CN114445356A (en) Multi-resolution-based full-field pathological section image tumor rapid positioning method
CN117197763A (en) Road crack detection method and system based on cross attention guide feature alignment network
CN114495060B (en) Road traffic marking recognition method and device
CN111310837A (en) Vehicle refitting recognition method, device, system, medium and equipment
CN111210398A (en) White blood cell recognition system based on multi-scale pooling
CN113822212B (en) Embedded object recognition method and device
CN116580232A (en) Automatic image labeling method and system and electronic equipment
CN114612669B (en) Method and device for calculating ratio of inflammation to necrosis of medical image
CN116259021A (en) Lane line detection method, storage medium and electronic equipment
CN116977249A (en) Defect detection method, model training method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Zhang Hongliang

Inventor after: Li Guangming

Inventor after: Yu Chenhui

Inventor after: Zhang Hong

Inventor after: Luo Jiaqi

Inventor before: Li Guangming

Inventor before: Zhang Hongliang

Inventor before: Yu Chenhui

Inventor before: Zhang Hong

Inventor before: Luo Jiaqi

CB03 Change of inventor or designer information
GR01 Patent grant
GR01 Patent grant