CN113516090B - Factory building scene recognition method and device, electronic equipment and storage medium - Google Patents
Factory building scene recognition method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN113516090B CN113516090B CN202110850496.1A CN202110850496A CN113516090B CN 113516090 B CN113516090 B CN 113516090B CN 202110850496 A CN202110850496 A CN 202110850496A CN 113516090 B CN113516090 B CN 113516090B
- Authority
- CN
- China
- Prior art keywords
- type
- factory building
- scene
- plant
- image sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 63
- 238000003860 storage Methods 0.000 title claims abstract description 14
- 238000012549 training Methods 0.000 claims abstract description 64
- 239000013598 vector Substances 0.000 claims description 30
- 238000000605 extraction Methods 0.000 claims description 20
- 230000006870 function Effects 0.000 claims description 20
- 238000004590 computer program Methods 0.000 claims description 9
- 238000002372 labelling Methods 0.000 abstract description 7
- 239000002699 waste material Substances 0.000 abstract description 3
- 230000008569 process Effects 0.000 description 9
- 238000004891 communication Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 3
- 238000003062 neural network model Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 241001417527 Pempheridae Species 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000005553 drilling Methods 0.000 description 1
- 239000000428 dust Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a factory building scene recognition method, a device, electronic equipment and a storage medium, which can enable a scene recognition model to obtain higher accuracy by a small amount of training samples carrying labels while not changing the model structure of the scene recognition model, thereby increasing the robustness and the accuracy of the scene recognition model and reducing the large-scale data labeling cost. And meanwhile, the waste of a large amount of unmarked data is avoided. The problem that the large-scale data set labeling cost is high in factory building scene recognition of a general image recognition model in the prior art can be solved through the scene recognition model.
Description
Technical Field
The present invention relates to the field of target recognition technologies, and in particular, to a method and apparatus for recognizing a factory building scene, an electronic device, and a storage medium.
Background
Production field management is an important component of enterprise management, and factory work environment management is a core link of production field management. The core of the system relates to a plurality of aspects such as work awareness, system construction, behavior habit and the like. The improvement of the working environment of the factory building can create good working atmosphere and improve the production efficiency of staff; a clean and tidy working environment is formed, and the quality of products is improved; generating orderly office places, ensuring safe production; the autonomous and serious working attitude of staff is promoted, and the working enthusiasm of the staff is improved.
The current factory building working environment recognition based on machine vision is still in an exploration stage, and the current latest image recognition model is used as a reference. The performance of common image recognition algorithms has a strong dependence on the training set size. But factory building scene changes are extremely complicated in reality, factory building scenes can all produce very big change between different factory buildings, same factory building but different time slots of factory building scene, so that a large amount of manual annotation is needed in the training process of a general image recognition model, and the problems of large data scale, high annotation cost and the like in factory building scene recognition are caused.
For this reason, it is urgently needed to provide a plant scene recognition method.
Disclosure of Invention
The invention provides a factory building scene recognition method, a device, electronic equipment and a storage medium, which are used for solving the defects in the prior art.
The invention provides a factory building scene recognition method, which comprises the following steps:
acquiring a factory building image to be identified;
Inputting the plant image to be identified into a scene identification model to obtain a plant scene category corresponding to the plant image to be identified, which is output by the scene identification model;
The scene recognition model is obtained by training based on a first preset number of first-class factory building image samples which do not carry factory building scene type labels and a second preset number of second-class factory building image samples which do not carry factory building scene type labels, wherein the first-class factory building image samples do not carry factory building scene type labels, and the second-class factory building image samples carry factory building scene type labels; and determining the factory building scene category label of the first type factory building image sample based on the factory building scene category label carried by the second type factory building image sample.
According to the plant scene recognition method provided by the invention, the scene recognition model is obtained by training based on the following method:
inputting each first type factory building image sample and each second type factory building image sample into a feature extraction module of a scene recognition model to be trained respectively to obtain first type coding features of each first type factory building image sample and second type coding features of each second type factory building image sample output by the feature extraction module;
determining a factory building scene category label of each first type factory building image sample based on the distance between the first type coding feature of each first type factory building image sample and the second type coding feature of each second type factory building image sample;
Inputting each first type of coding feature into a feature classification module of the scene recognition model to be trained to obtain a first type of feature probability vector of each first type of factory building image sample output by the feature classification module;
And training the scene recognition model to be trained based on the plant scene class labels of each first-type plant image sample and the first-type feature probability vectors of each first-type plant image sample.
According to the plant scene recognition method provided by the invention, the training of the to-be-trained scene recognition model based on the plant scene class labels of each first plant image sample and the first type feature probability vectors of each first plant image sample specifically comprises the following steps:
And calculating a loss function of the scene recognition model to be trained based on the plant scene class labels of each first plant image sample and the first type feature probability vector of each first plant image sample, carrying out gradient inversion on the scene recognition model to be trained, and recalculating the loss function of the scene recognition model to be trained until the scene recognition model to be trained converges to obtain the scene recognition model.
According to the plant scene recognition method provided by the invention, the distance between the first type coding features of each first type plant image sample and the second type coding features of each second type plant image sample is cosine distance;
Correspondingly, determining the factory building scene category label of each first type factory building image sample based on the distance between the first type coding feature of each first type factory building image sample and the second type coding feature of each second type factory building image sample specifically comprises:
For any first type of coding feature, determining second type factory building image samples corresponding to a third preset number of second type of coding features with the largest cosine distance based on the cosine distance between any first type of coding feature and each second type of coding feature;
and determining the factory building scene category labels of the first type factory building image samples corresponding to any one of the first type coding features based on the factory building scene category labels carried by the third preset number of second type factory building image samples.
According to the plant scene recognition method provided by the invention, the determining of the plant scene category label of the first plant image sample corresponding to any one of the first type coding features based on the plant scene category label of the third preset number of the second type plant image samples specifically comprises the following steps:
comparing the plant scene type labels of the third preset number of the second-class plant image samples, and determining that the plant scene type label with the highest proportion in the third preset number of the second-class plant image samples is the plant scene type label of the first-class plant image sample corresponding to any one of the first-class coding features.
According to the plant scene recognition method provided by the invention, the scene recognition model to be trained is based on the pre-training model obtained by training the ImageNet database.
The invention also provides a plant scene recognition device, which comprises:
The acquisition module is used for acquiring the image of the factory building to be identified;
The recognition module is used for inputting the plant image to be recognized into a scene recognition model to obtain a plant scene category corresponding to the plant image to be recognized, which is output by the scene recognition model;
The scene recognition model is trained based on a first preset number of first-class factory building image samples and a second preset number of second-class factory building image samples, wherein the first-class factory building image samples do not carry factory building scene type labels, and the second-class factory building image samples carry factory building scene type labels; and determining the factory building scene category label of the first type factory building image sample based on the factory building scene category label carried by the second type factory building image sample.
The invention provides a workshop scene recognition device, which further comprises a training module for:
inputting each first type factory building image sample and each second type factory building image sample into a feature extraction module of a scene recognition model to be trained respectively to obtain first type coding features of each first type factory building image sample and second type coding features of each second type factory building image sample output by the feature extraction module;
determining a factory building scene category label of each first type factory building image sample based on the distance between the first type coding feature of each first type factory building image sample and the second type coding feature of each second type factory building image sample;
Inputting each first type of coding feature into a feature classification module of the scene recognition model to be trained to obtain a first type of feature probability vector of each first type of factory building image sample output by the feature classification module;
And training the scene recognition model to be trained based on the plant scene class labels of each first-type plant image sample and the first-type feature probability vectors of each first-type plant image sample.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and operable on the processor, wherein the processor implements the steps of any one of the plant scene recognition methods described above when executing the program.
The invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the plant scene recognition method as described in any of the above.
According to the factory building scene recognition method, device, electronic equipment and storage medium, the model structure of the scene recognition model is not changed, and meanwhile, a small amount of training samples carrying labels can be used for enabling the scene recognition model to obtain high accuracy, so that the robustness and the accuracy of the scene recognition model are improved, and the large-scale data labeling cost is reduced. And meanwhile, the waste of a large amount of unmarked data is avoided. The problem that the large-scale data set labeling cost is high in factory building scene recognition of a general image recognition model in the prior art can be solved through the scene recognition model.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the invention, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a plant scene recognition method provided by the invention;
FIG. 2 is a second flow chart of the plant scene recognition method according to the present invention;
Fig. 3 is a schematic structural diagram of a plant scene recognition device provided by the invention;
fig. 4 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 is a flow chart of a plant scene recognition method provided in an embodiment of the present invention, as shown in fig. 1, the method includes:
S1, acquiring a factory building image to be identified;
s2, inputting the plant image to be identified into a scene identification model to obtain a plant scene category corresponding to the plant image to be identified, which is output by the scene identification model;
The scene recognition model is obtained by training a first type of factory building image samples with a first preset number and a second type of factory building image samples with a second preset number, wherein the first type of factory building image samples do not carry the factory building scene type labels; and determining the factory building scene category label of the first type factory building image sample based on the factory building scene category label carried by the second type factory building image sample.
Specifically, in the plant scene recognition method provided by the embodiment of the invention, the execution subject is a plant scene recognition device for recognizing a plant scene. The factory building refers to a house without a partition wall for staff or work machines to work or for parking work machines. The factory building scene refers to a working environment in the factory building, and can comprise a neat scene, a dirty and dirty scene and the like, and the invention is not particularly limited in the embodiment.
Wherein the work machine may include: at least one of a drilling machine, an excavating machine, a loading machine, a carrying machine, a municipal machine, a crusher, and a driver-driven vehicle. The excavating machine is a working machine for excavating a mine. A loading machine is a work machine for loading cargo into a carrier machine. The loading machine includes at least one of a hydraulic excavator, an electric excavator, and a wheel loader. The carrier machine is a work machine for carrying goods. Municipal machines are work machines used for urban road cleaning and beautification, such as motor sweeper, watering cart and dust collector. The crusher is a working machine that crushes earth and stones thrown in from a carrying machine.
The plant identification device can be configured in a local server or a cloud server. The local server may be a computer, and the embodiment of the present invention is not limited in particular.
Step S1 is executed first, and a factory building image to be identified is obtained. The factory building image to be identified refers to a factory building image of which scene category needs to be determined. The factory building image to be identified can be obtained by shooting through image acquisition equipment, the image acquisition equipment can be a camera, the camera can be installed at a fixed position in the factory building, and the camera located at the fixed position can shoot a scene in the factory building.
And then executing step S2, inputting the plant image to be identified into a scene identification model, and processing the plant image to be identified through the scene identification model to obtain the plant scene category corresponding to the plant image to be identified, which is output by the scene identification model. The factory scene categories may include categories of clean scenes, dirty scenes, and the like.
In the embodiment of the invention, the scene recognition model can be a machine learning model, such as a neural network model, a visual algorithm network model and the like. The scene recognition model can be obtained by training a first preset number of first-class factory building image samples which do not carry factory building scene category labels and a second preset number of second-class factory building image samples which do not carry factory building scene category labels. The first preset number can be larger than the second preset number, namely, the first type factory building image samples and the second type factory building image samples which are used as training samples, the first type factory building image samples which do not carry factory building scene type labels are large samples, and the second type factory building image samples which do not carry factory building scene type labels are small samples. All the first type factory building image samples are large sample data sets, and all the second type factory building image samples are small sample data sets.
The factory building scene category label of the first type factory building image sample can be assigned through the factory building scene category label carried by the second type factory building image sample, namely, the factory building scene category label of the first type factory building image sample can be obtained through the factory building scene category label carried by the second type factory building image sample. And then training the scene recognition model according to the first type factory building image sample with the determined factory building scene type label and the second type factory building image sample with the factory building scene type label.
The factory building scene recognition method provided by the embodiment of the invention comprises the steps of firstly, acquiring a factory building image to be recognized; and secondly, inputting the plant image to be identified into a scene identification model to obtain a plant scene category corresponding to the plant image to be identified, which is output by the scene identification model. The scene recognition model is obtained through training of a first preset number of first-class factory building image samples which do not carry factory building scene type labels and a second preset number of second-class factory building image samples which do not carry factory building scene type labels; the factory building scene category labels of the first type of factory building image samples are determined based on the factory building scene category labels carried by the second type of factory building image samples. The method has the advantages that the model structure of the scene recognition model is not changed, meanwhile, a small amount of training samples carrying labels can be used for enabling the scene recognition model to obtain higher accuracy, the robustness and the accuracy of the scene recognition model are improved, and the large-scale data labeling cost is reduced. And meanwhile, the waste of a large amount of unmarked data is avoided. The problem that the large-scale data set labeling cost is high in factory building scene recognition of a general image recognition model in the prior art can be solved through the scene recognition model.
On the basis of the embodiment, according to the plant scene recognition method provided by the embodiment of the invention, the camera can be movably arranged in the plant, so that the plant image to be recognized can be ensured to contain more information for identifying the plant scene, and the accuracy of the plant scene recognition result is further improved.
On the basis of the above embodiment, the plant scene recognition method provided in the embodiment of the present invention, wherein the scene recognition model is trained based on the following method:
inputting each first type factory building image sample and each second type factory building image sample into a feature extraction module of a scene recognition model to be trained respectively to obtain first type coding features of each first type factory building image sample and second type coding features of each second type factory building image sample output by the feature extraction module;
determining a factory building scene category label of each first type factory building image sample based on the distance between the first type coding feature of each first type factory building image sample and the second type coding feature of each second type factory building image sample;
Inputting each first type of coding feature into a feature classification module of the scene recognition model to be trained to obtain a first type of feature probability vector of each first type of factory building image sample output by the feature classification module;
And training the scene recognition model to be trained based on the plant scene class labels of each first-type plant image sample and the first-type feature probability vectors of each first-type plant image sample.
Specifically, in the embodiment of the present invention, the scene recognition model may include a feature extraction module, a feature classification module, and an output module, where the feature extraction module is configured to perform feature extraction on an image to obtain an encoded feature; the feature classification module is used for carrying out feature classification on the coding features to obtain feature probability vectors, and the scene category of the image is represented by the feature probability vectors; the output module is used for outputting the scene category corresponding to the feature probability vector.
The scene recognition model can be obtained by training a scene recognition model to be trained, and the scene recognition model to be trained is consistent with the scene recognition model in structure. The scene recognition model to be trained is a basic model of the scene recognition model, can be a neural network model with undetermined model parameters, and can also be a pre-training model obtained through preliminary training of the existing image library, and the method is not particularly limited in the embodiment of the invention.
When the scene recognition model to be trained is trained, each first type of factory building image sample and each second type of factory building image sample can be input to the feature extraction module of the scene recognition model to be trained, and the first type of coding feature of each first type of factory building image sample and the second type of coding feature of each second type of factory building image sample output by the feature extraction module are obtained. The first type of coding features as well as the second type of coding features are typically represented in the form of feature vectors. And then determining the factory building scene category label of each first type factory building image sample according to the distance between the first type coding features of each first type factory building image sample and the second type coding features of each second type factory building image sample. The distance between the first type of coding feature and the second type of coding feature may be a cosine distance, an euclidean distance, a manhattan distance, or the like, which is not particularly limited in the embodiment of the present invention. And finally, inputting each first type coding feature into a feature classification module to obtain a first type feature probability vector of each first type factory building image sample output by the feature classification module.
The plant scene type labels of each first plant image sample are known, and the first type feature probability vector of each first plant image sample can be obtained through the feature classification module in the to-be-trained scene recognition model, so that training of the to-be-trained scene recognition model can be achieved according to the plant scene type labels of each first plant image sample and the first type feature probability vector of each first plant image sample.
In the embodiment of the invention, the specific structure of the scene recognition model to be trained is provided, the actions of each structure and the training process of the scene recognition model to be trained are also provided, and the accuracy of the scene recognition model is improved under the condition that a large number of samples with labels are not needed.
On the basis of the above embodiment, the plant scene recognition method provided in the embodiment of the present invention trains the to-be-trained scene recognition model based on the plant scene class label of each first type plant image sample and the first type feature probability vector of each first type plant image sample, specifically includes:
And calculating a loss function of the scene recognition model to be trained based on the plant scene class labels of each first plant image sample and the first type feature probability vector of each first plant image sample, carrying out gradient inversion on the scene recognition model to be trained, and recalculating the loss function of the scene recognition model to be trained until the scene recognition model to be trained converges to obtain the scene recognition model.
Specifically, in the training process of the scene recognition model to be trained, each first type factory building image sample and each second type factory building image sample need to be input into the scene recognition model to be trained one by one, a feature extraction module obtains first type coding features of each first type factory building image sample and second type coding features of each second type factory building image sample, and each first type coding feature is input into a feature classification module to obtain first type feature probability vectors of each first type factory building image sample.
And then calculating a Loss function of the scene recognition model to be trained according to the plant scene class labels of each first-type plant image sample and the first-type feature probability vector of each first-type plant image sample, wherein the Loss function can comprise a mean square error Loss (Mean Squared Loss) function, an average absolute error Loss (Mean Absolute Error Loss, huber Loss) function, a quantile Loss (Quantile Loss) function, a cross entropy Loss (Cross Entropy Loss, range) function and the like.
And finally, carrying out gradient inversion on the scene recognition model to be trained, iteratively executing the process, and recalculating a loss function of the scene recognition model to be trained until the scene recognition model to be trained converges to obtain the scene recognition model. The scene recognition model to be trained converges, namely the value of the loss function reaches the minimum, so that the training process of the scene recognition model to be trained can be understood as a process of searching model parameters capable of minimizing the loss function.
In the embodiment of the invention, the iterative training process of the scene recognition model to be trained is given by calculating the loss function and the gradient inversion, so that the training process of the scene recognition model to be trained is smoother.
On the basis of the above embodiment, in the plant scene recognition method provided by the embodiment of the present invention, the distance between the first type coding feature of each first type plant image sample and the second type coding feature of each second type plant image sample is a cosine distance;
Correspondingly, determining the factory building scene category label of each first type factory building image sample based on the distance between the first type coding feature of each first type factory building image sample and the second type coding feature of each second type factory building image sample specifically comprises:
For any first type of coding feature, determining second type factory building image samples corresponding to a third preset number of second type of coding features with the largest cosine distance based on the cosine distance between any first type of coding feature and each second type of coding feature;
and determining the factory building scene category labels of the first type factory building image samples corresponding to any one of the first type coding features based on the factory building scene category labels carried by the third preset number of second type factory building image samples.
Specifically, in the embodiment of the invention, the distance between the first type of coding feature and the second type of coding feature is represented by a cosine distance. Therefore, when determining the factory building scene type label of each first type factory building image sample according to the distance, for any first type coding feature, the cosine distance between the any first type coding feature and each second type coding feature can be calculated first, then the cosine distances between any first type coding feature and each second type coding feature are arranged according to the sequence from big to small, and the second type factory building image samples corresponding to the first third preset number of second type coding features in the arrangement are selected. The third preset number may be set as needed, and may be set to 1, 3, 5, or the like, for example.
And finally, determining the factory building scene type label of the first type factory building image sample corresponding to any one of the first type coding features according to the third preset number of factory building scene type labels of the second type factory building image sample. That is, one second type factory building image sample can be arbitrarily selected from a third preset number of second type factory building image samples, and the factory building scene type label carried by the second type factory building image sample is used as the factory building scene type label of the first type factory building image sample corresponding to any one of the first type coding features. When the third preset number is 1, the factory building scene type label carried by the second type factory building image sample can be directly used as the factory building scene type label of the first type factory building image sample corresponding to any one of the first type coding features.
In the embodiment of the invention, the determination method of the factory building scene type label of each first type factory building image sample is provided through the cosine distance, so that the obtained factory building scene type label of each first type factory building image sample is more accurate.
On the basis of the foregoing embodiment, the plant scene identification method provided in the embodiment of the present invention, where the determining, based on the plant scene class labels of the third preset number of second-class plant image samples, the plant scene class label of the first-class plant image sample corresponding to any one of the first-class coding features specifically includes:
comparing the plant scene type labels of the third preset number of the second-class plant image samples, and determining that the plant scene type label with the highest proportion in the third preset number of the second-class plant image samples is the plant scene type label of the first-class plant image sample corresponding to any one of the first-class coding features.
Specifically, when the third preset number is at least 2, when determining the plant scene type label of the first plant image sample corresponding to any one of the first type coding features, the plant scene type labels of the second type plant image samples can be compared, and the plant scene type label with the highest proportion in the second type plant image samples is determined and used as the plant scene type label of the first type plant image sample corresponding to any one of the first type coding features. For example, the third preset number is 5, and the plant scene class labels of the 5 second type plant image samples are respectively 2 clean scene labels and 3 messy scene labels, and the plant scene class label with the highest proportion is the messy scene label, so that the plant scene class label of the first type plant image sample corresponding to any one of the first type coding features is determined to be the messy scene label.
In the embodiment of the invention, the method for determining the factory building scene type label of the first type factory building image sample corresponding to the first type coding feature is provided when the third preset number is a plurality of, so that the obtained factory building scene type label of the first type factory building image sample is more accurate.
On the basis of the above embodiment, according to the plant scene recognition method provided by the embodiment of the present invention, the scene recognition model to be trained is based on a pre-training model obtained by training an ImageNet database.
Specifically, in the embodiment of the present invention, the scene recognition model to be trained may be a pre-training model obtained by training according to an ImageNet database. The ImageNet database is a large visual database for visual object recognition software research, contains a large number of images for training, and can make the obtained pre-training model more universal.
In the embodiment of the invention, the universal pre-training model is used as the scene recognition model to be trained, so that the training time of the scene recognition model to be trained can be shortened, and the recognition accuracy of the scene recognition model can be improved.
Fig. 2 is a complete flow diagram of a plant scene recognition method provided in an embodiment of the present invention, as shown in fig. 2, where the method includes:
Step1: a tagged small sample data set Q is constructed.
Step 2: and loading a pre-training model M obtained through training of an ImageNet database.
Step3: and (3) inputting all the second-type factory building image samples in the small sample data set Q in the step (1) into the pre-training model M in the step (2). And extracting the characteristics of each second type of factory building image sample by using a characteristic extraction module of the pre-training model M to obtain a coding library K.
Step 4: and acquiring a first factory building image sample from the camera video stream based on the frames.
Step 5: inputting the first type factory building image sample obtained in the step 4 into the pre-training model M in the step 2, extracting a first type feature code c of the first type factory building image sample by using a feature extraction module of the pre-training model M, and using a first type probability vector v output by a feature classification module of the pre-training model M.
Step 6: and (3) respectively calculating cosine distances between the first type feature codes c in the step (5) and the second type feature codes in the code library K in the step (3), and selecting second type factory building image samples corresponding to the first 5 second type feature codes with the largest cosine distances.
Step 7: and (3) comparing the plant scene type labels carried by the 5 second-class plant image samples in the step (6), wherein the plant scene type label of the first-class plant image sample in the step (4) is the plant scene type label with the highest proportion in the plant scene type labels carried by the 5 second-class plant image samples.
Step 8: and (3) calculating the class labels of the plant scenes obtained in the step (7) and the loss functions of the first class probability vectors v given in the step (5), and carrying out gradient inversion on the pre-training model M in the step (2).
Step 9: and (3) iteratively executing the steps 3 to 8 until the pre-training model converges, and obtaining the scene recognition model.
Step 10: and (3) inputting the scene image to be identified into the image identification model obtained in the step (9), and outputting the workshop scene category corresponding to the workshop image to be identified.
According to the factory building scene recognition method provided by the embodiment of the invention, based on the determined small sample scene, the cosine distance is calculated based on the features extracted by the feature extractor and the determined small sample set in each iteration process of training, so that the purpose of dynamically assigning values to the label-free data is achieved, and the problem that the manual labeling cost of a large-scale data set is too high for a common image recognition algorithm in a messy scene can be solved. Compared with the common image recognition algorithm in the prior art, the method is more suitable for recognizing the dirty and bad scenes of the factory building, and has better robustness.
As shown in fig. 3, on the basis of the above embodiment, an apparatus for identifying a plant scene is provided in an embodiment of the present invention, including: an acquisition module 31 and an identification module 32.
An acquisition module 31, configured to acquire a plant image to be identified;
The recognition module 32 is configured to input the plant image to be recognized into a scene recognition model, and obtain a plant scene category corresponding to the plant image to be recognized output by the scene recognition model;
The scene recognition model is obtained by training a first type of factory building image samples with a first preset number and a second type of factory building image samples with a second preset number, wherein the first type of factory building image samples do not carry the factory building scene type labels; and determining the factory building scene category label of the first type factory building image sample based on the factory building scene category label carried by the second type factory building image sample.
On the basis of the above embodiment, the plant scene recognition device provided in the embodiment of the present invention further includes a training module, configured to:
inputting each first type factory building image sample and each second type factory building image sample into a feature extraction module of a scene recognition model to be trained respectively to obtain first type coding features of each first type factory building image sample and second type coding features of each second type factory building image sample output by the feature extraction module;
determining a factory building scene category label of each first type factory building image sample based on the distance between the first type coding feature of each first type factory building image sample and the second type coding feature of each second type factory building image sample;
Inputting each first type of coding feature into a feature classification module of the scene recognition model to be trained to obtain a first type of feature probability vector of each first type of factory building image sample output by the feature classification module;
And training the scene recognition model to be trained based on the plant scene class labels of each first-type plant image sample and the first-type feature probability vectors of each first-type plant image sample.
On the basis of the foregoing embodiments, the plant scene recognition device provided in the embodiment of the present invention, the training module is specifically configured to:
And calculating a loss function of the scene recognition model to be trained based on the plant scene class labels of each first plant image sample and the first type feature probability vector of each first plant image sample, carrying out gradient inversion on the scene recognition model to be trained, and recalculating the loss function of the scene recognition model to be trained until the scene recognition model to be trained converges to obtain the scene recognition model.
On the basis of the above embodiment, in the plant scene recognition device provided in the embodiment of the present invention, a distance between the first type coding feature of each first type plant image sample and the second type coding feature of each second type plant image sample is a cosine distance;
Correspondingly, the training module is further specifically configured to:
For any first type of coding feature, determining second type factory building image samples corresponding to a third preset number of second type of coding features with the largest cosine distance based on the cosine distance between any first type of coding feature and each second type of coding feature;
and determining the factory building scene category labels of the first type factory building image samples corresponding to any one of the first type coding features based on the factory building scene category labels carried by the third preset number of second type factory building image samples.
On the basis of the foregoing embodiments, the plant scene recognition device provided in the embodiment of the present invention, the training module is further specifically configured to:
comparing the plant scene type labels of the third preset number of the second-class plant image samples, and determining that the plant scene type label with the highest proportion in the third preset number of the second-class plant image samples is the plant scene type label of the first-class plant image sample corresponding to any one of the first-class coding features.
On the basis of the above embodiment, the plant scene recognition device provided in the embodiment of the present invention, wherein the to-be-trained scene recognition model is based on a pre-training model obtained by training an ImageNet database.
Specifically, the functions of each module in the plant scene recognition device provided in the embodiment of the present invention are in one-to-one correspondence with the operation flows of each step in the above method embodiment, and the achieved effects are consistent.
Fig. 4 illustrates a physical schematic diagram of an electronic device, as shown in fig. 4, which may include: processor 410, communication interface (Communications Interface) 420, memory 430, and communication bus 440, wherein processor 410, communication interface 420, and memory 430 communicate with each other via communication bus 440. The processor 410 may call logic instructions in the memory 430 to perform the factory building scenario recognition method provided in the above embodiments, the method comprising: acquiring a factory building image to be identified; inputting the plant image to be identified into a scene identification model to obtain a plant scene category corresponding to the plant image to be identified, which is output by the scene identification model; the scene recognition model is obtained by training a first type of factory building image samples with a first preset number and a second type of factory building image samples with a second preset number, wherein the first type of factory building image samples do not carry the factory building scene type labels; and determining the factory building scene category label of the first type factory building image sample based on the factory building scene category label carried by the second type factory building image sample.
Further, the logic instructions in the memory 430 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the plant scene recognition method provided by the above embodiments, the method comprising: acquiring a factory building image to be identified; inputting the plant image to be identified into a scene identification model to obtain a plant scene category corresponding to the plant image to be identified, which is output by the scene identification model; the scene recognition model is obtained by training a first type of factory building image samples with a first preset number and a second type of factory building image samples with a second preset number, wherein the first type of factory building image samples do not carry the factory building scene type labels; and determining the factory building scene category label of the first type factory building image sample based on the factory building scene category label carried by the second type factory building image sample.
In yet another aspect, the present invention further provides a non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor is implemented to perform the plant scene recognition method provided in the above embodiments, the method comprising: acquiring a factory building image to be identified; inputting the plant image to be identified into a scene identification model to obtain a plant scene category corresponding to the plant image to be identified, which is output by the scene identification model; the scene recognition model is obtained by training a first type of factory building image samples with a first preset number and a second type of factory building image samples with a second preset number, wherein the first type of factory building image samples do not carry the factory building scene type labels; and determining the factory building scene category label of the first type factory building image sample based on the factory building scene category label carried by the second type factory building image sample.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims (9)
1. The factory building scene recognition method is characterized by comprising the following steps of:
acquiring a factory building image to be identified;
Inputting the plant image to be identified into a scene identification model to obtain a plant scene category corresponding to the plant image to be identified, which is output by the scene identification model;
The scene recognition model is obtained by training a scene recognition model to be trained based on a first preset number of first-class factory building image samples and a second preset number of second-class factory building image samples, wherein the first-class factory building image samples do not carry factory building scene category labels, and the second-class factory building image samples carry factory building scene category labels; the factory building scene category labels of the first type factory building image samples are determined based on the factory building scene category labels carried by the second type factory building image samples;
the factory building scene category labels of the first type factory building image samples are determined based on the following steps:
inputting each first type factory building image sample and each second type factory building image sample into a feature extraction module of a scene recognition model to be trained respectively to obtain first type coding features of each first type factory building image sample and second type coding features of each second type factory building image sample output by the feature extraction module;
determining a factory building scene category label of each first type factory building image sample based on the distance between the first type coding feature of each first type factory building image sample and the second type coding feature of each second type factory building image sample;
the distance between the first type coding features of each first type factory building image sample and the second type coding features of each second type factory building image sample is cosine distance;
Correspondingly, determining the factory building scene category label of each first type factory building image sample based on the distance between the first type coding feature of each first type factory building image sample and the second type coding feature of each second type factory building image sample specifically comprises:
For any first type of coding feature, determining second type factory building image samples corresponding to a third preset number of second type of coding features with the largest cosine distance based on the cosine distance between any first type of coding feature and each second type of coding feature; the third preset number is one or more;
and determining the factory building scene category labels of the first type factory building image samples corresponding to any one of the first type coding features based on the factory building scene category labels carried by the third preset number of second type factory building image samples.
2. The plant scene recognition method according to claim 1, wherein the scene recognition model is trained based on the following method:
Inputting each first type of coding feature into a feature classification module of the scene recognition model to be trained to obtain a first type of feature probability vector of each first type of factory building image sample output by the feature classification module;
And training the scene recognition model to be trained based on the plant scene class labels of each first-type plant image sample and the first-type feature probability vectors of each first-type plant image sample.
3. The plant scene recognition method according to claim 2, wherein training the to-be-trained scene recognition model based on the plant scene class label of each first-class plant image sample and the first-class feature probability vector of each first-class plant image sample specifically comprises:
And calculating a loss function of the scene recognition model to be trained based on the plant scene class labels of each first plant image sample and the first type feature probability vector of each first plant image sample, carrying out gradient inversion on the scene recognition model to be trained, and recalculating the loss function of the scene recognition model to be trained until the scene recognition model to be trained converges to obtain the scene recognition model.
4. The plant scene recognition method according to claim 1, wherein determining the plant scene category label of the first plant image sample corresponding to the any one of the first type coding features based on the plant scene category label of the third preset number of second type plant image samples specifically comprises:
comparing the plant scene type labels of the third preset number of the second-class plant image samples, and determining that the plant scene type label with the highest proportion in the third preset number of the second-class plant image samples is the plant scene type label of the first-class plant image sample corresponding to any one of the first-class coding features.
5. The plant scene recognition method according to claim 1, wherein the scene recognition model to be trained is based on a pre-trained model obtained by training an ImageNet database.
6. A factory building scene recognition device, characterized by comprising:
The acquisition module is used for acquiring the image of the factory building to be identified;
The recognition module is used for inputting the plant image to be recognized into a scene recognition model to obtain a plant scene category corresponding to the plant image to be recognized, which is output by the scene recognition model;
The scene recognition model is obtained by training a scene recognition model to be trained based on a first preset number of first-class factory building image samples and a second preset number of second-class factory building image samples, wherein the first-class factory building image samples do not carry factory building scene category labels, and the second-class factory building image samples carry factory building scene category labels; the factory building scene category labels of the first type factory building image samples are determined based on the factory building scene category labels carried by the second type factory building image samples;
the factory building scene category labels of the first type factory building image samples are determined based on the following steps:
inputting each first type factory building image sample and each second type factory building image sample into a feature extraction module of a scene recognition model to be trained respectively to obtain first type coding features of each first type factory building image sample and second type coding features of each second type factory building image sample output by the feature extraction module;
determining a factory building scene category label of each first type factory building image sample based on the distance between the first type coding feature of each first type factory building image sample and the second type coding feature of each second type factory building image sample;
the distance between the first type coding features of each first type factory building image sample and the second type coding features of each second type factory building image sample is cosine distance;
Correspondingly, determining the factory building scene category label of each first type factory building image sample based on the distance between the first type coding feature of each first type factory building image sample and the second type coding feature of each second type factory building image sample specifically comprises:
For any first type of coding feature, determining second type factory building image samples corresponding to a third preset number of second type of coding features with the largest cosine distance based on the cosine distance between any first type of coding feature and each second type of coding feature; the third preset number is one or more;
and determining the factory building scene category labels of the first type factory building image samples corresponding to any one of the first type coding features based on the factory building scene category labels carried by the third preset number of second type factory building image samples.
7. The plant scene recognition device of claim 6, further comprising a training module configured to:
Inputting each first type of coding feature into a feature classification module of the scene recognition model to be trained to obtain a first type of feature probability vector of each first type of factory building image sample output by the feature classification module;
And training the scene recognition model to be trained based on the plant scene class labels of each first-type plant image sample and the first-type feature probability vectors of each first-type plant image sample.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the plant scene recognition method according to any one of claims 1 to 5 when the program is executed.
9. A non-transitory computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor implements the steps of the plant scene recognition method according to any of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110850496.1A CN113516090B (en) | 2021-07-27 | 2021-07-27 | Factory building scene recognition method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110850496.1A CN113516090B (en) | 2021-07-27 | 2021-07-27 | Factory building scene recognition method and device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113516090A CN113516090A (en) | 2021-10-19 |
CN113516090B true CN113516090B (en) | 2024-05-14 |
Family
ID=78068597
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110850496.1A Active CN113516090B (en) | 2021-07-27 | 2021-07-27 | Factory building scene recognition method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113516090B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108710847A (en) * | 2018-05-15 | 2018-10-26 | 北京旷视科技有限公司 | Scene recognition method, device and electronic equipment |
CN112488218A (en) * | 2020-12-04 | 2021-03-12 | 北京金山云网络技术有限公司 | Image classification method, and training method and device of image classification model |
CN112633246A (en) * | 2020-12-30 | 2021-04-09 | 携程计算机技术(上海)有限公司 | Multi-scene recognition method, system, device and storage medium in open scene |
CN112990378A (en) * | 2021-05-08 | 2021-06-18 | 腾讯科技(深圳)有限公司 | Scene recognition method and device based on artificial intelligence and electronic equipment |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10390082B2 (en) * | 2016-04-01 | 2019-08-20 | Oath Inc. | Computerized system and method for automatically detecting and rendering highlights from streaming videos |
-
2021
- 2021-07-27 CN CN202110850496.1A patent/CN113516090B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108710847A (en) * | 2018-05-15 | 2018-10-26 | 北京旷视科技有限公司 | Scene recognition method, device and electronic equipment |
CN112488218A (en) * | 2020-12-04 | 2021-03-12 | 北京金山云网络技术有限公司 | Image classification method, and training method and device of image classification model |
CN112633246A (en) * | 2020-12-30 | 2021-04-09 | 携程计算机技术(上海)有限公司 | Multi-scene recognition method, system, device and storage medium in open scene |
CN112990378A (en) * | 2021-05-08 | 2021-06-18 | 腾讯科技(深圳)有限公司 | Scene recognition method and device based on artificial intelligence and electronic equipment |
Non-Patent Citations (1)
Title |
---|
陈敏.《人工智能通信理论与方法》.武汉:华中科技大学出版社,2020,122-127. * |
Also Published As
Publication number | Publication date |
---|---|
CN113516090A (en) | 2021-10-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109272509B (en) | Target detection method, device and equipment for continuous images and storage medium | |
CN113111716B (en) | Remote sensing image semiautomatic labeling method and device based on deep learning | |
CN109816780B (en) | Power transmission line three-dimensional point cloud generation method and device of binocular sequence image | |
JP2024513596A (en) | Image processing method and apparatus and computer readable storage medium | |
CN110689535A (en) | Workpiece identification method and device, electronic equipment and storage medium | |
CN114170516B (en) | Vehicle weight recognition method and device based on roadside perception and electronic equipment | |
CN110781960A (en) | Training method, classification method, device and equipment of video classification model | |
CN115546116B (en) | Full-coverage type rock mass discontinuous surface extraction and interval calculation method and system | |
CN117376632B (en) | Data recovery method and system based on intelligent depth synthesis | |
CN111353062A (en) | Image retrieval method, device and equipment | |
CN115098717A (en) | Three-dimensional model retrieval method and device, electronic equipment and storage medium | |
CN107274425A (en) | A kind of color image segmentation method and device based on Pulse Coupled Neural Network | |
CN113989504A (en) | Semantic segmentation method for three-dimensional point cloud data | |
CN113516090B (en) | Factory building scene recognition method and device, electronic equipment and storage medium | |
CN111860287A (en) | Target detection method and device and storage medium | |
CN112348011A (en) | Vehicle damage assessment method and device and storage medium | |
CN115225731B (en) | Online protocol identification method based on hybrid neural network | |
CN115937492A (en) | Transformer equipment infrared image identification method based on feature identification | |
CN115272284A (en) | Power transmission line defect identification method based on image quality evaluation | |
CN112905832B (en) | Complex background fine-grained image retrieval system and method | |
CN109522196A (en) | A kind of method and device of fault log processing | |
CN114913330A (en) | Point cloud component segmentation method and device, electronic equipment and storage medium | |
Sayed et al. | Point clouds reduction model based on 3D feature extraction | |
CN113470048A (en) | Scene segmentation method, device, equipment and computer readable storage medium | |
CN110737652B (en) | Data cleaning method and system for three-dimensional digital model of surface mine and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |