CN117333813A - Head direction recognition method and device, electronic equipment and readable storage medium - Google Patents

Head direction recognition method and device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN117333813A
CN117333813A CN202311221147.9A CN202311221147A CN117333813A CN 117333813 A CN117333813 A CN 117333813A CN 202311221147 A CN202311221147 A CN 202311221147A CN 117333813 A CN117333813 A CN 117333813A
Authority
CN
China
Prior art keywords
category
embedding
identified
feature vector
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311221147.9A
Other languages
Chinese (zh)
Inventor
元方
李华美
李宝政
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bocom Smart Information Technology Co ltd
Original Assignee
Bocom Smart Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bocom Smart Information Technology Co ltd filed Critical Bocom Smart Information Technology Co ltd
Priority to CN202311221147.9A priority Critical patent/CN117333813A/en
Publication of CN117333813A publication Critical patent/CN117333813A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30232Surveillance
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a vehicle head direction identification method, a vehicle head direction identification device, electronic equipment and a readable storage medium, which are characterized in that an Embedding characteristic vector to be identified of a picture to be detected is firstly extracted through an Embedding model, and then the type of the picture to be detected is identified by comparing the distances between the Embedding characteristic vector to be identified and a plurality of pre-extracted Embedding characteristic vectors, so that the identification of a license plate is not dependent, and the high-precision identification of the vehicle head direction is realized under the condition that a large amount of sample data is not needed, and further whether the coal stealing behavior occurs can be judged based on the vehicle head direction.

Description

Head direction recognition method and device, electronic equipment and readable storage medium
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to a method and a device for recognizing a headstock direction, electronic equipment and a readable storage medium.
Background
The condition that a truck steals coal mine often exists on the coal mine, the judgment basis is that the direction of the truck head at the gate opening is observed, and if the truck head is reverse, the coal stealing behavior can be determined.
The traditional mode is to observe the direction of the truck head by manpower for inspection, but the inspection cannot be performed for 24 hours without dead angles, and misjudgment is easy at night.
If intelligent recognition is performed by combining a license plate camera with a vision technology, the situation that both the vehicle head and the vehicle tail have license plates is difficult to distinguish, and when a truck has a coal stealing behavior, a driver can consciously shield the license plates, or the camera is avoided or shielded by other methods.
If the license plate is not relied on for recognition, the model cannot be trained by using a large number of sample pictures because the vehicle head reversely belongs to a differential scene and the sample data are relatively scarce, and the recognition accuracy of the model is low.
Disclosure of Invention
Based on the above, the technical problems, a method, a device, an electronic device and a readable storage medium for identifying the direction of a vehicle head are provided.
The technical scheme adopted by the invention is as follows:
as a first aspect of the present invention, there is provided a vehicle head direction recognition method including:
s101, inputting a picture to be detected into an Embedding model trained by a vehicle head reverse picture and a vehicle head non-reverse picture to obtain an Embedding feature vector to be identified, wherein the Embedding model is formed by replacing the last layer of a classification network of a classification neural network model with a full-connection layer;
s102, respectively calculating the distances between the to-be-identified and a plurality of pre-extracted Embedding feature vectors to obtain a plurality of vector distances, wherein the plurality of pre-extracted Embedding feature vectors are obtained by respectively inputting a plurality of headstock reverse pictures and a plurality of headstock non-reverse pictures with the same quantity into the Embedding model;
and S103, sorting the vector distances from small to large, and identifying the category of the to-be-identified Embedding feature vector through a heuristic nearest neighbor algorithm.
As a second aspect of the present invention, there is provided a vehicle head direction recognition apparatus comprising:
the system comprises an Embedding feature vector extraction module, a classification neural network module and a classification neural network module, wherein the Embedding feature vector extraction module is used for inputting a picture to be detected into an Embedding model trained by a vehicle head reverse picture and a vehicle head non-reverse picture to obtain an Embedding feature vector to be identified, and the Embedding model is formed by replacing the last layer of a classification network of the classification neural network model with a full-connection layer;
the vector distance calculation module is used for respectively calculating the distances between the to-be-identified and a plurality of pre-extracted Embedding feature vectors to obtain a plurality of vector distances, wherein the plurality of pre-extracted Embedding feature vectors are obtained by respectively inputting a plurality of headstock reverse pictures and a plurality of headstock non-reverse pictures which are the same in number into the Embedding model;
the identification module is used for sequencing the plurality of vector distances from small to large, and identifying the category of the to-be-identified Embedding feature vector through a heuristic nearest neighbor algorithm.
As a third aspect of the present invention, there is provided an electronic device comprising a storage module comprising instructions loaded and executed by a processor, which instructions, when executed, cause the processor to perform a head direction recognition method of the first aspect described above.
As a fourth aspect of the present invention, there is provided a computer-readable storage medium storing one or more programs which, when executed by a processor, implement a head direction recognition method of the first aspect described above.
According to the method, the to-be-identified Embedding feature vector of the to-be-detected picture is extracted through the Embedding model, and then the to-be-identified Embedding feature vector is compared with the distances between the plurality of pre-extracted Embedding feature vectors, so that the category of the to-be-detected picture is identified, the identification of a license plate is not dependent, the high-accuracy identification of the direction of the head of a vehicle is achieved under the condition that a large amount of sample data is not needed, and further whether the coal stealing behavior occurs or not can be judged based on the direction of the head of the vehicle.
Drawings
The invention is described in detail below with reference to the attached drawings and detailed description:
fig. 1 is a flowchart of a method for identifying a direction of a vehicle head according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a vehicle head direction recognition device according to an embodiment of the present invention;
fig. 3 is a schematic diagram of an electronic device according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of the construction of an Embedding model according to an embodiment of the present invention.
Detailed Description
Embodiments of the present invention will be described below with reference to the drawings. The embodiments described in the present specification are not intended to be exhaustive or to represent the only embodiments of the present invention. The following examples are presented for clarity of illustration of the invention of the present patent and are not intended to limit the embodiments thereof. It will be apparent to those skilled in the art that various changes and modifications can be made in the embodiment described, and that all the obvious changes or modifications which come within the spirit and scope of the invention are deemed to be within the scope of the invention.
The embodiment of the application provides a headstock direction identification method, which is applied to a mine field and is used for identifying the headstock direction of a truck in the mine field, so that whether coal stealing behavior occurs or not can be judged based on the headstock direction, and as shown in fig. 1, the specific flow of the method is as follows:
s101, inputting the picture to be detected into an Embedding model trained by the reverse picture of the headstock and the non-reverse picture of the headstock, and obtaining an Embedding feature vector to be identified.
In this embodiment, the picture to be detected may be a picture acquired by a camera disposed at the gate of the mine. The reverse direction of the head refers to the opposite direction of travel of the truck head and the vehicle specified by the mine site.
The image feature vector may be understood as a feature of representing an image with a vector, and since in the embodiment, the model is not required to identify the type of the image to be detected, but only the image feature vector is extracted, a large number of sample images are not required to train the model.
The classification neural network model (such as yolov8 classification neural network model) can only output the probability relative to each category, as shown in fig. 4, wherein the last layer c1 of the classification network classification is replaced by a full connection layer c1', so as to form an Embedding model, and thus an Embedding feature vector is output.
In order to facilitate training, the number of nodes of the fully connected layer c1' is the same as that of the upper layer c2 (the last second layer of the classification network classification), if the number of nodes of the fully connected layer c1' is smaller than that of the upper layer c2, the features are compressed to affect the training effect, and if the number of nodes of the fully connected layer c1' is larger than that of the upper layer c2, the feature redundancy is caused, so that the training convergence is slow.
In this embodiment, the neural network model uses a pre-training (pre-training) classification neural network model, such as a Yolov8 classification model based on image-net pre-training.
The pre-training model is obtained through massive data training, universal features in the data are better learned, compared with the training of parameters from the beginning on a data set of the pre-training model, the pre-training model parameters generally have better generalization effect, and on the basis, the model can be optimized by only training the full-connection layer, so that the model can be optimized by using fewer sample data. The specific training process is as follows:
(a) Three pictures are selected as a group each time: randomly selecting a picture x from a headstock reverse picture set or a headstock non-reverse picture set a Selecting one picture from the headstock reverse picture set and the headstock non-reverse picture set respectively, wherein the pictures are respectively expressed as x p And x n Wherein a represents a picture x a Class x of (2) p Representing pictures of the same kind as a, x n Representing pictures of a different class than a.
(b) Inputting a group of pictures into an Embedding model in turn to obtain three results f (x) a )、f(x p ) And f (x) n );
(c) Three results are substituted into the loss function:
wherein α is a preset fault tolerance value, which may be set to 0.01.
(d) The parameters of the fully connected layer are optimized by back propagation.
After about 10 times, the model is in a state with better convergence, and the optimized Embedding model can be obtained at the moment and can be directly used for reasoning.
S102, respectively calculating the distances between the to-be-identified Embedding feature vector and the plurality of pre-extracted Embedding feature vectors to obtain a plurality of vector distances.
The method comprises the steps of extracting a plurality of pre-extracted Embedding feature vectors by respectively inputting a plurality of head reverse pictures and a plurality of head non-reverse pictures with the same quantity into an Embedding model.
For example, 10 representative and non-repeated head reverse pictures and 10 representative and non-repeated head non-reverse pictures may be input into an Embedding model, resulting in 20 pre-extracted Embedding feature vectors.
It should be noted that the feature vector may be represented as a point in the feature space, so that the distance between the to-be-identified and the pre-extracted feature vector may be understood as the distance between two points, for example, the euclidean distance between two points may be calculated, where the euclidean distance is better in universality, and of course, the distance between coordinate points in the space may also be calculated in other manners.
And S103, sorting a plurality of vector distances from small to large, and identifying the category of the to-be-identified Embedding feature vector through a heuristic nearest neighbor algorithm.
Since the points of the same category are closer in the feature space, and the points of different categories are more dispersed in the feature space, the category of the picture to be detected can be identified according to the distance between the comparison points.
The heuristic nearest neighbor algorithm is to select a threshold value as K, and perform heuristic decision voting on the point closest to the test sample within the threshold value range, wherein the category with the high number of votes is the category of the test sample.
In this embodiment, K is selected to be 2, heuristic decision voting is performed on two points closest to the point to be identified (the point corresponding to the to-be-identified embedded feature vector), and the category of the point to be identified is determined, namely, the category of the to-be-identified embedded feature vector is identified, and the specific process is as follows:
(a) Ordering the plurality of vector distances from small to large is expressed as: d1, d2, d3, d4...d i I represents the number of vector distances.
The method comprises the steps that the categories of pictures input into an Embedding model are used as the categories of corresponding pre-extracted Embedding feature vectors output by the Embedding model, and the categories of the pre-extracted Embedding feature vectors are used as the categories of corresponding vector distances.
(b) If |d1-d2| is greater than the threshold, i.e. d2 is much greater than d1, the class of d1 is taken as the class of the to-be-identified Embedding feature vector.
(c) If |d1-d2| is smaller than or equal to the threshold, that is, d1 and d2 are not different greatly and d1 and d2 belong to the same category, the category of d1 is used as the category of the to-be-identified Embedding feature vector.
(d) If |d1-d2| is less than or equal to the threshold, and d1 and d2 do not belong to the same category, judging whether d3 and d4 belong to the same category, if so, finding a similar vector distance to d3 and d4 from d1 and d2, taking the category of the vector distance as the category of the to-be-identified Embedding feature vector, if not, further judging whether d3 and d4 belong to the same category as d1, if d3 and d1 belong to the same category, taking the category of d1 as the category of the to-be-identified Embedding feature vector, if d4 and d1 belong to the same category, judging the sum of d1 and d4 and the sum of d2 and d3, if sum (d 1, d 4) < sum (d 2, d 3), taking the category of d1 as the category of the to-be-identified Embedding feature vector, and if sum (d 1, d 4) > sum (d 2, d 3), taking the category of d2 as the category of the to-be-identified Embedding feature vector.
The type of the to-be-identified Embedding feature vector is the type of the to-be-detected picture, and if the type of the to-be-detected picture is the reverse direction of the vehicle head, the coal stealing behavior can be determined.
It can be seen that step S103 classifies by comparing the distances between points, unlike classifying by a neural network model, does not require sample data.
In the following, 10 reverse pictures of the headstock and 10 non-reverse pictures of the headstock are taken as examples to illustrate how the above threshold is determined.
10 head reverse pictures are input into an Embedding model to obtain 10 pre-extracted Embedding feature vectors, the 10 pre-extracted Embedding feature vectors are defined as a set 1, 10 head non-reverse pictures are input into the Embedding model to obtain 10 pre-extracted Embedding feature vectors, and the 10 pre-extracted Embedding feature vectors are defined as a set 2.
And (3) combining the elements of the set 1 in pairs, respectively calculating the distance between the elements, averaging the distances to obtain a first average value, similarly, performing the same operation on the set 2 to obtain a second average value, and finally, averaging the first average value and the second average value to obtain a threshold value.
As can be seen from the above, the method of the embodiment extracts the to-be-identified embedded feature vector of the to-be-detected picture through the embedded model, and then compares the to-be-identified embedded feature vector with the distances between the plurality of pre-extracted embedded feature vectors, so that the category of the to-be-detected picture is identified, the identification of license plates is not dependent, and the high-accuracy identification of the direction of the vehicle head is achieved under the condition that a large amount of sample data is not needed.
The head direction recognition device of one or more embodiments of the present invention will be described in detail below. Those skilled in the art will appreciate that these identification means may be configured by the steps taught by the present solution using commercially available hardware components. Fig. 2 shows a vehicle head direction recognition device provided by the embodiment of the present invention, and as shown in fig. 2, the recognition device includes an extracting module 11 for an Embedding feature vector, a calculating module 12 for a vector distance, and a recognition module 13.
The extraction module 11 is configured to input the picture to be detected into an extraction model trained by the reverse picture of the headstock and the non-reverse picture of the headstock, so as to obtain an extraction feature vector to be identified.
In this embodiment, the picture to be detected may be a picture acquired by a camera disposed at the gate of the mine. The reverse direction of the head refers to the opposite direction of travel of the truck head and the vehicle specified by the mine site.
The image feature vector may be understood as a feature of representing an image with a vector, and since in the embodiment, the model is not required to identify the type of the image to be detected, but only the image feature vector is extracted, a large number of sample images are not required to train the model.
The classification neural network model (such as yolov8 classification neural network model) can only output the probability relative to each category, as shown in fig. 4, wherein the last layer c1 of the classification network classification is replaced by a full connection layer c1', so as to form an Embedding model, and thus an Embedding feature vector is output.
In order to facilitate training, the number of nodes of the fully connected layer c1' is the same as that of the upper layer c2 (the last second layer of the classification network classification), if the number of nodes of the fully connected layer c1' is smaller than that of the upper layer c2, the features are compressed to affect the training effect, and if the number of nodes of the fully connected layer c1' is larger than that of the upper layer c2, the feature redundancy is caused, so that the training convergence is slow.
In this embodiment, the neural network model uses a pre-training (pre-training) classification neural network model, such as a Yolov8 classification model based on image-net pre-training.
The pre-training model is obtained through massive data training, universal features in the data are better learned, compared with the training of parameters from the beginning on a data set of the pre-training model, the pre-training model parameters generally have better generalization effect, and on the basis, the model can be optimized by only training the full-connection layer, so that the model can be optimized by using fewer sample data. The specific training process is as follows:
(a) Three pictures are selected as a group each time: randomly selecting a picture x from a headstock reverse picture set or a headstock non-reverse picture set a Selecting one picture from the headstock reverse picture set and the headstock non-reverse picture set respectively, wherein the pictures are respectively expressed as x p And x n Wherein a represents a picture x a Class x of (2) p Representing pictures of the same kind as a, x n Representing pictures of a different class than a.
(b) Inputting a group of pictures into an Embedding model in turn to obtain three results f (x) a )、f(x p ) And f (x) n );
(c) Three results are substituted into the loss function:
wherein α is a preset fault tolerance value, which may be set to 0.01.
(d) The parameters of the fully connected layer are optimized by back propagation.
After about 10 times, the model is in a state with better convergence, and the optimized Embedding model can be obtained at the moment and can be directly used for reasoning.
The vector distance calculating module 12 is configured to calculate distances between the to-be-identified embedded feature vector and a plurality of pre-extracted embedded feature vectors, respectively, to obtain a plurality of vector distances.
The method comprises the steps of extracting a plurality of pre-extracted Embedding feature vectors by respectively inputting a plurality of head reverse pictures and a plurality of head non-reverse pictures with the same quantity into an Embedding model.
For example, 10 headstock reverse pictures and 10 headstock non-reverse pictures can be input into an Embedding model to obtain 20 pre-extracted Embedding feature vectors.
It should be noted that the feature vector may be represented as a point in the feature space, so that the distance between the to-be-identified and the pre-extracted feature vector may be understood as the distance between two points, for example, the euclidean distance between two points may be calculated, where the euclidean distance is better in universality, and of course, the distance between coordinate points in the space may also be calculated in other manners.
The identifying module 13 is configured to sort the plurality of vector distances from small to large, and identify the category of the to-be-identified assembled feature vector through a heuristic nearest neighbor algorithm.
Since the points of the same category are closer in the feature space, and the points of different categories are more dispersed in the feature space, the category of the picture to be detected can be identified according to the distance between the comparison points.
The heuristic nearest neighbor algorithm is to select a threshold value as K, and perform heuristic decision voting on the point closest to the test sample within the threshold value range, wherein the category with the high number of votes is the category of the test sample.
In this embodiment, K is selected to be 2, heuristic decision voting is performed on two points closest to the point to be identified (the point corresponding to the to-be-identified embedded feature vector), and the category of the point to be identified is determined, namely, the category of the to-be-identified embedded feature vector is identified, and the specific process is as follows:
(a) Ordering the plurality of vector distances from small to large is expressed as: d1, d2, d3, d4...d i I represents the number of vector distances.
The method comprises the steps that the categories of pictures input into an Embedding model are used as the categories of corresponding pre-extracted Embedding feature vectors output by the Embedding model, and the categories of the pre-extracted Embedding feature vectors are used as the categories of corresponding vector distances.
(b) If |d1-d2| is greater than the threshold, i.e. d2 is much greater than d1, the class of d1 is taken as the class of the to-be-identified Embedding feature vector.
(c) If |d1-d2| is smaller than or equal to the threshold, that is, d1 and d2 are not different greatly and d1 and d2 belong to the same category, the category of d1 is used as the category of the to-be-identified Embedding feature vector.
(d) If |d1-d2| is less than or equal to the threshold, and d1 and d2 do not belong to the same category, judging whether d3 and d4 belong to the same category, if so, finding a similar vector distance to d3 and d4 from d1 and d2, taking the category of the vector distance as the category of the to-be-identified Embedding feature vector, if not, further judging whether d3 and d4 belong to the same category as d1, if d3 and d1 belong to the same category, taking the category of d1 as the category of the to-be-identified Embedding feature vector, if d4 and d1 belong to the same category, judging the sum of d1 and d4 and the sum of d2 and d3, if sum (d 1, d 4) < sum (d 2, d 3), taking the category of d1 as the category of the to-be-identified Embedding feature vector, and if sum (d 1, d 4) > sum (d 2, d 3), taking the category of d2 as the category of the to-be-identified Embedding feature vector.
The type of the to-be-identified Embedding feature vector is the type of the to-be-detected picture, and if the type of the to-be-detected picture is the reverse direction of the vehicle head, the coal stealing behavior can be determined.
It can be seen that step S103 classifies by comparing the distances between points, unlike classifying by a neural network model, does not require sample data.
In the following, 10 reverse pictures of the headstock and 10 non-reverse pictures of the headstock are taken as examples to illustrate how the above threshold is determined.
10 head reverse pictures are input into an Embedding model to obtain 10 pre-extracted Embedding feature vectors, the 10 pre-extracted Embedding feature vectors are defined as a set 1, 10 head non-reverse pictures are input into the Embedding model to obtain 10 pre-extracted Embedding feature vectors, and the 10 pre-extracted Embedding feature vectors are defined as a set 2.
And (3) combining the elements of the set 1 in pairs, respectively calculating the distance between the elements, averaging the distances to obtain a first average value, similarly, performing the same operation on the set 2 to obtain a second average value, and finally, averaging the first average value and the second average value to obtain a threshold value.
In summary, the head direction recognition device provided in the foregoing embodiments may perform the head direction recognition method provided in the foregoing embodiments.
As with the above concept, the structure of the head direction recognition device shown in fig. 2 may be implemented as an electronic device, and fig. 3 is a schematic block diagram of the structure of the electronic device according to the embodiment of the present invention.
Illustratively, the electronic device includes a memory module 21 and a processor 22, the memory module 21 including instructions loaded and executed by the processor 22, which when executed, cause the processor 22 to perform the steps according to various exemplary embodiments of the present invention described in the above-described one of the head direction identification methods section of this specification.
It should be appreciated that the processor 22 may be a central processing unit (CentralProcessingUnit, CPU), and that the processor 22 may also be other general purpose processors, digital signal processors (DigitalSignalProcessor, DSP), application specific integrated circuits (ApplicationSpecificIntegratedCircuit, ASIC), field programmable gate arrays (Field-ProgrammableGateArray, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. Wherein the general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Embodiments of the present invention also provide a computer-readable storage medium storing one or more programs that, when executed by a processor, implement the steps described in the foregoing description of various exemplary embodiments of a method for identifying a direction of a vehicle head.
Those of ordinary skill in the art will appreciate that all or some of the steps, systems, functional modules/units in the apparatus, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between the functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed cooperatively by several physical components. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer-readable storage media, which may include computer-readable storage media (or non-transitory media) and communication media (or transitory media).
The term computer-readable storage medium includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer-readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer-readable storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.
By way of example, the computer readable storage medium may be an internal storage module of the electronic device of the foregoing embodiments, such as a hard disk or a memory of the electronic device. The computer readable storage medium may also be an external storage device of the electronic device, such as a plug-in hard disk provided on the electronic device, a smart memory card (SmartMediaCard, SMC), a secure digital (SecureDigital, SD) card, a flash memory card (FlashCard), or the like.
The electronic device and the computer readable storage medium provided in the foregoing embodiments extract an to-be-identified embedded feature vector of a to-be-detected picture through an embedded model, and then compare distances between the to-be-identified embedded feature vector and a plurality of pre-extracted embedded feature vectors, so as to identify a category of the to-be-detected picture, without depending on identification of a license plate, and high-accuracy identification of a direction of a vehicle head is achieved without needing a large amount of sample data.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present application without departing from the scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to cover such modifications and variations.

Claims (10)

1. A method for identifying a direction of a vehicle head, comprising:
s101, inputting a picture to be detected into an Embedding model trained by a vehicle head reverse picture and a vehicle head non-reverse picture to obtain an Embedding feature vector to be identified, wherein the Embedding model is formed by replacing the last layer of a classification network of a classification neural network model with a full-connection layer;
s102, respectively calculating the distances between the to-be-identified and a plurality of pre-extracted Embedding feature vectors to obtain a plurality of vector distances, wherein the plurality of pre-extracted Embedding feature vectors are obtained by respectively inputting a plurality of headstock reverse pictures and a plurality of headstock non-reverse pictures with the same quantity into the Embedding model;
and S103, sorting the vector distances from small to large, and identifying the category of the to-be-identified Embedding feature vector through a heuristic nearest neighbor algorithm.
2. The method of claim 1, wherein the classification neural network model is a pre-trained classification neural network model.
3. The method for recognizing a vehicle head direction according to claim 2, wherein the training of the embedded model comprises:
three pictures are selected as a group each time: randomly selecting a picture x from a headstock reverse picture set or a headstock non-reverse picture set a Selecting one picture from the headstock reverse picture set and the headstock non-reverse picture set respectively, wherein the pictures are respectively expressed as x p And x n Wherein a represents a picture x a Class x of (2) p Representing pictures of the same kind as a, x n Representing pictures of different classes from a;
inputting a group of pictures into an Embedding model in turn to obtain three results f (x) a )、f(x p ) And x n
Substituting the three results into a loss function:
wherein alpha is a preset fault tolerance value;
the parameters of the fully connected layer are optimized by back propagation.
4. A method for recognizing a vehicle head direction according to claim 3, wherein the pre-training classification neural network model is a yellow 8 classification neural network model based on image-net.
5. The method of claim 1, wherein the number of nodes in the fully connected layer is the same as the number of nodes in the immediately preceding layer.
6. The method for recognizing a vehicle head direction according to claim 1, wherein the calculating distances between the to-be-recognized embedded feature vector and a plurality of pre-extracted embedded feature vectors, respectively, further comprises:
and respectively calculating Euclidean distances between the to-be-identified Embedding feature vector and a plurality of pre-extracted Embedding feature vectors.
7. The method for recognizing a vehicle head direction according to claim 1, wherein said S103 further comprises:
ordering the plurality of vector distances from small to large is expressed as: d1, d2, d3, d4...d i I represents the number of vector distances, the category of the picture input into the coding model is used as the category of the corresponding pre-extracted coding feature vector output by the coding model, and the category of the pre-extracted coding feature vector is used as the category of the corresponding vector distance;
if the |d1-d2| is larger than the threshold value, taking the class of d1 as the class of the to-be-identified coding feature vector;
if the |d1-d2| is smaller than or equal to the threshold value and d1 and d2 belong to the same category, taking the category of d1 as the category of the to-be-identified coding feature vector;
if |d1-d2| is smaller than or equal to the threshold value, d1 and d2 do not belong to the same category, judging whether d3 and d4 belong to the same category, if so, finding a vector distance similar to d3 and d4 from d1 and d2, taking the category of the vector distance as the category of the to-be-identified Embedding feature vector, if not, further judging whether d3 and d4 belong to the same category as d1, if d3 and d1 belong to the same category, taking the category of d1 as the category of the to-be-identified Embedding feature vector, if d4 and d1 belong to the same category, judging the sum of d1 and d4 and the sum of d2 and d3, if sum (d 1, d 4) < sum (d 2, d 3), taking the category of d1 as the category of the to-be-identified Embedding feature vector, and if sum (d 1, d 4) > sum (d 2, d 3) taking the category of d2 as the category of the to-be-identified Embedding feature vector.
8. A vehicle head direction recognition device, comprising:
the system comprises an Embedding feature vector extraction module, a classification neural network module and a classification neural network module, wherein the Embedding feature vector extraction module is used for inputting a picture to be detected into an Embedding model trained by a vehicle head reverse picture and a vehicle head non-reverse picture to obtain an Embedding feature vector to be identified, and the Embedding model is formed by replacing the last layer of a classification network of the classification neural network model with a full-connection layer;
the vector distance calculation module is used for respectively calculating the distances between the to-be-identified and a plurality of pre-extracted Embedding feature vectors to obtain a plurality of vector distances, wherein the plurality of pre-extracted Embedding feature vectors are obtained by respectively inputting a plurality of headstock reverse pictures and a plurality of headstock non-reverse pictures which are the same in number into the Embedding model;
the identification module is used for sequencing the plurality of vector distances from small to large, and identifying the category of the to-be-identified Embedding feature vector through a heuristic nearest neighbor algorithm.
9. An electronic device comprising a memory module including instructions loaded and executed by a processor, which when executed, cause the processor to perform a method of identifying a direction of a vehicle head according to any one of claims 1-7.
10. A computer readable storage medium storing one or more programs, which when executed by a processor, implement a method of head direction identification as claimed in any one of claims 1 to 7.
CN202311221147.9A 2023-09-21 2023-09-21 Head direction recognition method and device, electronic equipment and readable storage medium Pending CN117333813A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311221147.9A CN117333813A (en) 2023-09-21 2023-09-21 Head direction recognition method and device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311221147.9A CN117333813A (en) 2023-09-21 2023-09-21 Head direction recognition method and device, electronic equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN117333813A true CN117333813A (en) 2024-01-02

Family

ID=89289466

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311221147.9A Pending CN117333813A (en) 2023-09-21 2023-09-21 Head direction recognition method and device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN117333813A (en)

Similar Documents

Publication Publication Date Title
JP5127392B2 (en) Classification boundary determination method and classification boundary determination apparatus
Wang et al. Detecting semantic parts on partially occluded objects
CN106934378B (en) Automobile high beam identification system and method based on video deep learning
CN105574550A (en) Vehicle identification method and device
CN110826558B (en) Image classification method, computer device, and storage medium
CN109657664A (en) A kind of recognition methods, device and the electronic equipment of license plate type
US11605210B2 (en) Method for optical character recognition in document subject to shadows, and device employing method
CN111178367B (en) Feature determination device and method for adapting to multiple object sizes
CN111428735B (en) Truck brand classification method based on migration learning deep network fusion model
Hendryli et al. Automatic license plate recognition for parking system using convolutional neural networks
CN110135428B (en) Image segmentation processing method and device
CN111126504A (en) Multi-source incomplete information fusion image target classification method
CN112784494B (en) Training method of false positive recognition model, target recognition method and device
CN112597995B (en) License plate detection model training method, device, equipment and medium
CN114037886A (en) Image recognition method and device, electronic equipment and readable storage medium
US20190392249A1 (en) Image feature amount output device, image recognition device, the image feature amount output program, and image recognition program
PirahanSiah et al. Adaptive image segmentation based on peak signal-to-noise ratio for a license plate recognition system
CN112288702A (en) Road image detection method based on Internet of vehicles
CN116977834A (en) Method for identifying internal and external images distributed under open condition
CN116721396A (en) Lane line detection method, device and storage medium
CN112446375A (en) License plate recognition method, device, equipment and storage medium
CN111507396A (en) Method and device for relieving error classification of neural network on unknown samples
Yang et al. License plate detection based on sparse auto-encoder
CN117333813A (en) Head direction recognition method and device, electronic equipment and readable storage medium
CN113705666B (en) Split network training method, use method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination