CN117333813A

CN117333813A - Head direction recognition method and device, electronic equipment and readable storage medium

Info

Publication number: CN117333813A
Application number: CN202311221147.9A
Authority: CN
Inventors: 元方; 李华美; 李宝政
Original assignee: Bocom Smart Information Technology Co ltd
Current assignee: Bocom Smart Information Technology Co ltd
Priority date: 2023-09-21
Filing date: 2023-09-21
Publication date: 2024-01-02

Abstract

The invention provides a vehicle head direction identification method, a vehicle head direction identification device, electronic equipment and a readable storage medium, which are characterized in that an Embedding characteristic vector to be identified of a picture to be detected is firstly extracted through an Embedding model, and then the type of the picture to be detected is identified by comparing the distances between the Embedding characteristic vector to be identified and a plurality of pre-extracted Embedding characteristic vectors, so that the identification of a license plate is not dependent, and the high-precision identification of the vehicle head direction is realized under the condition that a large amount of sample data is not needed, and further whether the coal stealing behavior occurs can be judged based on the vehicle head direction.

Description

Head direction recognition method and device, electronic equipment and readable storage medium

Technical Field

The invention belongs to the technical field of computer vision, and particularly relates to a method and a device for recognizing a headstock direction, electronic equipment and a readable storage medium.

Background

The condition that a truck steals coal mine often exists on the coal mine, the judgment basis is that the direction of the truck head at the gate opening is observed, and if the truck head is reverse, the coal stealing behavior can be determined.

The traditional mode is to observe the direction of the truck head by manpower for inspection, but the inspection cannot be performed for 24 hours without dead angles, and misjudgment is easy at night.

If intelligent recognition is performed by combining a license plate camera with a vision technology, the situation that both the vehicle head and the vehicle tail have license plates is difficult to distinguish, and when a truck has a coal stealing behavior, a driver can consciously shield the license plates, or the camera is avoided or shielded by other methods.

If the license plate is not relied on for recognition, the model cannot be trained by using a large number of sample pictures because the vehicle head reversely belongs to a differential scene and the sample data are relatively scarce, and the recognition accuracy of the model is low.

Disclosure of Invention

Based on the above, the technical problems, a method, a device, an electronic device and a readable storage medium for identifying the direction of a vehicle head are provided.

The technical scheme adopted by the invention is as follows:

as a first aspect of the present invention, there is provided a vehicle head direction recognition method including:

s101, inputting a picture to be detected into an Embedding model trained by a vehicle head reverse picture and a vehicle head non-reverse picture to obtain an Embedding feature vector to be identified, wherein the Embedding model is formed by replacing the last layer of a classification network of a classification neural network model with a full-connection layer;

s102, respectively calculating the distances between the to-be-identified and a plurality of pre-extracted Embedding feature vectors to obtain a plurality of vector distances, wherein the plurality of pre-extracted Embedding feature vectors are obtained by respectively inputting a plurality of headstock reverse pictures and a plurality of headstock non-reverse pictures with the same quantity into the Embedding model;

and S103, sorting the vector distances from small to large, and identifying the category of the to-be-identified Embedding feature vector through a heuristic nearest neighbor algorithm.

As a second aspect of the present invention, there is provided a vehicle head direction recognition apparatus comprising:

the system comprises an Embedding feature vector extraction module, a classification neural network module and a classification neural network module, wherein the Embedding feature vector extraction module is used for inputting a picture to be detected into an Embedding model trained by a vehicle head reverse picture and a vehicle head non-reverse picture to obtain an Embedding feature vector to be identified, and the Embedding model is formed by replacing the last layer of a classification network of the classification neural network model with a full-connection layer;

the vector distance calculation module is used for respectively calculating the distances between the to-be-identified and a plurality of pre-extracted Embedding feature vectors to obtain a plurality of vector distances, wherein the plurality of pre-extracted Embedding feature vectors are obtained by respectively inputting a plurality of headstock reverse pictures and a plurality of headstock non-reverse pictures which are the same in number into the Embedding model;

the identification module is used for sequencing the plurality of vector distances from small to large, and identifying the category of the to-be-identified Embedding feature vector through a heuristic nearest neighbor algorithm.

As a third aspect of the present invention, there is provided an electronic device comprising a storage module comprising instructions loaded and executed by a processor, which instructions, when executed, cause the processor to perform a head direction recognition method of the first aspect described above.

As a fourth aspect of the present invention, there is provided a computer-readable storage medium storing one or more programs which, when executed by a processor, implement a head direction recognition method of the first aspect described above.

According to the method, the to-be-identified Embedding feature vector of the to-be-detected picture is extracted through the Embedding model, and then the to-be-identified Embedding feature vector is compared with the distances between the plurality of pre-extracted Embedding feature vectors, so that the category of the to-be-detected picture is identified, the identification of a license plate is not dependent, the high-accuracy identification of the direction of the head of a vehicle is achieved under the condition that a large amount of sample data is not needed, and further whether the coal stealing behavior occurs or not can be judged based on the direction of the head of the vehicle.

Drawings

The invention is described in detail below with reference to the attached drawings and detailed description:

fig. 1 is a flowchart of a method for identifying a direction of a vehicle head according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a vehicle head direction recognition device according to an embodiment of the present invention;

fig. 3 is a schematic diagram of an electronic device according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of the construction of an Embedding model according to an embodiment of the present invention.

Detailed Description

Embodiments of the present invention will be described below with reference to the drawings. The embodiments described in the present specification are not intended to be exhaustive or to represent the only embodiments of the present invention. The following examples are presented for clarity of illustration of the invention of the present patent and are not intended to limit the embodiments thereof. It will be apparent to those skilled in the art that various changes and modifications can be made in the embodiment described, and that all the obvious changes or modifications which come within the spirit and scope of the invention are deemed to be within the scope of the invention.

The embodiment of the application provides a headstock direction identification method, which is applied to a mine field and is used for identifying the headstock direction of a truck in the mine field, so that whether coal stealing behavior occurs or not can be judged based on the headstock direction, and as shown in fig. 1, the specific flow of the method is as follows:

s101, inputting the picture to be detected into an Embedding model trained by the reverse picture of the headstock and the non-reverse picture of the headstock, and obtaining an Embedding feature vector to be identified.

In this embodiment, the picture to be detected may be a picture acquired by a camera disposed at the gate of the mine. The reverse direction of the head refers to the opposite direction of travel of the truck head and the vehicle specified by the mine site.

The image feature vector may be understood as a feature of representing an image with a vector, and since in the embodiment, the model is not required to identify the type of the image to be detected, but only the image feature vector is extracted, a large number of sample images are not required to train the model.

The classification neural network model (such as yolov8 classification neural network model) can only output the probability relative to each category, as shown in fig. 4, wherein the last layer c1 of the classification network classification is replaced by a full connection layer c1', so as to form an Embedding model, and thus an Embedding feature vector is output.

In order to facilitate training, the number of nodes of the fully connected layer c1' is the same as that of the upper layer c2 (the last second layer of the classification network classification), if the number of nodes of the fully connected layer c1' is smaller than that of the upper layer c2, the features are compressed to affect the training effect, and if the number of nodes of the fully connected layer c1' is larger than that of the upper layer c2, the feature redundancy is caused, so that the training convergence is slow.

In this embodiment, the neural network model uses a pre-training (pre-training) classification neural network model, such as a Yolov8 classification model based on image-net pre-training.

The pre-training model is obtained through massive data training, universal features in the data are better learned, compared with the training of parameters from the beginning on a data set of the pre-training model, the pre-training model parameters generally have better generalization effect, and on the basis, the model can be optimized by only training the full-connection layer, so that the model can be optimized by using fewer sample data. The specific training process is as follows:

(a) Three pictures are selected as a group each time: randomly selecting a picture x from a headstock reverse picture set or a headstock non-reverse picture set _a Selecting one picture from the headstock reverse picture set and the headstock non-reverse picture set respectively, wherein the pictures are respectively expressed as x _p And x _n Wherein a represents a picture x _a Class x of (2) _p Representing pictures of the same kind as a, x _n Representing pictures of a different class than a.

(b) Inputting a group of pictures into an Embedding model in turn to obtain three results f (x) _a )、f(x _p ) And f (x) _n )；

(c) Three results are substituted into the loss function:

wherein α is a preset fault tolerance value, which may be set to 0.01.

(d) The parameters of the fully connected layer are optimized by back propagation.

After about 10 times, the model is in a state with better convergence, and the optimized Embedding model can be obtained at the moment and can be directly used for reasoning.

S102, respectively calculating the distances between the to-be-identified Embedding feature vector and the plurality of pre-extracted Embedding feature vectors to obtain a plurality of vector distances.

The method comprises the steps of extracting a plurality of pre-extracted Embedding feature vectors by respectively inputting a plurality of head reverse pictures and a plurality of head non-reverse pictures with the same quantity into an Embedding model.

For example, 10 representative and non-repeated head reverse pictures and 10 representative and non-repeated head non-reverse pictures may be input into an Embedding model, resulting in 20 pre-extracted Embedding feature vectors.

It should be noted that the feature vector may be represented as a point in the feature space, so that the distance between the to-be-identified and the pre-extracted feature vector may be understood as the distance between two points, for example, the euclidean distance between two points may be calculated, where the euclidean distance is better in universality, and of course, the distance between coordinate points in the space may also be calculated in other manners.

And S103, sorting a plurality of vector distances from small to large, and identifying the category of the to-be-identified Embedding feature vector through a heuristic nearest neighbor algorithm.

Since the points of the same category are closer in the feature space, and the points of different categories are more dispersed in the feature space, the category of the picture to be detected can be identified according to the distance between the comparison points.

The heuristic nearest neighbor algorithm is to select a threshold value as K, and perform heuristic decision voting on the point closest to the test sample within the threshold value range, wherein the category with the high number of votes is the category of the test sample.

In this embodiment, K is selected to be 2, heuristic decision voting is performed on two points closest to the point to be identified (the point corresponding to the to-be-identified embedded feature vector), and the category of the point to be identified is determined, namely, the category of the to-be-identified embedded feature vector is identified, and the specific process is as follows:

(a) Ordering the plurality of vector distances from small to large is expressed as: d1, d2, d3, d4...d _i I represents the number of vector distances.

The method comprises the steps that the categories of pictures input into an Embedding model are used as the categories of corresponding pre-extracted Embedding feature vectors output by the Embedding model, and the categories of the pre-extracted Embedding feature vectors are used as the categories of corresponding vector distances.

(b) If |d1-d2| is greater than the threshold, i.e. d2 is much greater than d1, the class of d1 is taken as the class of the to-be-identified Embedding feature vector.

(c) If |d1-d2| is smaller than or equal to the threshold, that is, d1 and d2 are not different greatly and d1 and d2 belong to the same category, the category of d1 is used as the category of the to-be-identified Embedding feature vector.

(d) If |d1-d2| is less than or equal to the threshold, and d1 and d2 do not belong to the same category, judging whether d3 and d4 belong to the same category, if so, finding a similar vector distance to d3 and d4 from d1 and d2, taking the category of the vector distance as the category of the to-be-identified Embedding feature vector, if not, further judging whether d3 and d4 belong to the same category as d1, if d3 and d1 belong to the same category, taking the category of d1 as the category of the to-be-identified Embedding feature vector, if d4 and d1 belong to the same category, judging the sum of d1 and d4 and the sum of d2 and d3, if sum (d 1, d 4) < sum (d 2, d 3), taking the category of d1 as the category of the to-be-identified Embedding feature vector, and if sum (d 1, d 4) > sum (d 2, d 3), taking the category of d2 as the category of the to-be-identified Embedding feature vector.

The type of the to-be-identified Embedding feature vector is the type of the to-be-detected picture, and if the type of the to-be-detected picture is the reverse direction of the vehicle head, the coal stealing behavior can be determined.

It can be seen that step S103 classifies by comparing the distances between points, unlike classifying by a neural network model, does not require sample data.

In the following, 10 reverse pictures of the headstock and 10 non-reverse pictures of the headstock are taken as examples to illustrate how the above threshold is determined.

10 head reverse pictures are input into an Embedding model to obtain 10 pre-extracted Embedding feature vectors, the 10 pre-extracted Embedding feature vectors are defined as a set 1, 10 head non-reverse pictures are input into the Embedding model to obtain 10 pre-extracted Embedding feature vectors, and the 10 pre-extracted Embedding feature vectors are defined as a set 2.

And (3) combining the elements of the set 1 in pairs, respectively calculating the distance between the elements, averaging the distances to obtain a first average value, similarly, performing the same operation on the set 2 to obtain a second average value, and finally, averaging the first average value and the second average value to obtain a threshold value.

As can be seen from the above, the method of the embodiment extracts the to-be-identified embedded feature vector of the to-be-detected picture through the embedded model, and then compares the to-be-identified embedded feature vector with the distances between the plurality of pre-extracted embedded feature vectors, so that the category of the to-be-detected picture is identified, the identification of license plates is not dependent, and the high-accuracy identification of the direction of the vehicle head is achieved under the condition that a large amount of sample data is not needed.

The head direction recognition device of one or more embodiments of the present invention will be described in detail below. Those skilled in the art will appreciate that these identification means may be configured by the steps taught by the present solution using commercially available hardware components. Fig. 2 shows a vehicle head direction recognition device provided by the embodiment of the present invention, and as shown in fig. 2, the recognition device includes an extracting module 11 for an Embedding feature vector, a calculating module 12 for a vector distance, and a recognition module 13.

The extraction module 11 is configured to input the picture to be detected into an extraction model trained by the reverse picture of the headstock and the non-reverse picture of the headstock, so as to obtain an extraction feature vector to be identified.

(c) Three results are substituted into the loss function:

wherein α is a preset fault tolerance value, which may be set to 0.01.

The vector distance calculating module 12 is configured to calculate distances between the to-be-identified embedded feature vector and a plurality of pre-extracted embedded feature vectors, respectively, to obtain a plurality of vector distances.

For example, 10 headstock reverse pictures and 10 headstock non-reverse pictures can be input into an Embedding model to obtain 20 pre-extracted Embedding feature vectors.

The identifying module 13 is configured to sort the plurality of vector distances from small to large, and identify the category of the to-be-identified assembled feature vector through a heuristic nearest neighbor algorithm.

In summary, the head direction recognition device provided in the foregoing embodiments may perform the head direction recognition method provided in the foregoing embodiments.

As with the above concept, the structure of the head direction recognition device shown in fig. 2 may be implemented as an electronic device, and fig. 3 is a schematic block diagram of the structure of the electronic device according to the embodiment of the present invention.

Illustratively, the electronic device includes a memory module 21 and a processor 22, the memory module 21 including instructions loaded and executed by the processor 22, which when executed, cause the processor 22 to perform the steps according to various exemplary embodiments of the present invention described in the above-described one of the head direction identification methods section of this specification.

It should be appreciated that the processor 22 may be a central processing unit (CentralProcessingUnit, CPU), and that the processor 22 may also be other general purpose processors, digital signal processors (DigitalSignalProcessor, DSP), application specific integrated circuits (ApplicationSpecificIntegratedCircuit, ASIC), field programmable gate arrays (Field-ProgrammableGateArray, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. Wherein the general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

Embodiments of the present invention also provide a computer-readable storage medium storing one or more programs that, when executed by a processor, implement the steps described in the foregoing description of various exemplary embodiments of a method for identifying a direction of a vehicle head.

Those of ordinary skill in the art will appreciate that all or some of the steps, systems, functional modules/units in the apparatus, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between the functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed cooperatively by several physical components. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer-readable storage media, which may include computer-readable storage media (or non-transitory media) and communication media (or transitory media).

The term computer-readable storage medium includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer-readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer-readable storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.

By way of example, the computer readable storage medium may be an internal storage module of the electronic device of the foregoing embodiments, such as a hard disk or a memory of the electronic device. The computer readable storage medium may also be an external storage device of the electronic device, such as a plug-in hard disk provided on the electronic device, a smart memory card (SmartMediaCard, SMC), a secure digital (SecureDigital, SD) card, a flash memory card (FlashCard), or the like.

The electronic device and the computer readable storage medium provided in the foregoing embodiments extract an to-be-identified embedded feature vector of a to-be-detected picture through an embedded model, and then compare distances between the to-be-identified embedded feature vector and a plurality of pre-extracted embedded feature vectors, so as to identify a category of the to-be-detected picture, without depending on identification of a license plate, and high-accuracy identification of a direction of a vehicle head is achieved without needing a large amount of sample data.

It will be apparent to those skilled in the art that various modifications and variations can be made in the present application without departing from the scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to cover such modifications and variations.

Claims

1. A method for identifying a direction of a vehicle head, comprising:

2. The method of claim 1, wherein the classification neural network model is a pre-trained classification neural network model.

3. The method for recognizing a vehicle head direction according to claim 2, wherein the training of the embedded model comprises:

three pictures are selected as a group each time: randomly selecting a picture x from a headstock reverse picture set or a headstock non-reverse picture set _a Selecting one picture from the headstock reverse picture set and the headstock non-reverse picture set respectively, wherein the pictures are respectively expressed as x _p And x _n Wherein a represents a picture x _a Class x of (2) _p Representing pictures of the same kind as a, x _n Representing pictures of different classes from a;

inputting a group of pictures into an Embedding model in turn to obtain three results f (x) _a )、f(x _p ) And x _n ；

Substituting the three results into a loss function:

wherein alpha is a preset fault tolerance value;

the parameters of the fully connected layer are optimized by back propagation.

4. A method for recognizing a vehicle head direction according to claim 3, wherein the pre-training classification neural network model is a yellow 8 classification neural network model based on image-net.

5. The method of claim 1, wherein the number of nodes in the fully connected layer is the same as the number of nodes in the immediately preceding layer.

6. The method for recognizing a vehicle head direction according to claim 1, wherein the calculating distances between the to-be-recognized embedded feature vector and a plurality of pre-extracted embedded feature vectors, respectively, further comprises:

and respectively calculating Euclidean distances between the to-be-identified Embedding feature vector and a plurality of pre-extracted Embedding feature vectors.

7. The method for recognizing a vehicle head direction according to claim 1, wherein said S103 further comprises:

ordering the plurality of vector distances from small to large is expressed as: d1, d2, d3, d4...d _i I represents the number of vector distances, the category of the picture input into the coding model is used as the category of the corresponding pre-extracted coding feature vector output by the coding model, and the category of the pre-extracted coding feature vector is used as the category of the corresponding vector distance;

if the |d1-d2| is larger than the threshold value, taking the class of d1 as the class of the to-be-identified coding feature vector;

if the |d1-d2| is smaller than or equal to the threshold value and d1 and d2 belong to the same category, taking the category of d1 as the category of the to-be-identified coding feature vector;

if |d1-d2| is smaller than or equal to the threshold value, d1 and d2 do not belong to the same category, judging whether d3 and d4 belong to the same category, if so, finding a vector distance similar to d3 and d4 from d1 and d2, taking the category of the vector distance as the category of the to-be-identified Embedding feature vector, if not, further judging whether d3 and d4 belong to the same category as d1, if d3 and d1 belong to the same category, taking the category of d1 as the category of the to-be-identified Embedding feature vector, if d4 and d1 belong to the same category, judging the sum of d1 and d4 and the sum of d2 and d3, if sum (d 1, d 4) < sum (d 2, d 3), taking the category of d1 as the category of the to-be-identified Embedding feature vector, and if sum (d 1, d 4) > sum (d 2, d 3) taking the category of d2 as the category of the to-be-identified Embedding feature vector.

8. A vehicle head direction recognition device, comprising:

9. An electronic device comprising a memory module including instructions loaded and executed by a processor, which when executed, cause the processor to perform a method of identifying a direction of a vehicle head according to any one of claims 1-7.

10. A computer readable storage medium storing one or more programs, which when executed by a processor, implement a method of head direction identification as claimed in any one of claims 1 to 7.