CN111400522B

CN111400522B - Traffic sign recognition method, training method and equipment

Info

Publication number: CN111400522B
Application number: CN202010356150.1A
Authority: CN
Inventors: 王建强; 顾友良; 周培新
Original assignee: Guangzhou Ziweiyun Technology Co ltd
Current assignee: Guangzhou Ziweiyun Technology Co ltd
Priority date: 2020-04-29
Filing date: 2020-04-29
Publication date: 2021-06-11
Anticipated expiration: 2040-04-29
Also published as: CN111400522A

Abstract

The invention discloses a training method for traffic sign recognition, which comprises a feature index general table, wherein the feature index general table comprises one or more categories, and the feature index general table comprises a first feature extractor and category features extracted by the first feature extractor; one or more feature index sub-tables, each of the feature index sub-tables corresponding to a category in the feature index summary table, the feature index sub-tables including one or more sub-categories, each of the feature index sub-tables including a second feature extractor and category features extracted by the second feature extractor; extracting a first feature of the image using a first feature extractor; and if the similarity between the first feature and the category in the feature index general table is smaller than a preset value, adding the category corresponding to the image and the feature of the image into the feature index general table. Therefore, the whole feature extractor does not need to be retrained, and the whole training time is greatly reduced.

Description

Traffic sign recognition method, training method and equipment

Technical Field

The invention relates to the field of artificial intelligence, in particular to a traffic sign identification method, a training method and equipment.

Background

At present, the identification of the traffic sign is mainly carried out by adopting a deep learning mode, but the identification method of the traffic sign based on the deep learning is difficult to label all categories, so that in the development process of the method, an updated model is continuously optimized along with the increase of the required categories. However, the existing deep learning-based method has a fixed number of output classification classes, and if the network structure needs to be modified to increase the number of recognition classes, the model needs to be completely retrained and optimized, which requires high cost of time and training investment resources.

Therefore, the existing traffic sign identification method based on deep learning is not economical and reasonable in training optimization cost.

The background description provided herein is for the purpose of generally presenting the context of the disclosure. Unless otherwise indicated herein, the material described in this section is not prior art to the claims in this application and is not admitted to be prior art by inclusion in this section.

Disclosure of Invention

In view of the above technical problems in the related art, the present invention provides a training method for traffic sign recognition, including a feature index summary table, where the feature index summary table includes one or more categories, and the feature index summary table includes a first feature extractor and category features extracted by the first feature extractor; one or more feature index sub-tables, wherein each of the feature index sub-tables corresponds to a category in the feature index summary table, and the feature index sub-tables comprise one or more sub-categories, wherein each of the feature index sub-tables comprises a second feature extractor and category features extracted by the second feature extractor;

s1, acquiring an image including a traffic sign of a category to be added;

s2, extracting the image by using a first feature extractor corresponding to a feature index summary table to obtain a first feature of the image;

s3, calculating the similarity between the first feature and the category in the feature index summary table;

and S4, if the similarity is less than a preset value, aiming at the category corresponding to the image, extracting the features of the image by using the first feature extractor and adding the features into the feature index summary table.

When new identification categories need to be added, particularly when the traffic identification categories with larger category intervals need to be added, the pictures of the traffic signs needing to be added are extracted once by using the feature extractor of the summary table to obtain corresponding features, and the features are added into the feature index summary table, so that the whole feature extractor does not need to be retrained, and the whole training time is greatly reduced.

Further, in order to increase the recognition of the subclasses, step S3 of the present embodiment further includes:

s31, selecting the category with the similarity larger than a preset value as a similar category;

s32, judging whether the similar categories have subclasses;

and S33, if not, creating a feature index sub-table corresponding to the similar category and a feature extractor.

S34, if yes, reconstructing a feature extractor for all classes in the image and the feature index sub-table, reconstructing the image and all classes in the feature index sub-table and the reconstructed feature extractor into a new feature index sub-table, and replacing the feature index sub-table with the new feature index sub-table.

By setting the feature index general table and the feature index sub-table, when the identification categories need to be added in the later period, the trained model does not need to be replaced, and only the feature index general table or the feature index sub-table needs to be updated, so that the optimization cost and the training time are greatly reduced.

In addition, the invention also provides a traffic sign identification method, which is characterized by comprising a characteristic index general table, wherein the characteristic index general table comprises one or more categories, and the characteristic index general table comprises a first characteristic extractor and the category characteristics extracted by the first characteristic extractor; one or more feature index sub-tables, wherein each of the feature index sub-tables corresponds to a category in the feature index summary table, and the feature index sub-tables comprise one or more sub-categories, wherein each of the feature index sub-tables comprises a second feature extractor and category features extracted by the second feature extractor;

s10, acquiring an image comprising a traffic sign;

s20, extracting the image by using the first feature extractor corresponding to a feature index summary table, and acquiring the category corresponding to the feature index summary table;

s30, judging whether the category corresponding to the feature index general table has a feature index sub-table, if yes, going to step S40, and if not, taking the category of the feature index general table as a target category;

s40, extracting the image by using a second feature extractor corresponding to the category, acquiring the category corresponding to the feature index sub-table, and taking the category of the feature index sub-table as a target category;

the traffic sign classification method is characterized in that a feature index general table and a feature index sub-table are established in advance and used for identifying traffic signs, the class interval in the feature index general table is large, strong interference does not exist, the complexity of the traffic sign features is low, and a simple classification network can realize high classification performance; although the distance between classes is smaller, the feature index sub-table has fewer classes, the difference between classes is not complex and has clear classification limits, such as a limit of 3m and a limit of 3.5m, so that a simple classification network can also realize better classification performance.

Further, the step S20 specifically includes:

s210, extracting the image by using a first feature extractor corresponding to a feature index summary table to obtain a first feature of the image;

s220, calculating cos distance between the first feature and the feature of the category in the feature index summary table to obtain similarity;

and S230, selecting the category with the similarity larger than a preset value as the category of the feature index summary table.

Further, after the step S230, the method further includes:

and S240, if the similarity is smaller than a preset value, adding the category corresponding to the image and the features extracted by the first feature extractor into a feature index general table.

Further, step S40 specifically includes:

s410, extracting the image by using a second feature extractor corresponding to the category to acquire a second feature of the image;

s420, calculating cos distance between the second feature and the feature of the category in the feature index sub-table to obtain similarity;

and S430, selecting the category with the similarity larger than a preset value as the category of the feature index sub-table.

Further, the step S430 is followed by

S440, if the similarity is smaller than a preset value, reconstructing a feature extractor for all the categories in the image and the feature index sub-table, and reconstructing the image, all the categories in the feature index sub-table and the reconstructed feature extractor into a new feature index sub-table.

If the traffic sign cannot be identified in the identification process, the embodiment can add the corresponding traffic sign feature in the general table or retrain the feature extractor of the sublist, thereby avoiding retraining the whole identification model, reducing the input resources and effectively reducing the cost.

The invention also provides a computer storage medium for storing computer readable instructions, wherein the instructions, when executed, perform the traffic sign recognition training method or the operation of the traffic sign recognition method.

The invention also provides an electronic device, which comprises a processor and a memory, wherein executable instructions are stored in the memory, and the processor is used for realizing the execution of the traffic sign recognition training method or the operation of the traffic sign recognition method when executing the executable instructions stored in the memory.

According to the invention, by setting the characteristic index general table and the characteristic index sub-table, when the model can not identify the traffic sign, the whole model does not need to be trained, so that the time for training the model is reduced, and the training cost can be effectively reduced. Further, a traffic recognition method is provided, in which a feature index summary table and a feature index sub-table are provided, so that a simple classification model can be used, thereby reducing the complexity of the model and the recognition time.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

FIG. 1 is a flow chart of a traffic sign recognition training method according to an embodiment of the invention;

FIG. 2 is a flow chart of a traffic sign recognition method according to an embodiment of the invention;

FIG. 3 is a schematic structural diagram of a traffic sign recognition training device according to an embodiment of the present invention

FIG. 4 is a schematic structural diagram of a traffic sign recognition device according to an embodiment of the present invention

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present invention.

Example one

Fig. 1 is a flowchart of a training method for traffic sign recognition according to the present embodiment. The method of the embodiment shown in fig. 1 includes:

a feature index summary table, wherein the feature index summary table includes one or more categories, and the feature index summary table includes a first feature extractor and category features extracted by the first feature extractor; one or more feature index sub-tables, wherein each of the feature index sub-tables corresponds to a category in the feature index summary table, and the feature index sub-tables comprise one or more sub-categories, wherein each of the feature index sub-tables comprises a second feature extractor and category features extracted by the second feature extractor;

in this embodiment, the traffic signs are divided into 2 levels, wherein the first-level feature index general table includes categories with sufficiently large class intervals, such as height limit, speed limit, and courier, and the second-level feature index sub table includes categories with small class intervals, such as height limit 4m and height limit 4.5 m; the speed limit 60, the speed limit 30 and the like.

The characteristic index general table and the characteristic index sub-table are established as follows:

firstly, reasonably dividing a feature index summary table to contain categories. The principle of establishing the feature index summary table is to ensure that the distance between classes is large enough as much as possible, only one class is selected as a representative for a plurality of classes with high similarity between classes, and the selected class is added into the feature index summary table, for example: the following traffic signs are assumed: the characteristic index summary table only needs to contain 3 types of the pedestrian courtesy, the no-parking, the height limit of 4m (or the height limit of 4.5 m).

And secondly, manufacturing a traffic sign feature extractor. Training a traffic sign classification network according to the classes contained in the feature index summary table, then removing the last classification output layer from the trained model, taking the output of the last full-connection layer as traffic sign feature information, and finishing the manufacturing of the traffic sign feature extractor, wherein the output dimension of the feature extractor is 128 bits in the embodiment.

And thirdly, establishing a feature index summary table. And the traffic sign pictures of different categories (the feature index general table comprises categories) are sequentially sent to a feature extractor to extract features, and then the obtained features are associated with the corresponding traffic signs and are stored in the index table together.

Fourthly, establishing a characteristic index sub-table. The principle of establishing the feature index sub-table is to ensure that the distance between classes is as small as possible, which is completely opposite to the feature index general table, for example, as follows, assuming that the training set includes: the lowest speed limit 50, the lowest speed limit 60, the height limit 4m and the height limit 4.5m, and only 2 types of the lowest speed limit 50 (or the lowest speed limit 60) and the height limit 4m (or the height limit 4.5 m) are needed to be contained in the feature index summary table; but the feature index sub-table needs to have two: one contains the lowest speed limit 50, the lowest speed limit 60; the other includes a limit height of 4m and a limit height of 4.5 m. Therefore, there is only one feature index total table, but there may be more feature index sub tables. Except for different establishing principles, other manufacturing steps of the feature index sub-table are consistent with the feature index general table, and are not described in detail in this embodiment.

S1, acquiring an image including a traffic sign of a category to be added;

extracting image features by using a feature extractor corresponding to the feature index general table, comparing the image features with all features (assuming N classes) in the feature index general table, calculating cos distance to obtain N similarity, setting a similarity threshold thresh1, and selecting the class which is the largest and exceeds the threshold from the N similarity as the corresponding class.

When a new identification category needs to be added, particularly when the traffic identification category has a large category interval, the image of the traffic sign needs to be added is extracted once by using the feature extractor of the summary table to obtain the corresponding feature, and the feature is added into the feature index summary table, so that the whole feature extractor does not need to be retrained, the whole training time is greatly reduced, and the training cost can be effectively reduced.

s32, judging whether the similar categories have subclasses;

And under the condition that the similar category is identified in the feature index, judging whether the similar category has a corresponding sub-table, and if the similar category does not have the corresponding sub-table, constructing the sub-table and the feature extractor together with the image. If the category has a corresponding sub-table, deleting the sub-table, and reconstructing the sub-table and the feature extractor by the image category and all categories in the original sub-table.

In the embodiment, by setting the feature index general table and the feature index sub-table, when the identification categories need to be added in the later period, the trained model does not need to be replaced, and only the feature index general table or the feature index sub-table needs to be updated, so that the optimization cost and the training time are greatly reduced.

Example two

The embodiment provides a traffic sign identification method, which is characterized by comprising a feature index general table, wherein the feature index general table comprises one or more categories, and the feature index general table comprises a first feature extractor and category features extracted by the first feature extractor; one or more feature index sub-tables, wherein each of the feature index sub-tables corresponds to a category in the feature index summary table, and the feature index sub-tables comprise one or more sub-categories, wherein each of the feature index sub-tables comprises a second feature extractor and category features extracted by the second feature extractor;

s10, acquiring an image comprising a traffic sign;

the image including the traffic sign may be acquired by the vehicle-mounted camera or acquired by other methods, and this embodiment is not particularly limited.

specifically, in the first step of establishing the feature index summary table, the feature index summary table is reasonably divided to include categories. The principle of establishing the feature index summary table is to ensure that the distance between classes is large enough as much as possible, only one class of a plurality of classes with high similarity between classes is selected as a representative, and the representative is added into the feature index summary table, for example: suppose that the traffic signs are: the characteristic index summary table only needs to contain 3 types of the pedestrian courtesy, the no-parking, the height limit of 4m (or the height limit of 4.5 m). And secondly, manufacturing a traffic sign feature extractor. Training a traffic sign classification network according to the classes contained in the feature index summary table, then removing the last classification output layer from the trained model, taking the output of the last full-connection layer as traffic sign feature information, and finishing the manufacturing of the traffic sign feature extractor, wherein the output dimension of the feature extractor in the embodiment is 128 bits. And thirdly, establishing a feature index summary table. And the traffic sign pictures of different categories (the feature index general table comprises categories) are sequentially sent to a feature extractor to extract features, and then the obtained features are associated with the corresponding traffic signs and are stored in the index table together.

In the step, a first feature extractor which is established in a feature index general table in advance is used for extracting the features of the acquired image, similarity comparison is carried out on the features and all features (assuming N classes) in the feature index general table, cos distance is calculated, N similarity is obtained, a first similarity threshold thresh1 is set, and one corresponding class which exceeds the first similarity threshold and is the largest in the N similarity is selected as the class of the feature index general table.

and if the category of the feature index general table has no corresponding sub-table, the feature index general table is regarded as the target category. The category of the feature index summary table has no corresponding sub-table, which indicates that the target category has been identified, i.e. the target category has no sub-category with a small category pitch, such as a "stop" flag.

and entering a sub-table, extracting the picture features again by using a feature extractor corresponding to the sub-table for comparison, calculating the cos distance to obtain M similarity, setting a second similarity threshold thresh2, and selecting a corresponding category which exceeds the first similarity threshold and is the largest in the M similarities as the category of the feature index sub-table, namely the target category.

In the embodiment, a feature index general table and a feature index sub-table are pre-established and used for identifying the traffic signs, the class interval in the feature index general table is larger, strong interference does not exist, the complexity of the traffic sign features is low, and a simple classification network can realize higher classification performance; although the distance between classes is smaller, the feature index sub-table has fewer classes, the difference between classes is not complex and has clear classification limits, such as a limit of 3m and a limit of 3.5m, so that a simple classification network can also realize better classification performance. According to the method provided by the embodiment, different feature extractors can be manufactured only by designing a simple traffic sign recognition model and using different data training, so that the operation complexity is low. Second, there are problems with the speed of operation and the size of the model. Because a simple classification network can realize higher performance, even if two-stage indexing is performed, the time cost is not large, in practical application, the method of the embodiment has fast recognition speed and small model, and the details of equipment and operation performance are shown in the following table.

Watch 1

Further, the step S20 specifically includes:

When the summary table cannot identify the corresponding category, the present embodiment further includes, after step S230:

and S240, if the similarity is smaller than a preset value, adding the feature extracted by the first feature extractor into a feature index general table according to the category corresponding to the image.

If the similarity between the features of the image and all the features in the general table is less than thresh1, the new identification category can be completed by adding the features and the features into the general table without retraining the identification model.

Further, in the case that it is recognized that the sub-table exists in the list of the feature index total table, the embodiment further includes the following steps:

When the sub-table cannot identify the corresponding category, the embodiment further includes, after step S430

If the similarity between the features of the image and all the features in the general table is less than thresh2, only the feature extractor and the sub-table corresponding to the sub-table need to be reconstructed, and the whole recognition model does not need to be retrained.

EXAMPLE III

The embodiment provides a schematic structural diagram of a training device 20 for traffic sign recognition. The training device 20 for traffic sign recognition of this embodiment comprises a processor 21, a memory 22 and a computer program stored in said memory 22 and executable on said processor 21. The processor 21, when executing the computer program, implements the steps in the above-mentioned training method embodiment of traffic sign recognition, such as step S1 shown in fig. 2.

Illustratively, the computer program may be divided into one or more modules/units, which are stored in the memory 22 and executed by the processor 21 to accomplish the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions for describing the execution of the computer program in the training device 20 for traffic sign recognition.

The training device 20 for traffic sign recognition may include, but is not limited to, a processor 21 and a memory 22. It will be appreciated by those skilled in the art that the schematic diagram is merely an example of the training device 20 for traffic sign recognition and does not constitute a limitation of the training device 20 for traffic sign recognition, and may include more or less components than those shown, or some components in combination, or different components, for example, the training device 20 for traffic sign recognition may also include input-output devices, network access devices, buses, etc.

The Processor 21 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, and the processor 21 is the control center of the training device for traffic sign recognition 20, and various interfaces and lines are used to connect the various parts of the training device for overall traffic sign recognition 20.

The memory 22 may be used for storing the computer programs and/or modules, and the processor 21 may implement various functions of the training device 20 for traffic sign recognition by running or executing the computer programs and/or modules stored in the memory 22 and invoking data stored in the memory 22. The memory 22 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. In addition, the memory 22 may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.

Wherein, the integrated module/unit of the training device 20 for traffic sign recognition can be stored in a computer readable storage medium if it is implemented in the form of software functional unit and sold or used as a stand-alone product. Based on such understanding, all or part of the flow of the method according to the above embodiments may be implemented by a computer program, which may be stored in a computer readable storage medium and used by the processor 21 to implement the steps of the above embodiments of the method. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

It should be noted that the above-described device embodiments are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiment of the apparatus provided by the present invention, the connection relationship between the modules indicates that there is a communication connection between them, and may be specifically implemented as one or more communication buses or signal lines. One of ordinary skill in the art can understand and implement it without inventive effort.

Example four

The present embodiment provides a schematic structural view of a traffic sign recognition device 30. The traffic sign recognition device 30 of this embodiment comprises a processor 31, a memory 32 and a computer program stored in said memory 32 and executable on said processor 31. The processor 31, when executing the computer program, implements the steps in the above described embodiments of the traffic sign identification method. Alternatively, the processor 31 implements the functions of the modules/units in the above device embodiments when executing the computer program.

Illustratively, the computer program may be divided into one or more modules/units, which are stored in the memory 32 and executed by the processor 31 to accomplish the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program in the traffic sign recognition device 30.

Wherein, the integrated module/unit of the traffic sign recognition device 30 can be stored in a computer readable storage medium if it is implemented in the form of software functional unit and sold or used as a stand-alone product. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by the processor 31, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A training method for traffic sign recognition is characterized by comprising a feature index general table, wherein the feature index general table comprises one or more categories, and the feature index general table comprises a first feature extractor and category features extracted by the first feature extractor; one or more feature index sub-tables, wherein each of the feature index sub-tables corresponds to a category in the feature index summary table, and the feature index sub-tables comprise one or more sub-categories, wherein each of the feature index sub-tables comprises a second feature extractor and category features extracted by the second feature extractor; the method comprises the following steps:

s1, acquiring an image including a traffic sign of a category to be added;

s32, judging whether the similar categories have subclasses;

s33, if not, creating a feature index sub-table corresponding to the similar category and a feature extractor;

s34, if yes, reconstructing a feature extractor for all categories in the image and the feature index sub-table, reconstructing the image and all categories in the feature index sub-table and the reconstructed feature extractor into a new feature index sub-table, and replacing the feature index sub-table with the new feature index sub-table;

2. A traffic sign recognition method using the traffic sign recognition training method according to claim 1, characterized in that: the method comprises the following steps:

s10, acquiring an image comprising a traffic sign;

and S40, extracting the image by using a second feature extractor corresponding to the category, acquiring the category corresponding to the feature index sub-table, and taking the category of the feature index sub-table as a target category.

3. The method according to claim 2, wherein the step S20 specifically includes:

4. The method of claim 3, further comprising after the step S230:

5. The method according to claim 2, wherein step S40 specifically includes:

6. The method of claim 5, further comprising after step S430

7. A computer storage medium storing computer readable instructions that, when executed, perform the operations of the traffic sign recognition training method of claim 1 or the traffic sign recognition method of any of claims 2-6.

8. An electronic device comprising a processor, a memory having executable instructions stored thereon, the processor, when executing the executable instructions on the memory, being configured to perform operations of performing the traffic sign recognition training method of claim 1 or the traffic sign recognition method of any of claims 2-6.