CN117197765A

CN117197765A - Image recognition method, device, electronic equipment and medium

Info

Publication number: CN117197765A
Application number: CN202311236132.XA
Authority: CN
Inventors: 周天宝
Original assignee: Xiaomi Automobile Technology Co Ltd
Current assignee: Xiaomi Automobile Technology Co Ltd
Priority date: 2023-09-22
Filing date: 2023-09-22
Publication date: 2023-12-08

Abstract

The disclosure provides an image recognition method, an image recognition device, electronic equipment and a medium, and relates to the technical field of automatic driving, comprising the following steps: image information acquired by a vehicle in the running process is obtained, the image information is identified through a target image identification model, an image identification result is obtained, and the target image identification model is obtained by the following modes: when the image recognition model is used, under the condition that a main guide head of the image recognition model outputs an error result, an auxiliary head is additionally arranged in the image recognition model, sample data corresponding to the error result is learned through the auxiliary head, and the auxiliary head transfers the learned characteristics to the main guide head to obtain a trained image recognition model; and deleting the auxiliary head in the trained image recognition model to obtain a target image recognition model, wherein compared with the original image recognition model, a new module is not added in the target image recognition model, the structure is still kept simple, and the model parameters are relatively less, so that the image recognition speed is high.

Description

Image recognition method, device, electronic equipment and medium

Technical Field

The disclosure relates to the technical field of automatic driving, and in particular relates to an image recognition method, an image recognition device, electronic equipment and a medium.

Background

As an important application of artificial intelligence, autopilot technology has been greatly developed in recent years. As the most important part of autopilot, the safety of autopilot depends on the accuracy and robustness of the perception system. The perception system usually recognizes vehicles, pedestrians, traffic lights and the like by means of models. When the bad case (error result or bad result) occurs in the model, a new module may be newly added to the model to improve the bad case. However, adding a new module to the model increases the complexity of the model, and when image information is recognized using the model, the calculation speed of the model becomes slow due to the increase in the complexity of the model, and the speed of recognition of the image becomes slow accordingly.

Disclosure of Invention

In order to overcome the problems in the related art, the present disclosure provides an image recognition method, an image recognition device, an electronic apparatus, and a medium.

According to a first aspect of embodiments of the present disclosure, there is provided an image recognition method, the method including: acquiring image information acquired by a vehicle in the running process; inputting the image information into a target image recognition model to obtain an image recognition result output by a main guide head in the target image recognition model, wherein the target image recognition model is obtained by deleting an auxiliary head in a trained image recognition model, the trained image recognition model comprises the auxiliary head and the main guide head, and the trained image recognition model is obtained by moving the characteristic learned by the auxiliary head to the main guide head after learning sample data corresponding to an error result through the auxiliary head under the condition that the image recognition model outputs the error result.

Optionally, the trained image recognition model is obtained by: acquiring first basic sample data and sample data corresponding to the error result, wherein the first basic sample data is sample data of which the image recognition model does not output the error result; training the auxiliary head through the first basic sample data and the sample data corresponding to the error result, and training the main head through the first basic sample data; and transferring the features learned by the auxiliary head to the primary guide head to obtain the trained image recognition model.

Optionally, the transferring the features learned by the auxiliary head to the main head includes: acquiring a first dominant loss function corresponding to the main head, and acquiring a first auxiliary loss function corresponding to the auxiliary head and a distillation loss function corresponding to the image recognition model; and migrating the features learned by the auxiliary head to the main head through the first main loss function, the first auxiliary loss function and the distillation loss function.

Optionally, the first auxiliary loss function is a product between the first dominant loss function and a first weight coefficient.

Optionally, the method further comprises: acquiring second basic sample data; and training an initial training model through the second basic sample data to obtain the image recognition model.

Optionally, the training the initial training model by the second basic sample data to obtain the image recognition model includes: training the initial training model through the second basic sample data to obtain a trained initial training model; aiming at the initial training model after training, acquiring a second dominant loss function corresponding to a dominant head in the identification layer and a second auxiliary loss function corresponding to an auxiliary head in the identification layer; and updating the trained initial training model through the second dominant loss function and the second auxiliary loss function to obtain the image recognition model.

Optionally, the second auxiliary loss function is a product between the second dominant loss function and a second weight coefficient.

Optionally, the target image recognition model is a multitasking model for autopilot.

According to a second aspect of embodiments of the present disclosure, there is provided an image recognition apparatus, the apparatus comprising: the acquisition module is used for acquiring image information acquired by the vehicle in the running process; the recognition module is used for inputting the image information into a target image recognition model to obtain an image recognition result output by a main guide in the target image recognition model, wherein the target image recognition model is obtained by deleting an auxiliary guide in a trained image recognition model, the trained image recognition model comprises the auxiliary guide and the main guide, and the trained image recognition model is obtained by transferring the characteristic learned by the auxiliary guide to the main guide after learning sample data corresponding to an error result through the auxiliary guide under the condition that the image recognition model outputs the error result.

According to a third aspect of embodiments of the present disclosure, there is provided an electronic device, comprising: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to implement the steps of the method of the first aspect when executing the instructions.

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the method provided by the first aspect of the present disclosure.

The image recognition method, the device, the electronic equipment and the medium provided by the disclosure acquire image information acquired by a vehicle in a driving process, input the image information into a target image recognition model, acquire an image recognition result output by a main head of the target image recognition model, and acquire the target image recognition model by the following modes: when the image recognition model is used, under the condition that a main guide head of the image recognition model outputs an error result, an auxiliary head is additionally arranged in the image recognition model, sample data corresponding to the error result is learned through the auxiliary head, and the auxiliary head transfers the learned characteristics to the main guide head to obtain a trained image recognition model; and deleting the auxiliary head in the trained image recognition model to obtain a target image recognition model, wherein in the method, the image recognition result is output by the main head of the target image recognition model through the image recognition model recognition image information, and compared with the original image recognition model for recognizing the image information, a new module is not additionally arranged in the target image recognition model, so that the target image recognition model is not complicated due to the fact that the bad case problem is solved, the structure is still kept simple, the debugging and deployment difficulty of the model is low, the follow-up image recognition is convenient, and because the model structure is simple, the model parameters are relatively less, and the speed of image recognition through the model is higher.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.

FIG. 1 is a flowchart illustrating an image recognition method according to an exemplary embodiment;

FIG. 2 is a schematic diagram of a target image recognition model shown in accordance with an exemplary embodiment;

FIG. 3 is a schematic diagram of a target image recognition model shown in accordance with an exemplary embodiment;

FIG. 4 is a flowchart illustrating another image recognition method according to an exemplary embodiment;

FIG. 5 is a schematic diagram of an image recognition model shown in accordance with an exemplary embodiment;

FIG. 6 is a schematic diagram illustrating an image recognition model after adding an auxiliary head, according to an example embodiment;

FIG. 7 is a schematic diagram of a training master and auxiliary heads shown according to an exemplary embodiment;

FIG. 8 is a flowchart illustrating another image recognition method according to an exemplary embodiment;

FIG. 9 is a block diagram of an image recognition device, according to an exemplary embodiment;

FIG. 10 is a functional block diagram of an electronic device shown in an exemplary embodiment;

fig. 11 is a block diagram of a server, according to an example embodiment.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.

As an important application of artificial intelligence, autopilot technology has been greatly developed in recent years. The perception system is used as the most important ring in automatic driving, and the safety of automatic driving depends on the correctness of the perception system. The perception system usually recognizes vehicles, pedestrians, traffic lights and the like by means of models. When the model has a bad case, a new module can be added in the model to improve the situation, wherein the bad case refers to the situation that the model has errors or performs poorly in prediction. However, adding a new module to the model increases the complexity of the model, and when the model is used for identifying image information, the calculation speed of the model is slow due to the increase of the complexity of the model, the speed of image identification is correspondingly slow, and the difficulty is increased for the subsequent debugging and deployment of the model along with the increase of the complexity of the model.

In order to solve the above problems, the present disclosure provides an image recognition method, an apparatus, an electronic device, and a medium, in which, when a main header of a model outputs an erroneous result, an auxiliary header is added to the model, sample data corresponding to the erroneous result is learned by the auxiliary header, and then the auxiliary header transfers the learned feature to the main header; after the migration is successful, the auxiliary head in the model is deleted, the model is used for identifying the image information, and a new module is not added in the model for identifying the image information, so that the model has a simple structure, the debugging and deployment difficulties of the model are low, the subsequent image identification is convenient, and the model has a simple structure and relatively few model parameters, so that the image identification speed is higher through the model. Among them, the image recognition method is described in the following embodiments.

Referring to fig. 1, fig. 1 is a flowchart illustrating an image recognition method according to an exemplary embodiment, which may be applied to the image recognition apparatus 200, the electronic device, and the computer-readable storage medium shown in fig. 9. In this embodiment, an example of application to the electronic device is taken, where the electronic device may be an automobile central processing unit, a processing device, etc. on a vehicle, and the electronic device may also be a server communicatively connected to the vehicle. The following will describe the flowchart shown in fig. 1 in detail, and the image recognition method specifically may include the following steps:

step S110, acquiring image information acquired by the vehicle in the running process.

The sensing system of the vehicle comprises a camera, the camera is arranged on the vehicle, and the camera shoots image information of the vehicle in the running process.

Optionally, in order to make the vehicle obtain more wide view angle in the driving process, the quantity of the camera on the vehicle can be a plurality of, and a plurality of cameras set up on the different positions of vehicle, and the camera of different positions gathers the image information of different view angles.

Step S120, inputting the image information into a target image recognition model, and obtaining an image recognition result output by a main guide in the target image recognition model, where the target image recognition model is obtained by deleting an auxiliary guide in a trained image recognition model, the trained image recognition model includes the auxiliary guide and the main guide, and the trained image recognition model is obtained by transferring features learned by the auxiliary guide to the main guide after learning sample data corresponding to an erroneous result by the auxiliary guide when the image recognition model outputs the erroneous result.

And inputting the image information into a target image recognition model, recognizing the image through the target image recognition model, and outputting an image recognition result by a main guide head in the target image recognition model.

In one embodiment, referring to FIG. 2, the target image recognition model 110 is a single task model for autopilot. The target image recognition model 110 includes a Backbone network (Backbone), a bottleneck layer (ck), and a master header (Head). Wherein the backbone network is used to extract features. The bottleneck layer is a connecting layer for connecting the main network and the main guide head, and has the functions of compressing and reducing the dimension of the characteristics extracted by the main network, so that the number of parameters and the computational complexity can be reduced, and the recognition speed of the subsequent main guide head can be improved. The image information is input into a target image recognition model 110, the characteristics of the image information are extracted through a backbone network in the target image recognition model 110, the target image recognition model 110 sends the extracted image characteristics to a bottleneck layer, the bottleneck layer compresses and reduces the dimension of the image characteristics, the compressed and reduced-dimension image characteristics are input into a main guide head, the main guide head recognizes the image characteristics, and an image recognition result is output. For the single task model shown in fig. 2, the model identifies a task, and the image identification result includes an identification result of an object of a category. For example, the recognition task is to recognize a vehicle, the target image recognition model is a single task model for recognizing a vehicle, and the image recognition result includes a recognition result of a photographed vehicle. Alternatively, if a plurality of vehicles are included in the captured image information, the image recognition result includes a recognition result of the plurality of vehicles.

In another embodiment, referring to FIG. 3, the target image recognition model 110 may be a multi-tasking model for autopilot. The target image recognition model 110 includes a backbone network, a bottleneck layer, and a plurality of masters. Wherein each of the plurality of primary leads corresponds to an identification task. The image information is input into a target image recognition model 110, the characteristics of the image information are extracted through a backbone network in the target image recognition model 110, the target image recognition model 110 sends the extracted image characteristics to a bottleneck layer, the bottleneck layer compresses and reduces the dimension of the image characteristics, the compressed and reduced-dimension image characteristics are respectively input into a plurality of main guide heads, each main guide head recognizes the image characteristics, and each main guide head outputs an image recognition result. For the multitasking model shown in fig. 3, the image recognition result includes recognition results for objects of various categories. For example, the target image recognition model is a multi-task model for recognizing vehicles, zebra crossings, traffic lights and pedestrians, and the image recognition result may include a recognition result of the photographed vehicles, zebra crossings, traffic lights and pedestrians. The image recognition result does not necessarily include recognition results corresponding to all types of objects, and the type of recognition result included in the image recognition result is determined based on the image information obtained by actual shooting.

The image recognition method provided by the embodiment obtains the image information acquired by the vehicle in the driving process, inputs the image information into the target image recognition model, and obtains the image recognition result output by the main guide head of the target image recognition model, wherein the target image recognition model is obtained by the following modes: when the image recognition model is used, under the condition that a main guide head of the image recognition model outputs an error result, an auxiliary head is additionally arranged in the image recognition model, sample data corresponding to the error result is learned through the auxiliary head, and the auxiliary head transfers the learned characteristics to the main guide head to obtain a trained image recognition model; and deleting the auxiliary head in the trained image recognition model to obtain a target image recognition model, wherein in the embodiment, the image recognition result is output by the main head of the target image recognition model for recognizing the image information, and compared with the original image recognition model for recognizing the image information, a new module is not additionally arranged in the target image recognition model, so that the target image recognition model is not complicated due to the fact that the bad case problem is solved, the structure is still kept simple, the debugging and deployment difficulty of the model is low, the follow-up image recognition is convenient, and because the model structure is simple, the model parameters are relatively less, and the speed of image recognition through the model is higher.

Optionally, the image information collected during the driving process of the vehicle may be stored in a sample database to expand a training sample, and may be used as a training sample when the model needs to be trained subsequently.

Referring to fig. 4, the trained image recognition model is obtained by:

step S210, acquiring first basic sample data and sample data corresponding to the error result, where the first basic sample data is sample data of which the image recognition model does not output the error result.

In one case, where the image recognition model is deployed in an electronic device, already used for image recognition, referring to fig. 5, the image recognition model 120 includes a backbone network, a bottleneck layer, and a leader. The image recognition is performed by inputting certain image data into the image recognition model 120, and when the main head of the image recognition model 120 outputs an erroneous result, the image recognition model 120 needs to be adjusted. Then, the aforementioned image data is taken as sample data corresponding to the error result. And acquiring first basic sample data, wherein the first basic sample data is sample data of which the image recognition model does not output an error result, it is understood that the first basic sample data may be sample data of which the image recognition model does not output an error result, or the first sample data may not be sample data of which the image recognition model is input and which is not recognized by the image recognition model.

It should be noted that the image recognition model is not limited to the single task model in fig. 5, but may be a multi-task model. When the image recognition model is a multi-task model, if any master in the multi-task model outputs an error result, the model needs to be adjusted.

In this case, the image recognition model is deployed in the electronic device, already used for image recognition. And collecting image data in the running process of the vehicle, inputting the image data into an image recognition model, and taking the image data as sample data corresponding to the error result if a main guide head of the image data outputs the error result. And obtaining first sample data from the sample database. Alternatively, the first sample data in the sample database may be data that does not output an erroneous result after the image data collected during the same vehicle driving process is identified by the image identification model.

In another case, the image recognition model is not deployed in the electronic device after being trained by the initial training model. Referring to fig. 6, the image recognition model 120 includes a backbone network, a bottleneck layer, an auxiliary header, and a primary header. The image recognition is performed by inputting certain image data into the image recognition model 120, and when the main head of the image recognition model 120 outputs an erroneous result, the image recognition model 120 needs to be adjusted. Then, the aforementioned image data is taken as sample data corresponding to the error result. And acquiring first basic sample data.

In this case, the image recognition model is not deployed in the electronic device after being trained by the initial training model. The sample database stores a plurality of sample data, and the sample data are all image data. After any sample data in the database is input into the image recognition model, if the main header of the image recognition model outputs an error result, the sample data is determined to be the sample data corresponding to the error result. The rest sample data except the sample data corresponding to the error result in the sample database can be used as the first sample data.

Step S220, training the auxiliary head according to the first basic sample data and the sample data corresponding to the error result, and training the main head according to the first basic sample data.

In this embodiment, the features of the sample data are learned by the auxiliary head, please refer to fig. 6, and the auxiliary head is added to the image recognition model already deployed in the electronic device as shown in fig. 5. For image recognition models that are not deployed in electronic devices, the model is self-contained with an auxiliary head, as also shown in FIG. 6. Alternatively, the auxiliary header may be provided at an intermediate network layer where the main header is located.

For convenience of description, referring to fig. 7, the first basic sample data is sample data d1, and the sample data corresponding to the error result is sample data d2. The sample data d1 and the sample data d2 are input into the image feature model of fig. 7, and after the sample data d1 and the sample data d2 sequentially pass through the main network and the bottleneck layer, the sample data d1 and the sample data d2 are input into the auxiliary head, and the auxiliary head is trained through the sample data d1 and the sample data d2. And the sample data d1 is input to the main guide head, and the auxiliary head is trained by the sample data d 1.

It should be noted that the image recognition model is not limited to the single-task model shown in fig. 7, but may be a multi-task model. Optionally, when the image recognition model is not a multitask model, an auxiliary head may be set for each main guide head, and the auxiliary head corresponding to each main guide head processes the bat case corresponding to the respective task. Optionally, when the image recognition model is a multitasking model, only auxiliary heads can be set for the main head with the bad case, so that the number of the auxiliary heads in the model is reduced, parameters of the model are reduced, and the subsequent training speed is improved.

And step S230, transferring the features learned by the auxiliary head to the primary guide head to obtain the trained image recognition model.

In one embodiment, a first dominant loss function corresponding to the dominant head is obtained, and a first auxiliary loss function corresponding to the auxiliary head and a distillation loss function corresponding to the image recognition model are obtained. And transferring the features learned by the auxiliary head to the main head through the first main loss function, the first auxiliary loss function and the distillation loss function. In this way, the feature representation of the primary lead is brought closer to the feature representation of the secondary lead by the first dominant loss function, the first auxiliary loss function, and the distillation loss function, thereby migrating the learned features of the secondary lead to the primary lead. Illustratively, a first dominant loss function, a first auxiliary loss function, and a distillation loss function are summed to obtain a first total loss function by which the features learned by the auxiliary head are migrated to the dominant head.

Optionally, the first auxiliary loss function is a product between the first dominant loss function and a first weight coefficient. The first weight coefficient may be set according to training requirements and may be 0.1, 0.15, etc.

In this embodiment, the main guide is trained by the first basic sample data, so that the model keeps the previous function by the main guide, and the auxiliary guide is trained by the first basic sample data and the sample data corresponding to the error result, so that the model can solve the problem of the bad case by the auxiliary guide; and transferring the features learned by the auxiliary head to the main guide head to obtain a trained image recognition model, wherein the main guide head in the obtained trained image recognition model can not only maintain the previous function, but also solve the problem of bad case.

Optionally, after obtaining the trained image recognition model, the auxiliary head in the trained image recognition model is deleted to obtain the target image recognition model. And deleting the auxiliary head, wherein parameters in the target image recognition model can be reduced, so that the image recognition speed of the target image recognition model is improved.

Alternatively, when the number of sample data for training is small, different sample data may be spliced. Exemplary, the first basic sample data in the sample database and the sample data corresponding to the error result are spliced, and the spliced data is new sample data, so that the number of samples in the sample database and the abundance of amplified samples are expanded.

Optionally, the method further comprises: second basic sample data is acquired, wherein the second basic sample data is any sample data of the sample database. And training an initial training model through the second basic sample data to obtain the image recognition model. Referring to fig. 8, the image recognition model is trained by:

and step S310, training the initial training model through the second basic sample data to obtain a trained initial training model.

And inputting the second basic sample data into an initial training model, and training the initial training model to obtain a trained initial training model.

Optionally, the initial training model includes a main network, a bottleneck layer, an auxiliary head and a main head, the second basic sample data is sequentially input into the main head and the auxiliary head through the main network and the bottleneck layer respectively, and the main head and the auxiliary head are trained respectively, so as to obtain the trained initial training model.

Step 320, for the trained initial training model, obtaining a second dominant loss function corresponding to the dominant head in the identification layer and a second auxiliary loss function corresponding to the auxiliary head in the identification layer.

Wherein the second auxiliary loss function is a product between the second dominant loss function and a second weight coefficient. Optionally, the second weight coefficient may be set according to training requirements, and may be 0.1, 0.15, and so on.

Alternatively, the second weight coefficient may be the same as the first weight coefficient.

And step S330, updating the trained initial training model through the second dominant loss function and the second auxiliary loss function to obtain the image recognition model.

The sum of the second dominant loss function and the second auxiliary loss function is calculated to obtain a second total loss function, and parameters of the initial training model after training are updated through the second total loss function to obtain an image recognition model. Illustratively, updating the intermediate network layer by the second total loss function may be understood as updating both the auxiliary header and the primary header.

Based on the same inventive concept, the present disclosure provides an image recognition apparatus 200, referring to fig. 9, the image recognition apparatus 200 includes:

the acquisition module 210 is configured to acquire image information acquired by the vehicle during a driving process;

the identifying module 220 is configured to input the image information into a target image identifying model, and obtain an image identifying result output by a main guide in the target image identifying model, where the target image identifying model is obtained by deleting an auxiliary guide in a trained image identifying model, the trained image identifying model includes the auxiliary guide and the main guide, and the trained image identifying model is obtained by transferring features learned by the auxiliary guide to the main guide after learning sample data corresponding to an erroneous result by the auxiliary guide when the image identifying model outputs the erroneous result.

Optionally, the image recognition apparatus 200 further includes:

the first data acquisition module is used for acquiring first basic sample data and sample data corresponding to the error result, wherein the first basic sample data is sample data of which the image recognition model does not output the error result;

the first training module is used for training the auxiliary head through the first basic sample data and the sample data corresponding to the error result, and training the main head through the first basic sample data;

and the migration module is used for migrating the features learned by the auxiliary head to the primary guide head to obtain the trained image recognition model.

Optionally, the migration module includes:

the first loss function acquisition module is used for acquiring a first dominant loss function corresponding to the main head, a first auxiliary loss function corresponding to the auxiliary head and a distillation loss function corresponding to the image recognition model;

and the characteristic migration module migrates the characteristic learned by the auxiliary head to the main head through the first main loss function, the first auxiliary loss function and the distillation loss function.

Optionally, the image recognition apparatus 200 further includes:

the second data acquisition module is used for acquiring second basic sample data;

and the second training module is used for training the initial training model through the second basic sample data to obtain the image recognition model.

Optionally, the second training module includes:

the initial training module is used for training the initial training model through the second basic sample data to obtain a trained initial training model;

the second loss function acquisition module is used for acquiring a second dominant loss function corresponding to a dominant head in the identification layer and a second auxiliary loss function corresponding to an auxiliary head in the identification layer aiming at the trained initial training model;

and the updating module is used for updating the initial training model after training through the second dominant loss function and the second auxiliary loss function to obtain the image recognition model.

With respect to the image recognition apparatus 200 in the above-described embodiment, the specific manner in which the respective modules perform the operations has been described in detail in the embodiment regarding the method, and will not be described in detail here.

The present disclosure also provides a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the seed image recognition method provided by the present disclosure.

Fig. 10 is a block diagram of an electronic device 600, according to an example embodiment. For example, the electronic device 600 may be a device on a vehicle, such as a hybrid vehicle, a non-hybrid vehicle, an electric vehicle, a fuel cell vehicle, or other type of vehicle. The vehicle may be an autonomous vehicle, a semi-autonomous vehicle, or a non-autonomous vehicle.

Referring to fig. 10, an electronic device 600 may include various subsystems, such as an infotainment system 610, a perception system 620, a decision control system 630, a drive system 640, and a computing platform 650. Wherein the electronic device 600 may also include more or fewer subsystems, and each subsystem may include multiple components. In addition, interconnections between each subsystem and between each component of the electronic device 600 may be achieved by wired or wireless means.

In some embodiments, the infotainment system 610 may include a communication system, an entertainment system, a navigation system, and the like.

The perception system 620 may include several types of sensors for sensing information of the environment surrounding the electronic device 600. For example, the sensing system 620 may include a global positioning system (which may be a GPS system, a beidou system, or other positioning system), an inertial measurement unit (inertial measurement unit, IMU), a lidar, millimeter wave radar, an ultrasonic radar, and a camera device.

Decision control system 630 may include a computing system, a vehicle controller, a steering system, a throttle, and a braking system.

The drive system 640 may include components that provide powered movement of the electronic device 600. In one embodiment, the drive system 640 may include an engine, an energy source, a transmission, and wheels. The engine may be one or a combination of an internal combustion engine, an electric motor, an air compression engine. The engine is capable of converting energy provided by the energy source into mechanical energy.

Some or all of the functionality of the electronic device 600 is controlled by the computing platform 650. The computing platform 650 may include at least one processor 651 and memory 652, the processor 651 may execute instructions 653 stored in the memory 652.

The processor 651 may be any conventional processor, such as a commercially available CPU. The processor may also include, for example, an image processor (Graphic Process Unit, GPU), a field programmable gate array (Field Programmable Gate Array, FPGA), a System On Chip (SOC), an application specific integrated Chip (Application Specific Integrated Circuit, ASIC), or a combination thereof.

The memory 652 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

In addition to instructions 653, memory 652 may store data such as road maps, route information, vehicle location, direction, speed, and the like. The data stored by memory 652 may be used by computing platform 650.

In an embodiment of the present disclosure, the processor 651 may execute instructions 653 to perform all or part of the steps of the seed image recognition method described above.

Fig. 11 shows a server connected to a vehicle, which receives image information photographed by the vehicle during traveling. With reference to FIG. 11, the server 1900 includes a processing component 1922 that further includes one or more processors and memory resources represented by memory 1932 for storing instructions, such as application programs, that can be executed by the processing component 1922. The application programs stored in memory 1932 may include one or more modules each corresponding to a set of instructions. Further, processing component 1922 is configured to execute instructions to perform the methods described above.

The server 1900 may also include a power component 1926 configured to perform power management of the server 1900, a wired or wireless network interface 1950 configured to connect the server 1900 to a network, and an input/output interface 1958. The server 1900 may operate based on an operating system stored in memory 1932.

In another exemplary embodiment, a computer program product is also provided, which comprises a computer program executable by a programmable apparatus, the computer program having code portions for performing the above-described seed image recognition method when being executed by the programmable apparatus.

In summary, the image recognition method, the device, the electronic equipment and the medium provided by the present disclosure obtain image information collected during the driving process of a vehicle, then input the image information into a target image recognition model, obtain an image recognition result output by a main head of the target image recognition model, and obtain the target image recognition model by the following modes: when the image recognition model is used, under the condition that a main guide head of the image recognition model outputs an error result, an auxiliary head is additionally arranged in the image recognition model, sample data corresponding to the error result is learned through the auxiliary head, and the auxiliary head transfers the learned characteristics to the main guide head to obtain a trained image recognition model; and deleting the auxiliary head in the trained image recognition model to obtain a target image recognition model, wherein in the method, the image recognition result is output by the main head of the target image recognition model through the image recognition model recognition image information, and compared with the original image recognition model for recognizing the image information, a new module is not additionally arranged in the target image recognition model, so that the target image recognition model is not complicated due to the fact that the bad case problem is solved, the structure is still kept simple, the debugging and deployment difficulty of the model is low, the follow-up image recognition is convenient, and because the model structure is simple, the model parameters are relatively less, and the speed of image recognition through the model is higher.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. An image recognition method, the method comprising:

acquiring image information acquired by a vehicle in the running process;

inputting the image information into a target image recognition model to obtain an image recognition result output by a main guide head in the target image recognition model, wherein the target image recognition model is obtained by deleting an auxiliary head in a trained image recognition model, the trained image recognition model comprises the auxiliary head and the main guide head, and the trained image recognition model is obtained by moving the characteristic learned by the auxiliary head to the main guide head after learning sample data corresponding to an error result through the auxiliary head under the condition that the image recognition model outputs the error result.

2. The method of claim 1, wherein the trained image recognition model is obtained by:

acquiring first basic sample data and sample data corresponding to the error result, wherein the first basic sample data is sample data of which the image recognition model does not output the error result;

training the auxiliary head through the first basic sample data and the sample data corresponding to the error result, and training the main head through the first basic sample data;

and transferring the features learned by the auxiliary head to the primary guide head to obtain the trained image recognition model.

3. The method of claim 2, wherein the migrating the learned features of the auxiliary head to the main head comprises:

acquiring a first dominant loss function corresponding to the main head, and acquiring a first auxiliary loss function corresponding to the auxiliary head and a distillation loss function corresponding to the image recognition model;

and migrating the features learned by the auxiliary head to the main head through the first main loss function, the first auxiliary loss function and the distillation loss function.

4. A method according to claim 3, wherein the first auxiliary loss function is a product between the first dominant loss function and a first weight coefficient.

5. A method according to any one of claims 1 to 3, further comprising:

acquiring second basic sample data;

and training an initial training model through the second basic sample data to obtain the image recognition model.

6. The method of claim 5, wherein training an initial training model with the second base sample data to obtain the image recognition model comprises:

training the initial training model through the second basic sample data to obtain a trained initial training model;

aiming at the initial training model after training, acquiring a second dominant loss function corresponding to a dominant head in the identification layer and a second auxiliary loss function corresponding to an auxiliary head in the identification layer;

and updating the trained initial training model through the second dominant loss function and the second auxiliary loss function to obtain the image recognition model.

7. The method of claim 6, wherein the second auxiliary loss function is a product between the second dominant loss function and a second weight coefficient.

8. A method according to any one of claims 1-3, wherein the target image recognition model is a multitasking model for autopilot.

9. An image recognition apparatus, the apparatus comprising:

the acquisition module is used for acquiring image information acquired by the vehicle in the running process;

the recognition module is used for inputting the image information into a target image recognition model to obtain an image recognition result output by a main guide in the target image recognition model, wherein the target image recognition model is obtained by deleting an auxiliary guide in a trained image recognition model, the trained image recognition model comprises the auxiliary guide and the main guide, and the trained image recognition model is obtained by transferring the characteristic learned by the auxiliary guide to the main guide after learning sample data corresponding to an error result through the auxiliary guide under the condition that the image recognition model outputs the error result.

10. An electronic device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to implement the steps of the method of any one of claims 1 to 8 when executing the instructions.

11. A computer readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the steps of the method of any of claims 1 to 8.