CN115050001A

CN115050001A - Image recognition method and device, electronic equipment and readable storage medium

Info

Publication number: CN115050001A
Application number: CN202210787001.XA
Authority: CN
Inventors: 杨赫; 胡佳高; 王飞; 徐梦龙; 汪真
Original assignee: Beijing Xiaomi Pinecone Electronic Co Ltd; Xiaomi Automobile Technology Co Ltd
Current assignee: Beijing Xiaomi Pinecone Electronic Co Ltd; Xiaomi Automobile Technology Co Ltd
Priority date: 2022-07-04
Filing date: 2022-07-04
Publication date: 2022-09-13

Abstract

The present disclosure relates to an image recognition method, an image recognition apparatus, an electronic device, and a readable storage medium, wherein the method comprises: acquiring an image to be recognized, inputting the image to be recognized into a pre-trained image recognition model to obtain an object in the image to be recognized output by the image recognition model, wherein the image recognition model is obtained by training a preset training model according to a sample image, the sample image is obtained by performing image transformation on an original sample image and a label of the original sample image for a preset number of times, and one sample image is obtained by performing image transformation once; the method can perform image transformation on the original sample image and the label of the original sample image to obtain the sample image, the sample image obtained by performing image transformation on the original sample image of a single frame can be similar to continuous image frames, and the image recognition model obtained by training the continuous image frames has better detection stability compared with the image recognition model obtained by training a randomly-called single frame image.

Description

Image recognition method and device, electronic equipment and readable storage medium

Technical Field

The present disclosure relates to the field of automatic driving technologies, and in particular, to an image recognition method and apparatus, an electronic device, and a readable storage medium.

Background

The image perception algorithm is trained by adopting a large number of sample images, and the performance of the model is optimized. During training, the sample images are cut randomly and zoomed, so that the diversity of sample data is greatly improved, and the recognition effect of the model on the single-frame images is good. Because the sample images are randomly selected, the same sample image cannot appear for multiple times in a continuous mode, and the judgment of the model on similar scenes fluctuates, so that the model cannot output stable results on the same scene and the similar scenes. This also makes the model unstable in detecting continuous images, and particularly, makes the detection result of the model inconsistent with the detection result of the still image when the automobile is parked.

In the related art, a common method is to label consecutive frames of a video, and train a video detection model by fully utilizing the previous and subsequent information of the image frames. However, this method requires labeling a large number of consecutive frames of video, labeling a large number of repeated scenes, and wasting labeling resources. In addition, due to the fact that the manual labeling judgment standards are different, continuous image labeling is more prone to unstable conditions, and fluctuation of the model on the detection result of the continuous image is aggravated.

Disclosure of Invention

To overcome the problems in the related art, the present disclosure provides an image recognition method, apparatus, electronic device, and readable storage medium.

According to a first aspect of the embodiments of the present disclosure, an image recognition method is provided, which includes acquiring an image to be recognized; inputting the image to be recognized into a pre-trained image recognition model to obtain an object in the image to be recognized output by the image recognition model; the image recognition model is obtained by training a preset training model according to a sample image; the sample image is obtained by performing image transformation on an original sample image and a label of the original sample image for a predetermined number of times, and one sample image is obtained by performing image transformation once.

Optionally, the image transformation comprises at least one of flipping, rotating, cropping, warping, scaling, noise, blurring, color transformation, erasing, and padding.

Optionally, the method further includes: repeatedly executing a training process of training the preset training model according to the sample image to obtain the image recognition model until the preset training model meeting specified conditions is obtained and serves as the image recognition model; the training process comprises: acquiring the original sample image and a label of the original sample image; performing the image transformation for the preset times on the original sample image and the label of the original sample image to obtain the sample images with the preset number; inputting the sample images of the preset number into the preset training model to obtain the recognition results of the preset number; comparing the identification result with the labels of the original sample images to obtain the identification errors with the preset number; and adjusting the parameters of the preset training model according to the recognition error.

Optionally, the step of obtaining the original sample image and the label of the original sample image includes: and randomly selecting a single-frame image from a database as the original sample image.

Optionally, the performing the predetermined number of image transformations on the original sample image and the label of the original sample image to obtain a predetermined number of sample images includes: performing the image transformation for the preset times on the original sample image and the label of the original sample image through an image transformation model to obtain the sample images with the preset number; the label of the sample image is the label of the original sample image or the label of the original sample image after displacement change.

Optionally, the step of adjusting the parameter of the preset training model according to the recognition error includes: obtaining an average value of the predetermined number of identification errors; obtaining the adjustment gradient of the preset training model according to the average value; and adjusting the parameters of the preset training model according to the adjustment gradient.

According to a second aspect of the embodiments of the present disclosure, there is provided an image recognition apparatus including:

according to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to execute the executable instructions to implement the steps of the image recognition method described previously.

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the image recognition method provided by the first aspect of the present disclosure.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects: acquiring an image to be recognized, inputting the image to be recognized into a pre-trained image recognition model to obtain an object in the image to be recognized output by the image recognition model, wherein the image recognition model is obtained by training a preset training model according to a sample image, the sample image is obtained by performing image transformation on an original sample image and a label of the original sample image for a preset number of times, and one sample image is obtained by performing image transformation once; the method can perform image transformation on the original sample image and the label of the original sample image to obtain the sample image, the sample image obtained by the image transformation of the original sample image of a single frame can be similar to continuous image frames, and the image recognition model obtained by training the continuous image frames has better detection stability compared with the image recognition model obtained by training a single frame image by randomly adjusting and selecting, and avoids the marking resource waste caused by manually marking a large number of continuous video frames.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.

Fig. 1 is a schematic structural diagram of a computer system shown in an exemplary embodiment of the present disclosure.

Fig. 2 is a flowchart illustrating an image recognition method according to an exemplary embodiment of the present disclosure.

Fig. 3 is a flowchart illustrating another image recognition method according to an exemplary embodiment of the present disclosure.

Fig. 4 is a flowchart illustrating a training method of an image recognition model according to an exemplary embodiment of the present disclosure.

Fig. 5 is a block diagram illustrating an image recognition apparatus according to an exemplary embodiment.

Fig. 6 is a block diagram illustrating an apparatus for image recognition according to an exemplary embodiment.

Fig. 7 is a block diagram illustrating an apparatus for image recognition according to an exemplary embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

It is understood that "a plurality" in this disclosure means two or more, and other words are analogous. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. The singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It is further to be understood that while operations are depicted in the drawings in a particular order, this is not to be understood as requiring that such operations be performed in the particular order shown or in serial order, or that all illustrated operations be performed, to achieve desirable results. In certain environments, multitasking and parallel processing may be advantageous.

It should be noted that all actions of acquiring signals, information or data in the present disclosure are performed under the premise of complying with the corresponding data protection regulation policy of the country of the location and obtaining the authorization given by the owner of the corresponding device.

Fig. 1 shows a schematic structural diagram of a computer system provided by an exemplary embodiment of the present disclosure, which includes a terminal 120 and a server 140.

The terminal 120 and the server 140 are connected to each other through a wired or wireless network.

The terminal 120 may include at least one of a smartphone, a laptop, a desktop, a tablet, a smart speaker, and a smart robot.

The terminal 120 includes a display; the display is used for displaying the image recognition result. The terminal 120 stores therein a first program; the first program is called and executed by the first processor to realize the image recognition method provided by the present disclosure.

The server 140 stores therein a second program that is called by the second processor to implement the image recognition method provided by the present disclosure.

The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), a big data and artificial intelligence platform. The terminal may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, and the like. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the disclosure is not limited thereto.

Fig. 2 is a flowchart illustrating an image recognition method according to an exemplary embodiment, which is performed by a computer device, for example, a terminal or a server in the computer system shown in fig. 1, as shown in fig. 2. The image recognition method comprises the following steps:

in step S11, an image to be recognized is acquired.

For example, an image to be recognized in a related scene, such as an image in an automobile driving scene, which includes objects such as vehicles, pedestrians, traffic lights, and the like, may be acquired through an intelligent terminal such as a smart phone, a camera, or a tablet computer; the image to be recognized is used as an input image of a pre-trained image recognition model.

In step S12, the image to be recognized is input to the image recognition model trained in advance to obtain the object in the image to be recognized output by the image recognition model.

And inputting the image to be recognized obtained in the step into a pre-trained image recognition model, and recognizing the object in the image to be recognized by the image recognition model, such as recognizing objects in the image to be recognized, such as vehicles, pedestrians, traffic lights and the like.

The image recognition model is obtained by training a preset training model according to a sample image, the sample image is obtained by performing image transformation on an original sample image and a label of the original sample image for a predetermined number of times, and one sample image is obtained by image transformation each time; the label is a manual annotation or a machine annotation of the original sample image and is used for indicating the object category in the original sample image.

The sample images obtained by the original sample images of the single frames through image transformation can be similar to continuous image frames, and compared with the image recognition model obtained by training the randomly selected single frame images, the image recognition model obtained by training the continuous image frames has better detection stability, and the marking resource waste caused by manually marking a large number of continuous video frames is also avoided.

Referring to fig. 3, fig. 3 is a flowchart illustrating another image recognition method according to an exemplary embodiment of the disclosure. The method is performed by a computer device, for example, a terminal or a server in the computer system shown in fig. 1.

It should be noted that the image recognition method shown in fig. 3 is consistent with the embodiment of the image recognition method shown in fig. 2, and the parts that are not mentioned in fig. 3 may refer to the description of fig. 2, and are not described again here.

The image recognition method shown in fig. 3 includes the steps of:

in step S11, an image to be recognized is acquired.

For example, an image to be recognized in a related scene, such as an image in a driving scene of an automobile, which includes objects such as vehicles, pedestrians, and traffic lights, may be acquired through an intelligent terminal such as a smart phone, a camera, or a tablet computer; the image to be recognized is used as an input image of a pre-trained image recognition model.

And inputting the image to be recognized obtained in the step into a pre-trained image recognition model, and recognizing objects in the image to be recognized by the image recognition model, such as vehicles, pedestrians, traffic lights and other objects in the image to be recognized.

The image recognition model is obtained by training a preset training model according to a sample image, the sample image is obtained by performing image transformation on an original sample image and a label of the original sample image for a predetermined number of times, and one sample image is obtained by image transformation each time; the label is a manual annotation or a machine annotation of the original sample image and is used for indicating the object type in the original sample image; the predetermined number of times may be 60, 80, or 100, etc., which is not limited by the present disclosure, and for example, when the predetermined number of times is 60, 60 sample images may be acquired.

The image transformation includes at least one of flipping, rotating, cropping, morphing, scaling, noise, blurring, color transformation, erasing, and filling, and typically a sample image results from only one image transformation. When the original sample image is subjected to image transformation, the original sample image and the label of the original sample image can be sent to the image transformation module together for image transformation, so that a sample image is obtained; the transformation of the original sample image and the label needs to ensure consistency because the image transformation has randomness; the sample image may share the same label as the original sample image, or the label of the sample image may simply be a label of the original sample image after the label has been simply shifted.

Repeatedly executing a training process of training a preset training model according to the sample image to obtain an image recognition model until the preset training model meeting specified conditions is obtained and serves as the image recognition model; the training process comprises the following steps:

in step S21, the original sample image and the label of the original sample image are acquired.

Illustratively, n original sample images can be randomly extracted from the database, where the value of n can be, but is not limited to, 32, 64, 256, or 512, and the value of n can be adjusted as the computing resources change, where each frame of original sample image corresponds to one label. In one embodiment, n single-frame images may be randomly selected from the database as the original sample image. The images in the database are obtained by acquiring images in related scenes through intelligent terminals such as a smart phone, a camera or a tablet personal computer, for example, images in a driving scene of an automobile, which comprise objects such as vehicles, pedestrians and traffic lights; the label is a manual annotation or a machine annotation of the original sample image and is used for indicating the object category in the original sample image.

In step S22, the original sample image and the label of the original sample image are subjected to image transformation a predetermined number of times, resulting in a predetermined number of sample images.

Since n original sample images are randomly extracted, it cannot be guaranteed that the n original sample images are continuously taken on a time axis, and thus the original sample images are subjected to image transformation. Taking the example of randomly extracting n original sample images, each of the n original sample images is subjected to image transformation for a predetermined number of times, so as to obtain n times of a predetermined number of sample images. The predetermined number of times may be 60, 80, or 100, etc., which is not limited by the present disclosure, for example, when the predetermined number of times is 60, a predetermined number of 60 sample images may be acquired.

The image transformation includes at least one of flipping, rotating, cropping, morphing, scaling, noise, blurring, color transformation, erasing, and filling, and typically a sample image results from only one image transformation. When the original sample image is subjected to image transformation, the original sample image and the label of the original sample image can be sent to the image transformation module together for image transformation, so that a sample image is obtained; the transformation of the original sample image and the label needs to ensure consistency because the image transformation has randomness; the sample image may share the same label as the original sample image, or the label of the sample image may simply be a label of the original sample image after the unit of label has been shifted. The sample image obtained by the image transformation of the original sample image of the single frame can be similar to a continuous image frame, so that the randomness of the original sample image is ensured, the continuity of the sample image is also ensured, and the generalization performance of the image recognition model is also ensured.

In step S23, a predetermined number of sample images are input into a preset training model, and a predetermined number of recognition results are obtained.

Inputting a preset number of sample images obtained according to the original sample images of the single frame into a preset training model to obtain a preset number of recognition results; the identification of a predetermined number of sample images corresponds to the continuous sampling of the scene of the original sample image a predetermined number of times. The recognition result contains a label of the sample image for indicating the object class in the original sample image, such as a matrix box for identifying the vehicle area.

In the foregoing step, a plurality of original sample images may be acquired at one time, and image transformation may be performed on the plurality of original sample images, but in this step, the sample image of the preset training model is input, and is a sample image obtained by once identifying the same original sample image, or a sample image obtained by continuously identifying the same original sample image and then identifying sample images obtained by transforming other original sample images.

In step S24, the recognition result is compared with the label of the original sample image, resulting in a predetermined number of recognition errors.

The identification result is equivalent to a predicted value, the label of the original sample image is equivalent to a standard value, and the identification result of each sample data is differed from the label of the corresponding original sample image, so that a preset number of identification errors can be obtained.

In step S25, parameters of the preset training model are adjusted according to the recognition error.

Illustratively, a predetermined number of recognition errors may be accumulated, then an average value of the predetermined number of recognition errors is obtained, and an adjustment gradient of the preset training model is obtained according to the average value; in one embodiment, the average value may be used as an adjustment gradient of the preset training model, and the parameters of the preset training model are adjusted according to the adjustment gradient.

The recognition result of the preset training model can be closer to the true value by adjusting the parameters of the preset training model each time, the training process is repeatedly executed, so that the recognition error of the preset training model can meet the specified condition, at the moment, the preset training model is used as an image recognition model, the specified condition can be that the recognition error is smaller than the specified threshold value, or the recognition error is stable and not reduced, and at the moment, the performance of the image recognition model reaches the optimum.

In summary, the present disclosure provides an image recognition method, including: acquiring an image to be recognized, inputting the image to be recognized into a pre-trained image recognition model to obtain an object in the image to be recognized output by the image recognition model, wherein the image recognition model is obtained by training the pre-trained image recognition model according to a sample image, the sample image is obtained by performing image transformation on an original sample image and a label of the original sample image for a predetermined number of times, and one sample image is obtained by performing image transformation once; the method can perform image transformation on the original sample image and the label of the original sample image to obtain the sample image, the sample image obtained by the image transformation of the original sample image of a single frame can be similar to continuous image frames, and the image recognition model obtained by training the continuous image frames has better detection stability compared with the image recognition model obtained by training a single frame image by randomly adjusting and selecting, and avoids the marking resource waste caused by manually marking a large number of continuous video frames.

Fig. 4 is a block diagram illustrating an image recognition apparatus according to an exemplary embodiment. Referring to fig. 4, the apparatus 20 includes an acquisition module 201 and a processing module 203.

The acquisition module 201 is configured to acquire an image to be recognized;

the processing module 203 is configured to input the image to be recognized into a pre-trained image recognition model to obtain an object in the image to be recognized output by the image recognition model;

the image recognition model is obtained by training a preset training model according to a sample image; the sample image is obtained by performing image transformation on an original sample image and a label of the original sample image for a predetermined number of times, and one sample image is obtained by performing image transformation once.

Optionally, the processing module 203 is further configured to repeatedly execute a training process of training the preset training model according to the sample image to obtain the image recognition model, until the preset training model meeting a specified condition is obtained as the image recognition model;

the training process comprises:

acquiring the original sample image and a label of the original sample image;

performing the image transformation for the preset times on the original sample image and the label of the original sample image to obtain the sample images with the preset number;

inputting the sample images of the preset number into the preset training model to obtain the recognition results of the preset number;

comparing the identification result with the labels of the original sample images to obtain the identification errors with the preset number;

and adjusting the parameters of the preset training model according to the recognition error.

Optionally, the processing module 203 is further configured to randomly select a single frame image from a database as the original sample image.

Optionally, the processing module 203 is further configured to perform image transformation on the original sample image and the label of the original sample image for the predetermined number of times through an image transformation model to obtain a predetermined number of sample images; the label of the sample image is the label of the original sample image or the label of the original sample image after displacement change.

Optionally, the processing module 203 is further configured to obtain an average value of the predetermined number of identification errors;

obtaining the adjustment gradient of the preset training model according to the average value;

and adjusting the parameters of the preset training model according to the adjustment gradient.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

The present disclosure also provides a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the image recognition method provided by the present disclosure.

Fig. 5 is a block diagram illustrating an apparatus for image recognition according to an example embodiment. For example, the apparatus 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.

Referring to fig. 5, the apparatus 800 may include one or more of the following components: a first processing component 802, a first memory 804, a first power component 806, a multimedia component 808, an audio component 810, a first input/output interface 812, a sensor component 814, and a communication component 816.

The first processing component 802 generally controls overall operation of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The first processing component 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the image recognition method described above. Further, the first processing component 802 can include one or more modules that facilitate interaction between the first processing component 802 and other components. For example, the first processing component 802 may include a multimedia module to facilitate interaction between the multimedia component 808 and the first processing component 802.

The first memory 804 is configured to store various types of data to support operations at the apparatus 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The first memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

A first power supply component 806 provides power to the various components of the device 800. The first power component 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device 800.

The multimedia component 808 includes a screen that provides an output interface between the device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the device 800 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the apparatus 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may further be stored in the first memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.

The first input/output interface 812 provides an interface between the first processing component 802 and a peripheral interface module, which may be a keyboard, click wheel, button, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the device 800. For example, the sensor assembly 814 may detect the open/closed status of the device 800, the relative positioning of components, such as a display and keypad of the device 800, the sensor assembly 814 may also detect a change in the position of the device 800 or a component of the device 800, the presence or absence of user contact with the device 800, the orientation or acceleration/deceleration of the device 800, and a change in the temperature of the device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 816 is configured to facilitate communications between the apparatus 800 and other devices in a wired or wireless manner. The device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the image recognition methods described above.

In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the first memory 804 comprising instructions, executable by the processor 820 of the apparatus 800 to perform the image recognition method described above is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

In another exemplary embodiment, a computer program product is also provided, which comprises a computer program executable by a programmable apparatus, the computer program having code portions for performing the image recognition method described above when executed by the programmable apparatus.

Fig. 6 is a block diagram illustrating an apparatus 1900 for image recognition according to an example embodiment. For example, the apparatus 1900 may be provided as a server. Referring to FIG. 6, the apparatus 1900 includes a second processing component 1922 further including one or more processors and memory resources represented by a second memory 1932 for storing instructions, e.g., applications, executable by the second processing component 1922. The application programs stored in the second memory 1932 may include one or more modules that each correspond to a set of instructions. Further, the second processing component 1922 is configured to execute instructions to perform the image recognition method described above.

The device 1900 may also include a second power component 1926 configured to perform power management of the device 1900, a wired or wireless network interface 1950 configured to connect the device 1900 to a network, and a second input/output interface 1958. The device 1900 may operate based on an operating system, such as Windows Server, stored in memory 1932 ^TM ，Mac OS X ^TM ，Unix ^TM ，Linux ^TM ，FreeBSD ^TM Or the like.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. An image recognition method, comprising:

acquiring an image to be identified;

inputting the image to be recognized into a pre-trained image recognition model to obtain an object in the image to be recognized output by the image recognition model;

2. The method of claim 1, wherein the image transformation comprises at least one of flipping, rotating, cropping, morphing, scaling, noise, blurring, color transformation, erasing, and padding.

3. The method according to claim 1 or 2, characterized in that the method further comprises:

repeatedly executing a training process of training the preset training model according to the sample image to obtain the image recognition model until the preset training model meeting specified conditions is obtained and serves as the image recognition model;

the training process comprises:

acquiring the original sample image and a label of the original sample image;

4. The method of claim 3, wherein the step of obtaining the original specimen image and the label of the original specimen image comprises:

and randomly selecting a single-frame image from a database as the original sample image.

5. The method of claim 3, wherein the performing the predetermined number of image transformations on the original sample image and the label of the original sample image to obtain a predetermined number of sample images comprises:

performing the image transformation for the preset times on the original sample image and the label of the original sample image through an image transformation model to obtain the sample images with the preset number; the label of the sample image is the label of the original sample image or the label of the original sample image after displacement change.

6. The method of claim 3, wherein the step of adjusting the parameters of the preset training model according to the recognition error comprises:

obtaining an average value of the predetermined number of identification errors;

7. An image recognition apparatus, comprising:

an acquisition module configured to acquire an image to be recognized;

the processing module is configured to input the image to be recognized into a pre-trained image recognition model so as to obtain an object in the image to be recognized output by the image recognition model;

8. The apparatus of claim 7, wherein the image transformation comprises at least one of flipping, rotating, cropping, morphing, scaling, noise, blurring, color transformation, erasing, and padding.

9. An electronic device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to execute the executable instructions to implement the steps of the method of any of claims 1-6.

10. A computer-readable storage medium, on which computer program instructions are stored, which program instructions, when executed by a processor, carry out the steps of the method according to any one of claims 1 to 6.