CN115984643A

CN115984643A - Model training method, related device and storage medium

Info

Publication number: CN115984643A
Application number: CN202211603537.8A
Authority: CN
Inventors: 请求不公布姓名
Original assignee: Beijing Real AI Technology Co Ltd
Current assignee: Beijing Real AI Technology Co Ltd
Priority date: 2022-12-13
Filing date: 2022-12-13
Publication date: 2023-04-18

Abstract

The embodiment of the application discloses a model training method, related equipment and a storage medium. The method comprises the following steps: acquiring a plurality of images to be identified which are historically acquired by an image acquisition device in a preset place; respectively identifying each image to be identified according to a preset security identification model to obtain a candidate confidence corresponding to each image to be identified; determining an image to be recognized corresponding to a target confidence coefficient as a candidate sample image, wherein the target confidence coefficient is a confidence coefficient which is larger than a second confidence coefficient threshold value and smaller than a first confidence coefficient threshold value in the candidate confidence coefficients; acquiring real sample label information corresponding to each candidate sample image to obtain a plurality of target sample images, wherein the target sample images carry the corresponding real sample label information; and training the security identification model according to the plurality of target sample images to obtain the trained security identification model. By implementing the method of the embodiment of the application, the generalization capability of the model can be improved.

Description

Model training method, related device and storage medium

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a model training method, a related device, and a storage medium.

Background

Currently, security monitoring by using Artificial Intelligence (AI) is driven by policies and technologies, and is rapidly developed, especially in recent years, demands of enterprise security at B end and personal security market at C end of civil security field are gradually increased, and more monitoring devices in various places deploy AI security models, for example, automatic monitoring of personnel violation or device violation in factory, port and park, monitoring of fire, forbidden intrusion, standard and other conditions in special scenes, and the like.

The algorithm in the security scene is mainly based on target detection and target classification algorithm, data in a real monitored scene are various, a training sample generally covers a limited scene, most scene data are confidential, a developer is difficult to obtain full-class data in the early-stage algorithm model development process, and a lot of scene data are generally in an intranet and are difficult to export outwards. The AI security model is not accurate due to the lack of training data.

Disclosure of Invention

The embodiment of the application provides a model training method, related equipment and a storage medium, which can improve the accuracy of a model and improve the generalization capability of the model on line.

In a first aspect, an embodiment of the present application provides a model training method, which includes:

acquiring a plurality of images to be identified which are historically acquired by an image acquisition device in a preset place;

respectively identifying each image to be identified according to a preset security identification model to obtain candidate confidence degrees corresponding to each image to be identified;

determining an image to be recognized corresponding to a target confidence coefficient as a candidate sample image, wherein the target confidence coefficient is the confidence coefficient which is larger than a second confidence coefficient threshold and smaller than a first confidence coefficient threshold in the candidate confidence coefficients, the first confidence coefficient threshold is the confidence coefficient threshold which is currently set by the security recognition model, and the second confidence coefficient threshold is smaller than the first confidence coefficient threshold;

acquiring real sample label information corresponding to each candidate sample image to obtain a plurality of target sample images, wherein the target sample images carry the corresponding real sample label information;

and training the security identification model according to the plurality of target sample images to obtain the trained security identification model.

In a second aspect, an embodiment of the present application further provides a model training apparatus, which includes:

the receiving and sending module is used for acquiring a plurality of images to be identified which are historically acquired by the image acquisition device in a preset place;

the processing module is used for respectively carrying out recognition processing on the images to be recognized according to a preset security recognition model to obtain candidate confidence degrees corresponding to the images to be recognized; determining an image to be recognized corresponding to a target confidence as a candidate sample image, wherein the target confidence is a confidence which is greater than a second confidence threshold and smaller than a first confidence threshold in the candidate confidence, the first confidence threshold is a confidence threshold currently set by the security recognition model, and the second confidence threshold is smaller than the first confidence threshold;

the receiving and sending module is further configured to obtain real sample label information corresponding to each candidate sample image, so as to obtain a plurality of target sample images, where the target sample images carry the corresponding real sample label information;

the processing module is further configured to train the security recognition model according to the plurality of target sample images to obtain the trained security recognition model.

In some embodiments, before the transceiver module performs the step of acquiring a plurality of images to be identified historically acquired by the image acquisition device in a preset location, the processing module is further configured to:

determining a target frame extraction frequency according to the current scene of the image acquisition device and a preset frame extraction strategy;

dividing a preset computing power resource into a first computing power resource and a second computing power resource according to a preset computing power distribution rule and the target frame extraction frequency, wherein the first computing power resource is used for respectively identifying the images to be identified, and the second computing power resource is used for training the security identification model;

at this time, when the step of obtaining a plurality of images to be identified historically collected by the image collection device in a preset place is executed, the transceiver module is specifically configured to:

and acquiring a plurality of images to be identified historically acquired by the image acquisition device in the preset place according to the target frame extraction frequency.

In some embodiments, when the step of training the security recognition model according to the plurality of target sample images to obtain the trained security recognition model is executed, the processing module is specifically configured to:

respectively dividing the target sample images into a training sample set and a verification sample set according to a preset sample distribution proportion;

performing model parameter training on the security identification model according to the training sample set to obtain an intermediate security identification model;

and adjusting the first confidence coefficient threshold value according to the verification sample set based on the intermediate security identification model to obtain the trained security identification model.

In some embodiments, when the adjusting the first confidence threshold according to the verification sample set based on the intermediate security identification model is executed by the processing module, the processing module is specifically configured to:

based on the intermediate security identification model, determining a recall rate and a false alarm rate respectively corresponding to each candidate confidence coefficient threshold value in a plurality of candidate confidence coefficient threshold values according to the verification sample set;

selecting a target confidence coefficient threshold value from a plurality of candidate confidence coefficient threshold values meeting a preset standard according to the recall rate and the false alarm rate, and replacing the first confidence coefficient threshold value with the target confidence coefficient threshold value, wherein the preset standard is that the recall rate is higher than a preset recall rate threshold value and the false alarm rate is lower than a preset false alarm rate threshold value.

In some embodiments, before the step of determining, based on the intermediate security identification model, a recall rate and a false positive rate respectively corresponding to each of the candidate confidence thresholds in the plurality of candidate confidence thresholds according to the verification sample set is executed by the processing module, the processing module is further configured to:

receiving, by the transceiver module, the candidate confidence threshold input by a user.

In some embodiments, before the transceiver module performs the step of obtaining the real sample label information corresponding to each candidate sample image to obtain a plurality of target sample images, the processing module is further configured to:

clustering the candidate sample images to obtain a plurality of clustering groups;

respectively filtering the candidate sample images in each clustering group according to a preset screening rule to obtain filtered candidate sample images;

when the step of obtaining the real sample label information corresponding to each candidate sample image to obtain a plurality of target sample images is executed, the transceiver module is specifically configured to:

and acquiring real sample label information corresponding to each filtered candidate sample image to obtain a plurality of target sample images.

In some embodiments, after the step of performing recognition processing on each image to be recognized according to a preset security recognition model to obtain candidate confidence degrees corresponding to each image to be recognized, the processing module is further configured to:

determining the image to be recognized with the candidate confidence degree larger than or equal to the first confidence degree as an abnormal image;

and outputting the abnormal image.

In some embodiments, when the step of obtaining the real sample label information corresponding to each candidate sample image to obtain the plurality of target sample images is executed by the transceiver module, the transceiver module is specifically configured to:

displaying, by the processing module, each of the candidate sample images in a display interface;

receiving real sample label information sent by a user for each candidate sample image through the display interface;

and adding corresponding real sample label information to each candidate sample image through the processing module to obtain a plurality of target sample images.

In a third aspect, an embodiment of the present application further provides a computer device, which includes a memory and a processor, where the memory stores a computer program, and the processor implements the above method when executing the computer program.

In a fourth aspect, the present application also provides a computer-readable storage medium, in which a computer program is stored, the computer program including program instructions, which when executed by a processor, implement the above method.

Compared with the prior art, in the scheme provided by the embodiment of the application, on one hand, the application can screen the monitoring image (to-be-recognized image) with the confidence degree greater than the second confidence degree threshold value and smaller than the first confidence degree threshold value (the confidence degree threshold value currently set by the model) in the real monitoring scene (the preset place) of the image acquisition device as the candidate sample image, and as the confidence degree of the candidate sample image is close to the first confidence degree threshold value, the obtained candidate sample image is likely to include abnormal images which are not filtered by the first confidence degree threshold value. Therefore, when the method is used for training the model, real sample label information (such as an abnormal label or a normal label) can be added to the candidate sample images which possibly comprise the abnormal images which are not filtered by the first confidence coefficient threshold value, so that a target sample image is obtained, the model is further trained by using the target sample image, and therefore, the method can obtain the training sample from a real monitoring scene, further train the model according to the obtained training sample, and improve the accuracy of the model; on the other hand, the target sample image in the embodiment is obtained in the working process of the model, and the model can be trained in the working process of the model, so that the generalization capability of the model can be improved on line through the scheme.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic view of an application scenario of a model training method provided in an embodiment of the present application;

FIG. 2 is a schematic flow chart illustrating a model training method according to an embodiment of the present disclosure;

FIG. 3 is a schematic block diagram of a model training apparatus provided in an embodiment of the present application;

FIG. 4 is a schematic diagram of a server according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a terminal in an embodiment of the present application;

fig. 6 is a schematic structural diagram of a server in an embodiment of the present application.

Detailed Description

The terms "first," "second," and the like in the description and claims of the embodiments of the application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "include" and "have", and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that includes a list of steps or modules is not necessarily limited to those explicitly listed, but may include other steps or modules not explicitly listed or inherent to such process, method, article, or apparatus, such that partitioning of the modules as presented in an embodiment of the present application is merely a logical partitioning, and may be implemented in practice in other ways, such that multiple modules may be combined or integrated into another system, or some features may be omitted, or not implemented, and such that shown or discussed couplings or direct couplings or communicative connections between modules may be through interfaces, and such that indirect couplings or communicative connections between modules may be electrical or other similar forms, none of which are limiting in the present embodiment. Moreover, the modules or sub-modules described as separate components may or may not be physically separated, may or may not be physical modules, or may be distributed in a plurality of circuit modules, and some or all of the modules may be selected according to actual needs to implement the purpose of the embodiments of the present application.

An execution subject of the model training method may be the model training device provided in the embodiment of the present application, or a computer device integrated with the model training device, or a model training system including the model training device and an image acquisition device, where the model training device may be implemented in a hardware or software manner, and the computer device may be a terminal or a server.

When the computer device is a server, the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN, and a big data and artificial intelligence platform.

When the computer device is a terminal, the terminal may include: smart terminals carrying multimedia data processing functions (e.g., video data playing function, music data playing function), such as smart phones, tablet computers, notebook computers, desktop computers, smart televisions, smart speakers, personal Digital Assistants (PDA), desktop computers, smart watches, etc., but are not limited thereto.

The scheme of the embodiment of the application can be realized based on an artificial intelligence technology, and particularly relates to the technical field of computer vision in the artificial intelligence technology and the fields of cloud computing, cloud storage, databases and the like in the cloud technology, which are respectively introduced below.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Computer Vision technology (CV) Computer Vision is a science for researching how to make a machine "see", and further means that a camera and a Computer are used for replacing human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further performing graphic processing, so that the Computer processing becomes an image more suitable for human eyes to observe or transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. Computer vision technologies generally include image processing, face recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technologies, virtual reality, augmented reality, synchronous positioning, map construction, and other technologies, and also include common face recognition, fingerprint recognition, and other biometric technologies.

With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and applied in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical care, smart customer service, and the like.

The scheme of the embodiment of the application can be realized based on a cloud technology, particularly relates to the technical fields of cloud computing, cloud storage, databases and the like in the cloud technology, and is respectively introduced below.

Cloud technology refers to a hosting technology for unifying series of resources such as hardware, software, and network in a wide area network or a local area network to realize calculation, storage, processing, and sharing of data. Cloud technology (Cloud technology) is based on a general term of network technology, information technology, integration technology, management platform technology, application technology and the like applied in a Cloud computing business model, can form a resource pool, is used as required, and is flexible and convenient. Cloud computing technology will become an important support. Background services of the technical network system require a large amount of computing and storage resources, such as video websites, image-like websites and more portal websites. With the high development and application of the internet industry, each article may have an own identification mark and needs to be transmitted to a background system for logic processing, data of different levels can be processed separately, and various industry data need strong system background support and can be realized only through cloud computing. According to the embodiment of the application, the identification result can be stored through a cloud technology.

A distributed cloud storage system (hereinafter, referred to as a storage system) refers to a storage system that aggregates a large number of storage devices (storage devices are also referred to as storage nodes) of various types in a network through application software or application interfaces to cooperatively work through functions such as cluster application, grid technology, and a distributed storage file system, and provides data storage and service access functions to the outside. In the embodiment of the application, information such as network configuration and the like can be stored in the storage system, so that the server can conveniently retrieve the information.

At present, a storage method of a storage system is as follows: logical volumes are created, and when created, each logical volume is allocated physical storage space, which may be the disk composition of a certain storage device or of several storage devices. The client stores data on a certain logical volume, that is, stores the data on a file system, the file system divides the data into a plurality of parts, each part is an object, the object includes not only the data but also additional information such as data identification (ID, ID entry), the file system writes each object into a physical storage space of the logical volume, and the file system records storage location information of each object, so that when the client requests to access the data, the file system can allow the client to access the data according to the storage location information of each object.

The process of allocating physical storage space for the logical volume by the storage system specifically includes: physical storage space is divided in advance into stripes according to a group of capacity measures of objects stored in a logical volume (the measures often have a large margin with respect to the capacity of the actual objects to be stored) and Redundant Array of Independent Disks (RAID), and one logical volume can be understood as one stripe, thereby allocating physical storage space to the logical volume.

The Database (Database), which can be regarded as an electronic file cabinet in short, is a place for storing electronic files, and a user can add, query, update, delete, etc. data in the files. A "database" is a collection of data that is stored together in a manner that can be shared by multiple users, has as little redundancy as possible, and is independent of the application.

A Database Management System (DBMS) is a computer software System designed for managing a Database, and generally has basic functions of storage, interception, security assurance, backup, and the like. The database management system may classify the database according to the database model it supports, such as relational, XML (Extensible Markup Language); or classified according to the type of computer supported, e.g., server cluster, mobile phone; or classified according to the Query Language used, such as SQL (Structured Query Language), XQuery; or by performance impulse emphasis, e.g., maximum size, maximum operating speed; or other classification schemes. Regardless of the manner of classification used, some DBMSs are capable of supporting multiple query languages across categories, for example, simultaneously. In the embodiment of the application, the identification result can be stored in the database management system, so that the server can conveniently call the identification result.

It should be noted that the service terminal according to the embodiments of the present application may be a device providing voice and/or data connectivity to the service terminal, a handheld device having a wireless connection function, or another processing device connected to a wireless modem. Such as mobile telephones (or "cellular" telephones) and computers with mobile terminals, such as portable, pocket, hand-held, computer-included, or vehicle-mounted mobile devices, that exchange voice and/or data with a radio access network. Examples of such devices include Personal Communication Service (PCS) phones, cordless phones, session Initiation Protocol (SIP) phones, wireless Local Loop (WLL) stations, and Personal Digital Assistants (PDA).

Referring to fig. 1, fig. 1 is a schematic view of an application scenario of a model training method according to an embodiment of the present application. The model training method is applied to the model training system in fig. 1, and the model training system comprises model training devices and image acquisition devices, each model training device can be communicated with one or more image acquisition devices, wherein each image acquisition device is arranged in a preset place, and the preset place is a place with high confidentiality, such as a monitoring scene for personnel violation and/or equipment violation in a specific place (a factory, a port or a garden). Each image acquisition device is responsible for acquiring images in the same or different preset places, wherein the model training device and the image acquisition devices can be deployed in a centralized manner or in a separated manner, and the embodiment of the application is not limited to the above, and only the separated deployment is taken as an example.

Hereinafter, the technical solutions of the embodiments of the present application will be described in detail.

Referring to fig. 2, a model training method provided in the embodiment of the present application is described below, and as shown in fig. 2, the method includes the following steps S110 to S150.

S110, the model training device obtains a plurality of images to be recognized which are historically collected by the image collecting device in a preset place.

In this embodiment, the image acquisition device is installed in a preset place and is responsible for security monitoring of the preset place, where the preset place may be a place with high confidentiality, for example, a monitoring scene for violation of personnel and/or equipment in a specific place (a factory, a port, or a garden), and the like. In this embodiment, the model training device needs to extract the image to be recognized from the image acquisition device, and perform subsequent model training and security identification by using the extracted image to be recognized.

In some embodiments, in order to reasonably utilize the computing resources, it is necessary to dynamically adjust the frequency of extracting the image to be recognized by the model training apparatus, in this case, before extracting the image to be recognized, it is necessary to determine the current frame extraction frequency, in this case, the method further includes: determining a target frame extraction frequency according to the current scene of the image acquisition device and a preset frame extraction strategy, wherein step S110 includes: and acquiring a plurality of images to be identified which are historically acquired by the image acquisition device in the preset place according to the target frame extraction frequency.

That is, at this time, the scene condition of the current scene of the image capturing apparatus in the preset location needs to be determined first, for example, if an idle scene (the scene change in the evening is not large) from 12 pm to 5 am is set in the preset frame extraction policy, other time periods are busy scenes, the idle scene corresponds to a first frame extraction frequency, the busy scene corresponds to a second frame extraction rate, wherein the second frame extraction frequency is greater than the first frame extraction frequency, at this time, if the current scene is an idle scene, the first frame extraction frequency is determined as the target frame extraction frequency according to the frame extraction policy, and if the current scene is a busy scene, the second frame extraction frequency is determined as the target frame extraction frequency according to the frame extraction policy.

Further, if the duration that no monitoring target (e.g., a person) exists in the current scene exceeds a preset duration (e.g., 10 minutes), stopping frame extraction of the image of the current scene according to a frame extraction strategy until the monitoring target appears in the current scene, waking up a frame extraction function, and performing frame extraction according to a frame extraction frequency corresponding to the current scene.

In this embodiment, since the computational power resource is fixed, in order to reasonably allocate the computational power resource between image recognition and model training, after determining the current target framing frequency, the method further includes: dividing preset computing power resources into first computing power resources and second computing power resources according to preset computing power distribution rules and the target frame extraction frequency, wherein the first computing power resources are used for respectively identifying and processing the images to be identified, and the second computing power resources are used for training the security identification model, the smaller the target frame extraction frequency is, the smaller the identification amount of the images is, at the moment, more computing power can be distributed to model training, namely the smaller the target frame extraction frequency is, the fewer the first computing power resources are, the more the second computing power resources are, the larger the target frame extraction frequency is, the larger the first computing power resources are, and the fewer the second computing power resources are. And if the second computing resource is 0, stopping training the security identification model at the moment.

It should be noted that, in order to avoid the waste of the computing power resource, when the model training instruction is received, the first computing power resource and the second computing power resource are divided from the preset computing power resource according to the preset computing power allocation rule, and at this time, the first computing power resource and the second computing power resource are divided from the preset computing power resource according to the preset computing power allocation rule and the target frame extraction frequency, including: and when a model training instruction is received, dividing a preset calculation resource into a first calculation resource and a second calculation resource according to a preset calculation allocation rule and the target frame extraction frequency.

And S120, respectively identifying each image to be identified by the model training device according to a preset security identification model to obtain a candidate confidence corresponding to each image to be identified.

The security identification model in this embodiment is arranged in the model training device, and the model training device can identify the acquired image to be identified through the built-in security identification model and perform model training on the built-in model training device according to the identification result, so as to improve the generalization capability of the model.

In this embodiment, after the images to be recognized are obtained from the image acquisition device, the images to be recognized are respectively recognized according to a preset security recognition model, and candidate confidence degrees corresponding to the candidate recognition images are obtained, where the candidate confidence degrees indicate the possibility that the corresponding candidate recognition images have violation conditions, and the higher the confidence value is, the higher the possibility that the violation conditions exist is.

In some embodiments, after the candidate confidence of each image to be recognized is obtained, the image to be recognized, of which the candidate confidence is greater than or equal to the first confidence, is determined as an abnormal image; and outputting the abnormal image. The first confidence is the confidence corresponding to the abnormal image, wherein the abnormal image is an image including abnormal behaviors, and the abnormal behaviors include at least one behavior (for example, a person does an illegal operation, a fire occurs in a place, and the like).

For example, when the first confidence level is 0.5, determining an image to be recognized, in which the median of the candidate confidence levels is greater than or equal to 0.5, as an abnormal image, and outputting a corresponding abnormal image, and further displaying the abnormal image through a large screen, where the abnormal image includes an abnormal position identifier (e.g., the abnormal position is framed by a square frame) and the corresponding confidence level, so that a monitoring person may conveniently acquire the abnormal information in the abnormal image.

In this embodiment, the candidate sample images obtained on line are used for training subsequently, and the embodiment can train the model in the working process of the model, so that the generalization capability of the model can be improved on line through the scheme.

S130, the model training device determines the image to be recognized corresponding to the target confidence degree as a candidate sample image.

The target confidence is a confidence which is greater than a second confidence threshold and smaller than a first confidence threshold in the candidate confidences, the first confidence threshold is a confidence threshold which is currently set by the security identification model, the second confidence threshold is smaller than the first confidence threshold, and specifically, the first confidence threshold and the second confidence threshold are both confidence thresholds corresponding to abnormal images.

In this embodiment, specifically, the image to be recognized with the candidate confidence level between the second confidence level threshold and the first confidence level threshold is taken as the candidate sample image, for example, when the second confidence level threshold is 0.4 and the first confidence level threshold is 0.5, the image to be recognized with the confidence level between 0.4 and 0.5 is screened as the candidate sample image.

It should be noted that, in some embodiments, in order to save effort, the model training device does not need to train the security recognition model in real time, and the model training device will perform training on the security model only after receiving a model training instruction input by a user or automatically triggered periodically, where step S130 includes: when a model training instruction is received, determining an image to be recognized corresponding to the target confidence as a candidate sample image, where the image to be recognized is an image collected by the model training device within a preset historical time period, and the time length of the historical time period may be set according to actual needs, for example, data within the previous week, and is not limited herein.

S140, the model training device obtains real sample label information corresponding to each candidate sample image to obtain a plurality of target sample images.

The target sample image carries corresponding real sample label information, and the real sample label information represents whether the target sample image is an image acquired under an abnormal condition or an abnormal behavior scene, so that the target sample image carrying the real sample label information can be used for further training a model.

In some embodiments, in particular, step S140 comprises: displaying each candidate sample image in a display interface; receiving real sample label information sent by a user for each candidate sample image through the display interface; and respectively adding corresponding real sample label information to each candidate sample image to obtain a plurality of target sample images.

The candidate sample image displayed in the display interface comprises a plurality of candidate identification results and a confidence corresponding to each candidate identification result, at the moment, if the candidate identification results comprise the identification results corresponding to the real sample label information, the user directly selects the candidate identification results in the interface to serve as the real identification results of the candidate image sample image, so as to obtain the target sample image comprising the real sample label information, and if the candidate identification results do not comprise the identification results corresponding to the real sample label information, the user inputs the real sample label information of the candidate sample image in the interface to obtain the target sample image. At this time, obtaining the real sample label information corresponding to each candidate sample image includes: the real sample label information corresponding to each candidate sample image is obtained through a semi-automatic labeling method, and therefore, in this embodiment, a user can determine the real sample label information through a candidate identification result displayed on an interface, so that semi-automatic labeling of a sample is realized, and the efficiency of sample labeling is improved.

Specifically, when the model training device is deployed on a server (including a cloud server or a physical server), at this time, the server provides a display interface of a candidate sample image for a user through a user terminal, the user determines real sample label information corresponding to the candidate sample image through the candidate sample image in the display interface, so that the user terminal obtains the real sample label information of the candidate sample image, and after the user terminal obtains the real sample label information, the user terminal sends the real sample label information to the server, so that the server obtains the real sample label information of the candidate sample image. In other embodiments, when the model training apparatus is deployed at the user terminal side, the candidate sample image is directly displayed in a display interface provided by the user terminal, so that the user inputs or determines the real sample label information of the candidate sample image in the display interface according to the displayed candidate sample image.

In some embodiments, the number of the candidate sample images to be screened may be large, and there are many similar images, in order to improve the training efficiency and reduce the number of labeled samples, this embodiment needs to screen a small amount of common valid data from a large number of similar samples, at this time, the method further includes: clustering the candidate sample images to obtain a plurality of clustering groups; and respectively filtering the candidate sample images in each clustering group according to a preset screening rule to obtain filtered candidate sample images.

The preset screening rule is that target candidate sample images (i.e. candidate sample images not required to be filtered) are selected from each cluster group according to a specific proportion or number, and then non-target candidate sample images are filtered.

At this time, the candidate sample images in step S140 are filtered candidate sample images, that is, the obtaining of the real sample label information corresponding to each of the candidate sample images obtains a plurality of target sample images, and the method includes: and acquiring real sample label information corresponding to each filtered candidate sample image to obtain a plurality of target sample images.

S150, the model training device trains the security identification model according to the target sample images to obtain the trained security identification model.

Specifically, step S150 includes step a, step B, and step C as follows:

and step A, dividing the target sample images into a training sample set and a verification sample set respectively according to a preset sample distribution proportion.

The sample distribution proportion of the training sample set and the verification sample set may be 7 to 3, or may be other values, and the proportion may be adjusted according to the actual needs of the user, and is not limited herein.

And B, performing model parameter training on the security identification model according to the training sample set to obtain an intermediate security identification model.

Specifically, in this embodiment, based on the training sample set, a gradient descent algorithm may be used to perform model parameter training on the security identification model, and an intermediate security identification model subjected to parameter optimization is output.

And C, adjusting the first confidence coefficient threshold value according to the verification sample set based on the middle security identification model to obtain the trained security identification model.

Specifically, in this embodiment, based on the intermediate security identification model, a recall rate and a false alarm rate corresponding to each candidate confidence threshold in a plurality of candidate confidence thresholds are determined according to the verification sample set; selecting a target confidence coefficient threshold value from a plurality of candidate confidence coefficient threshold values meeting a preset standard according to the recall rate and the false alarm rate, and replacing the first confidence coefficient threshold value with the target confidence coefficient threshold value, wherein the preset standard is that the recall rate is higher than a preset recall rate threshold value and the false alarm rate is lower than a preset false alarm rate threshold value.

In some embodiments, the candidate confidence threshold is a confidence threshold preset in the model training device, and at this time, the model training device may select a candidate confidence threshold from a plurality of candidate confidence thresholds meeting a preset standard as the target confidence threshold, or select a candidate confidence threshold with the highest recall ratio or the lowest false positive ratio from a plurality of candidate confidence thresholds meeting a preset standard as the target confidence threshold.

In other embodiments, in order to improve flexibility of the confidence setting and perform the confidence setting individually according to a user requirement, before determining a recall rate and a false positive rate respectively corresponding to each of a plurality of candidate confidence thresholds according to the verification sample set, the method further includes: receiving the candidate confidence threshold input by a user; specifically, at this time, an input interface of candidate confidence degree thresholds is provided for the user, then the user inputs the candidate confidence degree thresholds through the input interface, then the model training device displays the recall rate and the false alarm rate respectively corresponding to each candidate confidence degree threshold through the interface, and then the user selects a target confidence degree threshold from a plurality of candidate confidence degree thresholds meeting the preset standard according to requirements.

In some embodiments, when the model training apparatus is deployed on a server (including a cloud server or a physical server), the server provides an input interface of a candidate confidence threshold value for a user through a user terminal, and the user terminal sends the candidate confidence threshold value to the server after acquiring the candidate confidence threshold value through the input interface. In other embodiments, the model training device is deployed at a user terminal, and at this time, the user directly inputs the candidate confidence threshold in an input interface provided by the user terminal, so that the user terminal acquires the candidate confidence threshold input by the user.

In summary, in the scheme provided in the embodiment of the present application, on one hand, a monitoring image (to-be-recognized image) whose confidence level is greater than a second confidence level threshold and smaller than a first confidence level threshold (a confidence level threshold currently set by a model) can be screened out in a real monitoring scene (a preset place) of an image acquisition device as a candidate sample image, and since the confidence level of the candidate sample image is close to the first confidence level threshold, the obtained candidate sample image is likely to include an abnormal image which is not filtered by the first confidence level threshold, a user can add real sample label information (such as an abnormal label or a normal label) to the candidate sample image, so as to obtain a target sample image, and further train the model by using the target sample image, so that, it is seen that the present application can obtain a training sample from the real monitoring scene, and further train the model according to the obtained training sample, so as to improve the accuracy of the model; on the other hand, the target sample image in the embodiment is obtained in the working process of the model, and the model can be trained in the working process of the model, so that the generalization capability of the model can be improved on line through the scheme.

Fig. 3 is a schematic block diagram of a model training apparatus according to an embodiment of the present application. As shown in fig. 3, the present application also provides a model training apparatus corresponding to the above model training method. The model training apparatus, which includes means for performing the above-described model training method, may be configured in a server or a terminal.

For example, the device is configured on a computer of a user (the user may be a client using the model training device, or a person specially training a model), and the computer is in communication connection with a monitoring camera (an image acquisition device) to acquire an image to be recognized acquired by the monitoring camera, and perform image recognition and model online update according to the image to be recognized.

Specifically, referring to fig. 3, the model training apparatus 300 includes a transceiver module 301 and a processing module 302, wherein:

the receiving and sending module 301 is configured to obtain a plurality of images to be identified, which are historically collected by the image collection device in a preset place;

the processing module 302 is configured to perform recognition processing on each to-be-recognized image according to a preset security recognition model, so as to obtain candidate confidence levels corresponding to the to-be-recognized images respectively; determining an image to be recognized corresponding to a target confidence as a candidate sample image, wherein the target confidence is a confidence which is greater than a second confidence threshold and smaller than a first confidence threshold in the candidate confidence, the first confidence threshold is a confidence threshold currently set by the security recognition model, and the second confidence threshold is smaller than the first confidence threshold;

the transceiver module 301 is further configured to obtain real sample label information corresponding to each candidate sample image, to obtain a plurality of target sample images, where the target sample images carry the corresponding real sample label information;

the processing module 302 is further configured to train the security recognition model according to the plurality of target sample images to obtain the trained security recognition model.

In some embodiments, before the transceiver module 301 performs the step of acquiring a plurality of images to be identified historically acquired by the image acquisition apparatus in a preset location, the processing module 302 is further configured to:

at this time, when the step of acquiring a plurality of images to be identified historically acquired by the image acquisition apparatus in a preset place is executed, the transceiver module 301 is specifically configured to:

In some embodiments, when the step of training the security recognition model according to the plurality of target sample images to obtain the trained security recognition model is executed, the processing module 302 is specifically configured to:

and adjusting the first confidence coefficient threshold value according to the verification sample set based on the middle security identification model to obtain the trained security identification model.

In some embodiments, when the adjusting processing step of the first confidence threshold according to the verification sample set based on the intermediate security identification model is executed by the processing module 302, specifically, the processing module is configured to:

In some embodiments, before the step of determining, based on the intermediate security identification model, a recall rate and a false positive rate respectively corresponding to each of the candidate confidence thresholds in the plurality of candidate confidence thresholds according to the verification sample set is executed by the processing module 302, the processing module is further configured to:

receiving, by the transceiver module 301, the candidate confidence threshold input by a user.

In some embodiments, before the transceiver module 301 performs the step of obtaining the real sample label information corresponding to each candidate sample image to obtain a plurality of target sample images, the processing module 302 is further configured to:

when the step of obtaining the real sample label information corresponding to each candidate sample image to obtain a plurality of target sample images is executed, the transceiver module 301 is specifically configured to:

In some embodiments, after the step of performing recognition processing on each image to be recognized according to a preset security recognition model to obtain a candidate confidence corresponding to each image to be recognized is executed, the processing module 302 is further configured to:

and outputting the abnormal image.

In some embodiments, when the step of obtaining the real sample label information corresponding to each candidate sample image to obtain a plurality of target sample images is executed, the transceiver module 301 is specifically configured to:

presenting, by the processing module 302, each of the candidate sample images in a presentation interface;

adding corresponding real sample label information to each candidate sample image through the processing module 302 to obtain a plurality of target sample images.

In this embodiment, on one hand, the transceiver module 301 in this application may acquire an image to be identified (a monitoring image) acquired by an image acquisition device, and the processing module 302 may screen out an image to be identified whose confidence is greater than a second confidence threshold and smaller than a first confidence threshold as a candidate sample image, where since the confidence of the candidate sample image is close to the first confidence threshold, the obtained candidate sample image is likely to include an abnormal image that is not filtered by the first confidence threshold. Therefore, when the method is used for training the model, the target sample image can be obtained by adding real sample label information to the candidate sample images which possibly comprise the abnormal images which are not filtered by the first confidence coefficient threshold value, and the model is further trained by using the target sample image, so that the method can obtain the training sample from a real monitoring scene, further train the model according to the obtained training sample, and improve the accuracy of the model; on the other hand, the target sample image in the embodiment is obtained in the working process of the model, and the model can be trained in the working process of the model, so that the generalization capability of the model can be improved on line through the scheme.

The image information recognition system in the embodiment of the present application is described above from the perspective of the modular functional entity, and the image information recognition apparatus in the embodiment of the present application is described below from the perspective of hardware processing.

It should be noted that, in the embodiments (including the embodiments shown in fig. 3) in the present application, all entity devices corresponding to the transceiver module may be transceivers, and all entity devices corresponding to the processing module may be processors. When one of the devices has the structure as shown in fig. 3, the processor, the transceiver and the memory implement the same or similar functions of the transceiver module and the processing module provided in the device embodiment corresponding to the device, and the memory in fig. 4 stores a computer program that needs to be called when the processor executes the image information identification method.

The apparatus shown in fig. 3 may have a structure as shown in fig. 4, when the apparatus shown in fig. 3 has the structure as shown in fig. 4, the processor in fig. 4 can implement the same or similar functions of the processing module provided in the embodiment of the apparatus corresponding to the apparatus, the transceiver in fig. 4 can implement the same or similar functions of the transceiver module provided in the embodiment of the apparatus corresponding to the apparatus, and the memory in fig. 4 stores a computer program that needs to be called when the processor executes the image information identification method. In the embodiment shown in fig. 3 of this application, the entity device corresponding to the transceiver module may be an input/output interface, and the entity device corresponding to the processing module may be a processor.

As shown in fig. 5, for convenience of description, only the portions related to the embodiments of the present application are shown, and details of the specific technology are not disclosed, please refer to the method portion of the embodiments of the present application. The terminal may be any terminal, such as a mobile phone, a tablet computer, a Personal Digital Assistant (PDA, for short), a Point of sale terminal (POS, for short), and a vehicle-mounted computer, where the terminal is taken as a mobile phone:

fig. 5 is a block diagram illustrating a partial structure of a mobile phone related to a terminal provided in an embodiment of the present application. Referring to fig. 5, the handset includes: radio Frequency (RF) circuit 55, memory 520, input unit 530, display unit 540, sensor 550, audio circuit 560, wireless fidelity (Wi-Fi) module 570, processor 580, and power supply 590. Those skilled in the art will appreciate that the handset configuration shown in fig. 5 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

The following describes each component of the mobile phone in detail with reference to fig. 5:

the RF circuit 55 may be used for receiving and transmitting signals during information transmission and reception or during a call, and in particular, receives downlink information of a base station and then processes the received downlink information to the processor 580; in addition, the data for designing uplink is transmitted to the base station. In general, RF circuit 55 includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuitry 55 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communications (GSM), general Packet Radio Service (GPRS), code Division Multiple Access (CDMA), wideband Code Division Multiple Access (WCDMA), long Term Evolution (LTE), e-mail, short Message Service (SMS), etc.

The memory 520 may be used to store software programs and modules, and the processor 580 executes various functional applications and data processing of the mobile phone by operating the software programs and modules stored in the memory 520. The memory 520 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, etc. Further, the memory 520 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The input unit 530 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the cellular phone. Specifically, the input unit 530 may include a touch panel 531 and other input devices 532. The touch panel 531, also called a touch screen, can collect touch operations of a user on or near the touch panel 531 (for example, operations of the user on or near the touch panel 531 by using any suitable object or accessory such as a finger or a stylus pen), and drive the corresponding connection device according to a preset program. Alternatively, the touch panel 531 may include two parts, a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, and sends the touch point coordinates to the processor 580, and can receive and execute commands sent by the processor 580. In addition, the touch panel 531 may be implemented by various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The input unit 530 may include other input devices 532 in addition to the touch panel 531. In particular, other input devices 532 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.

The display unit 540 may be used to display information input by the user or information provided to the user and various menus of the mobile phone. The Display unit 540 may include a Display panel 541, and optionally, the Display panel 541 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel 531 may cover the display panel 541, and when the touch panel 531 detects a touch operation on or near the touch panel 531, the touch panel is transmitted to the processor 580 to determine the type of the touch event, and then the processor 580 provides a corresponding visual output on the display panel 541 according to the type of the touch event. Although in fig. 5, the touch panel 531 and the display panel 541 are two independent components to implement the input and output functions of the mobile phone, in some embodiments, the touch panel 531 and the display panel 541 may be integrated to implement the input and output functions of the mobile phone.

The handset may also include at least one sensor 550, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 541 according to the brightness of ambient light, and the proximity sensor may turn off the display panel 541 and/or the backlight when the mobile phone is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing the gesture of the mobile phone (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometer and tapping), and the like; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile phone, further description is omitted here.

Audio circuitry 560, speaker 561, and microphone 562 may provide an audio interface between a user and a cell phone. The audio circuit 560 may transmit the electrical signal converted from the received audio data to the speaker 561, and the electrical signal is converted into an audio signal by the speaker 561 and output; on the other hand, the microphone 562 converts the collected sound signals into electrical signals, which are received by the audio circuit 560 and converted into audio data, which are then processed by the audio data output processor 580, either by way of the RF circuit 55 for transmission to, for example, another cell phone, or by outputting the audio data to the memory 520 for further processing.

Wi-Fi belongs to short-distance wireless transmission technology, and the mobile phone can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the Wi-Fi module 570, and provides wireless broadband internet access for the user. Although fig. 5 shows the Wi-Fi module 570, it is understood that it does not belong to the essential component of the handset, and may be omitted entirely as necessary within the scope of not changing the essence of the application.

The processor 580 is a control center of the mobile phone, connects various parts of the entire mobile phone using various interfaces and lines, and performs various functions of the mobile phone and processes data by operating or executing software programs and/or modules stored in the memory 520 and calling data stored in the memory 520, thereby integrally monitoring the mobile phone. Alternatively, processor 580 may include one or more processing units; preferably, the processor 580 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into processor 580.

The handset also includes a power supply 590 (e.g., a battery) for powering the various components, which may be logically coupled to the processor 580 via a power management system to manage charging, discharging, and power consumption via the power management system.

Although not shown, the mobile phone may further include a camera, a bluetooth module, etc., which are not described herein.

In the embodiment of the present application, the processor 580 included in the handset also has a flowchart for controlling the execution of the model training method shown in fig. 2.

Fig. 6 is a schematic structural diagram of a server 620 according to an embodiment of the present disclosure, where the server 620 may have a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 622 (e.g., one or more processors) and a memory 632, and one or more storage media 630 (e.g., one or more mass storage devices) for storing applications 642 or data 644. Memory 632 and storage medium 630 may be, among other things, transient or persistent storage. The program stored in the storage medium 630 may include one or more modules (not shown), each of which may include a series of instruction operations for the server. Still further, the central processor 622 may be configured to communicate with the storage medium 630 and execute a series of instruction operations in the storage medium 630 on the server 620.

The Server 620 may also include one or more power supplies 626, one or more wired or wireless network interfaces 650, one or more input-output interfaces 658, and/or one or more operating systems 641, such as Windows Server, mac OS X, unix, linux, freeBSD, etc.

The steps performed by the server in the above embodiment may be based on the structure of the server 620 shown in fig. 6. The steps of the server shown in fig. 2 in the above-described embodiment, for example, may be based on the server structure shown in fig. 6. For example, the processor 622, by calling instructions in the memory 632, performs the following operations:

respectively identifying each image to be identified according to a preset security identification model to obtain a candidate confidence corresponding to each image to be identified;

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working processes of the system, the apparatus, and the module described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the embodiments of the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and in actual implementation, there may be other divisions, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may be stored in a computer readable storage medium.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, it may be implemented in whole or in part in the form of a computer program product.

The computer program product includes one or more computer instructions. The procedures or functions described in accordance with the embodiments of the present application are generated in whole or in part when the computer program is loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a server, a data center, etc., that includes one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), among others.

The technical solutions provided in the embodiments of the present application are described in detail above, and the embodiments of the present application use specific examples to explain the principles and implementations of the embodiments of the present application, and the descriptions of the embodiments are only used to help understand the methods and core ideas of the embodiments of the present application; meanwhile, for a person skilled in the art, according to the idea of the embodiment of the present application, the specific implementation manner and the application range may be changed, and in conclusion, the content of the present specification should not be construed as a limitation to the embodiment of the present application.

Claims

1. A method of model training, comprising:

determining an image to be recognized corresponding to a target confidence as a candidate sample image, wherein the target confidence is a confidence which is greater than a second confidence threshold and smaller than a first confidence threshold in the candidate confidence, the first confidence threshold is a confidence threshold currently set by the security recognition model, and the second confidence threshold is smaller than the first confidence threshold;

2. The method according to claim 1, wherein the acquiring image capturing device is ahead of a plurality of images to be recognized collected historically in a preset place, the method further comprising:

the acquisition image acquisition device is in presetting a plurality of images of treating discernment of historical collection in the place, includes:

3. The method of claim 1, wherein the training the security recognition model according to the plurality of target sample images to obtain the trained security recognition model comprises:

4. The method of claim 3, wherein the adjusting the first confidence threshold according to the verification sample set based on the intermediate security identification model comprises:

5. The method of claim 4, wherein before determining, based on the intermediate security identification model, a recall rate and a false positive rate respectively corresponding to each of the candidate confidence thresholds in the plurality of candidate confidence thresholds from the validation sample set, the method further comprises:

receiving the candidate confidence threshold input by a user.

6. The method according to any one of claims 1 to 5, wherein before obtaining the real sample label information corresponding to each of the candidate sample images, and obtaining a plurality of target sample images, the method further comprises:

the obtaining of the real sample label information corresponding to each candidate sample image to obtain a plurality of target sample images includes:

7. The method according to any one of claims 1 to 5, wherein after the images to be recognized are respectively recognized according to a preset security recognition model to obtain candidate confidence degrees corresponding to the images to be recognized, the method further comprises:

and outputting the abnormal image.

8. The method according to any one of claims 1 to 5, wherein the obtaining of the real sample label information corresponding to each candidate sample image to obtain a plurality of target sample images comprises:

displaying each candidate sample image in a display interface;

and adding corresponding real sample label information for each candidate sample image to obtain a plurality of target sample images.

9. A computer arrangement, characterized in that the computer arrangement comprises a memory having stored thereon a computer program and a processor implementing the method according to any of claims 1-8 when executing the computer program.

10. A computer-readable storage medium, characterized in that the storage medium stores a computer program comprising program instructions which, when executed by a processor, implement the method according to any one of claims 1-8.