CN112580684A

CN112580684A - Target detection method and device based on semi-supervised learning and storage medium

Info

Publication number: CN112580684A
Application number: CN202011288652.1A
Authority: CN
Inventors: 唐子豪; 刘莉红; 刘玉宇
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2020-11-17
Filing date: 2020-11-17
Publication date: 2021-03-30
Anticipated expiration: 2040-11-17
Also published as: CN112580684B

Abstract

The invention relates to the technical field of target detection, and discloses a target detection method based on semi-supervised learning, which comprises the following steps: determining label data corresponding to the training data based on the acquired training data; carrying out data cleaning processing on the tag data to obtain new cleaned tag data; performing data enhancement processing on the new tag data to acquire enhanced data corresponding to the new tag data; training a deep learning model based on the enhanced data and preset artificially labeled image information until a loss function of the deep learning model converges in a preset range to form a target detection model; and acquiring a target detection result of the data to be detected based on the target detection model. The invention also relates to blockchain techniques, the new tag data being stored in a blockchain. The invention can improve the efficiency and accuracy of target detection based on semi-supervised learning.

Description

Target detection method and device based on semi-supervised learning and storage medium

Technical Field

The present invention relates to the field of target detection technologies, and in particular, to a method and an apparatus for target detection based on semi-supervised learning, an electronic device, and a computer-readable storage medium.

Background

The manual work behind artificial intelligence mainly means that a large amount of manpower is needed to label data before training a model. Although there are target detection data sets disclosed by COCO and the like at present, the target detection depth model is applied to the actual item at present, and needs to be trained again on the labeled service data set to adapt to the service data. At present, most artificial intelligence enterprises need to invest a large amount of cost for acquiring the artificial annotation of business data. Meanwhile, for the labeled data, manual work is required to be invested to carry out inspection, cleaning and correction to ensure the image labeling quality, the requirement comes from the sensitivity of the neural network to the data, therefore, a multi-level labeling and examining structure is required to be constructed for the labeled data, and the labeled data can only be proved to be statistically available through sampling inspection for large-batch data.

At present, although a semi-supervised learning method for classification tasks achieves certain results, the semi-supervised learning method for target detection is not mature, and the problems of high calculation precision, large data volume and the like still exist.

Disclosure of Invention

The invention provides a target detection method and device based on semi-supervised learning, electronic equipment and a computer readable storage medium, and mainly aims to improve the efficiency and accuracy of target detection based on semi-supervised learning.

In order to achieve the above object, the present invention provides a target detection method based on semi-supervised learning, comprising: determining label data corresponding to the training data based on the acquired training data;

carrying out data cleaning processing on the tag data to obtain new cleaned tag data;

performing data enhancement processing on the new tag data to acquire enhanced data corresponding to the new tag data;

training a deep learning model based on the enhanced data and preset artificially labeled image information until a loss function of the deep learning model converges in a preset range to form a target detection model;

and acquiring a target detection result of the data to be detected based on the target detection model.

Optionally, the step of determining label data corresponding to the training data includes:

carrying out horizontal mirror image overturning processing on the training data and acquiring the processed training data;

training an open source model based on the processed training data until the open source model converges in a specified range to form a label obtaining model;

and obtaining label data of the unmarked training data according to the label obtaining model.

Optionally, the training data comprises label-free image information;

the tag data includes an object located on the image information, an upper left-hand abscissa, an upper left-hand ordinate, a lower right-hand abscissa, and a lower right-hand ordinate of a bounding box for bounding the object.

Optionally, the step of performing data cleaning processing on the tag data, and acquiring new tag data after cleaning includes:

determining the width, height and central point coordinate information of the surrounding frame based on the upper left-corner abscissa, the upper left-corner ordinate, the lower right-corner abscissa and the lower right-corner ordinate of the surrounding frame;

determining a conversion coordinate corresponding to the surrounding frame according to the width coordinate of the surrounding frame, the width of the image information corresponding to the surrounding frame, the height and center coordinate of the surrounding frame and the height of the image information corresponding to the surrounding frame;

and carrying out data cleaning processing on the converted coordinates based on an open source framework CLEANAB to obtain new label data after cleaning.

Optionally, the new tag data is stored in a block chain, and the step of performing data enhancement processing on the new tag data includes:

randomly dithering a color variable of the new label data; and/or geometrically deforming objects within a bounding box in the new tag data; and/or geometrically deforming the new label data and correspondingly transforming the bounding box in the new label data; wherein the content of the first and second substances,

the color variables include brightness, saturation, contrast, and transparency;

the geometric deformation includes translation, flipping, shearing, and rotation.

Optionally, the loss function comprises a sum of supervised and unsupervised losses; wherein the content of the first and second substances,

the expression of the supervised loss is as follows:

wherein, x represents an image, p and t represent vector information, b represents the sequence number of a bounding box in the artificially marked image information, i represents the sequence number of a prior box, pi represents the probability that the predicted prior box belongs to a positive sample, ti represents the coordinate of the prior box (including the horizontal and vertical coordinates of the upper left corner and the lower right corner), and when the prior box i belongs to the bounding box b, p represents the horizontal and vertical coordinates of the prior box_i，b1, otherwise p_i，bIs 0, t_bRepresenting coordinates of the manual label surrounding the frame, Ls represents the survived loss, Lcls represents a classified loss function, Lreg represents a regressed loss function, Nreg represents a normalization coefficient of a regression term, and Ncls represents a normalization coefficient of a classification term;

the expression of the unsupervised loss is as follows:

wherein x represents an image, q represents label data of the image x, b represents a sequence number of a bounding box in the label data, i represents a sequence number of a prior box, pi represents the predicted probability that the prior box belongs to a positive sample, ti represents coordinates of the prior box (including a horizontal and vertical coordinate at the upper left corner and a horizontal and vertical coordinate at the lower right corner), and q represents the number of the prior box when the prior box i belongs to the bounding box b_i，b1, otherwise q_i，bIs 0, s_bRepresenting the bounding box and the labeled coordinates thereof, Lu representing unsupervised loss, Lcs representing the loss function of classification, Lreg representing the loss function of regression, Nreg representing the normalization coefficient of regression term, and Ncs representing the normalization coefficient of classification term; wherein the content of the first and second substances,

ω(x)＝1ifmax(p(x；θ))≥τelse0

q(x)＝ONE_HOT(argmax(p(x；θ)))

where θ represents a trainable parameter of the deep learning model and τ represents a confidence threshold for the new tag data.

Optionally, the number of the enhancement data is 10-15 times of the number of the artificially labeled image information.

In order to solve the above problem, the present invention further provides a target detection apparatus based on semi-supervised learning, the apparatus comprising:

a label data determination unit configured to determine label data corresponding to the training data based on the acquired training data;

a new tag data acquisition unit, configured to perform data cleaning processing on the tag data, and acquire new tag data after cleaning;

the enhanced data acquisition unit is used for performing data enhancement processing on the new tag data to acquire enhanced data corresponding to the new tag data;

the target detection model forming unit is used for training a deep learning model based on the enhanced data and preset artificially labeled image information until the loss function of the deep learning model is converged in a preset range so as to form a target detection model;

and the detection result acquisition unit is used for acquiring a target detection result of the data to be detected based on the target detection model.

In order to solve the above problem, the present invention also provides an electronic device, including:

a memory storing at least one instruction; and

and the processor executes the instructions stored in the memory to realize the semi-supervised learning based target detection method.

In order to solve the above problem, the present invention further provides a computer-readable storage medium, which stores at least one instruction, where the at least one instruction is executed by a processor in an electronic device to implement the semi-supervised learning based target detection method described above.

The method comprises the steps of determining corresponding label data based on acquired training data, then carrying out data cleaning processing and data enhancement processing on the label data to acquire new label data and enhanced data, and training a deep learning model based on the enhanced data and preset artificially labeled image information until a loss function of the deep learning model converges in a preset range to form a target detection model; through the characteristics, the automatic marking of the embodiment of the invention can achieve the degree close to that after multi-stage inspection and correction in quality, and the cost is greatly reduced; the method can also be used for matching with manual quality inspection to inspect and modify the existing data, simplify the data quality inspection process, save the management and time cost except the labor cost.

Drawings

Fig. 1 is a flowchart of a target detection method based on semi-supervised learning according to an embodiment of the present invention;

FIG. 2 is a block diagram of an apparatus for detecting targets based on semi-supervised learning according to an embodiment of the present invention;

fig. 3 is a schematic internal structural diagram of an electronic device implementing a target detection method based on semi-supervised learning according to an embodiment of the present invention;

the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The invention provides a target detection method based on semi-supervised learning. Fig. 1 is a schematic flow chart of a target detection method based on semi-supervised learning according to an embodiment of the present invention. The method may be performed by an apparatus, which may be implemented by software and/or hardware.

In this embodiment, the target detection method based on semi-supervised learning includes:

s110: based on the acquired training data, tag data corresponding to the training data is determined.

The training data can adopt image information without labels, and corresponding label data is obtained based on the image information without labels; wherein the label data further comprises a category of the object in the image information, an upper left abscissa, an upper left ordinate, a lower right abscissa, and a lower right ordinate of a bounding box that bounds the object.

Wherein the step of determining label data corresponding to the training data comprises:

1. carrying out horizontal mirror image overturning processing on the training data and acquiring the processed training data;

2. training a DetectORS open source model based on the processed training data until the DetectORS open source model converges in a specified range to form a label obtaining model;

3. and obtaining label data of the unmarked training data according to the label obtaining model.

It should be noted that, the image information in the training data is the original image, not the reduced-size image, the image information in the training data includes multi-medium-scale images with sizes of 0.25, 0.5, 0.75, 1, 1.25, 1.5, 1.75, 2, 2.25, 2.5, 2.75, 3, etc., and the detection accuracy of the later model can be improved by using image information with various sizes.

S120: and carrying out data cleaning processing on the tag data to obtain new tag data after cleaning.

The data cleaning processing is performed on the tag data, and the specific process of acquiring the cleaned new tag data may include:

1. determining the width, height and central point coordinate information of the enclosing frame based on the upper left-corner abscissa, the upper left-corner ordinate, the lower right-corner abscissa and the lower right-corner ordinate of the enclosing frame;

2. determining a conversion coordinate corresponding to the surrounding frame according to the width coordinate of the surrounding frame, the width of the image information corresponding to the surrounding frame, the height and center coordinate of the surrounding frame and the height of the image information corresponding to the surrounding frame;

specifically, the width coordinate of the bounding box may be divided by the width of the image information corresponding to the bounding box, and the height and the center coordinate of the bounding box may be simultaneously divided by the height of the image information corresponding to the bounding box, to obtain the conversion coordinate corresponding to the bounding box;

3. and carrying out data cleaning processing on the converted coordinates based on an open source framework CLEANAB to obtain new label data after cleaning.

It should be noted that, the existing open source framework clearlab can clean noisy data, and train a model using the cleaned data. However, the method can only be used for a classification task and cannot be used in a target detection task, and the target detection task can be regarded as the superposition of the classification task and a regression task.

Since CLEANLAB can be used to process classification problems up to 1000 classes, the regression target output values can be normalized to between 0 and 1, and each 0.001 step is divided into 1 class, which can be subdivided into 1000 classes 1 × 1e-3, 2 × 1e-3, … …, 1000 × 1e-3, [ (n-1) × 1e-3, n × 1e-3), belonging to the nth class.

In the above step, the process of obtaining the transformation coordinates corresponding to the bounding box may normalize all 4 coordinates of the bounding box to between 0 and 1, and further transform the target detection task into a classification problem of 1000 classes. For example, in a neural network, assuming that there are m neurons in the last layer, m scalars are output, which is represented by a vector v. If the regression problem is processed, the m neurons are connected to 1 neuron to obtain 1 scalar w.v + b, and a continuous value is output. If the classification problem is handled: the m neurons are connected to n neurons to obtain n scalars w.v + b, and then normalized into probabilities on n categories by an activation function such as softmax, so that the regression problem can be converted into the classification problem.

S130: and performing data enhancement processing on the new tag data to acquire enhanced data corresponding to the new tag data.

The new tag data can be stored in a block chain, and the step of performing data enhancement processing on the new tag data comprises the following steps: randomly dithering the color variable of the new label data; and/or geometrically deforming the object within the bounding box in the new tag data; and/or geometrically deforming the new label data and correspondingly transforming the bounding box in the new label data; wherein the color variables include brightness, saturation, contrast, and transparency; geometric deformations include translation, flipping, shearing, and rotation.

It is emphasized that, to further ensure the privacy and security of the new tag data, the new tag data may also be stored in a node of a blockchain.

Specifically, the label data after washing can also be understood as image data with a machine label after washing, and the manner of performing data enhancement processing on the label data after washing includes three processing cases, the first: carrying out random dithering of color variables on the cleaned label data; in the second case: geometrically deforming the bounding box in the cleaned label data; in the third case: and performing geometric deformation on the cleaned label data, and performing corresponding transformation on the enclosure frame. Wherein, the color variables comprise brightness, saturation, contrast, transparency and the like; geometric deformations include translation, flipping, shearing, rotation, and the like.

In addition, after the enhancement data are obtained, positions can be randomly selected on the image of the enhancement data, a rectangular frame with color random noise is added, the shielding condition in a real scene is simulated through the rectangular frame, and the rectangular frame can be selectively added according to a specific application scene or requirement.

S140: and training the deep learning model based on the enhanced data and preset artificially labeled image information until the loss function of the deep learning model is converged in a preset range to form a target detection model.

the expression with supervision loss is:

the expression for unsupervised loss is:

ω(x)＝1if max(p(x；θ))≥τelse0

q(x)＝ONE_HOT(arg max(p(x；θ)))

in the above formula, θ represents a trainable parameter of the deep learning model, and τ represents a confidence threshold of the new tag data.

It should be noted that, in order to increase the data amount and avoid the image information occupation ratio of the manual annotation being too small, the number of the enhanced data (i.e. the number of the images to be labeled) is 10 to 15 times of the number of the image information of the manual annotation, and the occupation ratio may also be set according to a specific application scenario and requirements, and is not limited to the specific numerical value.

S150: and acquiring a target detection result of the data to be detected based on the target detection model.

The target detection method based on semi-supervised learning can greatly reduce the marking cost and improve the detection precision, compared with the traditional method, the automatic marking can achieve the degree close to that after multi-stage inspection and correction in quality, and meanwhile, the cost is greatly reduced; the method can also be used for matching with manual quality inspection to inspect and reform the existing data, thereby simplifying the data quality inspection process; furthermore, in addition to labor costs, management and time costs may also be saved. In addition, the existing data cleaning method can be applied to more complex scenes through modification, and more options are provided for subsequent expansion of the application field.

Corresponding to the target detection method based on semi-supervised learning, the invention also provides a target detection device based on semi-supervised learning.

Specifically, fig. 3 shows functional modules of a target detection apparatus based on semi-supervised learning according to an embodiment of the present invention.

As shown in fig. 3, the target detection apparatus 100 based on semi-supervised learning according to the present invention can be installed in an electronic device. According to the implemented functions, the semi-supervised learning based object detection apparatus 100 may include: a tag data determination unit 101, a new tag data acquisition unit 102, an enhanced data acquisition unit 103, an object detection model formation unit 104, and a detection result acquisition unit 105. A module according to the present invention, which may also be referred to as a unit, refers to a series of computer program segments that can be executed by a processor of an electronic device and that can perform a fixed function, and that are stored in a memory of the electronic device.

In the present embodiment, the functions regarding the respective modules/units are as follows:

a tag data determination unit 101 configured to determine, based on the acquired training data, tag data corresponding to the training data;

a new tag data obtaining unit 102, configured to perform data cleaning processing on the tag data, and obtain new tag data after cleaning. It is emphasized that, to further ensure the privacy and security of the new tag data, the new tag data may also be stored in a node of a blockchain.

An enhanced data obtaining unit 103, configured to perform data enhancement processing on the new tag data, and obtain enhanced data corresponding to the new tag data;

a target detection model forming unit 104, which trains a deep learning model based on the enhanced data and preset artificially labeled image information until a loss function of the deep learning model converges within a preset range to form a target detection model;

a detection result obtaining unit 105, configured to obtain a target detection result of the to-be-detected data based on the target detection model.

Specifically, in the tag data determination unit 101, the step of determining the tag data corresponding to the training data includes:

Furthermore, the training data comprises label-free image information;

In the new tag data obtaining unit 102, a data washing process is performed on the tag data, and the step of obtaining the washed new tag data includes:

determining a conversion coordinate corresponding to the surrounding frame according to the width coordinate of the surrounding frame, the width of the image information corresponding to the surrounding frame, the height and the center coordinate of the surrounding frame and the height of the image information corresponding to the surrounding frame;

In the enhanced data obtaining unit 103, the new tag data is stored in a block chain, and the step of performing data enhancement processing on the new tag data includes:

the color variables include brightness, saturation, contrast, and transparency;

Further, in the object detection model forming unit 104, the loss function includes a sum of the supervised loss and the unsupervised loss; wherein the content of the first and second substances,

the expression of the supervised loss is as follows:

the expression of the unsupervised loss is as follows:

ω(x)＝1if max(p(x；θ))≥τelse0

q(x)＝ONE_HOT(arg max(p(x；θ)))

Fig. 3 is a schematic structural diagram of an electronic device for implementing a target detection method based on semi-supervised learning according to the present invention.

The electronic device 1 may comprise a processor 10, a memory 11 and a bus, and may further comprise a computer program, such as a semi-supervised learning based object detection program 12, stored in the memory 11 and executable on the processor 10.

The memory 11 includes at least one type of readable storage medium, which includes flash memory, removable hard disk, multimedia card, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, such as a removable hard disk of the electronic device 1. The memory 11 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 11 may be used not only to store application software installed in the electronic device 1 and various types of data, such as codes of an object detection program based on semi-supervised learning, etc., but also to temporarily store data that has been output or is to be output.

The processor 10 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various components of the whole electronic device by using various interfaces and lines, and executes various functions and processes data of the electronic device 1 by running or executing programs or modules (e.g., an object detection program based on semi-supervised learning, etc.) stored in the memory 11 and calling data stored in the memory 11.

The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. The bus is arranged to enable connection communication between the memory 11 and at least one processor 10 or the like.

Fig. 3 only shows an electronic device with components, and it will be understood by a person skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components.

For example, although not shown, the electronic device 1 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so as to implement functions of charge management, discharge management, power consumption management, and the like through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device 1 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.

Further, the electronic device 1 may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used for establishing a communication connection between the electronic device 1 and other electronic devices.

Optionally, the electronic device 1 may further comprise a user interface, which may be a Display (Display), an input unit (such as a Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the electronic device 1 and for displaying a visualized user interface, among other things.

It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.

The semi-supervised learning based object detection program 12 stored by the memory 11 in the electronic device 1 is a combination of instructions that, when executed in the processor 10, may implement:

determining label data corresponding to the training data based on the acquired training data;

Optionally, the training data comprises label-free image information;

the tag data includes an object located on the image information, an upper left-hand abscissa, an upper left-hand ordinate, a lower right-hand abscissa, and a lower right-hand ordinate of a bounding box for bounding the object. Optionally, the step of performing data cleaning processing on the tag data, and acquiring new tag data after cleaning includes:

dividing the width coordinate of the surrounding frame by the width of the image information corresponding to the surrounding frame, and simultaneously dividing the height and the center coordinate of the surrounding frame by the height of the image information corresponding to the surrounding frame to obtain the conversion coordinate information corresponding to the surrounding frame;

the color variables include brightness, saturation, contrast, and transparency;

the expression of the supervised loss is as follows:

the expression of the unsupervised loss is as follows:

ω(x)＝1if max(p(x；θ))≥τelse0

q(x)＝ONE_HOT(arg max(p(x；θ)))

Further, the integrated modules/units of the electronic device 1, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. The computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.

The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.

The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.

Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims

1. A target detection method based on semi-supervised learning is characterized by comprising the following steps:

2. The semi-supervised learning based object detection method of claim 1, wherein the step of determining label data corresponding to the training data comprises:

carrying out horizontal mirror image overturning processing on the training data to obtain the training data after overturning processing;

training an open source model based on the training data after the turning processing until the open source model is converged in a specified range to form a label obtaining model;

3. The semi-supervised learning based object detection method of claim 2,

the training data comprises image information without labels;

4. The semi-supervised learning based object detection method as recited in claim 3, wherein the tag data is subjected to data washing processing, and the step of acquiring new tag data after washing comprises the following steps:

5. The semi-supervised learning based object detection method according to claim 3, wherein the new tag data is stored in a block chain, and the step of performing data enhancement processing on the new tag data comprises:

the color variables include brightness, saturation, contrast, and transparency;

6. The semi-supervised learning based object detection method of claim 1, wherein the loss function comprises a sum of supervised and unsupervised losses; wherein the content of the first and second substances,

the expression of the supervised loss is as follows:

wherein, x represents an image, p and t represent vector information, b represents the sequence number of a bounding box in the artificially marked image information, i represents the sequence number of a prior box, pi represents the probability that the predicted prior box belongs to a positive sample, ti represents the coordinate of the prior box (including the horizontal and vertical coordinates of the upper left corner and the lower right corner), and when the prior box i belongs to the bounding box b, p represents the horizontal and vertical coordinates of the prior box_i，b1, otherwise p_i，bIs 0, t_bCoordinates representing an artificial label surrounding the boxLs represents the survived loss, Lcls represents the loss function of classification, Lreg represents the loss function of regression, Nreg represents the normalization coefficient of the regression term, and Ncls represents the normalization coefficient of the classification term;

the expression of the unsupervised loss is as follows:

ω(x)＝1if max(p(x；θ))≥τelse0

q(x)＝ONE_HOT(arg max(p(x；θ)))

7. The semi-supervised learning based object detection method of claim 1,

the quantity of the enhanced data is 10-15 times of the quantity of the artificially marked image information.

8. An object detection apparatus based on semi-supervised learning, the apparatus comprising:

9. An electronic device, characterized in that the electronic device comprises:

at least one processor; and the number of the first and second groups,

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps in the semi-supervised learning based object detection method as claimed in any one of claims 1 to 7.

10. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the steps in the semi-supervised learning based object detection method as recited in any one of claims 1 to 7.