WO2022262141A1 - 人机回圈方法、装置、系统、电子设备和存储介质 - Google Patents

人机回圈方法、装置、系统、电子设备和存储介质 Download PDF

Info

Publication number
WO2022262141A1
WO2022262141A1 PCT/CN2021/119968 CN2021119968W WO2022262141A1 WO 2022262141 A1 WO2022262141 A1 WO 2022262141A1 CN 2021119968 W CN2021119968 W CN 2021119968W WO 2022262141 A1 WO2022262141 A1 WO 2022262141A1
Authority
WO
WIPO (PCT)
Prior art keywords
target object
target
type
neural network
ratio
Prior art date
Application number
PCT/CN2021/119968
Other languages
English (en)
French (fr)
Inventor
林成龙
崔磊
Original Assignee
深圳市商汤科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市商汤科技有限公司 filed Critical 深圳市商汤科技有限公司
Publication of WO2022262141A1 publication Critical patent/WO2022262141A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Definitions

  • the present application relates to the technical field of man-machine circle, and in particular to a man-machine circle method, device, system, electronic equipment and storage medium.
  • Embodiments of the present application provide a human-machine loop method and device, electronic equipment, and a storage medium.
  • An embodiment of the present application provides a man-machine loop method, including: acquiring an image to be processed; performing inference on each target object in the image to be processed through at least one neural network corresponding to the target task, and obtaining the prediction result of each target object; responding Because there is a target object whose confidence degree of the prediction result is less than the preset threshold, use the standard feature vector to mine the target object, determine the mining result of the target object, and construct the first part of the training data set based on the mining result; use the training data set to at least one The neural network is retrained.
  • At least one neural network corresponding to the target task is used to infer each target object in the image to be processed
  • the step of obtaining the prediction result of each target object includes: using at least one neural network in the image to be processed Perform feature extraction for each target object, determine the initial type of each target object, and obtain the prediction result; in response to the existence of a target object whose confidence degree of the prediction result is less than a preset threshold, use the standard feature vector to mine the target object, and determine the target object mining
  • the result step includes: in response to the presence of target objects whose confidence of the initial type is less than a preset threshold, sort the confidence of each type of target object in descending order to obtain the sorting sequence of each type of target object; Obtain the previously set number of target objects in the sorting sequence and determine them as the target objects that need to be mined; use various types of standard feature vectors to mine the target objects that need to be mined to determine the type of target objects that need to be mined and obtain the mining results .
  • the method before obtaining the previously set number of target objects from each sorting sequence in turn and determining them as target objects that need to be mined, the method further includes: obtaining the demand ratio of each type of target object; The demand ratio and the number of target objects of each type whose confidence is not less than the preset threshold determine the set number of target objects of each type that need to be mined from the target objects whose confidence is less than the preset threshold.
  • the standard feature vector is used to mine the target object, and after the step of determining the mining result of the target object, it also includes: judging whether each target object has been mined; if the target object has not been mined, receiving a manual For the labeling of the type of the undetermined target object, manually determine the mining result of the target object; construct the second part of the training data set based on the manually determined target object of the mining result.
  • the standard feature vectors are used to mine the target objects
  • the step of determining the mining results of the target objects includes: using various types of standard feature vectors to mine each target object through a clustering method.
  • At least one neural network corresponding to the target task is used to infer each target object in the image to be processed
  • the step of obtaining the prediction result of each target object includes: using at least one neural network in the image to be processed Feature extraction is carried out for each target object, and the confidence degree of each target object is determined for each type; the type with the highest confidence degree of each target object is determined as the initial type of each target object, and the prediction result is obtained.
  • the method further includes: determining the initial type as the type of the target object in response to the existence of a target object whose confidence level of the initial type is not less than a preset threshold; The third part of the dataset.
  • the step of using the training data set to retrain at least one neural network includes: detecting the target object on the training data set, and dividing the training data set into positive samples with the target object based on the detection result pictures and negative sample pictures that do not have the target object; judge whether the first quantity ratio between the positive sample picture and the negative sample picture is the first set ratio; if the first quantity ratio between the positive sample picture and the negative sample picture is not If the ratio is the first setting, the number of positive sample pictures or negative sample pictures is adjusted by resampling or/and partial random sampling, so that the first quantity ratio between the positive sample picture and the negative sample picture is the first set ratio. fixed ratio; at least one neural network is retrained based on the first set ratio of positive sample pictures and negative sample pictures.
  • the method further includes: judging whether each type of the target object of the determined type Whether the second number ratio between is the second set ratio; if the second number ratio of each type of target object is not the second set ratio, adjust the different types of targets by resampling or/and partial random sampling The number of objects, so that the second number ratio of different types of target objects is a second set ratio; at least one neural network is retrained by the second set ratio of different types of target objects.
  • the step of obtaining the image to be processed before the step of obtaining the image to be processed, it includes: obtaining the original training sample, wherein the original training sample is a sample of the marked target object type; judging the third quantity ratio of different types of target objects Whether it is the third setting ratio; if the third number ratio of different types of target objects is not the third setting ratio, adjust the number of different types of target objects by resampling or/and partial random sampling to make different
  • the third ratio of the number of types of target objects is a third set ratio; the initial network is trained by using the third set ratio of different types of target objects to obtain at least one neural network.
  • the embodiment of the present application also provides a man-machine loop device.
  • the man-machine loop device includes: an acquisition module configured to acquire an image to be processed; a reasoning module configured to use at least one neural network corresponding to the target task in the image to be processed Each target object is reasoned to obtain the prediction result of each target object; the mining module is configured to respond to the existence of a target object whose confidence degree of the prediction result is less than a preset threshold, and uses the standard feature vector to mine the target object to determine the target object. mining results, constructing a first part of the training data set based on the mining results; and a training module configured to retrain at least one neural network using the training data set.
  • the embodiment of the present application also provides a man-machine loop system, including an inference platform configured to acquire an image to be processed and perform inference on each target object in the image to be processed through at least one neural network corresponding to the target task to obtain each target object
  • the prediction result of the target object the labeling platform is configured to respond to the existence of a target object whose confidence degree of the prediction result is less than a preset threshold, use the standard feature vector to mine the target object, determine the mining result of the target object, and construct the training data set based on the mining result
  • the first part a training platform configured to retrain at least one neural network using a training data set.
  • the embodiment of the present application also provides an electronic device, including a memory and a processor coupled to each other, and the processor is used to execute program instructions stored in the memory, so as to realize the above human-machine loop method.
  • the embodiment of the present application also provides a computer-readable storage medium, on which program instructions are stored, and when the program instructions are executed by a processor, the human-computer loop method is realized.
  • At least one neural network corresponding to the target task is used to infer each target object in the image to be processed to obtain the prediction result of each target object; object, use the standard feature vector to mine the target object, determine the mining result of the target object, and build the first part of the training data set based on the mining result; finally use the training data set to retrain at least one neural network, so that the mined
  • the target object improves the detection accuracy of at least one neural network, and improves the reliability of at least one neural network.
  • the at least one neural network is further retrained by using the training data set, so that at least one neural network is retrained by using the target object obtained during the inference process of the at least one neural network, thereby realizing the iterative upgrade of the at least one neural network, Further improving the performance and detection efficiency of at least one neural network.
  • Fig. 1 is a schematic flow chart of an embodiment of the applicant's machine loop method
  • Fig. 2 is a schematic flow chart of another embodiment of the applicant's machine loop method
  • Fig. 3 is a schematic flow chart of an embodiment of at least one neural network acquisition method in the embodiment of Fig. 2;
  • Fig. 4 is a schematic flowchart of an embodiment of the man-machine loop method in the embodiment of Fig. 2;
  • Fig. 5 is a schematic frame diagram of an embodiment of the applicant's machine loop device
  • Fig. 6 is a schematic frame diagram of an embodiment of the applicant's machine loop system
  • Fig. 7 is a schematic frame diagram of another embodiment of the applicant's machine loop system.
  • Fig. 8 is a schematic frame diagram of an embodiment of the electronic device of the present application.
  • FIG. 9 is a schematic diagram of an embodiment of a computer-readable storage medium of the present application.
  • system and “network” are often used interchangeably herein.
  • the term “and/or” in this article is just an association relationship describing associated objects. There can be three relationships, for example, A and/or B, which can be: A exists alone, A and B exist simultaneously, and B exists alone. three conditions.
  • the character “/” in this article generally has an “or” relationship between context and context objects.
  • “more” in this article is two or more than two.
  • FIG. 1 is a schematic flowchart of an embodiment of the applicant's machine looping method. Specifically, the following steps may be included:
  • Step S11 Acquiring images to be processed.
  • the image to be processed in this embodiment corresponds to the target task.
  • the image to be processed is a garbage image.
  • the image to be processed is an image of urban fireworks.
  • the target task is to detect floating objects in a river, the image to be processed is a river image.
  • the specific objects and fields of the at least one neural network are not limited here.
  • the image to be processed is obtained first, wherein the image to be processed is an image obtained in an application environment after at least one neural network is applied.
  • the image to be processed is based on the river floating object detection model after applying the river floating object detection model. Take pictures.
  • Step S12 Perform inference on each target object in the image to be processed through at least one neural network corresponding to the target task, and obtain a prediction result of each target object.
  • the at least one neural network in this embodiment can be applied to various fields, and at least one neural network is used to infer and predict the image to be processed to obtain a prediction result. That is to say, there is at least one neural network corresponding to the execution target task in this embodiment.
  • the target task is a face recognition task
  • a neural network and a neural network for recognizing faces; or when the target task is a license plate recognition task the number of neural networks for performing the task can be at least three, for example including at least a neural network for vehicle recognition, a neural network for A neural network for license plate recognition, a neural network for text recognition, and more.
  • the type and quantity of the neural network in this embodiment are set according to the type of the specific target task, which is not limited in this embodiment.
  • the target object is the target object on the image to be processed, which is determined by the application object of at least one neural network.
  • the reasoning process of this embodiment is set based on the specific task of the target task, for example: when the target task is used for detection, the reasoning process can be detecting the target object; when the target task is used for detection and identification, the reasoning process can be detecting and identify the target object.
  • Step S13 In response to the presence of a target object whose confidence level of the prediction result is less than a preset threshold, the standard feature vector is used to mine the target object, the mining result of the target object is determined, and the first part of the training data set is constructed based on the mining result.
  • the preset threshold is the threshold of confidence, and the specific value can be set based on actual application, and is not limited here.
  • the target object whose confidence degree of the prediction result is less than the preset threshold value that is, the prediction result of the target object cannot be judged by the confidence degree
  • the standard feature vector is used to mine the target object whose confidence degree is less than the preset threshold value, so as to determine the target object mining results.
  • the first part of the training data set in this embodiment is at least a part of the training data set.
  • the first part of the training data set may be part or all of the training data set.
  • the training data set may only include the mining results of this step.
  • the training data set may include the mining result of this step and the training data used before obtaining at least one neural network.
  • the entire composition of the training data set is not limited here.
  • Step S14 retrain at least one neural network using the training data set.
  • At least one neural network is retrained by using the training data set, so that the mined target object can be used to iteratively upgrade the at least one neural network, so that the application effect of the at least one neural network can be further improved.
  • the man-machine loop method of this embodiment first uses at least one neural network corresponding to the target task to infer each target object in the image to be processed to obtain the prediction result of each target object; and then responds to the confidence of the prediction result
  • the target object whose degree is less than the preset threshold value is used to mine the target object using the standard feature vector to determine the mining result of the target object, and construct the first part of the training data set based on the mining result; finally use the training data set to retrain at least one neural network , so that the detection accuracy of at least one neural network is improved through the mined target object, and the reliability of at least one neural network is improved.
  • the at least one neural network is further retrained by using the training data set, so that at least one neural network is retrained by using the target object obtained during the inference process of the at least one neural network, thereby realizing the iterative upgrade of the at least one neural network, Further improving the performance and detection efficiency of at least one neural network.
  • FIG. 2 is a schematic flowchart of another embodiment of the applicant's machine looping method. Specifically, the following steps may be included:
  • Step S21 Acquiring images to be processed.
  • step S11 is the same as step S11 in the above embodiment, please refer to the above for details, and will not be repeated here.
  • At least one neural network is deployed and applied, and images to be processed are obtained during the inference process of the at least one neural network.
  • the upgrade iteration step of at least one neural network is started, and when the acquired number of images to be processed does not reach the set value, at least one neural network in this embodiment only outputs the output to be processed.
  • the setting value may be determined based on actual application, and is not limited here.
  • Step S22 performing feature extraction on each target object in the image to be processed through at least one neural network, and determining the initial type of each target object.
  • At least one neural network in this embodiment can be a deep learning network model, such as a composite network framework that can be constructed using network structures such as a convolutional neural network and a recursive upgrade network, or can be a network model built with a separate deep learning network as a template , is not limited here.
  • a deep learning network model such as a composite network framework that can be constructed using network structures such as a convolutional neural network and a recursive upgrade network, or can be a network model built with a separate deep learning network as a template , is not limited here.
  • At least one neural network is used to detect and infer the target object on the image to be processed, so as to determine whether there is a target object on the image to be processed; if there is no target object on the image to be processed, no subsequent detection is performed on this type of image to be processed; if If there is a target object on the image to be processed, at least one neural network is further used to perform feature extraction for each target object in the image to be processed where the target object exists.
  • the above-mentioned determination of the initial type of each target object includes: performing feature extraction of each target object in the image to be processed through at least one neural network to obtain a feature vector of each target object, and determining each target object through the feature vector.
  • the target object is the confidence level of each type, and the type with the highest confidence level of each target object is determined as the initial type of each target object.
  • each type is the type that needs to be divided into at least one neural network to detect the target object.
  • the sum of the probabilities of each type is 1, that is, the probability that the type of the target object belongs to each type is determined in this step. In an application scenario, when the detection scenario is garbage detection and there are 5 types of garbage, 5 probabilities that the target object belongs to the 5 types of garbage will be obtained, and the sum of these 5 probabilities is 1.
  • the type with the highest confidence of each target object is determined as the initial type of each target object.
  • the probabilities of a target object belonging to each type are: bottle: 30%, paper: 5%, plastic board: 20%, box: 41%, and stick: 4%, then The box type corresponding to the maximum probability of 41% is determined as the initial type of the target object.
  • the initial type is each division category of each target object, which is determined based on division standards in practical applications, and is not limited here.
  • the target object in the image to be processed is the river floating object.
  • the division standard of river floating objects is solid-liquid division
  • the initial type of each target object can be liquid or solid.
  • the initial type of the target object may not be the final determined type of the target object.
  • at least one neural network can be used to detect the confidence between the target object and each type, and the confidence between each type is equal to 1, so that the type with the highest confidence is used as the initial type of the target object .
  • the characteristics of the target object can also be compared with the characteristics of various types, and the type most similar to the characteristics of the target object can be used as the initial type of the target object. It is not limited here.
  • the reliability of the initial type of each target object is improved by determining the type with the highest confidence of each target object as the initial type of each target object.
  • At least one neural network may be used to extract features of each target object in the image to be processed, and use the extracted features to determine the confidence between the target object and each type.
  • the type confidence obtained first may be liquid: 30%, solid: 70%.
  • Step S23 In response to the presence of target objects whose initial type confidence is less than the preset threshold, respectively sort the confidence levels of each type of target objects in descending order to obtain a sorting sequence of each type of target objects.
  • the method further includes: acquiring demand ratios of various types of target objects, based on the demand ratios and the number of various types of target objects whose confidence levels are not less than a preset threshold, and determining from The set quantity of various types of target objects that need to be mined among the target objects of the preset threshold.
  • the demand ratio can be set according to the actual application demand. For example, when there are three types of target objects, the demand ratio of each type may be 1:1:1.
  • the demand ratio in this embodiment is not limited here.
  • the number of various types of target objects whose confidence degree is not less than the confidence threshold value obtained in each actual application is uncertain, therefore, the number of various types of target objects that need to be mined among the target objects whose confidence degree is less than the confidence degree threshold value is determined by The requirement ratio is jointly determined with the quantity of each type of target objects whose confidence level is not less than the confidence level threshold.
  • the demand ratio is 1:1
  • the confidence degree is not less than the confidence threshold
  • the quantity between the two types of target objects is 200 and 100
  • the confidence Among the target objects whose degree is less than the confidence threshold the quantity of the two types of target objects corresponding to mining 100 and 200, so that the finally obtained quantity ratio between the two types is the demand ratio.
  • this number is used as the set number of each type.
  • the number of each type of target objects that need to be mined from the target objects whose confidence is less than the confidence threshold and based on the need to mine.
  • Step S24 Obtain a previously set number of target objects from each sorting sequence in turn, and determine them as target objects to be mined.
  • sorting is performed based on the confidence level, and the previously set number of target objects are sequentially acquired, so as to ensure a certain amount of mining while improving the reliability of various types of target objects, speeding up the training efficiency of at least one neural network and improving the sample size. Data reliability.
  • Step S25 Mining the target object to be mined by using standard feature vectors of various types, determining the type of the target object to be mined, and obtaining the mining result.
  • the clustering method may be, for example, k-means algorithm, k-medoids algorithm, k-median algorithm, k-center algorithm or other clustering algorithms, etc., which are not limited here.
  • various types of standard feature vectors can be used to mine each target object through a clustering method, thereby improving mining efficiency and reliability.
  • the type of each target object can be compared and judged by each type of standard feature vector, and the target object’s identity can be determined according to the similarity between the feature vector of each target object and each type of standard feature vector Type, the type corresponding to the standard feature vector with the highest similarity with the target object feature vector is used as the type of the target object, so as to complete the mining of the target object, and repeat the above process until the number of target objects of each type is mined or Exhaust all target objects to get mining results.
  • At least one neural network determines the type of the target object in the image to be processed by the above method, at least one neural network outputs the type of the target object, thereby completing the actual application of the at least one neural network .
  • Step S26 Determine whether each target object has been mined.
  • step S25 It is judged whether each target object whose confidence degree is less than the preset threshold has been mined, that is, it is judged whether each target object whose confidence degree is less than the preset threshold is mined through step S25. If the target object is not mined, execute step S27, and if the target object is mined, execute step S28.
  • Step S27 receiving manual annotation of the type of the undetermined target object, manually determining the mining result of the target object, and constructing the second part of the training data set based on the manually determined mining result of the target object.
  • the receiver will manually analyze the above-mentioned undetermined types of target objects.
  • Types of annotations manually determine the mining results of the target objects, and construct the second part of the training data set based on the manually determined target objects of the mining results.
  • the mining result of the target object is manually determined by manually marking the type of the undetermined target object, thereby further expanding the richness of the training data set and improving the training effect of at least one neural network.
  • Step S28 Construct the first part of the training data set based on the mining results.
  • the first part of the training data set is constructed based on the mining result of the target object.
  • the above method may further include step S29: in response to the presence of a target object whose confidence level of the initial type is not less than a preset threshold, determine the initial type as the type of the target object, and determine the type of target based on the initial type The object constructs the third part of the training dataset.
  • this step can be executed between step S22-step S30, and this embodiment does not limit the specific execution sequence thereof.
  • the confidence threshold can be set according to the actual situation, for example: 70%, 80% or 65%, etc., which is not limited here.
  • Step S30 retrain at least one neural network using the training data set.
  • the training data set to retrain at least one neural network, wherein the training data set in this embodiment includes at least the first part of constructing the training data set based on the mining results, and the second part of constructing the training data set based on the target object of the manually determined mining results. The second part and the third part of constructing the training data set based on the initial type to determine the type of target objects.
  • the training data set may also include the original data of at least one neural network and the first part of constructing the training data set based on the mined results, the second part of constructing the training data set based on the target object of the manually determined mining results, and the second part of constructing the training data set based on The initial type determines the type of target object to construct the third part of the training dataset.
  • step S30 may include: performing target object detection on the training data set, and dividing the image to be processed into positive sample pictures in which the target object exists and negative sample pictures in which the target object does not exist based on the detection result; and Determine whether the first quantity ratio between the positive sample picture and the negative sample picture is the first set ratio; if the first quantity ratio between the positive sample picture and the negative sample picture is not the first set ratio, then by resampling Or/and the method of partial random sampling to adjust the number of positive sample pictures or negative sample pictures, so that the first quantity ratio between the positive sample picture and the negative sample picture is the first set ratio; the positive sample based on the first set ratio
  • the sample images and the negative sample images retrain at least one neural network.
  • the positive sample pictures can be re- Sampling, for example, sampling 10 times, makes the quantity ratio between the positive and negative sample pictures conform to one to one.
  • randomly sample the negative sample pictures for example, a random sampling ratio of 10 to 1 only samples some negative sample pictures, so that the ratio of the number of positive and negative sample pictures is consistent with one to one.
  • the imbalance phenomenon in the training data is changed by adjusting the first number ratio between the positive sample picture and the negative sample picture to a first set ratio.
  • resampling or partial random sampling can also be performed on various types of pictures between the positive sample pictures, so that the positive The sample picture is internally equalized.
  • the object trains the initial network to obtain at least one neural network.
  • the method of resampling and partial random sampling is the same as the sampling method between the positive and negative sample pictures in the aforementioned application scenario, please refer to the previous article, and will not repeat them here.
  • FIG. 3 is a schematic flowchart of an embodiment of at least one neural network acquisition method in the embodiment in FIG. 2 .
  • Step S31 Acquiring original training samples, where the original training samples are samples of marked target object types.
  • the original training samples are obtained, wherein the original training samples are samples that have been marked with the type of the target object.
  • the samples may be manually marked by manual marking, so as to obtain the marked target objects and the original training samples of the type to which each target object belongs.
  • the samples may also be marked by other classification models, so as to obtain the original training samples marked with target objects and the type of each target object.
  • the labeling manner of the original training samples is not limited here.
  • Step S32 Determine whether the third quantity ratio of different types of target objects is the third set ratio.
  • the third setting ratio can be set according to the actual training requirements, for example: setting equal ratios among various types, or increasing the setting ratio of certain types according to different types of training difficulty, which is not limited here.
  • Step S33 If the third number ratio of different types of target objects is not the third set ratio, adjust the number of different types of target objects by resampling or/and partial random sampling, so that the different types of target objects
  • the third quantity ratio is a third setting ratio.
  • the number of different types of target objects is adjusted by resampling or/and partial random sampling, so that the third number of different types of target objects
  • the three-quantity ratio is the third setting ratio.
  • the ratio between the positive and negative sample pictures in the original training sample can be adjusted by resampling or/and partial random sampling, and then the third quantity ratio between the types of positive samples can be adjusted by resampling or/ and some random sampling are adjusted to make it the third set ratio, so as to meet the training requirements.
  • the data sampled according to the above two steps can alleviate the imbalance between positive and negative sample pictures and various types.
  • deep learning network training can be performed to obtain at least one neural network in Figure 2.
  • step S34 is directly executed.
  • Step S34 training the initial network with different types of target objects in a third set ratio to obtain at least one neural network.
  • At least one neural network is obtained by training the initial network with different types of target objects that have satisfied the third set ratio.
  • the initial network may be a deep learning network model, for example, a composite network framework constructed using network structures such as a convolutional neural network and a recursive upgrade network. It can also be a network model constructed from a separate deep learning network as a template, which is not limited here.
  • the third quantity ratio of different types of target objects in the original training samples is adjusted to a third set ratio to balance at least one neural network for each type of training samples, so that the training of at least one neural network is more comprehensive.
  • the man-machine loop method of this embodiment first uses the original training samples to train the initial network to obtain at least one neural network, and deploys at least one neural network, so as to obtain the set
  • the image to be processed with a fixed value is used to detect the target object and determine the type of each target object through at least one neural network, so as to use active learning technology to complete data mining, effectively filter data, reduce labeling costs, and improve data quality .
  • the target object of the determined type is manually reviewed to further improve the accuracy of the target object category.
  • the resampling or partial random sampling training technology can be used to alleviate the problem.
  • the reasoning process and training process of at least one neural network are opened up through the labeling method of man-machine loop, so that The whole process of model production and iteration is more efficient, and now the performance of at least one neural network is improved.
  • the at least one neural network obtains the image to be processed with the set value, the at least one neural network can be continuously iteratively updated, so that the performance and detection accuracy of the at least one neural network can be improved cyclically to a certain extent.
  • FIG. 4 is a schematic flowchart of an implementation of the man-machine loop method in the embodiment of FIG. 2 .
  • This embodiment will take at least one neural network as an example for the detection model of floating objects in the river, please refer to Figure 4, which may include the following steps:
  • Step S41 using the original training samples to train the initial network to obtain a river floating object detection model.
  • the initial network is trained using the original training samples.
  • the original training samples can be marked by manual labeling, so as to obtain the original training samples with accurately labeled categories, and use the original training samples with accurately labeled categories to train the initial network to obtain the river floating object detection model.
  • Step S42 Acquiring images to be processed.
  • step S41 Apply the river floating object detection model obtained in step S41 to the river floating object detection scene, and use the river floating object detection model to detect the obtained image to be processed, so as to determine the category of the river floating object on the image to be processed.
  • Step S43 Perform feature extraction on the image to be processed through the river floating object detection model to determine the type of each target object.
  • the detection and inference of the target object is performed on the image to be processed through the detection model of floating objects in the river, so as to determine whether there is a target object, that is, floating objects in the river, on the image to be processed. If there is no target object on the image to be processed, no subsequent detection is performed on this type of image to be processed.
  • each target object in the image to be processed is extracted through the river floating objects detection model, and the feature vectors of each target object are obtained, and the probability of each target object being each type is determined through the feature vector; After determining the probability, the type with the highest probability of each target object is determined as the initial type of each target object. If the confidence of the initial type of the target object is not less than the confidence threshold, the initial type of the target object is determined as the final type of the target object.
  • Step S44 Output the type of each target object.
  • Step S45 Determine whether the number of images to be processed satisfies a set value.
  • the set value can be determined according to the actual application, for example: 1000 sheets, 2000 sheets, etc., which is not limited here.
  • step S42 is executed again to continue to acquire images to be processed; if the set value is satisfied, step S46 is executed.
  • Step S46 Mining the image to be processed to determine the type of the target object.
  • the demand ratio of each type of target object is obtained, based on the demand ratio, and the number of each type of target objects whose confidence is not less than the confidence threshold, determine from the confidence The number of target objects of each type that needs to be mined in the target objects of the threshold.
  • the set number of target objects and the target objects whose confidence is not less than the confidence threshold are combined to obtain training data.
  • the accuracy of the type of each target object is guaranteed to a certain extent by manually marking and determining the type of each target object whose confidence is less than the confidence threshold and which have not been mined.
  • the type of the target object is determined through the river floating object detection model to improve the comprehensiveness of the detection of the target object and feed back the detection accuracy of the river floating object detection model. And realize the man-machine loop of the training upgrade of the floating object detection model in the river.
  • Step S47 Sampling target objects of a certain type to obtain training data.
  • the number of positive sample pictures and negative sample pictures is the first set ratio, so that the first quantitative ratio between the positive sample pictures and the negative sample pictures is the first set ratio; the river floating objects detection model is retrained based on the first set ratio of positive sample pictures and negative sample pictures.
  • resampling or random sampling can also be performed on various types of pictures between the positive sample pictures, so that the inside of the positive sample pictures balanced. For example, it can be judged whether the second quantity ratio among the various types of target objects of a certain type is the second set proportion; if the second quantity proportion of each type of target objects is not the set proportion, then by resampling or/ and partial random sampling methods to adjust the number of different types of target objects, so that the second number ratio of different types of target objects is the second set ratio; finally, the initial network is affected by the second set ratio of different types of target objects Carry out training to obtain a river floating object detection model.
  • the method of resampling and partial random sampling of various types of pictures in the positive sample picture is the same as the sampling method between the positive and negative sample pictures in the aforementioned application scenarios, please refer to the above, and will not repeat them here.
  • Step S48 Train the river channel floating objects detection model with the sampled training data.
  • the model of floating objects in the river is trained by using different types of target objects that have satisfied the set ratios, so as to upgrade and iterate the polarity of the floating objects in the river.
  • the river floating object detection model of this embodiment first uses the original training samples to train the initial network to obtain the river floating object detection model, and deploys the river floating object detection model, so that in the normal application process of the river floating object detection model Obtain the set value of the image to be processed, and use the river floating object detection model to detect the target object and determine the type of each target object, so that the active learning technology can be used to complete the data mining, which can effectively filter the data and reduce the cost of labeling , to improve data quality.
  • the floating object detection model in the river can be updated iteratively, so that the performance and detection accuracy of the floating object detection model in the river can be improved to a certain extent. .
  • FIG. 5 is a schematic diagram of the framework of an embodiment of the applicant's robot loop device.
  • the man-machine loop device 50 includes an acquisition module 51, a reasoning module 52, a mining module 53 and a training module 54, the acquisition module 51 is configured to acquire images to be processed; the reasoning module 52 is configured to pass at least one neural network corresponding to the target task to process images Perform inference on each target object in to obtain the prediction result of each target object; the mining module 53 is configured to respond to the existence of a target object whose confidence degree of the prediction result is less than a preset threshold, use the standard feature vector to mine the target object, and determine the target object A first part of the training data set is constructed based on the mining results; the training module 54 is configured to use the training data set to retrain at least one neural network.
  • At least one neural network corresponding to the target task is used to infer each target object in the image to be processed to obtain the prediction result of each target object;
  • the standard feature vector is used to mine the target object, and the mining result of the target object is determined to construct the first part of the training data set based on the mining result; finally, the training data set is used to retrain at least one neural network, so that the mined target object improves
  • the detection accuracy of at least one neural network improves the reliability of at least one neural network.
  • the at least one neural network is further retrained by using the training data set, so that at least one neural network is retrained by using the target object obtained during the inference process of the at least one neural network, thereby realizing the iterative upgrade of the at least one neural network, Further improving the performance and detection efficiency of at least one neural network.
  • the above-mentioned reasoning module 52 is configured to extract features of each target object in the image to be processed through at least one neural network, determine the initial type of each target object, and obtain a prediction result;
  • the above-mentioned mining module 53 is configured to respond to Since there are target objects whose confidence of the initial type is less than the preset threshold, the confidence of each type of target object is sorted in descending order to obtain the sorting sequence of each type of target object; A certain number of target objects are determined as the target objects to be mined; the target objects to be mined are mined using various types of standard feature vectors, the type of target objects to be mined is determined, and the mining results are obtained.
  • sorting is performed based on the confidence level, and the previously set number of target objects are sequentially acquired, so as to ensure a certain amount of mining while improving the reliability of various types of target objects, accelerating the training efficiency of at least one neural network and improving Reliability of sample data.
  • the above-mentioned mining module 53 is configured to obtain the demand ratios of various types of target objects; The set quantity of each type of target objects that need to be mined among the target objects with a threshold value.
  • the number of each type of target objects that need to be mined from the target objects whose confidence is less than the confidence threshold and based on the needs
  • the number of various types of target objects to be mined using various types of standard feature vectors to mine each target object, so as to ensure the richness of sample data to a certain extent and improve the efficiency of this model retraining.
  • the mining module 53 is configured to use standard feature vectors to mine target objects, and after determining the mining results of the target objects, judge whether each target object has been mined; if the target object has not been mined, then receive artificial The type of the determined type of target object is marked by manually determining the mining result of the target object; the second part of the training data set is constructed based on the manually determined target object of the mining result.
  • the mining result of the target object is manually determined by manually marking the type of the undetermined target object, thereby further expanding the richness of the training data set and improving the training effect of at least one neural network.
  • the mining module 53 is configured to use various types of standard feature vectors to mine each target object through a clustering method.
  • various types of standard feature vectors are used to mine each target object through a clustering method, thereby improving mining efficiency and reliability.
  • the reasoning module 52 is configured to extract features of each target object in the image to be processed through at least one neural network, and determine the confidence of each target object as each type; determine the type with the largest confidence of each target object For the initial type of each target object, the prediction result is obtained.
  • the mining module 53 is further configured to determine the initial type as the type of the target object in response to the existence of a target object whose confidence level of the initial type is not less than a preset threshold; The third part of the dataset.
  • the type with the highest probability of each target object is determined as the initial type of each target object, thereby improving the reliability of the initial type of each target object.
  • the training module 54 is configured to detect the target object on the training data set, and divide the training data set into positive sample pictures with the target object and negative sample pictures without the target object based on the detection results; determine the positive sample Whether the first number ratio between the picture and the negative sample picture is the first set ratio; if the first number ratio between the positive sample picture and the negative sample picture is not the first set ratio, then by resampling or/and The method of partial random sampling adjusts the number of positive sample pictures or negative sample pictures, so that the first quantity ratio between the positive sample picture and the negative sample picture is the first set ratio; the positive sample picture based on the first set ratio and Negative images are used to retrain at least one neural network.
  • the phenomenon of unbalance between the positive sample picture and the negative sample picture is reduced.
  • the training module 54 is further configured to determine whether the second quantity ratio among the types of target objects of a certain type is a second set ratio; if the second quantity ratio of each type of target objects is not
  • the second setting ratio is to adjust the number of different types of target objects by resampling or/and partial random sampling, so that the second number ratio of different types of target objects is the second setting ratio; through the second setting A fixed proportion of different types of target objects is used to train the initial network to obtain at least one neural network.
  • the obtaining module 51 is also configured to obtain an original training sample before obtaining the image to be processed, wherein the original training sample is a sample of the marked target object type; determine the third quantity ratio of different types of target objects Whether it is the third setting ratio; if the third number ratio of different types of target objects is not the third setting ratio, adjust the number of different types of target objects by resampling or/and partial random sampling to make different
  • the third ratio of the number of types of target objects is a third set ratio; the initial network is trained by using the third set ratio of different types of target objects to obtain at least one neural network.
  • the training of at least one neural network is more efficient. comprehensive.
  • FIG. 6 is a schematic diagram of an embodiment of the applicant's machine loop system.
  • the man-machine loop system 60 includes an inference platform 63 , a labeling platform 61 and a training platform 62 connected by communication.
  • the inference platform 63 is configured to acquire the image to be processed and perform inference on each target object in the image to be processed through at least one neural network corresponding to the target task, so as to obtain the prediction result of each target object;
  • the labeling platform 61 is configured to respond to the existence prediction For the target object whose confidence degree of the result is less than the preset threshold, the standard feature vector is used to mine the target object, the mining result of the target object is determined, and the first part of the training data set is constructed based on the mining result;
  • the training platform 62 is configured to use the training data set to At least one neural network is retrained.
  • the image to be processed when the at least one neural network is applied to the image to be processed, the image to be processed can be used as training data to train the at least one neural network again, thereby improving the application effect of the at least one neural network.
  • FIG. 7 is a schematic diagram of another embodiment of the applicant's machine loop system.
  • the man-machine loop system 70 includes a graphical user interface (GUI, Graphical User Interface) 71, a business layer 72, a platform layer 73, a scheduling layer 74, and a hardware layer 75.
  • GUI 71 refers to a computer-operated user interface displayed in a graphical manner for receiving user operations.
  • the business layer 72 includes a resource center, a user center, and an authority center.
  • the resource center is used to manage system resources
  • the user center is used to manage user information
  • the authority center is used to manage authority.
  • the platform layer 73 includes a labeling platform, a training platform and an inference platform.
  • the platform layer 73 is used to implement the man-machine loop method in any of the above-mentioned embodiments.
  • the inference platform can be used to obtain the image to be processed and perform inference on each target object in the image to be processed through at least one neural network corresponding to the target task, and obtain the prediction result of each target object.
  • the labeling platform can be used to respond to the existence of For the target object whose confidence degree of the prediction result is less than the preset threshold, the standard feature vector is used to mine the target object to determine the mining result of the target object, so as to construct the first part of the training data set based on the mining result; the training platform can use the training data set to At least one neural network is retrained.
  • the scheduling layer 74 is used to schedule the human-machine loop system 70 .
  • the kubernetes scheduling mechanism can be used for scheduling; kubernetes is referred to as K8s, which is an open source container orchestration engine.
  • the hardware layer 75 may include a central processing unit (CPU, Central Processing Unit), a graphics processing unit (GPU, Graphics Processing Unit), and a network attached storage (NAS, Network Attached Storage). The application of the man-machine loop system 70 is realized through the above-mentioned hardware.
  • the image to be processed when the at least one neural network is applied to the image to be processed, the image to be processed can be used as training data to train the at least one neural network again, thereby improving the application effect of the at least one neural network.
  • FIG. 8 is a schematic frame diagram of an embodiment of an electronic device of the present application.
  • the electronic device 80 includes a memory 81 and a processor 82 coupled to each other.
  • the processor 82 is configured to execute the program instructions stored in the memory 81 to implement the steps in any of the above embodiments of the human-machine loop method.
  • the electronic device 80 may include, but is not limited to: a microcomputer, a server.
  • the electronic device 80 may also include mobile devices such as notebook computers and tablet computers, which are not limited here.
  • the processor 82 is used to control itself and the memory 81 to implement the steps of any one of the above embodiments of the human-machine loop method.
  • Processor 82 may also be referred to as a CPU.
  • the processor 82 may be an integrated circuit chip with signal processing capability.
  • the processor 82 can also be a general-purpose processor, a digital signal processor (DSP, Digital Signal Processor), an application-specific integrated circuit (ASIC, Application Specific Integrated Circuit), a field-programmable gate array (FPGA, Field-Programmable Gate Array) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • DSP digital signal processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the processor 82 may be jointly realized by an integrated circuit chip.
  • the above solutions can improve the performance and accuracy of at least one neural network.
  • FIG. 9 is a schematic frame diagram of an embodiment of a computer-readable storage medium of the present application.
  • the computer-readable storage medium 90 stores program instructions 901 that can be executed by the processor, and the program instructions 901 are used to implement the steps in any of the above embodiments of the man-machine loop method.
  • the above solutions can improve the performance and accuracy of at least one neural network.
  • the disclosed methods and devices may be implemented in other ways.
  • the device implementations described above are only illustrative.
  • the division of modules or units is only a logical function division. In actual implementation, there may be other division methods.
  • units or components can be combined or integrated. to another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • a unit described as a separate component may or may not be physically separated, and a component shown as a unit may or may not be a physical unit, that is, it may be located in one place, or may also be distributed to network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) execute all or part of the steps of the methods in various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. .

Abstract

一种人机回圈方法、装置、系统、电子设备和存储介质,其中,人机回圈方法包括:获取待处理图像(S11);通过目标任务对应的至少一个神经网络对待处理图像中的各目标对象进行推理,得到各目标对象的预测结果(S12);响应于存在预测结果的置信度小于预设阈值的目标对象,利用标准特征向量对目标对象进行挖掘,确定目标对象的挖掘结果,基于挖掘结果构建训练数据集的第一部分(S13);利用训练数据集对至少一个神经网络进行再训练(S14)。能够通过至少一个神经网络对待处理图像进行应用时,将待处理图像作为训练数据再次对至少一个神经网络进行训练,从而提高至少一个神经网络的应用效果。

Description

人机回圈方法、装置、系统、电子设备和存储介质
相关申请的交叉引用
本申请基于申请号为202110667457.8、申请日为2021年06月16日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以引入方式并入本申请。
技术领域
本申请涉及人机回圈的技术领域,特别是涉及人机回圈方法、装置、系统、电子设备和存储介质。
背景技术
以深度学习为代表的人工智能技术,对数据的依赖非常明显,这也是人脸识别、语音识别、自然语言理解等技术能够快速取得突破的原因,因为这些领域已经存在大量的有学术界和工业界贡献的优质数据。
在更多实际应用领域,诸如垃圾检测、城市烟火告警、河道漂浮物检测等,却因为缺乏足够的数据,从而导致相关领域的人工智能技术进展较为迟缓。
发明内容
本申请实施例提供一种人机回圈方法及装置、电子设备和存储介质。
本申请实施例提供了一种人机回圈方法,包括:获取待处理图像;通过目标任务对应的至少一个神经网络对待处理图像中的各目标对象进行推理,得到各目标对象的预测结果;响应于存在预测结果的置信度小于预设阈值的目标对象,利用标准特征向量对目标对象进行挖掘,确定目标对象的挖掘结果,基于挖掘结果构建训练数据集的第一部分;利用训练数据集对至少一个神经网络进行再训练。
在本申请的一些实施例中,通过目标任务对应的至少一个神经网络对待处理图像中的各目标对象进行推理,得到各目标对象的预测结果的步骤包括:通过至少一个神经网络对待处理图像中的各目标对象进行特征提取,确定各目标对象的初始类型,得到预测结果;响应于存在预测结果的置信度小于预设阈值的目标对象,利用标准特征向量对目标对象进行挖掘,确定目标对象的挖掘结果的步骤包括:响应于存在初始类型的置信度小于预设阈值的目标对象,分别对各类型的目标对象的置信度进行降序排序,得到每个类型的目标对象的排序序列;依次从每个排序序列中获取前设定数量的目标对象,并确 定为需要挖掘的目标对象;利用各类型的标准特征向量对需要挖掘的目标对象进行挖掘,以确定需要挖掘的目标对象的类型,得到挖掘结果。
在本申请的一些实施例中,依次从每个排序序列中获取前设定数量的目标对象,并确定为需要挖掘的目标对象之前,方法还包括:获取各类型的目标对象的需求比例;基于需求比例,以及置信度不小于预设阈值的各类型的目标对象的数量,确定从置信度小于预设阈值的目标对象中需要挖掘的各类型目标对象的设定数量。
在本申请的一些实施例中,利用标准特征向量对目标对象进行挖掘,确定目标对象的挖掘结果的步骤之后还包括:判断各目标对象是否已被挖掘;如果目标对象未被挖掘,则接收人工对未被确定类型的目标对象的类型的标注,通过人工确定目标对象的挖掘结果;基于人工确定挖掘结果的目标对象构建训练数据集的第二部分。
在本申请的一些实施例中,利用标准特征向量对目标对象进行挖掘,确定目标对象的挖掘结果的步骤包括:利用各类型的标准特征向量通过聚类方法对各目标对象进行挖掘。
在本申请的一些实施例中,通过目标任务对应的至少一个神经网络对待处理图像中的各目标对象进行推理,得到各目标对象的预测结果的步骤包括:通过至少一个神经网络对待处理图像中的各目标对象进行特征提取,确定各目标对象为各个类型的置信度;将各个目标对象的置信度最大的类型确定为各个目标对象的初始类型,得到预测结果。
在本申请的一些实施例中,方法还包括:响应于存在初始类型的置信度不小于预设阈值的目标对象,将初始类型确定为目标对象的类型;基于初始类型确定类型的目标对象构建训练数据集的第三部分。
在本申请的一些实施例中,利用训练数据集对至少一个神经网络进行再训练的步骤包括:对训练数据集进行目标对象的检测,基于检测结果将训练数据集划分为存在目标对象的正样本图片以及不存在目标对象的负样本图片;判断正样本图片与负样本图片之间的第一数量比例是否为第一设定比例;如果正样本图片与负样本图片之间的第一数量比例不为第一设定比例,则通过重采样或/和部分随机采样的方法调整正样本图片或负样本图片的数量,以使正样本图片与负样本图片之间的第一数量比例为第一设定比例;基于第一设定比例的正样本图片与负样本图片对至少一个神经网络进行再训练。
在本申请的一些实施例中,在所述正样本图片与所述负样本图片之间的第一数量比例为第一设定比例之后,方法还包括:判断确定类型的目标对象的各类型之间的第二数量比例是否为第二设定比例;如果目标对象的各类型的第二数量比例不为第二设定比例,则通过重采样或/和部分随机采样的方法调整不同类型的目标对象的数量,以使不同类型的目标对象的第二数量比例为第二设定比例;通过第二设定比例的不同类型的目标对象对至少一个神经网络进行再训练。
在本申请的一些实施例中,获取到待处理图像的步骤之前包括:获取到原始训练样本,其中,原始训练样本为已标注目标对象类型的样本;判断不同类型的目标对象的第三数量比例是否为第三设定比例;如果不同类型的目标对象的第三数量比例不为第三设 定比例,通过重采样或/和部分随机采样的方法调整不同类型的目标对象的数量,以使不同类型的目标对象的第三数量比例为第三设定比例;通过第三设定比例的不同类型的目标对象对初始网络进行训练,得到至少一个神经网络。
本申请实施例还提供了一种人机回圈装置,人机回圈装置包括:获取模块,配置为获取待处理图像;推理模块,配置为通过目标任务对应的至少一个神经网络对待处理图像中的各目标对象进行推理,得到各目标对象的预测结果;挖掘模块,配置为响应于存在预测结果的置信度小于预设阈值的目标对象,利用标准特征向量对目标对象进行挖掘,确定目标对象的挖掘结果,基于挖掘结果构建训练数据集的第一部分;训练模块,配置为利用训练数据集对至少一个神经网络进行再训练。
本申请实施例还提供了一种人机回圈系统,包括推理平台,配置为获取待处理图像以及通过目标任务对应的至少一个神经网络对待处理图像中的各目标对象进行推理,得到各目标对象的预测结果;标注平台,配置为响应于存在预测结果的置信度小于预设阈值的目标对象,利用标准特征向量对目标对象进行挖掘,确定目标对象的挖掘结果,基于挖掘结果构建训练数据集的第一部分;训练平台,配置为利用训练数据集对至少一个神经网络进行再训练。
本申请实施例还提供了一种电子设备,包括相互耦接的存储器和处理器,处理器用于执行存储器中存储的程序指令,以实现上述人机回圈方法。
本申请实施例还提供了一种计算机可读存储介质,其上存储有程序指令,程序指令被处理器执行时实现上人机回圈方法。
本申请实施例中,先通过目标任务对应的至少一个神经网络对待处理图像中的各目标对象进行推理,得到各目标对象的预测结果;再响应于存在预测结果的置信度小于预设阈值的目标对象,利用标准特征向量对目标对象进行挖掘,确定目标对象的挖掘结果,以基于挖掘结果构建训练数据集的第一部分;最后利用训练数据集对至少一个神经网络进行再训练,从而通过挖掘后的目标对象提升至少一个神经网络的检测准确率,提高至少一个神经网络的可靠性。本实施例还利用训练数据集对至少一个神经网络进行再训练,从而利用至少一个神经网络推理过程中所获取的目标对象对至少一个神经网络进行再训练,从而实现至少一个神经网络的迭代升级,进而进一步提升至少一个神经网络的性能与检测效率。
附图说明
图1是本申请人机回圈方法一实施例的流程示意图;
图2是本申请人机回圈方法另一实施例的流程示意图;
图3是图2实施例中至少一个神经网络获取方式一实施例的流程示意图;
图4是图2实施例的人机回圈方法一实施方式的流程示意图;
图5是本申请人机回圈装置一实施例的框架示意图;
图6是本申请人机回圈系统一实施例的框架示意图;
图7是本申请人机回圈系统另一实施例的框架示意图;
图8是本申请电子设备一实施例的框架示意图;
图9为本申请计算机可读存储介质一实施例的框架示意图。
具体实施方式
下面结合说明书附图,对本申请实施例的方案进行详细说明。
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、接口、技术之类的具体细节,以便透彻理解本申请。
本文中术语“系统”和“网络”在本文中常被可互换使用。本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,可以存在三种关系,例如,A和/或B,可以:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般前后关联对象是一种“或”的关系。此外,本文中的“多”两个或者多于两个。
请参阅图1,图1是本申请人机回圈方法一实施例的流程示意图。具体而言,可以包括如下步骤:
步骤S11:获取待处理图像。
本实施例的待处理图像与目标任务相对应。在一个应用场景中,当目标任务为垃圾分类检测时,待处理图像为垃圾图像。在另一个应用场景中,当目标任务为城市烟火检测时,待处理图像为城市烟火图像。在又一个应用场景中,当目标任务为河道漂浮物检测时,待处理图像为河道图像。至少一个神经网络的具体对象与领域在此不做限定。
本实施例中,先获取到待处理图像,其中,待处理图像是至少一个神经网络进行应用后、在应用环境中获取到的图像。在一个应用场景中,当至少一个神经网络为河道漂浮物检测模型时,将河道漂浮物检测模型应用于某条河道的环境检测后,待处理图像为应用河道漂浮物检测模型后基于该河道所拍摄的图片。
步骤S12:通过目标任务对应的至少一个神经网络对待处理图像中的各目标对象进行推理,得到各目标对象的预测结果。
本实施例的至少一个神经网络可以应用于多种领域,通过至少一个神经网络对待处理图像进行推理预测,得到预测结果。也就是本实施例的执行目标任务所对应的神经网络至少为一个,例如当目标任务为人脸识别任务时,用于执行该任务的神经网络可以至少为两个,例如至少包括用于检测人脸的神经网络和用于识别人脸的神经网络;或当目标任务为车牌识别任务时,用于执行该任务的神经网络可以至少为三个,例如至少包括用于车辆识别的神经网络、用于车牌识别的神经网络以及用于文字识别的神经网络,等等。本实施例中的神经网络的类型和数量根据具体的目标任务的类型进行设置,本实施例在此不做限制。
获取到待处理图像后,通过目标任务对应的至少一个神经网络对待处理图像中的各 目标对象进行推理,得到各目标对象的预测结果。其中,目标对象为待处理图像上的目标物,其由至少一个神经网络的应用对象而定。其中,本实施例的推理过程基于目标任务的具体任务进行设置,例如:当目标任务用于检测时,推理过程可以为检测目标对象;当目标任务用于检测和识别时,推理过程可以为检测并识别目标对象。
步骤S13:响应于存在预测结果的置信度小于预设阈值的目标对象,利用标准特征向量对目标对象进行挖掘,确定目标对象的挖掘结果,基于挖掘结果构建训练数据集的第一部分。
本实施例中,获取各目标对象的预测结果后,响应于各目标对象中存在预测结果的置信度小于预设阈值的目标对象,利用标准特征向量对置信度小于预设阈值的目标对象进行挖掘,来确定目标对象的挖掘结果,基于挖掘结果构建训练数据集的第一部分。其中,预设阈值为置信度的阈值,具体大小可以基于实际应用进行设置,再此不做限定。
其中,预测结果的置信度小于预设阈值的目标对象,即无法依靠置信度判断出目标对象的预测结果,则通过标准特征向量对置信度小于预设阈值的目标对象进行挖掘,从而确定目标对象的挖掘结果。
本实施例的训练数据集的第一部分为训练数据集的至少一部分,示例性的,训练数据集的第一部分可以为训练数据集的部分或全部。在一个应用场景中,训练数据集可以只包括本步骤的挖掘结果。在另一个应用场景中,训练数据集可以包括本步骤的挖掘结果和获得至少一个神经网络前所使用的训练数据。训练数据集的全部组成在此不做限定。
步骤S14:利用训练数据集对至少一个神经网络进行再训练。
利用训练数据集对至少一个神经网络进行再训练,从而能利用挖掘的目标对象对至少一个神经网络进行迭代升级,使至少一个神经网络的应用效果得到进一步提升。
通过上述方法,本实施例的人机回圈方法先通过目标任务对应的至少一个神经网络对待处理图像中的各目标对象进行推理,得到各目标对象的预测结果;再响应于存在预测结果的置信度小于预设阈值的目标对象,利用标准特征向量对目标对象进行挖掘,确定目标对象的挖掘结果,基于挖掘结果构建训练数据集的第一部分;最后利用训练数据集对至少一个神经网络进行再训练,从而通过挖掘后的目标对象提升至少一个神经网络的检测准确率,提高至少一个神经网络的可靠性。本实施例还利用训练数据集对至少一个神经网络进行再训练,从而利用至少一个神经网络推理过程中所获取的目标对象对至少一个神经网络进行再训练,从而实现至少一个神经网络的迭代升级,进而进一步提升至少一个神经网络的性能与检测效率。
请参阅图2,图2是本申请人机回圈方法另一实施例的流程示意图。具体而言,可以包括如下步骤:
步骤S21:获取待处理图像。
本步骤与上述实施例的步骤S11相同,具体请参阅前文,在此不再赘述。
本实施例中,对至少一个神经网络进行部署应用,在至少一个神经网络的推理过程 中,获取到待处理图像。当待处理图像的获取数量达到设定值时,即开启至少一个神经网络的升级迭代步骤,而当待处理图像的获取数量没有达到设定值时,本实施例的至少一个神经网络只输出待处理图像的检测结果。其中,设定值可以基于实际应用而定,在此不做限定。
在本实施例中,将以至少一个神经网络应用于目标对象的类型确定的场景进行说明,当至少一个神经网络应用于其他方面时,执行步骤与本实施例类似。
步骤S22:通过至少一个神经网络对待处理图像中的各目标对象进行特征提取,确定各目标对象的初始类型。
本实施例的至少一个神经网络可以为深度学习网络模型,例如可以利用卷积神经网络、递归升级网络等网络结构构建而成的复合网络框架,也可以为单独深度学习网络为模板构建的网络模型,在此不做限定。
先通过至少一个神经网络对待处理图像进行目标对象的检测推理,从而判断待处理图像上是否存在目标对象;如果待处理图像上不存在目标对象,则对该类待处理图像不进行后续检测;如果待处理图像上存在目标对象,则进一步通过至少一个神经网络对存在目标对象的待处理图像中的各目标对象进行特征提取。其中,当一张待处理图像上存在目标对象时,待处理图片上可能存在不止一个目标对象。例如,当待处理图像为垃圾图片时,垃圾图片上的垃圾可能存在有多个。
在一些实施例中,上述确定所述各目标对象的初始类型,包括:通过至少一个神经网络对待处理图像中的各目标对象进行特征提取,得到各目标对象的特征向量,并通过特征向量确定各目标对象为各个类型的置信度,将各个目标对象的置信度最大的类型确定为所述各个目标对象的初始类型。其中,各个类型是至少一个神经网络的检测目标对象所需要划分的类型。各个类型的概率之和为1,即在本步骤中会确定目标对象的类型属于每个类型的概率。在一个应用场景中,当检测场景为垃圾检测时,而垃圾类型共有5个时,则会获取到目标对象属于这5个垃圾类型的5个概率,这5个概率之和为1。
在获得了各目标对象为各个类型的置信度后,将各个目标对象的置信度最大的类型确定为各个目标对象的初始类型。在一个应用场景中,在垃圾检测中,某个目标对象属于各类型的概率分别为:瓶子:30%、纸张:5%、塑料板:20%、盒子:41%以及棍子:4%,则将最大的概率41%所对应的盒子类型确定为该目标对象的初始类型。
其中,初始类型则为各目标对象的各划分类别,其基于实际应用中的划分标准而定,在此不做限定。在一个应用场景中,当至少一个神经网络为河道漂浮物检测模型时,待处理图像中的目标对象即为河道漂浮物。当河道漂浮物的划分标准为固液划分时,各目标对象的初始类型可以为液态或固态。
目标对象的初始类型可以不是目标对象最终所确定的类型。在一个的应用场景中,可以通过至少一个神经网络检测目标对象与各类型之间的置信度,各类型之间的置信度相加等于1,从而将置信度最大的类型作为目标对象的初始类型。在另一个应用场景中,也可以将目标对象的特征与各类型之间的特征进行比对,将与目标对象的特征最为相似 的类型作为目标对象的初始类型。在此不再限定。
本实施例中,通过将各个目标对象的置信度最大的类型确定为各个目标对象的初始类型,从而提高各目标对象的初始类型的可靠性。
在一些可选实施例中,可以通过至少一个神经网络对待处理图像中的各目标对象进行特征提取,并利用提取到的特征确定目标对象与各类型之间的置信度。在一个应用场景中,当判断河道漂浮物的初始类型时,先获取到的类型置信度可以为液态:30%,固态:70%。
步骤S23:响应于存在初始类型的置信度小于预设阈值的目标对象,分别对各类型的目标对象的置信度进行降序排序,得到每个类型的目标对象的排序序列。
在一些可选实施例中,所述方法还包括:获取各类型的目标对象的需求比例,基于需求比例,以及置信度不小于预设阈值的各类型的目标对象的数量,确定从置信度小于预设阈值的目标对象中需要挖掘的各类型目标对象的设定数量。
判断各目标对象的初始类型所对应的置信度是否小于置信度阈值;如果存在初始类型的置信度小于预设阈值的目标对象,则先获取到各类型的目标对象的需求比例,并基于需求比例,以及置信度不小于置信度阈值的各类型的目标对象的数量,确定从置信度小于置信度阈值的目标对象中需要挖掘的各类型目标对象的数量。
其中,需求比例可以根据实际应用的需求进行设置。例如,当目标对象的类型共有3种时,每种类型的需求比例可以为1:1:1。本实施例中的需求比例在此不做限定。
而由于每次实际应用中所获取置信度不小于置信度阈值的各类型的目标对象的数量不定,因此,对置信度小于置信度阈值的目标对象中需要挖掘的各类型目标对象的数量,由需求比例与置信度不小于置信度阈值的各类型的目标对象的数量共同确定。在一个应用场景中,当类型种类为两种,而需求比例为1:1,置信度不小于置信度阈值的各类型的目标对象中两类型之间的数量200和100时,则可以在置信度小于置信度阈值的目标对象中对应挖掘100和200的两类型的目标对象的数量,使得最终所得到的两类型之间的数量比例为需求比例。
其中,确定了置信度小于置信度阈值的目标对象中需要挖掘的各类型目标对象的数量后,并将该数量作为各类型的设定数量。
本实施例中,基于需求比例,以及置信度不小于置信度阈值的各类型的目标对象的数量从置信度小于置信度阈值的目标对象中需要挖掘的各类型目标对象的数量,并基于需要挖掘的各类型目标对象的数量,利用各类型的标准特征向量对各目标对象进行挖掘,从而在一定程度上保证样本数据的丰富度,提高本次模型再训练的效率。
步骤S24:依次从每个排序序列中获取前设定数量的目标对象,并确定为需要挖掘的目标对象。
获取了各类型的设定数量后,分别对各类型的目标对象的置信度和/或信息熵进行降序排序,得到每个类型的目标对象的排序序列;依次从每个排序序列中获取前设定数量的目标对象,并确定为需要挖掘的目标对象。
本实施例中,基于置信度进行排序,依次获取前设定数量的目标对象,从而在保证一定挖掘数量的同时,提高目标对象各类型的可靠性,加速至少一个神经网络的训练效率和提高样本数据的可靠性。
步骤S25:利用各类型的标准特征向量对需要挖掘的目标对象进行挖掘,确定需要挖掘的目标对象的类型,得到挖掘结果。
当确定了需要进行特征挖掘的目标对象后,可以利用各类型的标准特征向量通过聚类方法对各目标对象进行挖掘。其中,聚类方法例如可以为k-means算法、k-medoids算法、k-median算法、k-center算法或其他聚类算法等,在此不做限定。
本实施例中可利用各类型的标准特征向量通过聚类方法对各目标对象进行挖掘,从而提高挖掘效率和可靠性。
在一些可选实施例中,可以通过各类型的标准特征向量对各目标对象的类型进行比对判断,根据各目标对象的特征向量与各类型的标准特征向量之间的相似度确定目标对象的类型,将与目标对象特征向量相似度最高的标准特征向量所对应的类型作为该目标对象的类型,从而完成该目标对象的挖掘,重复上述过程,直到挖掘够各类型设定数量的目标对象或耗尽所有目标对象,从而得到挖掘结果。
在一些可选实施例中,当至少一个神经网络通过上述方式确定了待处理图像中目标对象的类型之后,至少一个神经网络将目标对象的类型进行输出,从而完成至少一个神经网络的实际应用作业。
步骤S26:判断各目标对象是否已被挖掘。
判断置信度小于预设阈值的各目标对象是否已被挖掘,也就是判断置信度小于预设阈值的各目标对象是否通过步骤S25进行挖掘。如果目标对象未被挖掘,则执行步骤S27,如果目标对象被挖掘,则执行步骤S28。
步骤S27:接收人工对未被确定类型的目标对象的类型的标注,通过人工确定目标对象的挖掘结果,基于人工确定挖掘结果的目标对象构建训练数据集的第二部分。
如果置信度小于预设阈值的目标对象中存在目标对象未进行挖掘,也就是各类型的目标对象中存在排序序列前设定数量以外的目标对象,接收人工对上述未被确定类型的目标对象的类型的标注,通过人工确定目标对象的挖掘结果,并基于人工确定挖掘结果的目标对象构建训练数据集的第二部分。
本实施例中,通过人工对未被确定类型的目标对象的类型的标注,以通过人工确定目标对象的挖掘结果,从而进一步扩大训练数据集的丰富度,提高至少一个神经网络的训练效果。
步骤S28:基于挖掘结果构建训练数据集的第一部分。
如果置信度小于预设阈值的目标对象中存在目标对象进行了挖掘,确定了上述目标对象的挖掘结果,则基于上述目标对象的挖掘结果构建训练数据集的第一部分。
在一些可选实施例中,上述方法还可包括步骤S29:响应于存在初始类型的置信度不小于预设阈值的目标对象,将初始类型确定为目标对象的类型,基于初始类型确定类 型的目标对象构建训练数据集的第三部分。
通过至少一个神经网络对待处理图像中的各目标对象进行特征提取后,响应于存在初始类型的置信度不小于预设阈值的目标对象,将初始类型确定为目标对象的类型,基于初始类型确定为类型的目标对象构建训练数据集的第三部分。其中,本步骤在步骤S22-步骤S30之间执行即可,本实施例不对其具体执行顺序作限定。
将目标对象中置信度最大的类型确定为初始类型,并判断初始类型的置信度是否小于置信度阈值。在一个具体的应用场景中,当目标对象属于某类型的置信度最大,且为95%,则判断置信度95%是否小于置信度阈值80%。其中,置信度阈值可以根据实际情况进行设置,例如:70%、80%或65%等,在此不做限定。
步骤S30:利用训练数据集对至少一个神经网络进行再训练。
利用训练数据集对至少一个神经网络进行再训练,其中,本实施例的训练数据集至少包括基于所挖掘结果构建训练数据集的第一部分、基于人工确定挖掘结果的目标对象构建训练数据集的第二部分以及基于初始类型确定类型的目标对象构建训练数据集的第三部分。在其他实施例中,训练数据集还可以包括至少一个神经网络的原始数据和基于所挖掘结果构建训练数据集的第一部分、基于人工确定挖掘结果的目标对象构建训练数据集的第二部分以及基于初始类型确定类型的目标对象构建训练数据集的第三部分。
在一些可选实施例中,步骤S30可包括:对训练数据集进行目标对象的检测,基于检测结果将待处理图像划分为存在目标对象的正样本图片以及不存在目标对象的负样本图片;并判断正样本图片与负样本图片之间的第一数量比例是否为第一设定比例;如果正样本图片与负样本图片之间的第一数量比例不为第一设定比例,则通过重采样或/和部分随机采样的方法调整正样本图片或负样本图片的数量,以使正样本图片与负样本图片之间的第一数量比例为第一设定比例;基于第一设定比例的正样本图片与负样本图片对至少一个神经网络进行再训练。
在一个应用场景中,当正负样本图片之间的第一设定比例为1比1时,而正样本图片的数量为10,负样本图片的数量为100,则可以对正样本图片进行重采样,例如采样10次,使得正负样本图片之间的数量比例符合一比一。或对负样本图片进行随机采样,例如10比1的随机采样比例只采样部分负样本图片,使得正负样本图片之间的数量比例符合一比一。
本实施例通过将正样本图片与负样本图片之间的第一数量比例调整为第一设定比例,从而改变训练数据中的不均衡现象。
在一些可选实施例中,当正负样本图片之间的第一数量比例为第一设定比例之后,还可以对正样本图片之间各类型的图片进行重采样或部分随机采样,使得正样本图片内部均衡。可选地,可以判断确定类型的目标对象的各类型之间的第二数量比例是否为第二设定比例;如果目标对象的各类型的第二数量比例不为第二设定比例,则通过重采样或/和部分随机采样的方法调整不同类型的目标对象的数量,以使不同类型的目标对象的第二数量比例为第二设定比例;最后通过第二设定比例的不同类型的目标对象对初始网 络进行训练,得到至少一个神经网络。其中,重采样与部分随机采样的方式与前述应用场景中正负样本图片之间的采样方法相同,请参阅前文,在此不再赘述。
本实施例通过将确定类型的目标对象的各类型之间的第二数量比例调整为第二设定比例,从而减少各类型目标对象之间不均衡的现象发生。
请参阅图3,图3是图2实施例中至少一个神经网络获取方式一实施例的流程示意图。
步骤S31:获取到原始训练样本,其中,原始训练样本为已标注目标对象类型的样本。
先获取到原始训练样本,其中,原始训练样本为已经被标注了目标对象类型的样本。其中,在一个应用场景中,可以通过人工标注的方式对样本进行人工标注,从而得到被标注了目标对象,以及各目标对象所属类型的原始训练样本。在另一个应用场景中,也可以通过其他分类模型对样本进行标注,从而得到被标注了目标对象,以及各目标对象所属类型的原始训练样本。本实施例中对原始训练样本的标注方式在此不做限定。
步骤S32:判断不同类型的目标对象的第三数量比例是否为第三设定比例。
判断原始训练样本中各不同类型的目标对象之间的第三数量比例是否为第三设定比例。其中,第三设定比例可以根据实际训练需求进行设定,例如:各类型之间为等比例设定,或按照不同类型的训练难度增加某些类型的设定比例,在此不做限定。
步骤S33:如果不同类型的目标对象的第三数量比例不为第三设定比例,通过重采样或/和部分随机采样的方法调整不同类型的目标对象的数量,以使不同类型的目标对象的第三数量比例为第三设定比例。
当判断出不同类型的目标对象的第三数量比例不为第三设定比例,通过重采样或/和部分随机采样的方法调整不同类型的目标对象的数量,以使不同类型的目标对象的第三数量比例为第三设定比例。
其中,可以将原始训练样本中的正负样本图片之间的比例通过重采样或/和部分随机采样进行调整,再对正样本之间的各类型之间的第三数量比例通过重采样或/和部分随机采样进行调整,使其为第三设定比例,从而满足训练要求。根据上述的两个步骤采样的后的数据可以缓解正负样本图片以及各类型之间的不均衡现象,采样得到训练数据之后可以进行深度学习网络训练,从而得到图2中的至少一个神经网络。
而当不同类型的目标对象的数量比例为第三设定比例,则直接执行步骤S34。
步骤S34:通过第三设定比例的不同类型的目标对象对初始网络进行训练,得到至少一个神经网络。
通过已满足第三设定比例的不同类型的目标对象对初始网络进行训练,得到至少一个神经网络。其中,初始网络可以为深度学习网络模型,例如可以利用卷积神经网络、递归升级网络等网络结构构建而成的复合网络框架。也可以为单独深度学习网络为模板构建的网络模型,在此不做限定。
本实施例通过将原始训练样本中不同类型的目标对象的第三数量比例调整为第三 设定比例以均衡至少一个神经网络对各类型的训练样本,使得至少一个神经网络的训练更加全面。
通过上述方法,本实施例的人机回圈方法先利用原始训练样本对初始网络进行训练得到至少一个神经网络,将至少一个神经网络进行部署,从而在至少一个神经网络的正常应用过程中获取设定值的待处理图像,通过至少一个神经网络对待处理图像进行目标对象的检测和各目标对象的类型确定,从而使用主动学习技术,完成数据挖掘,能够有效筛选数据,降低标注成本,提升数据质量。再将确定了类型的目标对象通过人工进行复核,从而进一步提高目标对象类别的准确率,最后还通过对目标对象进行重采样或部分随机采样,从而采用重采样或部分随机采样训练技术,能够缓解正负样本以及各类型目标对象之间不均衡现象。利用采样后的目标对象对至少一个神经网络进行训练,从而完成至少一个神经网络的迭代升级,本实施例通过人机回圈的标注方式,打通了至少一个神经网络的推理过程和训练过程,使得模型生产和迭代的全流程更加高效,而今提升至少一个神经网络的性能。当至少一个神经网络的过程中获得设定值的待处理图像,至少一个神经网络就能一直循环迭代更新下去,从而使得至少一个神经网络的性能与检测准确率得到一定程度的循环提升。
图4是图2实施例的人机回圈方法一实施方式的流程示意图。本实施方式将以至少一个神经网络为河道漂浮物检测模型为例进行说明,请参阅图4,可以包括如下步骤:
步骤S41:利用原始训练样本对初始网络进行训练,得到河道漂浮物检测模型。
利用原始训练样本对初始网络进行训练。在一些实施例中,可以通过人工标注对原始训练样本进行标注,从而得到被准确标注类别的原始训练样本,利用被准确标注类别的原始训练样本对初始网络进行训练,得到河道漂浮物检测模型。
步骤S42:获取待处理图像。
将步骤S41获得的河道漂浮物检测模型应用于河道漂浮物检测场景中,利用河道漂浮物检测模型对获取得到的待处理图像进行检测,以确定待处理图像上河道漂浮物的类别。
步骤S43:通过河道漂浮物检测模型对待处理图像进行特征提取,确定各目标对象的类型。
一些实施例中,通过河道漂浮物检测模型对待处理图像进行目标对象的检测推理,从而判断待处理图像上是否存在目标对象,即河道漂浮物。如果待处理图像上不存在目标对象,则对该类待处理图像不进行后续检测。
通过河道漂浮物检测模型对待处理图像中的各目标对象进行特征提取,得到各目标对象的特征向量,并通过特征向量确定各目标对象为各个类型的概率;在获得了各目标对象为各个类型的概率后,将各个目标对象的概率最大的类型确定为各个目标对象的初始类型。如果目标对象的初始类型的置信度不小于置信度阈值时,将目标对象的初始类型确定为该目标对象的最终类型。
步骤S44:输出各目标对象的类型。
将各目标对象的最终类型进行输出,完成河道漂浮物检测模型河道漂浮物的检测。
步骤S45:判断待处理图像的数量是否满足设定值。
判断河道漂浮物检测模型应用后新获取的待处理图像的数量是否满足设定值。其中,设定值可以根据实际应用而定,例如:1000张、2000张等,在此不做限定。
如果不满足设定值则重新执行步骤S42,以继续获取待处理图像,如果满足设定值,则执行步骤S46。
步骤S46:对待处理图像进行挖掘,确定目标对象的类型。
判断各目标对象的初始类型所对应的置信度是否小于置信度阈值时。如果目标对象的初始类型小于置信度阈值,则获取各类型的目标对象的需求比例,基于需求比例,以及置信度不小于置信度阈值的各类型的目标对象的数量,确定从置信度小于置信度阈值的目标对象中需要挖掘的各类型目标对象的数量。
在确定了设定数量的目标对象的类型后,将设定数量的目标对象与置信度不小于置信度阈值的目标对象进行合并,得到训练数据。
通过人工对置信度小于置信度阈值的目标对象中、未被挖掘的各目标对象的类型进行标注确定,从而在一定程度上保证各目标对象的类型的准确性。同时,通过河道漂浮物检测模型对目标对象的类型进行确定以提高对目标对象的检测的全面性并反馈河道漂浮物检测模型的检测准确性。并实现河道漂浮物检测模型训练升级的人机回圈。
步骤S47:对确定类型的目标对象进行采样,得到训练数据。
对待处理图像进行目标对象的检测,以将待处理图像划分为存在目标对象的正样本图片以及不存在目标对象的负样本图片;并判断正样本图片与负样本图片之间的第一数量比例是否为第一设定比例;如果正样本图片与负样本图片之间的第一数量比例不为第一设定比例,则通过重采样或/和部分随机采样的方法调整正样本图片或负样本图片的数量,以使正样本图片与负样本图片之间的第一数量比例为第一设定比例;基于第一设定比例的正样本图片与负样本图片对河道漂浮物检测模型进行再训练。
在一个应用场景中,当正负样本图片之间的第一数量比例为第一设定比例之后,还可以对正样本图片之间各类型的图片进行重采样或随机采样,使得正样本图片内部均衡。例如,可以判断确定类型的目标对象的各类型之间的第二数量比例是否为第二设定比例;如果目标对象的各类型的第二数量比例不为设定比例,则通过重采样或/和部分随机采样的方法调整不同类型的目标对象的数量,以使不同类型的目标对象的第二数量比例为第二设定比例;最后通过第二设定比例的不同类型的目标对象对初始网络进行训练,得到河道漂浮物检测模型。其中,关于对正样本图片中各类型的图片的重采样与部分随机采样的方式与前述应用场景中正负样本图片之间的采样方法相同,请参阅前文,在此不再赘述。
步骤S48:通过采样后的训练数据对河道漂浮物检测模型进行训练。
通过已满足各设定比例的不同类型的目标对象对河道漂浮物模型进行训练,以对河道漂浮物模型极性升级迭代。
通过上述方式,本实施例的河道漂浮物检测模型先利用原始训练样本对初始网络进行训练得到河道漂浮物检测模型,将河道漂浮物检测模型进行部署,从而在河道漂浮物检测模型的正常应用过程中获取设定值的待处理图像,通过河道漂浮物检测模型对待处理图像进行目标对象的检测和各目标对象的类型确定,从而使用主动学习技术,完成数据挖掘,能够有效筛选数据,降低标注成本,提升数据质量。再通过人工对置信度小于置信度阈值的目标对象中未被挖掘的各目标对象的类型进行标注确定,从而进一步提高目标对象类别的准确率,最后还通过对目标对象进行重采样或部分随机采样,从而采用重采样或部分随机采样训练技术,能够缓解正负样本以及各类型目标对象之间不均衡现象。利用采样后的目标对象对河道漂浮物检测模型进行训练,从而完成河道漂浮物检测模型的迭代升级,本实施例通过人机回圈的标注方式,打通了河道漂浮物检测模型的推理过程和训练过程,使得模型生产和迭代的全流程更加高效,而今提升河道漂浮物检测模型的性能。当河道漂浮物检测模型的过程中获得设定值的待处理图像,河道漂浮物检测模型就能一直循环迭代更新下去,从而使得河道漂浮物检测模型的性能与检测准确率得到一定程度的循环提升。
请参阅图5,图5是本申请人机回圈装置一实施例的框架示意图。人机回圈装置50包括获取模块51、推理模块52、挖掘模块53和训练模块54,获取模块51配置为获取待处理图像;推理模块52配置为通过目标任务对应的至少一个神经网络对待处理图像中的各目标对象进行推理,得到各目标对象的预测结果;挖掘模块53配置为响应于存在预测结果的置信度小于预设阈值的目标对象,利用标准特征向量对目标对象进行挖掘,确定目标对象的挖掘结果,基于挖掘结果构建训练数据集的第一部分;训练模块54,配置为利用训练数据集对至少一个神经网络进行再训练。
本实施例先通过目标任务对应的至少一个神经网络对待处理图像中的各目标对象进行推理,得到各目标对象的预测结果;再响应于存在预测结果的置信度小于预设阈值的目标对象,利用标准特征向量对目标对象进行挖掘,确定目标对象的挖掘结果,以基于挖掘结果构建训练数据集的第一部分;最后利用训练数据集对至少一个神经网络进行再训练,从而通过挖掘后的目标对象提升至少一个神经网络的检测准确率,提高至少一个神经网络的可靠性。本实施例还利用训练数据集对至少一个神经网络进行再训练,从而利用至少一个神经网络推理过程中所获取的目标对象对至少一个神经网络进行再训练,从而实现至少一个神经网络的迭代升级,进而进一步提升至少一个神经网络的性能与检测效率。
在一些实施例中,上述推理模块52,配置为通过至少一个神经网络对待处理图像中的各目标对象进行特征提取,确定各目标对象的初始类型,得到预测结果;上述挖掘模块53,配置为响应于存在初始类型的置信度小于预设阈值的目标对象,分别对各类型的目标对象的置信度进行降序排序,得到每个类型的目标对象的排序序列;依次从每个排序序列中获取前设定数量的目标对象,并确定为需要挖掘的目标对象;利用各类型的标准特征向量对需要挖掘的目标对象进行挖掘,确定需要挖掘的目标对象的类型,得到挖 掘结果。
区别于前述实施例,基于置信度进行排序,依次获取前设定数量的目标对象,从而在保证一定挖掘数量的同时,提高目标对象各类型的可靠性,加速至少一个神经网络的训练效率和提高样本数据的可靠性。
在一些实施例中,上述挖掘模块53,配置为获取各类型的目标对象的需求比例;基于需求比例,以及置信度不小于预设阈值的各类型的目标对象的数量,确定从置信度小于预设阈值的目标对象中需要挖掘的各类型目标对象的设定数量。
区别于前述实施例,基于需求比例,以及置信度不小于置信度阈值的各类型的目标对象的数量从置信度小于置信度阈值的目标对象中需要挖掘的各类型目标对象的数量,并基于需要挖掘的各类型目标对象的数量,利用各类型的标准特征向量对各目标对象进行挖掘,从而在一定程度上保证样本数据的丰富度,提高本次模型再训练的效率。
在一些实施例中,挖掘模块53配置为利用标准特征向量对目标对象进行挖掘,确定目标对象的挖掘结果之后,判断各目标对象是否已被挖掘;如果目标对象未被挖掘,则接收人工对未被确定类型的目标对象的类型的标注,通过人工确定目标对象的挖掘结果;基于人工确定挖掘结果的目标对象构建训练数据集的第二部分。
区别于前述实施例,通过人工对未被确定类型的目标对象的类型的标注,以通过人工确定目标对象的挖掘结果,从而进一步扩大训练数据集的丰富度,提高至少一个神经网络的训练效果。
在一些实施例中,挖掘模块53配置为利用各类型的标准特征向量通过聚类方法对各目标对象进行挖掘。
区别于前述实施例,利用各类型的标准特征向量通过聚类方法对各目标对象进行挖掘,从而提高挖掘效率和可靠性。
在一些实施例中,推理模块52配置为通过至少一个神经网络对待处理图像中的各目标对象进行特征提取,确定各目标对象为各个类型的置信度;将各个目标对象的置信度最大的类型确定为各个目标对象的初始类型,得到预测结果。
在一些实施例中,挖掘模块53,还配置为响应于存在初始类型的置信度不小于预设阈值的目标对象,将初始类型确定为目标对象的类型;基于初始类型确定类型的目标对象构建训练数据集的第三部分。
区别于前述实施例,将各个目标对象的概率最大的类型确定为各个目标对象的初始类型,从而提高各目标对象的初始类型的可靠性。
在一些实施例中,训练模块54配置为对训练数据集进行目标对象的检测,基于检测结果将训练数据集划分为存在目标对象的正样本图片以及不存在目标对象的负样本图片;判断正样本图片与负样本图片之间的第一数量比例是否为第一设定比例;如果正样本图片与负样本图片之间的第一数量比例不为第一设定比例,则通过重采样或/和部分随机采样的方法调整正样本图片或负样本图片的数量,以使正样本图片与负样本图片之间的第一数量比例为第一设定比例;基于第一设定比例的正样本图片与负样本图片对至 少一个神经网络进行再训练。
区别于前述实施例,通过将正样本图片与负样本图片之间的第一数量比例调整为第一设定比例从而减少正样本图片与负样本图片之间不均衡的现象发生。
在一些实施例中,训练模块54还配置为,判断确定类型的目标对象的各类型之间的第二数量比例是否为第二设定比例;如果目标对象的各类型的第二数量比例不为第二设定比例,则通过重采样或/和部分随机采样的方法调整不同类型的目标对象的数量,以使不同类型的目标对象的第二数量比例为第二设定比例;通过第二设定比例的不同类型的目标对象对初始网络进行训练,得到至少一个神经网络。
区别于前述实施例,通过将确定类型的目标对象的各类型之间的第二数量比例调整为第二设定比例,减少各类型之间不均衡的现象发生。
在一些实施例中,获取模块51还配置为获取到待处理图像之前,获取到原始训练样本,其中,原始训练样本为已标注目标对象类型的样本;判断不同类型的目标对象的第三数量比例是否为第三设定比例;如果不同类型的目标对象的第三数量比例不为第三设定比例,通过重采样或/和部分随机采样的方法调整不同类型的目标对象的数量,以使不同类型的目标对象的第三数量比例为第三设定比例;通过第三设定比例的不同类型的目标对象对初始网络进行训练,得到至少一个神经网络。
区别于前述实施例,通过将原始训练样本中不同类型的目标对象的第三数量比例调整为第三设定比例以均衡至少一个神经网络对各类型的训练样本,使得至少一个神经网络的训练更加全面。
请参阅图6,图6是本申请人机回圈系统一实施例的框架示意图。
人机回圈系统60包括相互通信连接的推理平台63、标注平台61以及训练平台62。其中,推理平台63配置为获取待处理图像以及通过目标任务对应的至少一个神经网络对待处理图像中的各目标对象进行推理,得到各目标对象的预测结果;标注平台61,配置为响应于存在预测结果的置信度小于预设阈值的目标对象,利用标准特征向量对目标对象进行挖掘,确定目标对象的挖掘结果,基于挖掘结果构建训练数据集的第一部分;训练平台62配置为利用训练数据集对至少一个神经网络进行再训练。
上述方案,能够通过至少一个神经网络对待处理图像进行应用时,将待处理图像作为训练数据再次对至少一个神经网络进行训练,从而提高至少一个神经网络的应用效果。
请参阅图7,图7是本申请人机回圈系统另一实施例的框架示意图。
人机回圈系统70包括图形用户界面(GUI,Graphical User Interface)71、业务层72、平台层73、调度层74以及硬件层75。GUI 71是指采用图形方式显示的计算机操作用户界面,用于接收用户操作。业务层72包括资源中心、用户中心以及权限中心,资源中心用于管理系统资源,用户中心用于管理用户信息,权限中心用于管理权限。
平台层73包括标注平台、训练平台以及推理平台。平台层73用于实现上述任一实施例的人机回圈方法。在一个应用场景中,推理平台可用于获取待处理图像以及通过目 标任务对应的至少一个神经网络对待处理图像中的各目标对象进行推理,得到各目标对象的预测结果标注平台可以用于响应于存在预测结果的置信度小于预设阈值的目标对象,利用标准特征向量对目标对象进行挖掘,确定目标对象的挖掘结果,以基于挖掘结果构建训练数据集的第一部分;训练平台可以利用训练数据集对至少一个神经网络进行再训练。
调度层74用于对人机回圈系统70进行调度。在一些实施例中,可以采用kubernetes调度机制进行调度;kubernetes简称K8s,是一种开源的容器编排引擎。硬件层75可以包括中央处理单元(CPU,Central Processing Unit)、图形处理单元(GPU,Graphics Processing Unit)、网络附属存储(NAS,Network Attached Storage)。通过上述硬件来实现人机回圈系统70的应用实现。
上述方案,能够通过至少一个神经网络对待处理图像进行应用时,将待处理图像作为训练数据再次对至少一个神经网络进行训练,从而提高至少一个神经网络的应用效果。
请参阅图8,图8是本申请电子设备一实施例的框架示意图。电子设备80包括相互耦接的存储器81和处理器82,处理器82用于执行存储器81中存储的程序指令,以实现上述任一人机回圈方法实施例的步骤。在一个实施场景中,电子设备80可以包括但不限于:微型计算机、服务器,此外,电子设备80还可以包括笔记本电脑、平板电脑等移动设备,在此不做限定。
处理器82用于控制其自身以及存储器81以实现上述任一人机回圈方法实施例的步骤。处理器82还可以称为CPU。处理器82可能是一种集成电路芯片,具有信号的处理能力。处理器82还可以是通用处理器、数字信号处理器(DSP,Digital Signal Processor)、专用集成电路(ASIC,Application Specific Integrated Circuit)、现场可编程门阵列(FPGA,Field-Programmable Gate Array)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。另外,处理器82可以由集成电路芯片共同实现。
上述方案,能够提高至少一个神经网络的性能和准确性。
请参阅图9,图9为本申请计算机可读存储介质一实施例的框架示意图。计算机可读存储介质90存储有能够被处理器运行的程序指令901,程序指令901用于实现上述任一人机回圈方法实施例的步骤。
上述方案,能够提高至少一个神经网络的性能和准确性。
在本申请所提供的几个实施例中,应该理解到,所揭露的方法和装置,可以通过其它的方式实现。例如,以上所描述的装置实施方式仅仅是示意性的,例如,模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性、机械或其它的形式。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施方式方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施方式方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。

Claims (14)

  1. 一种人机回圈方法,包括:
    获取待处理图像;
    通过目标任务对应的至少一个神经网络对所述待处理图像中的各目标对象进行推理,得到所述各目标对象的预测结果;
    响应于存在预测结果的置信度小于预设阈值的目标对象,利用标准特征向量对所述目标对象进行挖掘,确定所述目标对象的挖掘结果,基于所述挖掘结果构建训练数据集的第一部分;
    利用所述训练数据集对所述至少一个神经网络进行再训练。
  2. 根据权利要求1所述的人机回圈方法,其中,所述通过目标任务对应的至少一个神经网络对所述待处理图像中的各目标对象进行推理,得到所述各目标对象的预测结果的步骤包括:
    通过所述至少一个神经网络对所述待处理图像中的各目标对象进行特征提取,确定所述各目标对象的初始类型,得到所述预测结果;
    所述响应于存在预测结果的置信度小于预设阈值的目标对象,利用标准特征向量对所述目标对象进行挖掘,确定所述目标对象的挖掘结果的步骤包括:
    响应于存在初始类型的置信度小于预设阈值的目标对象,分别对各类型的所述目标对象的置信度进行降序排序,得到每个类型的目标对象的排序序列;
    依次从每个排序序列中获取前设定数量的目标对象,并确定为需要挖掘的目标对象;
    利用各类型的标准特征向量对所述需要挖掘的目标对象进行挖掘,确定所述需要挖掘的目标对象的类型,得到所述挖掘结果。
  3. 根据权利要求2所述的人机回圈方法,其中,所述依次从每个排序序列中获取前设定数量的目标对象,并确定为需要挖掘的目标对象之前,所述方法还包括:
    获取各类型的目标对象的需求比例;
    基于所述需求比例,以及置信度不小于所述预设阈值的各类型的目标对象的数量,确定从置信度小于所述预设阈值的目标对象中需要挖掘的各类型目标对象的设定数量。
  4. 根据权利要求1所述的人机回圈方法,其中,所述利用标准特征向量对所述目标对象进行挖掘,确定所述目标对象的挖掘结果的步骤之后还包括:
    判断所述各目标对象是否已被挖掘;
    如果所述目标对象未被挖掘,则接收人工对未被确定类型的目标对象的类型的标注,通过人工确定所述目标对象的挖掘结果;
    基于人工确定挖掘结果的目标对象构建训练数据集的第二部分。
  5. 根据权利要求1所述的人机回圈方法,其中,所述利用标准特征向量对所述目 标对象进行挖掘,确定所述目标对象的挖掘结果的步骤包括:
    利用各类型的标准特征向量通过聚类方法对所述各目标对象进行挖掘。
  6. 根据权利要求1所述的人机回圈方法,其中,所述通过目标任务对应的至少一个神经网络对所述待处理图像中的各目标对象进行推理,得到所述各目标对象的预测结果的步骤包括:
    通过所述至少一个神经网络对所述待处理图像中的各目标对象进行特征提取,确定所述各目标对象为各个类型的置信度;将各个目标对象的置信度最大的类型确定为所述各个目标对象的初始类型,得到所述预测结果。
  7. 根据权利要求1所述的人机回圈方法,其中,所述方法还包括:
    响应于存在初始类型的置信度不小于预设阈值的目标对象,将所述初始类型确定为所述目标对象的类型;
    基于所述初始类型确定类型的目标对象构建训练数据集的第三部分。
  8. 根据权利要求1-7任一项所述的人机回圈方法,其中,所述利用所述训练数据集对所述至少一个神经网络进行再训练的步骤包括:
    对所述训练数据集进行目标对象的检测,基于检测结果将所述训练数据集划分为存在所述目标对象的正样本图片以及不存在所述目标对象的负样本图片;
    判断所述正样本图片与所述负样本图片之间的第一数量比例是否为第一设定比例;
    如果所述正样本图片与所述负样本图片之间的第一数量比例不为第一设定比例,则通过重采样或/和部分随机采样的方法调整所述正样本图片或所述负样本图片的数量,以使所述正样本图片与所述负样本图片之间的第一数量比例为所述第一设定比例;
    基于所述第一设定比例的所述正样本图片与所述负样本图片对所述至少一个神经网络进行再训练。
  9. 根据权利要求8所述的人机回圈方法,其中,在所述正样本图片与所述负样本图片之间的第一数量比例为第一设定比例之后,所述方法还包括:
    判断确定类型的目标对象的各类型之间的第二数量比例是否为第二设定比例;
    如果所述目标对象的各类型的第二数量比例不为所述第二设定比例,则通过重采样或/和部分随机采样的方法调整不同类型的目标对象的数量,以使所述不同类型的目标对象的第二数量比例为所述第二设定比例;
    通过所述第二设定比例的所述不同类型的目标对象对所述至少一个神经网络进行再训练。
  10. 根据权利要求1-7任一项所述的人机回圈方法,其中,所述获取到待处理图像的步骤之前包括:
    获取到原始训练样本,其中,所述原始训练样本为已标注目标对象类型的样本;
    判断不同类型的目标对象的第三数量比例是否为第三设定比例;
    如果所述不同类型的目标对象的第三数量比例不为所述第三设定比例,通过重采样或/和部分随机采样的方法调整所述不同类型的目标对象的数量,以使所述不同类型的目 标对象的第三数量比例为所述第三设定比例;
    通过所述第三设定比例的所述不同类型的目标对象对初始网络进行训练,得到所述至少一个神经网络。
  11. 一种人机回圈装置,所述人机回圈装置包括:
    获取模块,配置为获取待处理图像;
    推理模块,配置为通过目标任务对应的至少一个神经网络对所述待处理图像中的各目标对象进行推理,得到所述各目标对象的预测结果;
    挖掘模块,配置为响应于存在预测结果的置信度小于预设阈值的目标对象,利用标准特征向量对所述目标对象进行挖掘,确定所述目标对象的挖掘结果,基于所述挖掘结果构建训练数据集的第一部分;
    训练模块,配置为利用所述训练数据集对所述至少一个神经网络进行再训练。
  12. 一种人机回圈系统,所述人机回圈系统包括:
    推理平台,配置为获取待处理图像以及通过目标任务对应的至少一个神经网络对所述待处理图像中的各目标对象进行推理,得到所述各目标对象的预测结果;
    标注平台,配置为响应于存在预测结果的置信度小于预设阈值的目标对象,利用标准特征向量对所述目标对象进行挖掘,确定所述目标对象的挖掘结果,基于所述挖掘结果构建训练数据集的第一部分;
    训练平台,配置为利用所述训练数据集对所述至少一个神经网络进行再训练。
  13. 一种电子设备,包括相互耦接的存储器和处理器,所述处理器用于执行所述存储器中存储的程序指令,以实现权利要求1至10任一项所述的人机回圈方法。
  14. 一种计算机可读存储介质,其上存储有程序指令,所述程序指令被处理器执行时实现权利要求1至10任一项所述的人机回圈方法。
PCT/CN2021/119968 2021-06-16 2021-09-23 人机回圈方法、装置、系统、电子设备和存储介质 WO2022262141A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110667457.8 2021-06-16
CN202110667457.8A CN113344086B (zh) 2021-06-16 2021-06-16 人机回圈方法、装置、系统、电子设备和存储介质

Publications (1)

Publication Number Publication Date
WO2022262141A1 true WO2022262141A1 (zh) 2022-12-22

Family

ID=77476070

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/119968 WO2022262141A1 (zh) 2021-06-16 2021-09-23 人机回圈方法、装置、系统、电子设备和存储介质

Country Status (2)

Country Link
CN (1) CN113344086B (zh)
WO (1) WO2022262141A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344086B (zh) * 2021-06-16 2022-07-01 深圳市商汤科技有限公司 人机回圈方法、装置、系统、电子设备和存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027605A (zh) * 2019-11-28 2020-04-17 北京影谱科技股份有限公司 基于深度学习的细粒度图像识别方法和装置
CN111881966A (zh) * 2020-07-20 2020-11-03 北京市商汤科技开发有限公司 神经网络训练方法、装置、设备和存储介质
US20200364407A1 (en) * 2019-05-14 2020-11-19 Korea University Research And Business Foundation Method and server for text classification using multi-task learning
CN112119469A (zh) * 2018-05-23 2020-12-22 豪夫迈·罗氏有限公司 医疗器械数据管理配置系统和使用方法
CN113344086A (zh) * 2021-06-16 2021-09-03 深圳市商汤科技有限公司 人机回圈方法、装置、系统、电子设备和存储介质

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6437726B1 (en) * 2000-11-30 2002-08-20 Caterpillar Inc. Method and apparatus for determining the location of underground objects during a digging operation
WO2018112833A1 (en) * 2016-12-22 2018-06-28 Intel Corporation Efficient transferring of human experiences to robots and other autonomous machines
KR102368520B1 (ko) * 2017-08-30 2022-03-02 구글 엘엘씨 인간참여형(human-in-the-loop) 인터랙티브 모델 훈련
CN109800778B (zh) * 2018-12-03 2020-10-09 浙江工业大学 一种基于难分样本挖掘的Faster RCNN目标检测方法
CN112464930A (zh) * 2019-09-09 2021-03-09 华为技术有限公司 目标检测网络构建方法、目标检测方法、装置和存储介质
CN110968718B (zh) * 2019-11-19 2023-07-14 北京百度网讯科技有限公司 目标检测模型负样本挖掘方法、装置及电子设备
CN111832613B (zh) * 2020-06-03 2022-03-15 北京百度网讯科技有限公司 模型训练方法、装置、电子设备和存储介质
CN111881956B (zh) * 2020-07-15 2023-05-12 北京市商汤科技开发有限公司 网络训练方法及装置、目标检测方法及装置和电子设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112119469A (zh) * 2018-05-23 2020-12-22 豪夫迈·罗氏有限公司 医疗器械数据管理配置系统和使用方法
US20200364407A1 (en) * 2019-05-14 2020-11-19 Korea University Research And Business Foundation Method and server for text classification using multi-task learning
CN111027605A (zh) * 2019-11-28 2020-04-17 北京影谱科技股份有限公司 基于深度学习的细粒度图像识别方法和装置
CN111881966A (zh) * 2020-07-20 2020-11-03 北京市商汤科技开发有限公司 神经网络训练方法、装置、设备和存储介质
CN113344086A (zh) * 2021-06-16 2021-09-03 深圳市商汤科技有限公司 人机回圈方法、装置、系统、电子设备和存储介质

Also Published As

Publication number Publication date
CN113344086B (zh) 2022-07-01
CN113344086A (zh) 2021-09-03

Similar Documents

Publication Publication Date Title
WO2021203863A1 (zh) 基于人工智能的物体检测方法、装置、设备及存储介质
CN111738251B (zh) 一种融合语言模型的光学字符识别方法、装置和电子设备
US20200004815A1 (en) Text entity detection and recognition from images
CN111160350B (zh) 人像分割方法、模型训练方法、装置、介质及电子设备
CN110580500A (zh) 一种面向人物交互的网络权重生成少样本图像分类方法
WO2020253127A1 (zh) 脸部特征提取模型训练方法、脸部特征提取方法、装置、设备及存储介质
CN112347244A (zh) 基于混合特征分析的涉黄、涉赌网站检测方法
CN111126347B (zh) 人眼状态识别方法、装置、终端及可读存储介质
CN113076994A (zh) 一种开集域自适应图像分类方法及系统
CN110008365B (zh) 一种图像处理方法、装置、设备及可读存储介质
CN113051914A (zh) 一种基于多特征动态画像的企业隐藏标签抽取方法及装置
CN115658955B (zh) 跨媒体检索及模型训练方法、装置、设备、菜谱检索系统
CN104951791A (zh) 数据分类方法和装置
CN112766255A (zh) 一种光学文字识别方法、装置、设备及存储介质
CN110489747A (zh) 一种图像处理方法、装置、存储介质及电子设备
WO2022262141A1 (zh) 人机回圈方法、装置、系统、电子设备和存储介质
WO2021237227A1 (en) Method and system for multi-language text recognition model with autonomous language classification
Asri et al. A real time Malaysian sign language detection algorithm based on YOLOv3
Alon et al. Deep-hand: a deep inference vision approach of recognizing a hand sign language using american alphabet
CN115661846A (zh) 数据处理方法、装置、电子设备和存储介质
CN112750128A (zh) 图像语义分割方法、装置、终端及可读存储介质
CN115713669A (zh) 一种基于类间关系的图像分类方法、装置、存储介质及终端
Khekare et al. Real time object detection with speech recognition using tensorflow lite
CN109949827A (zh) 一种基于深度学习与强化学习的室内声学行为识别方法
CN115588227A (zh) 情绪识别方法、装置、电子设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21945721

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE