WO2023273570A1 - Procédé d'apprentissage de modèle de détection de cible et procédé de détection de cible, et dispositif associé - Google Patents

Procédé d'apprentissage de modèle de détection de cible et procédé de détection de cible, et dispositif associé Download PDF

Info

Publication number
WO2023273570A1
WO2023273570A1 PCT/CN2022/089194 CN2022089194W WO2023273570A1 WO 2023273570 A1 WO2023273570 A1 WO 2023273570A1 CN 2022089194 W CN2022089194 W CN 2022089194W WO 2023273570 A1 WO2023273570 A1 WO 2023273570A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target
sample image
target detection
detection model
Prior art date
Application number
PCT/CN2022/089194
Other languages
English (en)
Chinese (zh)
Inventor
江毅
杨朔
孙培泽
袁泽寰
王长虎
Original Assignee
北京有竹居网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京有竹居网络技术有限公司 filed Critical 北京有竹居网络技术有限公司
Publication of WO2023273570A1 publication Critical patent/WO2023273570A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • the present application relates to the technical field of image processing, and in particular to a target detection model training method, a target detection method and related equipment.
  • Target detection also known as target extraction
  • target detection is an image segmentation technology based on target geometric statistics and features; and target detection has a wide range of applications (for example, target detection can be applied to robotics or automatic driving and other fields).
  • the present application provides a target detection model training method, a target detection method and related equipment, which can effectively improve the accuracy of target detection.
  • An embodiment of the present application provides a method for training a target detection model, the method comprising:
  • the performing text feature extraction on the actual target text identifier of the sample image to obtain the target text feature of the sample image includes:
  • the method further includes:
  • the actual target text identifier of the added image After acquiring the added image, the actual target text identifier of the added image, and the actual target position of the added image, perform text feature extraction on the actual target text identifier of the added image to obtain the added
  • the target text feature of the image the actual target text identifier of the added image is different from the actual target text identifier of the sample image;
  • the predicted target position of the historical sample image the actual target position of the historical sample image, the similarity between the image feature of the historical sample image and the target text feature of the historical sample image, the The predicted target position of the added image, the actual target position of the added image, and the similarity between the image feature of the added image and the target text feature of the added image, and update the target detection model , and continue to execute the step of inputting the historical sample image and the newly added image into the target detection model until a second stop condition is reached.
  • the process of determining the historical sample image includes:
  • the training used image determines the training used image belonging to each historical target category from the training used image corresponding to the target detection model;
  • the historical sample images corresponding to the respective historical object categories are respectively extracted from the training used images belonging to the various historical object categories.
  • the predicted target position of the historical sample image, the actual target position of the historical sample image, the image features of the historical sample image and the historical sample image The similarity between the target text features of the added image, the predicted target position of the added image, the actual target position of the added image, and the relationship between the image feature of the added image and the target text feature of the added image.
  • the similarity between updates the target detection model including:
  • the weighted weight corresponding to the historical image loss value is higher than that of the newly added image The weighted weight corresponding to the loss value
  • the target detection model is updated according to the detection loss value of the target detection model.
  • the inputting the sample image into a target detection model, and obtaining the image features of the sample image output by the target detection model and the predicted target position of the sample image include:
  • the predicted target text identifier of the sample image the actual target text identifier of the sample image, the predicted target position of the sample image, the actual target position of the sample image, and the image features of the sample image and the The similarity between the target text features of the sample images is used to update the target detection model.
  • the embodiment of the present application also provides a target detection method, the method comprising:
  • the target detection model is a target detection model provided by an embodiment of the present application Any implementation of the training method for training.
  • the embodiment of the present application also provides a target detection model training device, the device comprising:
  • a first acquiring unit configured to acquire a sample image, an actual target text identifier of the sample image, and an actual target position of the sample image
  • the first extraction unit is used to extract the text features of the actual target text identifier of the sample image to obtain the target text features of the sample image;
  • a first prediction unit configured to input the sample image into a target detection model, and obtain the image features of the sample image output by the target detection model and the predicted target position of the sample image;
  • a first update unit configured to update the target position according to the predicted target position of the sample image, the actual target position of the sample image, and the similarity between the image feature of the sample image and the target text feature of the sample image. the target detection model, and return to the first prediction unit to execute the inputting the sample image into the target detection model until a first stop condition is reached.
  • the embodiment of the present application also provides a target detection device, the device comprising:
  • a second acquiring unit configured to acquire an image to be detected
  • a target detection unit configured to input the image to be detected into a pre-trained target detection model, and obtain the target detection result of the image to be detected output by the target detection model; wherein, the target detection model is implemented using the present application Any implementation of the target detection model training method provided by the example is used for training.
  • the embodiment of the present application also provides a device, the device includes a processor and a memory:
  • the memory is used to store computer programs
  • the processor is configured to execute any implementation of the target detection model training method provided in the embodiments of the present application according to the computer program, or execute any implementation of the target detection method provided in the embodiments of the present application.
  • the embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium is used to store a computer program, and the computer program is used to execute any implementation of the target detection model training method provided in the embodiment of the present application way, or execute any implementation of the target detection method provided in the embodiment of the present application.
  • the embodiment of the present application also provides a computer program product.
  • the terminal device executes any implementation manner of the target detection model training method provided in the embodiment of the present application, or executes Any implementation of the target detection method provided in the embodiments of this application.
  • the embodiment of the present application has at least the following advantages:
  • the text feature extraction is first performed on the actual target text identifier of the sample image to obtain the target text feature of the sample image; then the sample image, the target text feature of the sample image and the sample image are used
  • the actual target position of the target detection model is trained so that the target detection model can perform target detection learning under the constraints of the target text features of the sample image and the actual target position of the sample image, so that the trained target detection model It has better target detection performance, so that the trained target detection model can be used to perform more accurate target detection on the image to be detected, and the target detection result of the image to be detected is obtained and output, so that the target of the image to be detected
  • the detection result is more accurate, which is conducive to improving the accuracy of target detection.
  • FIG. 1 is a flow chart of a method for training a target detection model provided in an embodiment of the present application
  • FIG. 2 is a schematic structural diagram of a target detection model provided by an embodiment of the present application.
  • FIG. 3 is a flowchart of a target detection method provided in an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of a target detection model training device provided in an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a target detection device provided by an embodiment of the present application.
  • the following first introduces the training process of the target detection model (that is, the target detection model training method), and then introduces the application process of the target detection model (that is, the target detection method).
  • this figure is a flow chart of a method for training a target detection model provided by an embodiment of the present application.
  • the target detection model training method provided in the embodiment of the present application includes S101-S105:
  • S101 Acquire a sample image, an actual target text identifier of the sample image, and an actual target position of the sample image.
  • the sample image refers to the image used for training the target detection model.
  • the embodiment of the present application does not limit the number of sample images, for example, the number of sample images may be N (that is, use N sample images to train the target detection model).
  • the actual target text identifier of the sample image is used to uniquely represent the target object in the sample image.
  • this embodiment of the present application does not limit the actual target text identifier of the sample image, for example, the actual target text identifier of the sample image may be an object category (or object name, etc.). For example, if the sample image includes a cat, the actual target text identifier of the sample image may be a cat.
  • the actual target position of the sample image is used to represent the area actually occupied by the target object in the sample image in the sample image.
  • the present application does not limit the representation of the actual target position of the sample image, and any existing or future representation that can represent the area occupied by an object in the image can be used for implementation.
  • S102 Perform text feature extraction on the actual target text identifier of the sample image to obtain the target text feature of the sample image.
  • the target text feature of the sample image is used to describe the text information (such as semantic information, etc.) carried by the actual target text mark of the sample image, so that the target text feature of the sample image can represent the target object in the sample image The features actually present in this sample image.
  • the embodiment of the present application does not limit the method of extracting the target text features of the sample image (that is, the implementation of S102), and any existing or future method that can perform feature extraction for a text can be used for implementation. .
  • the following description will be given in combination with examples.
  • S102 may specifically include: inputting the actual target text identifier of the sample image into a pre-trained language model, and obtaining the target text feature of the sample image output by the language model.
  • the language model is used for text feature extraction; and the embodiment of the present application does not limit the language model, and any existing or future language model can be used for implementation.
  • the language model can be trained in advance according to the sample text and the actual text features of the sample text.
  • the sample text refers to the text required for training the language model; and the actual text features of the sample text are used to describe the text information actually carried by the sample text (such as semantic information, etc.).
  • the embodiment of the present application does not limit the training process of the language model, and any existing or future method that can train the language model according to the sample text and the actual text features of the sample text can be used for implementation.
  • the pre-trained language model can be used to target the actual target text of the i-th sample image
  • the text mark is used for text feature extraction, and the target text feature of the i-th sample image is obtained and output, so that the target text feature of the i-th sample image can accurately represent the actual target text mark of the i-th sample image
  • the text information carried by so that the target text features of the i-th sample image can be used to constrain the training update process of the target detection model.
  • the pre-trained language model can accurately extract the text information (especially semantic information) carried by a text
  • the number of texts that can be described by the language model is unlimited, so that the language model can be used for different texts Any two of the output text features of these different texts are highly separable, so that it can effectively ensure that the text features of any two texts (for example, any two of the target text features of N sample images ) There is no overlap, which can effectively improve the detection accuracy of the target detection model.
  • the language model can learn the semantic correlation between different texts during the training process (for example, the semantic correlation between "cat” and “tiger” is higher than that between “cat” and “car”) ), so that the trained language model can better extract text features, which can effectively improve the detection accuracy of the target detection model.
  • S103 Input the sample image into the target detection model, and obtain the image features of the sample image and the predicted target position of the sample image output by the target detection model.
  • the image feature of the sample image is used to represent the feature that the target object in the sample image is predicted to appear in the sample image.
  • the predicted target position of the sample image is used to represent the predicted area occupied by the target object in the sample image in the sample image.
  • the target detection model is used for target detection (for example, to detect the category of the target object and the image position of the target object).
  • the embodiment of the present application does not limit the target detection model.
  • the input data of the target category prediction layer 202 includes the output data of the image feature extraction layer 201
  • the input data of the target position prediction layer 203 includes the output data of the image feature extraction layer 201 .
  • the working process of the target detection model 200 may include step 11-step 13:
  • Step 11 Input the sample image into the image feature extraction layer 201, and obtain the image features of the sample image output by the image feature extraction layer 201.
  • the image feature extraction layer 201 is used for performing image feature extraction on the input data of the image feature extraction layer 201 .
  • the embodiment of the present application does not limit the implementation manner of the image feature extraction layer 201, and any existing or future solution capable of image feature extraction can be used for implementation.
  • Step 12 Input the image features of the sample image into the target category prediction layer 202 to obtain the predicted target text identifier of the sample image output by the target category prediction layer 202 .
  • the object type prediction layer 202 is used for performing object type prediction on the input data of the object type prediction layer 202 .
  • the embodiment of the present application does not limit the implementation manner of the object category prediction layer 202, and any existing or future solution capable of performing object category prediction can be used for implementation.
  • the predicted target text identifier of the sample image is used to represent the predicted identifier (eg, predicted category) of the target object in the sample image.
  • Step 13 Input the image features of the sample image into the target position prediction layer 203 to obtain the predicted target position of the sample image output by the target position prediction layer 203 .
  • the target position prediction layer 203 is used for performing object position prediction on the input data of the target position prediction layer 203 .
  • the embodiment of the present application does not limit the implementation of the object position prediction layer 203, and any existing or future solution capable of predicting object positions can be used for implementation.
  • step 11 to step 13 Based on the relevant content of the above step 11 to step 13, it can be seen that for the target detection model 200 shown in FIG. 202 and the target position prediction layer 203 respectively generate and output the image features of the sample image, the predicted target text identifier of the sample image, and the predicted target position of the sample image, so that the subsequent target detection model 200 can be determined based on these prediction information. Object detection performance.
  • the data dimension of the image feature of the sample image output by the image feature extraction layer 201 may be different from the data dimension of the target text feature of the sample image. Inconsistent, so in order to ensure that the similarity between the image features of the sample image and the target text features of the sample image can be successfully calculated in the future, a data dimension transformation layer can be added in the target detection model 200 shown in FIG.
  • the input data of the data dimension transformation layer includes the output data of the image feature extraction layer 201, so that the data dimension transformation layer can perform data dimension transformation for the output data of the image feature extraction layer 201 (such as the image features of the sample image), Therefore, the output data of the data dimension transformation layer can be consistent with the data dimension of the target text feature of the above sample image, which is beneficial to improve the calculation of the similarity between the image feature of the sample image and the target text feature of the sample image accuracy.
  • the i-th sample image can be input into the target detection model , so that the target detection model performs target detection processing on the i-th sample image, obtains and outputs the image features of the i-th sample image and the predicted target position of the i-th sample image, so that the subsequent can be based on the i-th sample image
  • the image features of each sample image and its predicted target position are used to determine the target detection performance of the target detection model.
  • S104 Determine whether the first stop condition is met, if yes, perform a preset action; if not, perform S105.
  • the first stop condition may be preset, and the embodiment of the present application does not limit the first stop condition, for example, the first stop condition may be that the predicted loss value of the target detection model is lower than the first preset loss threshold, or The rate of change of the predicted loss value of the target detection model is lower than the first rate of change threshold, or the number of updates of the target detection model reaches the first threshold.
  • the predicted loss value of the target detection model is used to represent the target detection performance of the target detection model for the above N sample images; and the embodiment of the present application does not limit the calculation method of the predicted loss value of the target detection model, which can be Use any existing or future model prediction loss value calculation method for implementation.
  • Preset actions can be preset.
  • the preset action may be to end the training process of the target detection model (that is, to end the target detection learning process of the target detection model for N sample images).
  • the preset actions may include the following S106-S109.
  • the target detection model of the current round it can be judged whether the target detection model of the current round meets the first stop condition;
  • the N sample images have better target detection performance, which means that the target detection performance of the current round of target detection model is better, so the target detection model of the current round can be saved, so that the subsequent work can be performed using the saved target detection model ( For example, to perform target detection work or to add a new object detection function to the target detection model);
  • the first stop condition is not met, it means that the target detection performance of the current round of target detection model for the above N sample images is still relatively poor , so the target detection model can be updated according to the label information corresponding to the N sample images and the prediction information output by the target detection model of the current round for the N sample images.
  • S105 Update the target detection model according to the predicted target position of the sample image, the actual target position of the sample image, and the similarity between the image feature of the sample image and the target text feature of the sample image, and return to execute S103.
  • the similarity between the image feature of the sample image and the target text feature of the sample image is used to represent the similarity between the image feature of the sample image and the target text feature of the sample image.
  • the embodiment of the present application does not limit the calculation method of the similarity between the image feature of the sample image and the target text feature of the sample image, for example, the Euclidean distance may be used for calculation.
  • the training objectives of the target detection model may include that the predicted target position of the sample image is as close as possible to the actual target position of the sample image, and the image features of the sample image are as close as possible to the target text features of the sample image (also That is, the similarity between the image feature of the sample image and the target text feature of the sample image is as large as possible).
  • the target detection model of the current round can first be based on the predicted target position of the i-th sample image and the sample The gap between the actual target positions of the images, and the similarity between the image features of the i-th sample image and the target text features of the i-th sample image, update the target detection model, so that the updated target detection model It has better target detection performance, so that the above S103 and its subsequent steps can be continued to be performed subsequently.
  • i is a positive integer
  • i ⁇ N and N is a positive integer.
  • the text feature extraction can be performed on the actual target text identifier of the sample image to obtain the target text feature of the sample image; and then use
  • the sample image, the target text feature of the sample image and the actual target position of the sample image are used to train the target detection model to obtain a trained target detection model.
  • the target text feature of the sample image can more accurately represent the actual target text mark of the sample image
  • the target detection model trained under the constraints of the target text feature of the sample image has a better target detection function, This is beneficial to improve the target detection performance.
  • the trained target detection model has better target detection performance for the target objects it has learned, so in order to further improve the prediction performance of the target detection model, the trained target detection model can be further learned Learned target objects (i.e., category incremental learning can be performed for target detection models).
  • the embodiment of the present application also provides a possible implementation of the target detection model training method.
  • the target detection model training method includes S106-S109 in addition to the above S101-S105:
  • the newly-added image refers to the image required for category incremental learning for the trained target detection model.
  • the embodiment of the present application does not limit the number of added images, for example, the number of added images is M; wherein, M is a positive integer.
  • S106-S109 can be used to realize that the target detection model further learns how to perform target detection on the M new images under the premise of keeping the learned target objects.
  • the actual target text identifier of the added image the actual target position of the added image, and the target text feature of the added image
  • the actual target text identifier of the sample image and the actual target of the sample image in S101 above location, and the relevant content of the target text feature of the sample image in S102 above, only need to identify the actual target text of the sample image in S101 above, the actual target position of the sample image, and the target text feature of the sample image in S102 above Just replace "sample image" with "new image” in the relevant content of .
  • the trained target detection model for example, it can be a target detection model trained by using the training process shown in S101-S105 above, or it can be a target detection model trained by using the training process shown in S101-S105 above.
  • the target detection model obtained by using the training process shown in S106-S109 to carry out category incremental learning at least once after the training process shown in the training process is completed after the newly added image and the actual target text identification of the newly added image are acquired and the actual target position of the newly added image, it can be determined that a class incremental learning is needed for the trained target detection model, so text feature extraction can be performed on the actual target text identifier of the newly added image to obtain the Add the target text features of the image, so that the target text features of the new image can be used to constrain the incremental learning process of the target detection model, so that the retrained target detection model can maintain the learned target Further learning how to perform object detection on these additional images based on the premise of objects.
  • S107 Input the historical sample image and the newly added image into the target detection model, and obtain the image features of the historical sample image output by the target detection model, the predicted target position of the historical sample image, the image features of the newly added image and The predicted target location for this added image.
  • the historical sample images may include all or part of the images used in the historical training process of the target detection model.
  • the historical training process of the target detection model refers to the category learning process that the target detection model has experienced before the current sub-category incremental learning process for the target detection model. For example, if the trained target detection model has only experienced the category learning process shown in S101-S105 above, the historical training process of the target detection model refers to the training process shown in S101-S105 above. As another example, if the trained target detection model has gone through the category learning process shown in S101-S105 above and Q times the category incremental learning process shown in S106-S109, then the historical training process of the target detection model It may include the training process shown in S101-S105 above, the training process shown in the first time S106-S109 to the training process shown in the Qth time S106-S109.
  • the determination process of the historical sample image may include Step 21-Step 24:
  • Step 21 According to the sample image, determine the image used for training corresponding to the target detection model.
  • the training used images corresponding to the target detection model refer to images that have been used in the historical training process of the target detection model.
  • two examples are used for description below.
  • Example 1 if the historical training process of the target detection model includes the training process shown in S101-S105 above, the training used images corresponding to the target detection model may include the above N sample images.
  • Example 2 if the historical training process of the target detection model can include the training process shown in S101-S105 above, the training process shown in the first time S106-S109 to the training process shown in the Qth time S106-S109, the qth In the training process shown in S106-S109, G q newly added images are used for category incremental learning, and q is a positive integer, q ⁇ Q, then the training used images corresponding to the target detection model can include the above N sample images, G 1 new images, G 2 new images, ..., G Q new images.
  • the trained target detection model needs to be incrementally learned, it can first be determined based on the images involved in the historical training process of the target detection model.
  • the training used image of is used so that the training used image can accurately represent the image that has been used in the historical learning process of the object detection model.
  • Step 22 Determine at least one historical target category according to the actual target text identifiers of the images used for training.
  • the historical target category refers to the object category that the target detection model has learned during the historical training process of the target detection model.
  • two examples are used for description below.
  • Example 1 if the historical training process of the target detection model includes the training process shown in S101-S105 above, and the N sample images in the training process shown in S101-S105 above correspond to R 0 object categories, then the The R 0 object categories are all determined as historical object categories.
  • Example 2 if the historical training process of the target detection model can include the training process shown in S101-S105 above, and the training process shown in the first time S106-S109 to the training process shown in the Qth time S106-S109, the In the training process shown in S101-S105 above, the N sample images correspond to R 0 object categories, and in the qth training process shown in S106-S109, G q newly added images correspond to R q object categories, and q is a positive integer, and q ⁇ Q, then R 0 object categories, R 1 object categories, R 2 object categories, ..., R Q object categories can all be determined as historical object categories.
  • R 0 object categories there are no repeated object categories among R 0 object categories, R 1 object categories, R 2 object categories, . . . , R Q object categories. That is, any two object categories among R 0 object categories, R 1 object categories, R 2 object categories, . . . , R q-1 object categories are different.
  • the actual target text identifiers of each training used images can be used to determine the historical object category corresponding to the target detection model, so that the The historical object categories can accurately represent the object categories that have been learned during the history learning process of this object detection model.
  • Step 23 According to the actual target text identification of the training used images, determine the training used images belonging to each historical target category from the training used images corresponding to the target detection model.
  • step 23 may specifically include: determining Y1 images belonging to the first historical target category in the training images corresponding to the target detection model to belong to the first The training images of 1 historical target category are used, and the Y 2 images belonging to the second historical target category in the training used images corresponding to the target detection model are determined as the training used images belonging to the second historical target category, ... (by analogy), all Y M images belonging to the Mth historical target category in the training used images corresponding to the target detection model are determined as the training used images belonging to the Mth historical target category.
  • Step 24 Extract historical sample images corresponding to each historical object category from training images that belong to each historical object category.
  • the extraction may be performed with reference to a preset extraction ratio (or number of extractions, etc.).
  • step 24 may specifically include: performing random extraction according to an extraction ratio of 10% from the training images that belong to the first historical object category, Obtain each historical sample image corresponding to the first historical target category, so that the actual target text identification of each historical sample image corresponding to the first historical target category is the first historical target category; subordinate to the second The training of the first historical target category has been randomly selected according to the sampling ratio of 10% in the used image, and each historical sample image corresponding to the second historical target category is obtained, so that each historical sample image corresponding to the second historical target category The actual target text identification of the image is the second historical target category; ...
  • the training images belonging to the Mth historical target category are randomly selected according to the sampling ratio of 10%, and the first Each historical sample image corresponding to the M historical target category, so that the actual target text identifiers of each historical sample image corresponding to the M th historical target category are all the M th historical target category.
  • some historical samples can be extracted from the images involved in the historical training process of the target detection model images, so that these historical sample images can represent the object categories that have been learned during the historical learning process of the object detection model.
  • image features of the historical sample images and the predicted target positions of the historical sample images please refer to the related content of “Image Features of Sample Images” and “Predicted Target Positions of Sample Images” in S103 above. Just replace “sample image” with “historical sample image” in the related content of “image feature of sample image” and “predicted target position of sample image” in S103 above.
  • image features of the newly added image and the predicted target position of the newly added image please refer to the relevant content of the “image feature of the sample image” and “predicted target position of the sample image” in S103 above.
  • image features of sample image and “predicted target position of sample image” in the related content of "image features of sample image” and “predicted target position of sample image”, “sample image” can be replaced with “new image”.
  • the historical sample image and the newly added image can be respectively input into the target detection model, so that the target detection model can target the historical sample image and the newly added image for target detection, obtain and output the image features of the historical sample image and the predicted target position, the image features of the newly added image and the predicted target position, so that the target detection model can be determined based on these predicted information.
  • the target detection model can target the historical sample image and the newly added image for target detection, obtain and output the image features of the historical sample image and the predicted target position, the image features of the newly added image and the predicted target position, so that the target detection model can be determined based on these predicted information.
  • the second stop condition may be preset, and the embodiment of the present application does not limit the second stop condition, for example, the second stop condition may be that the detection loss value of the target detection model is lower than the second preset loss threshold, or The rate of change of the detection loss value of the target detection model is lower than the second rate of change threshold, or the number of updates of the target detection model reaches the second threshold.
  • the detection loss value of the target detection model is used to represent the target detection performance of the target detection model for historical sample images and newly added images; and the embodiment of the present application does not limit the calculation method of the detection loss value of the target detection model , which can be implemented by using any existing or future model detection loss value calculation method.
  • the embodiment of the present application also provides a detection method of the target detection model.
  • the calculation method of the loss value may specifically include step 31-step 33:
  • Step 31 Determine the historical Image loss value.
  • the historical image loss value refers to the loss value generated when the target detection model performs target detection on the historical sample images, so that the historical image loss value is used to represent the target detection performance of the target detection model on the historical sample images.
  • the embodiment of the present application does not limit the calculation method of the historical image loss value, and any existing or future prediction loss value calculation method may be used for implementation.
  • Step 32 According to the predicted target position of the added image, the actual target position of the added image, and the similarity between the image feature of the added image and the target text feature of the added image, determine the loss value of the added image .
  • the newly added image loss value refers to the loss value generated when the target detection model performs target detection for the newly added image, so that the newly added image loss value is used to represent the target detection performance of the target detection model for the newly added image.
  • the embodiment of the present application does not limit the calculation method of the newly added image loss value, and any existing or future prediction loss value calculation method may be used for implementation.
  • Step 33 Perform weighted summation of the historical image loss value and the newly added image loss value to obtain the detection loss value of the target detection model. Wherein, the weighting weight corresponding to the historical image loss value is higher than the weighting weight corresponding to the newly added image loss value.
  • the weighted weight corresponding to the historical image loss value refers to the weight value to be multiplied by the historical image loss value in the "weighted summation" in step 33 .
  • the weighting weights corresponding to the historical image loss values may be preset.
  • the weighted weight corresponding to the newly added image loss value refers to the weight value to be multiplied by the newly added image loss value in the "weighted sum" in step 33 .
  • the weighting weights corresponding to the newly added image loss values may be preset.
  • the target detection model trained based on the weighted weight corresponding to the historical image loss value can not only realize accurate target detection for the newly added image corresponding to the target detection model, but also realize Still for the training corresponding to the target detection model, images have been used for accurate target detection, which is conducive to improving the accuracy of category incremental learning.
  • the preset steps can be preset.
  • the preset step may be to end the current category incremental learning process of the target detection model.
  • the preset steps may include the above S106-S109 .
  • the target detection model of the current round it can be judged whether the target detection model of the current round meets the second stop condition; Both historical sample images and newly added images have better target detection performance, which means that the target detection performance of the current round of target detection model is better, so the current round of target detection model can be saved so that the saved target detection can be used later
  • the model performs follow-up work (such as performing target detection or adding new object detection functions to the target detection model); if the second stop condition is not reached, it means that the target detection model of the current round is aimed at the above-mentioned historical sample images and The target detection performance of the newly added image is still relatively poor, so it can be based on the label information corresponding to the historical sample image, the label information corresponding to the newly added image, and the target detection model of the current round for the historical sample image and the newly added image.
  • the output prediction information updates the target detection model.
  • S109 According to the predicted target position of the historical sample image, the actual target position of the historical sample image, the similarity between the image feature of the historical sample image and the target text feature of the historical sample image, and the Predict the target position, the actual target position of the added image, and the similarity between the image feature of the added image and the target text feature of the added image, update the target detection model, and return to execute S107.
  • the training target of the target detection model may include that the predicted target position of the historical sample image is as close as possible to the actual target position of the historical sample image, and the image features of the historical sample image are as close as possible to the historical sample image.
  • the target text feature of the image that is, the similarity between the image feature of the historical sample image and the target text feature of the historical sample image is as large as possible
  • the predicted target position of the new image is as close as possible to the
  • the actual target position of the added image, and the image feature of the added image is as close as possible to the target text feature of the added image (that is, the distance between the image feature of the added image and the target text feature of the added image similarity as large as possible).
  • S109 may specifically include S1091-S1094:
  • S1091 Determine the historical image according to the predicted target position of the historical sample image, the actual target position of the historical sample image, and the similarity between the image feature of the historical sample image and the target text feature of the historical sample image loss value.
  • S1092 Determine the added image loss value according to the predicted target position of the added image, the actual target position of the added image, and the similarity between the image feature of the added image and the target text feature of the added image.
  • S1093 Perform weighted summation of the historical image loss value and the newly added image loss value to obtain a detection loss value of the target detection model. Wherein, the weighting weight corresponding to the historical image loss value is higher than the weighting weight corresponding to the newly added image loss value.
  • the target detection model training method provided in the embodiment of the present application, for the trained target detection model, if it is necessary to add a new object detection function to the target detection model, then
  • the new image and its label information can be used to carry out category incremental learning for the target detection model, so that the learned target detection model can add the target detection function for the new image while maintaining the original target detection function. , which is conducive to continuously improving the target detection performance of the target detection model.
  • the embodiment of the present application also provides a possible implementation of the target detection model training method, which specifically includes steps 41-45:
  • Step 41 Obtain a sample image, an actual target text identifier of the sample image, and an actual target position of the sample image.
  • Step 42 Perform text feature extraction on the actual target text identifier of the sample image to obtain the target text feature of the sample image.
  • step 41-step 42 refer to the above S101-S102 respectively.
  • Step 43 Input the sample image into the target detection model, and obtain the image features of the sample image output by the target detection model, the predicted target text mark of the sample image, and the predicted target position of the sample image.
  • the predicted target text identifier of the sample image is used to represent the predicted identifier (eg, predicted category) of the target object in the sample image.
  • step 43 can be implemented in any of the above S103 implementations, only the output data of the target detection model in S103 above is replaced by "the image features of the sample image and the predicted target position of the sample image" It only needs to be “the image feature of the sample image, the predicted target text identifier of the sample image, and the predicted target position of the sample image”.
  • Step 44 Judging whether the first stop condition is met, if yes, execute a preset action; if not, execute step 45.
  • step 44 the relevant content of step 44, please refer to the relevant content of S104 above.
  • the "predicted loss value of the target detection model" in step 44 is based on the predicted target text identifier of the sample image, the actual target text identifier of the sample image, the predicted target position of the sample image, the actual target position of the sample image, and The similarity between the image feature of the sample image and the target text feature of the sample image is calculated.
  • Step 45 According to the predicted target text mark of the sample image, the actual target text mark of the sample image, the predicted target position of the sample image, the actual target position of the sample image, and the image features of the sample image and the target of the sample image The similarity between text features is used to update the target detection model, and return to step 43.
  • step 45 can be implemented using any of the implementations of S105 above, and it is only necessary to combine the "predicted target position of the sample image, the actual target position of the sample image, and the The similarity between the image features of the sample image and the target text features of the sample image" is replaced by "the predicted target text identifier of the sample image, the actual target text identifier of the sample image, the predicted target position of the sample image, the sample image and the similarity between the image features of the sample image and the target text features of the sample image".
  • the update process of the target detection model in step 45 is based on the predicted target text mark of the sample image, the actual target text mark of the sample image, the predicted target position of the sample image, the actual target position of the sample image, and the sample.
  • the text feature extraction can be performed on the actual target text identifier of the sample image to obtain the target text feature of the sample image; Then use the sample image, the target text feature of the sample image, the actual target text identifier of the sample image, and the actual target position of the sample image to train the target detection model to obtain a trained target detection model.
  • the target detection model is trained under the constraints of the target text features of the sample image, the actual target text identifier, and the actual target position, the trained target detection model has a better target detection function. This is beneficial to improve the target detection performance.
  • the embodiment of the present application also provides a possible implementation of the target detection model training method.
  • the target detection model training method includes the above steps 41-45
  • step 46-step 49 is also included:
  • Step 46 After acquiring the added image, the actual target text identifier of the added image, and the actual target position of the added image, perform text feature extraction on the actual target text identifier of the added image to obtain the added image target text features.
  • step 46 refers to the relevant content of S106 above.
  • Step 47 Input the historical sample image and the newly added image into the target detection model, and obtain the image features of the historical sample image output by the target detection model, the predicted target text identifier of the historical sample image, and the The predicted target position, the image feature of the added image, the predicted target text identifier of the added image, and the predicted target position of the added image.
  • the predicted target text identifier of the historical sample image is used to represent the predicted identifier (eg, predicted category) of the target object in the historical sample image.
  • the predicted target text identifier of the added image is used to represent the predicted identifier (eg, predicted category) of the target object in the added image.
  • step 47 can be implemented by using any of the implementation methods of S107 above. It is only necessary to convert the output data of the target detection model in S107 above from "the image characteristics of the historical sample image, the historical sample image The predicted target position of the image, the image feature of the newly added image, and the predicted target position of the newly added image" are replaced with "the image feature of the historical sample image, the predicted target text identifier of the historical sample image, the The predicted target position, the image feature of the added image, the predicted target text identifier of the added image, and the predicted target position of the added image” are enough.
  • Step 48 Judging whether the second stop condition is met, if yes, execute the preset step; if not, execute step 49.
  • the "detection loss value of the target detection model" in step 48 is based on the predicted target text identifier of the historical sample image, the actual target text identifier of the historical sample image, the predicted target position of the historical sample image, the historical sample image The actual target position of the example image, the predicted target text mark of the newly added image, the actual target text mark of the added image, the predicted target position of the added image, the actual target position of the added image, the historical sample image The similarity between the image feature and the target text feature of the historical sample image, and the similarity between the image feature of the added image and the target text feature of the added image are calculated.
  • Step 49 According to the predicted target text mark of the historical sample image, the actual target text mark of the historical sample image, the predicted target position of the historical sample image, the actual target position of the historical sample image, and the prediction of the newly added image
  • step 49 can be implemented by using any of the implementation methods of S109 above, and only need to set the "predicted target position of the historical sample image, the actual target of the historical sample image" in any implementation of S109 above position, the predicted target position of the newly added image, the actual target position of the newly added image, the similarity between the image features of the historical sample image and the target text features of the historical sample image, and the image
  • the similarity between the feature and the target text feature of the newly added image" is replaced by "the predicted target text identifier of the historical sample image, the actual target text identifier of the historical sample image, the predicted target position of the historical sample image,
  • the target detection model training method provided in the embodiment of the present application, for the trained target detection model, if it is necessary to add a new object detection function to the target detection model , then the target detection model can be incrementally learned by using the newly added image and its three label information (that is, target text features, actual target text identification, and actual target position), so that the learned target detection model can On the premise of maintaining the original target detection function, the target detection function for new images is added, which is conducive to continuously improving the target detection performance of the target detection model.
  • the target detection function for new images is added, which is conducive to continuously improving the target detection performance of the target detection model.
  • the target detection model After the target detection model is trained, the target detection model can be used for target detection. Based on this, an embodiment of the present application further provides a target detection method, which will be described below with reference to the accompanying drawings.
  • this figure is a flow chart of a target detection method provided by an embodiment of the present application.
  • the target detection method provided in the embodiment of this application includes S301-S302:
  • S301 Acquire an image to be detected.
  • the image to be detected refers to an image that needs to be subjected to target detection processing.
  • S302 Input the image to be detected into a pre-trained target detection model, and obtain a target detection result of the image to be detected output by the target detection model.
  • the target detection model is trained by using any implementation of the target detection model training method provided in the embodiment of the present application.
  • the object detection result of the image to be detected is obtained by the object detection model performing object detection on the image to be detected.
  • this embodiment of the present application does not limit the target detection result of the image to be detected.
  • the target detection result of the image to be detected may include the predicted target text identifier (for example, the predicted target category) of the target object in the image to be detected and/or the The area occupied by the target object in the image to be detected in the image to be detected.
  • the target detection model that has been trained can be used to perform target detection on the image to be detected, and the target detection result of the image to be detected can be obtained and output, so that The target detection result of the image to be detected can accurately represent the relevant information of the target object in the image to be detected (eg, target category information and target position information, etc.).
  • the target detection result of the image to be detected determined by using the target detection model is more accurate, which is beneficial to improve the accuracy of target detection.
  • the embodiment of the present application also provides a target detection model training device, which will be explained and described below with reference to the accompanying drawings.
  • this figure is a schematic structural diagram of a target detection model training device provided by an embodiment of the present application.
  • the target detection model training device 400 provided in the embodiment of the present application includes:
  • a first acquiring unit 401 configured to acquire a sample image, an actual target text identifier of the sample image, and an actual target position of the sample image;
  • the first extraction unit 402 is configured to perform text feature extraction on the actual target text identifier of the sample image to obtain the target text feature of the sample image;
  • the first prediction unit 403 is configured to input the sample image into the target detection model, and obtain the image features of the sample image output by the target detection model and the predicted target position of the sample image;
  • the first updating unit 404 is configured to, according to the predicted target position of the sample image, the actual target position of the sample image, and the similarity between the image feature of the sample image and the target text feature of the sample image, Update the target detection model, and return to the first prediction unit 403 to execute the input of the sample image into the target detection model until the first stop condition is reached.
  • the first extraction unit 402 is specifically configured to:
  • the target detection model training device 400 further includes:
  • the second extraction unit is configured to, after the first stop condition is reached and the added image, the actual target text identifier of the added image, and the actual target position of the added image are acquired, the actual target position of the added image
  • the target text mark carries out text feature extraction, obtains the target text feature of described newly added image
  • the second prediction unit is configured to input the historical sample image and the newly added image into the target detection model, and obtain the image features of the historical sample image and the predicted target of the historical sample image output by the target detection model position, the image feature of the added image, and the predicted target position of the added image; wherein, the historical sample image is determined according to the sample image;
  • the second updating unit is configured to: according to the predicted target position of the historical example image, the actual target position of the historical example image, the image feature of the historical example image and the target text feature of the historical example image The similarity between, the predicted target position of the added image, the actual target position of the added image, and the similarity between the image feature of the added image and the target text feature of the added image , updating the target detection model, and returning to the second prediction unit to execute the inputting the historical sample image and the newly added image into the target detection model until a second stop condition is reached.
  • the process of determining the historical sample image includes:
  • the training used image determines the training used image belonging to each historical target category from the training used image corresponding to the target detection model;
  • the historical sample images corresponding to the respective historical object categories are respectively extracted from the training used images belonging to the various historical object categories.
  • the second updating unit includes:
  • the first determination subunit is configured to: according to the predicted target position of the historical sample image, the actual target position of the historical sample image, and the image features of the historical sample image and the target of the historical sample image The similarity between text features determines the historical image loss value;
  • the second determining subunit is configured to use the predicted target position of the added image, the actual target position of the added image, and the relationship between the image feature of the added image and the target text feature of the added image The similarity to determine the new image loss value;
  • the third determining subunit is configured to perform weighted summation of the historical image loss value and the newly added image loss value to obtain the detection loss value of the target detection model; wherein, the weight corresponding to the historical image loss value The weight is higher than the weighted weight corresponding to the added image loss value;
  • the model update subunit is configured to update the target detection model according to the detection loss value of the target detection model.
  • the first prediction unit 403 is specifically configured to:
  • the first updating unit 404 is specifically used for:
  • the predicted target text identifier of the sample image the actual target text identifier of the sample image, the predicted target position of the sample image, the actual target position of the sample image, and the image features of the sample image and the The similarity between the target text features of the sample images is used to update the target detection model, and return to the first prediction unit 403 to execute the input of the sample images into the target detection model until the first stop condition is reached.
  • the text feature extraction is first performed on the actual target text identifier of the sample image to obtain the target text feature of the sample image; image, the target text features of the sample image and the actual target position of the sample image to train the target detection model to obtain a trained target detection model.
  • the target text feature of the sample image can more accurately represent the actual target text mark of the sample image
  • the target detection model trained based on the target text feature of the sample image has a better target detection function, which is beneficial to Improve object detection performance.
  • the embodiment of the present application also provides a target detection device, which will be explained and described below with reference to the accompanying drawings.
  • this figure is a schematic structural diagram of a target detection device provided by an embodiment of the present application.
  • the target detection device 500 provided in the embodiment of the present application includes:
  • a second acquiring unit 501 configured to acquire an image to be detected
  • the target detection unit 502 is configured to input the image to be detected into a pre-trained target detection model, and obtain the target detection result of the image to be detected output by the target detection model; wherein, the target detection model uses the Any implementation of the method for training the target detection model provided in the examples is used for training.
  • the target detection device 500 After acquiring the image to be detected, it can use the trained target detection model to perform target detection on the image to be detected, and obtain and output the target detection model. Detect the target detection result of the image, so that the target detection result of the image to be detected can accurately represent the relevant information of the target object in the image to be detected (eg, target category information and target position information, etc.). Among them, since the trained target detection model has better target detection performance, the target detection result of the image to be detected determined by using the target detection model is more accurate, which is beneficial to improve the accuracy of target detection.
  • the embodiment of the present application also provides a device, the device includes a processor and a memory:
  • the memory is used to store computer programs
  • the processor is configured to execute any implementation of the target detection model training method provided in the embodiments of the present application according to the computer program, or execute any implementation of the target detection method provided in the embodiments of the present application.
  • the embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium is used to store a computer program, and the computer program is used to execute the target detection model training method provided in the embodiment of the present application. Any implementation manner, or execute any implementation manner of the target detection method provided in the embodiment of the present application.
  • the embodiment of the present application also provides a computer program product, which, when running on the terminal device, enables the terminal device to execute any implementation manner of the target detection model training method provided in the embodiment of the present application , or execute any implementation of the target detection method provided in the embodiment of the present application.
  • At least one (item) means one or more, and “multiple” means two or more.
  • “And/or” is used to describe the association relationship of associated objects, indicating that there can be three types of relationships, for example, “A and/or B” can mean: only A exists, only B exists, and A and B exist at the same time , where A and B can be singular or plural.
  • the character “/” generally indicates that the contextual objects are an “or” relationship.
  • At least one of the following” or similar expressions refer to any combination of these items, including any combination of single or plural items.
  • At least one item (piece) of a, b or c can mean: a, b, c, "a and b", “a and c", “b and c", or "a and b and c ", where a, b, c can be single or multiple.

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

Sont divulgués dans la présente demande un procédé d'apprentissage de modèle de détection de cible et un procédé de détection de cible, et un dispositif associé. Tout d'abord, une extraction de caractéristique de texte est effectuée sur un identifiant de texte cible réel d'une image d'échantillon, de façon à obtenir une caractéristique de texte cible de l'image d'échantillon ; puis, à l'aide de l'image d'échantillon, la caractéristique de texte cible de l'image d'échantillon et une position cible réelle de l'image d'échantillon, un modèle de détection de cible est entraîné, de façon à permettre au modèle de détection de cible d'effectuer un apprentissage de détection de cible sous les contraintes de la caractéristique de texte cible de l'image d'échantillon et de la position cible réelle de l'image d'échantillon, de telle sorte que le modèle de détection de cible entraîné a une meilleure performance de détection de cible, une détection de cible plus précise peut être ensuite effectuée sur une image en cours de test à l'aide du modèle de détection de cible entraîné de façon à obtenir et à délivrer en sortie un résultat de détection de cible de l'image en cours de test, et le résultat de détection de cible de l'image en cours de test est plus précis, ce qui facilite une amélioration de la précision de détection de cible.
PCT/CN2022/089194 2021-06-28 2022-04-26 Procédé d'apprentissage de modèle de détection de cible et procédé de détection de cible, et dispositif associé WO2023273570A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110723057.4 2021-06-28
CN202110723057.4A CN113469176B (zh) 2021-06-28 2021-06-28 一种目标检测模型训练方法、目标检测方法及其相关设备

Publications (1)

Publication Number Publication Date
WO2023273570A1 true WO2023273570A1 (fr) 2023-01-05

Family

ID=77873458

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/089194 WO2023273570A1 (fr) 2021-06-28 2022-04-26 Procédé d'apprentissage de modèle de détection de cible et procédé de détection de cible, et dispositif associé

Country Status (2)

Country Link
CN (1) CN113469176B (fr)
WO (1) WO2023273570A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469176B (zh) * 2021-06-28 2023-06-02 北京有竹居网络技术有限公司 一种目标检测模型训练方法、目标检测方法及其相关设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109191453A (zh) * 2018-09-14 2019-01-11 北京字节跳动网络技术有限公司 用于生成图像类别检测模型的方法和装置
CN111860573A (zh) * 2020-06-04 2020-10-30 北京迈格威科技有限公司 模型训练方法、图像类别检测方法、装置和电子设备
CN112560999A (zh) * 2021-02-18 2021-03-26 成都睿沿科技有限公司 一种目标检测模型训练方法、装置、电子设备及存储介质
CN112926654A (zh) * 2021-02-25 2021-06-08 平安银行股份有限公司 预标注模型训练、证件预标注方法、装置、设备及介质
US20210192180A1 (en) * 2018-12-05 2021-06-24 Tencent Technology (Shenzhen) Company Limited Method for training object detection model and target object detection method
CN113469176A (zh) * 2021-06-28 2021-10-01 北京有竹居网络技术有限公司 一种目标检测模型训练方法、目标检测方法及其相关设备

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110837856B (zh) * 2019-10-31 2023-05-30 深圳市商汤科技有限公司 神经网络训练及目标检测方法、装置、设备和存储介质
CN112861917B (zh) * 2021-01-14 2021-12-28 西北工业大学 基于图像属性学习的弱监督目标检测方法
CN113033660B (zh) * 2021-03-24 2022-08-02 支付宝(杭州)信息技术有限公司 一种通用小语种检测方法、装置以及设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109191453A (zh) * 2018-09-14 2019-01-11 北京字节跳动网络技术有限公司 用于生成图像类别检测模型的方法和装置
US20210192180A1 (en) * 2018-12-05 2021-06-24 Tencent Technology (Shenzhen) Company Limited Method for training object detection model and target object detection method
CN111860573A (zh) * 2020-06-04 2020-10-30 北京迈格威科技有限公司 模型训练方法、图像类别检测方法、装置和电子设备
CN112560999A (zh) * 2021-02-18 2021-03-26 成都睿沿科技有限公司 一种目标检测模型训练方法、装置、电子设备及存储介质
CN112926654A (zh) * 2021-02-25 2021-06-08 平安银行股份有限公司 预标注模型训练、证件预标注方法、装置、设备及介质
CN113469176A (zh) * 2021-06-28 2021-10-01 北京有竹居网络技术有限公司 一种目标检测模型训练方法、目标检测方法及其相关设备

Also Published As

Publication number Publication date
CN113469176A (zh) 2021-10-01
CN113469176B (zh) 2023-06-02

Similar Documents

Publication Publication Date Title
CN109582793B (zh) 模型训练方法、客服系统及数据标注系统、可读存储介质
TWI752455B (zh) 圖像分類模型訓練方法、影像處理方法、資料分類模型訓練方法、資料處理方法、電腦設備、儲存媒介
CN109189767B (zh) 数据处理方法、装置、电子设备及存储介质
CN110046706B (zh) 模型生成方法、装置及服务器
CN111079847B (zh) 一种基于深度学习的遥感影像自动标注方法
WO2023115761A1 (fr) Procédé et appareil de détection d'événement basés sur un graphe de connaissances temporelles
CN109165309B (zh) 负例训练样本采集方法、装置及模型训练方法、装置
WO2022048194A1 (fr) Procédé, appareil et dispositif d'optimisation d'un modèle d'identification d'un thème d'un événement et support de stockage lisible
JP6892606B2 (ja) 位置特定装置、位置特定方法及びコンピュータプログラム
CN112149420A (zh) 实体识别模型训练方法、威胁情报实体提取方法及装置
CN111160959B (zh) 一种用户点击转化预估方法及装置
CN110458022B (zh) 一种基于域适应的可自主学习目标检测方法
CN110909784A (zh) 一种图像识别模型的训练方法、装置及电子设备
WO2023273570A1 (fr) Procédé d'apprentissage de modèle de détection de cible et procédé de détection de cible, et dispositif associé
WO2023273572A1 (fr) Procédé de construction de modèle d'extraction de caractéristiques et procédé de détection de cible, et dispositif associé
WO2020135054A1 (fr) Procédé, dispositif et appareil de recommandation de vidéos et support de stockage
JP2019067299A (ja) ラベル推定装置及びラベル推定プログラム
CN111539456A (zh) 一种目标识别方法及设备
CN108428234B (zh) 基于图像分割结果评价的交互式分割性能优化方法
CN110929013A (zh) 一种基于bottom-up attention和定位信息融合的图片问答实现方法
CN111368792B (zh) 特征点标注模型训练方法、装置、电子设备及存储介质
CN115063858A (zh) 视频人脸表情识别模型训练方法、装置、设备及存储介质
CN114021658A (zh) 一种命名实体识别模型的训练方法、应用方法及其系统
CN112990145B (zh) 一种基于组稀疏年龄估计方法及电子设备
CN112069800A (zh) 基于依存句法的句子时态识别方法、设备和可读存储介质

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22831399

Country of ref document: EP

Kind code of ref document: A1