WO2022188886A1 - Image matting model training method and apparatus, and image matting method and apparatus - Google Patents

Image matting model training method and apparatus, and image matting method and apparatus Download PDF

Info

Publication number
WO2022188886A1
WO2022188886A1 PCT/CN2022/080531 CN2022080531W WO2022188886A1 WO 2022188886 A1 WO2022188886 A1 WO 2022188886A1 CN 2022080531 W CN2022080531 W CN 2022080531W WO 2022188886 A1 WO2022188886 A1 WO 2022188886A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
image
transparency
cutout
transparency mask
Prior art date
Application number
PCT/CN2022/080531
Other languages
French (fr)
Chinese (zh)
Inventor
王闯闯
钱贝贝
杨飞宇
胡正
Original Assignee
奥比中光科技集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 奥比中光科技集团股份有限公司 filed Critical 奥比中光科技集团股份有限公司
Publication of WO2022188886A1 publication Critical patent/WO2022188886A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present application belongs to the technical field of image processing, and in particular, relates to a method and device for matting model training and image matting.
  • foreground matting In the field of image processing, foreground matting is a common processing method. Among them, foreground matting refers to extracting the region of interest (foreground) in the image, obtaining a fine transparency mask, and using the transparency mask to extract the matting object from the image or video, so as to apply the matting object to photo editing, The movie is being recreated.
  • a matting model is often used to obtain a transparency mask, and then a matting object is extracted from an image or video according to the transparency mask.
  • the traditional matting model is often large in size, resulting in a long processing time and cannot be applied to the application scenario of real-time matting.
  • the embodiments of the present application provide a method for matting model training, a method for image matting, an apparatus for training a matting model, an apparatus for image matting, a first terminal device, a first
  • the second terminal device and the computer-readable storage medium can solve the technical problem that the traditional matting model is often large in size, resulting in a long processing time and cannot be applied to the application scenario of real-time matting.
  • a first aspect of the embodiments of the present application provides a method for training a matting model, the method comprising:
  • each training sample includes an input sample and an output sample;
  • the input sample includes an image to be cutout, a background image, and a depth image of the image to be cutout, and the output sample includes a standard transparency mask corresponding to the image to be cutout;
  • the initial teacher model is trained to obtain a target teacher model and a first transparency mask output by the target teacher model;
  • the transitional student model is trained to obtain a matting model.
  • a second aspect of the embodiments of the present application provides a method for image matting, the method comprising:
  • the image to be cutout and the background image are images collected at the same viewing position, and the image to be cutout includes a matting object, the background image does not include the matting object;
  • the target transparency mask output by the cutout model is obtained;
  • the cutout model is trained by the transition student model Obtained, the transition student model is obtained by migrating the first weight parameter of the target teacher model to the initial student model;
  • the network structure complexity of the cutout model is lower than the network structure complexity of the target teacher model;
  • the cutout image corresponding to the cutout object in the image to be cutout is intercepted.
  • a third aspect of the embodiments of the present application provides an apparatus for training a cutout model, the apparatus comprising:
  • the first obtaining unit is used to obtain a training sample set, an initial teacher model and an initial student model; wherein, the network structure complexity of the initial student model is lower than the network structure complexity of the initial teacher model; in each training sample Including input samples and output samples; the input samples include an image to be cut out, a background image and a depth image of the image to be cut out, and the output sample includes a standard transparency mask corresponding to the image to be cut out;
  • a first training unit configured to train the initial teacher model to obtain a target teacher model and a first transparency mask output by the target teacher model through the training sample set;
  • a migration unit configured to respectively migrate the first weight parameter in the target teacher model to each sub-network in the initial student model to obtain a transitional student model
  • the second training unit is configured to train the transitional student model to obtain the matting model according to the first transparency mask and the training sample set.
  • a fourth aspect of the embodiments of the present application provides an apparatus for training a cutout model, the apparatus comprising:
  • the second acquiring unit is configured to acquire the image to be cut out, the background image and the depth image corresponding to the image to be cut out; wherein, the image to be cut out and the background image are images collected in the same viewing position, so The image to be cutout includes a cutout object, and the background image does not include the cutout object;
  • a processing unit configured to input the image to be cutout, the background image and the depth image into a pretrained cutout model to obtain a target transparency mask output by the cutout model;
  • the network structure complexity of the cutout model is lower than the network structure complexity of the target teacher model ;
  • An intercepting unit configured to intercept a cutout image corresponding to the cutout object in the to-be-cutout image according to the target transparency mask.
  • a fifth aspect of the embodiments of the present application provides a first terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer
  • the program implements the steps of the method described in the first aspect above.
  • a sixth aspect of the embodiments of the present application provides a second terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer
  • the program implements the steps of the method described in the second aspect above.
  • a seventh aspect of the embodiments of the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the first aspect or the second aspect described above is implemented. steps of the method.
  • the embodiments of the present application have the following beneficial effects: the present application obtains the target teacher model by training the initial teacher model. Since the network structure of the target teacher model has high processing accuracy, the first weight parameter in the target teacher model is transferred to the initial student model to obtain the transition student model. And according to the first transparency mask and the training sample set output by the target teacher model, the transition student model is trained to obtain the matting model.
  • the cutout model not only has the first weight parameter in the target teacher model, but also continuously learns the first transparency mask output by the target teacher model, the cutout model has a processing accuracy similar to that of the target teacher model, and the network of the cutout model
  • the structure is relatively simple, so on the premise of ensuring the processing accuracy, the volume of the model and the processing time are subtly reduced.
  • FIG. 1 shows a schematic flowchart of a method for training a matting model provided by the present application
  • Figure 2 shows a schematic diagram of a student model and a teacher model
  • FIG. 3 shows a specific schematic flow chart of step 103 in a method for matting model training provided by the present application
  • FIG. 4 shows a specific schematic flow chart of step 103 in a method for matting model training provided by the present application
  • FIG. 5 shows a specific schematic flowchart of step 1043 in a method for training a matting model provided by the present application
  • step A4 shows a specific schematic flow chart of step A4 in a method for matting model training provided by the present application
  • FIG. 7 shows a schematic flowchart of a method for image matting provided by the present application.
  • FIG. 8 shows a schematic diagram of a device for training a cutout model provided by the present application
  • FIG. 9 shows a schematic diagram of a device for image matting provided by the present application.
  • FIG. 10 is a schematic diagram of a first terminal device according to an embodiment of the present invention.
  • FIG. 11 is a schematic diagram of a second terminal device according to an embodiment of the present invention.
  • FIG. 1 shows a schematic flowchart of a method for training a matting model provided by the present application. As shown in FIG. 1 , the calibration method is applied to the first terminal device, and the calibration method includes the following steps:
  • Step 101 Obtain a training sample set, an initial teacher model and an initial student model; wherein, the network structure complexity of the initial student model is lower than the network structure complexity of the initial teacher model; each training sample includes input samples and An output sample; the input sample includes an image to be cutout, a background image, and a depth image of the image to be cutout, and the output sample includes a standard transparency mask corresponding to the image to be cutout.
  • the training sample set includes different training samples, and each training sample includes an image to be cut, a background image, a depth image of the image to be cut, and a standard transparency mask corresponding to the image to be cut.
  • the image to be cutout and the background image are images collected at the same framing position, and the difference between the two is that the image to be cutout includes a cutout object, and the background image does not include a cutout object (that is, the image to be cutout includes complete the foreground and background, only the background is included in the background image).
  • the training sample set is used to train the initial teacher model as well as the initial student model.
  • the initial teacher model and the initial student model are used to obtain transparency masks.
  • the initial teacher model is a model with high network structure complexity, which can extract rich feature information, and then obtain a high-precision transparency mask.
  • the initial teacher model can adopt a network structure such as a Resnet152 network.
  • Resnet152 is a highly complex super network model with 152 convolutional layers.
  • the number of convolutional layers of the model is more, the extracted features are richer, and the feature integrity is high, and then a high-precision transparency mask can be obtained.
  • Resnet152 has a slow training speed, requires high-performance computing, and is only suitable for running in high-performance and high-memory devices. And the operation time is long, which cannot meet the needs of real-time matting.
  • the present application uses a student model with a simple network structure to learn the output results of the teacher model, so as to replace the high-complexity teacher model with a low-complexity student model under the premise of ensuring the processing effect. That is, the teacher model is only used in the training phase, and the student model is used to process the images in the application phase.
  • this implementation is the process steps in the training phase, and the process steps in the application phase refer to steps 701 to 703 in the embodiment shown in FIG. 7 .
  • Step 102 Train the initial teacher model through the training sample set to obtain a target teacher model and a first transparency mask output by the target teacher model.
  • Each training sample in the training sample set performs the following process: input the image to be cutout, the background image, and the depth image of the image to be cutout into the initial teacher model, and obtain the initial transparency template output by the initial teacher model. Calculate the loss function based on the initial transparency mask and the standard transparency mask. Update the network parameters in the initial teacher model according to the loss function.
  • the target teacher model and the first transparency mask output by the target teacher model are obtained.
  • Step 103 Migrate the first weight parameter in the target teacher model to each sub-network in the initial student model to obtain a transitional student model.
  • FIG. 2 shows a schematic diagram of the student model and the teacher model.
  • the dashed box M represents the teacher model
  • the dashed box N represents the student model
  • the box I represents the image to be matted
  • the box S represents the depth image of the image to be matted
  • the box B represents the background image.
  • the target teacher model and the transparency mask ⁇ output by the target teacher model are obtained.
  • the first weight parameters in the target teacher model are respectively transferred to each sub-network of the initial student model N (ie the Stage 1 module to the Stage n module) to obtain the transitional student model N.
  • the network architectures adopted by different stage modules include, but are not limited to, a combination of one or more network architectures such as RefineNet network architecture or MobileNet network architecture.
  • step 103 the first weight parameter can be directly transferred to the initial student model.
  • the first weight parameter in the target teacher model is floating-point data
  • the amount of calculation of floating-point data is relatively large.
  • step 103 the following steps can also be performed:
  • step 103 includes the following steps 1031 to 1032.
  • FIG. 3 shows a specific schematic flowchart of step 103 in a method for training a matting model provided by the present application.
  • Step 1031 Quantize the floating-point first weight parameter into integer data to obtain a second weight parameter.
  • the quantization process for the first weight parameter is as follows: obtain the original file model of the target teacher model (eg TensorFlow, pyrorch or onnx model), and convert the original file model into intermediate files ".json format” and “.data format”. Quantify the data in the intermediate file, and obtain the quantized ".quant format” file.
  • the ".quant format” file includes the quantized integer weights of each layer of the target teacher model.
  • the quantization method can adopt the existing quantization method or the following optional embodiments:
  • each first weight parameter is sequentially substituted into the first formula group to obtain a second weight parameter corresponding to each first weight parameter.
  • the first formula group is as follows:
  • A represents the first quantization parameter, wherein the first quantization parameter refers to the corresponding scalable minimum scale factor between the floating-point first weight parameter and the integer first weight parameter, and J max represents the The maximum weight parameter in (the maximum weight parameter is floating-point data), J min represents the minimum weight parameter among all the first weight parameters (the minimum weight parameter is floating-point data), and ⁇ represents the preset integer data range in (The preset integer data range refers to the upper and lower limits of integer data, for example: 0-255, which can be preset according to different calculation precision requirements), and Represents rounding to the nearest integer, B represents the first preset integer value, the first preset integer value refers to the first integer value corresponding to the first floating-point weight parameter being zero, and N represents the each The first weight parameter, C represents the second weight parameter corresponding to each of the first weight parameters.
  • Step 1032 Migrate the second weight parameter to each sub-network in the initial student model to obtain a transitional student model.
  • the result output by the matting model is also integer data.
  • some protocols or hardware eg. Open Pluggable Specification, Ops, Open Pluggable Specification
  • a mapping relationship between floating-point data and integer data can be established in advance. To convert integer data to floating-point data when getting the output of the cutout model.
  • Step 104 Train the transitional student model to obtain the matting model according to the first transparency mask and the training sample set.
  • Method 1 Input the image to be cutout, the background image, and the depth image of the image to be cutout in each training sample in the training sample set into the transition student model, and obtain the transition transparency template output by the transition student model.
  • the loss function is calculated from the transition transparency mask and the first transparency mask. Update the network parameters in the transitional student model according to the loss function.
  • step 104 includes the following steps 1041 to 1044.
  • FIG. 4 shows a specific schematic flowchart of step 103 in a method for training a matting model provided by the present application.
  • Step 1041 Quantize the floating-point transparency of each pixel in the first transparency mask into integer data to obtain a second transparency mask.
  • the floating-point transparency of each pixel in the first transparency mask is substituted into the second formula group to obtain the second transparency mask.
  • the second formula group is as follows:
  • D represents the second quantization parameter, wherein the second quantization parameter refers to the minimum scale factor corresponding to the floating point transparency and the integer transparency, and K max represents the maximum transparency in the first transparency mask (the maximum transparency is floating-point data), K min represents the minimum transparency in the first transparency mask (the minimum transparency is floating-point data), ⁇ represents the maximum value in the preset integer data range, and Indicates rounding to the nearest integer, E represents the second preset integer transparency, the second preset integer transparency refers to the second integer value corresponding to the floating point transparency of zero, and M represents the floating point value of each pixel. Point type transparency, F represents the integer type transparency corresponding to the floating point type transparency of each pixel.
  • Step 1042 Input the training samples into the transitional student model to obtain a third transparency mask output by the transitional student model.
  • Step 1043 Adjust a third weight parameter in the transition student model according to the second transparency mask and the third transparency mask.
  • step 1043 the loss function between the second transparency mask and the third transparency mask can be directly calculated, and the third weight parameter in the transition student model can be adjusted according to the loss function.
  • Step 1043 can also be implemented by the following optional embodiments:
  • step 1043 includes the following steps A1 to A4. Please refer to FIG. 5.
  • FIG. 5 shows a specific schematic flowchart of step 1043 in a method for training a matting model provided by the present application.
  • Step A1 Calculate the first loss function through the first formula.
  • the first formula is as follows:
  • H represents the preset length of the composite image
  • M represents the preset width of the composite image
  • a i,j represents the first transparency of the pixel in the ith row and the jth column in the second transparency mask
  • Step A2 Calculate the second loss function through the second formula.
  • represents the first average transparency value of each pixel in the second transparency mask
  • ⁇ 2 represents the square of the first transparency average value
  • ⁇ * represents the second average transparency value of each pixel in the third transparency mask
  • ⁇ *2 represents the first transparency value.
  • the square of the mean value of the two transparency ⁇ represents the first transparency variance of each pixel in the second transparency mask
  • ⁇ 2 represents the square of the first transparency variance
  • ⁇ * represents the second transparency variance of each pixel in the third transparency mask
  • ⁇ *2 represents the square of the second transparency variance
  • c 1 represents the first constant
  • c 2 represents the second constant.
  • Step A3 Calculate the third loss function through the third formula.
  • the third formula is as follows:
  • represents the third constant
  • ⁇ i,j represents the index of the difficult pixel in the third transparency mask
  • the difficult pixel refers to the pixel that cannot be processed by the transition student model
  • mn represents the range of m ⁇ n pixels adjacent to the difficult pixel point
  • a i,j represents the adjacent pixel point of the unprocessable pixel point.
  • Step A4 Adjust the third weight parameter in the transitional student model according to the first loss function, the second loss function and the third loss function.
  • step A4 the first loss function, the second loss function, and the third loss function can be directly used as a joint loss function, and the third weight parameter in the transitional student model can be adjusted.
  • Step A4 can also be implemented by the following optional embodiments:
  • step A4 includes the following steps A41 to A42.
  • FIG. 6 shows a specific schematic flowchart of step A4 in a method for training a matting model provided by the present application.
  • Step A41 Multiply the first loss function, the second loss function, and the third loss function by their corresponding preset weights to obtain a joint loss function.
  • Step A42 Adjust the third weight parameter in the transitional student model according to the joint loss function.
  • Step 1044 the steps of inputting the training samples into the transition student model to obtain the third transparency mask output by the transition student model and subsequent steps are sequentially performed on each of the training samples in the training sample set, to obtain a Graphical model.
  • the target teacher model is obtained by training the initial teacher model. Since the network structure of the target teacher model has high processing accuracy, the first weight parameter in the target teacher model is transferred to the initial student model to obtain the transition student model. And according to the first transparency mask and the training sample set output by the target teacher model, the transition student model is trained to obtain the matting model. Since the cutout model not only has the first weight parameter in the target teacher model, but also continuously learns the first transparency mask output by the target teacher model, the cutout model has a processing accuracy similar to that of the target teacher model, and the network of the cutout model The structure is relatively simple, so on the premise of ensuring the processing accuracy, the volume of the model and the processing time are subtly reduced.
  • FIG. 7 shows a schematic flowchart of a method for applying the above-mentioned matting model to image matting provided by the present application. As shown in FIG. 7 , the method is applied to the second terminal device, and the method includes the following steps:
  • Step 701 Obtain an image to be cutout, a background image, and a depth image corresponding to the image to be cutout; wherein, the image to be cutout and the background image are images collected at the same viewing position, and the image to be cutout is an image collected at the same viewing position.
  • the image includes a matting object, and the background image does not include the matting object.
  • Step 702 Input the image to be cutout, the background image and the depth image into a pretrained cutout model to obtain a target transparency mask output by the cutout model;
  • the transitional student model is obtained by training the student model, and the transitional student model is obtained by migrating the first weight parameter of the target teacher model to the initial student model; the network structure complexity of the matting model is lower than that of the target teacher model.
  • the application simultaneously uses the image to be cutout, the background image and the depth image as the input data of the cutout model to accurately Extract depth features, and then obtain high-accuracy target transparency masks.
  • Step 703 according to the target transparency mask, intercept the cutout image corresponding to the cutout object in the image to be cutout.
  • the matting image and the image to be synthesized are synthesized to obtain a target synthesized image.
  • the synthesis process is shown in the following formula:
  • represents the target transparency mask
  • I represents the target composite image
  • F represents the image to be matted
  • B represents the image to be composited.
  • the transitional student model adopts the first weight parameter of the target teacher model, and trains the first weight parameter to obtain the matting model.
  • the network structure complexity of the matting model is lower than that of the target teacher model. Therefore, the matting model improves the image processing efficiency on the premise of ensuring the processing accuracy.
  • FIG. 8 shows a schematic diagram of an apparatus for training a cutout model provided by the present application.
  • a graphical model training device including:
  • the first obtaining unit 81 is used to obtain a training sample set, an initial teacher model and an initial student model; wherein, the network structure complexity of the initial student model is lower than the network structure complexity of the initial teacher model;
  • the first training unit 82 is used to train the initial teacher model to obtain a target teacher model and a first transparency mask output by the target teacher model through the training sample set; each training sample includes an input sample and an output sample ;
  • the input sample includes an image to be cutout, a background image and a depth image of the image to be cutout, and the output sample includes a standard transparency mask corresponding to the image to be cutout;
  • the migration unit 83 is used to respectively migrate the first weight parameter in the target teacher model to each sub-network in the initial student model to obtain a transitional student model;
  • the second training unit 84 is configured to train the transitional student model to obtain a matting model according to the first transparency mask and the training sample set.
  • an apparatus for training a cutout model which obtains a target teacher model by training an initial teacher model. Since the network structure of the target teacher model has high processing accuracy, the first weight parameter in the target teacher model is transferred to the initial student model to obtain the transition student model. And according to the first transparency mask and the training sample set output by the target teacher model, the transition student model is trained to obtain the matting model.
  • the cutout model not only has the first weight parameter in the target teacher model, but also continuously learns the first transparency mask output by the target teacher model, the cutout model has a processing accuracy similar to that of the target teacher model, and the network of the cutout model
  • the structure is relatively simple, so on the premise of ensuring the processing accuracy, the volume of the model and the processing time are subtly reduced.
  • FIG. 9 shows a schematic diagram of an image matting apparatus provided by the present application.
  • an image matting devices including:
  • the second acquiring unit 91 is configured to acquire an image to be cut out, a background image, and a depth image corresponding to the image to be cut out; wherein, the image to be cut out and the background image are images collected at the same viewing position, The image to be cutout includes a cutout object, and the background image does not include the cutout object;
  • the processing unit 92 is configured to input the image to be cutout, the background image and the depth image into a pretrained cutout model to obtain a target transparency mask output by the cutout model; the cutout The model is obtained by training the transitional student model, and the transitional student model is obtained by migrating the first weight parameter of the target teacher model to the initial student model; the network structure complexity of the cutout model is lower than that of the target teacher model. Spend;
  • the intercepting unit 93 is configured to intercept the cutout image corresponding to the cutout object in the image to be cutout according to the target transparency mask.
  • the transitional student model adopts the first weight parameter of the target teacher model, and the cutout model is obtained by training the first weight parameter.
  • the network structure complexity of the matting model is lower than that of the target teacher model. Therefore, the matting model improves the image processing efficiency on the premise of ensuring the processing accuracy.
  • FIG. 10 is a schematic diagram of a first terminal device according to an embodiment of the present invention.
  • a first terminal device 100 in this embodiment includes: a processor 1001, a memory 1002, and a computer program 1003 stored in the memory 1002 and running on the processor 1001, such as a A program for matting model training.
  • the processor 1001 executes the computer program 1003
  • the steps in each of the foregoing method embodiments for training a cutout model are implemented, for example, steps 101 to 104 shown in FIG. 1 .
  • the processor 1001 executes the computer program 134
  • the functions of the units in the foregoing apparatus embodiments such as the functions of units 81 to 84 shown in FIG. 8 , are implemented.
  • the computer program 1003 may be divided into one or more units, and the one or more units are stored in the memory 1002 and executed by the processor 1001 to complete the present invention.
  • the one or more units may be a series of computer program instruction segments capable of performing specific functions, and the instruction segments are used to describe the execution process of the computer program 1003 in the first terminal device 100 .
  • the computer program 1003 can be divided into units with the following specific functions:
  • a first obtaining unit used for obtaining a training sample set, an initial teacher model and an initial student model; wherein, the network structure complexity of the initial student model is lower than the network structure complexity of the initial teacher model;
  • a first training unit configured to train the initial teacher model to obtain a target teacher model and a first transparency mask output by the target teacher model through the training sample set;
  • a migration unit configured to respectively migrate the first weight parameter in the target teacher model to each sub-network in the initial student model to obtain a transitional student model
  • the second training unit is configured to train the transitional student model to obtain the matting model according to the first transparency mask and the training sample set.
  • the first terminal device includes but is not limited to a processor 1001 and a memory 1002 .
  • FIG. 10 is only an example of a first terminal device 100, and does not constitute a limitation to a first terminal device 100, and may include more or less components than those shown in the figure, or combinations thereof
  • the roaming control device may also include input and output devices, network access devices, buses, and the like.
  • the processor 1001 may be a central processing unit (Central Processing Unit, CPU), or other general-purpose processors, digital signal processors (Digital Signal Processors, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the memory 1002 may be an internal storage unit of the first terminal device 100, such as a hard disk or a memory of the first terminal device 100.
  • the memory 1002 may also be an external storage device of the first terminal device 100, such as a plug-in hard disk equipped on the first terminal device 100, a smart memory card (Smart Media Card, SMC), Secure Digital (SD) card, flash memory card (Flash Card), etc.
  • the memory 1002 may also include both an internal storage unit of the first terminal device 100 and an external storage device.
  • the memory 1002 is used for storing the computer program and other programs and data required by the one kind of roaming control device.
  • the memory 1002 may also be used to temporarily store data that has been output or will be output.
  • FIG. 11 is a schematic diagram of a second terminal device according to an embodiment of the present invention.
  • a second terminal device 11 in this embodiment includes: a processor 111, a memory 112, and a computer program 113 stored in the memory 112 and executable on the processor 111, such as a A program for image matting.
  • the processor 111 executes the computer program 113
  • the steps in each of the above-mentioned embodiments of the image matting method are implemented, for example, steps 701 to 703 shown in FIG. 7 .
  • the processor 111 executes the computer program 134
  • the functions of the units in the foregoing device embodiments for example, the functions of the units 91 to 93 shown in FIG. 9 , are implemented.
  • the computer program 113 may be divided into one or more units, and the one or more units are stored in the memory 112 and executed by the processor 111 to complete the present invention.
  • the one or more units may be a series of computer program instruction segments capable of performing specific functions, and the instruction segments are used to describe the execution process of the computer program 113 in the second terminal device 11 .
  • the computer program 113 can be divided into units with specific functions as follows:
  • the second acquiring unit is configured to acquire the image to be cut out, the background image and the depth image corresponding to the image to be cut out; wherein, the image to be cut out and the background image are images collected in the same viewing position, so The image to be cutout includes a cutout object, and the background image does not include the cutout object;
  • a processing unit configured to input the image to be cutout, the background image and the depth image into a pretrained cutout model to obtain a target transparency mask output by the cutout model;
  • the network structure complexity of the cutout model is lower than the network structure complexity of the target teacher model ;
  • An intercepting unit configured to intercept a cutout image corresponding to the cutout object in the to-be-cutout image according to the target transparency mask.
  • the second terminal device includes but is not limited to the processor 111 and the memory 112 .
  • FIG. 11 is only an example of a second terminal device 11 , and does not constitute a limitation to a second terminal device 11 , and may include more or less components than those shown in the figure, or combinations thereof
  • the roaming control device may also include input and output devices, network access devices, buses, and the like.
  • the processor 111 may be a central processing unit (Central Processing Unit, CPU), or other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), Off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the memory 112 may be an internal storage unit of the second terminal device 11 , such as a hard disk or a memory of the second terminal device 11 .
  • the memory 112 may also be an external storage device of the second terminal device 11, such as a plug-in hard disk equipped on the second terminal device 11, a smart memory card (Smart Media Card, SMC), Secure Digital (SD) card, flash memory card (Flash Card), etc.
  • the memory 112 may also include both an internal storage unit of the second terminal device 11 and an external storage device.
  • the memory 112 is used for storing the computer program and other programs and data required by the one kind of roaming control device.
  • the memory 112 may also be used to temporarily store data that has been output or will be output.
  • Embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the steps in the foregoing method embodiments can be implemented.
  • the embodiments of the present application provide a computer program product, when the computer program product runs on a mobile terminal, the steps in the foregoing method embodiments can be implemented when the mobile terminal executes the computer program product.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium.
  • the present application realizes all or part of the processes in the methods of the above embodiments, which can be completed by instructing the relevant hardware through a computer program, and the computer program can be stored in a computer-readable storage medium.
  • the computer program includes computer program code
  • the computer program code may be in the form of source code, object code, executable file or some intermediate form, and the like.
  • the computer-readable medium may include at least: any entity or device capable of carrying the computer program code to the photographing device/living detection device, recording medium, computer memory, read-only memory (ROM), random access Memory (Random Access Memory, RAM), electrical carrier signal, telecommunication signal and software distribution medium.
  • ROM read-only memory
  • RAM random access Memory
  • electrical carrier signal telecommunication signal and software distribution medium.
  • computer readable media may not be electrical carrier signals and telecommunications signals.
  • the disclosed apparatus/network device and method may be implemented in other manners.
  • the apparatus/network device embodiments described above are only illustrative.
  • the division of the modules or units is only a logical function division. In actual implementation, there may be other division methods, such as multiple units. Or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units.
  • the term “if” may be contextually interpreted as “when” or “once” or “in response to determining” or “in response to monitoring of ".
  • the phrases “if it is determined” or “if the [described condition or event] is monitored” can be interpreted, depending on the context, to mean “once it is determined” or “in response to the determination” or “once the [described condition or event] is monitored. ]” or “in response to the detection of the [described condition or event]”.
  • references in this specification to "one embodiment” or “some embodiments” and the like mean that a particular feature, structure or characteristic described in connection with the embodiment is included in one or more embodiments of the present application.
  • appearances of the phrases “in one embodiment,” “in some embodiments,” “in other embodiments,” “in other embodiments,” etc. in various places in this specification are not necessarily All refer to the same embodiment, but mean “one or more but not all embodiments” unless specifically emphasized otherwise.
  • the terms “including”, “including”, “having” and their variants mean “including but not limited to” unless specifically emphasized otherwise.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)

Abstract

The present invention is applicable to the technical field of image processing, and provides an image matting model training method and apparatus, and an image matting method and apparatus. The image matting model training method comprises: obtaining a training sample set, an initial teacher model, and an initial student model; training the initial teacher model by means of the training sample set to obtain a target teacher model and a first transparency mask output by the target teacher model; separately migrating first weight parameters in the target teacher model into sub-networks in the initial student model to obtain a transition student model; and training the transition student model according to the first transparency mask and the training sample set to obtain an image matting model. Because an image matting model has first weight parameters in a target teacher model and continuously learns a first transparency mask output by the target teacher model, the volume of the model and the processing duration are efficiently reduced when ensuring the processing accuracy.

Description

一种抠图模型训练、图像抠图的方法及装置A kind of method and device for matting model training and image matting
本申请要求于2021年3月11日提交中国专利局,申请号为202110264893.0,发明名称为“一种抠图模型训练、图像抠图的方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on March 11, 2021, with the application number of 202110264893.0, and the invention titled "A method and device for patterning model training and image patterning", the entire contents of which are Incorporated herein by reference.
技术领域technical field
本申请属于图像处理的技术领域,尤其涉及一种抠图模型训练、图像抠图的方法及装置。The present application belongs to the technical field of image processing, and in particular, relates to a method and device for matting model training and image matting.
背景技术Background technique
在图像处理领域中,前景抠图是一种常用的处理手段。其中,前景抠图是指提取图像中感兴趣区域(前景),获取一张精细的透明度蒙版,利用透明度蒙版从图像或视频中提取抠图对象,从而将抠图对象应用于照片编辑、电影再创作中。In the field of image processing, foreground matting is a common processing method. Among them, foreground matting refers to extracting the region of interest (foreground) in the image, obtaining a fine transparency mask, and using the transparency mask to extract the matting object from the image or video, so as to apply the matting object to photo editing, The movie is being recreated.
传统的抠图技术,往往采用抠图模型获取透明度蒙版,进而根据透明度蒙版从图像或视频中提取抠图对象。然而,为了进一步提升抠图模型的处理精度,传统的抠图模型往往体量较大,导致处理时长较长,无法应用于实时抠图的应用场景。In traditional matting technology, a matting model is often used to obtain a transparency mask, and then a matting object is extracted from an image or video according to the transparency mask. However, in order to further improve the processing accuracy of the matting model, the traditional matting model is often large in size, resulting in a long processing time and cannot be applied to the application scenario of real-time matting.
发明内容SUMMARY OF THE INVENTION
有鉴于此,本申请实施例提供了一种抠图模型训练的方法、一种图像抠图的方法、一种抠图模型训练的装置、一种图像抠图的装置、第一终端设备、第二终端设备以及计算机可读存储介质,可以解决传统的抠图模型往往体量较大,导致处理时长较长,无法应用于实时抠图的应用场景的技术问题。In view of this, the embodiments of the present application provide a method for matting model training, a method for image matting, an apparatus for training a matting model, an apparatus for image matting, a first terminal device, a first The second terminal device and the computer-readable storage medium can solve the technical problem that the traditional matting model is often large in size, resulting in a long processing time and cannot be applied to the application scenario of real-time matting.
本申请实施例的第一方面提供了一种抠图模型训练的方法,所述方法包括:A first aspect of the embodiments of the present application provides a method for training a matting model, the method comprising:
获取训练样本集合、初始教师模型以及初始学生模型;其中,所述初始学生模型的网络结构复杂度低于所述初始教师模型的网络结构复杂度;每个训练样本中包括输入样本和输出样本;所述输入样本包括待抠图图像、背景图像以及所述待抠图图像的深度图像,所述输出样本包括待抠图图像对应的标准透明度蒙版;Obtain a training sample set, an initial teacher model and an initial student model; wherein, the network structure complexity of the initial student model is lower than the network structure complexity of the initial teacher model; each training sample includes an input sample and an output sample; The input sample includes an image to be cutout, a background image, and a depth image of the image to be cutout, and the output sample includes a standard transparency mask corresponding to the image to be cutout;
通过所述训练样本集合,训练所述初始教师模型得到目标教师模型以及所述目标教师模型输出的第一透明度蒙版;Through the training sample set, the initial teacher model is trained to obtain a target teacher model and a first transparency mask output by the target teacher model;
分别将所述目标教师模型中的第一权重参数迁移至所述初始学生模型中的各个子网络中,得到过渡学生模型;respectively migrating the first weight parameter in the target teacher model to each sub-network in the initial student model to obtain a transitional student model;
根据所述第一透明度蒙版以及所述训练样本集合,训练所述过渡学生模型得到抠图模型。According to the first transparency mask and the training sample set, the transitional student model is trained to obtain a matting model.
本申请实施例的第二方面提供了一种图像抠图的方法,所述方法包括:A second aspect of the embodiments of the present application provides a method for image matting, the method comprising:
获取待抠图图像、背景图像以及所述待抠图图像对应的深度图像;其中,所述待抠图图像和所述背景图像为相同取景位置下采集的图像,所述待抠图图像中包括抠图对象,所述背景图像中不包括所述抠图对象;Obtain the image to be cutout, the background image, and the depth image corresponding to the image to be cutout; wherein, the image to be cutout and the background image are images collected at the same viewing position, and the image to be cutout includes a matting object, the background image does not include the matting object;
将所述待抠图图像、所述背景图像以及所述深度图像输入预先训练的抠图模型中,得到由所述抠图模型输出的目标透明度蒙版;所述抠图模型由过渡学生模型训练得到,所述过渡学生模型由目标教师模型的第一权重参数迁移至初始学生模型得到;所述抠图模型的网络结构复杂度低于所述目标教师模型的网络结构复杂度;Inputting the image to be cutout, the background image and the depth image into a pretrained cutout model, the target transparency mask output by the cutout model is obtained; the cutout model is trained by the transition student model Obtained, the transition student model is obtained by migrating the first weight parameter of the target teacher model to the initial student model; the network structure complexity of the cutout model is lower than the network structure complexity of the target teacher model;
根据所述目标透明度蒙版,截取所述待抠图图像中所述抠图对象对应的抠图图像。According to the target transparency mask, the cutout image corresponding to the cutout object in the image to be cutout is intercepted.
本申请实施例的第三方面提供了一种抠图模型训练的装置,所述装置包括:A third aspect of the embodiments of the present application provides an apparatus for training a cutout model, the apparatus comprising:
第一获取单元,用于获取训练样本集合、初始教师模型以及初始学生模型;其中,所述初始学生模型的网络结构复杂度低于所述初始教师模型的网络结构复杂度;每个训练样本中包括输入样本和输出样本;所述输入样本包括待抠图 图像、背景图像以及所述待抠图图像的深度图像,所述输出样本包括待抠图图像对应的标准透明度蒙版;The first obtaining unit is used to obtain a training sample set, an initial teacher model and an initial student model; wherein, the network structure complexity of the initial student model is lower than the network structure complexity of the initial teacher model; in each training sample Including input samples and output samples; the input samples include an image to be cut out, a background image and a depth image of the image to be cut out, and the output sample includes a standard transparency mask corresponding to the image to be cut out;
第一训练单元,用于通过所述训练样本集合,训练所述初始教师模型得到目标教师模型以及所述目标教师模型输出的第一透明度蒙版;a first training unit, configured to train the initial teacher model to obtain a target teacher model and a first transparency mask output by the target teacher model through the training sample set;
迁移单元,用于分别将所述目标教师模型中的第一权重参数迁移至所述初始学生模型中的各个子网络中,得到过渡学生模型;a migration unit, configured to respectively migrate the first weight parameter in the target teacher model to each sub-network in the initial student model to obtain a transitional student model;
第二训练单元,用于根据所述第一透明度蒙版以及所述训练样本集合,训练所述过渡学生模型得到所述抠图模型。The second training unit is configured to train the transitional student model to obtain the matting model according to the first transparency mask and the training sample set.
本申请实施例的第四方面提供了一种抠图模型训练的装置,所述装置包括:A fourth aspect of the embodiments of the present application provides an apparatus for training a cutout model, the apparatus comprising:
第二获取单元,用于获取待抠图图像、背景图像以及所述待抠图图像对应的深度图像;其中,所述待抠图图像和所述背景图像为相同取景位置下采集的图像,所述待抠图图像中包括抠图对象,所述背景图像中不包括所述抠图对象;The second acquiring unit is configured to acquire the image to be cut out, the background image and the depth image corresponding to the image to be cut out; wherein, the image to be cut out and the background image are images collected in the same viewing position, so The image to be cutout includes a cutout object, and the background image does not include the cutout object;
处理单元,用于将所述待抠图图像、所述背景图像以及所述深度图像输入预先训练的抠图模型中,得到由所述抠图模型输出的目标透明度蒙版;所述抠图模型由过渡学生模型训练得到,所述过渡学生模型由目标教师模型的第一权重参数迁移至初始学生模型得到;所述抠图模型的网络结构复杂度低于所述目标教师模型的网络结构复杂度;a processing unit, configured to input the image to be cutout, the background image and the depth image into a pretrained cutout model to obtain a target transparency mask output by the cutout model; the cutout model Obtained from the training of the transitional student model, the transitional student model is obtained by migrating the first weight parameter of the target teacher model to the initial student model; the network structure complexity of the cutout model is lower than the network structure complexity of the target teacher model ;
截取单元,用于根据所述目标透明度蒙版,截取所述待抠图图像中所述抠图对象对应的抠图图像。An intercepting unit, configured to intercept a cutout image corresponding to the cutout object in the to-be-cutout image according to the target transparency mask.
本申请实施例的第五方面提供了一种第一终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述第一方面所述方法的步骤。A fifth aspect of the embodiments of the present application provides a first terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer The program implements the steps of the method described in the first aspect above.
本申请实施例的第六方面提供了一种第二终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述第二方面所述方法的步骤。A sixth aspect of the embodiments of the present application provides a second terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer The program implements the steps of the method described in the second aspect above.
本申请实施例的第七方面提供了一种计算机可读存储介质,所述计算机可 读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现上述第一方面或第二方面所述方法的步骤。A seventh aspect of the embodiments of the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the first aspect or the second aspect described above is implemented. steps of the method.
本申请实施例与现有技术相比存在的有益效果是:本申请通过训练初始教师模型得到目标教师模型。由于目标教师模型的网络结构处理精度较高,故将目标教师模型中的第一权重参数迁移至初始学生模型中,得到过渡学生模型。并根据目标教师模型输出的第一透明度蒙版和训练样本集合,训练过渡学生模型,得到抠图模型。由于抠图模型不仅拥有目标教师模型中的第一权重参数,且不断学习目标教师模型输出的第一透明度蒙版,故抠图模型具有与目标教师模型相近的处理精度,且抠图模型的网络结构较为简单,故在保证处理精度的前提下,巧妙地缩减了模型的体量以及处理时长。Compared with the prior art, the embodiments of the present application have the following beneficial effects: the present application obtains the target teacher model by training the initial teacher model. Since the network structure of the target teacher model has high processing accuracy, the first weight parameter in the target teacher model is transferred to the initial student model to obtain the transition student model. And according to the first transparency mask and the training sample set output by the target teacher model, the transition student model is trained to obtain the matting model. Since the cutout model not only has the first weight parameter in the target teacher model, but also continuously learns the first transparency mask output by the target teacher model, the cutout model has a processing accuracy similar to that of the target teacher model, and the network of the cutout model The structure is relatively simple, so on the premise of ensuring the processing accuracy, the volume of the model and the processing time are subtly reduced.
附图说明Description of drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例或相关技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to illustrate the technical solutions in the embodiments of the present application more clearly, the following briefly introduces the accompanying drawings that are used in the description of the embodiments or related technologies. Obviously, the drawings in the following description are only some of the drawings in the present application. In the embodiments, for those of ordinary skill in the art, other drawings can also be obtained according to these drawings without any creative effort.
图1示出了本申请提供的一种抠图模型训练的方法的示意性流程图;1 shows a schematic flowchart of a method for training a matting model provided by the present application;
图2示出了学生模型和教师模型的示意图;Figure 2 shows a schematic diagram of a student model and a teacher model;
图3示出了本申请提供的一种抠图模型训练的方法中步骤103具体示意性流程图;FIG. 3 shows a specific schematic flow chart of step 103 in a method for matting model training provided by the present application;
图4示出了本申请提供的一种抠图模型训练的方法中步骤103具体示意性流程图;FIG. 4 shows a specific schematic flow chart of step 103 in a method for matting model training provided by the present application;
图5示出了本申请提供的一种抠图模型训练的方法中步骤1043具体示意性流程图;FIG. 5 shows a specific schematic flowchart of step 1043 in a method for training a matting model provided by the present application;
图6示出了本申请提供的一种抠图模型训练的方法中步骤A4具体示意性流程图;6 shows a specific schematic flow chart of step A4 in a method for matting model training provided by the present application;
图7示出了本申请提供的一种图像抠图的方法的示意性流程图;FIG. 7 shows a schematic flowchart of a method for image matting provided by the present application;
图8示出了本申请提供的一种抠图模型训练的装置的示意图;FIG. 8 shows a schematic diagram of a device for training a cutout model provided by the present application;
图9示出了本申请提供的一种图像抠图的装置的示意图;FIG. 9 shows a schematic diagram of a device for image matting provided by the present application;
图10是本发明一实施例提供的一种第一终端设备的示意图;FIG. 10 is a schematic diagram of a first terminal device according to an embodiment of the present invention;
图11是本发明一实施例提供的一种第二终端设备的示意图。FIG. 11 is a schematic diagram of a second terminal device according to an embodiment of the present invention.
具体实施方式Detailed ways
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、技术之类的具体细节,以便透彻理解本申请实施例。然而,本领域的技术人员应当清楚,在没有这些具体细节的其它实施例中也可以实现本申请。在其它情况中,省略对众所周知的系统、装置、电路以及方法的详细说明,以免不必要的细节妨碍本申请的描述。In the following description, for the purpose of illustration rather than limitation, specific details such as a specific system structure and technology are set forth in order to provide a thorough understanding of the embodiments of the present application. However, it will be apparent to those skilled in the art that the present application may be practiced in other embodiments without these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
请参见图1,图1示出了本申请提供的一种抠图模型训练的方法的示意性流程图。如图1所示,该校正方法应用于第一终端设备,该校正方法包括如下步骤:Please refer to FIG. 1. FIG. 1 shows a schematic flowchart of a method for training a matting model provided by the present application. As shown in FIG. 1 , the calibration method is applied to the first terminal device, and the calibration method includes the following steps:
步骤101,获取训练样本集合、初始教师模型以及初始学生模型;其中,所述初始学生模型的网络结构复杂度低于所述初始教师模型的网络结构复杂度;每个训练样本中包括输入样本和输出样本;所述输入样本包括待抠图图像、背景图像以及所述待抠图图像的深度图像,所述输出样本包括待抠图图像对应的标准透明度蒙版。Step 101: Obtain a training sample set, an initial teacher model and an initial student model; wherein, the network structure complexity of the initial student model is lower than the network structure complexity of the initial teacher model; each training sample includes input samples and An output sample; the input sample includes an image to be cutout, a background image, and a depth image of the image to be cutout, and the output sample includes a standard transparency mask corresponding to the image to be cutout.
训练样本集合中包括不同的训练样本,每个训练样本中包括待抠图图像、背景图像、待抠图图像的深度图像以及待抠图图像对应的标准透明度蒙版。其中,待抠图图像和背景图像为相同取景位置下采集的图像,两者的区别在于待抠图图像中包括抠图对象,背景图像中不包括抠图对象(即待抠图图像中包括完整的前景和背景,背景图像中仅包括背景)。The training sample set includes different training samples, and each training sample includes an image to be cut, a background image, a depth image of the image to be cut, and a standard transparency mask corresponding to the image to be cut. The image to be cutout and the background image are images collected at the same framing position, and the difference between the two is that the image to be cutout includes a cutout object, and the background image does not include a cutout object (that is, the image to be cutout includes complete the foreground and background, only the background is included in the background image).
训练样本集合用于训练初始教师模型以及初始学生模型。其中,初始教师 模型以及初始学生模型都用于获取透明度蒙版。初始教师模型为网络结构复杂度较高的模型,可提取到丰富的特征信息,进而得到高精度的透明度蒙版。优选地,初始教师模型可以采用Resnet152网络等网络结构。The training sample set is used to train the initial teacher model as well as the initial student model. Among them, the initial teacher model and the initial student model are used to obtain transparency masks. The initial teacher model is a model with high network structure complexity, which can extract rich feature information, and then obtain a high-precision transparency mask. Preferably, the initial teacher model can adopt a network structure such as a Resnet152 network.
示例性地,以Resnet152网络为例,Resnet152为一个高度复杂的超大网络模型,具有152层卷积层。当模型的卷积层数越多,提取的特征越丰富,且特征完整度高,进而可获得高精度的透明度蒙版。但Resnet152训练速度慢,需要高性能的计算,只适合在高性能和高内存的设备中运行。且运算时长较长,无法满足实时抠图的需求。基于上述考量,本申请利用具有简易网络结构的学生模型学习教师模型的输出结果,以在保证处理效果的前提下,通过低复杂度的学生模型替代高复杂度的教师模型。即教师模型仅用于训练阶段,在应用阶段采用学生模型对图像进行处理。Illustratively, taking the Resnet152 network as an example, Resnet152 is a highly complex super network model with 152 convolutional layers. When the number of convolutional layers of the model is more, the extracted features are richer, and the feature integrity is high, and then a high-precision transparency mask can be obtained. However, Resnet152 has a slow training speed, requires high-performance computing, and is only suitable for running in high-performance and high-memory devices. And the operation time is long, which cannot meet the needs of real-time matting. Based on the above considerations, the present application uses a student model with a simple network structure to learn the output results of the teacher model, so as to replace the high-complexity teacher model with a low-complexity student model under the premise of ensuring the processing effect. That is, the teacher model is only used in the training phase, and the student model is used to process the images in the application phase.
值得注意的是,本实施为训练阶段的流程步骤,应用阶段的流程步骤请参见图7所示实施例中的步骤701至步骤703。It is worth noting that this implementation is the process steps in the training phase, and the process steps in the application phase refer to steps 701 to 703 in the embodiment shown in FIG. 7 .
步骤102,通过所述训练样本集合,训练所述初始教师模型得到目标教师模型以及所述目标教师模型输出的第一透明度蒙版。Step 102: Train the initial teacher model through the training sample set to obtain a target teacher model and a first transparency mask output by the target teacher model.
训练样本集合中的每个训练样本执行如下过程:将待抠图图像、背景图像、待抠图图像的深度图像输入初始教师模型,得到由初始教师模型输出的初始透明度模板。根据初始透明度蒙版和标准透明度蒙版,计算损失函数。根据损失函数更新初始教师模型中的网络参数。Each training sample in the training sample set performs the following process: input the image to be cutout, the background image, and the depth image of the image to be cutout into the initial teacher model, and obtain the initial transparency template output by the initial teacher model. Calculate the loss function based on the initial transparency mask and the standard transparency mask. Update the network parameters in the initial teacher model according to the loss function.
当训练样本集合中所有训练样本训练完成或达到预设训练次数,或达到模型收敛条件时,得到目标教师模型以及目标教师模型输出的第一透明度蒙版。When the training of all the training samples in the training sample set is completed or the preset number of training times is reached, or the model convergence condition is reached, the target teacher model and the first transparency mask output by the target teacher model are obtained.
步骤103,分别将所述目标教师模型中的第一权重参数迁移至所述初始学生模型中的各个子网络中,得到过渡学生模型。Step 103: Migrate the first weight parameter in the target teacher model to each sub-network in the initial student model to obtain a transitional student model.
请参见图2,图2示出了学生模型和教师模型的示意图。虚线框M表示教师模型,虚线框N表示学生模型,方框I表示待抠图图像,方框S表示待抠图图像的深度图像,方框B表示背景图像。当初始教师模型经过待抠图图像、深 度图像以及背景图像训练后,得到目标教师模型以及目标教师模型输出的透明度蒙版α。分别将目标教师模型中的第一权重参数迁移至初始学生模型N的各个子网络中(即Stage 1模块至Stage n模块),得到过渡学生模型N。其中,不同stage模块采用的网络架构包括但不限于为RefineNet网络架构或MobileNet网络架构等一种或多种网络架构之间的组合。 Please refer to FIG. 2, which shows a schematic diagram of the student model and the teacher model. The dashed box M represents the teacher model, the dashed box N represents the student model, the box I represents the image to be matted, the box S represents the depth image of the image to be matted, and the box B represents the background image. After the initial teacher model is trained on the image to be cutout, the depth image and the background image, the target teacher model and the transparency mask α output by the target teacher model are obtained. The first weight parameters in the target teacher model are respectively transferred to each sub-network of the initial student model N (ie the Stage 1 module to the Stage n module) to obtain the transitional student model N. The network architectures adopted by different stage modules include, but are not limited to, a combination of one or more network architectures such as RefineNet network architecture or MobileNet network architecture.
在执行步骤103时,可将第一权重参数直接迁移至初始学生模型中。但由于目标教师模型中的第一权重参数为浮点型数据,而浮点型数据的计算量较大。为了优化计算量,在步骤103中还可以执行如下步骤:When step 103 is performed, the first weight parameter can be directly transferred to the initial student model. However, since the first weight parameter in the target teacher model is floating-point data, the amount of calculation of floating-point data is relatively large. In order to optimize the calculation amount, in step 103, the following steps can also be performed:
作为本申请的一个可选实施例,在步骤103包括如下步骤1031至步骤1032。请参见图3,图3示出了本申请提供的一种抠图模型训练的方法中步骤103具体示意性流程图。As an optional embodiment of the present application, step 103 includes the following steps 1031 to 1032. Referring to FIG. 3 , FIG. 3 shows a specific schematic flowchart of step 103 in a method for training a matting model provided by the present application.
步骤1031,将浮点型的所述第一权重参数,量化为整型数据,得到第二权重参数。Step 1031: Quantize the floating-point first weight parameter into integer data to obtain a second weight parameter.
对于第一权重参数的量化过程如下:获取目标教师模型的原始文件模型(例如TensorFlow、pyrorch或onnx模型),将原始文件模型转化为中间文件“.json格式”和“.data格式”。对中间文件中的数据进行量化,获取量化后的“.quant格式”文件,“.quant格式”文件中包括了目标教师模型经量化后的每一层整型权重。The quantization process for the first weight parameter is as follows: obtain the original file model of the target teacher model (eg TensorFlow, pyrorch or onnx model), and convert the original file model into intermediate files ".json format" and ".data format". Quantify the data in the intermediate file, and obtain the quantized ".quant format" file. The ".quant format" file includes the quantized integer weights of each layer of the target teacher model.
其中,量化的方式可采用现有的量化方式或如下可选实施例:Wherein, the quantization method can adopt the existing quantization method or the following optional embodiments:
作为本申请的一个可选实施例,将每个第一权重参数依次代入第一公式组中,得到所述每个第一权重参数对应的第二权重参数。As an optional embodiment of the present application, each first weight parameter is sequentially substituted into the first formula group to obtain a second weight parameter corresponding to each first weight parameter.
所述第一公式组如下:The first formula group is as follows:
Figure PCTCN2022080531-appb-000001
Figure PCTCN2022080531-appb-000001
Figure PCTCN2022080531-appb-000002
Figure PCTCN2022080531-appb-000002
Figure PCTCN2022080531-appb-000003
Figure PCTCN2022080531-appb-000003
其中,A表示第一量化参数,其中,第一量化参数是指浮点型第一权重参数与整型第一权重参数之间对应的可缩放最小比例因子,J max表示在所有第一权重参数中的最大权重参数(最大权重参数为浮点型数据),J min表示在所有第一权重参数中的最小权重参数(最小权重参数为浮点型数据),α表示预设整型数据范围中的最大值(预设整型数据范围是指整型数据的上限和下限,例如:0-255,可根据不同计算精度需求进行预设),
Figure PCTCN2022080531-appb-000004
Figure PCTCN2022080531-appb-000005
表示四舍五入取整,B表示第一预设整型值,所述第一预设整型值是指浮点型第一权重参数为零时对应的第一整型值,N表示所述每个第一权重参数,C表示所述每个第一权重参数对应的第二权重参数。
Among them, A represents the first quantization parameter, wherein the first quantization parameter refers to the corresponding scalable minimum scale factor between the floating-point first weight parameter and the integer first weight parameter, and J max represents the The maximum weight parameter in (the maximum weight parameter is floating-point data), J min represents the minimum weight parameter among all the first weight parameters (the minimum weight parameter is floating-point data), and α represents the preset integer data range in (The preset integer data range refers to the upper and lower limits of integer data, for example: 0-255, which can be preset according to different calculation precision requirements),
Figure PCTCN2022080531-appb-000004
and
Figure PCTCN2022080531-appb-000005
Represents rounding to the nearest integer, B represents the first preset integer value, the first preset integer value refers to the first integer value corresponding to the first floating-point weight parameter being zero, and N represents the each The first weight parameter, C represents the second weight parameter corresponding to each of the first weight parameters.
步骤1032,将所述第二权重参数迁移至所述初始学生模型中的各个子网络中,得到过渡学生模型。Step 1032: Migrate the second weight parameter to each sub-network in the initial student model to obtain a transitional student model.
作为本申请的一个实施例,由于第二权重参数为整型数据,故抠图模型输出的结果,同样为整型数据。然而,由于部分协议或硬件(例如:开放式拔插电脑规范Ops,Open Pluggable Specification)不支持整型数据,故可预先将浮点型数据与整型数据建立映射关系。以在得到抠图模型输出的结果时,将整型数据转换为浮点数据。As an embodiment of the present application, since the second weight parameter is integer data, the result output by the matting model is also integer data. However, since some protocols or hardware (eg, Open Pluggable Specification, Ops, Open Pluggable Specification) do not support integer data, a mapping relationship between floating-point data and integer data can be established in advance. To convert integer data to floating-point data when getting the output of the cutout model.
步骤104,根据所述第一透明度蒙版以及所述训练样本集合,训练所述过渡学生模型得到所述抠图模型。Step 104: Train the transitional student model to obtain the matting model according to the first transparency mask and the training sample set.
训练过渡学生模型包括如下两种方式:There are two ways to train the transitional student model:
方式①:将训练样本集合中每个训练样本中的待抠图图像、背景图像、待抠图图像的深度图像输入过渡学生模型,得到由过渡学生模型输出的过渡透明度模板。根据过渡透明度蒙版和第一透明度蒙版,计算损失函数。根据损失函数更新过渡学生模型中的网络参数。当训练样本集合中所有训练样本训练完成或达到预设训练次数,或达到模型收敛条件时,得到抠图模型。Method ①: Input the image to be cutout, the background image, and the depth image of the image to be cutout in each training sample in the training sample set into the transition student model, and obtain the transition transparency template output by the transition student model. The loss function is calculated from the transition transparency mask and the first transparency mask. Update the network parameters in the transitional student model according to the loss function. When the training of all the training samples in the training sample set is completed or the preset number of training times is reached, or the model convergence condition is reached, the matting model is obtained.
方式②:作为本申请的一个可选实施例,在步骤104包括如下步骤1041至步骤1044。请参见图4,图4示出了本申请提供的一种抠图模型训练的方法中步骤103具体示意性流程图。Manner ②: As an optional embodiment of the present application, step 104 includes the following steps 1041 to 1044. Referring to FIG. 4 , FIG. 4 shows a specific schematic flowchart of step 103 in a method for training a matting model provided by the present application.
步骤1041,将所述第一透明度蒙版中每个像素的浮点型透明度量化为整型数据,得到第二透明度蒙版。Step 1041: Quantize the floating-point transparency of each pixel in the first transparency mask into integer data to obtain a second transparency mask.
量化的方式可采用现有的量化方式或如下可选实施例:The quantization method can adopt the existing quantization method or the following optional embodiments:
作为本申请的一个可选实施例,将所述第一透明度蒙版中每个像素的浮点型透明度代入第二公式组中,得到第二透明度蒙版。As an optional embodiment of the present application, the floating-point transparency of each pixel in the first transparency mask is substituted into the second formula group to obtain the second transparency mask.
所述第二公式组如下:The second formula group is as follows:
Figure PCTCN2022080531-appb-000006
Figure PCTCN2022080531-appb-000006
Figure PCTCN2022080531-appb-000007
Figure PCTCN2022080531-appb-000007
Figure PCTCN2022080531-appb-000008
Figure PCTCN2022080531-appb-000008
其中,D表示第二量化参数,其中,第二量化参数是指浮点型透明度与整型透明度之间对应的可缩放最小比例因子,K max表示第一透明度蒙版中最大透明度(最大透明度为浮点型数据),K min表示第一透明度蒙版中最小透明度(最小透明度为浮点型数据),α表示预设整型数据范围中的最大值,
Figure PCTCN2022080531-appb-000009
Figure PCTCN2022080531-appb-000010
表示四舍五入取整,E表示第二预设整型透明度,所述第二预设整型透明度是指浮点型透明度为零时对应的第二整型值,M表示所述每个像素的浮点型透明度,F表示所述每个像素的浮点型透明度对应的整型透明度。
Among them, D represents the second quantization parameter, wherein the second quantization parameter refers to the minimum scale factor corresponding to the floating point transparency and the integer transparency, and K max represents the maximum transparency in the first transparency mask (the maximum transparency is floating-point data), K min represents the minimum transparency in the first transparency mask (the minimum transparency is floating-point data), α represents the maximum value in the preset integer data range,
Figure PCTCN2022080531-appb-000009
and
Figure PCTCN2022080531-appb-000010
Indicates rounding to the nearest integer, E represents the second preset integer transparency, the second preset integer transparency refers to the second integer value corresponding to the floating point transparency of zero, and M represents the floating point value of each pixel. Point type transparency, F represents the integer type transparency corresponding to the floating point type transparency of each pixel.
步骤1042,将训练样本输入所述过渡学生模型,得到由所述过渡学生模型输出的第三透明度蒙版。Step 1042: Input the training samples into the transitional student model to obtain a third transparency mask output by the transitional student model.
步骤1043,根据所述第二透明度蒙版和所述第三透明度蒙版,调整所述过渡学生模型中的第三权重参数。Step 1043: Adjust a third weight parameter in the transition student model according to the second transparency mask and the third transparency mask.
在步骤1043中可直接计算第二透明度蒙版和第三透明度蒙版之间的损失函 数,并根据该损失函数调整过渡学生模型中的第三权重参数。In step 1043, the loss function between the second transparency mask and the third transparency mask can be directly calculated, and the third weight parameter in the transition student model can be adjusted according to the loss function.
步骤1043也可以通过如下可选实施例实现:Step 1043 can also be implemented by the following optional embodiments:
作为本申请的一个可选实施例,在步骤1043包括如下步骤A1至步骤A4。请参见图5,图5示出了本申请提供的一种抠图模型训练的方法中步骤1043具体示意性流程图。As an optional embodiment of the present application, step 1043 includes the following steps A1 to A4. Please refer to FIG. 5. FIG. 5 shows a specific schematic flowchart of step 1043 in a method for training a matting model provided by the present application.
步骤A1,通过第一公式计算第一损失函数。Step A1: Calculate the first loss function through the first formula.
所述第一公式如下:The first formula is as follows:
Figure PCTCN2022080531-appb-000011
Figure PCTCN2022080531-appb-000011
其中,H表示合成图像的预设长度,M表示合成图像的预设宽度,a i,j表示第二透明度蒙版中第i行第j列像素的第一透明度,
Figure PCTCN2022080531-appb-000012
表示第三透明度蒙版中第i行第j列像素的第二透明度。
Among them, H represents the preset length of the composite image, M represents the preset width of the composite image, a i,j represents the first transparency of the pixel in the ith row and the jth column in the second transparency mask,
Figure PCTCN2022080531-appb-000012
Represents the second transparency of the pixel at row i and column j in the third transparency mask.
步骤A2,通过第二公式计算第二损失函数。Step A2: Calculate the second loss function through the second formula.
所述第二公式如下:The second formula is as follows:
Figure PCTCN2022080531-appb-000013
Figure PCTCN2022080531-appb-000013
其中,μ表示第二透明度蒙版中各个像素的第一透明度均值,μ 2表示第一透明度均值的平方,μ *表示第三透明度蒙版中各个像素的第二透明度均值,μ *2表示第二透明度均值的平方,σ表示第二透明度蒙版中各个像素的第一透明度方差,σ 2表示第一透明度方差的平方,σ *表示第三透明度蒙版中各个像素的第二透明度方差,σ *2表示第二透明度方差的平方,c 1表示第一常数,c 2表示第二常数。 Among them, μ represents the first average transparency value of each pixel in the second transparency mask, μ 2 represents the square of the first transparency average value, μ * represents the second average transparency value of each pixel in the third transparency mask, and μ *2 represents the first transparency value. The square of the mean value of the two transparency, σ represents the first transparency variance of each pixel in the second transparency mask, σ2 represents the square of the first transparency variance, σ * represents the second transparency variance of each pixel in the third transparency mask, σ *2 represents the square of the second transparency variance, c 1 represents the first constant, and c 2 represents the second constant.
步骤A3,通过第三公式计算第三损失函数。Step A3: Calculate the third loss function through the third formula.
所述第三公式如下:The third formula is as follows:
Figure PCTCN2022080531-appb-000014
Figure PCTCN2022080531-appb-000014
其中,γ表示第三常数,θ i,j表示第三透明度蒙版中困难像素点的索引,所述困难像素点是指过渡学生模型无法处理的像素点,所述索引如下: Among them, γ represents the third constant, θ i,j represents the index of the difficult pixel in the third transparency mask, and the difficult pixel refers to the pixel that cannot be processed by the transition student model, and the index is as follows:
Figure PCTCN2022080531-appb-000015
Figure PCTCN2022080531-appb-000015
其中,m.n表示在所述困难像素点相邻的m×n个像素范围,A i,j表示所述无法处理像素点的相邻像素点。 Wherein, mn represents the range of m×n pixels adjacent to the difficult pixel point, and A i,j represents the adjacent pixel point of the unprocessable pixel point.
步骤A4,根据所述第一损失函数、所述第二损失函数以及所述第三损失函数,调整所述过渡学生模型中的第三权重参数。Step A4: Adjust the third weight parameter in the transitional student model according to the first loss function, the second loss function and the third loss function.
在步骤A4中可直接将第一损失函数、第二损失函数以及第三损失函数作为联合损失函数,调整过渡学生模型中的第三权重参数。In step A4, the first loss function, the second loss function, and the third loss function can be directly used as a joint loss function, and the third weight parameter in the transitional student model can be adjusted.
步骤A4也可以通过如下可选实施例实现:Step A4 can also be implemented by the following optional embodiments:
作为本申请的一个可选实施例,在步骤A4包括如下步骤A41至步骤A42。请参见图6,图6示出了本申请提供的一种抠图模型训练的方法中步骤A4具体示意性流程图。As an optional embodiment of the present application, step A4 includes the following steps A41 to A42. Referring to FIG. 6, FIG. 6 shows a specific schematic flowchart of step A4 in a method for training a matting model provided by the present application.
步骤A41,将所述第一损失函数、所述第二损失函数以及所述第三损失函数分别乘以各自对应的预设权重,得到联合损失函数。Step A41: Multiply the first loss function, the second loss function, and the third loss function by their corresponding preset weights to obtain a joint loss function.
步骤A42,根据所述联合损失函数,调整所述过渡学生模型中的第三权重参数。Step A42: Adjust the third weight parameter in the transitional student model according to the joint loss function.
步骤1044,将所述训练样本集合中的各个训练样本依次执行所述将训练样本输入所述过渡学生模型,得到由所述过渡学生模型输出的第三透明度蒙版的步骤以及后续步骤,得到抠图模型。Step 1044, the steps of inputting the training samples into the transition student model to obtain the third transparency mask output by the transition student model and subsequent steps are sequentially performed on each of the training samples in the training sample set, to obtain a Graphical model.
在本实施例中,通过训练初始教师模型得到目标教师模型。由于目标教师模型的网络结构处理精度较高,故将目标教师模型中的第一权重参数迁移至初始学生模型中,得到过渡学生模型。并根据目标教师模型输出的第一透明度蒙版和训练样本集合,训练过渡学生模型,得到抠图模型。由于抠图模型不仅拥 有目标教师模型中的第一权重参数,且不断学习目标教师模型输出的第一透明度蒙版,故抠图模型具有与目标教师模型相近的处理精度,且抠图模型的网络结构较为简单,故在保证处理精度的前提下,巧妙地缩减了模型的体量以及处理时长。In this embodiment, the target teacher model is obtained by training the initial teacher model. Since the network structure of the target teacher model has high processing accuracy, the first weight parameter in the target teacher model is transferred to the initial student model to obtain the transition student model. And according to the first transparency mask and the training sample set output by the target teacher model, the transition student model is trained to obtain the matting model. Since the cutout model not only has the first weight parameter in the target teacher model, but also continuously learns the first transparency mask output by the target teacher model, the cutout model has a processing accuracy similar to that of the target teacher model, and the network of the cutout model The structure is relatively simple, so on the premise of ensuring the processing accuracy, the volume of the model and the processing time are subtly reduced.
请参见图7,图7示出了本申请提供一种将上述抠图模型应用于图像抠图的方法示意性流程图。如图7所示,该方法应用于第二终端设备,该方法包括如下步骤:Referring to FIG. 7 , FIG. 7 shows a schematic flowchart of a method for applying the above-mentioned matting model to image matting provided by the present application. As shown in FIG. 7 , the method is applied to the second terminal device, and the method includes the following steps:
步骤701,获取待抠图图像、背景图像以及所述待抠图图像对应的深度图像;其中,所述待抠图图像和所述背景图像为相同取景位置下采集的图像,所述待抠图图像中包括抠图对象,所述背景图像中不包括所述抠图对象。Step 701: Obtain an image to be cutout, a background image, and a depth image corresponding to the image to be cutout; wherein, the image to be cutout and the background image are images collected at the same viewing position, and the image to be cutout is an image collected at the same viewing position. The image includes a matting object, and the background image does not include the matting object.
步骤702,将所述待抠图图像、所述背景图像以及所述深度图像输入预先训练的抠图模型中,得到由所述抠图模型输出的目标透明度蒙版;所述抠图模型由过渡学生模型训练得到,所述过渡学生模型由目标教师模型的第一权重参数迁移至初始学生模型得到;所述抠图模型的网络结构复杂度低于所述目标教师模型的网络结构复杂度。Step 702: Input the image to be cutout, the background image and the depth image into a pretrained cutout model to obtain a target transparency mask output by the cutout model; The transitional student model is obtained by training the student model, and the transitional student model is obtained by migrating the first weight parameter of the target teacher model to the initial student model; the network structure complexity of the matting model is lower than that of the target teacher model.
由于模型在处理过程中,部分像素可能出现无法获取深度数据的现象,为了提高前景的召回率,故本申请同时将待抠图图像、背景图像以及深度图像作为抠图模型的输入数据,以精确地提取深度特征,进的得到高精确度的目标透明度蒙版。During the processing of the model, some pixels may not be able to obtain depth data. In order to improve the recall rate of the foreground, the application simultaneously uses the image to be cutout, the background image and the depth image as the input data of the cutout model to accurately Extract depth features, and then obtain high-accuracy target transparency masks.
步骤703,根据所述目标透明度蒙版,截取所述待抠图图像中所述抠图对象对应的抠图图像。Step 703 , according to the target transparency mask, intercept the cutout image corresponding to the cutout object in the image to be cutout.
在得到抠图图像后,将抠图图像与待合成图像进行合成处理,得到目标合成图像。合成过程如下公式所示:After the matting image is obtained, the matting image and the image to be synthesized are synthesized to obtain a target synthesized image. The synthesis process is shown in the following formula:
I=αF+(1-α)BI=αF+(1-α)B
其中,α表示目标透明度蒙版,I表示目标合成图像,F表示待抠图图像,B表示待合成图像。Among them, α represents the target transparency mask, I represents the target composite image, F represents the image to be matted, and B represents the image to be composited.
在本实施例中,由于过渡学生模型采用目标教师模型的第一权重参数,并训练第一权重参数得到抠图模型。其中,抠图模型的网络结构复杂度低于目标教师模型的网络结构复杂度。故抠图模型在保证处理精度的前提下,提高了图像处理效率。In this embodiment, since the transitional student model adopts the first weight parameter of the target teacher model, and trains the first weight parameter to obtain the matting model. Among them, the network structure complexity of the matting model is lower than that of the target teacher model. Therefore, the matting model improves the image processing efficiency on the premise of ensuring the processing accuracy.
如图8本申请提供了一种抠图模型训练的装置8,请参见图8,图8示出了本申请提供的一种抠图模型训练的装置的示意图,如图8所示一种抠图模型训练的装置,包括:As shown in FIG. 8, the present application provides an apparatus 8 for training a cutout model. Please refer to FIG. 8. FIG. 8 shows a schematic diagram of an apparatus for training a cutout model provided by the present application. As shown in FIG. A graphical model training device, including:
第一获取单元81,用于获取训练样本集合、初始教师模型以及初始学生模型;其中,所述初始学生模型的网络结构复杂度低于所述初始教师模型的网络结构复杂度;The first obtaining unit 81 is used to obtain a training sample set, an initial teacher model and an initial student model; wherein, the network structure complexity of the initial student model is lower than the network structure complexity of the initial teacher model;
第一训练单元82,用于通过所述训练样本集合,训练所述初始教师模型得到目标教师模型以及所述目标教师模型输出的第一透明度蒙版;每个训练样本中包括输入样本和输出样本;所述输入样本包括待抠图图像、背景图像以及所述待抠图图像的深度图像,所述输出样本包括待抠图图像对应的标准透明度蒙版;The first training unit 82 is used to train the initial teacher model to obtain a target teacher model and a first transparency mask output by the target teacher model through the training sample set; each training sample includes an input sample and an output sample ; The input sample includes an image to be cutout, a background image and a depth image of the image to be cutout, and the output sample includes a standard transparency mask corresponding to the image to be cutout;
迁移单元83,用于分别将所述目标教师模型中的第一权重参数迁移至所述初始学生模型中的各个子网络中,得到过渡学生模型;The migration unit 83 is used to respectively migrate the first weight parameter in the target teacher model to each sub-network in the initial student model to obtain a transitional student model;
第二训练单元84,用于根据所述第一透明度蒙版以及所述训练样本集合,训练所述过渡学生模型得到抠图模型。The second training unit 84 is configured to train the transitional student model to obtain a matting model according to the first transparency mask and the training sample set.
本申请提供的一种抠图模型训练的装置,通过训练初始教师模型得到目标教师模型。由于目标教师模型的网络结构处理精度较高,故将目标教师模型中的第一权重参数迁移至初始学生模型中,得到过渡学生模型。并根据目标教师模型输出的第一透明度蒙版和训练样本集合,训练过渡学生模型,得到抠图模型。由于抠图模型不仅拥有目标教师模型中的第一权重参数,且不断学习目标教师模型输出的第一透明度蒙版,故抠图模型具有与目标教师模型相近的处理精度,且抠图模型的网络结构较为简单,故在保证处理精度的前提下,巧妙地 缩减了模型的体量以及处理时长。Provided in the present application is an apparatus for training a cutout model, which obtains a target teacher model by training an initial teacher model. Since the network structure of the target teacher model has high processing accuracy, the first weight parameter in the target teacher model is transferred to the initial student model to obtain the transition student model. And according to the first transparency mask and the training sample set output by the target teacher model, the transition student model is trained to obtain the matting model. Since the cutout model not only has the first weight parameter in the target teacher model, but also continuously learns the first transparency mask output by the target teacher model, the cutout model has a processing accuracy similar to that of the target teacher model, and the network of the cutout model The structure is relatively simple, so on the premise of ensuring the processing accuracy, the volume of the model and the processing time are subtly reduced.
如图9本申请提供了一种图像抠图的装置9,请参见图9,图9示出了本申请提供的一种图像抠图的装置的示意图,如图9所示一种图像抠图的装置,包括:As shown in FIG. 9, the present application provides an image matting apparatus 9. Please refer to FIG. 9. FIG. 9 shows a schematic diagram of an image matting apparatus provided by the present application. As shown in FIG. 9, an image matting devices, including:
第二获取单元91,用于获取待抠图图像、背景图像以及所述待抠图图像对应的深度图像;其中,所述待抠图图像和所述背景图像为相同取景位置下采集的图像,所述待抠图图像中包括抠图对象,所述背景图像中不包括所述抠图对象;The second acquiring unit 91 is configured to acquire an image to be cut out, a background image, and a depth image corresponding to the image to be cut out; wherein, the image to be cut out and the background image are images collected at the same viewing position, The image to be cutout includes a cutout object, and the background image does not include the cutout object;
处理单元92,用于将所述待抠图图像、所述背景图像以及所述深度图像输入预先训练的抠图模型中,得到由所述抠图模型输出的目标透明度蒙版;所述抠图模型由过渡学生模型训练得到,所述过渡学生模型由目标教师模型的第一权重参数迁移至初始学生模型得到;所述抠图模型的网络结构复杂度低于所述目标教师模型的网络结构复杂度;The processing unit 92 is configured to input the image to be cutout, the background image and the depth image into a pretrained cutout model to obtain a target transparency mask output by the cutout model; the cutout The model is obtained by training the transitional student model, and the transitional student model is obtained by migrating the first weight parameter of the target teacher model to the initial student model; the network structure complexity of the cutout model is lower than that of the target teacher model. Spend;
截取单元93,用于根据所述目标透明度蒙版,截取所述待抠图图像中所述抠图对象对应的抠图图像。The intercepting unit 93 is configured to intercept the cutout image corresponding to the cutout object in the image to be cutout according to the target transparency mask.
本申请提供的一种抠图模型训练的装置,由于过渡学生模型采用目标教师模型的第一权重参数,并训练第一权重参数得到抠图模型。其中,抠图模型的网络结构复杂度低于目标教师模型的网络结构复杂度。故抠图模型在保证处理精度的前提下,提高了图像处理效率。In the device for training a cutout model provided by the present application, the transitional student model adopts the first weight parameter of the target teacher model, and the cutout model is obtained by training the first weight parameter. Among them, the network structure complexity of the matting model is lower than that of the target teacher model. Therefore, the matting model improves the image processing efficiency on the premise of ensuring the processing accuracy.
图10是本发明一实施例提供的一种第一终端设备的示意图。如图10所示,该实施例的一种第一终端设备100包括:处理器1001、存储器1002以及存储在所述存储器1002中并可在所述处理器1001上运行的计算机程序1003,例如一种抠图模型训练的程序。所述处理器1001执行所述计算机程序1003时实现上述各个一种抠图模型训练的方法实施例中的步骤,例如图1所示的步骤101至步骤104。或者,所述处理器1001执行所述计算机程序134时实现上述各装置实施例中各单元的功能,例如图8所示单元81至84的功能。FIG. 10 is a schematic diagram of a first terminal device according to an embodiment of the present invention. As shown in FIG. 10, a first terminal device 100 in this embodiment includes: a processor 1001, a memory 1002, and a computer program 1003 stored in the memory 1002 and running on the processor 1001, such as a A program for matting model training. When the processor 1001 executes the computer program 1003 , the steps in each of the foregoing method embodiments for training a cutout model are implemented, for example, steps 101 to 104 shown in FIG. 1 . Alternatively, when the processor 1001 executes the computer program 134, the functions of the units in the foregoing apparatus embodiments, such as the functions of units 81 to 84 shown in FIG. 8 , are implemented.
示例性的,所述计算机程序1003可以被分割成一个或多个单元,所述一个或者多个单元被存储在所述存储器1002中,并由所述处理器1001执行,以完成本发明。所述一个或多个单元可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述所述计算机程序1003在所述一种第一终端设备100中的执行过程。例如,所述计算机程序1003可以被分割成各单元具体功能如下:Exemplarily, the computer program 1003 may be divided into one or more units, and the one or more units are stored in the memory 1002 and executed by the processor 1001 to complete the present invention. The one or more units may be a series of computer program instruction segments capable of performing specific functions, and the instruction segments are used to describe the execution process of the computer program 1003 in the first terminal device 100 . For example, the computer program 1003 can be divided into units with the following specific functions:
第一获取单元,用于获取训练样本集合、初始教师模型以及初始学生模型;其中,所述初始学生模型的网络结构复杂度低于所述初始教师模型的网络结构复杂度;a first obtaining unit, used for obtaining a training sample set, an initial teacher model and an initial student model; wherein, the network structure complexity of the initial student model is lower than the network structure complexity of the initial teacher model;
第一训练单元,用于通过所述训练样本集合,训练所述初始教师模型得到目标教师模型以及所述目标教师模型输出的第一透明度蒙版;a first training unit, configured to train the initial teacher model to obtain a target teacher model and a first transparency mask output by the target teacher model through the training sample set;
迁移单元,用于分别将所述目标教师模型中的第一权重参数迁移至所述初始学生模型中的各个子网络中,得到过渡学生模型;a migration unit, configured to respectively migrate the first weight parameter in the target teacher model to each sub-network in the initial student model to obtain a transitional student model;
第二训练单元,用于根据所述第一透明度蒙版以及所述训练样本集合,训练所述过渡学生模型得到所述抠图模型。The second training unit is configured to train the transitional student model to obtain the matting model according to the first transparency mask and the training sample set.
所述第一终端设备中包括但不限于处理器1001、存储器1002。本领域技术人员可以理解,图10仅仅是一种第一终端设备100的示例,并不构成对一种第一终端设备100的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述一种漫游控制设备还可以包括输入输出设备、网络接入设备、总线等。The first terminal device includes but is not limited to a processor 1001 and a memory 1002 . Those skilled in the art can understand that FIG. 10 is only an example of a first terminal device 100, and does not constitute a limitation to a first terminal device 100, and may include more or less components than those shown in the figure, or combinations thereof Some components, or different components, for example, the roaming control device may also include input and output devices, network access devices, buses, and the like.
所述处理器1001可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The processor 1001 may be a central processing unit (Central Processing Unit, CPU), or other general-purpose processors, digital signal processors (Digital Signal Processors, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
所述存储器1002可以是所述一种第一终端设备100的内部存储单元,例如 一种第一终端设备100的硬盘或内存。所述存储器1002也可以是所述一种第一终端设备100的外部存储设备,例如所述一种第一终端设备100上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,所述存储器1002还可以既包括所述一种第一终端设备100的内部存储单元也包括外部存储设备。所述存储器1002用于存储所述计算机程序以及所述一种漫游控制设备所需的其他程序和数据。所述存储器1002还可以用于暂时地存储已经输出或者将要输出的数据。The memory 1002 may be an internal storage unit of the first terminal device 100, such as a hard disk or a memory of the first terminal device 100. The memory 1002 may also be an external storage device of the first terminal device 100, such as a plug-in hard disk equipped on the first terminal device 100, a smart memory card (Smart Media Card, SMC), Secure Digital (SD) card, flash memory card (Flash Card), etc. Further, the memory 1002 may also include both an internal storage unit of the first terminal device 100 and an external storage device. The memory 1002 is used for storing the computer program and other programs and data required by the one kind of roaming control device. The memory 1002 may also be used to temporarily store data that has been output or will be output.
图11是本发明一实施例提供的一种第二终端设备的示意图。如图11所示,该实施例的一种第二终端设备11包括:处理器111、存储器112以及存储在所述存储器112中并可在所述处理器111上运行的计算机程序113,例如一种图像抠图的程序。所述处理器111执行所述计算机程序113时实现上述各个一种图像抠图的方法实施例中的步骤,例如图7所示的步骤701至步骤703。或者,所述处理器111执行所述计算机程序134时实现上述各装置实施例中各单元的功能,例如图9所示单元91至93的功能。FIG. 11 is a schematic diagram of a second terminal device according to an embodiment of the present invention. As shown in FIG. 11, a second terminal device 11 in this embodiment includes: a processor 111, a memory 112, and a computer program 113 stored in the memory 112 and executable on the processor 111, such as a A program for image matting. When the processor 111 executes the computer program 113 , the steps in each of the above-mentioned embodiments of the image matting method are implemented, for example, steps 701 to 703 shown in FIG. 7 . Alternatively, when the processor 111 executes the computer program 134, the functions of the units in the foregoing device embodiments, for example, the functions of the units 91 to 93 shown in FIG. 9 , are implemented.
示例性的,所述计算机程序113可以被分割成一个或多个单元,所述一个或者多个单元被存储在所述存储器112中,并由所述处理器111执行,以完成本发明。所述一个或多个单元可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述所述计算机程序113在所述一种第二终端设备11中的执行过程。例如,所述计算机程序113可以被分割成各单元具体功能如下:Exemplarily, the computer program 113 may be divided into one or more units, and the one or more units are stored in the memory 112 and executed by the processor 111 to complete the present invention. The one or more units may be a series of computer program instruction segments capable of performing specific functions, and the instruction segments are used to describe the execution process of the computer program 113 in the second terminal device 11 . For example, the computer program 113 can be divided into units with specific functions as follows:
第二获取单元,用于获取待抠图图像、背景图像以及所述待抠图图像对应的深度图像;其中,所述待抠图图像和所述背景图像为相同取景位置下采集的图像,所述待抠图图像中包括抠图对象,所述背景图像中不包括所述抠图对象;The second acquiring unit is configured to acquire the image to be cut out, the background image and the depth image corresponding to the image to be cut out; wherein, the image to be cut out and the background image are images collected in the same viewing position, so The image to be cutout includes a cutout object, and the background image does not include the cutout object;
处理单元,用于将所述待抠图图像、所述背景图像以及所述深度图像输入预先训练的抠图模型中,得到由所述抠图模型输出的目标透明度蒙版;所述抠图模型由过渡学生模型训练得到,所述过渡学生模型由目标教师模型的第一权重参数迁移至初始学生模型得到;所述抠图模型的网络结构复杂度低于所述目 标教师模型的网络结构复杂度;a processing unit, configured to input the image to be cutout, the background image and the depth image into a pretrained cutout model to obtain a target transparency mask output by the cutout model; the cutout model Obtained from the training of the transitional student model, the transitional student model is obtained by migrating the first weight parameter of the target teacher model to the initial student model; the network structure complexity of the cutout model is lower than the network structure complexity of the target teacher model ;
截取单元,用于根据所述目标透明度蒙版,截取所述待抠图图像中所述抠图对象对应的抠图图像。An intercepting unit, configured to intercept a cutout image corresponding to the cutout object in the to-be-cutout image according to the target transparency mask.
所述第二终端设备中包括但不限于处理器111、存储器112。本领域技术人员可以理解,图11仅仅是一种第二终端设备11的示例,并不构成对一种第二终端设备11的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述一种漫游控制设备还可以包括输入输出设备、网络接入设备、总线等。The second terminal device includes but is not limited to the processor 111 and the memory 112 . Those skilled in the art can understand that FIG. 11 is only an example of a second terminal device 11 , and does not constitute a limitation to a second terminal device 11 , and may include more or less components than those shown in the figure, or combinations thereof Some components, or different components, for example, the roaming control device may also include input and output devices, network access devices, buses, and the like.
所述处理器111可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The processor 111 may be a central processing unit (Central Processing Unit, CPU), or other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), Off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
所述存储器112可以是所述一种第二终端设备11的内部存储单元,例如一种第二终端设备11的硬盘或内存。所述存储器112也可以是所述一种第二终端设备11的外部存储设备,例如所述一种第二终端设备11上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,所述存储器112还可以既包括所述一种第二终端设备11的内部存储单元也包括外部存储设备。所述存储器112用于存储所述计算机程序以及所述一种漫游控制设备所需的其他程序和数据。所述存储器112还可以用于暂时地存储已经输出或者将要输出的数据。The memory 112 may be an internal storage unit of the second terminal device 11 , such as a hard disk or a memory of the second terminal device 11 . The memory 112 may also be an external storage device of the second terminal device 11, such as a plug-in hard disk equipped on the second terminal device 11, a smart memory card (Smart Media Card, SMC), Secure Digital (SD) card, flash memory card (Flash Card), etc. Further, the memory 112 may also include both an internal storage unit of the second terminal device 11 and an external storage device. The memory 112 is used for storing the computer program and other programs and data required by the one kind of roaming control device. The memory 112 may also be used to temporarily store data that has been output or will be output.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence numbers of the steps in the above embodiments does not mean the sequence of execution, and the execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
需要说明的是,上述装置/单元之间的信息交互、执行过程等内容,由于与 本申请方法实施例基于同一构思,其具体功能及带来的技术效果,具体可参见方法实施例部分,此处不再赘述。It should be noted that the information exchange, execution process and other contents between the above-mentioned devices/units are based on the same concept as the method embodiments of the present application. For specific functions and technical effects, please refer to the method embodiments section. It is not repeated here.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and simplicity of description, only the division of the above-mentioned functional units and modules is used as an example. Module completion, that is, dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above. Each functional unit and module in the embodiment may be integrated in one processing unit, or each unit may exist physically alone, or two or more units may be integrated in one unit, and the above-mentioned integrated units may adopt hardware. It can also be realized in the form of software functional units. In addition, the specific names of the functional units and modules are only for the convenience of distinguishing from each other, and are not used to limit the protection scope of the present application. For the specific working processes of the units and modules in the above-mentioned system, reference may be made to the corresponding processes in the foregoing method embodiments, which will not be repeated here.
本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现可实现上述各个方法实施例中的步骤。Embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the steps in the foregoing method embodiments can be implemented.
本申请实施例提供了一种计算机程序产品,当计算机程序产品在移动终端上运行时,使得移动终端执行时实现可实现上述各个方法实施例中的步骤。The embodiments of the present application provide a computer program product, when the computer program product runs on a mobile terminal, the steps in the foregoing method embodiments can be implemented when the mobile terminal executes the computer program product.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质至少可以包括:能够将计算机程序代码携带到拍照装置/活体检测设备的任何实体或装置、记录介质、计算机存储器、只读存储器(Read-Only Memory,ROM)、随机存取存 储器(Random Access Memory,RAM)、电载波信号、电信信号以及软件分发介质。例如U盘、移动硬盘、磁碟或者光盘等。在某些司法管辖区,根据立法和专利实践,计算机可读介质不可以是电载波信号和电信信号。The integrated unit, if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium. Based on this understanding, the present application realizes all or part of the processes in the methods of the above embodiments, which can be completed by instructing the relevant hardware through a computer program, and the computer program can be stored in a computer-readable storage medium. When executed by a processor, the steps of each of the above method embodiments can be implemented. Wherein, the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file or some intermediate form, and the like. The computer-readable medium may include at least: any entity or device capable of carrying the computer program code to the photographing device/living detection device, recording medium, computer memory, read-only memory (ROM), random access Memory (Random Access Memory, RAM), electrical carrier signal, telecommunication signal and software distribution medium. For example, U disk, mobile hard disk, disk or CD, etc. In some jurisdictions, under legislation and patent practice, computer readable media may not be electrical carrier signals and telecommunications signals.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。In the foregoing embodiments, the description of each embodiment has its own emphasis. For parts that are not described or described in detail in a certain embodiment, reference may be made to the relevant descriptions of other embodiments.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those of ordinary skill in the art can realize that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.
在本申请所提供的实施例中,应该理解到,所揭露的装置/网络设备和方法,可以通过其它的方式实现。例如,以上所描述的装置/网络设备实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。In the embodiments provided in this application, it should be understood that the disclosed apparatus/network device and method may be implemented in other manners. For example, the apparatus/network device embodiments described above are only illustrative. For example, the division of the modules or units is only a logical function division. In actual implementation, there may be other division methods, such as multiple units. Or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units.
应当理解,当在本申请说明书和所附权利要求书中使用时,术语“包括”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。It is to be understood that, when used in this specification and the appended claims, the term "comprising" indicates the presence of the described feature, integer, step, operation, element and/or component, but does not exclude one or more other The presence or addition of features, integers, steps, operations, elements, components and/or sets thereof.
还应当理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。It will also be understood that, as used in this specification and the appended claims, the term "and/or" refers to and including any and all possible combinations of one or more of the associated listed items.
如在本申请说明书和所附权利要求书中所使用的那样,术语“如果”可以依据上下文被解释为“当...时”或“一旦”或“响应于确定”或“响应于监测到”。类似地,短语“如果确定”或“如果监测到[所描述条件或事件]”可以依据上下文被解释为意指“一旦确定”或“响应于确定”或“一旦监测到[所描述条件或事件]”或“响应于监测到[所描述条件或事件]”。As used in the specification of this application and the appended claims, the term "if" may be contextually interpreted as "when" or "once" or "in response to determining" or "in response to monitoring of ". Similarly, the phrases "if it is determined" or "if the [described condition or event] is monitored" can be interpreted, depending on the context, to mean "once it is determined" or "in response to the determination" or "once the [described condition or event] is monitored. ]" or "in response to the detection of the [described condition or event]".
另外,在本申请说明书和所附权利要求书的描述中,术语“第一”、“第二”、“第三”等仅用于区分描述,而不能理解为指示或暗示相对重要性。In addition, in the description of the specification of the present application and the appended claims, the terms "first", "second", "third", etc. are only used to distinguish the description, and should not be construed as indicating or implying relative importance.
在本申请说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。References in this specification to "one embodiment" or "some embodiments" and the like mean that a particular feature, structure or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," "in other embodiments," etc. in various places in this specification are not necessarily All refer to the same embodiment, but mean "one or more but not all embodiments" unless specifically emphasized otherwise. The terms "including", "including", "having" and their variants mean "including but not limited to" unless specifically emphasized otherwise.
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, but not to limit them; although the present application has been described in detail with reference to the above-mentioned embodiments, those of ordinary skill in the art should understand that: it can still be used for the above-mentioned implementations. The technical solutions described in the examples are modified, or some technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions in the embodiments of the application, and should be included in the within the scope of protection of this application.

Claims (10)

  1. 一种抠图模型训练的方法,其特征在于,所述方法包括:A method for matting model training, wherein the method comprises:
    获取训练样本集合、初始教师模型以及初始学生模型;其中,所述初始学生模型的网络结构复杂度低于所述初始教师模型的网络结构复杂度;每个训练样本中包括输入样本和输出样本;所述输入样本包括待抠图图像、背景图像以及所述待抠图图像的深度图像,所述输出样本包括待抠图图像对应的标准透明度蒙版;Obtain a training sample set, an initial teacher model and an initial student model; wherein, the network structure complexity of the initial student model is lower than the network structure complexity of the initial teacher model; each training sample includes an input sample and an output sample; The input sample includes an image to be cutout, a background image, and a depth image of the image to be cutout, and the output sample includes a standard transparency mask corresponding to the image to be cutout;
    通过所述训练样本集合,训练所述初始教师模型得到目标教师模型以及所述目标教师模型输出的第一透明度蒙版;Through the training sample set, the initial teacher model is trained to obtain a target teacher model and a first transparency mask output by the target teacher model;
    分别将所述目标教师模型中的第一权重参数迁移至所述初始学生模型中的各个子网络中,得到过渡学生模型;respectively migrating the first weight parameter in the target teacher model to each sub-network in the initial student model to obtain a transitional student model;
    根据所述第一透明度蒙版以及所述训练样本集合,训练所述过渡学生模型得到所述抠图模型。The matting model is obtained by training the transitional student model according to the first transparency mask and the training sample set.
  2. 如权利要求1所述方法,其特征在于,所述分别将所述目标教师模型中的第一权重参数迁移至所述初始学生模型中的各个子网络中,得到过渡学生模型,包括:The method according to claim 1, wherein, respectively migrating the first weight parameters in the target teacher model to each sub-network in the initial student model to obtain a transitional student model, comprising:
    将浮点型的所述第一权重参数,量化为整型数据,得到第二权重参数;Quantize the floating-point first weight parameter into integer data to obtain the second weight parameter;
    将所述第二权重参数迁移至所述初始学生模型中的各个子网络中,得到过渡学生模型。The second weight parameter is transferred to each sub-network in the initial student model to obtain a transitional student model.
  3. 如权利要求2所述方法,其特征在于,所述将浮点型的所述第一权重参数,量化为整型数据,得到第二权重参数,包括:The method according to claim 2, characterized in that, quantizing the floating-point first weight parameter into integer data to obtain the second weight parameter, comprising:
    将每个第一权重参数依次代入第一公式组中,得到所述每个第一权重参数对应的第二权重参数;Substitute each first weight parameter into the first formula group in turn to obtain the second weight parameter corresponding to each first weight parameter;
    所述第一公式组如下:The first formula group is as follows:
    Figure PCTCN2022080531-appb-100001
    Figure PCTCN2022080531-appb-100001
    Figure PCTCN2022080531-appb-100002
    Figure PCTCN2022080531-appb-100002
    Figure PCTCN2022080531-appb-100003
    Figure PCTCN2022080531-appb-100003
    其中,A表示第一量化参数,J max表示在所有第一权重参数中的最大权重参数,J min表示在所有第一权重参数中的最小权重参数,α表示预设整型数据范围中的最大值,
    Figure PCTCN2022080531-appb-100004
    Figure PCTCN2022080531-appb-100005
    表示四舍五入取整,B表示第一预设整型值,所述第一预设整型值是指浮点型第一权重参数为零时对应的第一整型值,N表示所述每个第一权重参数,C表示所述每个第一权重参数对应的第二权重参数。
    Among them, A represents the first quantization parameter, J max represents the maximum weight parameter among all the first weight parameters, J min represents the minimum weight parameter among all the first weight parameters, and α represents the maximum weight parameter in the preset integer data range value,
    Figure PCTCN2022080531-appb-100004
    and
    Figure PCTCN2022080531-appb-100005
    Represents rounding to the nearest integer, B represents the first preset integer value, the first preset integer value refers to the first integer value corresponding to the first floating-point weight parameter being zero, and N represents the each The first weight parameter, C represents the second weight parameter corresponding to each of the first weight parameters.
  4. 如权利要求1所述方法,其特征在于,所述根据所述第一透明度蒙版以及所述训练样本集合,训练所述过渡学生模型得到所述抠图模型,包括:The method according to claim 1, wherein, according to the first transparency mask and the training sample set, training the transitional student model to obtain the matting model, comprising:
    将所述第一透明度蒙版中每个像素的浮点型透明度量化为整型数据,得到第二透明度蒙版;Quantifying the floating-point transparency of each pixel in the first transparency mask into integer data to obtain a second transparency mask;
    将训练样本输入所述过渡学生模型,得到由所述过渡学生模型输出的第三透明度蒙版;Input the training sample into the transition student model to obtain the third transparency mask output by the transition student model;
    根据所述第二透明度蒙版和所述第三透明度蒙版,调整所述过渡学生模型中的第三权重参数;adjusting the third weight parameter in the transition student model according to the second transparency mask and the third transparency mask;
    将所述训练样本集合中的各个训练样本依次执行所述将训练样本输入所述过渡学生模型,得到由所述过渡学生模型输出的第三透明度蒙版的步骤以及后续步骤,得到所述抠图模型。The steps of inputting the training samples into the transitional student model to obtain the third transparency mask output by the transitional student model and subsequent steps are performed successively for each training sample in the training sample set, and the matting is obtained. Model.
  5. 如权利要求4所述方法,其特征在于,所述将所述第一透明度蒙版中每个像素的浮点型透明度量化为整型数据,得到第二透明度蒙版,包括:The method according to claim 4, characterized in that, quantizing the floating-point transparency of each pixel in the first transparency mask into integer data to obtain the second transparency mask, comprising:
    将所述第一透明度蒙版中每个像素的浮点型透明度代入第二公式组中,得到第二透明度蒙版;Substitute the floating-point transparency of each pixel in the first transparency mask into the second formula group to obtain a second transparency mask;
    所述第二公式组如下:The second formula group is as follows:
    Figure PCTCN2022080531-appb-100006
    Figure PCTCN2022080531-appb-100006
    Figure PCTCN2022080531-appb-100007
    Figure PCTCN2022080531-appb-100007
    Figure PCTCN2022080531-appb-100008
    Figure PCTCN2022080531-appb-100008
    其中,D表示第二量化参数,K max表示第一透明度蒙版中最大透明度,K min表示第一透明度蒙版中最小透明度,α表示预设整型数据范围中的最大值,
    Figure PCTCN2022080531-appb-100009
    Figure PCTCN2022080531-appb-100010
    表示四舍五入取整,E表示第二预设整型透明度,所述第二预设整型透明度是指浮点型透明度为零时对应的第二整型值,M表示所述每个像素的浮点型透明度,F表示所述每个像素的浮点型透明度对应的整型透明度。
    Among them, D represents the second quantization parameter, K max represents the maximum transparency in the first transparency mask, K min represents the minimum transparency in the first transparency mask, α represents the maximum value in the preset integer data range,
    Figure PCTCN2022080531-appb-100009
    and
    Figure PCTCN2022080531-appb-100010
    Indicates rounding, E represents the second preset integer transparency, the second preset integer transparency refers to the second integer value corresponding to the floating point transparency of zero, and M represents the floating point of each pixel. Point type transparency, F represents the integer type transparency corresponding to the floating point type transparency of each pixel.
  6. 如权利要求4所述方法,其特征在于,所述根据所述第二透明度蒙版和所述第三透明度蒙版,调整所述过渡学生模型中的第三权重参数,包括:The method according to claim 4, wherein the adjusting the third weight parameter in the transition student model according to the second transparency mask and the third transparency mask comprises:
    通过第一公式计算第一损失函数;Calculate the first loss function through the first formula;
    所述第一公式如下:The first formula is as follows:
    Figure PCTCN2022080531-appb-100011
    Figure PCTCN2022080531-appb-100011
    其中,H表示合成图像的预设长度,M表示合成图像的预设宽度,a i,j表示第二透明度蒙版中第i行第j列像素的第一透明度,
    Figure PCTCN2022080531-appb-100012
    表示第三透明度蒙版中第i行第j列像素的第二透明度;
    Among them, H represents the preset length of the composite image, M represents the preset width of the composite image, a i,j represents the first transparency of the pixel in the ith row and the jth column in the second transparency mask,
    Figure PCTCN2022080531-appb-100012
    represents the second transparency of the pixel in row i and column j in the third transparency mask;
    通过第二公式计算第二损失函数;Calculate the second loss function by the second formula;
    所述第二公式如下:The second formula is as follows:
    Figure PCTCN2022080531-appb-100013
    Figure PCTCN2022080531-appb-100013
    其中,μ表示第二透明度蒙版中各个像素的第一透明度均值,μ 2表示第一透明度均值的平方,μ *表示第三透明度蒙版中各个像素的第二透明度均值,μ *2表示第二透明度均值的平方,σ表示第二透明度蒙版中各个像素的第一透明度方差,σ 2表示第一透明度方差的平方,σ *表示第三透明度蒙版中各个像素的第 二透明度方差,σ *2表示第二透明度方差的平方,c 1表示第一常数,c 2表示第二常数; Among them, μ represents the first average transparency value of each pixel in the second transparency mask, μ 2 represents the square of the first transparency average value, μ * represents the second average transparency value of each pixel in the third transparency mask, and μ *2 represents the first transparency value. The square of the mean value of the two transparency, σ represents the first transparency variance of each pixel in the second transparency mask, σ2 represents the square of the first transparency variance, σ * represents the second transparency variance of each pixel in the third transparency mask, σ *2 represents the square of the second transparency variance, c 1 represents the first constant, and c 2 represents the second constant;
    通过第三公式计算第三损失函数;Calculate the third loss function by the third formula;
    所述第三公式如下:The third formula is as follows:
    Figure PCTCN2022080531-appb-100014
    Figure PCTCN2022080531-appb-100014
    其中,γ表示第三常数,θ i,j表示第三透明度蒙版中困难像素点的索引,所述困难像素点是指过渡学生模型无法处理的像素点,所述索引如下: Among them, γ represents the third constant, θ i,j represents the index of the difficult pixel in the third transparency mask, and the difficult pixel refers to the pixel that cannot be processed by the transition student model, and the index is as follows:
    Figure PCTCN2022080531-appb-100015
    Figure PCTCN2022080531-appb-100015
    其中,m.n表示在所述困难像素点相邻的m×n个像素范围,A i,j表示所述无法处理像素点的相邻像素点; Wherein, mn represents the range of m×n pixels adjacent to the difficult pixel point, and A i,j represents the adjacent pixel point of the unprocessable pixel point;
    根据所述第一损失函数、所述第二损失函数以及所述第三损失函数,调整所述过渡学生模型中的第三权重参数。A third weight parameter in the transitional student model is adjusted according to the first loss function, the second loss function and the third loss function.
  7. 如权利要求6所述方法,其特征在于,所述根据所述第一损失函数、所述第二损失函数以及所述第三损失函数,调整所述过渡学生模型中的第三权重参数,包括:The method according to claim 6, wherein, adjusting the third weight parameter in the transitional student model according to the first loss function, the second loss function and the third loss function, comprising: :
    将所述第一损失函数、所述第二损失函数以及所述第三损失函数分别乘以各自对应的预设权重,得到联合损失函数;Multiply the first loss function, the second loss function and the third loss function by their respective preset weights to obtain a joint loss function;
    根据所述联合损失函数,调整所述过渡学生模型中的第三权重参数。A third weight parameter in the transitional student model is adjusted according to the joint loss function.
  8. 一种图像抠图的方法,其特征在于,所述方法包括:A method for image matting, characterized in that the method comprises:
    获取待抠图图像、背景图像以及所述待抠图图像对应的深度图像;其中,所述待抠图图像和所述背景图像为相同取景位置下采集的图像,所述待抠图图像中包括抠图对象,所述背景图像中不包括所述抠图对象;Obtain the image to be cutout, the background image, and the depth image corresponding to the image to be cutout; wherein, the image to be cutout and the background image are images collected at the same viewing position, and the image to be cutout includes a matting object, the background image does not include the matting object;
    将所述待抠图图像、所述背景图像以及所述深度图像输入预先训练的抠图模型中,得到由所述抠图模型输出的目标透明度蒙版;所述抠图模型由过渡学生模型训练得到,所述过渡学生模型由目标教师模型的第一权重参数迁移至初始学生模型得到;所述抠图模型的网络结构复杂度低于所述目标教师模型的网络结构复杂度;Inputting the image to be cutout, the background image and the depth image into a pretrained cutout model, the target transparency mask output by the cutout model is obtained; the cutout model is trained by the transition student model Obtained, the transition student model is obtained by migrating the first weight parameter of the target teacher model to the initial student model; the network structure complexity of the cutout model is lower than the network structure complexity of the target teacher model;
    根据所述目标透明度蒙版,截取所述待抠图图像中所述抠图对象对应的抠图图像。According to the target transparency mask, the cutout image corresponding to the cutout object in the image to be cutout is intercepted.
  9. 一种抠图模型训练的装置,其特征在于,所述装置包括:A device for matting model training, wherein the device comprises:
    第一获取单元,用于获取训练样本集合、初始教师模型以及初始学生模型;其中,所述初始学生模型的网络结构复杂度低于所述初始教师模型的网络结构复杂度;每个训练样本中包括输入样本和输出样本;所述输入样本包括待抠图图像、背景图像以及所述待抠图图像的深度图像,所述输出样本包括待抠图图像对应的标准透明度蒙版;The first obtaining unit is used to obtain a training sample set, an initial teacher model and an initial student model; wherein, the network structure complexity of the initial student model is lower than the network structure complexity of the initial teacher model; in each training sample Including input samples and output samples; the input samples include an image to be cut out, a background image and a depth image of the image to be cut out, and the output sample includes a standard transparency mask corresponding to the image to be cut out;
    第一训练单元,用于通过所述训练样本集合,训练所述初始教师模型得到目标教师模型以及所述目标教师模型输出的第一透明度蒙版;a first training unit, configured to train the initial teacher model to obtain a target teacher model and a first transparency mask output by the target teacher model through the training sample set;
    迁移单元,用于分别将所述目标教师模型中的第一权重参数迁移至所述初始学生模型中的各个子网络中,得到过渡学生模型;a migration unit, configured to respectively migrate the first weight parameter in the target teacher model to each sub-network in the initial student model to obtain a transitional student model;
    第二训练单元,用于根据所述第一透明度蒙版以及所述训练样本集合,训练所述过渡学生模型得到所述抠图模型。The second training unit is configured to train the transitional student model to obtain the matting model according to the first transparency mask and the training sample set.
  10. 一种图像抠图的装置,其特征在于,所述装置包括:A device for image matting, characterized in that the device comprises:
    第二获取单元,用于获取待抠图图像、背景图像以及所述待抠图图像对应的深度图像;其中,所述待抠图图像和所述背景图像为相同取景位置下采集的图像,所述待抠图图像中包括抠图对象,所述背景图像中不包括所述抠图对象;The second acquiring unit is configured to acquire the image to be cut out, the background image and the depth image corresponding to the image to be cut out; wherein, the image to be cut out and the background image are images collected in the same viewing position, so The image to be cutout includes a cutout object, and the background image does not include the cutout object;
    处理单元,用于将所述待抠图图像、所述背景图像以及所述深度图像输入预先训练的抠图模型中,得到由所述抠图模型输出的目标透明度蒙版;所述抠图模型由过渡学生模型训练得到,所述过渡学生模型由目标教师模型的第一权 重参数迁移至初始学生模型得到;所述抠图模型的网络结构复杂度低于所述目标教师模型的网络结构复杂度;a processing unit, configured to input the image to be cutout, the background image and the depth image into a pretrained cutout model to obtain a target transparency mask output by the cutout model; the cutout model Obtained from the training of the transitional student model, the transitional student model is obtained by migrating the first weight parameter of the target teacher model to the initial student model; the network structure complexity of the cutout model is lower than the network structure complexity of the target teacher model ;
    截取单元,用于根据所述目标透明度蒙版,截取所述待抠图图像中所述抠图对象对应的抠图图像。An intercepting unit, configured to intercept a cutout image corresponding to the cutout object in the to-be-cutout image according to the target transparency mask.
PCT/CN2022/080531 2021-03-11 2022-03-13 Image matting model training method and apparatus, and image matting method and apparatus WO2022188886A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110264893.0 2021-03-11
CN202110264893.0A CN113052868B (en) 2021-03-11 2021-03-11 Method and device for training matting model and image matting

Publications (1)

Publication Number Publication Date
WO2022188886A1 true WO2022188886A1 (en) 2022-09-15

Family

ID=76511337

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/080531 WO2022188886A1 (en) 2021-03-11 2022-03-13 Image matting model training method and apparatus, and image matting method and apparatus

Country Status (2)

Country Link
CN (1) CN113052868B (en)
WO (1) WO2022188886A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113052868B (en) * 2021-03-11 2023-07-04 奥比中光科技集团股份有限公司 Method and device for training matting model and image matting
CN114038006A (en) * 2021-08-09 2022-02-11 奥比中光科技集团股份有限公司 Matting network training method and matting method
CN114004772A (en) * 2021-09-30 2022-02-01 阿里巴巴(中国)有限公司 Image processing method, image synthesis model determining method, system and equipment
CN114140547B (en) * 2021-12-07 2023-03-14 北京百度网讯科技有限公司 Image generation method and device
CN114650453B (en) * 2022-04-02 2023-08-15 北京中庆现代技术股份有限公司 Target tracking method, device, equipment and medium applied to classroom recording and broadcasting
CN114937050A (en) * 2022-06-28 2022-08-23 北京字跳网络技术有限公司 Green curtain matting method and device and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110728658A (en) * 2019-09-16 2020-01-24 武汉大学 High-resolution remote sensing image weak target detection method based on deep learning
CN111339302A (en) * 2020-03-06 2020-06-26 支付宝(杭州)信息技术有限公司 Method and device for training element classification model
CN111724867A (en) * 2020-06-24 2020-09-29 中国科学技术大学 Molecular property measurement method, molecular property measurement device, electronic apparatus, and storage medium
US20200311540A1 (en) * 2019-03-28 2020-10-01 International Business Machines Corporation Layer-Wise Distillation for Protecting Pre-Trained Neural Network Models
CN112257815A (en) * 2020-12-03 2021-01-22 北京沃东天骏信息技术有限公司 Model generation method, target detection method, device, electronic device, and medium
CN113052868A (en) * 2021-03-11 2021-06-29 奥比中光科技集团股份有限公司 Cutout model training and image cutout method and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108830288A (en) * 2018-04-25 2018-11-16 北京市商汤科技开发有限公司 Image processing method, the training method of neural network, device, equipment and medium
CN110309842B (en) * 2018-12-28 2023-01-06 中国科学院微电子研究所 Object detection method and device based on convolutional neural network
CN109902745A (en) * 2019-03-01 2019-06-18 成都康乔电子有限责任公司 A kind of low precision training based on CNN and 8 integers quantization inference methods
CN109978893B (en) * 2019-03-26 2023-06-20 腾讯科技(深圳)有限公司 Training method, device, equipment and storage medium of image semantic segmentation network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200311540A1 (en) * 2019-03-28 2020-10-01 International Business Machines Corporation Layer-Wise Distillation for Protecting Pre-Trained Neural Network Models
CN110728658A (en) * 2019-09-16 2020-01-24 武汉大学 High-resolution remote sensing image weak target detection method based on deep learning
CN111339302A (en) * 2020-03-06 2020-06-26 支付宝(杭州)信息技术有限公司 Method and device for training element classification model
CN111724867A (en) * 2020-06-24 2020-09-29 中国科学技术大学 Molecular property measurement method, molecular property measurement device, electronic apparatus, and storage medium
CN112257815A (en) * 2020-12-03 2021-01-22 北京沃东天骏信息技术有限公司 Model generation method, target detection method, device, electronic device, and medium
CN113052868A (en) * 2021-03-11 2021-06-29 奥比中光科技集团股份有限公司 Cutout model training and image cutout method and device

Also Published As

Publication number Publication date
CN113052868A (en) 2021-06-29
CN113052868B (en) 2023-07-04

Similar Documents

Publication Publication Date Title
WO2022188886A1 (en) Image matting model training method and apparatus, and image matting method and apparatus
Lv et al. Attention guided low-light image enhancement with a large scale low-light simulation dataset
WO2020125495A1 (en) Panoramic segmentation method, apparatus and device
WO2020253127A1 (en) Facial feature extraction model training method and apparatus, facial feature extraction method and apparatus, device, and storage medium
WO2020125498A1 (en) Cardiac magnetic resonance image segmentation method and apparatus, terminal device and storage medium
WO2022160980A1 (en) Super-resolution method and apparatus, terminal device, and storage medium
WO2021164269A1 (en) Attention mechanism-based disparity map acquisition method and apparatus
WO2021169126A1 (en) Lesion classification model training method and apparatus, computer device, and storage medium
CN110991287A (en) Real-time video stream face detection tracking method and detection tracking system
WO2023179095A1 (en) Image segmentation method and apparatus, terminal device, and storage medium
JP2021531571A (en) Certificate image extraction method and terminal equipment
WO2018035794A1 (en) System and method for measuring image resolution value
CN110298829A (en) A kind of lingual diagnosis method, apparatus, system, computer equipment and storage medium
WO2022247568A1 (en) Image restoration method and apparatus, and device
CN110738235A (en) Pulmonary tuberculosis determination method, pulmonary tuberculosis determination device, computer device, and storage medium
WO2021175040A1 (en) Video processing method and related device
CN108632641A (en) Method for processing video frequency and device
CN111382647B (en) Picture processing method, device, equipment and storage medium
WO2022262660A1 (en) Pruning and quantization compression method and system for super-resolution network, and medium
CN116208586B (en) Low-delay medical image data transmission method and system
CN113781468A (en) Tongue image segmentation method based on lightweight convolutional neural network
WO2019120025A1 (en) Photograph adjustment method and apparatus, storage medium and electronic device
TWI817896B (en) Machine learning method and device
EP4184388A1 (en) White balance correction method and apparatus, device, and storage medium
CN111723934B (en) Image processing method and system, electronic device and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22766408

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22766408

Country of ref document: EP

Kind code of ref document: A1