WO2021218293A1

WO2021218293A1 - Image processing method and apparatus, electronic device and storage medium

Info

Publication number: WO2021218293A1
Application number: PCT/CN2021/076504
Authority: WO
Inventors: 李通; 金晟; 刘文韬; 钱晨
Original assignee: 北京市商汤科技开发有限公司
Priority date: 2020-04-29
Filing date: 2021-02-10
Publication date: 2021-11-04
Also published as: CN111539992A; JP2022534666A; TW202141340A

Abstract

The embodiments of the present disclosure disclose an image processing method and apparatus, an electronic device, and a storage medium. Said method comprises: obtaining a plurality of frames of images; performing limb key point detection processing on a target object in a first image among the plurality of frames of images, so as to obtain first key point information corresponding to part of the limbs of the target object; and determining second key point information corresponding to the part of the limbs of the target object in a second image on the basis of the first key point information, wherein in the plurality of frames of images, the second image is a frame of image following the first image.

Description

Image processing method, device, electronic equipment and storage medium

Cross-references to related applications

The present disclosure is filed based on a Chinese patent application with an application number of 202010357593.2 and an application date of April 29, 2020, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is hereby incorporated into the present disclosure by way of introduction.

Technical field

The present disclosure relates to the field of computer vision technology, and in particular to an image processing method, device, electronic equipment, and storage medium.

Background technique

Target tracking technology is usually based on a limb detection algorithm and a limb key point detection algorithm, using the human body detected by the limb detection algorithm and the human body key points detected by the limb key point detection algorithm to achieve target tracking. However, current limb detection algorithms and limb key point detection algorithms cannot adapt to scenes with only upper body limbs, which leads to the inability to track targets with only upper body limbs.

Summary of the invention

The embodiments of the present disclosure provide an image processing method, device, electronic equipment, and storage medium.

The embodiment of the present disclosure provides an image processing method, the method includes: obtaining a multi-frame image; performing limb key point detection processing on a target object in a first image in the multi-frame image, to obtain an image of the target object The first key point information corresponding to the part of the limb; the second key point information corresponding to the part of the limb of the target object in the second image is determined based on the first key point information; wherein, in the multi-frame image , The second image is an image after the first image.

In some optional embodiments of the present disclosure, the limb key point detection processing is performed on the target object in the first image in the multi-frame image to obtain first key point information corresponding to part of the limb of the target object , Including: performing limb detection processing on the target object in the first image to determine a first area of the target object; the first area includes the area where part of the limb of the target object is located; The pixel points corresponding to a region are subjected to limb key point detection processing, and the first key point information corresponding to the part of the limb of the target object is obtained.

In some optional embodiments of the present disclosure, the determining the second key point information corresponding to the part of the limb of the target object in the second image based on the first key point information includes: based on the first key point information A key point information determines a second area in the first image; the second area is larger than the first area of the target object; the first area includes the area where part of the limb of the target object is located; A second area, determining a third area in the second image corresponding to the position range of the second area; performing limb key point detection processing on pixels in the third area in the second image, Obtain the second key point information corresponding to the part of the limbs.

In some optional embodiments of the present disclosure, the determining the second key point information corresponding to the part of the limb of the target object in the second image based on the first key point information includes: according to the first key point information The position range of a key point information in the first image is used to determine the third area in the second image corresponding to the position range; for the pixels in the third area in the second image Perform limb key point detection processing to obtain the second key point information corresponding to the part of the limb.

In some optional embodiments of the present disclosure, the performing limb detection processing on the target object in the first image includes: performing limb detection on the target object in the first image using a limb detection network. Detection processing; wherein the limb detection network is trained using the first type of sample image; the first type of sample image is marked with a detection frame of the target object; the marking range of the detection frame includes part of the limb of the target object your region.

In some optional embodiments of the present disclosure, the performing limb key point detection processing on the pixels corresponding to the first area includes: performing limb key point detection on the pixels corresponding to the first area by using a limb key point detection network. Key point detection processing; wherein the limb key point detection network is trained by using a second type of sample image; the second type of sample image is marked with key points of a part of the limb that includes the target object.

In some optional embodiments of the present disclosure, part of the limbs of the target object includes at least one of the following: head, neck, shoulders, chest, waist, hips, arms, hands; the first key The point information and the second key point information include contour key point information and/or bone key point information of at least one of the head, neck, shoulder, chest, waist, hip, arm, and hand.

In some optional embodiments of the present disclosure, the method further includes: in response to obtaining first key point information corresponding to a part of the limb of the target object, assigning a tracking identifier to the target object; The number of tracking identifiers allocated during the processing of the multi-frame image determines the number of target objects in the multi-frame image.

In some optional embodiments of the present disclosure, the method further includes: determining the posture of the target object based on the second key point information; determining the interaction instruction corresponding to the target object based on the posture of the target object .

The embodiment of the present disclosure also provides an image processing device, the device includes: an acquisition unit, a detection unit, and a tracking determination unit; wherein the acquisition unit is configured to obtain multiple frames of images; the detection unit is configured to The target object in the first image in the multi-frame image performs limb key point detection processing to obtain first key point information corresponding to a part of the limb of the target object; the tracking determination unit is configured to be based on the first The key point information determines the second key point information corresponding to the part of the limb of the target object in the second image; wherein, in the multi-frame image, the second image is one after the first image Frame image.

In some optional embodiments of the present disclosure, the detection unit includes: a limb detection module and a limb key point detection module; wherein, the limb detection module is configured to perform detection on the target object in the first image. The limb detection processing determines the first area of the target object; the first area includes the area where part of the limb of the target object is located; the limb key point detection module is configured to detect pixels corresponding to the first area Performing limb key point detection processing to obtain first key point information corresponding to the part of the limb of the target object.

In some optional embodiments of the present disclosure, the tracking determination unit is configured to determine a second area in the first image based on the first key point information; the second area is larger than the target object The first area; the first area includes the area where part of the limb of the target object is located; according to the second area, a third area in the second image corresponding to the position range of the second area is determined; Perform limb key point detection processing on pixels in the third area in the second image to obtain second key point information corresponding to the part of the limb.

In some optional embodiments of the present disclosure, the tracking determination unit is configured to determine, according to the position range of the first key point information in the first image, the difference between the position in the second image and the position in the second image. A third area corresponding to the range; performing limb key point detection processing on pixels in the third area in the second image to obtain second key point information corresponding to the part of the limb.

In some optional embodiments of the present disclosure, the limb detection module is configured to use a limb detection network to perform limb detection processing on the target object in the first image; wherein, the limb detection network uses the first Class sample images are obtained through training; the first class sample image is marked with a detection frame of the target object; the marking range of the detection frame includes the area where part of the limb of the target object is located.

In some optional embodiments of the present disclosure, the limb key point detection module is configured to use a limb key point detection network to perform limb key point detection processing on pixels corresponding to the first region; wherein, the limb key point The point detection network is trained using a second type of sample image; the second type of sample image is marked with key points that include part of the body of the target object.

In some optional embodiments of the present disclosure, the device further includes an allocation unit and a statistics unit; wherein, the allocation unit is configured to obtain the first key corresponding to a part of the limb of the target object in response to the detection unit. In the case of point information, a tracking identifier is assigned to the target object; the statistical unit is configured to determine the target in the multi-frame image based on the number of the tracking identifiers allocated during the processing of the multi-frame image The number of objects.

In some optional embodiments of the present disclosure, the device further includes a determining unit configured to determine the posture of the target object based on the second key point information; and determine that the posture corresponds to the target object based on the posture of the target object The interactive instructions of the object.

The embodiment of the present disclosure also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the steps of the image processing method described in the embodiment of the present disclosure are realized.

The embodiment of the present disclosure also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor. The processor executes the program to implement the The steps of the image processing method.

The embodiment of the present disclosure also provides a computer program that enables a computer to execute the image processing method described in the embodiment of the present disclosure.

The image processing method, device, electronic device, and storage medium provided by the embodiments of the present disclosure recognize the key points of part of the limb of the target object in the first image in the multi-frame image to be processed, and are based on the recognized part of the limb The key points of determine the key points of the partial limbs of the target object in the subsequent second image, thereby realizing target tracking in a scene with partial limbs of the target object (for example, the upper body) in the image.

Description of the drawings

FIG. 1 is a first schematic diagram of the flow of an image processing method according to an embodiment of the disclosure;

2 is a schematic flowchart of a method for detecting and processing limb key points in an image processing method according to an embodiment of the disclosure;

FIG. 3 is a schematic flowchart of a method for tracking key points of limbs in an image processing method according to an embodiment of the present disclosure;

FIG. 4 is a second schematic flowchart of an image processing method according to an embodiment of the disclosure;

FIG. 5 is a first schematic diagram of the composition structure of the image processing apparatus according to an embodiment of the disclosure;

FIG. 6 is a second schematic diagram of the composition structure of the image processing apparatus according to an embodiment of the disclosure;

FIG. 7 is a third schematic diagram of the composition structure of the image processing device according to an embodiment of the disclosure; FIG.

FIG. 8 is a fourth schematic diagram of the composition structure of the image processing apparatus according to an embodiment of the disclosure;

FIG. 9 is a schematic diagram of the hardware composition structure of an electronic device according to an embodiment of the disclosure.

Detailed ways

The present disclosure will be further described in detail below with reference to the drawings and specific embodiments.

In the following description, for the purpose of illustration rather than limitation, specific details such as specific system structure, interface, technology, etc. are proposed for a thorough understanding of the present application.

The term "and/or" in this article is only an association relationship describing the associated objects, which means that there can be three relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, exist alone B these three situations. In addition, the character "/" in this text generally indicates that the associated objects before and after are in an "or" relationship. In addition, "many" in this document means two or more than two.

The embodiment of the present disclosure provides an image processing method. FIG. 1 is a schematic diagram 1 of the flow of an image processing method according to an embodiment of the present disclosure; as shown in FIG. 1, the method includes:

Step 101: Obtain multiple frames of images;

Step 102: Perform limb key point detection processing on the target object in the first image in the multi-frame image to obtain first key point information corresponding to part of the limb of the target object;

Step 103: Determine the second key point information corresponding to the part of the limb of the target object in the second image based on the first key point information; wherein, among the multiple frames of images, the second image is an image after the first image.

The image processing method of this embodiment can be applied to an image processing device. The image processing device can be set in an electronic device with processing functions such as a personal computer or a server, or it can be implemented by a processor executing a computer program.

In this embodiment, the above-mentioned multi-frame images may be continuous videos collected by a camera device built-in or externally connected to the electronic device, or may also be received videos transmitted by other electronic devices. In some application scenarios, the above-mentioned multi-frame images may be surveillance videos collected by a surveillance camera to track various target objects in the surveillance video. In other application scenarios, the above-mentioned multi-frame images may also be videos stored locally or in other video libraries to track each target object in the video. In still other application scenarios, the image processing method of this embodiment can be applied to application scenarios such as virtual reality (VR, Augmented Reality), augmented reality (AR, Augmented Reality), or somatosensory games; the above-mentioned multi-frame images may also be The image of the operator collected in the virtual reality or augmented reality scene can be used to control the actions of virtual objects in the virtual reality scene or the augmented reality scene by recognizing the operator’s posture in the image; or it can also be in a somatosensory game Collected images of target objects (such as multiple users) participating in the game.

In some application scenarios, the image processing device may establish a communication connection with one or more surveillance cameras, and obtain real-time surveillance videos collected by the surveillance cameras as multi-frame images to be processed. In other application scenarios, the image processing device can also obtain a video from the video stored by itself as the multi-frame image to be processed, or it can also obtain the video from the video stored in other electronic equipment as the multi-frame image to be processed, etc. Wait. In other application scenarios, the image processing device can also be placed in the game device, and the processor of the game device executes the computer program so as to realize that the output displayed image is used as the multi-frame image to be processed during the operation of the game operator. The target object in the image (the target object corresponds to the game operator) is tracked.

In this embodiment, the multi-frame image to be processed may include a target object, and the target object may be one or more; in some application scenarios, the target object may be a real person; in other application scenarios, the target object may also be It is other objects that are determined according to actual tracking needs, such as virtual characters or other virtual objects.

In this embodiment, each frame image in the multi-frame image can be called a frame image, which is the smallest unit that constitutes a video (ie, the image to be processed). The acquisition time of the frame image forms the above-mentioned multi-frame image, and the time parameter corresponding to each frame image is continuous.

Exemplarily, taking the target object as a real person as an example, in the case where the target object is included in the multi-frame image, one or more target objects may be included in the time range corresponding to the above-mentioned multi-frame image, or it may be the above-mentioned multi-frame image A part of the time range of includes one or more target objects, which is not limited in this embodiment.

In this embodiment, the above-mentioned first image is any one of the multi-frame images, and the second image is an image after the first image; in other words, the above-mentioned first image is among the multi-frame images, in the second Any frame before the image. Among them, in some optional embodiments, the second image may be a subsequent frame of image that is time-continuous with the first image. For example, if the multi-frame image includes 10 frames of images, the first image is the second frame of the 10 frames of images, and the second image is the third frame of images. In other optional embodiments, the second image may also be an image after the first image, which is separated from the first image by a preset number of frames of images. For example, the multi-frame image includes 20 frames of images, the above-mentioned first image is the second frame of the 20 frames of images, and assuming that the preset number of frame images is 3 frames of images, the above-mentioned second image may be the sixth frame of the 20 frames of images. Frame image. The above-mentioned preset number can be preset according to actual conditions, for example, the preset number can be preset according to the moving speed of the target object. This embodiment can effectively reduce the amount of data processing, thereby reducing the consumption of the image processing device.

In this embodiment, the image processing device may perform limb key point detection processing on the target object in the first image through the limb key point detection network to obtain first key point information corresponding to part of the limb of the target object. In this embodiment, part of the limbs of the target object includes at least one of the following: head, neck, shoulders, chest, waist, hips, arms, and hands. Correspondingly, the first key point information and the second key point information corresponding to part of the limbs of the target object include at least one limb of the head, neck, shoulders, chest, waist, hips, arms, and hands of the target object. The outline key point information and/or bone key point information.

Exemplarily, part of the limbs of the target object in this embodiment are the upper body limbs of the target object, so as to be able to identify the target object with the upper body in the multi-frame images, so as to realize the tracking of the target object with only the upper body or the whole body.

Exemplarily, the key points corresponding to the first key point information and the second key point information may include: at least one key point on the head, at least one key point on the shoulder, at least one key point on the arm, and at least one key point on the chest. The key point, at least one key point of the hip, and at least one key point of the waist; optionally, the key point corresponding to the first key point information and the second key point information may also include at least one key point of the hand. Whether the image processing device can obtain the key points of the hand depends on whether the key points of the hand are marked in the sample images used to train the key point detection network of the limbs; when the key points of the hands are marked in the sample image, Then the key points of the hand can be detected through the limb key point detection network.

In some optional embodiments, in the case where part of the limb of the target object includes the head, the first key point information and the second key point information may include key point information of at least one organ, and key point information of at least one organ. The information may include at least one of the following: nose key point information, eyebrow key point information, and mouth key point information.

In some optional embodiments, in the case where part of the limb of the target object includes an arm, the first key point information and the second key point information may include elbow key point information.

In some optional embodiments, in the case where part of the limb of the target object includes a hand, the first key point information and the second key point information may include wrist key point information. Optionally, the first key point information and the second key point information may further include contour key point information of the hand.

In some optional embodiments, when part of the limbs of the target object includes hips, the first key point information and the second key point information may include left hip key point information and right hip key point information. Optionally, the first key point information and the second key point information may also include the key point information of the spine root.

Wherein, the above-mentioned first key point information may specifically include the coordinates of the key point. The aforementioned first key point information may include the coordinates of the contour key points and/or the coordinates of the bone key points. It can be understood that the contour edges of the corresponding part of the limb can be formed by the coordinates of the contour key points; the bones of the corresponding part of the limb can be formed by the coordinates of the bone key points.

FIG. 2 is a schematic flowchart of a method for detecting and processing limb key points in an image processing method according to an embodiment of the present disclosure; in some optional embodiments, step 102 may refer to FIG. 2 and includes:

Step 1021: Perform limb detection processing on the target object in the first image to determine the first area of the target object; the first area includes the area where part of the limb of the target object is located;

Step 1022: Perform limb key point detection processing on the pixel points corresponding to the first area to obtain first key point information corresponding to part of the limb of the target object.

In this embodiment, firstly perform limb detection on each target object in the first image to determine the first area of each target object. For example, the first area corresponding to the upper body of each target object or the first area corresponding to the whole body of each target object can be determined. One area. In practical applications, a detection frame (for example, a rectangular frame) identifying the target object may be used to indicate the first area corresponding to a part of the limb, for example, the upper body of each person in the first image may be identified by each rectangular frame.

In some optional embodiments, the above-mentioned performing limb detection processing on the target object in the first image includes: using a limb detection network to perform limb detection processing on the target object in the first image; wherein, the aforementioned limb detection network adopts the first One type of sample image is trained; the first type of sample image is marked with the detection frame of the target object; the marking range of the detection frame includes the area where part of the limb of the target object is located; part of the limb of the target object may be the upper body limb of the target object.

In this embodiment, limb detection can be performed on the first image through a pre-trained limb detection network to determine the first area of the target object, that is, to obtain the detection frame of each target object in the first image. The above detection frame can identify part or all of the limbs of the target object, that is, all the limbs or upper body limbs of the target object can be detected through the limb detection network. Among them, the aforementioned limb detection network may adopt any network structure capable of detecting the limb of the target object, which is not limited in this embodiment.

Exemplarily, taking the detection frame of part of the limb of the target object detected by the limb detection network as an example, the feature extraction of the first image can be performed through the limb detection network, and the characteristics of each target object in the first image can be determined based on the extracted features. The center point of part of the limb and the height and width of the detection frame of the part of the limb corresponding to each target object. Based on the center point of the part of the limb of each target object and the corresponding height and width, the detection frame of the part of each target object can be determined .

In this embodiment, the limb detection network can be obtained by training with the first type of sample image marked with the detection frame of the target object; wherein the marking range of the detection frame includes part of the limb of the target object. It is understandable that the first type of sample image can be A detection frame marked with only a part of the limb of the target object (for example, the upper body limb of the target object) may also be marked with a detection frame of the complete limb of the target object. Exemplarily, taking the marking range of the detection frame as the part of the body of the target object as an example, the body detection network can be used to extract the feature data of the first type of sample image, and based on the feature data, the part of the body of each target object in the first type of sample image can be determined The predicted center point and the height and width of the predicted detection frame of the corresponding part of the limb, and the predicted detection frame corresponding to each part of the limb is determined based on the predicted center point of the above part of the limb and the corresponding height and width; according to the predicted detection frame and the marked part of the limb The detection frame determines the loss, and adjusts the network parameters of the limb detection network based on the loss.

In some optional embodiments, the above-mentioned performing limb key point detection processing on pixels corresponding to the first region includes: using a limb key point detection network to perform limb key point detection processing on pixels corresponding to the first area; wherein, The above-mentioned limb key point detection network is obtained by training using the second type of sample image; the second type of sample image is marked with the key points of the target object; the marking range of the above key point includes part of the limb of the target object.

In this embodiment, a pre-trained limb key point detection network may be used to perform limb key point detection on pixels corresponding to the first region to determine the first key point information of a part of the limb of each target object. Exemplarily, the above-mentioned first area may include part of the limbs of the target object, and the pixel points corresponding to the detection frame of each target object may be input to the limb key point detection network to obtain the first key point information corresponding to the part of the limb of each target object . Among them, the aforementioned limb key point detection network may adopt any network structure capable of detecting limb key points, which is not limited in this embodiment.

In this embodiment, the limb key point detection network can be obtained by training with the second type of sample images marked with the key points of the target object. The marking range of the key points includes part of the limbs of the target object. It is understandable that the second type of sample image Only the key points of part of the limbs of the target object (for example, the upper body limbs of the target object) may be marked in the, or the key points of the complete limbs of the target object may be marked. Exemplarily, taking the key points of the part of the limbs marked with the target object in the second type of sample image as an example, the feature data of the second type of sample image can be extracted using the limb key point detection network, and the second type of sample image can be determined based on the feature data The prediction key points of part of the limbs of each target object in the target object; the loss is determined based on the above prediction key points and the marked key points, and the network parameters of the limb key point detection network are adjusted based on the loss.

FIG. 3 is a schematic flowchart of a method for tracking body key points in an image processing method according to an embodiment of the present disclosure; in some optional embodiments, step 103 may refer to FIG. 3, and the method includes:

Step 1031: Determine a second area in the first image based on the first key point information; the second area is larger than the first area of the target object; the first area includes the area where part of the limb of the target object is located;

Step 1032: According to the second area, determine a third area in the second image corresponding to the position range of the second area;

Step 1033: Perform limb key point detection processing on pixels in the third area in the second image to obtain second key point information corresponding to a part of the limb.

In this embodiment, for a target object in the first image, an area is determined based on the first key point information of a part of the limb of the target object, and the area may be the smallest area containing all the key points of the part of the limb of the target object . Exemplarily, if the area is a rectangular area, the rectangular area is the smallest area containing all the key points of a part of the limb of the target object. Then, the above-mentioned second area is an area obtained by performing magnification processing on the first area in the first image.

Exemplarily, if the first area is a rectangle as an example, assuming that the height of the first area is H and the width is W, the center point of the area may be the center, and the four sides of the area may extend away from the center point. For example, in the height direction, respectively extend H/4 in the direction away from the center point, and in the width direction, respectively extend W/4 in the direction away from the center point, then the second area can pass through the first image and center on the center. The point is the center, the rectangular area with the height of 3H/2 and the width of 3W/2 is represented.

In this embodiment, the third area in the second image corresponding to the above-mentioned position range can be determined according to the position range of the second area in the first image.

In some optional embodiments, determining the third area in the second image corresponding to the position range of the second area according to the second area may further include: performing limb key point detection processing on pixels corresponding to the second area, Obtain third key point information; determine a position range of the third key point information in the first image, and determine a third area in the second image corresponding to the position range based on the position range.

Exemplarily, in this embodiment, the limb key point detection network is still used to perform limb key point detection processing on the pixels corresponding to the second area, and the pixels corresponding to the expanded second area in the first image may be used as The limb key point detection network inputs data, and outputs the third key point information. The third key point information is used as the prediction key point information of the target object in the second image. The area where the target object is located is expanded (for example, the area where part of the limb of the target object in the previous frame image is expanded), and the expanded area is detected by limb key points, and the obtained key points are used as the current frame image (I.e., the first image) in the next frame of image (i.e., the second image), corresponding to the predicted key points of the target object (for example, a part of the limb of the target object). Further based on the predicted position range, the pixel points corresponding to the third area in the second image are subjected to limb key point detection processing, and the detected key point information is the second key point information corresponding to the part of the limb of the target object.

In some optional embodiments, the above step 103 may further include: determining the first key point corresponding to the position range in the second image according to the position range of the first key point information in the first image. Three regions; performing limb key point detection processing on pixels in the third region in the second image to obtain second key point information corresponding to the part of the limb.

In this embodiment, the third area in the second image corresponding to the above-mentioned position range can be determined according to the position range of the first key point in the first image. The pixel points corresponding to the third area in the second image are further subjected to limb key point detection processing, and the detected key point information is the second key point information corresponding to the part of the limb of the target object.

In other optional embodiments, step 103 may further include: determining the predicted area of the target object in the second image based on the first image, the first area of the target object, and the target tracking network, based on the above-mentioned The pixel points in the prediction area are subjected to limb key point detection processing to obtain the second key point information corresponding to part of the limb of the target object; among them, the target tracking network is trained by using multi-frame sample images; the multi-frame sample images include at least the first sample The image and the second sample image, the second sample image is an image after the first sample image; the position of the target object is marked in the first sample image, and the position of the target object is marked in the second sample image. Exemplarily, the detection frame of the target object is marked in the sample images of multiple frames, and the position of the target object in the sample image is represented by the detection frame; the marking range of the detection frame includes the area where part of the limb of the target object is located; part of the limb of the target object It can be the upper body limbs of the target object.

In this embodiment, the position of the target object in the previous frame of image (ie, the first image) and the target object in the image can be used to determine the location of the target object in the next frame of image (ie, the second image) through a pre-trained target tracking network. Forecast location. Exemplarily, the first image containing the detection frame of the target object can be input to the target tracking network to obtain the predicted position of the target object in the second image; The key point detection process obtains the second key point information of the part of the limb of the target object in the second image. Among them, the above-mentioned target tracking network may adopt any network structure capable of realizing target tracking, which is not limited in this embodiment.

In this embodiment, the target tracking network can be obtained by training with multi-frame sample images marked with the position of the target object (for example, a detection frame containing the target object, or a detection frame containing a part of the limb of the target object). Exemplarily, taking the multi-frame sample image at least including the first image and the second image as an example, the target tracking network can be used to process the first sample image, and the position of the target object is marked in the first sample image. The result is the predicted position of the target object in the second sample image; the loss can be determined according to the predicted position and the label position of the target object in the second image, and the network parameters of the target tracking network can be adjusted based on the loss.

It should be noted that after the second key point information corresponding to the part of the limb of the target object in the second image is determined based on the first key point information, the second key point corresponding to the part of the limb of the target object in the second image may be determined based on the second key point. The information further determines the key point information corresponding to the part of the limb of the target object in the rear image, and so on, until the key point information corresponding to the part of the limb of the target object cannot be detected in the next frame of image. The above-mentioned target object is no longer included in the processed multi-frame image, that is, the target object has moved out of the field of view of the multi-frame image to be processed.

In some optional embodiments, the image processing device may also perform limb detection for the target object in each frame of image to obtain the area where the target object in each frame of image is located. The detected target object is used as the tracking object to determine whether a new target object appears in the current frame image; when a new target object appears in the current frame image, the new target object is used as the tracking object, and the new target object The pixel points in the first area corresponding to the target object are subjected to limb key point detection processing, that is, the processing of step 103 in the embodiment of the present disclosure is executed for the new target object. Exemplarily, the image processing device may execute the limb detection processing of the target object in the image every preset time or every preset number of image frames, so as to detect whether a new target object appears in the image at regular intervals. Track new target objects.

In some optional embodiments of the present disclosure, the foregoing method further includes: in response to obtaining first key point information corresponding to a part of the limb of the target object, assigning a tracking identifier to the target object; The number of assigned tracking marks determines the number of target objects in the multi-frame image.

In this embodiment, the image processing device detects the target object in the first frame of the multi-frame image to be processed, that is, when the first key point information corresponding to part of the body of the target object is obtained, a tracking identifier is assigned to the target object, The tracking identifier is associated with the target object until the target object cannot be tracked during the process of tracking the target object.

In some optional embodiments, the image processing device may also perform limb detection for the target object in each frame of image, to obtain the area corresponding to part of the limb of the target object in each frame of image, and use the detected target object as tracking Object. Based on this, the image processing device detects the first frame of the image to be processed, and assigns a tracking identifier to the detected target object. After that, the tracking identifier keeps following the target object until the target object cannot be tracked. If a new target object is detected in a certain frame of image, a tracking identifier is assigned to the new target object, and the above solution is repeated. It can be understood that each target object detected at the same time corresponds to different tracking identifiers; target objects tracked in a continuous time range correspond to the same tracking identifier; target objects detected separately in a non-continuous time range Correspond to different tracking identifiers.

For example, if three target objects are detected in a certain frame of image, a tracking identifier is assigned to the three target objects, and each target object corresponds to a tracking identifier.

For another example, for a 5-minute multi-frame image, three target objects are detected within the first 1 minute, and a tracking identifier is assigned to the three target objects, for example, they can be recorded as identifier 1, identifier 2, and identifier 3. Within the second one minute, the first one of the above three target objects disappears. In the current one minute, there are only two target objects, and the corresponding tracking identifiers are identification 2 and identification 3. In the third Within 1 minute, the above-mentioned first target object appears in the image again, that is, compared to the previous image, a new target object is detected, even though the target object is the first target object that appeared within 1 minute (ie The first target object), the target object is still assigned identifier 4 as the tracking identifier, and so on.

Based on this, the technical solution of this embodiment can determine the number of target objects that have appeared in the multi-frame image based on the number of corresponding tracking marks in the multi-frame image processing process. Exemplarily, the number of target objects that have appeared in multiple frames of images refers to the number of target objects that have appeared in a time range corresponding to the multiple frames of images.

By adopting the technical solution of the embodiment of the present disclosure, the key points of part of the limbs of the target object in the first image in the multi-frame images to be processed are recognized, and the subsequent second is determined based on the key points of the recognized part of the limbs. The key points of the part of the limb of the target object in the image, thereby realizing target tracking in a scene with only part of the limb of the target object (for example, the upper body) in the image, that is, the technical solution of the embodiment of the present disclosure can simultaneously adapt to the complete limb The scene and part of the limbs (such as the upper body) scene, realize the target tracking in the image.

The embodiment of the present disclosure also provides an image processing method. FIG. 4 is a second schematic diagram of the flow of the image processing method according to an embodiment of the disclosure; as shown in FIG. 4, the method includes:

Step 201: Obtain multiple frames of images;

Step 202: Perform limb key point detection processing on the target object in the first image in the multi-frame image to obtain first key point information corresponding to part of the limb of the target object;

Step 203: Determine the second key point information corresponding to the part of the limb of the target object in the second image based on the first key point information; wherein, in the multiple frames of images, the second image is one frame after the first image;

Step 204: Determine the posture of the target object based on the second key point information; determine the interactive instruction corresponding to the target object based on the posture of the target object.

For the detailed description of step 201 to step 203 in this embodiment, reference may be made to the description of step 101 to step 103, which will not be repeated here.

In this embodiment, the posture of the target object can be determined based on the tracked target object and further based on the second key point information of the target object, and the interaction instruction corresponding to each posture can be determined based on the posture of the target object. Afterwards, respond to the interactive commands corresponding to each posture.

This embodiment is suitable for an action interactive scene. The image processing device can determine the corresponding interactive instruction based on each posture, and respond to the above interactive instruction; in response to the above interactive instruction, for example, the image processing device itself or the electronic device where the image processing device is located can be turned on or off. Some of its own functions, etc.; or, in response to the above interactive instructions, the above interactive instructions can also be sent to other electronic devices, and other electronic devices receive the above interactive instructions and turn on or off certain functions based on the interactive instructions, in other words, the above Interactive instructions can also be used to turn on or turn off corresponding functions of other electronic devices.

This embodiment is also applicable to various application scenarios such as virtual reality, augmented reality, or somatosensory games. The image processing device can perform corresponding processing based on various interactive instructions, including but not limited to controlling virtual reality or augmented reality scenes, performing corresponding actions on virtual objects; controlling somatosensory game scenes, performing corresponding actions on virtual characters corresponding to the target object Actions. In some examples, if the method is applied to scenes such as augmented reality or virtual reality, the corresponding processing performed by the image processing device based on the interactive instruction may include controlling the virtual target object to perform an action corresponding to the interactive instruction in a real scene or a virtual scene.

By adopting the technical solutions of the embodiments of the present disclosure, on the one hand, target tracking in a scene with only part of the limbs (such as the upper body) of the target object in the image is realized, that is, the technical solutions of the embodiments of the present disclosure can simultaneously adapt to the scenes of complete limbs and Part of the limbs (such as the upper body) scene realizes the target tracking in the image; on the other hand, the key point information of the tracked target object is detected during the target tracking process, and the tracked target object is determined based on the key point information of the target object Based on the posture of the target object, the corresponding interaction instruction is determined, which realizes human-computer interaction in specific application scenarios (such as virtual reality scenes, augmented reality scenes, somatosensory game scenes and other interactive scenes), and enhances the user’s interactive experience.

The embodiment of the present disclosure also provides an image processing device. FIG. 5 is a schematic diagram 1 of the composition structure of an image processing device according to an embodiment of the disclosure; as shown in FIG. 5, the device includes: an acquisition unit 31, a detection unit 32, and a tracking determination unit 33; wherein,

The aforementioned acquiring unit 31 is configured to acquire multiple frames of images;

The detection unit 32 is configured to perform limb key point detection processing on the target object in the first image in the multi-frame image, to obtain first key point information corresponding to part of the limb of the target object;

The tracking determination unit 33 is configured to determine, based on the first key point information, the second key point information corresponding to the part of the limb of the target object in the second image; wherein, in the multi-frame image, the second image is An image after the first image.

In some optional embodiments of the present disclosure, as shown in FIG. 6, the detection unit 32 includes: a limb detection module 321 and a limb key point detection module 322; wherein,

The limb detection module 321 is configured to perform limb detection processing on the target object in the first image to determine the first area of the target object; the first area includes the area where part of the limb of the target object is located;

The limb key point detection module 322 is configured to perform limb key point detection processing on the pixel points corresponding to the first region to obtain first key point information corresponding to the part of the limb of the target object.

In some optional embodiments of the present disclosure, the tracking determining unit 33 is configured to determine a second area in the first image based on the first key point information; the second area is larger than the first area of the target object; One area includes the area where part of the limb of the target object is located; according to the second area, determine the third area in the second image corresponding to the position range of the second area; perform limb keying on the pixels in the third area in the second image Point detection processing to obtain the second key point information corresponding to the part of the limb.

In some optional embodiments of the present disclosure, the above-mentioned tracking determination unit 33 is configured to determine, according to the position range of the first key point information in the first image, that the position in the second image is different from the position in the second image. A third area corresponding to the range; performing limb key point detection processing on pixels in the third area in the second image to obtain second key point information corresponding to the part of the limb.

In some optional embodiments of the present disclosure, the limb detection module 321 is configured to use a limb detection network to perform limb detection processing on the target object in the first image; wherein, the limb detection network uses the first type of sample image Obtained by training; the detection frame of the target object is marked in the above-mentioned first-type sample image; the marking range of the detection frame includes the area where part of the limb of the target object is located.

In some optional embodiments of the present disclosure, the aforementioned limb key point detection module 322 is configured to use a limb key point detection network to perform limb key point detection processing on pixels corresponding to the first region; wherein, the aforementioned limb key point detection The network is trained by using the second type of sample image; the above-mentioned second type of sample image is marked with key points that include part of the body of the target object.

In some optional embodiments of the present disclosure, the part of the limbs of the target object includes at least one of the following: head, neck, shoulders, chest, waist, hips, arms, hands; the above-mentioned first key point information And the above-mentioned second key point information includes contour key point information and/or bone key point information of at least one of the head, neck, shoulders, chest, waist, hips, arms, and hands.

In some optional embodiments of the present disclosure, as shown in FIG. 7, the above-mentioned apparatus further includes: an allocation unit 34 and a statistics unit 35; wherein,

The allocation unit 34 is configured to allocate a tracking identifier to the target object in response to the detection unit 32 obtaining the first key point information corresponding to a part of the limb of the target object;

The aforementioned statistical unit 35 is configured to determine the number of target objects in the multi-frame image based on the number of tracking identifiers allocated during the processing of the multi-frame image.

In some optional embodiments of the present disclosure, as shown in FIG. 8, the above-mentioned apparatus further includes a determining unit 36 configured to determine the posture of the target object based on the second key point information; and determine the posture corresponding to the target object based on the posture of the target object. Interactive instructions.

In the embodiment of the present disclosure, the acquisition unit 31, the detection unit 32 (including the limb detection module 321 and the limb key point detection module 322), the tracking determination unit 33, the allocation unit 34, the statistics unit 35, and the determination unit 36 in the above-mentioned image processing device In practical applications, it can be implemented by a central processing unit (CPU, Central Processing Unit), a digital signal processor (DSP, Digital Signal Processor), a microcontroller unit (MCU, Microcontroller Unit) or a programmable gate array (FPGA, Field-Programmable Gate Array) implementation.

It should be noted that when the image processing device provided in the foregoing embodiment performs image processing, only the division of the foregoing program modules is used as an example for illustration. In actual applications, the foregoing processing can be allocated to different program modules as needed. That is, the internal structure of the device is divided into different program modules to complete all or part of the processing described above. In addition, the image processing device provided in the foregoing embodiment and the image processing method embodiment belong to the same concept, and the specific implementation process is detailed in the method embodiment, which will not be repeated here.

The embodiment of the present disclosure also provides an electronic device. FIG. 9 is a schematic diagram of the hardware composition structure of the electronic device of the embodiment of the disclosure; as shown in FIG. 9, the electronic device 40 may include a memory 42, a processor 41, and a computer program stored on the memory 42 and running on the processor 41 When the above-mentioned processor 41 executes the above-mentioned program, the steps of the above-mentioned image processing method in the embodiment of the present disclosure are realized.

It can be understood that various components in the electronic device 40 may be coupled together through the bus system 43. It can be understood that the bus system 43 is used to implement connection and communication between these components. In addition to the data bus, the bus system 43 also includes a power bus, a control bus, and a status signal bus. However, for the sake of clear description, various buses are marked as the bus system 43 in FIG. 9.

It can be understood that the memory 42 may be a volatile memory or a non-volatile memory, and may also include both volatile and non-volatile memory. Among them, the non-volatile memory can be a read-only memory (ROM, Read Only Memory), a programmable read-only memory (PROM, Programmable Read-Only Memory), an erasable programmable read-only memory (EPROM, Erasable Programmable Read- Only Memory, Electrically Erasable Programmable Read-Only Memory (EEPROM), Ferromagnetic Random Access Memory (FRAM), Flash Memory, Magnetic Surface Memory , CD-ROM, or CD-ROM (Compact Disc Read-Only Memory); magnetic surface memory can be magnetic disk storage or tape storage. The volatile memory may be a random access memory (RAM, Random Access Memory), which is used as an external cache. By way of exemplary but not restrictive description, many forms of RAM are available, such as static random access memory (SRAM, Static Random Access Memory), synchronous static random access memory (SSRAM, Synchronous Static Random Access Memory), and dynamic random access memory. Memory (DRAM, Dynamic Random Access Memory), Synchronous Dynamic Random Access Memory (SDRAM, Synchronous Dynamic Random Access Memory), Double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM, Double Data Rate Synchronous Dynamic Random Access Memory), enhanced Type synchronous dynamic random access memory (ESDRAM, Enhanced Synchronous Dynamic Random Access Memory), synchronous connection dynamic random access memory (SLDRAM, SyncLink Dynamic Random Access Memory), direct memory bus random access memory (DRRAM, Direct Rambus Random Access Memory) ). The memory 42 described in the embodiments of the present disclosure is intended to include, but is not limited to, these and any other suitable types of memory.

The methods disclosed in the foregoing embodiments of the present disclosure may be applied to the processor 41 or implemented by the processor 41. The processor 41 may be an integrated circuit chip with signal processing capabilities. In the implementation process, the steps of the foregoing method can be completed by an integrated logic circuit of hardware in the processor 41 or instructions in the form of software. The aforementioned processor 41 may be a general-purpose processor, a DSP, or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, and the like. The processor 41 may implement or execute various methods, steps, and logical block diagrams disclosed in the embodiments of the present disclosure. The general-purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the embodiments of the present disclosure may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor. The software module may be located in a storage medium, and the storage medium is located in the memory 42. The processor 41 reads the information in the memory 42 and completes the steps of the foregoing method in combination with its hardware.

In an exemplary embodiment, the electronic device 40 may be used by one or more Application Specific Integrated Circuits (ASIC, Application Specific Integrated Circuit), DSP, Programmable Logic Device (PLD, Programmable Logic Device), and Complex Programmable Logic Device (CPLD). , Complex Programmable Logic Device, FPGA, general-purpose processor, controller, MCU, microprocessor (Microprocessor), or other electronic components to implement the foregoing method.

In an exemplary embodiment, the embodiment of the present disclosure also provides a computer-readable storage medium, such as a memory 42 including a computer program, which can be executed by the processor 41 of the electronic device 40 to complete the steps described in the foregoing method. . The computer-readable storage medium can be FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface memory, optical disk, or CD-ROM, etc.; it can also be a variety of devices including one or any combination of the above-mentioned memories, such as Mobile phones, computers, tablet devices, personal digital assistants, etc.

The embodiment of the present disclosure also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the steps of the image processing method according to the embodiment of the present disclosure are realized.

The embodiment of the present disclosure also provides a computer program that enables a computer to execute the steps of the image processing method described in the embodiment of the present disclosure.

The methods disclosed in the several method embodiments provided in this application can be combined arbitrarily without conflict to obtain new method embodiments.

The features disclosed in the several product embodiments provided in this application can be combined arbitrarily without conflict to obtain new product embodiments.

The features disclosed in the several method or device embodiments provided in this application can be combined arbitrarily without conflict to obtain a new method embodiment or device embodiment.

In the several embodiments provided in this application, it should be understood that the disclosed device and method may be implemented in other ways. The device embodiments described above are only illustrative. For example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, such as: multiple units or components can be combined, or It can be integrated into another system, or some features can be ignored or not implemented. In addition, the coupling, or direct coupling, or communication connection between the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms. of.

The units described above as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units; Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

In addition, the functional units in the embodiments of the present disclosure can be all integrated into one processing unit, or each unit can be individually used as a unit, or two or more units can be integrated into one unit; The unit can be implemented in the form of hardware, or in the form of hardware plus software functional units.

A person of ordinary skill in the art can understand that all or part of the steps in the above method embodiments can be implemented by a program instructing relevant hardware. The foregoing program can be stored in a computer readable storage medium. When the program is executed, it is executed. Including the steps of the foregoing method embodiment; and the foregoing storage medium includes: various media that can store program codes, such as a mobile storage device, ROM, RAM, magnetic disk, or optical disk.

Alternatively, if the aforementioned integrated unit of the present disclosure is implemented in the form of a software function module and sold or used as an independent product, it may also be stored in a computer readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present disclosure can be embodied in the form of a software product in essence or a part that contributes to the prior art. The computer software product is stored in a storage medium and includes several instructions for A computer device (which may be a personal computer, a server, or a network device, etc.) executes all or part of the methods described in the various embodiments of the present disclosure. The aforementioned storage media include: removable storage devices, ROM, RAM, magnetic disks, or optical disks and other media that can store program codes.

The above are only specific implementations of the present disclosure, but the protection scope of the present disclosure is not limited thereto. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in the present disclosure. It should be covered within the protection scope of the present disclosure. Therefore, the protection scope of the present disclosure should be subject to the protection scope of the claims.

Claims

An image processing method, the method includes:

Obtain multiple frames of images;

Performing limb key point detection processing on the target object in the first image in the multi-frame image to obtain first key point information corresponding to part of the limb of the target object;

Determine the second key point information corresponding to the part of the limb of the target object in the second image based on the first key point information; wherein, in the multi-frame image, the second image is the first One frame after one image.
The method according to claim 1, wherein said performing limb key point detection processing on the target object in the first image in the multi-frame image, to obtain first key point information corresponding to part of the limb of the target object ,include:

Performing limb detection processing on the target object in the first image to determine a first area of the target object; the first area includes the area where part of the limb of the target object is located;

Perform limb key point detection processing on the pixel points corresponding to the first region to obtain first key point information corresponding to the part of the limb of the target object.
The method according to claim 1, wherein the determining the second key point information corresponding to the part of the limb of the target object in the second image based on the first key point information comprises:

Determine a second area in the first image based on the first key point information; the second area is larger than the first area of the target object; the first area includes the area where part of the body of the target object is located ；

Determine, according to the second area, a third area in the second image corresponding to the position range of the second area;

Performing limb key point detection processing on pixels in the third area in the second image to obtain second key point information corresponding to the part of the limb.
The method according to claim 1, wherein the determining the second key point information corresponding to the part of the limb of the target object in the second image based on the first key point information comprises:

Determine a third area in the second image corresponding to the position range according to the position range of the first key point information in the first image;

Performing limb key point detection processing on pixels in the third area in the second image to obtain second key point information corresponding to the part of the limb.
The method according to claim 2, wherein the performing limb detection processing on the target object in the first image comprises:

Performing a limb detection process on the target object in the first image by using a limb detection network;

Wherein, the limb detection network is obtained by training using a first type of sample image; the first type of sample image is marked with a detection frame of the target object; the label range of the detection frame includes the area where part of the limb of the target object is located.
The method according to claim 2, wherein said performing limb key point detection processing on pixels corresponding to said first area comprises:

Using a limb key point detection network to perform limb key point detection processing on pixels corresponding to the first region;

Wherein, the limb key point detection network is obtained by training using a second type of sample image; the second type of sample image is marked with key points of a part of the limb including the target object.
The method according to any one of claims 1 to 6, wherein part of the limbs of the target object includes at least one of the following: head, neck, shoulders, chest, waist, hips, arms, hands;

The first key point information and the second key point information include contour key point information and/or bone key of at least one of the head, neck, shoulders, chest, waist, hips, arms, and hands. Point information.
The method according to any one of claims 1 to 7, wherein the method further comprises:

In response to obtaining the first key point information corresponding to a part of the limb of the target object, assign a tracking identifier to the target object;

Determine the number of target objects in the multi-frame image based on the number of the tracking identifiers allocated during the processing of the multi-frame image.
The method according to any one of claims 1 to 8, wherein the method further comprises:

Determining the posture of the target object based on the second key point information;

An interaction instruction corresponding to the target object is determined based on the posture of the target object.
An image processing device, the device comprising: an acquisition unit, a detection unit, and a tracking determination unit; wherein,

The acquiring unit is configured to acquire multiple frames of images;

The detection unit is configured to perform limb key point detection processing on the target object in the first image in the multi-frame image to obtain first key point information corresponding to a part of the limb of the target object;

The tracking determination unit is configured to determine second key point information corresponding to the part of the limb of the target object in the second image based on the first key point information; wherein, in the multi-frame image, The second image is an image after the first image.
The device according to claim 10, wherein the detection unit comprises: a limb detection module and a limb key point detection module; wherein,

The limb detection module is configured to perform limb detection processing on the target object in the first image to determine a first area of the target object; the first area includes the area where part of the limb of the target object is located ；

The limb key point detection module is configured to perform limb key point detection processing on the pixel points corresponding to the first area to obtain first key point information corresponding to the part of the limb of the target object.
The device according to claim 10, wherein the tracking determination unit is configured to determine a second area in the first image based on the first key point information; the second area is larger than the target object The first area; the first area includes the area where part of the limb of the target object is located; according to the second area, a third area in the second image corresponding to the position range of the second area is determined; Perform limb key point detection processing on pixels in the third area in the second image to obtain second key point information corresponding to the part of the limb.
The device according to claim 10, wherein the tracking determination unit is configured to determine the position in the second image and the position in the first image according to the position range of the first key point information in the first image. A third area corresponding to the range; performing limb key point detection processing on pixels in the third area in the second image to obtain second key point information corresponding to the part of the limb.
The device according to claim 11, wherein the limb detection module is configured to perform limb detection processing on the target object in the first image by using a limb detection network;

Wherein, the limb detection network is trained using a first type of sample image; the first type of sample image is marked with a detection frame of the target object; the marking range of the detection frame includes the area where part of the limb of the target object is located.
The device according to claim 11, wherein the limb key point detection module is configured to use a limb key point detection network to perform limb key point detection processing on pixels corresponding to the first region;

Wherein, the limb key point detection network is obtained by training using a second type of sample image; the second type of sample image is marked with key points of a part of the limb including the target object.
The device according to any one of claims 10 to 15, wherein part of the limbs of the target object includes at least one of the following: head, neck, shoulders, chest, waist, hips, arms, hands;

The first key point information and the second key point information include contour key point information and/or bone key of at least one of the head, neck, shoulders, chest, waist, hips, arms, and hands. Point information.
The device according to any one of claims 10 to 16, wherein the device further comprises an allocation unit and a statistics unit; wherein,

The allocation unit is configured to allocate a tracking identifier to the target object in response to the detection unit obtaining the first key point information corresponding to a part of the limb of the target object;

The statistical unit is configured to determine the number of target objects in the multi-frame image based on the number of the tracking identifiers allocated during the processing of the multi-frame image.
The device according to any one of claims 10 to 17, wherein the device further comprises a determining unit configured to determine the posture of the target object based on the second key point information; and determine based on the posture of the target object An interactive instruction corresponding to the target object.
A computer-readable storage medium with a computer program stored thereon, which, when executed by a processor, implements the steps of the method described in any one of claims 1 to 9.
An electronic device comprising a memory, a processor, and a computer program stored on the memory and capable of running on the processor. The processor implements the steps of the method according to any one of claims 1 to 9 when the processor executes the program.
A computer program that causes a computer to execute the image processing method according to any one of claims 1 to 9.