WO2021218293A1 - Image processing method and apparatus, electronic device and storage medium - Google Patents

Image processing method and apparatus, electronic device and storage medium Download PDF

Info

Publication number
WO2021218293A1
WO2021218293A1 PCT/CN2021/076504 CN2021076504W WO2021218293A1 WO 2021218293 A1 WO2021218293 A1 WO 2021218293A1 CN 2021076504 W CN2021076504 W CN 2021076504W WO 2021218293 A1 WO2021218293 A1 WO 2021218293A1
Authority
WO
WIPO (PCT)
Prior art keywords
limb
image
target object
key point
point information
Prior art date
Application number
PCT/CN2021/076504
Other languages
French (fr)
Chinese (zh)
Inventor
李通
金晟
刘文韬
钱晨
Original Assignee
北京市商汤科技开发有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京市商汤科技开发有限公司 filed Critical 北京市商汤科技开发有限公司
Priority to JP2021565760A priority Critical patent/JP2022534666A/en
Publication of WO2021218293A1 publication Critical patent/WO2021218293A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/248Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Definitions

  • the present disclosure relates to the field of computer vision technology, and in particular to an image processing method, device, electronic equipment, and storage medium.
  • Target tracking technology is usually based on a limb detection algorithm and a limb key point detection algorithm, using the human body detected by the limb detection algorithm and the human body key points detected by the limb key point detection algorithm to achieve target tracking.
  • current limb detection algorithms and limb key point detection algorithms cannot adapt to scenes with only upper body limbs, which leads to the inability to track targets with only upper body limbs.
  • the embodiments of the present disclosure provide an image processing method, device, electronic equipment, and storage medium.
  • the embodiment of the present disclosure provides an image processing method, the method includes: obtaining a multi-frame image; performing limb key point detection processing on a target object in a first image in the multi-frame image, to obtain an image of the target object
  • the first key point information corresponding to the part of the limb; the second key point information corresponding to the part of the limb of the target object in the second image is determined based on the first key point information; wherein, in the multi-frame image ,
  • the second image is an image after the first image.
  • the limb key point detection processing is performed on the target object in the first image in the multi-frame image to obtain first key point information corresponding to part of the limb of the target object , Including: performing limb detection processing on the target object in the first image to determine a first area of the target object; the first area includes the area where part of the limb of the target object is located; The pixel points corresponding to a region are subjected to limb key point detection processing, and the first key point information corresponding to the part of the limb of the target object is obtained.
  • the determining the second key point information corresponding to the part of the limb of the target object in the second image based on the first key point information includes: based on the first key point information A key point information determines a second area in the first image; the second area is larger than the first area of the target object; the first area includes the area where part of the limb of the target object is located; A second area, determining a third area in the second image corresponding to the position range of the second area; performing limb key point detection processing on pixels in the third area in the second image, Obtain the second key point information corresponding to the part of the limbs.
  • the determining the second key point information corresponding to the part of the limb of the target object in the second image based on the first key point information includes: according to the first key point information The position range of a key point information in the first image is used to determine the third area in the second image corresponding to the position range; for the pixels in the third area in the second image Perform limb key point detection processing to obtain the second key point information corresponding to the part of the limb.
  • the performing limb detection processing on the target object in the first image includes: performing limb detection on the target object in the first image using a limb detection network. Detection processing; wherein the limb detection network is trained using the first type of sample image; the first type of sample image is marked with a detection frame of the target object; the marking range of the detection frame includes part of the limb of the target object your region.
  • the performing limb key point detection processing on the pixels corresponding to the first area includes: performing limb key point detection on the pixels corresponding to the first area by using a limb key point detection network. Key point detection processing; wherein the limb key point detection network is trained by using a second type of sample image; the second type of sample image is marked with key points of a part of the limb that includes the target object.
  • part of the limbs of the target object includes at least one of the following: head, neck, shoulders, chest, waist, hips, arms, hands; the first key
  • the point information and the second key point information include contour key point information and/or bone key point information of at least one of the head, neck, shoulder, chest, waist, hip, arm, and hand.
  • the method further includes: in response to obtaining first key point information corresponding to a part of the limb of the target object, assigning a tracking identifier to the target object;
  • the number of tracking identifiers allocated during the processing of the multi-frame image determines the number of target objects in the multi-frame image.
  • the method further includes: determining the posture of the target object based on the second key point information; determining the interaction instruction corresponding to the target object based on the posture of the target object .
  • the embodiment of the present disclosure also provides an image processing device, the device includes: an acquisition unit, a detection unit, and a tracking determination unit; wherein the acquisition unit is configured to obtain multiple frames of images; the detection unit is configured to The target object in the first image in the multi-frame image performs limb key point detection processing to obtain first key point information corresponding to a part of the limb of the target object; the tracking determination unit is configured to be based on the first The key point information determines the second key point information corresponding to the part of the limb of the target object in the second image; wherein, in the multi-frame image, the second image is one after the first image Frame image.
  • the detection unit includes: a limb detection module and a limb key point detection module; wherein, the limb detection module is configured to perform detection on the target object in the first image.
  • the limb detection processing determines the first area of the target object; the first area includes the area where part of the limb of the target object is located; the limb key point detection module is configured to detect pixels corresponding to the first area Performing limb key point detection processing to obtain first key point information corresponding to the part of the limb of the target object.
  • the tracking determination unit is configured to determine a second area in the first image based on the first key point information; the second area is larger than the target object The first area; the first area includes the area where part of the limb of the target object is located; according to the second area, a third area in the second image corresponding to the position range of the second area is determined; Perform limb key point detection processing on pixels in the third area in the second image to obtain second key point information corresponding to the part of the limb.
  • the tracking determination unit is configured to determine, according to the position range of the first key point information in the first image, the difference between the position in the second image and the position in the second image. A third area corresponding to the range; performing limb key point detection processing on pixels in the third area in the second image to obtain second key point information corresponding to the part of the limb.
  • the limb detection module is configured to use a limb detection network to perform limb detection processing on the target object in the first image; wherein, the limb detection network uses the first Class sample images are obtained through training; the first class sample image is marked with a detection frame of the target object; the marking range of the detection frame includes the area where part of the limb of the target object is located.
  • the limb key point detection module is configured to use a limb key point detection network to perform limb key point detection processing on pixels corresponding to the first region; wherein, the limb key point The point detection network is trained using a second type of sample image; the second type of sample image is marked with key points that include part of the body of the target object.
  • part of the limbs of the target object includes at least one of the following: head, neck, shoulders, chest, waist, hips, arms, hands; the first key
  • the point information and the second key point information include contour key point information and/or bone key point information of at least one of the head, neck, shoulder, chest, waist, hip, arm, and hand.
  • the device further includes an allocation unit and a statistics unit; wherein, the allocation unit is configured to obtain the first key corresponding to a part of the limb of the target object in response to the detection unit.
  • the allocation unit is configured to obtain the first key corresponding to a part of the limb of the target object in response to the detection unit.
  • a tracking identifier is assigned to the target object; the statistical unit is configured to determine the target in the multi-frame image based on the number of the tracking identifiers allocated during the processing of the multi-frame image The number of objects.
  • the device further includes a determining unit configured to determine the posture of the target object based on the second key point information; and determine that the posture corresponds to the target object based on the posture of the target object The interactive instructions of the object.
  • the embodiment of the present disclosure also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the steps of the image processing method described in the embodiment of the present disclosure are realized.
  • the embodiment of the present disclosure also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor.
  • the processor executes the program to implement the The steps of the image processing method.
  • the embodiment of the present disclosure also provides a computer program that enables a computer to execute the image processing method described in the embodiment of the present disclosure.
  • the image processing method, device, electronic device, and storage medium provided by the embodiments of the present disclosure recognize the key points of part of the limb of the target object in the first image in the multi-frame image to be processed, and are based on the recognized part of the limb
  • the key points of determine the key points of the partial limbs of the target object in the subsequent second image, thereby realizing target tracking in a scene with partial limbs of the target object (for example, the upper body) in the image.
  • FIG. 1 is a first schematic diagram of the flow of an image processing method according to an embodiment of the disclosure
  • FIG. 2 is a schematic flowchart of a method for detecting and processing limb key points in an image processing method according to an embodiment of the disclosure
  • FIG. 3 is a schematic flowchart of a method for tracking key points of limbs in an image processing method according to an embodiment of the present disclosure
  • FIG. 4 is a second schematic flowchart of an image processing method according to an embodiment of the disclosure.
  • FIG. 5 is a first schematic diagram of the composition structure of the image processing apparatus according to an embodiment of the disclosure.
  • FIG. 6 is a second schematic diagram of the composition structure of the image processing apparatus according to an embodiment of the disclosure.
  • FIG. 7 is a third schematic diagram of the composition structure of the image processing device according to an embodiment of the disclosure.
  • FIG. 8 is a fourth schematic diagram of the composition structure of the image processing apparatus according to an embodiment of the disclosure.
  • FIG. 9 is a schematic diagram of the hardware composition structure of an electronic device according to an embodiment of the disclosure.
  • FIG. 1 is a schematic diagram 1 of the flow of an image processing method according to an embodiment of the present disclosure; as shown in FIG. 1, the method includes:
  • Step 101 Obtain multiple frames of images
  • Step 102 Perform limb key point detection processing on the target object in the first image in the multi-frame image to obtain first key point information corresponding to part of the limb of the target object;
  • Step 103 Determine the second key point information corresponding to the part of the limb of the target object in the second image based on the first key point information; wherein, among the multiple frames of images, the second image is an image after the first image.
  • the image processing method of this embodiment can be applied to an image processing device.
  • the image processing device can be set in an electronic device with processing functions such as a personal computer or a server, or it can be implemented by a processor executing a computer program.
  • the above-mentioned multi-frame images may be continuous videos collected by a camera device built-in or externally connected to the electronic device, or may also be received videos transmitted by other electronic devices.
  • the above-mentioned multi-frame images may be surveillance videos collected by a surveillance camera to track various target objects in the surveillance video.
  • the above-mentioned multi-frame images may also be videos stored locally or in other video libraries to track each target object in the video.
  • the image processing method of this embodiment can be applied to application scenarios such as virtual reality (VR, Augmented Reality), augmented reality (AR, Augmented Reality), or somatosensory games; the above-mentioned multi-frame images may also be
  • the image of the operator collected in the virtual reality or augmented reality scene can be used to control the actions of virtual objects in the virtual reality scene or the augmented reality scene by recognizing the operator’s posture in the image; or it can also be in a somatosensory game Collected images of target objects (such as multiple users) participating in the game.
  • the image processing device may establish a communication connection with one or more surveillance cameras, and obtain real-time surveillance videos collected by the surveillance cameras as multi-frame images to be processed.
  • the image processing device can also obtain a video from the video stored by itself as the multi-frame image to be processed, or it can also obtain the video from the video stored in other electronic equipment as the multi-frame image to be processed, etc. Wait.
  • the image processing device can also be placed in the game device, and the processor of the game device executes the computer program so as to realize that the output displayed image is used as the multi-frame image to be processed during the operation of the game operator.
  • the target object in the image (the target object corresponds to the game operator) is tracked.
  • the multi-frame image to be processed may include a target object, and the target object may be one or more; in some application scenarios, the target object may be a real person; in other application scenarios, the target object may also be It is other objects that are determined according to actual tracking needs, such as virtual characters or other virtual objects.
  • each frame image in the multi-frame image can be called a frame image, which is the smallest unit that constitutes a video (ie, the image to be processed).
  • the acquisition time of the frame image forms the above-mentioned multi-frame image, and the time parameter corresponding to each frame image is continuous.
  • one or more target objects may be included in the time range corresponding to the above-mentioned multi-frame image, or it may be the above-mentioned multi-frame image
  • a part of the time range of includes one or more target objects, which is not limited in this embodiment.
  • the above-mentioned first image is any one of the multi-frame images
  • the second image is an image after the first image; in other words, the above-mentioned first image is among the multi-frame images, in the second Any frame before the image.
  • the second image may be a subsequent frame of image that is time-continuous with the first image.
  • the multi-frame image includes 10 frames of images
  • the first image is the second frame of the 10 frames of images
  • the second image is the third frame of images.
  • the second image may also be an image after the first image, which is separated from the first image by a preset number of frames of images.
  • the multi-frame image includes 20 frames of images
  • the above-mentioned first image is the second frame of the 20 frames of images
  • the preset number of frame images is 3 frames of images
  • the above-mentioned second image may be the sixth frame of the 20 frames of images.
  • Frame image The above-mentioned preset number can be preset according to actual conditions, for example, the preset number can be preset according to the moving speed of the target object. This embodiment can effectively reduce the amount of data processing, thereby reducing the consumption of the image processing device.
  • the image processing device may perform limb key point detection processing on the target object in the first image through the limb key point detection network to obtain first key point information corresponding to part of the limb of the target object.
  • part of the limbs of the target object includes at least one of the following: head, neck, shoulders, chest, waist, hips, arms, and hands.
  • the first key point information and the second key point information corresponding to part of the limbs of the target object include at least one limb of the head, neck, shoulders, chest, waist, hips, arms, and hands of the target object.
  • the outline key point information and/or bone key point information are examples of the target object in the first image through the limb key point detection network.
  • part of the limbs of the target object in this embodiment are the upper body limbs of the target object, so as to be able to identify the target object with the upper body in the multi-frame images, so as to realize the tracking of the target object with only the upper body or the whole body.
  • the key points corresponding to the first key point information and the second key point information may include: at least one key point on the head, at least one key point on the shoulder, at least one key point on the arm, and at least one key point on the chest.
  • the key point, at least one key point of the hip, and at least one key point of the waist; optionally, the key point corresponding to the first key point information and the second key point information may also include at least one key point of the hand.
  • Whether the image processing device can obtain the key points of the hand depends on whether the key points of the hand are marked in the sample images used to train the key point detection network of the limbs; when the key points of the hands are marked in the sample image, Then the key points of the hand can be detected through the limb key point detection network.
  • the first key point information and the second key point information may include key point information of at least one organ, and key point information of at least one organ.
  • the information may include at least one of the following: nose key point information, eyebrow key point information, and mouth key point information.
  • the first key point information and the second key point information may include elbow key point information.
  • the first key point information and the second key point information may include wrist key point information.
  • the first key point information and the second key point information may further include contour key point information of the hand.
  • the first key point information and the second key point information may include left hip key point information and right hip key point information.
  • the first key point information and the second key point information may also include the key point information of the spine root.
  • the above-mentioned first key point information may specifically include the coordinates of the key point.
  • the aforementioned first key point information may include the coordinates of the contour key points and/or the coordinates of the bone key points. It can be understood that the contour edges of the corresponding part of the limb can be formed by the coordinates of the contour key points; the bones of the corresponding part of the limb can be formed by the coordinates of the bone key points.
  • FIG. 2 is a schematic flowchart of a method for detecting and processing limb key points in an image processing method according to an embodiment of the present disclosure; in some optional embodiments, step 102 may refer to FIG. 2 and includes:
  • Step 1021 Perform limb detection processing on the target object in the first image to determine the first area of the target object; the first area includes the area where part of the limb of the target object is located;
  • Step 1022 Perform limb key point detection processing on the pixel points corresponding to the first area to obtain first key point information corresponding to part of the limb of the target object.
  • the first area corresponding to the upper body of each target object or the first area corresponding to the whole body of each target object can be determined.
  • One area a detection frame (for example, a rectangular frame) identifying the target object may be used to indicate the first area corresponding to a part of the limb, for example, the upper body of each person in the first image may be identified by each rectangular frame.
  • the above-mentioned performing limb detection processing on the target object in the first image includes: using a limb detection network to perform limb detection processing on the target object in the first image; wherein, the aforementioned limb detection network adopts the first One type of sample image is trained; the first type of sample image is marked with the detection frame of the target object; the marking range of the detection frame includes the area where part of the limb of the target object is located; part of the limb of the target object may be the upper body limb of the target object.
  • limb detection can be performed on the first image through a pre-trained limb detection network to determine the first area of the target object, that is, to obtain the detection frame of each target object in the first image.
  • the above detection frame can identify part or all of the limbs of the target object, that is, all the limbs or upper body limbs of the target object can be detected through the limb detection network.
  • the aforementioned limb detection network may adopt any network structure capable of detecting the limb of the target object, which is not limited in this embodiment.
  • the feature extraction of the first image can be performed through the limb detection network, and the characteristics of each target object in the first image can be determined based on the extracted features.
  • the center point of part of the limb and the height and width of the detection frame of the part of the limb corresponding to each target object. Based on the center point of the part of the limb of each target object and the corresponding height and width, the detection frame of the part of each target object can be determined .
  • the limb detection network can be obtained by training with the first type of sample image marked with the detection frame of the target object; wherein the marking range of the detection frame includes part of the limb of the target object.
  • the first type of sample image can be A detection frame marked with only a part of the limb of the target object (for example, the upper body limb of the target object) may also be marked with a detection frame of the complete limb of the target object.
  • the body detection network can be used to extract the feature data of the first type of sample image, and based on the feature data, the part of the body of each target object in the first type of sample image can be determined
  • the predicted center point and the height and width of the predicted detection frame of the corresponding part of the limb, and the predicted detection frame corresponding to each part of the limb is determined based on the predicted center point of the above part of the limb and the corresponding height and width; according to the predicted detection frame and the marked part of the limb
  • the detection frame determines the loss, and adjusts the network parameters of the limb detection network based on the loss.
  • the above-mentioned performing limb key point detection processing on pixels corresponding to the first region includes: using a limb key point detection network to perform limb key point detection processing on pixels corresponding to the first area; wherein, The above-mentioned limb key point detection network is obtained by training using the second type of sample image; the second type of sample image is marked with the key points of the target object; the marking range of the above key point includes part of the limb of the target object.
  • a pre-trained limb key point detection network may be used to perform limb key point detection on pixels corresponding to the first region to determine the first key point information of a part of the limb of each target object.
  • the above-mentioned first area may include part of the limbs of the target object, and the pixel points corresponding to the detection frame of each target object may be input to the limb key point detection network to obtain the first key point information corresponding to the part of the limb of each target object .
  • the aforementioned limb key point detection network may adopt any network structure capable of detecting limb key points, which is not limited in this embodiment.
  • the limb key point detection network can be obtained by training with the second type of sample images marked with the key points of the target object.
  • the marking range of the key points includes part of the limbs of the target object. It is understandable that the second type of sample image Only the key points of part of the limbs of the target object (for example, the upper body limbs of the target object) may be marked in the, or the key points of the complete limbs of the target object may be marked.
  • the feature data of the second type of sample image can be extracted using the limb key point detection network, and the second type of sample image can be determined based on the feature data
  • the prediction key points of part of the limbs of each target object in the target object; the loss is determined based on the above prediction key points and the marked key points, and the network parameters of the limb key point detection network are adjusted based on the loss.
  • FIG. 3 is a schematic flowchart of a method for tracking body key points in an image processing method according to an embodiment of the present disclosure; in some optional embodiments, step 103 may refer to FIG. 3, and the method includes:
  • Step 1031 Determine a second area in the first image based on the first key point information; the second area is larger than the first area of the target object; the first area includes the area where part of the limb of the target object is located;
  • Step 1032 According to the second area, determine a third area in the second image corresponding to the position range of the second area;
  • Step 1033 Perform limb key point detection processing on pixels in the third area in the second image to obtain second key point information corresponding to a part of the limb.
  • an area is determined based on the first key point information of a part of the limb of the target object, and the area may be the smallest area containing all the key points of the part of the limb of the target object .
  • the area is a rectangular area
  • the rectangular area is the smallest area containing all the key points of a part of the limb of the target object.
  • the above-mentioned second area is an area obtained by performing magnification processing on the first area in the first image.
  • the center point of the area may be the center, and the four sides of the area may extend away from the center point.
  • the second area can pass through the first image and center on the center.
  • the point is the center, the rectangular area with the height of 3H/2 and the width of 3W/2 is represented.
  • the third area in the second image corresponding to the above-mentioned position range can be determined according to the position range of the second area in the first image.
  • determining the third area in the second image corresponding to the position range of the second area according to the second area may further include: performing limb key point detection processing on pixels corresponding to the second area, Obtain third key point information; determine a position range of the third key point information in the first image, and determine a third area in the second image corresponding to the position range based on the position range.
  • the limb key point detection network is still used to perform limb key point detection processing on the pixels corresponding to the second area, and the pixels corresponding to the expanded second area in the first image may be used as The limb key point detection network inputs data, and outputs the third key point information.
  • the third key point information is used as the prediction key point information of the target object in the second image.
  • the area where the target object is located is expanded (for example, the area where part of the limb of the target object in the previous frame image is expanded), and the expanded area is detected by limb key points, and the obtained key points are used as the current frame image (I.e., the first image) in the next frame of image (i.e., the second image), corresponding to the predicted key points of the target object (for example, a part of the limb of the target object). Further based on the predicted position range, the pixel points corresponding to the third area in the second image are subjected to limb key point detection processing, and the detected key point information is the second key point information corresponding to the part of the limb of the target object.
  • the above step 103 may further include: determining the first key point corresponding to the position range in the second image according to the position range of the first key point information in the first image. Three regions; performing limb key point detection processing on pixels in the third region in the second image to obtain second key point information corresponding to the part of the limb.
  • the third area in the second image corresponding to the above-mentioned position range can be determined according to the position range of the first key point in the first image.
  • the pixel points corresponding to the third area in the second image are further subjected to limb key point detection processing, and the detected key point information is the second key point information corresponding to the part of the limb of the target object.
  • step 103 may further include: determining the predicted area of the target object in the second image based on the first image, the first area of the target object, and the target tracking network, based on the above-mentioned The pixel points in the prediction area are subjected to limb key point detection processing to obtain the second key point information corresponding to part of the limb of the target object; among them, the target tracking network is trained by using multi-frame sample images; the multi-frame sample images include at least the first sample The image and the second sample image, the second sample image is an image after the first sample image; the position of the target object is marked in the first sample image, and the position of the target object is marked in the second sample image.
  • the detection frame of the target object is marked in the sample images of multiple frames, and the position of the target object in the sample image is represented by the detection frame; the marking range of the detection frame includes the area where part of the limb of the target object is located; part of the limb of the target object It can be the upper body limbs of the target object.
  • the position of the target object in the previous frame of image (ie, the first image) and the target object in the image can be used to determine the location of the target object in the next frame of image (ie, the second image) through a pre-trained target tracking network.
  • Forecast location e.g., the first image containing the detection frame of the target object can be input to the target tracking network to obtain the predicted position of the target object in the second image;
  • the key point detection process obtains the second key point information of the part of the limb of the target object in the second image.
  • the above-mentioned target tracking network may adopt any network structure capable of realizing target tracking, which is not limited in this embodiment.
  • the target tracking network can be obtained by training with multi-frame sample images marked with the position of the target object (for example, a detection frame containing the target object, or a detection frame containing a part of the limb of the target object).
  • the target tracking network can be used to process the first sample image, and the position of the target object is marked in the first sample image.
  • the result is the predicted position of the target object in the second sample image; the loss can be determined according to the predicted position and the label position of the target object in the second image, and the network parameters of the target tracking network can be adjusted based on the loss.
  • the second key point corresponding to the part of the limb of the target object in the second image may be determined based on the second key point.
  • the information further determines the key point information corresponding to the part of the limb of the target object in the rear image, and so on, until the key point information corresponding to the part of the limb of the target object cannot be detected in the next frame of image.
  • the above-mentioned target object is no longer included in the processed multi-frame image, that is, the target object has moved out of the field of view of the multi-frame image to be processed.
  • the image processing device may also perform limb detection for the target object in each frame of image to obtain the area where the target object in each frame of image is located.
  • the detected target object is used as the tracking object to determine whether a new target object appears in the current frame image; when a new target object appears in the current frame image, the new target object is used as the tracking object, and the new target object
  • the pixel points in the first area corresponding to the target object are subjected to limb key point detection processing, that is, the processing of step 103 in the embodiment of the present disclosure is executed for the new target object.
  • the image processing device may execute the limb detection processing of the target object in the image every preset time or every preset number of image frames, so as to detect whether a new target object appears in the image at regular intervals. Track new target objects.
  • the foregoing method further includes: in response to obtaining first key point information corresponding to a part of the limb of the target object, assigning a tracking identifier to the target object; The number of assigned tracking marks determines the number of target objects in the multi-frame image.
  • the image processing device detects the target object in the first frame of the multi-frame image to be processed, that is, when the first key point information corresponding to part of the body of the target object is obtained, a tracking identifier is assigned to the target object, The tracking identifier is associated with the target object until the target object cannot be tracked during the process of tracking the target object.
  • the image processing device may also perform limb detection for the target object in each frame of image, to obtain the area corresponding to part of the limb of the target object in each frame of image, and use the detected target object as tracking Object. Based on this, the image processing device detects the first frame of the image to be processed, and assigns a tracking identifier to the detected target object. After that, the tracking identifier keeps following the target object until the target object cannot be tracked. If a new target object is detected in a certain frame of image, a tracking identifier is assigned to the new target object, and the above solution is repeated.
  • each target object detected at the same time corresponds to different tracking identifiers; target objects tracked in a continuous time range correspond to the same tracking identifier; target objects detected separately in a non-continuous time range Correspond to different tracking identifiers.
  • a tracking identifier is assigned to the three target objects, and each target object corresponds to a tracking identifier.
  • a tracking identifier is assigned to the three target objects, for example, they can be recorded as identifier 1, identifier 2, and identifier 3.
  • the first one of the above three target objects disappears.
  • the current one minute there are only two target objects, and the corresponding tracking identifiers are identification 2 and identification 3.
  • the above-mentioned first target object appears in the image again, that is, compared to the previous image, a new target object is detected, even though the target object is the first target object that appeared within 1 minute (ie The first target object), the target object is still assigned identifier 4 as the tracking identifier, and so on.
  • the technical solution of this embodiment can determine the number of target objects that have appeared in the multi-frame image based on the number of corresponding tracking marks in the multi-frame image processing process.
  • the number of target objects that have appeared in multiple frames of images refers to the number of target objects that have appeared in a time range corresponding to the multiple frames of images.
  • the key points of part of the limbs of the target object in the first image in the multi-frame images to be processed are recognized, and the subsequent second is determined based on the key points of the recognized part of the limbs.
  • the key points of the part of the limb of the target object in the image thereby realizing target tracking in a scene with only part of the limb of the target object (for example, the upper body) in the image, that is, the technical solution of the embodiment of the present disclosure can simultaneously adapt to the complete limb The scene and part of the limbs (such as the upper body) scene, realize the target tracking in the image.
  • FIG. 4 is a second schematic diagram of the flow of the image processing method according to an embodiment of the disclosure; as shown in FIG. 4, the method includes:
  • Step 201 Obtain multiple frames of images
  • Step 202 Perform limb key point detection processing on the target object in the first image in the multi-frame image to obtain first key point information corresponding to part of the limb of the target object;
  • Step 203 Determine the second key point information corresponding to the part of the limb of the target object in the second image based on the first key point information; wherein, in the multiple frames of images, the second image is one frame after the first image;
  • Step 204 Determine the posture of the target object based on the second key point information; determine the interactive instruction corresponding to the target object based on the posture of the target object.
  • step 201 to step 203 in this embodiment reference may be made to the description of step 101 to step 103, which will not be repeated here.
  • the posture of the target object can be determined based on the tracked target object and further based on the second key point information of the target object, and the interaction instruction corresponding to each posture can be determined based on the posture of the target object. Afterwards, respond to the interactive commands corresponding to each posture.
  • the image processing device can determine the corresponding interactive instruction based on each posture, and respond to the above interactive instruction; in response to the above interactive instruction, for example, the image processing device itself or the electronic device where the image processing device is located can be turned on or off. Some of its own functions, etc.; or, in response to the above interactive instructions, the above interactive instructions can also be sent to other electronic devices, and other electronic devices receive the above interactive instructions and turn on or off certain functions based on the interactive instructions, in other words, the above Interactive instructions can also be used to turn on or turn off corresponding functions of other electronic devices.
  • the image processing device can perform corresponding processing based on various interactive instructions, including but not limited to controlling virtual reality or augmented reality scenes, performing corresponding actions on virtual objects; controlling somatosensory game scenes, performing corresponding actions on virtual characters corresponding to the target object Actions.
  • the corresponding processing performed by the image processing device based on the interactive instruction may include controlling the virtual target object to perform an action corresponding to the interactive instruction in a real scene or a virtual scene.
  • target tracking in a scene with only part of the limbs (such as the upper body) of the target object in the image is realized, that is, the technical solutions of the embodiments of the present disclosure can simultaneously adapt to the scenes of complete limbs and Part of the limbs (such as the upper body) scene realizes the target tracking in the image; on the other hand, the key point information of the tracked target object is detected during the target tracking process, and the tracked target object is determined based on the key point information of the target object Based on the posture of the target object, the corresponding interaction instruction is determined, which realizes human-computer interaction in specific application scenarios (such as virtual reality scenes, augmented reality scenes, somatosensory game scenes and other interactive scenes), and enhances the user’s interactive experience.
  • specific application scenarios such as virtual reality scenes, augmented reality scenes, somatosensory game scenes and other interactive scenes
  • FIG. 5 is a schematic diagram 1 of the composition structure of an image processing device according to an embodiment of the disclosure; as shown in FIG. 5, the device includes: an acquisition unit 31, a detection unit 32, and a tracking determination unit 33; wherein,
  • the aforementioned acquiring unit 31 is configured to acquire multiple frames of images
  • the detection unit 32 is configured to perform limb key point detection processing on the target object in the first image in the multi-frame image, to obtain first key point information corresponding to part of the limb of the target object;
  • the tracking determination unit 33 is configured to determine, based on the first key point information, the second key point information corresponding to the part of the limb of the target object in the second image; wherein, in the multi-frame image, the second image is An image after the first image.
  • the detection unit 32 includes: a limb detection module 321 and a limb key point detection module 322; wherein,
  • the limb detection module 321 is configured to perform limb detection processing on the target object in the first image to determine the first area of the target object; the first area includes the area where part of the limb of the target object is located;
  • the limb key point detection module 322 is configured to perform limb key point detection processing on the pixel points corresponding to the first region to obtain first key point information corresponding to the part of the limb of the target object.
  • the tracking determining unit 33 is configured to determine a second area in the first image based on the first key point information; the second area is larger than the first area of the target object; One area includes the area where part of the limb of the target object is located; according to the second area, determine the third area in the second image corresponding to the position range of the second area; perform limb keying on the pixels in the third area in the second image Point detection processing to obtain the second key point information corresponding to the part of the limb.
  • the above-mentioned tracking determination unit 33 is configured to determine, according to the position range of the first key point information in the first image, that the position in the second image is different from the position in the second image.
  • the limb detection module 321 is configured to use a limb detection network to perform limb detection processing on the target object in the first image; wherein, the limb detection network uses the first type of sample image Obtained by training; the detection frame of the target object is marked in the above-mentioned first-type sample image; the marking range of the detection frame includes the area where part of the limb of the target object is located.
  • the aforementioned limb key point detection module 322 is configured to use a limb key point detection network to perform limb key point detection processing on pixels corresponding to the first region; wherein, the aforementioned limb key point detection The network is trained by using the second type of sample image; the above-mentioned second type of sample image is marked with key points that include part of the body of the target object.
  • the part of the limbs of the target object includes at least one of the following: head, neck, shoulders, chest, waist, hips, arms, hands; the above-mentioned first key point information And the above-mentioned second key point information includes contour key point information and/or bone key point information of at least one of the head, neck, shoulders, chest, waist, hips, arms, and hands.
  • the above-mentioned apparatus further includes: an allocation unit 34 and a statistics unit 35; wherein,
  • the allocation unit 34 is configured to allocate a tracking identifier to the target object in response to the detection unit 32 obtaining the first key point information corresponding to a part of the limb of the target object;
  • the aforementioned statistical unit 35 is configured to determine the number of target objects in the multi-frame image based on the number of tracking identifiers allocated during the processing of the multi-frame image.
  • the above-mentioned apparatus further includes a determining unit 36 configured to determine the posture of the target object based on the second key point information; and determine the posture corresponding to the target object based on the posture of the target object.
  • a determining unit 36 configured to determine the posture of the target object based on the second key point information; and determine the posture corresponding to the target object based on the posture of the target object.
  • the acquisition unit 31, the detection unit 32 (including the limb detection module 321 and the limb key point detection module 322), the tracking determination unit 33, the allocation unit 34, the statistics unit 35, and the determination unit 36 in the above-mentioned image processing device can be implemented by a central processing unit (CPU, Central Processing Unit), a digital signal processor (DSP, Digital Signal Processor), a microcontroller unit (MCU, Microcontroller Unit) or a programmable gate array (FPGA, Field-Programmable Gate Array) implementation.
  • CPU Central Processing Unit
  • DSP Digital Signal Processor
  • MCU Microcontroller Unit
  • FPGA Field-Programmable Gate Array
  • the image processing device provided in the foregoing embodiment performs image processing
  • only the division of the foregoing program modules is used as an example for illustration.
  • the foregoing processing can be allocated to different program modules as needed. That is, the internal structure of the device is divided into different program modules to complete all or part of the processing described above.
  • the image processing device provided in the foregoing embodiment and the image processing method embodiment belong to the same concept, and the specific implementation process is detailed in the method embodiment, which will not be repeated here.
  • FIG. 9 is a schematic diagram of the hardware composition structure of the electronic device of the embodiment of the disclosure; as shown in FIG. 9, the electronic device 40 may include a memory 42, a processor 41, and a computer program stored on the memory 42 and running on the processor 41 When the above-mentioned processor 41 executes the above-mentioned program, the steps of the above-mentioned image processing method in the embodiment of the present disclosure are realized.
  • bus system 43 various components in the electronic device 40 may be coupled together through the bus system 43. It can be understood that the bus system 43 is used to implement connection and communication between these components. In addition to the data bus, the bus system 43 also includes a power bus, a control bus, and a status signal bus. However, for the sake of clear description, various buses are marked as the bus system 43 in FIG. 9.
  • the memory 42 may be a volatile memory or a non-volatile memory, and may also include both volatile and non-volatile memory.
  • the non-volatile memory can be a read-only memory (ROM, Read Only Memory), a programmable read-only memory (PROM, Programmable Read-Only Memory), an erasable programmable read-only memory (EPROM, Erasable Programmable Read- Only Memory, Electrically Erasable Programmable Read-Only Memory (EEPROM), Ferromagnetic Random Access Memory (FRAM), Flash Memory, Magnetic Surface Memory , CD-ROM, or CD-ROM (Compact Disc Read-Only Memory); magnetic surface memory can be magnetic disk storage or tape storage.
  • the volatile memory may be a random access memory (RAM, Random Access Memory), which is used as an external cache.
  • RAM random access memory
  • SRAM static random access memory
  • SSRAM synchronous static random access memory
  • Synchronous Static Random Access Memory Synchronous Static Random Access Memory
  • DRAM Dynamic Random Access Memory
  • SDRAM Synchronous Dynamic Random Access Memory
  • DDRSDRAM Double Data Rate Synchronous Dynamic Random Access Memory
  • ESDRAM Enhanced Synchronous Dynamic Random Access Memory
  • SLDRAM synchronous connection dynamic random access memory
  • DRRAM Direct Rambus Random Access Memory
  • the memory 42 described in the embodiments of the present disclosure is intended to include, but is not limited to, these and any other suitable types of memory.
  • the methods disclosed in the foregoing embodiments of the present disclosure may be applied to the processor 41 or implemented by the processor 41.
  • the processor 41 may be an integrated circuit chip with signal processing capabilities. In the implementation process, the steps of the foregoing method can be completed by an integrated logic circuit of hardware in the processor 41 or instructions in the form of software.
  • the aforementioned processor 41 may be a general-purpose processor, a DSP, or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, and the like.
  • the processor 41 may implement or execute various methods, steps, and logical block diagrams disclosed in the embodiments of the present disclosure.
  • the general-purpose processor may be a microprocessor or any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present disclosure may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module may be located in a storage medium, and the storage medium is located in the memory 42.
  • the processor 41 reads the information in the memory 42 and completes the steps of the foregoing method in combination with its hardware.
  • the electronic device 40 may be used by one or more Application Specific Integrated Circuits (ASIC, Application Specific Integrated Circuit), DSP, Programmable Logic Device (PLD, Programmable Logic Device), and Complex Programmable Logic Device (CPLD). , Complex Programmable Logic Device, FPGA, general-purpose processor, controller, MCU, microprocessor (Microprocessor), or other electronic components to implement the foregoing method.
  • ASIC Application Specific Integrated Circuit
  • DSP Digital Signal Processing
  • PLD Programmable Logic Device
  • CPLD Complex Programmable Logic Device
  • FPGA field-programmable Logic Device
  • controller Microcontroller
  • MCU microprocessor
  • the embodiment of the present disclosure also provides a computer-readable storage medium, such as a memory 42 including a computer program, which can be executed by the processor 41 of the electronic device 40 to complete the steps described in the foregoing method.
  • the computer-readable storage medium can be FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface memory, optical disk, or CD-ROM, etc.; it can also be a variety of devices including one or any combination of the above-mentioned memories, such as Mobile phones, computers, tablet devices, personal digital assistants, etc.
  • the embodiment of the present disclosure also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the steps of the image processing method according to the embodiment of the present disclosure are realized.
  • the embodiment of the present disclosure also provides a computer program that enables a computer to execute the steps of the image processing method described in the embodiment of the present disclosure.
  • the disclosed device and method may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, such as: multiple units or components can be combined, or It can be integrated into another system, or some features can be ignored or not implemented.
  • the coupling, or direct coupling, or communication connection between the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms. of.
  • the units described above as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units; Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the embodiments of the present disclosure can be all integrated into one processing unit, or each unit can be individually used as a unit, or two or more units can be integrated into one unit;
  • the unit can be implemented in the form of hardware, or in the form of hardware plus software functional units.
  • the foregoing program can be stored in a computer readable storage medium. When the program is executed, it is executed. Including the steps of the foregoing method embodiment; and the foregoing storage medium includes: various media that can store program codes, such as a mobile storage device, ROM, RAM, magnetic disk, or optical disk.
  • the aforementioned integrated unit of the present disclosure is implemented in the form of a software function module and sold or used as an independent product, it may also be stored in a computer readable storage medium.
  • the computer software product is stored in a storage medium and includes several instructions for A computer device (which may be a personal computer, a server, or a network device, etc.) executes all or part of the methods described in the various embodiments of the present disclosure.
  • the aforementioned storage media include: removable storage devices, ROM, RAM, magnetic disks, or optical disks and other media that can store program codes.

Abstract

The embodiments of the present disclosure disclose an image processing method and apparatus, an electronic device, and a storage medium. Said method comprises: obtaining a plurality of frames of images; performing limb key point detection processing on a target object in a first image among the plurality of frames of images, so as to obtain first key point information corresponding to part of the limbs of the target object; and determining second key point information corresponding to the part of the limbs of the target object in a second image on the basis of the first key point information, wherein in the plurality of frames of images, the second image is a frame of image following the first image.

Description

图像处理方法、装置、电子设备和存储介质Image processing method, device, electronic equipment and storage medium
相关申请的交叉引用Cross-references to related applications
本公开基于申请号为202010357593.2、申请日为2020年04月29日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以引入方式并入本公开。The present disclosure is filed based on a Chinese patent application with an application number of 202010357593.2 and an application date of April 29, 2020, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is hereby incorporated into the present disclosure by way of introduction.
技术领域Technical field
本公开涉及计算机视觉技术领域,具体涉及一种图像处理方法、装置、电子设备和存储介质。The present disclosure relates to the field of computer vision technology, and in particular to an image processing method, device, electronic equipment, and storage medium.
背景技术Background technique
目标跟踪技术通常基于肢体检测算法和肢体关键点检测算法,利用肢体检测算法检测出的人体,以及肢体关键点检测算法检测出的人体关键点,实现目标跟踪。但是目前的肢体检测算法和肢体关键点检测算法无法适应只有上半身肢体的场景,从而导致只有上半身肢体的目标无法进行跟踪。Target tracking technology is usually based on a limb detection algorithm and a limb key point detection algorithm, using the human body detected by the limb detection algorithm and the human body key points detected by the limb key point detection algorithm to achieve target tracking. However, current limb detection algorithms and limb key point detection algorithms cannot adapt to scenes with only upper body limbs, which leads to the inability to track targets with only upper body limbs.
发明内容Summary of the invention
本公开实施例提供一种图像处理方法、装置、电子设备和存储介质。The embodiments of the present disclosure provide an image processing method, device, electronic equipment, and storage medium.
本公开实施例提供了一种图像处理方法,所述方法包括:获得多帧图像;对所述多帧图像中的第一图像中的目标对象进行肢体关键点检测处理,获得所述目标对象的部分肢体对应的第一关键点信息;基于所述第一关键点信息确定第二图像中的所述目标对象的所述部分肢体对应的第二关键点信息;其中,在所述多帧图像中,所述第二图像为所述第一图像后的一帧图像。The embodiment of the present disclosure provides an image processing method, the method includes: obtaining a multi-frame image; performing limb key point detection processing on a target object in a first image in the multi-frame image, to obtain an image of the target object The first key point information corresponding to the part of the limb; the second key point information corresponding to the part of the limb of the target object in the second image is determined based on the first key point information; wherein, in the multi-frame image , The second image is an image after the first image.
在本公开的一些可选实施例中,所述对所述多帧图像中的第一图像中的目标对象进行肢体关键点检测处理,获得所述目标对象的部分肢体对应的第一关键点信息,包括:对所述第一图像中的所述目标对象进行肢体检测处理,确定所述目标对象的第一区域;所述第一区域包括所述目标对象的部分肢体所在区域;对所述第一区域对应的像素点进行肢体关键点检测处理,获得所述目标对象的所述部分肢体对应的第一关键点信息。In some optional embodiments of the present disclosure, the limb key point detection processing is performed on the target object in the first image in the multi-frame image to obtain first key point information corresponding to part of the limb of the target object , Including: performing limb detection processing on the target object in the first image to determine a first area of the target object; the first area includes the area where part of the limb of the target object is located; The pixel points corresponding to a region are subjected to limb key point detection processing, and the first key point information corresponding to the part of the limb of the target object is obtained.
在本公开的一些可选实施例中,所述基于所述第一关键点信息确定第二图像中的所述目标对象的所述部分肢体对应的第二关键点信息,包括:基于所述第一关键点信息在 所述第一图像中确定第二区域;所述第二区域大于所述目标对象的第一区域;所述第一区域包括所述目标对象的部分肢体所在区域;根据所述第二区域,确定所述第二图像中与所述第二区域的位置范围对应的第三区域;对所述第二图像中的所述第三区域内的像素点进行肢体关键点检测处理,获得所述部分肢体对应的第二关键点信息。In some optional embodiments of the present disclosure, the determining the second key point information corresponding to the part of the limb of the target object in the second image based on the first key point information includes: based on the first key point information A key point information determines a second area in the first image; the second area is larger than the first area of the target object; the first area includes the area where part of the limb of the target object is located; A second area, determining a third area in the second image corresponding to the position range of the second area; performing limb key point detection processing on pixels in the third area in the second image, Obtain the second key point information corresponding to the part of the limbs.
在本公开的一些可选实施例中,所述基于所述第一关键点信息确定第二图像中的所述目标对象的所述部分肢体对应的第二关键点信息,包括:根据所述第一关键点信息在所述第一图像中的位置范围,确定所述第二图像中、与所述位置范围对应的第三区域;对所述第二图像中的所述第三区域内的像素点进行肢体关键点检测处理,获得所述部分肢体对应的第二关键点信息。In some optional embodiments of the present disclosure, the determining the second key point information corresponding to the part of the limb of the target object in the second image based on the first key point information includes: according to the first key point information The position range of a key point information in the first image is used to determine the third area in the second image corresponding to the position range; for the pixels in the third area in the second image Perform limb key point detection processing to obtain the second key point information corresponding to the part of the limb.
在本公开的一些可选实施例中,所述对所述第一图像中的所述目标对象进行肢体检测处理,包括:利用肢体检测网络对所述第一图像中的所述目标对象进行肢体检测处理;其中,所述肢体检测网络采用第一类样本图像训练得到;所述第一类样本图像中标注有目标对象的检测框;所述检测框的标注范围包括所述目标对象的部分肢体所在区域。In some optional embodiments of the present disclosure, the performing limb detection processing on the target object in the first image includes: performing limb detection on the target object in the first image using a limb detection network. Detection processing; wherein the limb detection network is trained using the first type of sample image; the first type of sample image is marked with a detection frame of the target object; the marking range of the detection frame includes part of the limb of the target object your region.
在本公开的一些可选实施例中,所述对所述第一区域对应的像素点进行肢体关键点检测处理,包括:利用肢体关键点检测网络对所述第一区域对应的像素点进行肢体关键点检测处理;其中,所述肢体关键点检测网络采用第二类样本图像训练得到;所述第二类样本图像中标注有包括所述目标对象的部分肢体的关键点。In some optional embodiments of the present disclosure, the performing limb key point detection processing on the pixels corresponding to the first area includes: performing limb key point detection on the pixels corresponding to the first area by using a limb key point detection network. Key point detection processing; wherein the limb key point detection network is trained by using a second type of sample image; the second type of sample image is marked with key points of a part of the limb that includes the target object.
在本公开的一些可选实施例中,所述目标对象的部分肢体包括以下至少之一:头部、颈部、肩部、胸部、腰部、髋部、手臂、手部;所述第一关键点信息和所述第二关键点信息包括头部、颈部、肩部、胸部、腰部、髋部、手臂和手部中的至少一个肢体的轮廓关键点信息和/骨骼关键点信息。In some optional embodiments of the present disclosure, part of the limbs of the target object includes at least one of the following: head, neck, shoulders, chest, waist, hips, arms, hands; the first key The point information and the second key point information include contour key point information and/or bone key point information of at least one of the head, neck, shoulder, chest, waist, hip, arm, and hand.
在本公开的一些可选实施例中,所述方法还包括:响应于获得所述目标对象的部分肢体对应的第一关键点信息的情况,为所述目标对象分配跟踪标识;基于对所述多帧图像的处理过程中分配的所述跟踪标识的数量,确定所述多帧图像中的目标对象的数量。In some optional embodiments of the present disclosure, the method further includes: in response to obtaining first key point information corresponding to a part of the limb of the target object, assigning a tracking identifier to the target object; The number of tracking identifiers allocated during the processing of the multi-frame image determines the number of target objects in the multi-frame image.
在本公开的一些可选实施例中,所述方法还包括:基于所述第二关键点信息确定所述目标对象的姿态;基于所述目标对象的姿态确定对应于所述目标对象的交互指令。In some optional embodiments of the present disclosure, the method further includes: determining the posture of the target object based on the second key point information; determining the interaction instruction corresponding to the target object based on the posture of the target object .
本公开实施例还提供了一种图像处理装置,所述装置包括:获取单元、检测单元和跟踪确定单元;其中,所述获取单元,配置为获得多帧图像;所述检测单元,配置为对所述多帧图像中的第一图像中的目标对象进行肢体关键点检测处理,获得所述目标对象的部分肢体对应的第一关键点信息;所述跟踪确定单元,配置为基于所述第一关键点信息确定第二图像中的所述目标对象的所述部分肢体对应的第二关键点信息;其中,在所述多帧图像中,所述第二图像为所述第一图像后的一帧图像。The embodiment of the present disclosure also provides an image processing device, the device includes: an acquisition unit, a detection unit, and a tracking determination unit; wherein the acquisition unit is configured to obtain multiple frames of images; the detection unit is configured to The target object in the first image in the multi-frame image performs limb key point detection processing to obtain first key point information corresponding to a part of the limb of the target object; the tracking determination unit is configured to be based on the first The key point information determines the second key point information corresponding to the part of the limb of the target object in the second image; wherein, in the multi-frame image, the second image is one after the first image Frame image.
在本公开的一些可选实施例中,所述检测单元包括:肢体检测模块和肢体关键点检测模块;其中,所述肢体检测模块,配置为对所述第一图像中的所述目标对象进行肢体 检测处理,确定所述目标对象的第一区域;所述第一区域包括所述目标对象的部分肢体所在区域;所述肢体关键点检测模块,配置为对所述第一区域对应的像素点进行肢体关键点检测处理,获得所述目标对象的所述部分肢体对应的第一关键点信息。In some optional embodiments of the present disclosure, the detection unit includes: a limb detection module and a limb key point detection module; wherein, the limb detection module is configured to perform detection on the target object in the first image. The limb detection processing determines the first area of the target object; the first area includes the area where part of the limb of the target object is located; the limb key point detection module is configured to detect pixels corresponding to the first area Performing limb key point detection processing to obtain first key point information corresponding to the part of the limb of the target object.
在本公开的一些可选实施例中,所述跟踪确定单元,配置为基于所述第一关键点信息在所述第一图像中确定第二区域;所述第二区域大于所述目标对象的第一区域;所述第一区域包括所述目标对象的部分肢体所在区域;根据所述第二区域,确定所述第二图像中与所述第二区域的位置范围对应的第三区域;对所述第二图像中的所述第三区域内的像素点进行肢体关键点检测处理,获得所述部分肢体对应的第二关键点信息。In some optional embodiments of the present disclosure, the tracking determination unit is configured to determine a second area in the first image based on the first key point information; the second area is larger than the target object The first area; the first area includes the area where part of the limb of the target object is located; according to the second area, a third area in the second image corresponding to the position range of the second area is determined; Perform limb key point detection processing on pixels in the third area in the second image to obtain second key point information corresponding to the part of the limb.
在本公开的一些可选实施例中,所述跟踪确定单元,配置为根据所述第一关键点信息在所述第一图像中的位置范围,确定所述第二图像中、与所述位置范围对应的第三区域;对所述第二图像中的所述第三区域内的像素点进行肢体关键点检测处理,获得所述部分肢体对应的第二关键点信息。In some optional embodiments of the present disclosure, the tracking determination unit is configured to determine, according to the position range of the first key point information in the first image, the difference between the position in the second image and the position in the second image. A third area corresponding to the range; performing limb key point detection processing on pixels in the third area in the second image to obtain second key point information corresponding to the part of the limb.
在本公开的一些可选实施例中,所述肢体检测模块,配置为利用肢体检测网络对所述第一图像中的所述目标对象进行肢体检测处理;其中,所述肢体检测网络采用第一类样本图像训练得到;所述第一类样本图像中标注有目标对象的检测框;所述检测框的标注范围包括所述目标对象的部分肢体所在区域。In some optional embodiments of the present disclosure, the limb detection module is configured to use a limb detection network to perform limb detection processing on the target object in the first image; wherein, the limb detection network uses the first Class sample images are obtained through training; the first class sample image is marked with a detection frame of the target object; the marking range of the detection frame includes the area where part of the limb of the target object is located.
在本公开的一些可选实施例中,所述肢体关键点检测模块,配置为利用肢体关键点检测网络对所述第一区域对应的像素点进行肢体关键点检测处理;其中,所述肢体关键点检测网络采用第二类样本图像训练得到;所述第二类样本图像中标注有包括所述目标对象的部分肢体的关键点。In some optional embodiments of the present disclosure, the limb key point detection module is configured to use a limb key point detection network to perform limb key point detection processing on pixels corresponding to the first region; wherein, the limb key point The point detection network is trained using a second type of sample image; the second type of sample image is marked with key points that include part of the body of the target object.
在本公开的一些可选实施例中,所述目标对象的部分肢体包括以下至少之一:头部、颈部、肩部、胸部、腰部、髋部、手臂、手部;所述第一关键点信息和所述第二关键点信息包括头部、颈部、肩部、胸部、腰部、髋部、手臂和手部中的至少一个肢体的轮廓关键点信息和/骨骼关键点信息。In some optional embodiments of the present disclosure, part of the limbs of the target object includes at least one of the following: head, neck, shoulders, chest, waist, hips, arms, hands; the first key The point information and the second key point information include contour key point information and/or bone key point information of at least one of the head, neck, shoulder, chest, waist, hip, arm, and hand.
在本公开的一些可选实施例中,所述装置还包括分配单元和统计单元;其中,所述分配单元,配置为响应于所述检测单元获得所述目标对象的部分肢体对应的第一关键点信息的情况,为所述目标对象分配跟踪标识;所述统计单元,配置为基于对所述多帧图像的处理过程中分配的所述跟踪标识的数量,确定所述多帧图像中的目标对象的数量。In some optional embodiments of the present disclosure, the device further includes an allocation unit and a statistics unit; wherein, the allocation unit is configured to obtain the first key corresponding to a part of the limb of the target object in response to the detection unit. In the case of point information, a tracking identifier is assigned to the target object; the statistical unit is configured to determine the target in the multi-frame image based on the number of the tracking identifiers allocated during the processing of the multi-frame image The number of objects.
在本公开的一些可选实施例中,所述装置还包括确定单元,配置为基于所述第二关键点信息确定所述目标对象的姿态;基于所述目标对象的姿态确定对应于所述目标对象的交互指令。In some optional embodiments of the present disclosure, the device further includes a determining unit configured to determine the posture of the target object based on the second key point information; and determine that the posture corresponds to the target object based on the posture of the target object The interactive instructions of the object.
本公开实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现本公开实施例所述的图像处理方法的步骤。The embodiment of the present disclosure also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the steps of the image processing method described in the embodiment of the present disclosure are realized.
本公开实施例还提供了一种电子设备,包括存储器、处理器及存储在存储器上并可 在处理器上运行的计算机程序,所述处理器执行所述程序时实现本公开实施例所述的图像处理方法的步骤。The embodiment of the present disclosure also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor. The processor executes the program to implement the The steps of the image processing method.
本公开实施例还提供了一种计算机程序,所述计算机程序使得计算机执行本公开实施例所述的图像处理方法。The embodiment of the present disclosure also provides a computer program that enables a computer to execute the image processing method described in the embodiment of the present disclosure.
本公开实施例提供的图像处理方法、装置、电子设备和存储介质,通过对待处理的多帧图像中的第一图像中的目标对象的部分肢体的关键点进行识别,并基于识别出的部分肢体的关键点确定在后的第二图像中的目标对象的部分肢体的关键点,从而实现了在图像中具有目标对象的部分肢体(例如上半身)的场景下的目标跟踪。The image processing method, device, electronic device, and storage medium provided by the embodiments of the present disclosure recognize the key points of part of the limb of the target object in the first image in the multi-frame image to be processed, and are based on the recognized part of the limb The key points of determine the key points of the partial limbs of the target object in the subsequent second image, thereby realizing target tracking in a scene with partial limbs of the target object (for example, the upper body) in the image.
附图说明Description of the drawings
图1为本公开实施例的图像处理方法的流程示意图一;FIG. 1 is a first schematic diagram of the flow of an image processing method according to an embodiment of the disclosure;
图2为本公开实施例的图像处理方法中的肢体关键点检测处理方法的流程示意图;2 is a schematic flowchart of a method for detecting and processing limb key points in an image processing method according to an embodiment of the disclosure;
图3为本公开实施例的图像处理方法中的肢体关键点跟踪方法的一种流程示意图;FIG. 3 is a schematic flowchart of a method for tracking key points of limbs in an image processing method according to an embodiment of the present disclosure;
图4为本公开实施例的图像处理方法的流程示意图二;FIG. 4 is a second schematic flowchart of an image processing method according to an embodiment of the disclosure;
图5为本公开实施例的图像处理装置的组成结构示意图一;FIG. 5 is a first schematic diagram of the composition structure of the image processing apparatus according to an embodiment of the disclosure;
图6为本公开实施例的图像处理装置的组成结构示意图二;FIG. 6 is a second schematic diagram of the composition structure of the image processing apparatus according to an embodiment of the disclosure;
图7为本公开实施例的图像处理装置的组成结构示意图三;FIG. 7 is a third schematic diagram of the composition structure of the image processing device according to an embodiment of the disclosure; FIG.
图8为本公开实施例的图像处理装置的组成结构示意图四;FIG. 8 is a fourth schematic diagram of the composition structure of the image processing apparatus according to an embodiment of the disclosure;
图9为本公开实施例的电子设备的硬件组成结构示意图。FIG. 9 is a schematic diagram of the hardware composition structure of an electronic device according to an embodiment of the disclosure.
具体实施方式Detailed ways
下面结合附图及具体实施例对本公开作进一步详细的说明。The present disclosure will be further described in detail below with reference to the drawings and specific embodiments.
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、接口、技术之类的具体细节,以便透彻理解本申请。In the following description, for the purpose of illustration rather than limitation, specific details such as specific system structure, interface, technology, etc. are proposed for a thorough understanding of the present application.
本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。此外,本文中的“多”表示两个或者多于两个。The term "and/or" in this article is only an association relationship describing the associated objects, which means that there can be three relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, exist alone B these three situations. In addition, the character "/" in this text generally indicates that the associated objects before and after are in an "or" relationship. In addition, "many" in this document means two or more than two.
本公开实施例提供了一种图像处理方法。图1为本公开实施例的图像处理方法的流程示意图一;如图1所示,所述方法包括:The embodiment of the present disclosure provides an image processing method. FIG. 1 is a schematic diagram 1 of the flow of an image processing method according to an embodiment of the present disclosure; as shown in FIG. 1, the method includes:
步骤101:获得多帧图像;Step 101: Obtain multiple frames of images;
步骤102:对多帧图像中的第一图像中的目标对象进行肢体关键点检测处理,获得目标对象的部分肢体对应的第一关键点信息;Step 102: Perform limb key point detection processing on the target object in the first image in the multi-frame image to obtain first key point information corresponding to part of the limb of the target object;
步骤103:基于第一关键点信息确定第二图像中的目标对象的部分肢体对应的第二 关键点信息;其中,在多帧图像中,第二图像为第一图像后的一帧图像。Step 103: Determine the second key point information corresponding to the part of the limb of the target object in the second image based on the first key point information; wherein, among the multiple frames of images, the second image is an image after the first image.
本实施例的图像处理方法可应用于图像处理装置中,图像处理装置可设置于个人计算机、服务器等具有处理功能的电子设备内,或者由处理器执行计算机程序实现。The image processing method of this embodiment can be applied to an image processing device. The image processing device can be set in an electronic device with processing functions such as a personal computer or a server, or it can be implemented by a processor executing a computer program.
本实施例中,上述多帧图像可以为由电子设备内置或外接的摄像设备采集的连续的视频,或者也可以是接收的由其它电子设备传输的视频等。在一些应用场景中,上述多帧图像可以为监控摄像头采集的监控视频,以对监控视频中的各个目标对象进行跟踪。在另一些应用场景中,上述多帧图像也可以是本地或其他视频库中存储的视频,以对视频中的各个目标对象进行跟踪。在又一些应用场景中,本实施例的图像处理方法可应用于虚拟现实(VR,Virtual Reality)、增强现实(AR,Augmented Reality)或者体感游戏等应用场景中;则上述多帧图像还可以为虚拟现实或增强现实场景中采集到的操作者的图像,可通过对图像中的操作者的姿态的识别,控制虚拟现实场景或增强现实场景中的虚拟对象的动作;或者还可以为体感游戏中采集到的参与游戏的目标对象(如多个用户)的图像等。In this embodiment, the above-mentioned multi-frame images may be continuous videos collected by a camera device built-in or externally connected to the electronic device, or may also be received videos transmitted by other electronic devices. In some application scenarios, the above-mentioned multi-frame images may be surveillance videos collected by a surveillance camera to track various target objects in the surveillance video. In other application scenarios, the above-mentioned multi-frame images may also be videos stored locally or in other video libraries to track each target object in the video. In still other application scenarios, the image processing method of this embodiment can be applied to application scenarios such as virtual reality (VR, Augmented Reality), augmented reality (AR, Augmented Reality), or somatosensory games; the above-mentioned multi-frame images may also be The image of the operator collected in the virtual reality or augmented reality scene can be used to control the actions of virtual objects in the virtual reality scene or the augmented reality scene by recognizing the operator’s posture in the image; or it can also be in a somatosensory game Collected images of target objects (such as multiple users) participating in the game.
在一些应用场景中,图像处理装置可与一个或多个监控摄像头建立通信连接,将实时获得监控摄像头采集的监控视频作为待处理的多帧图像。在另一些应用场景中,图像处理装置也可从自身存储的视频中获取视频作为待处理的多帧图像,或者,也可从其他电子设备存储的视频中获取视频作为待处理的多帧图像等等。在又一些应用场景中,图像处理装置也可置于游戏设备中,在游戏设备的处理器执行计算机程序从而实现游戏操作者操作过程中,将输出显示的图像作为待处理的多帧图像,对图像中的目标对象(目标对象对应于游戏操作者)进行跟踪。In some application scenarios, the image processing device may establish a communication connection with one or more surveillance cameras, and obtain real-time surveillance videos collected by the surveillance cameras as multi-frame images to be processed. In other application scenarios, the image processing device can also obtain a video from the video stored by itself as the multi-frame image to be processed, or it can also obtain the video from the video stored in other electronic equipment as the multi-frame image to be processed, etc. Wait. In other application scenarios, the image processing device can also be placed in the game device, and the processor of the game device executes the computer program so as to realize that the output displayed image is used as the multi-frame image to be processed during the operation of the game operator. The target object in the image (the target object corresponds to the game operator) is tracked.
本实施例中,待处理的多帧图像中可包括目标对象,目标对象可以为一个或多个;在一些应用场景中,目标对象可以是真实人物;在另一些应用场景中,目标对象也可以是根据实际追踪需要而确定的其他对象,例如虚拟人物或其他虚拟对象等。In this embodiment, the multi-frame image to be processed may include a target object, and the target object may be one or more; in some application scenarios, the target object may be a real person; in other application scenarios, the target object may also be It is other objects that are determined according to actual tracking needs, such as virtual characters or other virtual objects.
本实施例中,多帧图像中的每一帧图像可称为帧图像,是组成视频(即待处理图像)的最小单位,可以理解,多帧图像为一组时间连续的帧图像,按照各个帧图像的采集时间形成上述多帧图像,各个帧图像对应的时间参数是连续的。In this embodiment, each frame image in the multi-frame image can be called a frame image, which is the smallest unit that constitutes a video (ie, the image to be processed). The acquisition time of the frame image forms the above-mentioned multi-frame image, and the time parameter corresponding to each frame image is continuous.
示例性的,以目标对象为真实人物为例,在多帧图像中包括目标对象的情况下,上述多帧图像对应的时间范围内可包括一个或多个目标对象,也可以是上述多帧图像的时间范围内的部分时间范围内包括一个或多个目标对象,本实施例中对此不作限定。Exemplarily, taking the target object as a real person as an example, in the case where the target object is included in the multi-frame image, one or more target objects may be included in the time range corresponding to the above-mentioned multi-frame image, or it may be the above-mentioned multi-frame image A part of the time range of includes one or more target objects, which is not limited in this embodiment.
本实施例中,上述第一图像为多帧图像中的任意一帧图像,第二图像为第一图像后的一帧图像;换句话说,上述第一图像是多帧图像中、在第二图像之前的任意一帧图像。其中,在一些可选的实施例中,第二图像可以是与第一图像时间连续的、在后的一帧图像。例如,多帧图像包括10帧图像,上述第一图像为10帧图像中的第2帧图像,则上述第二图像为第3帧图像。在另一些可选实施例中,第二图像也可以是第一图像后的、 与第一图像相距预设数量帧图像的一帧图像。例如,多帧图像包括20帧图像,上述第一图像为20帧图像中的第2帧图像,假设预设数量帧图像为3帧图像,则上述第二图像可以为20帧图像中的第6帧图像。其中,上述预设数量可依据实际情况预先设定,例如预设数量可依据目标对象的移动速度预先设定。这种实施方式能够有效的减小数据处理量,从而减轻图像处理装置的消耗。In this embodiment, the above-mentioned first image is any one of the multi-frame images, and the second image is an image after the first image; in other words, the above-mentioned first image is among the multi-frame images, in the second Any frame before the image. Among them, in some optional embodiments, the second image may be a subsequent frame of image that is time-continuous with the first image. For example, if the multi-frame image includes 10 frames of images, the first image is the second frame of the 10 frames of images, and the second image is the third frame of images. In other optional embodiments, the second image may also be an image after the first image, which is separated from the first image by a preset number of frames of images. For example, the multi-frame image includes 20 frames of images, the above-mentioned first image is the second frame of the 20 frames of images, and assuming that the preset number of frame images is 3 frames of images, the above-mentioned second image may be the sixth frame of the 20 frames of images. Frame image. The above-mentioned preset number can be preset according to actual conditions, for example, the preset number can be preset according to the moving speed of the target object. This embodiment can effectively reduce the amount of data processing, thereby reducing the consumption of the image processing device.
本实施例中,图像处理装置可通过肢体关键点检测网络对第一图像中的目标对象进行肢体关键点检测处理,获得目标对象的部分肢体对应的第一关键点信息。本实施例中,上述目标对象的部分肢体包括以下至少之一:头部、颈部、肩部、胸部、腰部、髋部、手臂、手部。相应的,目标对象的部分肢体对应的第一关键点信息和第二关键点信息包括目标对象的头部、颈部、肩部、胸部、腰部、髋部、手臂、手部中的至少一个肢体的轮廓关键点信息和/骨骼关键点信息。In this embodiment, the image processing device may perform limb key point detection processing on the target object in the first image through the limb key point detection network to obtain first key point information corresponding to part of the limb of the target object. In this embodiment, part of the limbs of the target object includes at least one of the following: head, neck, shoulders, chest, waist, hips, arms, and hands. Correspondingly, the first key point information and the second key point information corresponding to part of the limbs of the target object include at least one limb of the head, neck, shoulders, chest, waist, hips, arms, and hands of the target object. The outline key point information and/or bone key point information.
示例性的,本实施例中目标对象的部分肢体为目标对象的上半身肢体,以便于能够识别出多帧图像中具有上半身的目标对象,从而实现仅具有上半身或具有全身的目标对象的追踪。Exemplarily, part of the limbs of the target object in this embodiment are the upper body limbs of the target object, so as to be able to identify the target object with the upper body in the multi-frame images, so as to realize the tracking of the target object with only the upper body or the whole body.
示例性的,上述第一关键点信息和第二关键点信息对应的关键点可以包括:头部的至少一个关键点、肩部的至少一个关键点、手臂的至少一个关键点、胸部的至少一个关键点、髋部的至少一个关键点和腰部的至少一个关键点;可选的,上述第一关键点信息和第二关键点信息对应的关键点还可以包括手部的至少一个关键点。图像处理装置是否能够获得手部的关键点,取决于用于训练肢体关键点检测网络的样本图像中是否标注了手部的关键点;在样本图像中标注了手部的关键点的情况下,则可通过肢体关键点检测网络检测到手部的关键点。Exemplarily, the key points corresponding to the first key point information and the second key point information may include: at least one key point on the head, at least one key point on the shoulder, at least one key point on the arm, and at least one key point on the chest. The key point, at least one key point of the hip, and at least one key point of the waist; optionally, the key point corresponding to the first key point information and the second key point information may also include at least one key point of the hand. Whether the image processing device can obtain the key points of the hand depends on whether the key points of the hand are marked in the sample images used to train the key point detection network of the limbs; when the key points of the hands are marked in the sample image, Then the key points of the hand can be detected through the limb key point detection network.
在一些可选实施例中,在上述目标对象的部分肢体包括头部的情况下,第一关键点信息和第二关键点信息中可包括至少一个器官的关键点信息,至少一个器官的关键点信息可包括以下至少之一:鼻子关键点信息、眉心关键点信息、嘴部关键点信息。In some optional embodiments, in the case where part of the limb of the target object includes the head, the first key point information and the second key point information may include key point information of at least one organ, and key point information of at least one organ. The information may include at least one of the following: nose key point information, eyebrow key point information, and mouth key point information.
在一些可选实施例中,在上述目标对象的部分肢体包括手臂的情况下,第一关键点信息和第二关键点信息中可包括手肘部关键点信息。In some optional embodiments, in the case where part of the limb of the target object includes an arm, the first key point information and the second key point information may include elbow key point information.
在一些可选实施例中,在上述目标对象的部分肢体包括手部的情况下,第一关键点信息和第二关键点信息中可包括手腕关键点信息。可选地,第一关键点信息和第二关键点信息中还可包括手部的轮廓关键点信息。In some optional embodiments, in the case where part of the limb of the target object includes a hand, the first key point information and the second key point information may include wrist key point information. Optionally, the first key point information and the second key point information may further include contour key point information of the hand.
在一些可选实施例中,在上述目标对象的部分肢体包括髋部的情况下,第一关键点信息和第二关键点信息中可包括左髋关键点信息和右髋关键点信息。可选地,第一关键点信息和第二关键点信息中还可包括脊柱根部关键点信息。In some optional embodiments, when part of the limbs of the target object includes hips, the first key point information and the second key point information may include left hip key point information and right hip key point information. Optionally, the first key point information and the second key point information may also include the key point information of the spine root.
其中,上述第一关键点信息具体可以包括关键点的坐标。上述第一关键点信息可以包括轮廓关键点的坐标和/或骨骼关键点的坐标。可以理解,通过轮廓关键点的坐标能够 形成对应的部分肢体的轮廓边缘;通过骨骼关键点的坐标能够形成对应的部分肢体的骨骼。Wherein, the above-mentioned first key point information may specifically include the coordinates of the key point. The aforementioned first key point information may include the coordinates of the contour key points and/or the coordinates of the bone key points. It can be understood that the contour edges of the corresponding part of the limb can be formed by the coordinates of the contour key points; the bones of the corresponding part of the limb can be formed by the coordinates of the bone key points.
图2为本公开实施例的图像处理方法中的肢体关键点检测处理方法的流程示意图;在一些可选的实施例中,步骤102可参照图2所示,包括:FIG. 2 is a schematic flowchart of a method for detecting and processing limb key points in an image processing method according to an embodiment of the present disclosure; in some optional embodiments, step 102 may refer to FIG. 2 and includes:
步骤1021:对第一图像中的目标对象进行肢体检测处理,确定目标对象的第一区域;第一区域包括上述目标对象的部分肢体所在区域;Step 1021: Perform limb detection processing on the target object in the first image to determine the first area of the target object; the first area includes the area where part of the limb of the target object is located;
步骤1022:对第一区域对应的像素点进行肢体关键点检测处理,获得目标对象的部分肢体对应的第一关键点信息。Step 1022: Perform limb key point detection processing on the pixel points corresponding to the first area to obtain first key point information corresponding to part of the limb of the target object.
本实施例中,首先对第一图像中的各个目标对象进行肢体检测,确定各个目标对象的第一区域,例如可确定各个目标对象的上半身对应的第一区域或者各个目标对象的全身对应的第一区域。实际应用中,可通过标识目标对象的检测框(例如矩形框)表示部分肢体对应的第一区域,例如,通过各个矩形框标识出第一图像中的各个人物的上半身。In this embodiment, firstly perform limb detection on each target object in the first image to determine the first area of each target object. For example, the first area corresponding to the upper body of each target object or the first area corresponding to the whole body of each target object can be determined. One area. In practical applications, a detection frame (for example, a rectangular frame) identifying the target object may be used to indicate the first area corresponding to a part of the limb, for example, the upper body of each person in the first image may be identified by each rectangular frame.
在一些可选的实施例中,上述对第一图像中的目标对象进行肢体检测处理,包括:利用肢体检测网络对第一图像中的目标对象进行肢体检测处理;其中,上述肢体检测网络采用第一类样本图像训练得到;第一类样本图像中标注有目标对象的检测框;检测框的标注范围包括目标对象的部分肢体所在区域;目标对象的部分肢体可以是目标对象的上半身肢体。In some optional embodiments, the above-mentioned performing limb detection processing on the target object in the first image includes: using a limb detection network to perform limb detection processing on the target object in the first image; wherein, the aforementioned limb detection network adopts the first One type of sample image is trained; the first type of sample image is marked with the detection frame of the target object; the marking range of the detection frame includes the area where part of the limb of the target object is located; part of the limb of the target object may be the upper body limb of the target object.
本实施例中,可通过预先训练好的肢体检测网络对第一图像进行肢体检测,确定目标对象的第一区域,也即获得第一图像中各个目标对象的检测框。上述检测框可标识目标对象的部分肢体或全部肢体,也即通过肢体检测网络可检测获得目标对象的全部肢体或者上半身肢体。其中,上述肢体检测网络可采用任意一种能够检测目标对象肢体的网络结构,本实施例中对此不做限定。In this embodiment, limb detection can be performed on the first image through a pre-trained limb detection network to determine the first area of the target object, that is, to obtain the detection frame of each target object in the first image. The above detection frame can identify part or all of the limbs of the target object, that is, all the limbs or upper body limbs of the target object can be detected through the limb detection network. Among them, the aforementioned limb detection network may adopt any network structure capable of detecting the limb of the target object, which is not limited in this embodiment.
示例性的,以通过肢体检测网络检测得到目标对象的部分肢体的检测框为例,可通过肢体检测网络对第一图像进行特征提取,基于提取到的特征确定第一图像中的各个目标对象的部分肢体的中心点以及对应于各个目标对象的部分肢体的检测框的高度和宽度,基于各个目标对象的部分肢体的中心点以及对应的高度和宽度,可确定各个目标对象的部分肢体的检测框。Exemplarily, taking the detection frame of part of the limb of the target object detected by the limb detection network as an example, the feature extraction of the first image can be performed through the limb detection network, and the characteristics of each target object in the first image can be determined based on the extracted features. The center point of part of the limb and the height and width of the detection frame of the part of the limb corresponding to each target object. Based on the center point of the part of the limb of each target object and the corresponding height and width, the detection frame of the part of each target object can be determined .
本实施例中,肢体检测网络可采用标注有目标对象的检测框的第一类样本图像训练获得;其中,检测框的标注范围包括目标对象的部分肢体,可以理解,第一类样本图像中可仅标注有目标对象的部分肢体(例如目标对象的上半身肢体)的检测框,也可以标注有目标对象完整肢体的检测框。示例性的,以检测框的标注范围为目标对象的部分肢体为例,可利用肢体检测网络提取第一类样本图像的特征数据,基于特征数据确定第一类样本图像中各个目标对象的部分肢体的预测中心点以及对应部分肢体的预测检测框的高度和宽度,基于上述部分肢体的预测中心点以及对应的高度和宽度确定各个部分肢 体对应的预测检测框;根据预测检测框以及标注的部分肢体的检测框确定损失,基于损失调整肢体检测网络的网络参数。In this embodiment, the limb detection network can be obtained by training with the first type of sample image marked with the detection frame of the target object; wherein the marking range of the detection frame includes part of the limb of the target object. It is understandable that the first type of sample image can be A detection frame marked with only a part of the limb of the target object (for example, the upper body limb of the target object) may also be marked with a detection frame of the complete limb of the target object. Exemplarily, taking the marking range of the detection frame as the part of the body of the target object as an example, the body detection network can be used to extract the feature data of the first type of sample image, and based on the feature data, the part of the body of each target object in the first type of sample image can be determined The predicted center point and the height and width of the predicted detection frame of the corresponding part of the limb, and the predicted detection frame corresponding to each part of the limb is determined based on the predicted center point of the above part of the limb and the corresponding height and width; according to the predicted detection frame and the marked part of the limb The detection frame determines the loss, and adjusts the network parameters of the limb detection network based on the loss.
在一些可选的实施例中,上述对第一区域对应的像素点进行肢体关键点检测处理,包括:利用肢体关键点检测网络对第一区域对应的像素点进行肢体关键点检测处理;其中,上述肢体关键点检测网络采用第二类样本图像训练得到;第二类样本图像中标注有目标对象的关键点;上述关键点的标注范围包括目标对象的部分肢体。In some optional embodiments, the above-mentioned performing limb key point detection processing on pixels corresponding to the first region includes: using a limb key point detection network to perform limb key point detection processing on pixels corresponding to the first area; wherein, The above-mentioned limb key point detection network is obtained by training using the second type of sample image; the second type of sample image is marked with the key points of the target object; the marking range of the above key point includes part of the limb of the target object.
本实施例中,可通过预先训练好的肢体关键点检测网络对第一区域对应的像素点进行肢体关键点检测,确定各个目标对象的部分肢体的第一关键点信息。示例性的,上述第一区域可包括目标对象的部分肢体,可将各个目标对象的检测框对应的像素点输入至肢体关键点检测网络,得到各个目标对象的部分肢体对应的第一关键点信息。其中,上述肢体关键点检测网络可采用任意一种能够检测肢体关键点的网络结构,本实施例中对此不做限定。In this embodiment, a pre-trained limb key point detection network may be used to perform limb key point detection on pixels corresponding to the first region to determine the first key point information of a part of the limb of each target object. Exemplarily, the above-mentioned first area may include part of the limbs of the target object, and the pixel points corresponding to the detection frame of each target object may be input to the limb key point detection network to obtain the first key point information corresponding to the part of the limb of each target object . Among them, the aforementioned limb key point detection network may adopt any network structure capable of detecting limb key points, which is not limited in this embodiment.
本实施例中,肢体关键点检测网络可采用标注有目标对象的关键点的第二类样本图像训练获得,其中,关键点的标注范围包括目标对象的部分肢体,可以理解,第二类样本图像中可仅标注有目标对象的部分肢体(例如目标对象的上半身肢体)的关键点,也可以标注有目标对象的完整肢体的关键点。示例性的,以第二类样本图像中标注有目标对象的部分肢体的关键点为例,可利用肢体关键点检测网络提取第二类样本图像的特征数据,基于特征数据确定第二类样本图像中各个目标对象的部分肢体的预测关键点;基于上述预测关键点和标注的关键点确定损失,基于损失调整肢体关键点检测网络的网络参数。In this embodiment, the limb key point detection network can be obtained by training with the second type of sample images marked with the key points of the target object. The marking range of the key points includes part of the limbs of the target object. It is understandable that the second type of sample image Only the key points of part of the limbs of the target object (for example, the upper body limbs of the target object) may be marked in the, or the key points of the complete limbs of the target object may be marked. Exemplarily, taking the key points of the part of the limbs marked with the target object in the second type of sample image as an example, the feature data of the second type of sample image can be extracted using the limb key point detection network, and the second type of sample image can be determined based on the feature data The prediction key points of part of the limbs of each target object in the target object; the loss is determined based on the above prediction key points and the marked key points, and the network parameters of the limb key point detection network are adjusted based on the loss.
图3为本公开实施例的图像处理方法中的肢体关键点跟踪方法的一种流程示意图;在一些可选的实施例中,步骤103可参照图3所示,方法包括:FIG. 3 is a schematic flowchart of a method for tracking body key points in an image processing method according to an embodiment of the present disclosure; in some optional embodiments, step 103 may refer to FIG. 3, and the method includes:
步骤1031:基于第一关键点信息在第一图像中确定第二区域;第二区域大于目标对象的第一区域;第一区域包括上述目标对象的部分肢体所在区域;Step 1031: Determine a second area in the first image based on the first key point information; the second area is larger than the first area of the target object; the first area includes the area where part of the limb of the target object is located;
步骤1032:根据第二区域,确定第二图像中与第二区域的位置范围对应的第三区域;Step 1032: According to the second area, determine a third area in the second image corresponding to the position range of the second area;
步骤1033:对第二图像中的第三区域内的像素点进行肢体关键点检测处理,获得部分肢体对应的第二关键点信息。Step 1033: Perform limb key point detection processing on pixels in the third area in the second image to obtain second key point information corresponding to a part of the limb.
本实施例中,针对第一图像中的一个目标对象,基于该目标对象的部分肢体的第一关键点信息确定一个区域,该区域可以为包含该目标对象的部分肢体的所有关键点的最小区域。示例性的,若该区域为矩形区域,则该矩形区域为包含该目标对象的部分肢体的所有关键点的最小区域。则上述第二区域为在第一图像中、对第一区域进行放大处理得到的区域。In this embodiment, for a target object in the first image, an area is determined based on the first key point information of a part of the limb of the target object, and the area may be the smallest area containing all the key points of the part of the limb of the target object . Exemplarily, if the area is a rectangular area, the rectangular area is the smallest area containing all the key points of a part of the limb of the target object. Then, the above-mentioned second area is an area obtained by performing magnification processing on the first area in the first image.
示例性的,若第一区域为矩形为例,假设上述第一区域的高度为H,宽度为W,则可以该区域的中心点为中心、以该区域的四边朝向远离中心点的方向延伸,例如在高度 方向上,分别向远离中心点的方向延伸H/4,在宽度方向上,分别向远离中心点的方向延伸W/4,则上述第二区域可通过第一图像中、以上述中心点为中心,高度为3H/2、宽度为3W/2的矩形区域表示。Exemplarily, if the first area is a rectangle as an example, assuming that the height of the first area is H and the width is W, the center point of the area may be the center, and the four sides of the area may extend away from the center point. For example, in the height direction, respectively extend H/4 in the direction away from the center point, and in the width direction, respectively extend W/4 in the direction away from the center point, then the second area can pass through the first image and center on the center. The point is the center, the rectangular area with the height of 3H/2 and the width of 3W/2 is represented.
则本实施例中,可依据第二区域在第一图像中的位置范围,确定第二图像中、与上述位置范围对应的第三区域。In this embodiment, the third area in the second image corresponding to the above-mentioned position range can be determined according to the position range of the second area in the first image.
在一些可选实施例中,根据第二区域,确定第二图像中与第二区域的位置范围对应的第三区域,还可以包括:对第二区域对应的像素点进行肢体关键点检测处理,获得第三关键点信息;确定第三关键点信息在第一图像中的位置范围,基于上述位置范围确定第二图像中、与上述位置范围对应的第三区域。In some optional embodiments, determining the third area in the second image corresponding to the position range of the second area according to the second area may further include: performing limb key point detection processing on pixels corresponding to the second area, Obtain third key point information; determine a position range of the third key point information in the first image, and determine a third area in the second image corresponding to the position range based on the position range.
示例性的,本实施例中,依旧采用肢体关键点检测网络对第二区域对应的像素点进行肢体关键点检测处理,可以将第一图像中、扩大后的上述第二区域对应的像素点作为肢体关键点检测网络的输入数据,输出第三关键点信息,上述第三关键点信息作为第二图像中的目标对象的预测关键点信息,也即本申请实施例通过对前一帧图像中的目标对象的所在区域进行扩大处理(例如对前一帧图像中的目标对象的部分肢体所在区域进行扩大处理),通过对扩大后的区域进行肢体关键点检测,将获得的关键点作为当前帧图像(即第一图像)之后的一帧图像(即第二图像)中、对应于目标对象(例如目标对象的部分肢体)的预测关键点。进一步基于预测出的位置范围,对第二图像中的第三区域对应的像素点进行肢体关键点检测处理,检测到的关键点信息即为上述目标对象的部分肢体对应的第二关键点信息。Exemplarily, in this embodiment, the limb key point detection network is still used to perform limb key point detection processing on the pixels corresponding to the second area, and the pixels corresponding to the expanded second area in the first image may be used as The limb key point detection network inputs data, and outputs the third key point information. The third key point information is used as the prediction key point information of the target object in the second image. The area where the target object is located is expanded (for example, the area where part of the limb of the target object in the previous frame image is expanded), and the expanded area is detected by limb key points, and the obtained key points are used as the current frame image (I.e., the first image) in the next frame of image (i.e., the second image), corresponding to the predicted key points of the target object (for example, a part of the limb of the target object). Further based on the predicted position range, the pixel points corresponding to the third area in the second image are subjected to limb key point detection processing, and the detected key point information is the second key point information corresponding to the part of the limb of the target object.
在一些可选实施例中,上述步骤103还可包括:根据所述第一关键点信息在所述第一图像中的位置范围,确定所述第二图像中、与所述位置范围对应的第三区域;对所述第二图像中的所述第三区域内的像素点进行肢体关键点检测处理,获得所述部分肢体对应的第二关键点信息。In some optional embodiments, the above step 103 may further include: determining the first key point corresponding to the position range in the second image according to the position range of the first key point information in the first image. Three regions; performing limb key point detection processing on pixels in the third region in the second image to obtain second key point information corresponding to the part of the limb.
本实施例中,可依据第一关键点在第一图像中的位置范围,确定第二图像中、与上述位置范围对应的第三区域。进一步对第二图像中的第三区域对应的像素点进行肢体关键点检测处理,检测到的关键点信息即为上述目标对象的部分肢体对应的第二关键点信息。In this embodiment, the third area in the second image corresponding to the above-mentioned position range can be determined according to the position range of the first key point in the first image. The pixel points corresponding to the third area in the second image are further subjected to limb key point detection processing, and the detected key point information is the second key point information corresponding to the part of the limb of the target object.
在另一些可选实施例中,步骤103还可包括:基于第一图像、目标对象的第一区域和目标追踪网络,确定第二图像中的目标对象的预测区域,基于第二图像中的上述预测区域的像素点进行肢体关键点检测处理,得到目标对象的部分肢体对应的第二关键点信息;其中,目标追踪网络采用多帧样本图像训练得到;多帧样本图像中至少包括第一样本图像和第二样本图像,第二样本图像为第一样本图像后的一帧图像;第一样本图像中标注有目标对象的位置,第二样本图像中标注有目标对象的位置。示例性的,多帧样本图像中均标注有目标对象的检测框,通过检测框表示目标对象在样本图像中的位置;检 测框的标注范围包括目标对象的部分肢体所在区域;目标对象的部分肢体可以是目标对象的上半身肢体。In other optional embodiments, step 103 may further include: determining the predicted area of the target object in the second image based on the first image, the first area of the target object, and the target tracking network, based on the above-mentioned The pixel points in the prediction area are subjected to limb key point detection processing to obtain the second key point information corresponding to part of the limb of the target object; among them, the target tracking network is trained by using multi-frame sample images; the multi-frame sample images include at least the first sample The image and the second sample image, the second sample image is an image after the first sample image; the position of the target object is marked in the first sample image, and the position of the target object is marked in the second sample image. Exemplarily, the detection frame of the target object is marked in the sample images of multiple frames, and the position of the target object in the sample image is represented by the detection frame; the marking range of the detection frame includes the area where part of the limb of the target object is located; part of the limb of the target object It can be the upper body limbs of the target object.
本实施例中,可利用上一帧图像(即第一图像)以及图像中的目标对象的位置、通过预先训练好的目标追踪网络确定下一帧图像(即第二图像)中该目标对象的预测位置。示例性的,可将包含有目标对象的检测框的第一图像输入至目标追踪网络,得到第二图像中的目标对象的预测位置;再对第二图像中的预测位置处的像素点进行肢体关键点检测处理,得到目标对象的部分肢体在第二图像中的第二关键点信息。其中,上述目标跟踪网络可采用任意一种能够实现目标跟踪的网络结构,本实施例中对此不做限定。In this embodiment, the position of the target object in the previous frame of image (ie, the first image) and the target object in the image can be used to determine the location of the target object in the next frame of image (ie, the second image) through a pre-trained target tracking network. Forecast location. Exemplarily, the first image containing the detection frame of the target object can be input to the target tracking network to obtain the predicted position of the target object in the second image; The key point detection process obtains the second key point information of the part of the limb of the target object in the second image. Among them, the above-mentioned target tracking network may adopt any network structure capable of realizing target tracking, which is not limited in this embodiment.
本实施例中,目标追踪网络可采用标注有目标对象的位置(例如包含目标对象的检测框,或者包含目标对象的部分肢体的检测框)的多帧样本图像训练获得。示例性的,以多帧样本图像中至少包括第一图像和第二图像为例,可利用目标追踪网络对第一样本图像进行处理,第一样本图像中标注有目标对象的位置,处理结果为该目标对象在第二样本图像中的预测位置;则可根据上述预测位置和第二图像中目标对象的标注位置确定损失,基于损失调整目标追踪网络的网络参数。In this embodiment, the target tracking network can be obtained by training with multi-frame sample images marked with the position of the target object (for example, a detection frame containing the target object, or a detection frame containing a part of the limb of the target object). Exemplarily, taking the multi-frame sample image at least including the first image and the second image as an example, the target tracking network can be used to process the first sample image, and the position of the target object is marked in the first sample image. The result is the predicted position of the target object in the second sample image; the loss can be determined according to the predicted position and the label position of the target object in the second image, and the network parameters of the target tracking network can be adjusted based on the loss.
需要说明的是,在基于第一关键点信息确定第二图像中的目标对象的部分肢体对应的第二关键点信息后,可基于第二图像中的目标对象的部分肢体对应的第二关键点信息进一步确定在后图像中的目标对象的部分肢体对应的关键点信息,以此类推,直至无法在后一帧图像中检测出目标对象的部分肢体对应的关键点信息,此时,可表明待处理的多帧图像中已不包括上述目标对象,即目标对象已移出待处理的多帧图像的视野范围内。It should be noted that after the second key point information corresponding to the part of the limb of the target object in the second image is determined based on the first key point information, the second key point corresponding to the part of the limb of the target object in the second image may be determined based on the second key point. The information further determines the key point information corresponding to the part of the limb of the target object in the rear image, and so on, until the key point information corresponding to the part of the limb of the target object cannot be detected in the next frame of image. The above-mentioned target object is no longer included in the processed multi-frame image, that is, the target object has moved out of the field of view of the multi-frame image to be processed.
在一些可选实施例中,图像处理装置也可针对每一帧图像中的目标对象进行肢体检测,得到每一帧图像中的目标对象所在的区域。将检测到的目标对象作为追踪对象,从而可确定当前帧图像中是否出现新的目标对象;在当前帧图像中出现新的目标对象的情况下,将新的目标对象作为追踪对象,将新的目标对象对应的第一区域内的像素点进行肢体关键点检测处理,即针对新的目标对象执行本公开实施例中步骤103的处理。示例性的,图像处理装置可每隔预设时间或者每隔预设数量的图像帧执行图像中的目标对象的肢体检测处理,从而实现每隔一段时间检测图像中是否有新的目标对象出现,对新的目标对象进行跟踪。In some optional embodiments, the image processing device may also perform limb detection for the target object in each frame of image to obtain the area where the target object in each frame of image is located. The detected target object is used as the tracking object to determine whether a new target object appears in the current frame image; when a new target object appears in the current frame image, the new target object is used as the tracking object, and the new target object The pixel points in the first area corresponding to the target object are subjected to limb key point detection processing, that is, the processing of step 103 in the embodiment of the present disclosure is executed for the new target object. Exemplarily, the image processing device may execute the limb detection processing of the target object in the image every preset time or every preset number of image frames, so as to detect whether a new target object appears in the image at regular intervals. Track new target objects.
在本公开的一些可选实施例中,上述方法还包括:响应于获得目标对象的部分肢体对应的第一关键点信息的情况,为目标对象分配跟踪标识;基于对多帧图像的处理过程中分配的跟踪标识的数量,确定多帧图像中的目标对象的数量。In some optional embodiments of the present disclosure, the foregoing method further includes: in response to obtaining first key point information corresponding to a part of the limb of the target object, assigning a tracking identifier to the target object; The number of assigned tracking marks determines the number of target objects in the multi-frame image.
本实施例中,图像处理装置在待处理的多帧图像中的首帧图像中检测到目标对象,即获得目标对象的部分肢体对应的第一关键点信息时,为目标对象分配一个跟踪标识,该跟踪标识与该目标对象建立关联,直至在对该目标对象进行跟踪的过程中,无法跟踪 到该目标对象。In this embodiment, the image processing device detects the target object in the first frame of the multi-frame image to be processed, that is, when the first key point information corresponding to part of the body of the target object is obtained, a tracking identifier is assigned to the target object, The tracking identifier is associated with the target object until the target object cannot be tracked during the process of tracking the target object.
在一些可选实施例中,图像处理装置也可针对每一帧图像中的目标对象进行肢体检测,得到每一帧图像中的目标对象的部分肢体对应的区域,将检测到的目标对象作为追踪对象。基于此,图像处理装置对待处理图像中的首帧图像进行检测,为检测到的目标对象分配跟踪标识。之后,该跟踪标识一直跟随该目标对象,直至无法跟踪到该目标对象。若在某一帧图像中检测到新的目标对象,则为该新的目标对象分配跟踪标识,重复执行上述方案。可以理解,在同一时刻检测到的各个目标对象对应于不同的跟踪标识;在连续的时间范围内跟踪到的目标对象对应于相同的跟踪标识;在非连续的时间范围内分别检测到的目标对象对应于不同的跟踪标识。In some optional embodiments, the image processing device may also perform limb detection for the target object in each frame of image, to obtain the area corresponding to part of the limb of the target object in each frame of image, and use the detected target object as tracking Object. Based on this, the image processing device detects the first frame of the image to be processed, and assigns a tracking identifier to the detected target object. After that, the tracking identifier keeps following the target object until the target object cannot be tracked. If a new target object is detected in a certain frame of image, a tracking identifier is assigned to the new target object, and the above solution is repeated. It can be understood that each target object detected at the same time corresponds to different tracking identifiers; target objects tracked in a continuous time range correspond to the same tracking identifier; target objects detected separately in a non-continuous time range Correspond to different tracking identifiers.
例如,若某一帧图像,分别检测到三个目标对象,则针对三个目标对象分别分配一个跟踪标识,每个目标对象分别对应一个跟踪标识。For example, if three target objects are detected in a certain frame of image, a tracking identifier is assigned to the three target objects, and each target object corresponds to a tracking identifier.
又例如,针对5分钟的多帧图像,在第一个1分钟内检测到三个目标对象,分别为三个目标对象分配一个跟踪标识,例如可记为标识1、标识2和标识3;在第二个1分钟内,上述三个目标对象中的第一个目标对象消失,则在当前1分钟内,只有两个目标对象,分别对应的跟踪标识为标识2和标识3;在第三个1分钟内,上述第一个目标对象又出现在图像中,即相比于在先图像中、检测到新的目标对象,尽管该目标对象是第一个1分钟内出现过的目标对象(即第一个目标对象),依旧为该目标对象分配标识4作为跟踪标识,以此类推。For another example, for a 5-minute multi-frame image, three target objects are detected within the first 1 minute, and a tracking identifier is assigned to the three target objects, for example, they can be recorded as identifier 1, identifier 2, and identifier 3. Within the second one minute, the first one of the above three target objects disappears. In the current one minute, there are only two target objects, and the corresponding tracking identifiers are identification 2 and identification 3. In the third Within 1 minute, the above-mentioned first target object appears in the image again, that is, compared to the previous image, a new target object is detected, even though the target object is the first target object that appeared within 1 minute (ie The first target object), the target object is still assigned identifier 4 as the tracking identifier, and so on.
基于此,本实施例的技术方案可基于多帧图像处理过程中对应的跟踪标识的数量,确定多帧图像中出现过的目标对象的数量。示例性的,多帧图像中出现过的目标对象的数量指的是多帧图像对应的时间范围内出现过的目标对象的次数。Based on this, the technical solution of this embodiment can determine the number of target objects that have appeared in the multi-frame image based on the number of corresponding tracking marks in the multi-frame image processing process. Exemplarily, the number of target objects that have appeared in multiple frames of images refers to the number of target objects that have appeared in a time range corresponding to the multiple frames of images.
采用本公开实施例的技术方案,通过对待处理的多帧图像中的第一图像中的目标对象的部分肢体的关键点进行识别,并基于识别出的部分肢体的关键点确定在后的第二图像中的目标对象的部分肢体的关键点,从而实现了在图像中仅具有目标对象的部分肢体(例如上半身)的场景下的目标跟踪,也即本公开实施例的技术方案能够同时适应完整肢体场景和部分肢体(例如上半身)场景,实现了图像中的目标跟踪。By adopting the technical solution of the embodiment of the present disclosure, the key points of part of the limbs of the target object in the first image in the multi-frame images to be processed are recognized, and the subsequent second is determined based on the key points of the recognized part of the limbs. The key points of the part of the limb of the target object in the image, thereby realizing target tracking in a scene with only part of the limb of the target object (for example, the upper body) in the image, that is, the technical solution of the embodiment of the present disclosure can simultaneously adapt to the complete limb The scene and part of the limbs (such as the upper body) scene, realize the target tracking in the image.
本公开实施例还提供了一种图像处理方法。图4为本公开实施例的图像处理方法的流程示意图二;如图4所示,所述方法包括:The embodiment of the present disclosure also provides an image processing method. FIG. 4 is a second schematic diagram of the flow of the image processing method according to an embodiment of the disclosure; as shown in FIG. 4, the method includes:
步骤201:获得多帧图像;Step 201: Obtain multiple frames of images;
步骤202:对多帧图像中的第一图像中的目标对象进行肢体关键点检测处理,获得目标对象的部分肢体对应的第一关键点信息;Step 202: Perform limb key point detection processing on the target object in the first image in the multi-frame image to obtain first key point information corresponding to part of the limb of the target object;
步骤203:基于第一关键点信息确定第二图像中的目标对象的部分肢体对应的第二关键点信息;其中,在多帧图像中,第二图像为第一图像后的一帧图像;Step 203: Determine the second key point information corresponding to the part of the limb of the target object in the second image based on the first key point information; wherein, in the multiple frames of images, the second image is one frame after the first image;
步骤204:基于第二关键点信息确定目标对象的姿态;基于目标对象的姿态确定对 应于目标对象的交互指令。Step 204: Determine the posture of the target object based on the second key point information; determine the interactive instruction corresponding to the target object based on the posture of the target object.
本实施例步骤201至步骤203的具体阐述可参照步骤101至步骤103的描述,这里不再赘述。For the detailed description of step 201 to step 203 in this embodiment, reference may be made to the description of step 101 to step 103, which will not be repeated here.
本实施例可通过追踪到的目标对象、进而基于该目标对象的第二关键点信息确定目标对象的姿态,并基于目标对象的姿态确定各姿态对应的交互指令。之后,对各个姿态对应的交互指令进行响应。In this embodiment, the posture of the target object can be determined based on the tracked target object and further based on the second key point information of the target object, and the interaction instruction corresponding to each posture can be determined based on the posture of the target object. Afterwards, respond to the interactive commands corresponding to each posture.
本实施例适用于动作交互场景,图像处理装置可基于各姿态确定对应的交互指令,响应上述交互指令;响应上述交互指令,例如可以为开启或关闭图像处理装置自身或者图像处理装置所在的电子设备自身的某些功能等;或者,响应上述交互指令还可以是将上述交互指令发送给其他电子设备,其他电子设备接收到上述交互指令,基于交互指令开启或关闭某些功能,换句话说,上述交互指令也可以用于开启或关闭其他电子设备的对应功能。This embodiment is suitable for an action interactive scene. The image processing device can determine the corresponding interactive instruction based on each posture, and respond to the above interactive instruction; in response to the above interactive instruction, for example, the image processing device itself or the electronic device where the image processing device is located can be turned on or off. Some of its own functions, etc.; or, in response to the above interactive instructions, the above interactive instructions can also be sent to other electronic devices, and other electronic devices receive the above interactive instructions and turn on or off certain functions based on the interactive instructions, in other words, the above Interactive instructions can also be used to turn on or turn off corresponding functions of other electronic devices.
本实施例还适用于虚拟现实、增强现实或者体感游戏等各种应用场景。图像处理装置可基于各种交互指令执行相应的处理,处理包括但不限于控制虚拟现实或增强现实场景中、针对虚拟对象执行相应动作;控制体感游戏场景中、针对目标对象对应的虚拟角色执行相应的动作。一些示例中,若方法应用于增强现实或虚拟现实等场景,则图像处理装置基于交互指令执行的相应处理可以包括控制虚拟目标对象在真实场景或虚拟场景中执行与交互指令相应的动作。This embodiment is also applicable to various application scenarios such as virtual reality, augmented reality, or somatosensory games. The image processing device can perform corresponding processing based on various interactive instructions, including but not limited to controlling virtual reality or augmented reality scenes, performing corresponding actions on virtual objects; controlling somatosensory game scenes, performing corresponding actions on virtual characters corresponding to the target object Actions. In some examples, if the method is applied to scenes such as augmented reality or virtual reality, the corresponding processing performed by the image processing device based on the interactive instruction may include controlling the virtual target object to perform an action corresponding to the interactive instruction in a real scene or a virtual scene.
采用本公开实施例的技术方案,一方面实现了在图像中仅具有目标对象的部分肢体(例如上半身)的场景下的目标跟踪,也即本公开实施例的技术方案能够同时适应完整肢体场景和部分肢体(例如上半身)场景,实现了图像中的目标跟踪;另一方面,在目标跟踪过程中检测跟踪到的目标对象的关键点信息,并基于目标对象的关键点信息确定跟踪到的目标对象的姿态,基于目标对象的姿态确定对应的交互指令,实现了在特定应用场景(例如虚拟现实场景、增强现实场景、体感游戏场景等交互场景)中的人机交互,提升用户的交互体验。By adopting the technical solutions of the embodiments of the present disclosure, on the one hand, target tracking in a scene with only part of the limbs (such as the upper body) of the target object in the image is realized, that is, the technical solutions of the embodiments of the present disclosure can simultaneously adapt to the scenes of complete limbs and Part of the limbs (such as the upper body) scene realizes the target tracking in the image; on the other hand, the key point information of the tracked target object is detected during the target tracking process, and the tracked target object is determined based on the key point information of the target object Based on the posture of the target object, the corresponding interaction instruction is determined, which realizes human-computer interaction in specific application scenarios (such as virtual reality scenes, augmented reality scenes, somatosensory game scenes and other interactive scenes), and enhances the user’s interactive experience.
本公开实施例还提供了一种图像处理装置。图5为本公开实施例的图像处理装置的组成结构示意图一;如图5所示,所述装置包括:获取单元31、检测单元32和跟踪确定单元33;其中,The embodiment of the present disclosure also provides an image processing device. FIG. 5 is a schematic diagram 1 of the composition structure of an image processing device according to an embodiment of the disclosure; as shown in FIG. 5, the device includes: an acquisition unit 31, a detection unit 32, and a tracking determination unit 33; wherein,
上述获取单元31,配置为获得多帧图像;The aforementioned acquiring unit 31 is configured to acquire multiple frames of images;
上述检测单元32,配置为对多帧图像中的第一图像中的目标对象进行肢体关键点检测处理,获得上述目标对象的部分肢体对应的第一关键点信息;The detection unit 32 is configured to perform limb key point detection processing on the target object in the first image in the multi-frame image, to obtain first key point information corresponding to part of the limb of the target object;
上述跟踪确定单元33,配置为基于上述第一关键点信息确定第二图像中的上述目标对象的上述部分肢体对应的第二关键点信息;其中,在上述多帧图像中,上述第二图像为上述第一图像后的一帧图像。The tracking determination unit 33 is configured to determine, based on the first key point information, the second key point information corresponding to the part of the limb of the target object in the second image; wherein, in the multi-frame image, the second image is An image after the first image.
在本公开的一些可选实施例中,如图6所示,上述检测单元32包括:肢体检测模块321和肢体关键点检测模块322;其中,In some optional embodiments of the present disclosure, as shown in FIG. 6, the detection unit 32 includes: a limb detection module 321 and a limb key point detection module 322; wherein,
上述肢体检测模块321,配置为对上述第一图像中的目标对象进行肢体检测处理,确定目标对象的第一区域;第一区域包括目标对象的部分肢体所在区域;The limb detection module 321 is configured to perform limb detection processing on the target object in the first image to determine the first area of the target object; the first area includes the area where part of the limb of the target object is located;
上述肢体关键点检测模块322,配置为对上述第一区域对应的像素点进行肢体关键点检测处理,获得上述目标对象的上述部分肢体对应的第一关键点信息。The limb key point detection module 322 is configured to perform limb key point detection processing on the pixel points corresponding to the first region to obtain first key point information corresponding to the part of the limb of the target object.
在本公开的一些可选实施例中,上述跟踪确定单元33,配置为基于上述第一关键点信息在第一图像中确定第二区域;上述第二区域大于上述目标对象的第一区域;第一区域包括目标对象的部分肢体所在区域;根据第二区域,确定第二图像中与第二区域的位置范围对应的第三区域;对第二图像中的第三区域内的像素点进行肢体关键点检测处理,获得上述部分肢体对应的第二关键点信息。In some optional embodiments of the present disclosure, the tracking determining unit 33 is configured to determine a second area in the first image based on the first key point information; the second area is larger than the first area of the target object; One area includes the area where part of the limb of the target object is located; according to the second area, determine the third area in the second image corresponding to the position range of the second area; perform limb keying on the pixels in the third area in the second image Point detection processing to obtain the second key point information corresponding to the part of the limb.
在本公开的一些可选实施例中,上述跟踪确定单元33,配置为根据所述第一关键点信息在所述第一图像中的位置范围,确定所述第二图像中、与所述位置范围对应的第三区域;对所述第二图像中的所述第三区域内的像素点进行肢体关键点检测处理,获得所述部分肢体对应的第二关键点信息。In some optional embodiments of the present disclosure, the above-mentioned tracking determination unit 33 is configured to determine, according to the position range of the first key point information in the first image, that the position in the second image is different from the position in the second image. A third area corresponding to the range; performing limb key point detection processing on pixels in the third area in the second image to obtain second key point information corresponding to the part of the limb.
在本公开的一些可选实施例中,上述肢体检测模块321,配置为利用肢体检测网络对上述第一图像中的上述目标对象进行肢体检测处理;其中,上述肢体检测网络采用第一类样本图像训练得到;上述第一类样本图像中标注有目标对象的检测框;检测框的标注范围包括目标对象的部分肢体所在区域。In some optional embodiments of the present disclosure, the limb detection module 321 is configured to use a limb detection network to perform limb detection processing on the target object in the first image; wherein, the limb detection network uses the first type of sample image Obtained by training; the detection frame of the target object is marked in the above-mentioned first-type sample image; the marking range of the detection frame includes the area where part of the limb of the target object is located.
在本公开的一些可选实施例中,上述肢体关键点检测模块322,配置为利用肢体关键点检测网络对上述第一区域对应的像素点进行肢体关键点检测处理;其中,上述肢体关键点检测网络采用第二类样本图像训练得到;上述第二类样本图像中标注有包括所述目标对象的部分肢体的关键点。In some optional embodiments of the present disclosure, the aforementioned limb key point detection module 322 is configured to use a limb key point detection network to perform limb key point detection processing on pixels corresponding to the first region; wherein, the aforementioned limb key point detection The network is trained by using the second type of sample image; the above-mentioned second type of sample image is marked with key points that include part of the body of the target object.
在本公开的一些可选实施例中,上述目标对象的部分肢体包括以下至少之一:头部、颈部、肩部、胸部、腰部、髋部、手臂、手部;上述第一关键点信息和上述第二关键点信息包括头部、颈部、肩部、胸部、腰部、髋部、手臂和手部中的至少一个肢体的轮廓关键点信息和/骨骼关键点信息。In some optional embodiments of the present disclosure, the part of the limbs of the target object includes at least one of the following: head, neck, shoulders, chest, waist, hips, arms, hands; the above-mentioned first key point information And the above-mentioned second key point information includes contour key point information and/or bone key point information of at least one of the head, neck, shoulders, chest, waist, hips, arms, and hands.
在本公开的一些可选实施例中,如图7所示,上述装置还包括:分配单元34和统计单元35;其中,In some optional embodiments of the present disclosure, as shown in FIG. 7, the above-mentioned apparatus further includes: an allocation unit 34 and a statistics unit 35; wherein,
上述分配单元34,配置为响应于上述检测单元32获得目标对象的部分肢体对应的第一关键点信息的情况,为目标对象分配跟踪标识;The allocation unit 34 is configured to allocate a tracking identifier to the target object in response to the detection unit 32 obtaining the first key point information corresponding to a part of the limb of the target object;
上述统计单元35,配置为基于对多帧图像的处理过程中分配的跟踪标识的数量,确定多帧图像中的目标对象的数量。The aforementioned statistical unit 35 is configured to determine the number of target objects in the multi-frame image based on the number of tracking identifiers allocated during the processing of the multi-frame image.
在本公开的一些可选实施例中,如图8所示,上述装置还包括确定单元36,配置为 基于第二关键点信息确定目标对象的姿态;基于目标对象的姿态确定对应于目标对象的交互指令。In some optional embodiments of the present disclosure, as shown in FIG. 8, the above-mentioned apparatus further includes a determining unit 36 configured to determine the posture of the target object based on the second key point information; and determine the posture corresponding to the target object based on the posture of the target object. Interactive instructions.
本公开实施例中,上述图像处理装置中的获取单元31、检测单元32(包括肢体检测模块321和肢体关键点检测模块322)、跟踪确定单元33、分配单元34、统计单元35和确定单元36,在实际应用中均可由中央处理器(CPU,Central Processing Unit)、数字信号处理器(DSP,Digital Signal Processor)、微控制单元(MCU,Microcontroller Unit)或可编程门阵列(FPGA,Field-Programmable Gate Array)实现。In the embodiment of the present disclosure, the acquisition unit 31, the detection unit 32 (including the limb detection module 321 and the limb key point detection module 322), the tracking determination unit 33, the allocation unit 34, the statistics unit 35, and the determination unit 36 in the above-mentioned image processing device In practical applications, it can be implemented by a central processing unit (CPU, Central Processing Unit), a digital signal processor (DSP, Digital Signal Processor), a microcontroller unit (MCU, Microcontroller Unit) or a programmable gate array (FPGA, Field-Programmable Gate Array) implementation.
需要说明的是:上述实施例提供的图像处理装置在进行图像处理时,仅以上述各程序模块的划分进行举例说明,实际应用中,可以根据需要而将上述处理分配由不同的程序模块完成,即将装置的内部结构划分成不同的程序模块,以完成以上描述的全部或者部分处理。另外,上述实施例提供的图像处理装置与图像处理方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。It should be noted that when the image processing device provided in the foregoing embodiment performs image processing, only the division of the foregoing program modules is used as an example for illustration. In actual applications, the foregoing processing can be allocated to different program modules as needed. That is, the internal structure of the device is divided into different program modules to complete all or part of the processing described above. In addition, the image processing device provided in the foregoing embodiment and the image processing method embodiment belong to the same concept, and the specific implementation process is detailed in the method embodiment, which will not be repeated here.
本公开实施例还提供了一种电子设备。图9为本公开实施例的电子设备的硬件组成结构示意图;如图9所示,电子设备40可包括存储器42、处理器41及存储在存储器42上并可在处理器41上运行的计算机程序,上述处理器41执行上述程序时实现本公开实施例上述图像处理方法的步骤。The embodiment of the present disclosure also provides an electronic device. FIG. 9 is a schematic diagram of the hardware composition structure of the electronic device of the embodiment of the disclosure; as shown in FIG. 9, the electronic device 40 may include a memory 42, a processor 41, and a computer program stored on the memory 42 and running on the processor 41 When the above-mentioned processor 41 executes the above-mentioned program, the steps of the above-mentioned image processing method in the embodiment of the present disclosure are realized.
可以理解,电子设备40中的各个组件可通过总线系统43耦合在一起。可理解,总线系统43用于实现这些组件之间的连接通信。总线系统43除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图9中将各种总线都标为总线系统43。It can be understood that various components in the electronic device 40 may be coupled together through the bus system 43. It can be understood that the bus system 43 is used to implement connection and communication between these components. In addition to the data bus, the bus system 43 also includes a power bus, a control bus, and a status signal bus. However, for the sake of clear description, various buses are marked as the bus system 43 in FIG. 9.
可以理解,存储器42可以是易失性存储器或非易失性存储器,也可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(ROM,Read Only Memory)、可编程只读存储器(PROM,Programmable Read-Only Memory)、可擦除可编程只读存储器(EPROM,Erasable Programmable Read-Only Memory)、电可擦除可编程只读存储器(EEPROM,Electrically Erasable Programmable Read-Only Memory)、磁性随机存取存储器(FRAM,ferromagnetic random access memory)、快闪存储器(Flash Memory)、磁表面存储器、光盘、或只读光盘(CD-ROM,Compact Disc Read-Only Memory);磁表面存储器可以是磁盘存储器或磁带存储器。易失性存储器可以是随机存取存储器(RAM,Random Access Memory),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(SRAM,Static Random Access Memory)、同步静态随机存取存储器(SSRAM,Synchronous Static Random Access Memory)、动态随机存取存储器(DRAM,Dynamic Random Access Memory)、同步动态随机存取存储器(SDRAM,Synchronous Dynamic Random Access Memory)、双倍数据速率同步动态随机存取存储器(DDRSDRAM,Double Data Rate Synchronous Dynamic  Random Access Memory)、增强型同步动态随机存取存储器(ESDRAM,Enhanced Synchronous Dynamic Random Access Memory)、同步连接动态随机存取存储器(SLDRAM,SyncLink Dynamic Random Access Memory)、直接内存总线随机存取存储器(DRRAM,Direct Rambus Random Access Memory)。本公开实施例描述的存储器42旨在包括但不限于这些和任意其它适合类型的存储器。It can be understood that the memory 42 may be a volatile memory or a non-volatile memory, and may also include both volatile and non-volatile memory. Among them, the non-volatile memory can be a read-only memory (ROM, Read Only Memory), a programmable read-only memory (PROM, Programmable Read-Only Memory), an erasable programmable read-only memory (EPROM, Erasable Programmable Read- Only Memory, Electrically Erasable Programmable Read-Only Memory (EEPROM), Ferromagnetic Random Access Memory (FRAM), Flash Memory, Magnetic Surface Memory , CD-ROM, or CD-ROM (Compact Disc Read-Only Memory); magnetic surface memory can be magnetic disk storage or tape storage. The volatile memory may be a random access memory (RAM, Random Access Memory), which is used as an external cache. By way of exemplary but not restrictive description, many forms of RAM are available, such as static random access memory (SRAM, Static Random Access Memory), synchronous static random access memory (SSRAM, Synchronous Static Random Access Memory), and dynamic random access memory. Memory (DRAM, Dynamic Random Access Memory), Synchronous Dynamic Random Access Memory (SDRAM, Synchronous Dynamic Random Access Memory), Double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM, Double Data Rate Synchronous Dynamic Random Access Memory), enhanced Type synchronous dynamic random access memory (ESDRAM, Enhanced Synchronous Dynamic Random Access Memory), synchronous connection dynamic random access memory (SLDRAM, SyncLink Dynamic Random Access Memory), direct memory bus random access memory (DRRAM, Direct Rambus Random Access Memory) ). The memory 42 described in the embodiments of the present disclosure is intended to include, but is not limited to, these and any other suitable types of memory.
上述本公开实施例揭示的方法可以应用于处理器41中,或者由处理器41实现。处理器41可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器41中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器41可以是通用处理器、DSP,或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。处理器41可以实现或者执行本公开实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合本公开实施例所公开的方法的步骤,可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于存储介质中,该存储介质位于存储器42,处理器41读取存储器42中的信息,结合其硬件完成前述方法的步骤。The methods disclosed in the foregoing embodiments of the present disclosure may be applied to the processor 41 or implemented by the processor 41. The processor 41 may be an integrated circuit chip with signal processing capabilities. In the implementation process, the steps of the foregoing method can be completed by an integrated logic circuit of hardware in the processor 41 or instructions in the form of software. The aforementioned processor 41 may be a general-purpose processor, a DSP, or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, and the like. The processor 41 may implement or execute various methods, steps, and logical block diagrams disclosed in the embodiments of the present disclosure. The general-purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the embodiments of the present disclosure may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor. The software module may be located in a storage medium, and the storage medium is located in the memory 42. The processor 41 reads the information in the memory 42 and completes the steps of the foregoing method in combination with its hardware.
在示例性实施例中,电子设备40可以被一个或多个应用专用集成电路(ASIC,Application Specific Integrated Circuit)、DSP、可编程逻辑器件(PLD,Programmable Logic Device)、复杂可编程逻辑器件(CPLD,Complex Programmable Logic Device)、FPGA、通用处理器、控制器、MCU、微处理器(Microprocessor)、或其他电子元件实现,用于执行前述方法。In an exemplary embodiment, the electronic device 40 may be used by one or more Application Specific Integrated Circuits (ASIC, Application Specific Integrated Circuit), DSP, Programmable Logic Device (PLD, Programmable Logic Device), and Complex Programmable Logic Device (CPLD). , Complex Programmable Logic Device, FPGA, general-purpose processor, controller, MCU, microprocessor (Microprocessor), or other electronic components to implement the foregoing method.
在示例性实施例中,本公开实施例还提供了一种计算机可读存储介质,例如包括计算机程序的存储器42,上述计算机程序可由电子设备40的处理器41执行,以完成前述方法所述步骤。计算机可读存储介质可以是FRAM、ROM、PROM、EPROM、EEPROM、Flash Memory、磁表面存储器、光盘、或CD-ROM等存储器;也可以是包括上述存储器之一或任意组合的各种设备,如移动电话、计算机、平板设备、个人数字助理等。In an exemplary embodiment, the embodiment of the present disclosure also provides a computer-readable storage medium, such as a memory 42 including a computer program, which can be executed by the processor 41 of the electronic device 40 to complete the steps described in the foregoing method. . The computer-readable storage medium can be FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface memory, optical disk, or CD-ROM, etc.; it can also be a variety of devices including one or any combination of the above-mentioned memories, such as Mobile phones, computers, tablet devices, personal digital assistants, etc.
本公开实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现本公开实施例所述图像处理方法的步骤。The embodiment of the present disclosure also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the steps of the image processing method according to the embodiment of the present disclosure are realized.
本公开实施例还提供了一种计算机程序,所述计算机程序使得计算机执行本公开实施例所述的图像处理方法的步骤。The embodiment of the present disclosure also provides a computer program that enables a computer to execute the steps of the image processing method described in the embodiment of the present disclosure.
本申请所提供的几个方法实施例中所揭露的方法,在不冲突的情况下可以任意组合,得到新的方法实施例。The methods disclosed in the several method embodiments provided in this application can be combined arbitrarily without conflict to obtain new method embodiments.
本申请所提供的几个产品实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的产品实施例。The features disclosed in the several product embodiments provided in this application can be combined arbitrarily without conflict to obtain new product embodiments.
本申请所提供的几个方法或设备实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的方法实施例或设备实施例。The features disclosed in the several method or device embodiments provided in this application can be combined arbitrarily without conflict to obtain a new method embodiment or device embodiment.
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。In the several embodiments provided in this application, it should be understood that the disclosed device and method may be implemented in other ways. The device embodiments described above are only illustrative. For example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, such as: multiple units or components can be combined, or It can be integrated into another system, or some features can be ignored or not implemented. In addition, the coupling, or direct coupling, or communication connection between the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms. of.
上述作为分离部件说明的单元可以是、或也可以不是物理上分开的,作为单元显示的部件可以是、或也可以不是物理单元,即可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。The units described above as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units; Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本公开各实施例中的各功能单元可以全部集成在一个处理单元中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。In addition, the functional units in the embodiments of the present disclosure can be all integrated into one processing unit, or each unit can be individually used as a unit, or two or more units can be integrated into one unit; The unit can be implemented in the form of hardware, or in the form of hardware plus software functional units.
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储设备、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。A person of ordinary skill in the art can understand that all or part of the steps in the above method embodiments can be implemented by a program instructing relevant hardware. The foregoing program can be stored in a computer readable storage medium. When the program is executed, it is executed. Including the steps of the foregoing method embodiment; and the foregoing storage medium includes: various media that can store program codes, such as a mobile storage device, ROM, RAM, magnetic disk, or optical disk.
或者,本公开上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本公开实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本公开各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储设备、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。Alternatively, if the aforementioned integrated unit of the present disclosure is implemented in the form of a software function module and sold or used as an independent product, it may also be stored in a computer readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present disclosure can be embodied in the form of a software product in essence or a part that contributes to the prior art. The computer software product is stored in a storage medium and includes several instructions for A computer device (which may be a personal computer, a server, or a network device, etc.) executes all or part of the methods described in the various embodiments of the present disclosure. The aforementioned storage media include: removable storage devices, ROM, RAM, magnetic disks, or optical disks and other media that can store program codes.
以上所述,仅为本公开的具体实施方式,但本公开的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应以所述权利要求的保护范围为准。The above are only specific implementations of the present disclosure, but the protection scope of the present disclosure is not limited thereto. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in the present disclosure. It should be covered within the protection scope of the present disclosure. Therefore, the protection scope of the present disclosure should be subject to the protection scope of the claims.

Claims (21)

  1. 一种图像处理方法,所述方法包括:An image processing method, the method includes:
    获得多帧图像;Obtain multiple frames of images;
    对所述多帧图像中的第一图像中的目标对象进行肢体关键点检测处理,获得所述目标对象的部分肢体对应的第一关键点信息;Performing limb key point detection processing on the target object in the first image in the multi-frame image to obtain first key point information corresponding to part of the limb of the target object;
    基于所述第一关键点信息确定第二图像中的所述目标对象的所述部分肢体对应的第二关键点信息;其中,在所述多帧图像中,所述第二图像为所述第一图像后的一帧图像。Determine the second key point information corresponding to the part of the limb of the target object in the second image based on the first key point information; wherein, in the multi-frame image, the second image is the first One frame after one image.
  2. 根据权利要求1所述的方法,其中,所述对所述多帧图像中的第一图像中的目标对象进行肢体关键点检测处理,获得所述目标对象的部分肢体对应的第一关键点信息,包括:The method according to claim 1, wherein said performing limb key point detection processing on the target object in the first image in the multi-frame image, to obtain first key point information corresponding to part of the limb of the target object ,include:
    对所述第一图像中的所述目标对象进行肢体检测处理,确定所述目标对象的第一区域;所述第一区域包括所述目标对象的部分肢体所在区域;Performing limb detection processing on the target object in the first image to determine a first area of the target object; the first area includes the area where part of the limb of the target object is located;
    对所述第一区域对应的像素点进行肢体关键点检测处理,获得所述目标对象的所述部分肢体对应的第一关键点信息。Perform limb key point detection processing on the pixel points corresponding to the first region to obtain first key point information corresponding to the part of the limb of the target object.
  3. 根据权利要求1所述的方法,其中,所述基于所述第一关键点信息确定第二图像中的所述目标对象的所述部分肢体对应的第二关键点信息,包括:The method according to claim 1, wherein the determining the second key point information corresponding to the part of the limb of the target object in the second image based on the first key point information comprises:
    基于所述第一关键点信息在所述第一图像中确定第二区域;所述第二区域大于所述目标对象的第一区域;所述第一区域包括所述目标对象的部分肢体所在区域;Determine a second area in the first image based on the first key point information; the second area is larger than the first area of the target object; the first area includes the area where part of the body of the target object is located ;
    根据所述第二区域,确定所述第二图像中与所述第二区域的位置范围对应的第三区域;Determine, according to the second area, a third area in the second image corresponding to the position range of the second area;
    对所述第二图像中的所述第三区域内的像素点进行肢体关键点检测处理,获得所述部分肢体对应的第二关键点信息。Performing limb key point detection processing on pixels in the third area in the second image to obtain second key point information corresponding to the part of the limb.
  4. 根据权利要求1所述的方法,其中,所述基于所述第一关键点信息确定第二图像中的所述目标对象的所述部分肢体对应的第二关键点信息,包括:The method according to claim 1, wherein the determining the second key point information corresponding to the part of the limb of the target object in the second image based on the first key point information comprises:
    根据所述第一关键点信息在所述第一图像中的位置范围,确定所述第二图像中、与所述位置范围对应的第三区域;Determine a third area in the second image corresponding to the position range according to the position range of the first key point information in the first image;
    对所述第二图像中的所述第三区域内的像素点进行肢体关键点检测处理,获得所述部分肢体对应的第二关键点信息。Performing limb key point detection processing on pixels in the third area in the second image to obtain second key point information corresponding to the part of the limb.
  5. 根据权利要求2所述的方法,其中,所述对所述第一图像中的所述目标对象进行肢体检测处理,包括:The method according to claim 2, wherein the performing limb detection processing on the target object in the first image comprises:
    利用肢体检测网络对所述第一图像中的所述目标对象进行肢体检测处理;Performing a limb detection process on the target object in the first image by using a limb detection network;
    其中,所述肢体检测网络采用第一类样本图像训练得到;所述第一类样本图像中标 注有目标对象的检测框;所述检测框的标注范围包括所述目标对象的部分肢体所在区域。Wherein, the limb detection network is obtained by training using a first type of sample image; the first type of sample image is marked with a detection frame of the target object; the label range of the detection frame includes the area where part of the limb of the target object is located.
  6. 根据权利要求2所述的方法,其中,所述对所述第一区域对应的像素点进行肢体关键点检测处理,包括:The method according to claim 2, wherein said performing limb key point detection processing on pixels corresponding to said first area comprises:
    利用肢体关键点检测网络对所述第一区域对应的像素点进行肢体关键点检测处理;Using a limb key point detection network to perform limb key point detection processing on pixels corresponding to the first region;
    其中,所述肢体关键点检测网络采用第二类样本图像训练得到;所述第二类样本图像中标注有包括所述目标对象的部分肢体的关键点。Wherein, the limb key point detection network is obtained by training using a second type of sample image; the second type of sample image is marked with key points of a part of the limb including the target object.
  7. 根据权利要求1至6任一项所述的方法,其中,所述目标对象的部分肢体包括以下至少之一:头部、颈部、肩部、胸部、腰部、髋部、手臂、手部;The method according to any one of claims 1 to 6, wherein part of the limbs of the target object includes at least one of the following: head, neck, shoulders, chest, waist, hips, arms, hands;
    所述第一关键点信息和所述第二关键点信息包括头部、颈部、肩部、胸部、腰部、髋部、手臂和手部中的至少一个肢体的轮廓关键点信息和/骨骼关键点信息。The first key point information and the second key point information include contour key point information and/or bone key of at least one of the head, neck, shoulders, chest, waist, hips, arms, and hands. Point information.
  8. 根据权利要求1至7任一项所述的方法,其中,所述方法还包括:The method according to any one of claims 1 to 7, wherein the method further comprises:
    响应于获得所述目标对象的部分肢体对应的第一关键点信息的情况,为所述目标对象分配跟踪标识;In response to obtaining the first key point information corresponding to a part of the limb of the target object, assign a tracking identifier to the target object;
    基于对所述多帧图像的处理过程中分配的所述跟踪标识的数量,确定所述多帧图像中的目标对象的数量。Determine the number of target objects in the multi-frame image based on the number of the tracking identifiers allocated during the processing of the multi-frame image.
  9. 根据权利要求1至8任一项所述的方法,其中,所述方法还包括:The method according to any one of claims 1 to 8, wherein the method further comprises:
    基于所述第二关键点信息确定所述目标对象的姿态;Determining the posture of the target object based on the second key point information;
    基于所述目标对象的姿态确定对应于所述目标对象的交互指令。An interaction instruction corresponding to the target object is determined based on the posture of the target object.
  10. 一种图像处理装置,所述装置包括:获取单元、检测单元和跟踪确定单元;其中,An image processing device, the device comprising: an acquisition unit, a detection unit, and a tracking determination unit; wherein,
    所述获取单元,配置为获得多帧图像;The acquiring unit is configured to acquire multiple frames of images;
    所述检测单元,配置为对所述多帧图像中的第一图像中的目标对象进行肢体关键点检测处理,获得所述目标对象的部分肢体对应的第一关键点信息;The detection unit is configured to perform limb key point detection processing on the target object in the first image in the multi-frame image to obtain first key point information corresponding to a part of the limb of the target object;
    所述跟踪确定单元,配置为基于所述第一关键点信息确定第二图像中的所述目标对象的所述部分肢体对应的第二关键点信息;其中,在所述多帧图像中,所述第二图像为所述第一图像后的一帧图像。The tracking determination unit is configured to determine second key point information corresponding to the part of the limb of the target object in the second image based on the first key point information; wherein, in the multi-frame image, The second image is an image after the first image.
  11. 根据权利要求10所述的装置,其中,所述检测单元包括:肢体检测模块和肢体关键点检测模块;其中,The device according to claim 10, wherein the detection unit comprises: a limb detection module and a limb key point detection module; wherein,
    所述肢体检测模块,配置为对所述第一图像中的所述目标对象进行肢体检测处理,确定所述目标对象的第一区域;所述第一区域包括所述目标对象的部分肢体所在区域;The limb detection module is configured to perform limb detection processing on the target object in the first image to determine a first area of the target object; the first area includes the area where part of the limb of the target object is located ;
    所述肢体关键点检测模块,配置为对所述第一区域对应的像素点进行肢体关键点检测处理,获得所述目标对象的所述部分肢体对应的第一关键点信息。The limb key point detection module is configured to perform limb key point detection processing on the pixel points corresponding to the first area to obtain first key point information corresponding to the part of the limb of the target object.
  12. 根据权利要求10所述的装置,其中,所述跟踪确定单元,配置为基于所述第 一关键点信息在所述第一图像中确定第二区域;所述第二区域大于所述目标对象的第一区域;所述第一区域包括所述目标对象的部分肢体所在区域;根据所述第二区域,确定所述第二图像中与所述第二区域的位置范围对应的第三区域;对所述第二图像中的所述第三区域内的像素点进行肢体关键点检测处理,获得所述部分肢体对应的第二关键点信息。The device according to claim 10, wherein the tracking determination unit is configured to determine a second area in the first image based on the first key point information; the second area is larger than the target object The first area; the first area includes the area where part of the limb of the target object is located; according to the second area, a third area in the second image corresponding to the position range of the second area is determined; Perform limb key point detection processing on pixels in the third area in the second image to obtain second key point information corresponding to the part of the limb.
  13. 根据权利要求10所述的装置,其中,所述跟踪确定单元,配置为根据所述第一关键点信息在所述第一图像中的位置范围,确定所述第二图像中、与所述位置范围对应的第三区域;对所述第二图像中的所述第三区域内的像素点进行肢体关键点检测处理,获得所述部分肢体对应的第二关键点信息。The device according to claim 10, wherein the tracking determination unit is configured to determine the position in the second image and the position in the first image according to the position range of the first key point information in the first image. A third area corresponding to the range; performing limb key point detection processing on pixels in the third area in the second image to obtain second key point information corresponding to the part of the limb.
  14. 根据权利要求11所述的装置,其中,所述肢体检测模块,配置为利用肢体检测网络对所述第一图像中的所述目标对象进行肢体检测处理;The device according to claim 11, wherein the limb detection module is configured to perform limb detection processing on the target object in the first image by using a limb detection network;
    其中,所述肢体检测网络采用第一类样本图像训练得到;所述第一类样本图像中标注有目标对象的检测框;所述检测框的标注范围包括所述目标对象的部分肢体所在区域。Wherein, the limb detection network is trained using a first type of sample image; the first type of sample image is marked with a detection frame of the target object; the marking range of the detection frame includes the area where part of the limb of the target object is located.
  15. 根据权利要求11所述的装置,其中,所述肢体关键点检测模块,配置为利用肢体关键点检测网络对所述第一区域对应的像素点进行肢体关键点检测处理;The device according to claim 11, wherein the limb key point detection module is configured to use a limb key point detection network to perform limb key point detection processing on pixels corresponding to the first region;
    其中,所述肢体关键点检测网络采用第二类样本图像训练得到;所述第二类样本图像中标注有包括所述目标对象的部分肢体的关键点。Wherein, the limb key point detection network is obtained by training using a second type of sample image; the second type of sample image is marked with key points of a part of the limb including the target object.
  16. 根据权利要求10至15任一项所述的装置,其中,所述目标对象的部分肢体包括以下至少之一:头部、颈部、肩部、胸部、腰部、髋部、手臂、手部;The device according to any one of claims 10 to 15, wherein part of the limbs of the target object includes at least one of the following: head, neck, shoulders, chest, waist, hips, arms, hands;
    所述第一关键点信息和所述第二关键点信息包括头部、颈部、肩部、胸部、腰部、髋部、手臂和手部中的至少一个肢体的轮廓关键点信息和/骨骼关键点信息。The first key point information and the second key point information include contour key point information and/or bone key of at least one of the head, neck, shoulders, chest, waist, hips, arms, and hands. Point information.
  17. 根据权利要求10至16任一项所述的装置,其中,所述装置还包括分配单元和统计单元;其中,The device according to any one of claims 10 to 16, wherein the device further comprises an allocation unit and a statistics unit; wherein,
    所述分配单元,配置为响应于所述检测单元获得所述目标对象的部分肢体对应的第一关键点信息的情况,为所述目标对象分配跟踪标识;The allocation unit is configured to allocate a tracking identifier to the target object in response to the detection unit obtaining the first key point information corresponding to a part of the limb of the target object;
    所述统计单元,配置为基于对所述多帧图像的处理过程中分配的所述跟踪标识的数量,确定所述多帧图像中的目标对象的数量。The statistical unit is configured to determine the number of target objects in the multi-frame image based on the number of the tracking identifiers allocated during the processing of the multi-frame image.
  18. 根据权利要求10至17任一项所述的装置,其中,所述装置还包括确定单元,配置为基于所述第二关键点信息确定所述目标对象的姿态;基于所述目标对象的姿态确定对应于所述目标对象的交互指令。The device according to any one of claims 10 to 17, wherein the device further comprises a determining unit configured to determine the posture of the target object based on the second key point information; and determine based on the posture of the target object An interactive instruction corresponding to the target object.
  19. 一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现权利要求1至9任一项所述方法的步骤。A computer-readable storage medium with a computer program stored thereon, which, when executed by a processor, implements the steps of the method described in any one of claims 1 to 9.
  20. 一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的 计算机程序,所述处理器执行所述程序时实现权利要求1至9任一项所述方法的步骤。An electronic device comprising a memory, a processor, and a computer program stored on the memory and capable of running on the processor. The processor implements the steps of the method according to any one of claims 1 to 9 when the processor executes the program.
  21. 一种计算机程序,所述计算机程序使得计算机执行如权利要求1至9任一项所述的图像处理方法。A computer program that causes a computer to execute the image processing method according to any one of claims 1 to 9.
PCT/CN2021/076504 2020-04-29 2021-02-10 Image processing method and apparatus, electronic device and storage medium WO2021218293A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2021565760A JP2022534666A (en) 2020-04-29 2021-02-10 Image processing method, device, electronic device and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010357593.2 2020-04-29
CN202010357593.2A CN111539992A (en) 2020-04-29 2020-04-29 Image processing method, image processing device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2021218293A1 true WO2021218293A1 (en) 2021-11-04

Family

ID=71975386

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/076504 WO2021218293A1 (en) 2020-04-29 2021-02-10 Image processing method and apparatus, electronic device and storage medium

Country Status (4)

Country Link
JP (1) JP2022534666A (en)
CN (1) CN111539992A (en)
TW (1) TW202141340A (en)
WO (1) WO2021218293A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115337607A (en) * 2022-10-14 2022-11-15 佛山科学技术学院 Upper limb movement rehabilitation training method based on computer vision

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111539992A (en) * 2020-04-29 2020-08-14 北京市商汤科技开发有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN112016514A (en) * 2020-09-09 2020-12-01 平安科技(深圳)有限公司 Traffic sign identification method, device, equipment and storage medium
CN112785573A (en) * 2021-01-22 2021-05-11 上海商汤智能科技有限公司 Image processing method and related device and equipment
CN112818908A (en) * 2021-02-22 2021-05-18 Oppo广东移动通信有限公司 Key point detection method, device, terminal and storage medium
CN113192127B (en) * 2021-05-12 2024-01-02 北京市商汤科技开发有限公司 Image processing method, device, electronic equipment and storage medium
CN113469017A (en) * 2021-06-29 2021-10-01 北京市商汤科技开发有限公司 Image processing method and device and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170148179A1 (en) * 2014-10-31 2017-05-25 Fyusion, Inc. System and method for infinite smoothing of image sequences
CN108062526A (en) * 2017-12-15 2018-05-22 厦门美图之家科技有限公司 A kind of estimation method of human posture and mobile terminal
CN108230357A (en) * 2017-10-25 2018-06-29 北京市商汤科技开发有限公司 Critical point detection method, apparatus, storage medium, computer program and electronic equipment
CN108986137A (en) * 2017-11-30 2018-12-11 成都通甲优博科技有限责任公司 Human body tracing method, device and equipment
CN109685797A (en) * 2018-12-25 2019-04-26 北京旷视科技有限公司 Bone point detecting method, device, processing equipment and storage medium
CN110139115A (en) * 2019-04-30 2019-08-16 广州虎牙信息科技有限公司 Virtual image attitude control method, device and electronic equipment based on key point
CN111539992A (en) * 2020-04-29 2020-08-14 北京市商汤科技开发有限公司 Image processing method, image processing device, electronic equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5525407B2 (en) * 2010-10-12 2014-06-18 日本電信電話株式会社 Behavior model learning device, three-dimensional posture estimation device, behavior model learning method, three-dimensional posture estimation method, and program
CN109918975B (en) * 2017-12-13 2022-10-21 腾讯科技(深圳)有限公司 Augmented reality processing method, object identification method and terminal
CN108062536B (en) * 2017-12-29 2020-07-24 纳恩博(北京)科技有限公司 Detection method and device and computer storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170148179A1 (en) * 2014-10-31 2017-05-25 Fyusion, Inc. System and method for infinite smoothing of image sequences
CN108230357A (en) * 2017-10-25 2018-06-29 北京市商汤科技开发有限公司 Critical point detection method, apparatus, storage medium, computer program and electronic equipment
CN108986137A (en) * 2017-11-30 2018-12-11 成都通甲优博科技有限责任公司 Human body tracing method, device and equipment
CN108062526A (en) * 2017-12-15 2018-05-22 厦门美图之家科技有限公司 A kind of estimation method of human posture and mobile terminal
CN109685797A (en) * 2018-12-25 2019-04-26 北京旷视科技有限公司 Bone point detecting method, device, processing equipment and storage medium
CN110139115A (en) * 2019-04-30 2019-08-16 广州虎牙信息科技有限公司 Virtual image attitude control method, device and electronic equipment based on key point
CN111539992A (en) * 2020-04-29 2020-08-14 北京市商汤科技开发有限公司 Image processing method, image processing device, electronic equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115337607A (en) * 2022-10-14 2022-11-15 佛山科学技术学院 Upper limb movement rehabilitation training method based on computer vision

Also Published As

Publication number Publication date
CN111539992A (en) 2020-08-14
JP2022534666A (en) 2022-08-03
TW202141340A (en) 2021-11-01

Similar Documents

Publication Publication Date Title
WO2021218293A1 (en) Image processing method and apparatus, electronic device and storage medium
US10832039B2 (en) Facial expression detection method, device and system, facial expression driving method, device and system, and storage medium
WO2021129064A1 (en) Posture acquisition method and device, and key point coordinate positioning model training method and device
JP7229174B2 (en) Person identification system and method
CN110874594B (en) Human body appearance damage detection method and related equipment based on semantic segmentation network
WO2017190646A1 (en) Facial image processing method and apparatus and storage medium
CN109952594B (en) Image processing method, device, terminal and storage medium
EP3210164B1 (en) Facial skin mask generation
CN105612533B (en) Living body detection method, living body detection system, and computer program product
Gorodnichy et al. Nouse ‘use your nose as a mouse’perceptual vision technology for hands-free games and interfaces
US20200394392A1 (en) Method and apparatus for detecting face image
WO2019019828A1 (en) Target object occlusion detection method and apparatus, electronic device and storage medium
US11176355B2 (en) Facial image processing method and apparatus, electronic device and computer readable storage medium
CN109299658B (en) Face detection method, face image rendering device and storage medium
CN104240277A (en) Augmented reality interaction method and system based on human face detection
CN110781770B (en) Living body detection method, device and equipment based on face recognition
CN109035415B (en) Virtual model processing method, device, equipment and computer readable storage medium
WO2022174594A1 (en) Multi-camera-based bare hand tracking and display method and system, and apparatus
WO2022267653A1 (en) Image processing method, electronic device, and computer readable storage medium
CN112949418A (en) Method and device for determining speaking object, electronic equipment and storage medium
CN113284041B (en) Image processing method, device and equipment and computer storage medium
US20230351615A1 (en) Object identifications in images or videos
Teng et al. Facial expressions recognition based on convolutional neural networks for mobile virtual reality
CN109241942B (en) Image processing method and device, face recognition equipment and storage medium
CN113192127B (en) Image processing method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2021565760

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21795741

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21795741

Country of ref document: EP

Kind code of ref document: A1