WO2019192205A1 - 图像中肢体表示信息的识别方法、装置、设备以及计算机可读存储介质 - Google Patents

图像中肢体表示信息的识别方法、装置、设备以及计算机可读存储介质 Download PDF

Info

Publication number
WO2019192205A1
WO2019192205A1 PCT/CN2018/119083 CN2018119083W WO2019192205A1 WO 2019192205 A1 WO2019192205 A1 WO 2019192205A1 CN 2018119083 W CN2018119083 W CN 2018119083W WO 2019192205 A1 WO2019192205 A1 WO 2019192205A1
Authority
WO
WIPO (PCT)
Prior art keywords
limb
points
contour
line
image
Prior art date
Application number
PCT/CN2018/119083
Other languages
English (en)
French (fr)
Inventor
闫桂新
张�浩
陈丽莉
楚明磊
孙剑
苗京花
田文红
董瑞君
赵斌
郭子强
Original Assignee
京东方科技集团股份有限公司
北京京东方光电科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司, 北京京东方光电科技有限公司 filed Critical 京东方科技集团股份有限公司
Priority to US16/473,509 priority Critical patent/US11354925B2/en
Publication of WO2019192205A1 publication Critical patent/WO2019192205A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • G06V40/113Recognition of static hand signs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm

Definitions

  • Embodiments of the present disclosure relate to a method, apparatus, and computer readable storage medium for identifying a body representation information in an image.
  • Each device has its own unique characteristics, either focusing on immersion or focusing on interactivity.
  • a method of identifying a limb representation information in an image comprising: determining a bone-like line of a limb in the image; and identifying the limb representation information according to the skeleton-like line.
  • the limb representation information includes a posture state of one of a trunk, a limb, a head and neck, a hand, a foot, or a combined posture state.
  • the determining a skeletal line of a limb in the image includes determining a midline of the limb in the image, and determining a skeletal line of the limb based on the midline.
  • the determining a skeletal line of a limb in the image comprises: acquiring a contour of a binary image of the limb in the image; determining a skeletal line of the limb in the image according to the directional gradient and the contour of the limb of the limb .
  • the acquiring the contour of the binary image of the limb in the image comprises: selecting a corresponding chrominance component according to the color feature of the limb to perform segmentation, determining a binary map of the limb in the image; extracting from the binary image of the limb The outline of the limb's binary map.
  • the image is converted from the RGB color representation method to the YCrCb color representation method.
  • the limb includes a hand
  • the contour of the binary map of the limb in the acquired image further includes: performing noise reduction processing on the image; and determining the presence of the hand in the image by palm recognition.
  • the determining a skeletal line of a limb in the image based on the directional gradient and the contour of the limb of the limb comprising: each contour point in the contour based on the binary map of the limb (x, y Determining a point in the skeletal line; determining the skeletal line based on the point in the skeletal line.
  • determining, based on each contour point (x, y) in the contour of the limb of the limb, determining a point in the skeletal line includes determining all of the contour points in the contour of the limb of the limb Whether both contour points are boundary points; in the case where both contour points are boundary points, the midpoints of the two boundary points are determined; and it is determined whether the midpoint is within the contour of the limb of the limb Where the midpoint is within the contour of the binary map of the limb, the midpoint is determined to be a point in the skeletal line.
  • determining whether two of the contour points in the contour of the binary map of the limb are all boundary points comprises: each contour point in the contour of the binary map of the limb (x, y ), categorized according to the y value, and the same y value in the contour, classified as the sequence seq y (x 1 , x 2 ,...), yields:
  • determining whether the two points are boundary points includes: taking the first two points in the sequence corresponding to the same y value Yi,1 ',x yi,2 ', according to the direction gradient of x yi,1 'and x yi,2 ' two points, determine whether these two points are boundary points; if the two points are not boundary points, then this Two points are removed from the sequence and retaken; if one point is not a boundary point, then this point is removed from the sequence and retaken; until both are considered boundary points.
  • determining whether two of the contour points in the contour of the limb of the limb are all boundary points further comprises deleting a sequence of less than two points in a sequence corresponding to the same y value.
  • the identifying the limb representation information according to the skeletal line includes: removing a point in the skeletal line that does not meet the preset requirement, and obtaining a limb represented by the skeletal line; The limb represented by the line identifies the limb representation information.
  • the culling a point in the skeletal line that does not meet the preset requirement, and obtaining a limb represented by the skeletal line includes: determining the number of pixels in each type of skeletal line; and setting the number of pixels to be smaller than the set The threshold-like skeletal line is removed, and the limb represented by the skeletal line is obtained.
  • determining the number of pixels in each class's bone line includes:
  • the collection of all class bone lines is represented as:
  • L 1 , L 2 , ... respectively represent a class of skeleton lines, (x y1,1 , y1), (x y1,1 , y1), ... respectively represent the pixel points constituting the skeleton line of the category ;
  • Ske(y) ⁇ lines_seq(x yi,1 ,x yi,2 ,...)
  • the culling of a point in the skeletal line that does not meet the preset requirement, and obtaining a limb represented by the skeletal line further includes: taking the first sequence of ske(y), and using the point as a skeletal line.
  • the starting point, the number of starting points is the same as the number of elements in the sequence, and all sequences are traversed starting from the second sequence of ske(y).
  • the traversal sequence includes: traversing the current sequence starting from the first element of the current sequence, and obtaining the closest distance to the current element in L 1 (p 1 ), L 2 (p 2 ), ..., L N (p N )
  • the point (x * , y * ), the corresponding class skeleton line is recorded as L * (p * ), the distance between (x * , y * ) and L * (p * ) is less than the set value, and (x * , y * ) is added to the end of the class skeleton line L * ; when the distance between (x * , y * ) and L * (p * ) is not less than the set value, (x * , y * ) is taken as the new class skeleton
  • the starting point of the line adds a new class skeleton line to skeLines(L).
  • an apparatus for identifying limb representation information in an image comprising: a determining unit configured to determine a bone-like line of a limb in the image; and an identifying unit configured to be based on the skeleton-like line , identification of the body representation information.
  • the limb representation information includes a posture state of one of a trunk, a limb, a head and neck, a hand, a foot, or a combined posture state.
  • an identification device for limb representation information in an image comprising a processor and a memory; wherein: the memory includes instructions executable by the processor, the processor executing the The above method is executed when the instruction is executed.
  • a computer readable storage medium having stored thereon computer program instructions that, when executed by a processor, implements the aforementioned methods.
  • FIG. 1 is a flowchart of a method for identifying a limb representation information in an image according to an embodiment of the present disclosure
  • FIG. 2 is a schematic diagram of an image to be identified according to an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of a gesture binary diagram according to an embodiment of the present disclosure.
  • FIG. 4 is a schematic diagram of a contour of a gesture binary image according to an embodiment of the present disclosure
  • FIG. 5 is a schematic diagram of a skeleton-like line provided by an embodiment of the present disclosure.
  • FIG. 6 is a schematic diagram of a gesture diagram of a skeleton-like line representation according to an embodiment of the present disclosure
  • FIG. 7 is a schematic structural diagram of an apparatus for identifying a limb representation information in an image according to an embodiment of the present disclosure
  • FIG. 8 is a schematic structural diagram of an apparatus for identifying a limb representation information in an image according to an embodiment of the present disclosure.
  • the inventors of the present disclosure have realized that, for portability, a conventional mouse, keyboard, and the like are difficult to meet real-time use requirements. For virtual reality, its operation is not convenient.
  • a limb representation is essentially a representation of a feature. Taking the gesture as an example, the gesture is represented as 26 degrees of freedom in the recognition of the binocular camera, and the hardware and software development costs are high.
  • the identification of the limb representation information is mostly performed by extracting the outer contour of the video frame, and the limb representation information is identified using the perimeter and the area of the contour as a basis for discrimination.
  • the error of this method is large, and the recognition rate is also low when the limb moves to the front and back and forth in front of the lens.
  • a method for identifying a limb representation information in an image includes:
  • Step S101 determining a skeleton-like line of a limb in the image
  • Step S102 Perform recognition of the limb representation information according to the skeleton-like line.
  • the limb representation information includes a posture state of one of the trunk, the limb, the head and neck, the hand, and the foot, or a posture state of any combination.
  • the method realizes the identification of the limb representation information by recognizing the skeletal line, and does not need to recognize the limb representation information through the perimeter and the area of the contour, the recognition error is small, the recognition rate is high, and the limb can be realized by a single camera. Indicates the identification of information and requires less equipment.
  • the user's body language expression information can be better recognized, so that the command can be further executed or translated into other languages according to the information expressed by the body language.
  • the method is used to identify a limb in an image of a video frame.
  • the change in the limb is identified by a plurality of video frames, thereby identifying the information represented by the limb from the limb change.
  • each video frame can be identified, a video frame with obvious limbs can be identified, and a number of video frames can be identified once. Of course, for each video frame. When identifying, the accuracy is higher and the amount of calculation is larger.
  • the skeletal line in the embodiment of the present disclosure may be a line simulating the bone of the body or body part.
  • the skeletal line is a simple single line, for example, each finger, each arm, and each torso corresponds to only one type of skeletal line.
  • the skeletal line is determined by determining the midline of the limb.
  • step S101 the skeletal line of the limb in the image is determined, including:
  • a midline of the limb in the image is determined, and a skeletal line of the limb is determined based on the midline.
  • step S101 the skeletal line of the limb in the image is determined, including:
  • the skeletal line of the limb in the image is determined.
  • the corresponding chrominance component when extracting the contour of the limb of the limb, the corresponding chrominance component may be selected according to the color of the limb in the image to determine the binary map of the limb in the image, and the color close to the color of the limb is selected. Corresponding chrominance components can make the extraction accuracy of the binary map contour higher.
  • step S101 the contour of the binary image of the limb in the image is obtained, including:
  • the corresponding chrominance component is selected for segmentation, and the binary image of the limb in the image is determined;
  • the contour of the limb's binary map is extracted from the limb's binary map.
  • the ostu segmentation may be performed according to the Cr channel of the image, and the gesture binary image is determined, thereby extracting the contour of the gesture binary image.
  • the Cr channel is suitable for representing the skin color of the human body
  • the gesture is determined by performing ostu segmentation on the Cr channel.
  • the binary image is extracted and the gesture binary contour is extracted, and the accuracy is high.
  • the image format of the video frame is in the RGB format.
  • the image format of the video frame is in the RGB format.
  • the YCrCb space is selected as the mapping space of the skin color distribution statistics, and the space has the advantage of being less affected by the brightness change, and is a two-dimensional independent distribution, which can better limit the skin color distribution area, and at this time, according to Before the color feature of the limb is selected and the corresponding chrominance component is selected for segmentation, the image needs to be color-converted.
  • the color conversion can adopt the following formula:
  • Y represents brightness (Luminance or Luma), which is the grayscale value
  • Cr reflects the difference between the red portion of the RGB input signal and the luminance value of the RGB signal.
  • Cb reflects the difference between the blue portion of the RGB input signal and the luminance value of the RGB signal.
  • the gesture is represented as white in the foreground and the background in black, as shown in the image of FIG. 2, to obtain a binary image as shown in FIG. 3.
  • the otsu algorithm is used because it binarizes the image. The treatment is better.
  • the contour of the obtained binary image is searched, and the largest contour is selected according to the position of the palm as a gesture outline, as shown in FIG.
  • the limb is a hand, to improve recognition accuracy, and to avoid identifying images of hands that are not present, reducing system effort.
  • the outline of the limb's binary image in the image it also includes:
  • the limb when the limb is an arm, a lower limb or a body, the limb can be pre-determined in the image by corresponding recognition, and further recognition is performed, thereby reducing the amount of calculation.
  • noise is filtered, which is more advantageous for the recognition of the limb, and the recognition accuracy is improved.
  • the noise can be selected according to the type of noise in the image, for example, for a general image.
  • the salt and pepper noise in the medium can be removed by median filtering.
  • the image after noise reduction is:
  • f(x, y) represents a processed image
  • W is typically a 3*3 or 5*5 two-dimensional template.
  • determining the skeletal line of the limb in the image may include determining the skeletal line based on each contour point (x, y) in the contour of the limb-based binary map.
  • the point in ; the skeletal line is determined based on the point in the skeletal line of the class.
  • determining, based on each contour point (x, y) in the contour of the limb of the limb, determining a point in the skeletal line includes determining all of the contour points in the contour of the limb of the limb Whether the two contour points are boundary points; when both contour points are boundary points, the midpoints of the two boundary points are determined; determining whether the midpoint is within the contour of the limb of the limb; The midpoint is within the contour of the binary map of the limb, and the midpoint is determined to be a point in the skeletal line.
  • determining whether two of the contour points in the contour of the binary map of the limb are all boundary points comprises: each contour point in the contour of the binary map of the limb (x, y ), categorized according to the y value, and the same y value in the contour, classified as the sequence seq y (x 1 , x 2 ,...), yields:
  • determining whether the two points are boundary points includes: taking the first two points in the sequence corresponding to the same y value Yi,1 ',x yi,2 ', according to the direction gradient of x yi,1 'and x yi,2 ' two points, determine whether these two points are boundary points, if neither point is a boundary point, then this Two points are removed from the sequence and retaken. If one point is not a boundary point, then this point is removed from the sequence and re-taken until it is judged that both are boundary points.
  • determining whether two of the contour points in the contour of the limb of the limb are all boundary points further includes: in the sequence corresponding to the same y value, if there are fewer than two points in the sequence Then delete this sequence.
  • determining the skeletal line of the limb in the image based on the directional gradient and the contour of the limb's binary map includes:
  • Each contour point (x, y) in the contour of the limb's binary map is classified according to the y value, and the same y value in the contour is classified into the sequence seq y (x 1 , x 2 ,... ),get:
  • Ske(y) ⁇ lines_seq(x yi,1 ,x yi,2 ,...)
  • step S102 before the identification of the limb representation information is performed, the points in the skeleton-like line that do not meet the preset requirements may be eliminated, thereby avoiding the misjudgment caused by the point that does not meet the preset requirement, thereby causing the limb to represent the information. Identification is more accurate.
  • step S102 the identification of the limb representation information is performed according to the skeletal line, including: removing a point in the skeletal line that does not meet the preset requirement, and obtaining a limb represented by the skeletal line; The limb represented by the skeletal line, and the identification of the limb representation information.
  • culling a point in the skeletal line that does not meet a preset requirement, and obtaining a limb represented by the skeletal line includes: determining a number of pixels in each type of skeletal line; and setting the number of pixels to be less than The threshold-based skeletal line is removed, and the limb represented by the skeletal line is obtained.
  • determining the number of pixels in each class of bone lines can be accomplished in the following manner.
  • the collection of all types of skeleton lines is represented as:
  • L 1 , L 2 , ... respectively represent a class of skeleton lines, (x y1,1 , y1), (x y1,1 , y1), ... respectively represent the pixel points constituting the skeleton line of the category ;
  • the culling points of the skeletal line that do not meet the preset requirements are obtained, and the limbs represented by the skeletal lines are obtained, and the first sequence of ske(y) is taken, and the points are taken as The starting point of the skeleton line, the number of starting points is the same as the number of elements in the sequence, and all sequences are traversed starting from the second sequence of ske(y).
  • traversing the sequence may include: traversing the current sequence starting from the first element of the current sequence, obtaining L 1 (p 1 ), L 2 (p 2 ), ..., L N (p N )
  • L * (p * ) The point closest to the current element (x * , y * ), the corresponding class skeleton line is denoted as L * (p * ), when the distance between (x * , y * ) and L * (p * ) is less than the set value , (x * , y * ) is added to the end of the class skeleton line L * , otherwise, (x * , y * ) is used as the starting point of the new class skeleton line, and a new class skeleton line is added for skeLines(L).
  • the points in the skeletal line that do not meet the preset requirements are eliminated, and the limbs represented by the skeletal lines are obtained, including:
  • L 1, L 2, ... denote a class skeleton line, (x y1,1, y1), (x y1,1, y1), ... respectively represent the pixel based article composed of the skeleton line ;
  • the gesture diagram is as shown in Fig. 6.
  • the embodiment of the present disclosure recognizes the skeletal line in the image, and can express various limbs more clearly through the skeletal line, the feature is more abundant, the recognition rate is greatly improved, and a reliable basis for further identifying the body representation information is provided.
  • the embodiment of the present disclosure further provides an apparatus for identifying a limb representation information in an image.
  • the apparatus corresponds to the identification method in the foregoing embodiment.
  • the identification device includes:
  • a determining unit 701 configured to determine a skeletal line of a limb in the image
  • the identification unit 702 is configured to perform identification of the limb representation information according to the skeleton-like line.
  • the above determining unit 701 and the identifying unit 702 are functional entities, which may be implemented by software, hardware or firmware, for example by a processor executing program code or a programmable logic circuit designed to perform corresponding functions.
  • the limb representation information includes a posture state of one of the trunk, the limb, the head and neck, the hand, and the foot, or a posture state of any combination.
  • the determining unit 701 is specifically configured to:
  • a midline of the limb in the image is determined, and a skeletal line of the limb is determined based on the midline.
  • the determining unit 701 is specifically configured to:
  • the skeletal line of the limb in the image is determined.
  • the determining unit 701 acquires the outline of the binary map of the limb in the image, including:
  • the corresponding chrominance component is selected for segmentation, and the binary image of the limb in the image is determined;
  • the contour of the limb's binary map is extracted from the binary map of the limb.
  • the determining unit 701 is further configured to:
  • the determining unit 701 determines the skeletal line of the limb in the image according to the direction gradient and the contour of the binary map of the limb, including:
  • Each contour point (x, y) in the contour of the limb's binary map is classified according to the y value, and the same y value in the contour is classified into the sequence seq y (x 1 , x 2 ,... ),get:
  • Ske(y) ⁇ lines_seq(x yi,1 ,x yi,2 ,...)
  • the identification unit 702 is specifically configured to:
  • the identification of the limb representation information is performed based on the limb represented by the skeleton-like line in each image frame.
  • the recognition unit 702 culls a point in the skeletal line that does not meet the preset requirement, and obtains a limb represented by the skeletal line, including:
  • L 1, L 2, ... denote a class skeleton line, (x y1,1, y1), (x y1,1, y1), ... respectively represent the pixel based article composed of the skeleton line ;
  • x ik ⁇ (1,...,w),j ⁇ (1,...,h).
  • the device may be implemented in a browser or other security application of the electronic device in advance, or may be loaded into a browser of the electronic device or a secure application thereof by downloading or the like.
  • Corresponding units in the device can cooperate with units in the electronic device to implement the solution of embodiments of the present disclosure.
  • an identification device for limb representation information in an image comprising a processor and a memory; the memory including instructions executable by the processor, the processor executing the instruction The method of any of the preceding embodiments is performed.
  • FIG. 8 a block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the present disclosure is shown.
  • the computer system includes a processor 801 that can perform various appropriate operations according to a program stored in a read only memory (ROM) 802 or a program loaded from the storage portion 808 into the random access memory (RAM) 803. Action and processing.
  • ROM read only memory
  • RAM random access memory
  • various programs and data required for system operation are also stored.
  • the processor 801, the ROM 802, and the RAM 803 are connected to each other through a bus 804.
  • An input/output (I/O) interface 805 is also coupled to bus 804.
  • the following components are connected to the I/O interface 805: an input portion 806 including a keyboard, a mouse, etc.; an output portion 807 including, for example, a cathode ray tube (CRT), a liquid crystal display (LCD), and the like, and a storage portion 808 including a hard disk or the like. And a communication portion 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the Internet.
  • Driver 810 is also coupled to I/O interface 805 as needed.
  • a removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory or the like, is mounted on the drive 810 as needed so that a computer program read therefrom is installed into the storage portion 808 as needed.
  • an embodiment of the present disclosure includes a computer program product comprising a computer program tangibly embodied on a machine readable medium, the computer program comprising program code for performing the method of FIG.
  • the computer program can be downloaded and installed from the network via communication portion 809, and/or installed from removable media 811.
  • each block of the flowchart or block diagrams can represent a module, a program segment, or a portion of code that includes one or more logic for implementing the specified.
  • Functional executable instructions can also occur in a different order than that illustrated in the drawings. For example, two successively represented blocks may in fact be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts can be implemented in a dedicated hardware-based system that performs the specified function or operation. , or can be implemented by a combination of dedicated hardware and computer instructions.
  • the units or modules described in the embodiments of the present disclosure may be implemented by software or by hardware.
  • the described unit or module may also be provided in the processor, for example, as a processor including an XX unit, a YY unit, and a ZZ unit.
  • the names of these units or modules do not in some cases constitute a limitation on the unit or module itself.
  • the XX unit may also be described as "a unit for XX.”
  • the present disclosure also provides a computer readable storage medium that implements the methods of the foregoing embodiments when instructions in the storage medium are executed.
  • the computer readable storage medium may be a computer readable storage medium included in the apparatus described in the above embodiments; or may be a computer readable storage medium that is separately present and not incorporated in the apparatus.
  • the computer readable storage medium stores one or more programs that are used by one or more processors to perform the formula input methods described in this disclosure.
  • the processor may be a central processing unit (CPU) or a field programmable logic array (FPGA) or a single chip microcomputer (MCU) or a digital signal processor (DSP) or an application specific integrated circuit (ASIC).
  • CPU central processing unit
  • FPGA field programmable logic array
  • MCU single chip microcomputer
  • DSP digital signal processor
  • ASIC application specific integrated circuit

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

一种图像中肢体表示信息的识别方法,包括,对带有肢体的图像,确定出图像中肢体的类骨骼线,再根据该类骨骼线,进行肢体表示信息的识别。还提供了一种图像中肢体表示信息的识别装置、设备以及存储介质。

Description

图像中肢体表示信息的识别方法、装置、设备以及计算机可读存储介质 技术领域
本公开实施例涉及一种图像中肢体表示信息的识别方法、装置、设备以及计算机可读存储介质。
背景技术
随着虚拟现实技术的快速发展,人们在生活中看到的虚拟现实设备越来越多。每一种设备都有其独到的特点,或是着重沉浸感,或是着重交互性等。
对于人机交互,人们也在不断探索着更符合人类交流习惯的交互方式。常见的人机交互设备有:鼠标、键盘、打印机、素描版等。这些是借助硬件设备实现。
发明内容
根据本公开的至少一个实施例,提供了一种图像中肢体表示信息的识别方法,包括:确定出图像中肢体的类骨骼线;根据所述类骨骼线,进行肢体表示信息的识别。
例如,所述肢体表示信息包括:躯干、肢体、头颈、手、足之一的姿势状态或组合而成的姿势状态。
例如,所述确定出图像中肢体的类骨骼线,包括:确定出图像中肢体的中线,根据所述中线确定所述肢体的类骨骼线。
例如,所述确定出图像中肢体的类骨骼线,包括:获取图像中肢体的二值图的轮廓;根据方向梯度和所述肢体的二值图的轮廓,确定出图像中肢体的类骨骼线。
例如,所述获取图像中肢体的二值图的轮廓,包括:根据肢体的颜色特征选取相应的色度分量进行分割,确定图像中肢体的二值图;从所述肢体的二值图中提取肢体的二值图的轮廓。
例如,在所述根据肢体的颜色特征选取相应的色度分量进行分割前,将图像从RGB颜色表示方法转换为YCrCb颜色表示方法。
例如,所述肢体包括手,所述获取图像中肢体的二值图的轮廓还包括:对图像进行降噪处理;通过掌心识别,确定图像中存在手。
例如,所述根据方向梯度和所述肢体的二值图的轮廓,确定出图像中肢体的类骨骼线,包括:基于所述肢体的二值图的轮廓中的每个轮廓点(x,y),确定类骨骼线中的点;基于所述类骨骼线中的点确定所述类骨骼线。
例如,基于所述肢体的二值图的轮廓中的每个轮廓点(x,y),确定类骨骼线中的点包括:确定所述肢体的二值图的轮廓中的所有轮廓点中的两个轮廓点是否均是边界点;在两个轮廓点均是边界点的情况下,确定这两个边界点的中点;确定所述中点是否在所述肢体的二值图的轮廓内;在所述中点在所述肢体的二值图的轮廓内的情况下,确定所述中点为类骨骼线中的点。
例如,确定所述肢体的二值图的轮廓中的所有轮廓点中的两个轮廓点是否均是边界点包括:对所述肢体的二值图的轮廓中的每个轮廓点(x,y),按照y值进行归类,对轮廓中相同的y值,归类为序列seq y(x 1,x 2,...),得到:
S(y)={seq yi(x yi,1,x yi,2,...)|yi∈(1,...,h),x i∈(1,...,w)};
每个序列按照x值的大小进行排列,得到
S'(y)={seq yi(x yi,1’,x yi,2’,...)|yi∈(1,...,h),x i’∈(1,...,w)};
在相同y值所对应的序列中,基于序列中的两个点的方向梯度,判断这两个点是否是边界点。
例如,在相同y值所对应的序列中,基于序列中的两个点的方向梯度,判断这两个点是否是边界点包括:在相同y值所对应的序列中,取前两个点x yi,1’,x yi,2’,按照x yi,1’和x yi,2’两点的方向梯度,判断这两个点是否为边界点;若两点都不是边界点,则将这两点从序列中去除并重新取点;若有一点不是边界点,则将这一点从序列中去除并重新取点;直到判断两者皆为边界点。
例如,确定所述肢体的二值图的轮廓中的所有轮廓点中的两个轮 廓点是否均是边界点还包括:在相同y值所对应的序列中,删除少于两个点的序列。
例如,当两个轮廓点均是边界点时,确定这两个边界点的中点包括:基于公式x yi,med1=(x yi,1’+x yi,2’)/2确定两个点的中点;确定所述中点是否在所述肢体的二值图的轮廓内包括:若所述中点在肢体的二值图的轮廓内,则将此点记录到新的序列lines_seq中,删除x yi,1’和x yi,2’,若所述中点不在肢体的二值图的轮廓内,则重新取点。
例如,所述根据所述类骨骼线,进行肢体表示信息的识别,包括:剔除所述类骨骼线中不符合预设要求的点,得到通过类骨骼线表示的肢体;根据图像中通过类骨骼线表示的肢体,进行肢体表示信息的识别。
例如,所述剔除所述类骨骼线中不符合预设要求的点,得到通过类骨骼线表示的肢体,包括:确定每条类骨骼线中的像素点个数;将像素个数小于设定阈值的类骨骼线去除,得到通过类骨骼线表示的肢体。
例如,确定每条类骨骼线中的像素点个数包括:所有类骨骼线的集合表示为:
skeLines(L)={L 1:{(x y1,1,y1)},L 2:{(x y1,2,y1)},...},
其中,L 1,L 2,...分别表示一条类骨骼线,(x y1,1,y1),(x y1,1,y1),...分别表示组成该条类骨骼线的像素点;
类骨骼线的所有点表示为:
ske(y)={lines_seq(x yi,1,x yi,2,...)|yi∈(1,...,h),x i∈(1,...,w)}
对ske(y)中每个序列,统计skeLines(L)中类骨骼线的个数N,统计每条类骨骼线像素点的个数P,确定每条类骨骼线的像素个数p 1,p 2,...,p N,每条类骨骼线最后一个像素点表示为L 1(p 1),L 2(p 2),...,L N(p N)。
例如,所述剔除所述类骨骼线中不符合预设要求的点,得到通过类骨骼线表示的肢体,还包括:取ske(y)第一个序列,将其中的点作为类骨骼线的起点,起点数与序列中元素数相同,从ske(y)第二个序列开始遍历所有序列。
例如,遍历序列包括:从当前序列第一个元素开始,遍历当前序列,获取L 1(p 1),L 2(p 2),...,L N(p N)中与当前元素距离最近的点(x *,y *),相应的类骨骼线记为L *(p *),(x *,y *)和L *(p *)的距离小于设定值时,将(x *,y *)增加到类骨骼线L *的末尾;(x *,y *)和L *(p *)的距离不小于设定值时,将(x *,y *)作为新的类骨骼线起点,为skeLines(L)增加新的类骨骼线。
根据本公开的至少一个实施例,提供了一种图像中肢体表示信息的识别装置,包括:确定单元,配置成确定出图像中肢体的类骨骼线;识别单元,配置成根据所述类骨骼线,进行肢体表示信息的识别。
例如,所述肢体表示信息包括:躯干、肢体、头颈、手、足之一的姿势状态或组合而成的姿势状态。
根据本公开的至少一个实施例,提供了一种图像中肢体表示信息的识别设备,包括处理器和存储器;其中:所述存储器包含可由所述处理器执行的指令,所述处理器执行所述指令时执行前述方法。
根据本公开的至少一个实施例,提供了一种计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现前述方法。
附图说明
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本公开的其它特征、目的和优点将会变得更明显:
图1为本公开实施例提供的图像中肢体表示信息的识别方法流程图;
图2为本公开实施例提供的待识别图像示意图;
图3为本公开实施例提供的手势二值图示意图;
图4为本公开实施例提供的手势二值图轮廓示意图;
图5为本公开实施例提供的类骨骼线示意图;
图6为本公开实施例提供的类骨骼线表示的手势图示意图;
图7为本公开实施例提供的图像中肢体表示信息的识别装置结构示意图;
图8为本公开实施例提供的图像中肢体表示信息的识别设备结构示意图。
具体实施方式
下面结合附图和实施例对本公开作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释相关发明,而非对该发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与发明相关的部分。
需要说明的是,在不冲突的情况下,本公开中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本公开。
本公开的发明人意识到,对于便携性来说,传统的鼠标、键盘等设备难于满足实时性使用要求。对于虚拟现实来说,其操作方式也不方便。
与之相比,基于视觉的肢体表示信息的识别有效的克服了上述缺点。在虚拟现实设备上安置摄像头可以实现肢体识别交互,降低了硬件成本和软件成本。
表征肢体对肢体表示信息识别是重要的。肢体表示本质上其实是一种特征的表示。以手势为例,在双目摄像头的识别中将手势表征为26个自由度,硬件和软件开发成本较高。
在发明人所知的技术中,肢体表示信息的识别多为通过对视频帧进行外轮廓提取来进行,使用轮廓的周长和面积作为判别依据来识别肢体表示信息。这种方法的误差较大,当肢体在镜头前向四周及前后运动时,识别率也比较低。
请参考图1,本公开实施例提供的图像中肢体表示信息的识别方法,包括:
步骤S101、确定出图像中肢体的类骨骼线;
步骤S102、根据所述类骨骼线,进行肢体表示信息的识别。
例如,肢体表示信息包括:躯干、肢体、头颈、手以及足中之一的姿势状态或任意组合而成的姿势状态。
该方法通过识别类骨骼线来实现肢体表示信息的识别,无需通过轮廓的周长和面积进行肢体表示信息的识别,其识别误差较小,识别率较高,而且通过单一摄像头即可实现该肢体表示信息的识别,对设备要求较低。
通过该方法,可以较好的识别出用户的肢体语言表达的信息,从而可以进一步根据肢体语言表达的信息执行命令或者翻译成其它语言。
在一些实施例中,所述方法用来对视频帧的图像中的肢体进行识别。通过多个视频帧识别出肢体的变化,从而从肢体变化中识别出肢体表示的信息。
对于一段视频来讲,可以对每个视频帧进行识别,也可以对其中明显具有肢体的视频帧进行识别,还可以每隔设定数量个视频帧进行一次识别,当然,对每个视频帧进行识别时,其准确率较高,计算量也较大。
本公开的实施例中的类骨骼线,可以是模拟类比身体或身体部位内部的骨骼的线条。
在一些实施例中,类骨骼线为简洁单线条,例如,每根手指、每只手臂、每个躯干只对应一条类骨骼线。
在一些实施例中,通过确定肢体的中线的方式,确定类骨骼线。
例如,步骤S101中,确定出图像中肢体的类骨骼线,包括:
确定出图像中肢体的中线,根据所述中线确定所述肢体的类骨骼线。
在一些实施例中,步骤S101中,确定出图像中肢体的类骨骼线,包括:
获取图像中肢体的二值图的轮廓;
根据方向梯度和所述肢体的二值图的轮廓,确定出图像中肢体的类骨骼线。
例如,在提取肢体的二值图轮廓时,可以根据图像中肢体的颜色来选取对应的相应的色度分量进行分割,确定图像中肢体的二值图,选取跟肢体的颜色相接近的颜色的相应的色度分量,可以使得二值图 轮廓的提取准确率更高。
在一些实施例中,步骤S101,获取图像中肢体的二值图的轮廓,包括:
根据肢体的颜色特征选取相应的色度分量进行分割,确定图像中肢体的二值图;
从肢体的二值图中提取肢体的二值图的轮廓。
例如,对于手势来讲,可以根据图像的Cr通道进行ostu分割,确定手势二值图,进而提取手势二值图轮廓,由于Cr通道适合表示人体肤色,所以通过对Cr通道进行ostu分割,确定手势二值图,再提取手势二值图轮廓,其准确率较高。
通常,视频帧的图像格式为RGB格式,为了把手的区域分割出来,使用适合不同肤色和不同光照条件的可靠的肤色模型较佳,而常用的RGB表示方法并不适合于皮肤模型。
在一些实施例,选用YCrCb空间作为肤色分布统计的映射空间,该空间的优点是受亮度变化的影响较小,而且是两维独立分布,能较好地限制肤色分布区域,此时,在根据肢体的颜色特征选取相应的色度分量进行分割前,还需要对图像进行颜色转换,颜色转换可以采用下面公式:
Figure PCTCN2018119083-appb-000001
将YCrCb格式的图像进行色彩通道分割,即可分别得到Y、Cr和Cb三个通道的图像。其中“Y”表示明亮度(Luminance或Luma),也就是灰阶值;Cr反映了RGB输入信号红色部分与RGB信号亮度值之间的差异。而Cb反映的是RGB输入信号蓝色部分与RGB信号亮度值之间的差异。考虑到人的肤色纹理特性更接近红色,因此选用Cr分量用作分割,使用otsu算法(即大津法或最大类间方差法)对Cr通道进行分割,得到二值图。手势作为前景以白色表示,背景以黑色表示,如图2所示的图像,即可得到如图3所示的二值图,本公开实施例中,使用otsu算法是因为其对图像二值化处理的较好。
对得到的二值图进行轮廓查找,根据掌心位置筛选出最大的轮廓 作为手势轮廓,如图4所示。
在一些实施例中,肢体为手,为提高识别准确度,以及避免对不存在手的图像进行识别,减小系统工作量。在获取图像中肢体的二值图的轮廓之前,还包括:
对图像进行降噪处理;
通过掌心识别,确定图像中存在手。
同样的,当肢体为手臂、下肢或身体时,也可以通过相应的识别,预先确定图像中存在该肢体,再进一步进行识别,从而减少计算量。
对于降噪处理的图像,过滤了例如的噪声,更有利于肢体的识别,提高识别准确度,降噪处理时,可以根据图像中噪声的类型,选择合适的方式降噪,例如,对于一般图像中的椒盐噪声,可以通过中值滤波来去除,降噪处理后的图像为:
使用中值滤波去除椒盐噪声。公式为
f(x,y)=med{I(x-k,y-l),(k,l∈W)}
例如,f(x,y)表示处理后的图像,W一般为3*3或5*5的二维模板。
进而再通过掌心检测判断图像f(x,y)中是否存在掌心,若存在掌心,则可以确定图像中存在手,获取例如手势二值图的轮廓,否则可以确定图像中不存在手,则进行下一帧图像的判断。
例如,根据方向梯度和肢体的二值图的轮廓,确定出图像中肢体的类骨骼线可以包括,基于肢体的二值图的轮廓中的每个轮廓点(x,y),确定类骨骼线中的点;基于类骨骼线中的点确定所述类骨骼线。
例如,基于所述肢体的二值图的轮廓中的每个轮廓点(x,y),确定类骨骼线中的点包括:确定所述肢体的二值图的轮廓中的所有轮廓点中的两个轮廓点是否均是边界点;当两个轮廓点均是边界点时,确定这两个边界点的中点;确定所述中点是否在所述肢体的二值图的轮廓内;当所述中点在所述肢体的二值图的轮廓内,则确定所述中点为类骨骼线中的点。
例如,确定所述肢体的二值图的轮廓中的所有轮廓点中的两个轮廓点是否均是边界点包括:对所述肢体的二值图的轮廓中的每个轮廓 点(x,y),按照y值进行归类,对轮廓中相同的y值,归类为序列seq y(x 1,x 2,...),得到:
S(y)={seq yi(x yi,1,x yi,2,...)|yi∈(1,...,h),x i∈(1,...,w)};
每个序列按照x值的大小进行排列,得到
S'(y)={seq yi(x yi,1’,x yi,2’,...)|yi∈(1,...,h),x i’∈(1,...,w)};
在相同y值所对应的序列中,基于序列中的两个点的方向梯度,判断这两个点是否是边界点。
例如,在相同y值所对应的序列中,基于序列中的两个点的方向梯度,判断这两个点是否是边界点包括:在相同y值所对应的序列中,取前两个点x yi,1’,x yi,2’,按照x yi,1’和x yi,2’两点的方向梯度,判断这两个点是否为边界点,若两点都不是边界点,则将这两点从序列中去除并重新取点,若有一点不是边界点,则将这一点从序列中去除并重新取点,直到判断两者皆为边界点。
例如,确定所述肢体的二值图的轮廓中的所有轮廓点中的两个轮廓点是否均是边界点还包括:在相同y值所对应的序列中,若序列中的点少于两个则删除此序列。
例如,当两个轮廓点均是边界点时,确定这两个边界点的中点包括:基于公式x yi,med1=(x yi,1’+x yi,2’)/2确定两个点的中点。确定所述中点是否在所述肢体的二值图的轮廓内包括:若所述中点在肢体的二值图的轮廓内,则将此点记录到新的序列lines_seq中,删除x yi,1’和x yi,2’,若所述中点不在肢体的二值图的轮廓内,则重新取点。
基于上述,在一些实施例中,根据方向梯度和肢体的二值图的轮廓,确定出图像中肢体的类骨骼线包括:
对肢体的二值图的轮廓中的每个轮廓点(x,y)按照y值进行归类,对轮廓中相同的y值,归类为序列seq y(x 1,x 2,...),得到:
S(y)={seq yi(x yi,1,x yi,2,...)|yi∈(1,...,h),x i∈(1,...,w)};
每个序列按照x值的大小进行排列,得到
S'(y)={seq yi(x yi,1’,x yi,2’,...)|yi∈(1,...,h),x i’ ∈(1,...,w)};
在任意y i值所对应的序列中,若序列中的点少于两个则删除此序列,否则,取前两个点x yi,1’,x yi,2’,按照x yi,1’和x yi,2’两点的方向梯度, 判断这两个点是否为边界点,若两点都不是边界点,则将这两点从序列中去除并重新取点,若有一点不是边界点,则将这一点从序列中去除并重新取点,直到判断两者皆为边界点,确定两个点的中点x yi,med1=(x yi,1’+x yi,2’)/2,若中点在肢体的二值图的轮廓内,则将此点记录到新的序列lines_seq中,删除x yi,1’和x yi,2’,若中点不在肢体的二值图的轮廓内,则重新取点;
若lines_seq存在元素,则表示此y i对应的序列存在类骨骼线的点,保存此y i,若没有元素,则删除此y i
遍历y i的所有值,即可得到类骨骼线的所有点:
ske(y)={lines_seq(x yi,1,x yi,2,...)|yi∈(1,...,h),x i∈(1,...,w)}。
如图2所示的图像,其类骨骼线如图5所示。
例如,在步骤S102,进行肢体表示信息的识别前,可以剔除类骨骼线中不符合预设要求的点,从而避免不符合预设要求的点造成干扰而进行误判,进而使得肢体表示信息的识别更加准确。
例如,在步骤S102,根据所述类骨骼线,进行肢体表示信息的识别,包括:剔除类骨骼线中不符合预设要求的点,得到通过类骨骼线表示的肢体;根据各个图像帧中通过类骨骼线表示的肢体,进行肢体表示信息的识别。
在一些实施例中,剔除所述类骨骼线中不符合预设要求的点,得到通过类骨骼线表示的肢体包括:确定每条类骨骼线中的像素点个数;将像素个数小于设定阈值的类骨骼线去除,得到通过类骨骼线表示的肢体。
例如,确定每条类骨骼线中的像素点个数可以以下述方式实现。所有类骨骼线的集合表示为:
skeLines(L)={L 1:{(x y1,1,y1)},L 2:{(x y1,2,y1)},...},
其中,L 1,L 2,...分别表示一条类骨骼线,(x y1,1,y1),(x y1,1,y1),...分别表示组成该条类骨骼线的像素点;
对每个序列,统计skeLines(L)中类骨骼线的个数N,统计每条类骨骼线像素点的个数P,确定每条类骨骼线的像素个数p 1,p 2,...,p N,每条类骨骼线最后一个像素点表示为L 1(p 1),L 2(p 2),...,L N(p N)。
在一些实施例中,所述剔除所述类骨骼线中不符合预设要求的点,得到通过类骨骼线表示的肢体,还包括:取ske(y)第一个序列,将其中的点作为类骨骼线的起点,起点数与序列中元素数相同,从ske(y)第二个序列开始遍历所有序列。
在一些实施例中,遍历序列可以包括:从当前序列第一个元素开始,遍历当前序列,获取L 1(p 1),L 2(p 2),...,L N(p N)中与当前元素距离最近的点(x *,y *),相应的类骨骼线记为L *(p *),(x *,y *)和L *(p *)的距离小于设定值时,将(x *,y *)增加到类骨骼线L *的末尾,否则,将(x *,y *)作为新的类骨骼线起点,为skeLines(L)增加新的类骨骼线。
一些实施例,剔除所述类骨骼线中不符合预设要求的点,得到通过类骨骼线表示的肢体,包括:
取ske(y)第一个序列,将例如的点作为类骨骼线的起点,起点数与序列中元素数相同,每条类骨骼线是一组像素点的集合,则所有类骨骼线的集合表示为:
skeLines(L)={L 1:{(x y1,1,y1)},L 2:{(x y1,2,y1)},...},
例如,L 1,L 2,...分别表示一条类骨骼线,(x y1,1,y1),(x y1,1,y1),...分别表示组成该条类骨骼线的像素点;
从ske(y)第二个序列开始遍历所有序列,对每个序列,统计skeLines(L)中类骨骼线的个数N,统计每条类骨骼线像素点的个数P,确定每条类骨骼线的像素个数p 1,p 2,...,p N,每条类骨骼线最后一个像素点表示为L 1(p 1),L 2(p 2),...,L N(p N);
从当前序列第一个元素开始,遍历当前序列,获取L 1(p 1),L 2(p 2),...,L N(p N)中与当前元素距离最近的点(x *,y *),相应的类骨骼线记为L *(p *),(x *,y *)和L *(p *)的距离小于设定值时,将(x *,y *)增加到类骨骼线L *的末尾,否则,将(x *,y *)作为新的类骨骼线起点,为skeLines(L)增加新的类骨骼线;
去除skeLines(L)中像素个数小于设定阈值的类骨骼线,得到通过类骨骼线表示的肢体:
Figure PCTCN2018119083-appb-000002
x ik∈(1,...,w),j∈(1,...,h)。
如图2所示的图像,其手势图如图6所示。
可见,本公开实施例识别出图像中的类骨骼线,可以通过类骨骼线较清楚的表示各种肢体,特征更为丰富,识别率大大提升,为进一步识别肢体表示信息提供可靠依据。
应当注意,尽管在附图中以特定顺序描述了本公开方法的操作,但是,这并非要求或者暗示必须按照该特定顺序来执行这些操作,或是必须执行全部所示的操作才能实现期望的结果。相反,流程图中描绘的步骤可以改变执行顺序。附加地或备选地,可以省略某些步骤,将多个步骤合并为一个步骤执行,和/或将一个步骤分解为多个步骤执行。
本公开实施例还提供一种图像中肢体表示信息的识别装置,该装置与前述实施例中的识别方法对应,为了说明书的简洁,以下仅作简要描述,具体实施方式参见前述实施例。如图7所示,识别装置包括:
确定单元701,用于确定出图像中肢体的类骨骼线;
识别单元702,用于根据类骨骼线,进行肢体表示信息的识别。
上述确定单元701和识别单元702是功能实体,可以通过软件,硬件或固件来实现,例如通过处理器执行程序代码或被设计为执行对应功能的可编程逻辑电路来实现。
例如,肢体表示信息包括:躯干、肢体、头颈、手以及足之一的姿势状态或任意组合而成的姿势状态。
例如,确定单元701具体用于:
确定出图像中肢体的中线,根据所述中线确定所述肢体的类骨骼线。
例如,确定单元701具体用于:
获取图像中肢体的二值图的轮廓;
根据方向梯度和肢体的二值图的轮廓,确定出图像中肢体的类骨骼线。
例如,确定单元701获取图像中肢体的二值图的轮廓,包括:
根据肢体的颜色特征选取相应的色度分量进行分割,确定图像中肢体的二值图;
从所述肢体的二值图中提取肢体的二值图的轮廓。
例如,当肢体具体为手时,确定单元701还用于:
获取图像中肢体的二值图的轮廓之前,对图像进行降噪处理;以及
通过掌心识别,确定图像中存在手。
例如,确定单元701根据方向梯度和所述肢体的二值图的轮廓,确定出图像中肢体的类骨骼线,包括:
对肢体的二值图的轮廓中的每个轮廓点(x,y)按照y值进行归类,对轮廓中相同的y值,归类为序列seq y(x 1,x 2,...),得到:
S(y)={seq yi(x yi,1’,x yi,2’,...)|yi∈(1,...,h),x i∈(1,...,w)};
每个序列按照x值的大小进行排列,得到
S'(y)={seq yi(x yi,1’,x yi,2’,...)|yi∈(1,...,h),x i’∈(1,...,w)};
在任意y i值所对应的序列中,若序列中的点少于两个则删除此序列,否则,取前两个点x yi,1’,x yi,2’,按照x yi,1’和x yi,2’两点的方向梯度,判断这两个点是否为边界点,若两点都不是边界点,则将这两点从序列中去除并重新取点,若有一点不是边界点,则将这一点从序列中去除并重新取点,直到判断两者皆为边界点,确定两个点的中点x yi,med1=(x yi,1’+x yi,2’)/2,若中点在肢体的二值图的轮廓内,则将此点记录到新的序列lines_seq中,删除x yi,1’和x yi,2’,若中点不在肢体的二值图的轮廓内,则重新取点;
若lines_seq存在元素,则表示此y i对应的序列存在类骨骼线的点,保存此y i,若没有元素,则删除此y i
遍历y i的所有值,即可得到类骨骼线的所有点:
ske(y)={lines_seq(x yi,1,x yi,2,...)|yi∈(1,...,h),x i∈(1,...,w)}。
例如,识别单元702具体用于:
剔除类骨骼线中不符合预设要求的点,得到通过类骨骼线表示的肢体;
根据各个图像帧中通过类骨骼线表示的肢体,进行肢体表示信息的识别。
例如,识别单元702剔除类骨骼线中不符合预设要求的点,得到通过类骨骼线表示的肢体,包括:
取ske(y)第一个序列,将例如的点作为类骨骼线的起点,起点数与序列中元素数相同,则所有类骨骼线的集合表示为:
skeLines(L)={L 1:{(x y1,1,y1)},L 2:{(x y1,2,y1)},...},
例如,L 1,L 2,...分别表示一条类骨骼线,(x y1,1,y1),(x y1,1,y1),...分别表示组成该条类骨骼线的像素点;
从ske(y)第二个序列开始遍历所有序列,对每个序列,统计skeLines(L)中类骨骼线的个数N,统计每条类骨骼线像素点的个数P,确定每条类骨骼线的像素个数p 1,p 2,...,p N,每条类骨骼线最后一个像素点表示为L 1(p 1),L 2(p 2),...,L N(p N);
从当前序列第一个元素开始,遍历当前序列,获取L 1(p 1),L 2(p 2),...,L N(p N)中与当前元素距离最近的点(x *,y *),相应的类骨骼线记为L *(p *),(x *,y *)和L *(p *)的距离小于设定值时,将(x *,y *)增加到类骨骼线L *的末尾,否则,将(x *,y *)作为新的类骨骼线起点,为skeLines(L)增加新的类骨骼线;
去除skeLines(L)中像素个数小于设定阈值的类骨骼线,得到通过类骨骼线表示的肢体:
Figure PCTCN2018119083-appb-000003
例如,x ik∈(1,...,w),j∈(1,...,h)。
应当理解,该装置中记载的诸单元或模块与参考图1描述的方法中的各个步骤相对应。由此,上文针对方法描述的操作和特征同样适用于该装置及其中包含的单元,在此不再赘述。该装置可以预先实现在电子设备的浏览器或其他安全应用中,也可以通过下载等方式而加载到电子设备的浏览器或其安全应用中。该装置中的相应单元可以与电子设备中的单元相互配合以实现本公开实施例的方案。
此外,在本公开实施例中,还提供了一种图像中肢体表示信息的识别设备,包括处理器和存储器;所述存储器包含可由所述处理器执行的指令,所述处理器执行所述指令时执行前述实施例任一所述的方法。
下面参考图8,其示出了适于用来实现本公开实施例的终端设备或服务器的计算机系统的结构示意图。
如图8所示,计算机系统包括处理器801,其可以根据存储在只读存储器(ROM)802中的程序或者从存储部分808加载到随机访问存储器(RAM)803中的程序而执行各种适当的动作和处理。在RAM 803中,还存储有系统操作所需的各种程序和数据。处理器801、ROM 802以及RAM 803通过总线804彼此相连。输入/输出(I/O)接口805也连接至总线804。
以下部件连接至I/O接口805:包括键盘、鼠标等的输入部分806;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分807;包括硬盘等的存储部分808;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分809。通信部分809经由诸如因特网的网络执行通信处理。驱动器810也根据需要连接至I/O接口805。可拆卸介质811,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器810上,以便于从其上读出的计算机程序根据需要被安装入存储部分808。
特别地,根据本公开的实施例,上文参考图1描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括有形地包含在机器可读介质上的计算机程序,所述计算机程序包含用于执行图1的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分809从网络上被下载和安装,和/或从可拆卸介质811被安装。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,所述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现, 或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元或模块可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元或模块也可以设置在处理器中,例如,可以描述为:一种处理器包括XX单元、YY单元以及ZZ单元。例如,这些单元或模块的名称在某种情况下并不构成对该单元或模块本身的限定,例如,XX单元还可以被描述为“用于XX的单元”。
作为另一方面,本公开还提供了一种计算机可读存储介质,在存储介质中的指令被执行时实现前述实施例中的方法。该计算机可读存储介质可以是上述实施例中所述装置中所包含的计算机可读存储介质;也可以是单独存在,未装配入设备中的计算机可读存储介质。计算机可读存储介质存储有一个或者一个以上程序,所述程序被一个或者一个以上的处理器用来执行描述于本公开的公式输入方法。
在本公开的实施例中,处理器可以是中央处理单元(CPU)或者现场可编程逻辑阵列(FPGA)或者单片机(MCU)或者数字信号处理器(DSP)或者专用集成电路(ASIC)等具有数据处理能力和/或程序执行能力的逻辑运算器件。
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开中所涉及的发明范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离所述发明构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。
本申请要求于2018年4月2日递交的中国专利申请第201810283309.4号的优先权,在此全文引用上述中国专利申请公开的内容以作为本公开的一部分。

Claims (22)

  1. 一种图像中肢体表示信息的识别方法,包括:
    确定出图像中肢体的类骨骼线;
    根据所述类骨骼线,进行肢体表示信息的识别。
  2. 根据权利要求1所述的方法,其中,所述肢体表示信息包括:躯干、肢体、头颈、手、足之一的姿势状态或组合而成的姿势状态。
  3. 如权利要求1或2所述的方法,其中,所述确定出图像中肢体的类骨骼线,包括:
    确定出图像中肢体的中线,根据所述中线确定所述肢体的类骨骼线。
  4. 如权利要求1-2任一所述的方法,其中,所述确定出图像中肢体的类骨骼线,包括:
    获取图像中肢体的二值图的轮廓;
    根据方向梯度和所述肢体的二值图的轮廓,确定出图像中肢体的类骨骼线。
  5. 如权利要求4所述的方法,其中,所述获取图像中肢体的二值图的轮廓,包括:
    根据肢体的颜色特征选取相应的色度分量进行分割,确定图像中肢体的二值图;
    从所述肢体的二值图中提取肢体的二值图的轮廓。
  6. 根据权利要求5所述的方法,其中,在所述根据肢体的颜色特征选取相应的色度分量进行分割前,将图像从RGB颜色表示方法转换为YCrCb颜色表示方法。
  7. 如权利要求5或6所述的方法,其中,所述肢体包括手,所述获取图像中肢体的二值图的轮廓还包括:
    对图像进行降噪处理;
    通过掌心识别,确定图像中存在手。
  8. 如权利要求4-7任一所述的方法,其中,所述根据方向梯度和所述肢体的二值图的轮廓,确定出图像中肢体的类骨骼线,包括:
    基于所述肢体的二值图的轮廓中的每个轮廓点(x,y),确定类骨骼线中的点;
    基于所述类骨骼线中的点确定所述类骨骼线。
  9. 根据权利要求8所述的方法,其中,基于所述肢体的二值图的轮廓中的每个轮廓点(x,y),确定类骨骼线中的点包括:
    确定所述肢体的二值图的轮廓中的所有轮廓点中的两个轮廓点是否均是边界点;
    在两个轮廓点均是边界点的情况下,确定这两个边界点的中点;
    确定所述中点是否在所述肢体的二值图的轮廓内;
    在所述中点在所述肢体的二值图的轮廓内的情况下,确定所述中点为类骨骼线中的点。
  10. 根据权利要求9所述的方法,其中,确定所述肢体的二值图的轮廓中的所有轮廓点中的两个轮廓点是否均是边界点包括:
    对所述肢体的二值图的轮廓中的每个轮廓点(x,y),按照y值进行归类,对轮廓中相同的y值,归类为序列seq y(x 1,x 2,...),得到:
    S(y)={seq yi(x yi,1,x yi,2,...)|yi∈(1,...,h),x i∈(1,...,w)};
    每个序列按照x值的大小进行排列,得到
    S'(y)={seq yi(x yi,1’,x yi,2’,...)|yi∈(1,...,h),x i’∈(1,...,w)};
    在相同y值所对应的序列中,基于序列中的两个点的方向梯度,判断这两个点是否是边界点。
  11. 根据权利要求10所述的方法,其中,在相同y值所对应的序列中,基于序列中的两个点的方向梯度,判断这两个点是否是边界点包括:
    在相同y值所对应的序列中,取前两个点x yi,1’,x yi,2’,按照x yi,1’和x yi,2’两点的方向梯度,判断这两个点是否为边界点;
    若两点都不是边界点,则将这两点从序列中去除并重新取点;
    若有一点不是边界点,则将这一点从序列中去除并重新取点;
    直到判断两者皆为边界点。
  12. 根据权利要求10或11所述的方法,其中,确定所述肢体的二值图的轮廓中的所有轮廓点中的两个轮廓点是否均是边界点还包 括:
    在相同y值所对应的序列中,删除少于两个点的序列。
  13. 根据权利要求9-12任一权利要求所述的方法,其中,当两个轮廓点均是边界点时,确定这两个边界点的中点包括:
    基于公式x yi,med1=(x yi,1’+x yi,2’)/2确定两个点的中点;
    确定所述中点是否在所述肢体的二值图的轮廓内包括:
    若所述中点在肢体的二值图的轮廓内,则将此点记录到新的序列lines_seq中,删除x yi,1’和x yi,2’,若所述中点不在肢体的二值图的轮廓内,则重新取点。
  14. 如权利要求8-13任一所述的方法,其中,所述根据所述类骨骼线,进行肢体表示信息的识别,包括:
    剔除所述类骨骼线中不符合预设要求的点,得到通过类骨骼线表示的肢体;
    根据图像中通过类骨骼线表示的肢体,进行肢体表示信息的识别。
  15. 如权利要求14所述的方法,其中,所述剔除所述类骨骼线中不符合预设要求的点,得到通过类骨骼线表示的肢体,包括:
    确定每条类骨骼线中的像素点个数;
    将像素个数小于设定阈值的类骨骼线去除,得到通过类骨骼线表示的肢体。
  16. 根据权利要求15所述的方法,其中,确定每条类骨骼线中的像素点个数包括:所有类骨骼线的集合表示为:
    skeLines(L)={L 1:{(x y1,1,y1)},L 2:{(x y1,2,y1)},...},
    其中,L 1,L 2,...分别表示一条类骨骼线,(x y1,1,y1),(x y1,1,y1),...分别表示组成该条类骨骼线的像素点;
    类骨骼线的所有点表示为:
    ske(y)={lines_seq(x yi,1,x yi,2,...)|yi∈(1,...,h),x i∈(1,...,w)}
    对ske(y)中每个序列,统计skeLines(L)中类骨骼线的个数N,统计每条类骨骼线像素点的个数P,确定每条类骨骼线的像素个数p 1,p 2,...,p N,每条类骨骼线最后一个像素点表示为L 1(p 1),L 2(p 2),...,L N(p N)。
  17. 根据权利要求16所述的方法,其中,所述剔除所述类骨骼线中不符合预设要求的点,得到通过类骨骼线表示的肢体,还包括:
    取ske(y)第一个序列,将其中的点作为类骨骼线的起点,起点数与序列中元素数相同,从ske(y)第二个序列开始遍历所有序列。
  18. 根据权利要求17所述的方法,其中,遍历序列包括:
    从当前序列第一个元素开始,遍历当前序列,获取L 1(p 1),L 2(p 2),...,L N(p N)中与当前元素距离最近的点(x *,y *),相应的类骨骼线记为L *(p *);
    (x *,y *)和L *(p *)的距离小于设定值时,将(x *,y *)增加到类骨骼线L *的末尾;
    (x *,y *)和L *(p *)的距离不小于设定值时,将(x *,y *)作为新的类骨骼线起点,为skeLines(L)增加新的类骨骼线。
  19. 一种图像中肢体表示信息的识别装置,包括:
    确定单元,配置成确定出图像中肢体的类骨骼线;
    识别单元,配置成根据所述类骨骼线,进行肢体表示信息的识别。
  20. 根据权利要求19所述的装置,其中,所述肢体表示信息包括:躯干、肢体、头颈、手、足之一的姿势状态或组合而成的姿势状态。
  21. 一种图像中肢体表示信息的识别设备,包括处理器和存储器;其中:
    所述存储器包含可由所述处理器执行的指令,所述处理器执行所述指令时执行如权利要求1-18任一所述的方法。
  22. 一种计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现如权利要求1-18任一所述的方法。
PCT/CN2018/119083 2018-04-02 2018-12-04 图像中肢体表示信息的识别方法、装置、设备以及计算机可读存储介质 WO2019192205A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/473,509 US11354925B2 (en) 2018-04-02 2018-12-04 Method, apparatus and device for identifying body representation information in image, and computer readable storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810283309.4 2018-04-02
CN201810283309.4A CN108491820B (zh) 2018-04-02 2018-04-02 图像中肢体表示信息的识别方法、装置及设备、存储介质

Publications (1)

Publication Number Publication Date
WO2019192205A1 true WO2019192205A1 (zh) 2019-10-10

Family

ID=63318047

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/119083 WO2019192205A1 (zh) 2018-04-02 2018-12-04 图像中肢体表示信息的识别方法、装置、设备以及计算机可读存储介质

Country Status (3)

Country Link
US (1) US11354925B2 (zh)
CN (1) CN108491820B (zh)
WO (1) WO2019192205A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108491820B (zh) * 2018-04-02 2022-04-12 京东方科技集团股份有限公司 图像中肢体表示信息的识别方法、装置及设备、存储介质
CN114708519B (zh) * 2022-05-25 2022-09-27 中国科学院精密测量科学与技术创新研究院 一种基于无人机遥感的麋鹿识别与形态轮廓参数提取方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102122350A (zh) * 2011-02-24 2011-07-13 浙江工业大学 基于骨架化和模板匹配的交警手势识别方法
US20120069168A1 (en) * 2010-09-17 2012-03-22 Sony Corporation Gesture recognition system for tv control
CN104077774A (zh) * 2014-06-28 2014-10-01 中国科学院光电技术研究所 一种结合骨架和广义Hough变换的扩展目标跟踪方法及装置
CN106815855A (zh) * 2015-12-02 2017-06-09 山东科技职业学院 基于产生式和判别式结合的人体运动跟踪方法
CN108491820A (zh) * 2018-04-02 2018-09-04 京东方科技集团股份有限公司 图像中肢体表示信息的识别方法、装置及设备、存储介质

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2028605A1 (de) * 2007-08-20 2009-02-25 Delphi Technologies, Inc. Detektionsverfahren für symmetrische Muster
US7983450B2 (en) * 2009-03-16 2011-07-19 The Boeing Company Method, apparatus and computer program product for recognizing a gesture
US8913809B2 (en) * 2012-06-13 2014-12-16 Microsoft Corporation Monitoring physical body changes via image sensor
CN103208002B (zh) 2013-04-10 2016-04-27 桂林电子科技大学 基于手轮廓特征的手势识别控制方法和系统
CN103559505A (zh) * 2013-11-18 2014-02-05 庄浩洋 一种3d骨架建模与手检测方法
US20160360165A1 (en) * 2014-02-06 2016-12-08 The General Hospital Corporation Systems and methods for care monitoring
US9710708B1 (en) * 2014-03-24 2017-07-18 Vecna Technologies, Inc. Method and apparatus for autonomously recognizing at least one object in an image
CN104680127A (zh) * 2014-12-18 2015-06-03 闻泰通讯股份有限公司 手势识别方法及系统
KR101706864B1 (ko) * 2015-10-14 2017-02-17 세종대학교산학협력단 모션 센싱 입력기기를 이용한 실시간 손가락 및 손동작 인식
CN106056053B (zh) * 2016-05-23 2019-04-23 西安电子科技大学 基于骨骼特征点提取的人体姿势识别方法
US10147184B2 (en) * 2016-12-30 2018-12-04 Cerner Innovation, Inc. Seizure detection
CN107330354B (zh) * 2017-03-20 2020-12-08 长沙理工大学 一种自然手势识别方法
DE102017216000A1 (de) * 2017-09-11 2019-03-14 Conti Temic Microelectronic Gmbh Gestensteuerung zur Kommunikation mit einem autonomen Fahrzeug auf Basis einer einfachen 2D Kamera
US10607079B2 (en) * 2017-09-26 2020-03-31 Toyota Research Institute, Inc. Systems and methods for generating three dimensional skeleton representations
DE102018212655A1 (de) * 2018-07-30 2020-01-30 Conti Temic Microelectronic Gmbh Erkennung der Bewegungsabsicht eines Fußgängers aus Kamerabildern

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120069168A1 (en) * 2010-09-17 2012-03-22 Sony Corporation Gesture recognition system for tv control
CN102122350A (zh) * 2011-02-24 2011-07-13 浙江工业大学 基于骨架化和模板匹配的交警手势识别方法
CN104077774A (zh) * 2014-06-28 2014-10-01 中国科学院光电技术研究所 一种结合骨架和广义Hough变换的扩展目标跟踪方法及装置
CN106815855A (zh) * 2015-12-02 2017-06-09 山东科技职业学院 基于产生式和判别式结合的人体运动跟踪方法
CN108491820A (zh) * 2018-04-02 2018-09-04 京东方科技集团股份有限公司 图像中肢体表示信息的识别方法、装置及设备、存储介质

Also Published As

Publication number Publication date
US20210365675A1 (en) 2021-11-25
CN108491820B (zh) 2022-04-12
CN108491820A (zh) 2018-09-04
US11354925B2 (en) 2022-06-07

Similar Documents

Publication Publication Date Title
CN108765278B (zh) 一种图像处理方法、移动终端及计算机可读存储介质
CN110751655B (zh) 一种基于语义分割和显著性分析的自动抠图方法
US10372226B2 (en) Visual language for human computer interfaces
US7627146B2 (en) Method and apparatus for effecting automatic red eye reduction
JP4574249B2 (ja) 画像処理装置及びその方法、プログラム、撮像装置
WO2020206850A1 (zh) 基于高维图像的图像标注方法和装置
CN109961016B (zh) 面向智能家居场景的多手势精准分割方法
WO2020038312A1 (zh) 多通道舌体边缘检测装置、方法及存储介质
WO2019015477A1 (zh) 图像矫正方法、计算机可读存储介质和计算机设备
US7460705B2 (en) Head-top detecting method, head-top detecting system and a head-top detecting program for a human face
JP2007272435A (ja) 顔特徴抽出装置及び顔特徴抽出方法
CN111080670A (zh) 图像提取方法、装置、设备及存储介质
CN112712054B (zh) 脸部皱纹检测方法
CN110335216A (zh) 图像处理方法、图像处理装置、终端设备及可读存储介质
WO2019192205A1 (zh) 图像中肢体表示信息的识别方法、装置、设备以及计算机可读存储介质
WO2022160586A1 (zh) 一种深度检测方法、装置、计算机设备和存储介质
WO2024078399A1 (zh) 迁移方法和装置
CN110473176B (zh) 图像处理方法及装置、眼底图像处理方法、电子设备
Huang et al. M2-Net: multi-stages specular highlight detection and removal in multi-scenes
CN110648336A (zh) 一种舌质和舌苔的分割方法及装置
JP6785181B2 (ja) 物体認識装置、物体認識システム、及び物体認識方法
Youlian et al. Face detection method using template feature and skin color feature in rgb color space
JP2004246424A (ja) 肌色領域の抽出方法
JP6343998B2 (ja) 画像処理装置、画像処理方法、及びプログラム
US20060010582A1 (en) Chin detecting method, chin detecting system and chin detecting program for a chin of a human face

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18913584

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18913584

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 08.04.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18913584

Country of ref document: EP

Kind code of ref document: A1