CN107704819A - A kind of action identification method, system and terminal device - Google Patents

A kind of action identification method, system and terminal device Download PDF

Info

Publication number
CN107704819A
CN107704819A CN201710901427.2A CN201710901427A CN107704819A CN 107704819 A CN107704819 A CN 107704819A CN 201710901427 A CN201710901427 A CN 201710901427A CN 107704819 A CN107704819 A CN 107704819A
Authority
CN
China
Prior art keywords
depth
pixel
sub
cubical
cube
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710901427.2A
Other languages
Chinese (zh)
Other versions
CN107704819B (en
Inventor
李懿
程俊
姬晓鹏
方璡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Shenzhen Institute of Advanced Technology of CAS
Shenzhen Tencent Computer Systems Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd, Shenzhen Institute of Advanced Technology of CAS filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201710901427.2A priority Critical patent/CN107704819B/en
Publication of CN107704819A publication Critical patent/CN107704819A/en
Application granted granted Critical
Publication of CN107704819B publication Critical patent/CN107704819B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/467Encoded features or binary features, e.g. local binary patterns [LBP]

Abstract

The present invention is applied to human-computer interaction technique field, provide a kind of action identification method, system and terminal device, by being divided on time dimension and Spatial Dimension to the range image sequence of target object, obtain more sub- depth vertical cubes, and obtain the cubical time motor imagination figure of sub- depth, the pre-set image feature of extraction time motor imagination figure, then the Feature Descriptor that all pre-set image features obtain range image sequence is spliced, so as to identify the type of action of target object according to Feature Descriptor.Method of the present invention is fast with algorithm process speed without carrying out pre-segmentation processing to target object, the significant advantage of accuracy of identification efficiency high.

Description

A kind of action identification method, system and terminal device
Technical field
The invention belongs to human-computer interaction technique field, more particularly to a kind of action identification method, system and terminal device.
Background technology
With the continuous development of computer vision technique, action recognition technology is in man-machine interaction, intelligent monitoring and void Intend the numerous areas such as reality to be widely used, there is highly important researching value.Depth camera possesses sense due to it Know the ability of depth distance information of the target object in areas imaging, be widely used in the three-dimensional information weight of target object Build, in the identification of moving description and human action.
However, the existing action identification method based on depth image usually requires to detect target object and split Etc. pretreatment operation, it is therefore an objective to extract target area interested from range image sequence, to reduce complex background The interference brought, so as to improve the accuracy rate of subsequent action identification, but the computation complexity of algorithm can be also added simultaneously, and Its accuracy of identification is overly dependent upon the accuracy of preprocessing process.In addition, the existing action identification method based on depth image The dimension of the feature extracted is often higher, is taken a substantial amount of time in terms of feature detection, reduces recognition efficiency.
The content of the invention
In view of this, it is existing to solve the embodiments of the invention provide a kind of action identification method, system and terminal device The problem of action identification method computational methods based on depth image are complicated in technology, the feature detection time is long, recognition efficiency is low.
The first aspect of the embodiment of the present invention provides a kind of action identification method, and it includes:
According to preset time step-length, the range image sequence of target object is divided on time dimension, obtained every Depth cube in individual time-domain;
According to preset plane grid, the depth cube is divided on Spatial Dimension, obtains dimension identical More sub- depth vertical cubes, the sub- cubical number of image frames of depth are identical with the cubical number of image frames of the depth;
Obtain the cubical time motor imagination figure of the sub- depth;
Extract the pre-set image feature of the time motor imagination figure;
Splice all sub- cubical pre-set image features of depth corresponding to the depth cube, obtain the depth vertical The characteristic vector of cube, and splice all cubical characteristic vectors of depth corresponding to the range image sequence, described in acquisition The Feature Descriptor of range image sequence;
The Feature Descriptor is classified by linear SVM, to identify the action class of the target object Type.
In one embodiment, the acquisition sub- cubical time motor imagination figure of depth, including:
Obtain the pixel value of the non-noise pixel in the sub- depth cube;
According to the pixel value, the pixel that destination node motion is characterized in the non-noise pixel is obtained, obtains institute State the cubical time motor imagination figure of sub- depth.
In one embodiment, the pixel value for obtaining the non-noise pixel in the sub- depth cube, including:
Zero averaging processing is carried out to the pixel signal of each pixel in the sub- depth cube;
Sign function conversion is carried out to the pixel signal after zero averaging processing, obtains the sub- depth cube In each pixel sign function pixel signal;
Convolution operation is carried out to the sign function pixel signal, obtain in the sub- depth cube non-makes an uproar The pixel value of acoustic image vegetarian refreshments.
In one embodiment, it is described according to the pixel value, obtain in the non-noise pixel and characterize destination node The pixel of motion, the sub- cubical time motor imagination figure of depth is obtained, including:
According to the time behavior of the pixel value, filter out and destination node motion is characterized in the non-noise pixel Pixel;
Visualization processing is carried out to the pixel of the sign destination node motion, obtain the sub- depth it is cubical when Between motor imagination figure.
In one embodiment, the preset time step-length is not overlapping or partly overlapping set time step-length, adjacent The time-domain it is not overlapping or partly overlap, described image is characterized as histograms of oriented gradients feature.
The second aspect of the embodiment of the present invention provides a kind of motion recognition system, and it includes:
First division module, for according to preset time step-length, to the depth image sequence of target object on time dimension Row are divided, and obtain the depth cube in each time-domain;
Second division module, for according to preset plane grid, being drawn on Spatial Dimension to the depth cube Point, obtain more sub- depth vertical cubes of dimension identical, the sub- cubical number of image frames of depth and the depth cube Number of image frames it is identical;
Acquisition module, for obtaining the cubical time motor imagination figure of the sub- depth;
Extraction module, for extracting the pre-set image feature of the time motor imagination figure;
Concatenation module, for splicing all sub- cubical pre-set image features of depth corresponding to the depth cube, The cubical characteristic vector of the depth is obtained, and splices all cubical features of depth corresponding to the range image sequence Vector, obtain the Feature Descriptor of the range image sequence;
Action recognition module, for being classified by linear SVM to the Feature Descriptor, to identify State the type of action of target object.
In one embodiment, the acquisition module includes:
Pixel value acquiring unit, for obtaining the pixel value of the non-noise pixel in the sub- depth cube;
Response diagram acquiring unit, for according to the pixel value, obtaining in the non-noise pixel and characterizing destination node The pixel of motion, obtain the sub- cubical time motor imagination figure of depth.
In one embodiment, the pixel value acquiring unit, including:
Equalization handles subelement, for the pixel signal progress to each pixel in the sub- depth cube Zero averaging processing;
Functional transformation subelement, sign function conversion is carried out for the pixel signal after handling zero averaging, Obtain the sign function pixel signal of each pixel in the sub- depth cube;
Pixel value obtains subelement, for carrying out convolution operation to the sign function pixel signal, obtains institute State the pixel value of the non-noise pixel in sub- depth cube.
The third aspect of the embodiment of the present invention provides a kind of terminal device, including memory, processor and is stored in In the memory and the computer program that can run on the processor, described in the computing device during computer program The step of realizing the above method.
The fourth aspect of the embodiment of the present invention provides a kind of computer-readable recording medium, the computer-readable storage Media storage has the step of computer program, the computer program realizes the above method when being executed by processor.
The embodiment of the present invention on time dimension and Spatial Dimension to the range image sequence of target object by drawing Point, more sub- depth vertical cubes are obtained, and obtain the cubical time motor imagination figure of sub- depth, extraction time motor imagination figure Pre-set image feature, then splice the Feature Descriptor that all pre-set image features obtain range image sequences, so as to The type of action of target object is identified according to Feature Descriptor, the method for the invention to target object without carrying out pre-segmentation Processing, the significant advantage of accuracy of identification efficiency high fast with algorithm process speed.
Brief description of the drawings
Technical scheme in order to illustrate the embodiments of the present invention more clearly, below will be to embodiment or description of the prior art In the required accompanying drawing used be briefly described, it should be apparent that, drawings in the following description be only the present invention some Embodiment, for those of ordinary skill in the art, without having to pay creative labor, can also be according to these Accompanying drawing obtains other accompanying drawings.
Fig. 1 is the implementation process schematic diagram for the action identification method that one embodiment of the present of invention provides;
Fig. 2 is the implementation process schematic diagram for the step S30 that one embodiment of the present of invention provides;
Fig. 3 is the exemplary plot for the time motor imagination figure that one embodiment of the present of invention provides;
Fig. 4 is the implementation process schematic diagram for the step S31 that one embodiment of the present of invention provides;
Fig. 5 is the schematic diagram for the motion recognition system that one embodiment of the present of invention provides;
Fig. 6 is the schematic diagram for the acquisition module that one embodiment of the present of invention provides;
Fig. 7 is the schematic diagram for the pixel value acquiring unit that one embodiment of the present of invention provides;
Fig. 8 is the schematic diagram for the terminal device that one embodiment of the present of invention provides.
Embodiment
In describing below, in order to illustrate rather than in order to limit, it is proposed that such as tool of particular system structure, technology etc Body details, thoroughly to understand the embodiment of the present invention.However, it will be clear to one skilled in the art that there is no these specific The present invention can also be realized in the other embodiments of details.In other situations, omit to well-known system, device, electricity Road and the detailed description of method, in case unnecessary details hinders description of the invention.
In order to illustrate technical solutions according to the invention, illustrated below by specific embodiment.
As shown in figure 1, one embodiment of the present of invention provides a kind of action identification method, it includes:
Step S10:According to preset time step-length, the range image sequence of target object is drawn on time dimension Point, obtain the depth cube in each time-domain.
In a particular application, preset time step-length is set time step-length, can be set according to being actually needed, fixed Time step can not be overlapping or partly overlapped, corresponding, and adjacent time-domain can not also be overlapping or partly overlapped, example Such as, it is assumed that all picture frames corresponding to depth time sequence total time length be 10 seconds, set time step-length be 2 seconds, then with When nonoverlapping set time step-length is divided, range image sequence can be divided into 0~2 second, 2 seconds~4 seconds, 4 seconds~6 The depth cube of totally five time-domains of second, 6 seconds~8 seconds and 8 seconds~10 seconds;Walked by the set time of 1 second of overlapping time It is long when being divided, range image sequence can be divided 0~2 second, 1 second~3 seconds, 2 seconds~4 seconds, 3 seconds~5 seconds, 4 seconds~6 The depth cube of totally nine time-domains of second, 5 seconds~7 seconds, 6 seconds~8 seconds, 7 seconds~9 seconds, 8 seconds~10 seconds.
In a particular application, target object typically refers to human body under being kept in motion or various live bodies Animal, the either robot of non-living body, bionic animal or machinery, electronic equipment etc..
In a particular application, range image sequence can specifically be obtained by depth camera, and depth camera, which has, to be perceived The ability of target area range information, the depth image of target object can be exported in real time, its depth image is not by illumination variation Influence, it is and insensitive to the color and textural characteristics of object, there is preferable robustness.
Step S20:According to preset plane grid, the depth cube is divided on Spatial Dimension, tieed up Spend more sub- depth vertical cubes of identical, the sub- cubical number of image frames of depth and the cubical number of image frames of the depth It is identical.
In a particular application, step S20 specifically refers to divide each depth cube on Spatial Dimension, obtains More sub- depth vertical cubes corresponding with each depth cube.
In a particular application, preset plane grid is parallel to plane where each depth image in range image sequence Two-dimensional grid, its mesh-density can be configured according to being actually needed, and mesh-density is bigger, and sub- depth cube is parallel to depth The surface area of plane is smaller where spending image;The sub- cubical number of image frames of depth tool identical with the cubical number of image frames of depth Body refers to, where depth image on the direction of plane, sub- depth cube and depth cube in each time-domain The frame number of image included by body is identical.
Step S30:Obtain the cubical time motor imagination figure of the sub- depth.
In a particular application, step S30 specifically refers to obtain more sub- depth vertical cubes corresponding to each depth cube Time motor imagination figure.
Step S40:Extract the pre-set image feature of the time motor imagination figure.
In a particular application, pre-set image feature specifically refers to direction gradient figure (Histogram of Oriented Gradient, HOG) feature, it can also refer to local binary patterns (Local Binary Pattern, LBP) feature or Haar features.
Step S50:Splice all sub- cubical pre-set image features of depth corresponding to the depth cube, obtain institute The cubical characteristic vector of depth is stated, and splices all cubical characteristic vectors of depth corresponding to the range image sequence, Obtain the Feature Descriptor of the range image sequence.
In a particular application, step S50 is specifically referred to, each according to the cubical splicing successively that puts in order of all sub- depth The sub- cubical pre-set image feature of depth, and splice each depth cube successively according to cubical put in order of all depth Characteristic vector.
It should be appreciated that the depth cube or sub- depth cube in the present embodiment are not to refer in particular to length The cube that equal cube or length partly or entirely differs.
Step S60:The Feature Descriptor is classified by linear SVM (SVW), to identify the mesh Mark the type of action of object.
The present embodiment is obtained by being divided on time dimension and Spatial Dimension to the range image sequence of target object To more sub- depth vertical cubes, and obtain the cubical time motor imagination figure of sub- depth, extraction time motor imagination figure it is pre- If characteristics of image, then splice the Feature Descriptor that all pre-set image features obtain range image sequence, so as to basis Feature Descriptor identifies the action of target object, method of the present invention without carrying out pre-segmentation processing to target object, It is fast with algorithm process speed, the significant advantage of accuracy of identification efficiency high.
As shown in Fig. 2 in one embodiment of the invention, step S30 is specifically included:
Step S31:Obtain the pixel value of the non-noise pixel in the sub- depth cube.
In a particular application, convolution behaviour can be carried out to each pixel position in every sub- depth vertical cube Make, filter out the non-noise pixel in sub- depth cube, then calculate the pixel value of non-noise pixel, the pixel value is For the time motor imagination of non-noise pixel.
Step S32:According to the pixel value, the pixel that destination node motion is characterized in the non-noise pixel is obtained Point, obtain the sub- cubical time motor imagination figure of depth.
In a particular application, destination node specifically refers to the movement node on target object, when target object is human body, Destination node can refer to each limbs or the joint of human body.
In one embodiment, step S32 is specifically included:
According to the time behavior of the pixel value, filter out and destination node motion is characterized in the non-noise pixel Pixel;
Visualization processing is carried out to the pixel of the sign destination node motion, obtain the sub- depth it is cubical when Between motor imagination figure.
In a particular application, it is different for the time behavior of the pixel value of different type of sports, such as during lift hand When, it can increase over time and decrease up to positioned at the pixel value of the non-noise pixel of hand position as 0, and hand position Non-noise pixel above the pixel value of pixel can then increase, the pixel that pixel value changes is to characterize target The pixel of joint movements.
In a particular application, visualization processing is specifically the pixel value that will be represented with numeric form, and being reduced to human eye can be with The image seen.
As shown in figure 3, it is the table to included all sub-cubes of the range image sequence sample of some human body The pixel for levying destination node motion carries out visualization processing, obtains the time motor imagination figure of range image sequence sample.Fig. 3 The digital calibration on middle right side represents gray value.As shown in Figure 3, it can be seen that people's shape in the range image sequence sample of human body The operation information of information and limbs is all preferably characterized, and noise and background are obtained for and preferably filtered out.
As shown in figure 4, in one embodiment of the invention, step S31 is specifically included:
Step S311:The pixel signal of each pixel in the sub- depth cube is carried out at zero averaging Reason.
In a particular application, the pixel of destination node motion is characterized in range image sequence in order to filter out, can be Each pixel regards an one-dimensional time signal as, and each pixel can be expressed as (i, j), corresponding, in an implementation In example, step S311 can be realized by equation below:
Wherein, N is the frame number of range image sequence, and f (i, j) [n] is the n-th frame image in range image sequence in pixel The pixel value at point (i, j) place, f*(i, j) is the pixel signal after zero averaging is handled.
Step S312:Sign function conversion is carried out to the pixel signal after zero averaging processing, obtains the son The sign function pixel signal of each pixel in depth cube.
In one embodiment, step S312 can be realized by equation below:
Sf (i, j)=sign (f*(i,j));
Wherein, sf (i, j) is f*The sign function pixel signal that (i, j) is obtained after sign function converts, sign () is sign function.
Step S313:Convolution operation is carried out to the sign function pixel signal, obtains the sub- depth cube The pixel value of non-noise pixel in body.
In a particular application, for some specific pixel (i, j), sf (i, j) in bundle depth cube>0 picture Vegetarian refreshments number is referred to as positive sample number, is designated as P (i, j), sf (i, j)<0 pixel number is referred to as negative sample number, be designated as Q (i, J), the pixel number of time motor imagination non-zero is called non-zero response sample number, is designated as NZ (i, j), modulus plate [- 1,0 ,- 1] it is convolution kernel;Corresponding, in one embodiment, step S311 realizes especially by below equation:
Wherein, M (i, j) represents pixel value of the time motor imagination at pixel (i, j) place.
The present embodiment is by way of the sub- depth Cube computation time motor imagination to obtaining, to identify that human body moves Make, algorithm is simple, and processing speed is fast.
It should be understood that the size of the sequence number of each step is not meant to the priority of execution sequence, each process in above-described embodiment Execution sequence should determine that the implementation process without tackling the embodiment of the present invention forms any limit with its function and internal logic It is fixed.
As shown in figure 5, one embodiment of the present of invention provides a kind of motion recognition system 100, for performing corresponding to Fig. 1 Embodiment in method and step, it includes:
First division module 10, for according to preset time step-length, to the depth image of target object on time dimension Sequence is divided, and obtains the depth cube in each time-domain;
Second division module 20, for according to preset plane grid, being carried out on Spatial Dimension to the depth cube Division, obtain more sub- depth vertical cubes of dimension identical, the sub- cubical number of image frames of depth and the depth cube The number of image frames of body is identical;
Acquisition module 30, for obtaining the cubical time motor imagination figure of the sub- depth;
Extraction module 40, for extracting the pre-set image feature of the time motor imagination figure;
Concatenation module 50, it is special for splicing all sub- cubical pre-set images of depth corresponding to the depth cube Sign, obtains the cubical characteristic vector of the depth, and it is cubical to splice all depth corresponding to the range image sequence Characteristic vector, obtain the Feature Descriptor of the range image sequence;
Sort module 60, for being classified by linear SVM (SVW) to the Feature Descriptor, with identification The type of action of the target object.
The present embodiment is obtained by being divided on time dimension and Spatial Dimension to the range image sequence of target object To more sub- depth vertical cubes, and obtain the cubical time motor imagination figure of sub- depth, extraction time motor imagination figure it is pre- If characteristics of image, then splice the Feature Descriptor that all pre-set image features obtain range image sequence, so as to basis Feature Descriptor identifies the action of target object, method of the present invention without carrying out pre-segmentation processing to target object, It is fast with algorithm process speed, the significant advantage of accuracy of identification efficiency high.
As shown in fig. 6, in one embodiment of the invention, acquisition module 30 includes being used to perform the reality corresponding to Fig. 2 The structure of the method and step in example is applied, it includes:
Pixel value acquiring unit 31, for obtaining the pixel value of the non-noise pixel in the sub- depth cube;
Response diagram acquiring unit 32, for according to the pixel value, obtaining and target section being characterized in the non-noise pixel The pixel of point motion, obtains the sub- cubical time motor imagination figure of depth.
In one embodiment, response diagram acquiring unit 32 specifically includes:
Subelement is screened, for the time behavior according to the pixel value, is filtered out in the non-noise pixel Characterize the pixel of destination node motion;
Response diagram acquiring unit, for carrying out visualization processing to the pixel of the sign destination node motion, obtain The cubical time motor imagination figure of sub- depth.
As shown in fig. 7, in one embodiment of the invention, pixel value acquiring unit 31 includes right for performing Fig. 4 institutes The structure of method and step in the embodiment answered, it includes:
Equalization handles subelement 311, for the pixel signal to each pixel in the sub- depth cube Carry out zero averaging processing;
Functional transformation subelement 312, sign function change is carried out for the pixel signal after handling zero averaging Change, obtain the sign function pixel signal of each pixel in the sub- depth cube;
Pixel value obtains subelement 313, for carrying out convolution operation to the sign function pixel signal, obtains The pixel value of non-noise pixel in the sub- depth cube.
In one embodiment, equalization processing subelement 311 is specifically used for according to below equation to the sub- depth vertical The pixel signal of each pixel in cube carries out zero averaging processing:
Wherein, N is the frame number of range image sequence, and f (i, j) [n] is the n-th frame image in range image sequence in pixel The pixel value at point (i, j) place, f*(i, j) is the pixel signal after zero averaging is handled.
In one embodiment, functional transformation subelement 312 is specifically used for calculating sign function pixel according to below equation Point signal:
Sf (i, j)=sign (f*(i,j));
Wherein, sf (i, j) is f*The sign function pixel signal that (i, j) is obtained after sign function converts, sign () is sign function.
In one embodiment, pixel value obtains subelement 313 and is specifically used for calculating non-noise pixel according to below equation The pixel value of point:
Wherein, M (i, j) represents time motor imagination in the pixel value at pixel (i, j) place, and P (i, j) is sub- depth cube Sf (i, j) in body>0 pixel number, referred to as positive sample number;Q (i, j) is sf (i, j) in sub- depth cube<0 pixel Point number, referred to as negative sample number;NZ (i, j) is the pixel number of time motor imagination non-zero in sub- depth cube, is referred to as Non-zero response sample number.
The present embodiment is by way of the sub- depth Cube computation time motor imagination to obtaining, to identify that human body moves Make, algorithm is simple, and processing speed is fast.
Fig. 8 is the schematic diagram for the terminal device that one embodiment of the invention provides.As shown in figure 8, the terminal of the embodiment is set Standby 7 include processor 70, memory 71 and are stored in the meter that can be run in the memory 71 and on the processor 70 Calculation machine program 72, the processor 70 realize the step in above-mentioned each embodiment of the method when performing the computer program 72, Such as the step S10 to S60 shown in Fig. 1.Or the processor 70 realizes above-mentioned each dress when performing the computer program 72 Put the function of each module/unit in embodiment, such as the function of module 10 to 60 shown in Fig. 5.
Exemplary, the computer program 72 can be divided into one or more module/units, it is one or Multiple module/units are stored in the memory 71, and are performed by the processor 70, to complete the present invention.Described one Individual or multiple module/units can be the series of computation machine programmed instruction section that can complete specific function, and the instruction segment is used for Implementation procedure of the computer program 72 in the terminal device 7 is described.For example, the computer program 72 can be divided It is cut into, the first division module, the second division module, acquisition module, extraction module, concatenation module and action recognition module, each mould Block concrete function is as follows:
First division module, for according to preset time step-length, to the depth image sequence of target object on time dimension Row are divided, and obtain the depth cube in each time-domain;
Second division module, for according to preset plane grid, being drawn on Spatial Dimension to the depth cube Point, obtain more sub- depth vertical cubes of dimension identical, the sub- cubical number of image frames of depth and the depth cube Number of image frames it is identical;
Acquisition module, for obtaining the cubical time motor imagination figure of the sub- depth;
Extraction module, for extracting the pre-set image feature of the time motor imagination figure;
Concatenation module, for splicing all sub- cubical pre-set image features of depth corresponding to the depth cube, The cubical characteristic vector of the depth is obtained, and splices all cubical features of depth corresponding to the range image sequence Vector, obtain the Feature Descriptor of the range image sequence;
Action recognition module, for being classified by linear SVM to the Feature Descriptor, to identify State the type of action of target object.
The terminal device 7 can be that the calculating such as desktop PC, notebook, palm PC and cloud server are set It is standby.The terminal device may include, but be not limited only to, processor 70, memory 71.It will be understood by those skilled in the art that Fig. 8 The only example of terminal device 7, the restriction to terminal device 7 is not formed, can included than illustrating more or less portions Part, some parts or different parts are either combined, such as the terminal device can also include input-output equipment, net Network access device, bus etc..
Alleged processor 70 can be CPU (Central Processing Unit, CPU), can also be Other general processors, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field- Programmable Gate Array, FPGA) either other PLDs, discrete gate or transistor logic, Discrete hardware components etc..General processor can be microprocessor or the processor can also be any conventional processor Deng.
The memory 71 can be the internal storage unit of the terminal device 7, such as the hard disk of terminal device 7 or interior Deposit.The memory 71 can also be the External memory equipment of the terminal device 7, such as be equipped with the terminal device 7 Plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, dodge Deposit card (Flash Card) etc..Further, the memory 71 can also both include the storage inside list of the terminal device 7 Member also includes External memory equipment.The memory 71 is used to store needed for the computer program and the terminal device Other programs and data.The memory 71 can be also used for temporarily storing the data that has exported or will export.
It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each work( Can unit, module division progress for example, in practical application, can be as needed and by above-mentioned function distribution by different Functional unit, module are completed, i.e., the internal structure of described device are divided into different functional units or module, more than completion The all or part of function of description.Each functional unit, module in embodiment can be integrated in a processing unit, also may be used To be that unit is individually physically present, can also two or more units it is integrated in a unit, it is above-mentioned integrated Unit can both be realized in the form of hardware, can also be realized in the form of SFU software functional unit.In addition, each function list Member, the specific name of module are not limited to the protection domain of the application also only to facilitate mutually distinguish.Said system The specific work process of middle unit, module, the corresponding process in preceding method embodiment is may be referred to, will not be repeated here.
In the above-described embodiments, the description to each embodiment all emphasizes particularly on different fields, and is not described in detail or remembers in some embodiment The part of load, it may refer to the associated description of other embodiments.
Those of ordinary skill in the art are it is to be appreciated that the list of each example described with reference to the embodiments described herein Member and algorithm steps, it can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually Performed with hardware or software mode, application-specific and design constraint depending on technical scheme.Professional and technical personnel Described function can be realized using distinct methods to each specific application, but this realization is it is not considered that exceed The scope of the present invention.
In embodiment provided by the present invention, it should be understood that disclosed device/terminal device and method, can be with Realize by another way.For example, device described above/terminal device embodiment is only schematical, for example, institute The division of module or unit is stated, only a kind of division of logic function, there can be other dividing mode when actually realizing, such as Multiple units or component can combine or be desirably integrated into another system, or some features can be ignored, or not perform.Separately A bit, shown or discussed mutual coupling or direct-coupling or communication connection can be by some interfaces, device Or INDIRECT COUPLING or the communication connection of unit, can be electrical, mechanical or other forms.
The unit illustrated as separating component can be or may not be physically separate, show as unit The part shown can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple On NE.Some or all of unit therein can be selected to realize the mesh of this embodiment scheme according to the actual needs 's.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, can also That unit is individually physically present, can also two or more units it is integrated in a unit.Above-mentioned integrated list Member can both be realized in the form of hardware, can also be realized in the form of SFU software functional unit.
If the integrated module/unit realized in the form of SFU software functional unit and as independent production marketing or In use, it can be stored in a computer read/write memory medium.Based on such understanding, the present invention realizes above-mentioned implementation All or part of flow in example method, by computer program the hardware of correlation can also be instructed to complete, described meter Calculation machine program can be stored in a computer-readable recording medium, and the computer program can be achieved when being executed by processor The step of stating each embodiment of the method..Wherein, the computer program includes computer program code, the computer program Code can be source code form, object identification code form, executable file or some intermediate forms etc..Computer-readable Jie Matter can include:Can carry any entity or device of the computer program code, recording medium, USB flash disk, mobile hard disk, Magnetic disc, CD, computer storage, read-only storage (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..It is it should be noted that described The content that computer-readable medium includes can carry out appropriate increasing according to legislation in jurisdiction and the requirement of patent practice Subtract, such as in some jurisdictions, according to legislation and patent practice, computer-readable medium do not include be electric carrier signal and Telecommunication signal.
Embodiment described above is merely illustrative of the technical solution of the present invention, rather than its limitations;Although with reference to foregoing reality Example is applied the present invention is described in detail, it will be understood by those within the art that:It still can be to foregoing each Technical scheme described in embodiment is modified, or carries out equivalent substitution to which part technical characteristic;And these are changed Or replace, the essence of appropriate technical solution is departed from the spirit and scope of various embodiments of the present invention technical scheme, all should Within protection scope of the present invention.

Claims (10)

  1. A kind of 1. action identification method, it is characterised in that including:
    According to preset time step-length, the range image sequence of target object is divided on time dimension, when obtaining each Between depth cube in domain;
    According to preset plane grid, the depth cube is divided on Spatial Dimension, it is multiple to obtain dimension identical Sub- depth cube, the sub- cubical number of image frames of depth are identical with the cubical number of image frames of the depth;
    Obtain the cubical time motor imagination figure of the sub- depth;
    Extract the pre-set image feature of the time motor imagination figure;
    Splice all sub- cubical pre-set image features of depth corresponding to the depth cube, obtain the depth cube Characteristic vector, and splice all cubical characteristic vectors of depth corresponding to the range image sequence, obtain the depth The Feature Descriptor of image sequence;
    The Feature Descriptor is classified by linear SVM, to identify the type of action of the target object.
  2. 2. action identification method as claimed in claim 1, it is characterised in that described to obtain the sub- depth cubical time Motor imagination figure, including:
    Obtain the pixel value of the non-noise pixel in the sub- depth cube;
    According to the pixel value, the pixel that destination node motion is characterized in the non-noise pixel is obtained, obtains the son The cubical time motor imagination figure of depth.
  3. 3. action identification method as claimed in claim 2, it is characterised in that non-in the acquisition sub- depth cube The pixel value of noise pixel point, including:
    Zero averaging processing is carried out to the pixel signal of each pixel in the sub- depth cube;
    Sign function conversion is carried out to the pixel signal after zero averaging processing, obtained in the sub- depth cube The sign function pixel signal of each pixel;
    Convolution operation is carried out to the sign function pixel signal, obtains the non-noise picture in the sub- depth cube The pixel value of vegetarian refreshments.
  4. 4. action identification method as claimed in claim 2, it is characterised in that it is described according to the pixel value, obtain described non- The pixel of destination node motion is characterized in noise pixel point, obtains the sub- cubical time motor imagination figure of depth, bag Include:
    According to the time behavior of the pixel value, the picture that destination node motion is characterized in the non-noise pixel is filtered out Vegetarian refreshments;
    Visualization processing is carried out to the pixel of the sign destination node motion, obtains the cubical time fortune of the sub- depth Dynamic response figure.
  5. 5. action identification method as claimed in claim 1, it is characterised in that the preset time step-length is not overlapping or part Overlapping set time step-length, the adjacent time-domain is not overlapping or partly overlaps, and described image is characterized as that direction gradient is straight Square figure feature.
  6. A kind of 6. motion recognition system, it is characterised in that including:
    First division module, for according to preset time step-length, entering on time dimension to the range image sequence of target object Row division, obtains the depth cube in each time-domain;
    Second division module, for according to preset plane grid, dividing, obtaining to the depth cube on Spatial Dimension Obtain more sub- depth vertical cubes of dimension identical, the sub- cubical number of image frames of depth and the cubical image of the depth Frame number is identical;
    Acquisition module, for obtaining the cubical time motor imagination figure of the sub- depth;
    Extraction module, for extracting the pre-set image feature of the time motor imagination figure;
    Concatenation module, for splicing all sub- cubical pre-set image features of depth corresponding to the depth cube, obtain The cubical characteristic vector of depth, and splice all cubical features of depth corresponding to the range image sequence to Amount, obtain the Feature Descriptor of the range image sequence;
    Action recognition module, for being classified by linear SVM to the Feature Descriptor, to identify the mesh Mark the type of action of object.
  7. 7. motion recognition system as claimed in claim 6, it is characterised in that the acquisition module includes:
    Pixel value acquiring unit, for obtaining the pixel value of the non-noise pixel in the sub- depth cube;
    Response diagram acquiring unit, for according to the pixel value, obtaining and destination node motion being characterized in the non-noise pixel Pixel, obtain the sub- cubical time motor imagination figure of depth.
  8. 8. motion recognition system as claimed in claim 7, it is characterised in that the pixel value acquiring unit, including:
    Equalization handles subelement, equal for carrying out zero to the pixel signal of each pixel in the sub- depth cube Value is handled;
    Functional transformation subelement, sign function conversion is carried out for the pixel signal after handling zero averaging, is obtained The sign function pixel signal of each pixel in the sub- depth cube;
    Pixel value obtains subelement, for carrying out convolution operation to the sign function pixel signal, obtains the son The pixel value of non-noise pixel in depth cube.
  9. 9. a kind of terminal device, including memory, processor and it is stored in the memory and can be on the processor The computer program of operation, it is characterised in that realize such as claim 1 to 5 described in the computing device during computer program The step of any one methods described.
  10. 10. a kind of computer-readable recording medium, the computer-readable recording medium storage has computer program, and its feature exists In when the computer program is executed by processor the step of realization such as any one of claim 1 to 5 methods described.
CN201710901427.2A 2017-09-28 2017-09-28 Action identification method and system and terminal equipment Active CN107704819B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710901427.2A CN107704819B (en) 2017-09-28 2017-09-28 Action identification method and system and terminal equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710901427.2A CN107704819B (en) 2017-09-28 2017-09-28 Action identification method and system and terminal equipment

Publications (2)

Publication Number Publication Date
CN107704819A true CN107704819A (en) 2018-02-16
CN107704819B CN107704819B (en) 2020-01-24

Family

ID=61175218

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710901427.2A Active CN107704819B (en) 2017-09-28 2017-09-28 Action identification method and system and terminal equipment

Country Status (1)

Country Link
CN (1) CN107704819B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105608421A (en) * 2015-12-18 2016-05-25 中国科学院深圳先进技术研究院 Human movement recognition method and device
CN106709461A (en) * 2016-12-28 2017-05-24 中国科学院深圳先进技术研究院 Video based behavior recognition method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105608421A (en) * 2015-12-18 2016-05-25 中国科学院深圳先进技术研究院 Human movement recognition method and device
CN106709461A (en) * 2016-12-28 2017-05-24 中国科学院深圳先进技术研究院 Video based behavior recognition method and device

Also Published As

Publication number Publication date
CN107704819B (en) 2020-01-24

Similar Documents

Publication Publication Date Title
Wu et al. 3d shapenets for 2.5 d object recognition and next-best-view prediction
Wu et al. 3d shapenets: A deep representation for volumetric shapes
CN107748890A (en) A kind of visual grasping method, apparatus and its readable storage medium storing program for executing based on depth image
CN108319957A (en) A kind of large-scale point cloud semantic segmentation method based on overtrick figure
CN109711410A (en) Three-dimensional object rapid segmentation and identification method, device and system
CN107506754A (en) Iris identification method, device and terminal device
CN110334762B (en) Feature matching method based on quad tree combined with ORB and SIFT
CN108044627A (en) Detection method, device and the mechanical arm of crawl position
CN107953329A (en) Object identification and Attitude estimation method, apparatus and mechanical arm grasping system
CN106408037A (en) Image recognition method and apparatus
CN109117773A (en) A kind of characteristics of image point detecting method, terminal device and storage medium
CN103310233A (en) Similarity mining method of similar behaviors between multiple views and behavior recognition method
CN110414571A (en) A kind of website based on Fusion Features reports an error screenshot classification method
CN110751097A (en) Semi-supervised three-dimensional point cloud gesture key point detection method
CN108734773A (en) A kind of three-dimensional rebuilding method and system for mixing picture
CN114509785A (en) Three-dimensional object detection method, device, storage medium, processor and system
CN110930503A (en) Method and system for establishing three-dimensional model of clothing, storage medium and electronic equipment
CN111145196A (en) Image segmentation method and device and server
CN110793437A (en) Positioning method and device of manual operator, storage medium and electronic equipment
Shrestha et al. A real world dataset for multi-view 3d reconstruction
CN108876776A (en) A kind of method of generating classification model, eye fundus image classification method and device
CN107633506A (en) A kind of image symmetrical characteristic detection method, device and terminal device
CN111401184A (en) Machine vision processing method and device, storage medium and electronic equipment
CN108229498B (en) Zipper piece identification method, device and equipment
CN107704819A (en) A kind of action identification method, system and terminal device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant