CN108345387A

CN108345387A - Method and apparatus for output information

Info

Publication number: CN108345387A
Application number: CN201810209048.1A
Authority: CN
Inventors: 杨振中
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Baidu Online Network Technology Beijing Co Ltd; Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2018-03-14
Filing date: 2018-03-14
Publication date: 2018-07-31

Abstract

This application discloses the method and apparatus for output information.One specific implementation mode of the above method includes：Obtain the target image for including gesture；Extract the characteristic point of above-mentioned gesture；According to the characteristic point extracted and the gesture identification model pre-established, determine that the gesture-type of above-mentioned gesture, above-mentioned gesture identification model are used for the correspondence of characteristic feature point and gesture-type；Correspondence list in response to the preset gesture-type of determination and control command includes the gesture-type of above-mentioned gesture, determines and export control command corresponding with the gesture-type of above-mentioned gesture.The embodiment enhances the sense of participation of people, enriches man-machine interaction mode.

Description

Method and apparatus for output information

Technical field

This application involves field of computer technology, and in particular to Internet technical field, more particularly, to output information Method and apparatus.

Background technology

With the development of Internet technology, the recreation based on terminal is on the increase.With augmented reality The appearance of (Augmented Reality, AR), can merge the things and virtual environment in reality, man-machine mutual to increase Dynamic interest.Rely on fixed things in environment in current fusion scene, the sense of participation of people is weaker more.

Invention content

The embodiment of the present application proposes the method and apparatus for output information.

In a first aspect, the embodiment of the present application provides a kind of method for output information, including：It obtains comprising gesture Target image；Extract the characteristic point of above-mentioned gesture；According to the characteristic point extracted and the gesture identification model pre-established, really The gesture-type of fixed above-mentioned gesture, above-mentioned gesture identification model are used for the correspondence of characteristic feature point and gesture-type；Response It is determining and defeated in determining that the correspondence list of preset gesture-type and control command includes the gesture-type of above-mentioned gesture Go out control command corresponding with the gesture-type of above-mentioned gesture.

In some embodiments, above-mentioned acquisition includes the target image of gesture, including：Obtain the video flowing for including gesture； Using the multiple image in above-mentioned video flowing as above-mentioned target image.

In some embodiments, the above method further includes：It is corresponding with control command in response to the above-mentioned gesture-type of determination Relation list does not include the gesture-type of above-mentioned gesture, detects in above-mentioned target image and is included with the presence or absence of at least two field pictures Gesture it is consistent；It is consistent in response to there is the gesture that at least two field pictures are included in the above-mentioned target image of determination, it determines above-mentioned Position of the gesture in above-mentioned at least two field pictures；According to position of the above-mentioned at least two field pictures in above-mentioned video flowing and on Position of the gesture in above-mentioned at least two field pictures is stated, determines that above-mentioned gesture is formed by action；According to preset action and control The correspondence of order is made, determines and exports and be formed by the corresponding control command of action with above-mentioned gesture.

In some embodiments, above-mentioned acquisition includes the target image of gesture, including：A frame is chosen from above-mentioned video flowing Image is as image to be detected；Based on image to be detected, executes and step is determined with hypograph：Detect the contrast of image to be detected Whether preset condition is met；In response to determining that the contrast of image to be detected meets preset condition, using image to be detected as mesh Logo image；In response to determining that the contrast of image to be detected is unsatisfactory for preset condition, chosen from above-mentioned video flowing unselected A frame image as new image to be detected, continue to execute above-mentioned image and determine step.

In some embodiments, above-mentioned acquisition includes the target image of gesture, including：In response to detecting the control of user Request operation sends gesture identification model to server-side and obtains request；In response to receiving above-mentioned gesture identification model, packet is obtained Target image containing gesture.

In some embodiments, above-mentioned gesture identification model is obtained by deep learning model training；And the above method is also Include the training step of gesture identification model：Preset training sample set is obtained, above-mentioned training sample set includes at least one Marked gesture-type, the different shooting angles image of kind；Extract the characteristic point of gesture in above-mentioned training sample；By above-mentioned instruction The characteristic point for practicing gesture in sample inputs deep learning model to be trained, and the characteristic point for obtaining gesture in training sample is corresponding The prediction result of gesture-type；Prediction result based on above-mentioned gesture-type and the difference iteration between the gesture-type of label are more The parameter of new deep learning model, so that the difference between the prediction result of above-mentioned gesture-type and the gesture-type of label meets The preset condition of convergence.

In some embodiments, the above-mentioned gesture identification model pre-established includes the characteristic point mould of a variety of default gestures Type, features described above point model is using deep learning algorithm to including the sample graphs of a variety of default gestures, different shooting angles As being generated after carrying out feature point extraction；And it is above-mentioned according to the characteristic point extracted and the gesture identification mould pre-established Type determines the gesture-type of above-mentioned gesture, including：Determine the characteristic point mould of extracted characteristic point and above-mentioned a variety of default gestures Matching degree between type；Using the gesture-type of the default gesture belonging to the highest characteristic point model of matching degree as above-mentioned target figure As the gesture-type for the gesture for being included.

Second aspect, the embodiment of the present application provide a kind of device for output information, including：Target image obtains single Member, for obtaining the target image for including gesture；Feature point extraction unit, the characteristic point for extracting above-mentioned gesture；Gesture class Type determination unit, for according to the characteristic point extracted and the gesture identification model pre-established, determining the hand of above-mentioned gesture Gesture type, above-mentioned gesture identification model are used for the correspondence of characteristic feature point and gesture-type；Control command output unit is used Include the gesture-type of above-mentioned gesture in the correspondence list in response to the preset gesture-type of determination and control command, determines And output control command corresponding with the gesture-type of above-mentioned gesture.

In some embodiments, above-mentioned target image acquiring unit is further used for：Obtain the video flowing for including gesture；It will Multiple image in above-mentioned video flowing is as above-mentioned target image.

In some embodiments, above-mentioned apparatus further includes：Gestures detection unit, in response to the above-mentioned gesture-type of determination Correspondence list with control command does not include the gesture-type of above-mentioned gesture, detects in above-mentioned target image with the presence or absence of extremely The gesture that few two field pictures are included is consistent；Position determination unit, in response to existing at least in the above-mentioned target image of determination The gesture that two field pictures are included is consistent, determines position of the above-mentioned gesture in above-mentioned at least two field pictures；Determination unit is acted, For according to position of the above-mentioned at least two field pictures in above-mentioned video flowing and above-mentioned gesture in above-mentioned at least two field pictures Position, determine that above-mentioned gesture is formed by action；Second control command output unit, for according to preset action and control The correspondence of order, determines and exports and be formed by the corresponding control command of action with above-mentioned gesture.

In some embodiments, above-mentioned target image acquiring unit is further used for：A frame is chosen from above-mentioned video flowing Image is as image to be detected；Based on image to be detected, executes and step is determined with hypograph：Detect the contrast of image to be detected Whether preset condition is met；In response to determining that the contrast of image to be detected meets preset condition, using image to be detected as mesh Logo image；In response to determining that the contrast of image to be detected is unsatisfactory for preset condition, chosen from above-mentioned video flowing unselected A frame image as new image to be detected, continue to execute above-mentioned image and determine step.

In some embodiments, above-mentioned target image acquiring unit is further used for：In response to detecting the control of user Request operation sends gesture identification model to server-side and obtains request；In response to receiving above-mentioned gesture identification model, packet is obtained Target image containing gesture.

In some embodiments, above-mentioned gesture identification model is obtained by deep learning model training；And above-mentioned apparatus is also Including model training unit, it is used for：Preset training sample set is obtained, above-mentioned training sample set includes that at least one has been marked Remember gesture-type, different shooting angles images；Extract the characteristic point of gesture in above-mentioned training sample；By above-mentioned training sample The characteristic point of middle gesture inputs deep learning model to be trained, and obtains the corresponding gesture class of characteristic point of gesture in training sample The prediction result of type；Prediction result based on above-mentioned gesture-type and the difference iteration between the gesture-type of label update depth The parameter of learning model so that the difference between the prediction result of above-mentioned gesture-type and the gesture-type of label meet it is preset The condition of convergence.

In some embodiments, the above-mentioned gesture identification model pre-established includes the characteristic point mould of a variety of default gestures Type, features described above point model is using deep learning algorithm to including the sample graphs of a variety of default gestures, different shooting angles As being generated after carrying out feature point extraction；And above-mentioned gesture-type determination unit is further used for：Determine extracted feature Matching degree between point and the feature point model of above-mentioned a variety of default gestures；It will be pre- belonging to the highest characteristic point model of matching degree If the gesture-type for the gesture that the gesture-type of gesture is included as above-mentioned target image.

The third aspect, the embodiment of the present application provide a kind of terminal, including：One or more processors；Storage device is used In the one or more programs of storage, when said one or multiple programs are executed by said one or multiple processors so that above-mentioned One or more processors realize method described in any of the above-described embodiment.

Fourth aspect, the embodiment of the present application provide a kind of computer-readable medium, are stored thereon with computer program, should Method described in any of the above-described embodiment is realized when program is executed by processor.

The method and apparatus for output information that above-described embodiment of the application provides obtain the mesh for including gesture first Then logo image extracts the characteristic point of above-mentioned gesture, then according to the characteristic point extracted and the gesture identification pre-established Model determines the gesture-type of gesture, in conjunction with the correspondence of preset gesture-type and control command, determine and output with The corresponding control command of gesture-type of above-mentioned gesture.The method of the present embodiment, can be by identifying that the gesture of people determines control Order, enhances the sense of participation of people, enriches man-machine interaction mode.

Description of the drawings

By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other Feature, objects and advantages will become more apparent upon：

Fig. 1 is that this application can be applied to exemplary system architecture figures therein；

Fig. 2 is the flow chart according to one embodiment of the method for output information of the application；

Fig. 3 is the schematic diagram according to an application scenarios of the method for output information of the application；

Fig. 4 is the flow chart according to another embodiment of the method for output information of the application；

Fig. 5 is that training sample set in gesture identification model is trained in the method for output information according to the application Schematic diagram；

Fig. 6 is the structural schematic diagram according to one embodiment of the device for output information of the application；

Fig. 7 is adapted for the structural schematic diagram of the computer system of the terminal device for realizing the embodiment of the present application.

Specific implementation mode

The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Convenient for description, is illustrated only in attached drawing and invent relevant part with related.

It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

Fig. 1 shows the implementation of the method for output information or the device for output information that can apply the application The exemplary system architecture 100 of example.

As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 provide communication link medium.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..

User can be interacted by network 104 with server 105 with using terminal equipment 101,102,103, to receive or send out Send message etc..Various telecommunication customer end applications, such as camera-type application, trip can be installed on terminal device 101,102,103 Class of playing application, web browser applications, the application of shopping class, searching class application, instant messaging tools, mailbox client, social activity are flat Platform software etc..

Terminal device 101,102,103 can be hardware, can also be software.When terminal device 101,102,103 is hard Can be the various electronic equipments with display screen and supported web page browsing when part, including but not limited to smart mobile phone, flat (Moving Picture Experts Group Audio Layer III are moved for plate computer, E-book reader, MP3 player State image expert's compression standard audio level 3), MP4 (Moving Picture Experts Group Audio Layer IV, Dynamic image expert's compression standard audio level 4) player, pocket computer on knee and desktop computer etc..Work as terminal When equipment 101,102,103 is software, it may be mounted in above-mentioned cited electronic equipment.It may be implemented into multiple soft Part or software module can also be implemented as single software or software module, and the present embodiment does not limit this not to be done specifically herein It limits.

Server 105 can be to provide the server of various services, such as to being handled on terminal device 101,102,103 Image provides the background server supported.Background server can send processing model to terminal device 101,102,103, so that Terminal device 101,102,103 handles image.

It should be noted that the method for output information that is provided of the embodiment of the present application generally by terminal device 101, 102, it 103 executes, correspondingly, the device for output information is generally positioned in terminal device 101,102,103.

It should be noted that server can be hardware, can also be software.When server is hardware, may be implemented At the distributed server cluster that multiple servers form, individual server can also be implemented as.It, can when server is software To be implemented as multiple softwares or software module (such as providing Distributed Services), single software or software can also be implemented as Module.It is not specifically limited herein.

It should be understood that the number of the terminal device, network and server in Fig. 1 is only schematical.According to realization need It wants, can have any number of terminal device, network and server.

With continued reference to Fig. 2, the flow of one embodiment of the method for output information according to the application is shown 200.The method for output information of the present embodiment, includes the following steps：

Step 201, the target image for including gesture is obtained.

In the present embodiment, the executive agent (such as terminal device shown in FIG. 1) for being used for the method for output information can be with Target image is obtained by various modes.Such as can be by communicating with connection image taking software in obtain, can also Acquisition is stored in local target image, can also obtain target image by the social software wherein installed, can also from regarding Frequency obtains target image in flowing.Above-mentioned target image is the image comprising images of gestures.In the present embodiment, above-mentioned executive agent exists When obtaining target image, a target image can be obtained, multiple target images can also be obtained.

Step 202, the characteristic point of above-mentioned gesture is extracted.

Executive agent can extract the characteristic point of gesture in target image after getting target image.What characteristic point referred to Figure can be reflected by being the point of gray value of image generation acute variation or the larger point of curvature on image border, image characteristic point As substantive characteristics, target object in image can be identified for that.In the present embodiment, different hands is distinguished by the characteristic point of gesture Gesture.When extracting the characteristic point of gesture, any of method extraction may be used, the present embodiment is not listed.

Step 203, according to the characteristic point extracted and the gesture identification model pre-established, the hand of above-mentioned gesture is determined Gesture type.

In the present embodiment, after being extracted the characteristic point of gesture, the characteristic point of extraction can be inputted the hand pre-established Gesture identification model obtains the gesture-type of above-mentioned gesture.Above-mentioned gesture-type can be used for characterizing different gestures, such as gesture Palm that type, which may include that palm turned upwards, to be divided, the palm that palm turned downwards divides, index finger and thumb contact its excess-three, which refer to, to be separated " OK " shape etc. formed.Above-mentioned gesture identification model can be used for the correspondence of characteristic feature point and gesture-type, can It is realized, can also be realized by the correspondence list of preset characteristic point and gesture-type with being trained by deep learning algorithm.

Step 204, the correspondence list in response to the preset gesture-type of determination and control command includes above-mentioned gesture Gesture-type, determine and corresponding with the gesture-type of the above-mentioned gesture control command of output.

In the present embodiment, the correspondence of different gestures and different control commands can be preset in lists, In target image is determined after the gesture-type of gesture, corresponding control command can be determined according to above-mentioned list, then will Above-mentioned control command output.For example, in AR scenes, when user spread out palm it is upward when, corresponding control command can be with For the expression for changing in scene.

It is a signal according to the application scenarios of the method for output information of the present embodiment with continued reference to Fig. 3, Fig. 3 Figure.In the application scenarios of Fig. 3, the camera software installed in terminal detects the target image for including gesture " than the heart " 30 Afterwards, symbol corresponding with the gesture 30 31 can be shown in AR scenes.

The method for output information that above-described embodiment of the application provides obtains the target figure for including gesture first Then picture extracts the characteristic point of above-mentioned gesture, then according to the characteristic point extracted and the gesture identification model pre-established, The gesture-type for determining gesture, in conjunction with the correspondence list of preset gesture-type and control command, determine and output with The corresponding control command of gesture-type of above-mentioned gesture.The method of the present embodiment, can be by identifying that the gesture of people determines control Order, enhances the sense of participation of people, enriches man-machine interaction mode.

In some optional realization methods of the present embodiment, target image may include multiple, then step 201 specifically may be used To be realized by unshowned following steps in Fig. 2：Obtain the video flowing for including gesture；By the multiple image in above-mentioned video flowing As target image.

In this realization method, executive agent can obtain the video flowing comprising gesture first, then in above-mentioned video flowing Multiple image is chosen as target image.

In some optional realization methods of the present embodiment, the executive agent of the method for output information is terminal, It can detect user and be operated to the control data of terminal, and control data operation is needed for characterizing user through gesture next life At control command.Then step 201 can also include unshowned following steps in Fig. 2：In response to detecting that the control of user is asked Operation is asked, sending gesture identification model to server-side obtains request；In response to receiving gesture identification model, it includes gesture to obtain Target image.

In this realization method, the possible occupied memory size of gesture identification model is larger, in order to not influence the fortune of terminal Scanning frequency degree, above-mentioned model can be stored in server-side.Only terminal, just can be to clothes in the control data operation for detecting user Business end sends gesture identification model and obtains request.Server-side can incite somebody to action after receiving above-mentioned gesture identification model and obtaining request Gesture identification model is sent to terminal.Terminal is after receiving gesture identification model, then the camera by being installed in terminal is answered With or obtain include the target image of gesture otherwise.By the realization method, terminal can be effectively improved The speed of service.

In some optional realization methods of the present embodiment, in order to further increase the processing speed of terminal, obtaining When target image, the resolution ratio of target image can also be handled, the resolution ratio of target image is controlled in preset range It is interior.Such as 100*100 can be less than with the resolution ratio of control targe image.

In some optional realization methods of the present embodiment, the contrast that can set target object meets default item Part, to facilitate the characteristic point of extraction gesture, then step 201 can also specifically be realized by unshowned following steps in Fig. 2：From A frame image is chosen in above-mentioned video flowing as image to be detected；Based on image to be detected, executes and step is determined with hypograph：Inspection Whether the contrast for surveying image to be detected meets preset condition；Item is preset in response to determining that the contrast of image to be detected meets Part, using image to be detected as target image；In response to determining that the contrast of image to be detected is unsatisfactory for preset condition, from above-mentioned A unselected frame image is chosen in video flowing as new image to be detected, is continued to execute image and is determined step.

In this realization method, a frame image can be chosen from above-mentioned video flowing first as image to be detected, then base In above-mentioned image to be detected, executes image and determine step.Above-mentioned image determines that step may include：Detect pair of image to be detected Whether meet preset condition than degree, if the contrast of image to be detected meets preset condition, using image to be detected as mesh Logo image；If the contrast of image to be detected is unsatisfactory for preset condition, unselected one is chosen from above-mentioned video flowing Frame image returns to execution image and determines step as new image to be detected.It can ensure target image by this realization method Middle gesture is clear, to be easy to extract its characteristic point.

In some optional realization methods of the present embodiment, above-mentioned gesture identification model is obtained by deep learning model training It arrives, then the above method can also include the training step of gesture identification model：Obtain preset training sample set；Extraction training The gesture-type of the characteristic point of gesture and determining gesture in sample；The characteristic point of gesture in the training sample is inputted and waits instructing Experienced deep learning model obtains the prediction result of the corresponding gesture-type of characteristic point of gesture in training sample；Based on above-mentioned The parameter of difference iteration update deep learning model between the prediction result of gesture-type and the gesture-type of label, so that on It states the difference between the prediction result of gesture-type and the gesture-type of label and meets the preset condition of convergence.

Wherein, training sample set includes at least one marked gesture-type, different shooting angles image, that is, is instructed It may include multiple training sample groups to practice sample set, and the image in each training sample group includes same gesture in different shootings The image of angle.That is, it includes a kind of a kind of image of shooting angle of gesture that each training sample, which is,.In order to enable gesture Identification model can more accurately identify gesture, and the angle for the same gesture that a training sample group includes can be more as possible. Fig. 4 shows the schematic diagram of training sample, including " palm " this gesture nine kinds of different shooting angles image, it is above-mentioned Nine kinds of images include gesture the image of conplane different angle and above-mentioned gesture the different angle in space image.

After obtaining above-mentioned training sample set, the characteristic point of gesture in each training sample can be extracted, then by hand Input of the characteristic point of gesture as deep learning model obtains deep learning model to the gesture-type corresponding to features described above point Prediction result, the then difference between the marked gesture-type corresponding to above-mentioned prediction result and this feature point, The parameter of above-mentioned deep learning model is adjusted, so that the deep learning model after adjustment is to the hand corresponding to features described above point Difference between the prediction result of gesture type and marked gesture-type meets the preset condition of convergence.Above-mentioned condition of convergence example Such as may include that error is less than threshold value.

In this realization method, above-mentioned deep learning model can be based on convolutional neural networks algorithm, unsupervised learning algorithm, Any one structure in Recognition with Recurrent Neural Network algorithm, Boltzmann machine algorithm obtains.

In some optional realization methods of the present embodiment, the above-mentioned gesture identification model pre-established includes multiple pre- If the feature point model of gesture, features described above point model is shot by deep learning algorithm to comprising a variety of default gestures, different The sample image of angle generated after characteristic point extracts (image shown in Fig. 4 can also be applied to this realization method In).That is, for a kind of default gesture, deep learning algorithm can extract first the gesture different shooting angles each sample The characteristic point of gesture in image, then the characteristic point of the different angle to being extracted be reconstructed, obtain including this kind of gesture Feature point model.It is understood that the feature point model of this kind of gesture includes characteristic point of this kind of gesture in different angle.This It can be one in the feature point model of realization method, can also be multiple.In addition, the feature point model of this realization method can be with A kind of default gesture including same gesture-type or a variety of default gestures.

Further step 203 can specifically be realized by unshowned following steps in Fig. 2：Determine extracted spy Matching degree between sign point and the feature point model of each default gesture；By the default hand belonging to the highest characteristic point model of matching degree The gesture-type for the gesture that the gesture-type of gesture is included as target image.

It, can be by gesture in target image in extracting target image after the characteristic point of gesture in this realization method Characteristic point be compared with features described above point model, with the matching degree both calculated, and assert the highest characteristic point of matching degree The gesture-type for the default gesture that model includes is identical as the gesture-type of gesture in target image, therefore by matching degree highest Characteristic point model belonging to gesture the gesture-type gesture that is included as target image gesture-type.In other realities In existing mode, each matching degree between the characteristic point extracted and the characteristic point of preset gesture can also be exported, for Gesture in the selection target image of family belonging to gesture.

With continued reference to Fig. 5, it illustrates controlled based on action output in the method for output information according to the application The flow 500 of the embodiment of order.As shown in figure 5, the method for output information of the present embodiment is being determined from video After the gesture-type for the gesture for including in each target image of stream, it can also include the following steps：

Step 501, the correspondence list in response to the above-mentioned gesture-type of determination and control command does not include above-mentioned gesture Gesture-type, detect consistent with the presence or absence of the gesture that at least two field pictures are included in target image.

The executive agent of method for output information is in the correspondence list that gesture-type and control command is determined In do not include each gesture that target image includes after, can detect in target image whether including at least two field pictures included Gesture it is consistent.

Step 502, consistent in the presence of the gesture that at least two field pictures are included in target image in response to determining, it determines above-mentioned Position of the gesture in above-mentioned at least two field pictures.

It is consistent to there is the gesture that at least two field pictures are included in target image is determined, then illustrating may in video flowing In the presence of the action formed by gesture.Position of the above-mentioned gesture in above-mentioned at least two field pictures can be determined first, then determined The action that above-mentioned gesture is formed.In the present embodiment, above-mentioned position can include but is not limited to the size of gesture, gesture central point Straight line where position, gesture and arm and image angle formed by the symmetry axis of symmetry axis/horizontal direction of vertical direction.

Step 503, according to above-mentioned at least two field pictures position in video streaming and above-mentioned gesture above-mentioned at least two Position in frame image determines that above-mentioned gesture is formed by action.

Include the position of the image of the gesture of same gesture-type in video streaming according to above-mentioned at least two frames, it may be determined that The shooting time of above-mentioned at least two field pictures.According to position of the gesture in above-mentioned at least two field pictures, it may be determined that gesture Situation of change.The two combination can obtain the action that above-mentioned gesture is formed.For example, it is equal that there are two field pictures in target image Including gesture-palm is divided upwards, one of them is located at the top of image, another is located at the lower section of image, then the two institute shape At action can be by top be moved to lower section, can also be by lower section be moved to top.In conjunction with above-mentioned two field pictures regarding Frequency flow in position, it may be determined that the position of shooting time image earlier in two field pictures, the gesture in the image is dynamic The initial position of work, so as to accurately determine that gesture is formed by action.

Step 504, it according to the correspondence of preset action and control command, determines and exports and above-mentioned gesture institute shape At the corresponding control command of action.

In the present embodiment, the correspondence of action and control command can be preset, above-mentioned gesture institute shape is being determined At action after, can determine the control command corresponding to above-mentioned action, then order above-mentioned control in conjunction with above-mentioned correspondence Enable output.

The method for output information that above-described embodiment of the application provides, can detect the hand for including in target image Gesture is formed by action, and then can determine the corresponding control command of above-mentioned action, further enriches man-machine interaction mode.

With further reference to Fig. 6, as the realization to method shown in above-mentioned each figure, this application provides one kind for exporting letter One embodiment of the device of breath, the device embodiment is corresponding with embodiment of the method shown in Fig. 2, which can specifically answer For in various electronic equipments.

As shown in fig. 6, the device 600 for output information of the present embodiment includes：Target image acquiring unit 601, spy Sign point extraction unit 602, gesture-type determination unit 603 and control command output unit 604.

Wherein, target image acquiring unit 601, for obtaining the target image for including gesture.

Feature point extraction unit 602, the characteristic point for extracting above-mentioned gesture.

Gesture-type determination unit 603, for according to the characteristic point extracted and the gesture identification model pre-established, Determine the gesture-type of above-mentioned gesture.Above-mentioned gesture identification model is used for the correspondence of characteristic feature point and gesture-type.

Control command output unit 604, for the correspondence in response to determination preset gesture-type and control command List includes the gesture-type of above-mentioned gesture, determines and export control command corresponding with the gesture-type of above-mentioned gesture.

In some optional realization methods of the present embodiment, above-mentioned target image acquiring unit 601 is further used for：It obtains Take the video flowing for including gesture；Using the multiple image in above-mentioned video flowing as target image.

In some optional realization methods of the present embodiment, above-mentioned apparatus 600 can also include unshowned hand in Fig. 6 Gesture detection unit, position determination unit, action determination unit and the second control command output unit.

Wherein, gestures detection unit, in response to determining that the correspondence list of gesture-type and control command is not wrapped The gesture-type of above-mentioned gesture is included, is detected consistent with the presence or absence of the gesture that at least two field pictures are included in target image.

Position determination unit, for there is the gesture one that at least two field pictures are included in target image in response to determining It causes, determines position of the above-mentioned gesture in above-mentioned at least two field pictures.

Determination unit is acted, for existing according to above-mentioned at least two field pictures position in video streaming and above-mentioned gesture The position at least two field pictures is stated, determines that above-mentioned gesture is formed by action.

Second control command output unit, for according to preset action and the correspondence of control command, determining and Output is formed by the corresponding control command of action with above-mentioned gesture.

In some optional realization methods of the present embodiment, above-mentioned target image acquiring unit 601 can also be further For：A frame image is chosen from video flowing as image to be detected；Based on image to be detected, executes and step is determined with hypograph Suddenly：Whether the contrast of detection image to be detected meets preset condition；In response to determining that it is pre- that the contrast of image to be detected meets If condition, using image to be detected as target image；In response to determining that the contrast of image to be detected is unsatisfactory for preset condition, from A unselected frame image is chosen in above-mentioned video flowing as new image to be detected, is continued to execute image and is determined step.

In some optional realization methods of the present embodiment, above-mentioned target image acquiring unit 601 can also be further For：In response to detecting that the control data of user operates, sends gesture identification model to server-side and obtain request；In response to connecing Gesture identification model is received, the target image for including gesture is obtained.

In some optional realization methods of the present embodiment, gesture identification model is obtained by deep learning model training, Then above-mentioned apparatus 600 can also include unshowned model training unit in Fig. 6, be used for：Preset training sample set is obtained, The training sample set includes at least one marked gesture-type, different shooting angles image；Extract the training The characteristic point of gesture in sample；The characteristic point of gesture in the training sample is inputted to deep learning model to be trained, is obtained The prediction result of the corresponding gesture-type of the characteristic point of gesture in training sample；Prediction result based on the gesture-type and mark Between the gesture-type of note difference iteration update deep learning model parameter so that the prediction result of the gesture-type with Difference between the gesture-type of label meets the preset condition of convergence.

In some optional realization methods of the present embodiment, the above-mentioned gesture identification model pre-established includes multiple pre- If the feature point model of gesture, the feature point model is using deep learning algorithm to including a variety of default gestures, different It is generated after the sample image progress feature point extraction of shooting angle, then above-mentioned gesture-type determination unit 603 can be further For：Determine the matching degree between extracted characteristic point and the feature point model of each default gesture；By the highest spy of matching degree The gesture-type for the gesture that the gesture-type of default gesture belonging to sign point model is included as target image.

The device for output information that above-described embodiment of the application provides, first target image acquiring unit obtain packet Target image containing gesture, then feature point extraction unit extract the characteristic point of above-mentioned gesture, then gesture-type determination unit According to the characteristic point extracted and the gesture identification model pre-established, the gesture-type of gesture, control command output are determined Unit combines the correspondence list of preset gesture-type and control command, determines and export the gesture class with above-mentioned gesture The corresponding control command of type.The device of the present embodiment can enhance the ginseng of people by identifying that the gesture of people determines control command With sense, man-machine interaction mode is enriched.

It should be appreciated that for output information device 600 described in unit 601 to unit 604 respectively with reference in figure 2 Each step in the method for description is corresponding.As a result, above with respect to for synthesizing the operation and feature that the method for song describes It is equally applicable to device 600 and unit wherein included, details are not described herein.The corresponding units of device 600 can be with server In unit cooperate to realize the scheme of the embodiment of the present application.

Below with reference to Fig. 7, it illustrates the computer systems 700 suitable for the terminal device for realizing the embodiment of the present application Structural schematic diagram.Terminal device shown in Fig. 7 is only an example, to the function of the embodiment of the present application and should not use model Shroud carrys out any restrictions.

As shown in fig. 7, computer system 700 includes central processing unit (CPU) 701, it can be read-only according to being stored in Program in memory (ROM) 702 or be loaded into the program in random access storage device (RAM) 703 from storage section 708 and Execute various actions appropriate and processing.In RAM 703, also it is stored with system 700 and operates required various programs and data. CPU 701, ROM 702 and RAM 703 are connected with each other by bus 704.Input/output (I/O) interface 705 is also connected to always Line 704.

It is connected to I/O interfaces 705 with lower component：Importation 706 including keyboard, mouse etc.；It is penetrated including such as cathode The output par, c 707 of spool (CRT), liquid crystal display (LCD) etc. and loud speaker etc.；Storage section 708 including hard disk etc.； And the communications portion 709 of the network interface card including LAN card, modem etc..Communications portion 709 via such as because The network of spy's net executes communication process.Driver 710 is also according to needing to be connected to I/O interfaces 705.Detachable media 711, such as Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on driver 710, as needed in order to be read from thereon Computer program be mounted into storage section 708 as needed.

Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising carrying is on a machine-readable medium Computer program, which includes the program code for method shown in execution flow chart.In such implementation In example, which can be downloaded and installed by communications portion 709 from network, and/or from detachable media 711 It is mounted.When the computer program is executed by central processing unit (CPU) 701, limited in execution the present processes upper State function.

It should be noted that computer-readable medium described herein can be computer-readable signal media or Computer readable storage medium either the two arbitrarily combines.Computer readable storage medium for example can be --- but Be not limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or arbitrary above combination. The more specific example of computer readable storage medium can include but is not limited to：Electrical connection with one or more conducting wires, Portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only deposit Reservoir (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory Part or above-mentioned any appropriate combination.In this application, computer readable storage medium can any be included or store The tangible medium of program, the program can be commanded the either device use or in connection of execution system, device.And In the application, computer-readable signal media may include the data letter propagated in a base band or as a carrier wave part Number, wherein carrying computer-readable program code.Diversified forms may be used in the data-signal of this propagation, including but not It is limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer Any computer-readable medium other than readable storage medium storing program for executing, the computer-readable medium can send, propagate or transmit use In by instruction execution system, device either device use or program in connection.Include on computer-readable medium Program code can transmit with any suitable medium, including but not limited to：Wirelessly, electric wire, optical cable, RF etc., Huo Zheshang Any appropriate combination stated.

Flow chart in attached drawing and block diagram, it is illustrated that according to the system of the various embodiments of the application, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part for a part for one module, program segment, or code of table, the module, program segment, or code includes one or more uses The executable instruction of the logic function as defined in realization.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, this is depended on the functions involved.Also it to note Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.

Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard The mode of part is realized.Described unit can also be arranged in the processor, for example, can be described as：A kind of processor packet Include target image acquiring unit, feature point extraction unit, gesture-type determination unit and control command output unit.Wherein, this The title of a little units does not constitute the restriction to the unit itself under certain conditions, for example, target image acquiring unit may be used also To be described as " obtaining the unit of the target image comprising gesture ".

As on the other hand, present invention also provides a kind of computer-readable medium, which can be Included in device described in above-described embodiment；Can also be individualism, and without be incorporated the device in.Above-mentioned calculating Machine readable medium carries one or more program, when said one or multiple programs are executed by the device so that should Device：Obtain the target image for including gesture；Extract the characteristic point of above-mentioned gesture；It builds according to the characteristic point extracted and in advance Vertical gesture identification model determines that the gesture-type of above-mentioned gesture, above-mentioned gesture identification model are used for characteristic feature point and gesture The correspondence of type；Correspondence list in response to the preset gesture-type of determination and control command includes above-mentioned gesture Gesture-type determines and exports control command corresponding with the gesture-type of above-mentioned gesture.

Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.People in the art Member should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature Other technical solutions of arbitrary combination and formation.Such as features described above has similar work(with (but not limited to) disclosed herein Can technical characteristic replaced mutually and the technical solution that is formed.

Claims

1. a kind of method for output information, including：

Obtain the target image for including gesture；

Extract the characteristic point of the gesture；

According to the characteristic point extracted and the gesture identification model pre-established, the gesture-type of the gesture is determined, it is described Gesture identification model is used for the correspondence of characteristic feature point and gesture-type；

Correspondence list in response to the preset gesture-type of determination and control command includes the gesture-type of the gesture, really Fixed and output control command corresponding with the gesture-type of the gesture.

It is described to obtain the target image for including gesture 2. according to the method described in claim 1, wherein, including：

Obtain the video flowing for including gesture；

Using the multiple image in the video flowing as the target image.

3. according to the method described in claim 2, wherein, the method further includes：

Do not include the gesture-type of the gesture in response to the correspondence list of the determination gesture-type and control command, examines It surveys consistent with the presence or absence of the gesture that at least two field pictures are included in the target image；

It is consistent in response to there is the gesture that at least two field pictures are included in the determination target image, determine the gesture in institute State the position at least two field pictures；

According to position of at least two field pictures in the video flowing and the gesture in at least two field pictures Position, determine that the gesture is formed by action；

According to preset action and the correspondence of control command, determine and export that be formed by action with the gesture corresponding Control command.

It is described to obtain the target image for including gesture 4. according to the method described in claim 2, wherein, including：

A frame image is chosen from the video flowing as image to be detected；

Based on image to be detected, executes and step is determined with hypograph：Whether the contrast of detection image to be detected meets default item Part；In response to determining that the contrast of image to be detected meets preset condition, using image to be detected as target image；

In response to determining that the contrast of image to be detected is unsatisfactory for preset condition, unselected one is chosen from the video flowing Frame image continues to execute described image and determines step as new image to be detected.

It is described to obtain the target image for including gesture 5. according to the method described in claim 1, wherein, including：

In response to detecting that the control data of user operates, sends gesture identification model to server-side and obtain request；

In response to receiving the gesture identification model, the target image for including gesture is obtained.

6. according to claim 1-5 any one of them methods, wherein the gesture identification model is by deep learning model training It obtains；And

The method further includes the training step of gesture identification model：

Preset training sample set is obtained, the training sample set includes at least one marked gesture-type, different The image of shooting angle；

Extract the characteristic point of gesture in the training sample；

The characteristic point of gesture in the training sample is inputted to deep learning model to be trained, obtains gesture in training sample The prediction result of the corresponding gesture-type of characteristic point；

Prediction result based on the gesture-type and the difference iteration between the gesture-type of label update deep learning model Parameter so that the difference between the prediction result of the gesture-type and the gesture-type of label meets preset convergence item Part.

7. according to claim 1-5 any one of them methods, wherein the gesture identification model pre-established includes multiple The feature point model of default gesture, the feature point model using deep learning algorithm to comprising a variety of default gestures, no With what is generated after the sample image progress feature point extraction of shooting angle；And

It is described to determine the gesture-type of the gesture according to the characteristic point extracted and the gesture identification model pre-established, Including：

Determine the matching degree between extracted characteristic point and the feature point model of each default gesture；

Gesture-type using the default gesture belonging to the highest characteristic point model of matching degree included as the target image The gesture-type of gesture.

8. a kind of device for output information, including：

Target image acquiring unit, for obtaining the target image for including gesture；

Feature point extraction unit, the characteristic point for extracting the gesture；

Gesture-type determination unit, for according to the characteristic point extracted and the gesture identification model pre-established, determining institute The gesture-type of gesture is stated, the gesture identification model is used for the correspondence of characteristic feature point and gesture-type；

Control command output unit, for including in response to the correspondence list of the preset gesture-type of determination and control command The gesture-type of the gesture determines and exports control command corresponding with the gesture-type of the gesture.

9. device according to claim 8, wherein the target image acquiring unit is further used for：

Obtain the video flowing for including gesture；

Using the multiple image in the video flowing as the target image.

10. device according to claim 9, wherein described device further includes：

Gestures detection unit does not include described for the correspondence list in response to the determination gesture-type and control command The gesture-type of gesture detects consistent with the presence or absence of the gesture that at least two field pictures are included in the target image；

Position determination unit, in response to there is the gesture one that at least two field pictures are included in the determination target image It causes, determines position of the gesture in at least two field pictures；

Determination unit is acted, at least position and the gesture of the two field pictures in the video flowing according in institute The position at least two field pictures is stated, determines that the gesture is formed by action；

Second control command output unit is determined and is exported for the correspondence according to preset action and control command It is formed by the corresponding control command of action with the gesture.

11. device according to claim 9, wherein the target image acquiring unit is further used for：

A frame image is chosen from the video flowing as image to be detected；

12. device according to claim 8, wherein the target image acquiring unit is further used for：

13. according to claim 8-12 any one of them devices, wherein the gesture identification model is instructed by deep learning model It gets；And

Described device further includes model training unit, is used for：

Extract the characteristic point of gesture in the training sample；

Using the characteristic point of gesture in the training sample as deep learning model to be trained is inputted, hand in training sample is obtained The prediction result of the corresponding gesture-type of characteristic point of gesture；

14. according to claim 8-12 any one of them devices, wherein the gesture identification model pre-established includes more The feature point model of kind default gesture, the feature point model using deep learning algorithm to comprising a variety of default gestures, It is generated after the sample image progress feature point extraction of different shooting angles；And

The gesture-type determination unit is further used for：

Determine the matching degree between extracted characteristic point and the feature point model of a variety of default gestures；

15. a kind of terminal, including：

One or more processors；

Storage device, for storing one or more programs,

When one or more of programs are executed by one or more of processors so that one or more of processors are real The now method as described in any in claim 1-7.

16. a kind of computer-readable medium, is stored thereon with computer program, wherein the program is realized when being executed by processor Method as described in any in claim 1-7.