CN109063653A

CN109063653A - Image processing method and device

Info

Publication number: CN109063653A
Application number: CN201810888877.7A
Authority: CN
Inventors: 吴兴龙
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2018-08-07
Filing date: 2018-08-07
Publication date: 2018-12-21
Also published as: WO2020029466A1

Abstract

The embodiment of the present application discloses image processing method and device.One specific embodiment of this method includes: the gesture frame for obtaining the image comprising gesture, wherein the gesture frame is used to indicate the position of the included gesture of image；From the image, the topography indicated by the gesture frame comprising gesture is extracted；The topography is inputted into critical point detection model, to determine the key point information of the finger tip of the included gesture of the image, wherein the key point information includes finger tip location information.The embodiment of the present application is able to detect the key point of important finger tip, avoids carrying out detecting caused invalid detection to the very low key point of applying frequency in gesture, improves detection efficiency.

Description

Image processing method and device

Technical field

The invention relates to field of computer technology, and in particular at Internet technical field more particularly to image Manage method and apparatus.

Background technique

Gesture interaction is a kind of important form of human-computer interaction, has been widely used.Detection to the key point of gesture, Usually the dozens of key point in gesture is detected.

In the prior art, detection key point will detect above-mentioned multiple key points every time, however detect this obtained Often utilization rate is relatively low for many key points in a little key points, is not applied subsequent greatly.

Summary of the invention

The embodiment of the present application proposes image processing method and device.

In a first aspect, the embodiment of the present application provides a kind of image processing method, comprising: obtain the image comprising gesture Gesture frame, wherein gesture frame is used to indicate the position of the included gesture of image；From image, it is signified to extract gesture frame The topography comprising gesture shown；Topography is inputted into critical point detection model, to determine the hand of the included gesture of image The key point information of finger tip, wherein key point information includes finger tip location information.

In some embodiments, key point information further include: for each finger tip of gesture, indicate that the finger tip is No visible finger tip visual information.

In some embodiments, the gesture frame of the image comprising gesture is obtained, comprising: gestures detection is carried out to image, To determine the gesture frame of the position of instruction the included gesture of image.

In some embodiments, critical point detection model is convolutional neural networks.

In some embodiments, critical point detection model is obtained by following steps training: training sample set is obtained, In, the sample that training sample is concentrated includes the topography comprising gesture, and the gesture for being included with each topography Finger tip location information and finger tip visual information；Using topography as input, by the hand of the included gesture of the topography As output, training initial key point detection model obtains critical point detection mould for fingertip location information and finger tip visual information Type.

Second aspect, the embodiment of the present application provide a kind of image processing apparatus, comprising: acquiring unit is configured to obtain Take the gesture frame of the image comprising gesture, wherein gesture frame is used to indicate the position of the included gesture of image；It extracts single Member is configured to from image, extracts the topography indicated by gesture frame comprising gesture；Determination unit is configured to Topography is inputted into critical point detection model, to determine the key point information of the finger tip of the included gesture of image, wherein close Key point information includes finger tip location information.

In some embodiments, acquiring unit is further configured to: gestures detection is carried out to image, to determine instruction figure As the gesture frame of the position of included gesture.

The third aspect, the embodiment of the present application provide a kind of electronic equipment, comprising: one or more processors；Storage dress It sets, for storing one or more programs, when one or more programs are executed by one or more processors, so that one or more A processor realizes the method such as any embodiment in image processing method.

Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, are stored thereon with computer journey Sequence realizes the method such as any embodiment in image processing method when the program is executed by processor.

Image procossing scheme provided by the embodiments of the present application, firstly, the gesture frame of the image comprising gesture is obtained, In, gesture frame is used to indicate the position of the included gesture of image.Later, from image, packet indicated by gesture frame is extracted Topography containing gesture.Then, topography is inputted into critical point detection model, to determine the finger of the included gesture of image The key point information of point, wherein key point information includes finger tip location information.The embodiment of the present application is able to detect important hand The key point of finger tip avoids carrying out detecting caused invalid detection to the very low key point of applying frequency in gesture, improves detection Efficiency.

Detailed description of the invention

By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other Feature, objects and advantages will become more apparent upon:

Fig. 1 is that this application can be applied to exemplary system architecture figures therein；

Fig. 2 is the flow chart according to one embodiment of the image processing method of the application；

Fig. 3 is the schematic diagram according to an application scenarios of the image processing method of the application；

Fig. 4 is the structural schematic diagram according to one embodiment of the image processing apparatus of the application；

Fig. 5 is adapted for the structural schematic diagram for the computer system for realizing the electronic equipment of the embodiment of the present application.

Specific embodiment

The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Convenient for description, part relevant to related invention is illustrated only in attached drawing.

It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

Fig. 1 is shown can be using the exemplary system of the embodiment of the image processing method or image processing apparatus of the application System framework 100.

As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..

User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out Send message etc..Various telecommunication customer end applications can be installed on terminal device 101,102,103, such as image recognition application, Short video class application, searching class application, instant messaging tools, mailbox client, social platform software etc..

Here terminal device 101,102,103 can be hardware, be also possible to software.When terminal device 101,102, 103 be hardware when, can be the various electronic equipments with display screen, including but not limited to smart phone, tablet computer, electronics Book reader, pocket computer on knee and desktop computer etc..It, can be with when terminal device 101,102,103 is software It is mounted in above-mentioned cited electronic equipment.Multiple softwares or software module may be implemented into (such as providing distribution in it The multiple softwares or software module of formula service), single software or software module also may be implemented into.It is not specifically limited herein.

Server 105 can be to provide the server of various services, such as provide support to terminal device 101,102,103 Background server.Background server can carry out analyzing etc. to data such as the pictures received processing, and by processing result (example If image includes the key point information of the finger tip of gesture) feed back to terminal device.

It should be noted that image processing method provided by the embodiment of the present application can be by server 105 or terminal Equipment 101,102,103 executes, correspondingly, image processing apparatus can be set in server 105 or terminal device 101, 102, in 103.

It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need It wants, can have any number of terminal device, network and server.

With continued reference to Fig. 2, the process 200 of one embodiment of the image processing method according to the application is shown.The figure As processing method, comprising the following steps:

Step 201, the gesture frame of the image comprising gesture is obtained, wherein gesture frame is used to indicate image and is included The position of gesture.

In the present embodiment, executing subject (such as server shown in FIG. 1 or the end of image processing method operation thereon End equipment) the gesture frame of the image comprising gesture can be obtained from local or other electronic equipments.Gesture frame can be drawn Gesture is made, to indicate the position of gesture.Specifically, gesture frame can be rectangle.Hand can be used only in above-mentioned executing subject Gesture frame stores the location information of frame, does not show gesture frame in the picture.In addition, gesture frame can also be intuitively presented In the picture.Specifically, above-mentioned executing subject can obtain gesture frame using various ways.For example, can from local or Other electronic equipments obtain predetermined gesture frame.It, can also will be to upper one in addition, if image is the frame in video Gesture frame of the profile in the region where the gesture frame that frame determines as this frame.

In some optional implementations of the present embodiment, step 201 may include:

Gestures detection is carried out to image, to determine the gesture frame of the position of instruction the included gesture of image.

In these optional implementations, above-mentioned executing subject can carry out gestures detection to above-mentioned image, will test The frame (Bounding-Box) generated in the process is used as gesture frame.More accurate gesture side can be obtained by detection Frame.Specifically, image can be inputted to the convolutional neural networks for being able to carry out gestures detection.It can be exported from convolutional neural networks Gesture frame.

Step 202, from image, the topography indicated by gesture frame comprising gesture is extracted.

In the present embodiment, above-mentioned executing subject can be from above-mentioned image, and extracting includes hand indicated by gesture frame The topography of gesture.Specifically, above-mentioned executing subject can cut out the topography in gesture frame, hand from above-mentioned image The sideline of gesture frame and the sideline of topography coincide.

Step 203, topography is inputted into critical point detection model, to determine the pass of the finger tip of the included gesture of image Key point information, wherein key point information includes finger tip location information.

In the present embodiment, the topography extracted can be inputted critical point detection model by above-mentioned executing subject, be obtained To the key point information exported from critical point detection model.Key point information is the pass of the finger tip of the included gesture of above-mentioned image Key point information.Key point information may include finger tip location information.Finger tip location information can be finger tip in Local map As the coordinate in (or image).

In practice, critical point detection model can be by support vector machines (Support Vector Machine, SVM), model-naive Bayesian (Naive Bayesian Model, NBM), convolutional neural networks (Convolutional Neural Network, CNN) etc. classifiers (Classifier) training obtain.In addition, critical point detection model is also possible to As made of certain classification functions (such as softmax function etc.) training in advance.

In some optional implementations of the present embodiment, critical point detection model is convolutional neural networks.

In these optional implementations, includes multiple convolutional layers in convolutional neural networks, there is very strong processing energy Power.Whether convolutional neural networks not only can accurately classify finger tip as it can be seen that also accurately determining fingertip location information of selling.

In some optional implementations of the present embodiment, critical point detection model is obtained by following steps training:

Obtain training sample set, wherein training sample concentrate sample include the topography comprising gesture, and with it is each The finger tip location information and finger tip visual information for the gesture that a topography is included；

It is using topography as input, the finger tip location information of the included gesture of the topography and finger tip is visible Information obtains critical point detection model as output, training initial key point detection model.

In these optional implementations, initial key point detection model is up for trained critical point detection mould Type.After being trained using sample, then critical point detection can be obtained to after critical point detection mode input topography The finger tip location information and finger tip visual information of the included gesture of the topography of model output.Pass through a large amount of sample It is trained, available accurate critical point detection model.

In some optional implementations of the present embodiment, key point information further include: for each hand of gesture Finger tip indicates the whether visible finger tip visual information of the finger tip.

In these optional implementations, corresponding 5 finger tips of a gesture, some gestures may in the picture not Have and whole finger tips is presented.Namely the different gestures finger tip number that is showed can be and be not quite similar.Pass through hand Finger tip visual information can determine which finger tip is visible in the picture, which is sightless.Specifically, above-mentioned to hold Row main body all determines whether the finger tip is visible for each finger tip of each gesture.

In this way, in the present embodiment, above-mentioned executing subject can determine the finger tip position of the included gesture of topography Confidence breath and finger tip visual information.For example, there is a gesture in topography, five hands from thumb to little finger The finger tip of finger number 1,2,3,4 and 5 respectively.It can be shown from critical point detection model with output test result, testing result State in finger tip 1,5 as it can be seen that 2,3,4 it is invisible and 1,5 location information.

The key point of finger tip is the high important key point of applying frequency, which these implementations can accurately show The key point of finger tip is as it can be seen which is invisible, and the subsequent visibility according to the finger tip of realization is to the specified finger tip in image It is handled, to enhance the popularity of testing result application.It for example, can be in the picture to where thumb finger tip Position add pattern, can be covering or superposition by the way of.

With continued reference to the schematic diagram that Fig. 3, Fig. 3 are according to the application scenarios of the image processing method of the present embodiment.? In the application scenarios of Fig. 3, executing subject 301 can obtain the gesture of the image comprising gesture from local or other electronic equipments Frame 302, wherein gesture frame is used to indicate the position of the included gesture of image.Executing subject 301 extracts hand from image It include the topography 303 of gesture indicated by gesture frame.Topography 303 is inputted into critical point detection model, to determine figure As the key point information 304 of the finger tip of included gesture, wherein key point information includes finger tip location information.

The method provided by the above embodiment of the application is able to detect the key point of important finger tip, avoids in gesture The very low key point of applying frequency carries out detecting caused invalid detection, improves detection efficiency.

With further reference to Fig. 4, as the realization to method shown in above-mentioned each figure, this application provides a kind of image procossing dresses The one embodiment set, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which specifically can be applied to respectively In kind electronic equipment.

As shown in figure 4, the image processing apparatus 400 of the present embodiment includes: acquiring unit 401, extraction unit 402 and determines Unit 403.Wherein, acquiring unit 401 are configured to obtain the gesture frame of the image comprising gesture, wherein gesture frame is used In the position of instruction the included gesture of image；Extraction unit 402, is configured to from image, extracts indicated by gesture frame Topography comprising gesture；Determination unit 403 is configured to topography inputting critical point detection model, to determine figure As the key point information of the finger tip of included gesture, wherein key point information includes finger tip location information.

In some embodiments, the acquiring unit 401 of image processing apparatus 400 can be from local or other electronic equipments Obtain the gesture frame of the image comprising gesture.Gesture frame can delimit out gesture, to indicate the position of gesture.Specifically, Gesture frame can be rectangle.The location information of gesture frame storage frame can be used only in above-mentioned executing subject, in the picture Gesture frame is not shown.In addition, gesture frame can also be intuitively presented in the picture.

In some embodiments, for extraction unit 402 from above-mentioned image, extracting includes gesture indicated by gesture frame Topography.Specifically, above-mentioned executing subject can cut out the topography in gesture frame, gesture side from above-mentioned image The sideline of frame and the sideline of topography coincide.

In some embodiments, the topography extracted can be inputted critical point detection model by determination unit 403, be obtained To the key point information exported from critical point detection model.Key point information is the pass of the finger tip of the included gesture of above-mentioned image Key point information.Key point information may include finger tip location information.Finger tip location information can be finger tip in Local map As the coordinate in (or image).

In some optional implementations of the present embodiment, acquiring unit is further configured to: carrying out hand to image Gesture detection, to determine the gesture frame of the position of instruction the included gesture of image.

In some optional implementations of the present embodiment, critical point detection model is obtained by following steps training: Obtain training sample set, wherein training sample concentrate sample include the topography comprising gesture, and with each Local map As the finger tip location information and finger tip visual information of the gesture for being included；Using topography as input, by the Local map It is exported as the finger tip location information and finger tip visual information of included gesture are used as, training initial key point detection model, Obtain critical point detection model.

Below with reference to Fig. 5, it illustrates the computer systems 500 for the electronic equipment for being suitable for being used to realize the embodiment of the present application Structural schematic diagram.Electronic equipment shown in Fig. 5 is only an example, function to the embodiment of the present application and should not use model Shroud carrys out any restrictions.

As shown in figure 5, computer system 500 includes central processing unit (CPU and/or GPU) 501, it can be according to depositing Storage is loaded into random access storage device (RAM) 503 in the program in read-only memory (ROM) 502 or from storage section 508 Program and execute various movements appropriate and processing.In RAM 503, also it is stored with system 500 and operates required various journeys Sequence and data.Central processing unit 501, ROM 502 and RAM 503 are connected with each other by bus 504.Input/output (I/O) Interface 505 is also connected to bus 504.

I/O interface 505 is connected to lower component: the storage section 506 including hard disk etc.；And including such as LAN card, tune The communications portion 507 of the network interface card of modulator-demodulator etc..Communications portion 507 executes mailing address via the network of such as internet Reason.Driver 508 is also connected to I/O interface 505 as needed.Detachable media 509, such as disk, CD, magneto-optic disk, half Conductor memory etc. is mounted on as needed on driver 508, in order to as needed from the computer program read thereon It is mounted into storage section 506.

Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communications portion 507, and/or from detachable media 509 are mounted.When the computer program is executed by central processing unit (CPU and/or GPU) 501, the present processes are executed The above-mentioned function of middle restriction.It should be noted that the computer-readable medium of the application can be computer-readable signal media Or computer readable storage medium either the two any combination.Computer readable storage medium for example can be with System, device or the device of --- but being not limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or it is any more than Combination.The more specific example of computer readable storage medium can include but is not limited to: have one or more conducting wires Electrical connection, portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type are programmable Read-only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic are deposited Memory device or above-mentioned any appropriate combination.In this application, computer readable storage medium can be it is any include or The tangible medium of program is stored, which can be commanded execution system, device or device use or in connection. And in this application, computer-readable signal media may include in a base band or as carrier wave a part propagate data Signal, wherein carrying computer-readable program code.The data-signal of this propagation can take various forms, including but It is not limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be calculating Any computer-readable medium other than machine readable storage medium storing program for executing, the computer-readable medium can send, propagate or transmit For by the use of instruction execution system, device or device or program in connection.It is wrapped on computer-readable medium The program code contained can transmit with any suitable medium, including but not limited to: wireless, electric wire, optical cable, RF etc., or Above-mentioned any appropriate combination.

Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the application, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.

Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard The mode of part is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor packet Include acquiring unit, extraction unit and determination unit.Wherein, the title of these units is not constituted under certain conditions to the unit The restriction of itself, for example, acquiring unit is also described as " obtaining the unit of the gesture frame of the image comprising gesture ".

As on the other hand, present invention also provides a kind of computer-readable medium, which be can be Included in device described in above-described embodiment；It is also possible to individualism, and without in the supplying device.Above-mentioned calculating Machine readable medium carries one or more program, when said one or multiple programs are executed by the device, so that should Device: the gesture frame of the image comprising gesture is obtained, wherein gesture frame is used to indicate the position of the included gesture of image； From image, the topography indicated by gesture frame comprising gesture is extracted；Topography is inputted into critical point detection model, To determine the key point information of the finger tip of the included gesture of image, wherein key point information includes finger tip location information.

Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.Those skilled in the art Member is it should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed herein Can technical characteristic replaced mutually and the technical solution that is formed.

Claims

1. a kind of image processing method, comprising:

Obtain the gesture frame of the image comprising gesture, wherein the gesture frame is used to indicate the position of the included gesture of image It sets；

From described image, the topography indicated by the gesture frame comprising gesture is extracted；

The topography is inputted into critical point detection model, to determine the key point of the finger tip of the included gesture of described image Information, wherein the key point information includes finger tip location information.

2. according to the method described in claim 1, wherein, the key point information further include: for each finger of gesture Point indicates the whether visible finger tip visual information of the finger tip.

3. according to the method described in claim 1, wherein, the acquisition includes the gesture frame of the image of gesture, comprising:

Gestures detection is carried out to described image, to determine the gesture frame of the position of instruction the included gesture of described image.

4. according to the method described in claim 1, wherein, the critical point detection model is convolutional neural networks.

5. according to the method described in claim 2, wherein, the critical point detection model is obtained by following steps training:

Obtain training sample set, wherein the sample that the training sample is concentrated includes the topography comprising gesture, and with it is each The finger tip location information and finger tip visual information for the gesture that a topography is included；

Using topography as input, by the finger tip location information and finger tip visual information of the included gesture of the topography As output, training initial key point detection model obtains the critical point detection model.

6. a kind of image processing apparatus, comprising:

Acquiring unit is configured to obtain the gesture frame of the image comprising gesture, wherein the gesture frame is used to indicate figure As the position of included gesture；

Extraction unit is configured to from described image, extracts the topography indicated by the gesture frame comprising gesture；

Determination unit is configured to the topography inputting critical point detection model, to determine the included hand of described image The key point information of the finger tip of gesture, wherein the key point information includes finger tip location information.

7. device according to claim 6, wherein the key point information further include: for each finger of gesture Point indicates the whether visible finger tip visual information of the finger tip.

8. device according to claim 6, wherein the acquiring unit is further configured to:

9. device according to claim 6, wherein the critical point detection model is convolutional neural networks.

10. device according to claim 7, wherein the critical point detection model is obtained by following steps training:

11. a kind of electronic equipment, comprising:

One or more processors；

Storage device, for storing one or more programs,

When one or more of programs are executed by one or more of processors, so that one or more of processors are real Now such as method as claimed in any one of claims 1 to 5.

12. a kind of computer readable storage medium, is stored thereon with computer program, wherein when the program is executed by processor Realize such as method as claimed in any one of claims 1 to 5.