CN114995652A

CN114995652A - Screen control method and user terminal

Info

Publication number: CN114995652A
Application number: CN202210748642.4A
Authority: CN
Inventors: 霍飞龙; 杭云; 郭宁; 施唯佳
Original assignee: Tianyi Digital Life Technology Co Ltd
Current assignee: Tianyi Digital Life Technology Co Ltd
Priority date: 2022-06-28
Filing date: 2022-06-28
Publication date: 2022-09-02

Abstract

The invention provides a screen control method and a user terminal, wherein the method comprises the following steps: displaying a manipulation area on the screen in response to an instruction to start an air operation; capturing an image through a camera and displaying the captured image on the screen; detecting whether a characteristic object exists in the image in the manipulation area; when the characteristic object is detected to exist in the manipulation area and exceeds a first time threshold, determining the characteristic object as an empty operation object; and controlling the movement of a mouse pointer on the screen according to the movement of the spaced operation object in the control area.

Description

Screen control method and user terminal

Technical Field

The invention relates to the technical field of video networking, in particular to a method and a system for controlling a user terminal screen based on image recognition instead of a mouse pointer.

Background

One of the main motivations for scientific development is to better serve humans. Various electronic devices have been deeply inserted into the lives of people for a long time, and people are in the possession of clothes and eating. People achieve the purpose by controlling the electronic equipment to use the corresponding functions. The existing method for controlling electronic equipment generally achieves the control effect through external equipment or a touch screen. Another way of controlling the device is by recognizing a specific gesture in space, which is based on machine learning to recognize specific, prescribed gestures.

The mode not only additionally increases the cost, but also needs the user to complete the operation by hand, and the external equipment or the spaced gesture control needs the user to complete the operation by hand, so that the existing control means is completely ineffective if the user is inconvenient to move or cannot use the hand due to physical defects.

Disclosure of Invention

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.

In order to solve the above problems, the present invention provides a method for controlling a device screen based on image recognition instead of a mouse pointer and a user terminal. The invention obtains the real-time picture through the camera and displays the real-time picture on the screen of the equipment. Further, a circle or other figures are drawn on a screen to define a certain range, the range enables a selected user to display certain objects with characteristics to serve as a pointer to replace a mouse function, when the user moves the objects, the corresponding mouse pointer function moves synchronously, the objects can exist in various forms, such as mole, table tennis, glass beads and the like on the nose, mouth and a certain position on the face of the user, and therefore the problem that the terminal equipment cannot be controlled by two hands due to partial injuries or body defects is solved, and the limitation that the existing air-separating operation equipment can only be realized by using specific gestures and actions is solved.

According to an aspect of the present invention, there is provided a screen control method including:

displaying a manipulation area on the screen in response to an instruction to start an air operation;

capturing an image through a camera and displaying the captured image on the screen;

detecting whether a characteristic object exists in the image in the manipulation area;

when the characteristic object is detected to exist in the manipulation area and exceeds a first time threshold, determining the characteristic object as an empty operation object; and

and controlling the movement of a mouse pointer on a screen according to the movement of the spaced operation object in the control area.

According to another embodiment of the present invention, further comprising:

when more than one characteristic object which exists in the control area and exceeds the first time threshold is detected, the characteristic object which is the object of the spaced operation is determined according to user selection or according to a preset rule.

According to a further embodiment of the present invention, controlling the movement of the mouse pointer on the screen according to the movement of the spaced manipulation object in the manipulation region includes:

identifying center point coordinates of the spaced-apart operation object in the manipulation area; and

and proportionally moving the position of the mouse pointer in the screen according to the moving distance of the central point coordinate in the control area.

According to a further embodiment of the present invention, controlling the movement of the mouse pointer on the screen according to the movement of the spaced apart manipulation object in the manipulation region further comprises:

and when the empty operation object is detected to move out of the control area and then reenter the control area, continuing to control the movement of the mouse pointer on the screen according to the movement of the empty operation object after reentering the control area.

According to another embodiment of the present invention, further comprising:

and when the air-separating operation object is detected to be stationary in the control area for a second time threshold, executing mouse double-click operation at the current position of the mouse pointer in the screen.

According to still another embodiment of the present invention, further comprising:

clearing information associated with the blank operation object when it is detected that the blank operation object moves out of the manipulation region and reaches a third time threshold.

According to another aspect of the present invention, there is provided a user terminal comprising:

a camera for capturing an image;

a display unit for displaying a screen; and

a control unit configured to:

displaying a manipulation area on the display unit in response to an instruction to start an air-break operation;

capturing the image through the camera and displaying the captured image on the screen; detecting whether a characteristic object exists in the image in the manipulation area;

and controlling the movement of a mouse pointer on the screen according to the movement of the spaced operation object in the control area.

identifying center point coordinates of the spaced operation object in the manipulation area; and

According to a further embodiment of the invention, the control unit is further configured to:

when the air-spaced operation object is detected to be stationary in the control area for a second time threshold, performing mouse double-click operation at the current position of the mouse pointer in the screen; or

These and other features and advantages will become apparent upon reading the following detailed description and upon reference to the associated drawings. It is to be understood that both the foregoing general description and the following detailed description are explanatory only and are not restrictive of aspects as claimed.

Drawings

So that the manner in which the above recited features of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only some typical aspects of this invention and are therefore not to be considered limiting of its scope, for the description may admit to other equally effective aspects.

Fig. 1 is a schematic diagram of a user terminal 100 for screen control according to an embodiment of the present invention.

Fig. 2 is a flow diagram of a screen control method 200 according to one embodiment of the invention.

FIG. 3 is a flow diagram 300 of clear operand determination according to one embodiment of the present invention.

FIG. 4 is a flow diagram 400 for controlling a mouse pointer based on a spacetime operand in accordance with an embodiment of the present invention.

In the drawings, the figures are not drawn to scale.

Detailed Description

The present invention will be described in detail below with reference to the attached drawings, and the features of the present invention will be further apparent from the following detailed description. The following detailed description of the embodiments and the accompanying drawings are provided to illustrate the principles of the invention and are not intended to limit the scope of the invention, i.e., the invention is not limited to the described embodiments.

In the description of the present invention, it is to be noted that, unless otherwise specified, "a plurality" means two or more; the terms "upper," "lower," "left," "right," "inner," "outer," and the like, as used herein, refer to an orientation or positional relationship indicated for convenience in describing the invention and to simplify description, but do not indicate or imply that the referenced device or element must have a particular orientation, be constructed and operated in a particular orientation, and thus should not be construed as limiting the invention. Furthermore, the terms "first," "second," "third," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. "vertical" is not strictly vertical, but is within the tolerance of the error. "parallel" is not strictly parallel but within the tolerance of the error.

The following description is given with the directional terms as viewed in the drawings and not intended to limit the invention to the specific structure shown in the drawings. In the description of the present invention, it should also be noted that, unless otherwise explicitly stated or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as being fixed or detachable or integrally connected; may be directly connected or indirectly connected through an intermediate. Specific meanings of the above terms in the present invention can be understood as appropriate by those of ordinary skill in the art.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

In the description of the embodiment of the present invention, the term "and/or" is only one kind of association relationship describing the association object, and means that there may be three relationships, for example, a and/or B, and may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.

Fig. 1 is a schematic diagram of a user terminal 100 for screen control according to an embodiment of the present invention. In fig. 1, the user terminal 100 is shown as a cell phone, but it will be appreciated that the user terminal may be any device such as a desktop computer, a notebook, a tablet, etc. As shown in fig. 1, the user terminal 100 may include a camera 102, a display unit 104, and a control unit (not shown). Therein, the camera 102 of the user terminal 100 may be used to capture images. The display unit 104 may be used to display a screen.

The control unit may be configured to: in response to an instruction to start the blank operation, the manipulation area 106 is displayed on the display unit 104. The manipulation area 106 may be, for example, a circle, square, or other graphic of a certain size. Then, a real-time image 108 is captured by the camera 102, and the captured real-time image 108 is presented on the screen. As an example, as shown in fig. 1, the display unit 104 may display a real-time image 108 captured by the camera 102 and circle the manipulation area 106 on the real-time image 108. It is received, and it is detected whether the real-time image 108 has a characteristic object in the manipulation area 106. When the presence of the feature object in the manipulation region 106 is detected and exceeds a first time threshold, the feature object is determined to be the clear operation object 112. Specifically, as one example, the featured object may be detected by detecting a contour of the featured object, for example, a contour parameter and a corresponding time of the featured object may be recorded, wherein the contour parameter may include a set of coordinate points of the contour. It will be appreciated that other characteristic or easily tracked parameters of the feature object may be selected to detect the feature object. That is, the feature object may not be limited, and the detection operation is not completed by presetting a specific identification object or a specific action, but only needs to keep that the object has an obvious appearance feature within a certain time period so as to extract relevant parameters, and the parameters may be as follows: a set of contour coordinate points, a center point, etc.

Finally, the movement of the mouse pointer 110 on the screen is controlled according to the movement of the spaced manipulation object 112 in the manipulation region 106. As an example, as shown in fig. 1, the position and the shape size of the manipulation region 106 in the screen are variable, and may be adjusted according to the user's requirement during the operation, for example, when the blank operation object is not determined, the manipulation region may be located above the real-time captured image displayed on the screen and does not affect the normal display of the real-time captured image; after the blank operation object is determined, the control area can be enlarged so that the user can operate the blank operation object in the enlarged control area, and the control area can be adjusted to a certain corner of the screen because the control area can shield the image displayed by the screen.

In one embodiment, controlling the movement of the mouse pointer 110 on the screen according to the movement of the spaceflight object 112 in the manipulation region 106 may include: identifying center point coordinates of the spaced-apart operand 112 in the manipulation area 106; and proportionally moving the position of the mouse pointer 110 in the screen according to the moving distance of the center point coordinate in the manipulation region 106.

In another embodiment, controlling the movement of the mouse pointer 110 on the screen according to the movement of the manipulation object 112 in the manipulation region 106 may further include: when it is detected that the blank operation object 112 moves out of the manipulation region 106 and then reenters the manipulation region 106, the movement of the mouse pointer 110 on the screen may be continuously controlled according to the movement of the blank operation object 112 after reentering the manipulation region 106.

In yet another embodiment, when the blank operation object 112 is detected to be stationary in the manipulation region 106 for a second time threshold, a mouse double-click operation may be performed at the current position of the mouse pointer 110 in the screen. It can be appreciated that various operations can be set according to actual needs, for example, it can be set to perform mouse click, right button, etc. operations at the current position of the mouse pointer 110 in the screen when the blank operation object 112 is detected to be stationary in the manipulation area 106 for the second time threshold; it may also be set that when the blank operation object 112 is detected to be stationary in the manipulation area 106 for each time threshold, a mouse click, a double click, a right button, etc. operation is respectively performed at the current position of the mouse pointer 110 in the screen; it may also be set to cause the mouse pointer 110 to perform a corresponding specific operation at the current position in the screen when detecting that the blank operation object 112 performs a specific operation (e.g., drawing a hook, drawing a fork, etc.) in the manipulation region 106.

In yet another embodiment, the information associated with the spaceflight object 112 is cleared when the spaceflight object 112 is detected to move out of the manipulation region 106 and for a third time threshold. In a further example, the control unit may also hide the manipulation area 106 (e.g., fade the manipulation area 106 on the screen to make it transparent) to enter the sleep mode.

It can be appreciated that, as an example, the first time threshold, the second time threshold, the third time threshold, and the like may be equal or different, which may be preset according to actual requirements.

Further, as an example, in order to improve the user experience, when the user is required to perform an operation, a corresponding operation instruction may be displayed on the screen.

FIG. 2 is a flow diagram of a screen control method 200 according to one embodiment of the invention. As shown in fig. 2, the method 200 begins at step 202, where a mobile terminal (e.g., 100 in fig. 1) may display a manipulation area on a screen in response to an instruction to initiate a blank operation. Subsequently, the mobile terminal may capture an image through a camera and present the captured image on a screen in step 204. Then, in step 206, it is detected whether the image has a characteristic object in the manipulation area. As an example, a real-time image within the scope of this manipulation area may be acquired and feature object vectors (including contours, RGB colors, center points, etc.) extracted and recorded. Next, in step 208, when the presence of the characteristic object in the manipulation area is detected and exceeds the first time threshold, the characteristic object is determined as a null operation object (otherwise, the time is re-counted). Finally, in step 210, the movement of the mouse pointer on the screen is controlled according to the movement of the spaced operation object in the manipulation region. It will be appreciated that other on-screen icons may also be controlled by the method described above.

Further, in one embodiment, when more than one characteristic object that exists in the manipulation area and exceeds the first time threshold is detected, the characteristic object that is the object of the spaced operation may be determined according to a user selection or according to a preset rule. In particular, in the case where a picture taken by the camera includes many elements, a plurality of characteristic objects may be caused. The feature object filtering and assigning may be performed on the image in the manipulation area (for example, selecting a feature object with a specific contour, or selecting a feature object with a maximum or minimum contour, etc.), or may be performed in other manners, such as: the user is provided with a selection or the like by sequentially displaying the recognized object outlines.

In another embodiment, controlling the movement of the mouse pointer on the screen according to the movement of the manipulation object in the manipulation region may include: identifying the coordinates of the center point of the air-spaced operation object in the control area; and proportionally moving the position of the mouse pointer in the screen according to the moving distance of the center point coordinate in the control area.

In a further embodiment, controlling the movement of the mouse pointer on the screen according to the movement of the manipulation object in the manipulation area may further include: and when the air-separating operation object is detected to move out of the control area and then reenter the control area, the movement of the mouse pointer on the screen is continuously controlled according to the movement of the air-separating operation object after reentering the control area.

In yet another embodiment, when it is detected that the blank operation object is stationary in the manipulation region for a second time threshold, a mouse double-click operation may be performed at a current position of the mouse pointer in the screen. It will be appreciated that various operations may be set up according to actual needs.

In yet another embodiment, information associated with the spaceflight object can be cleared when the spaceflight object is detected to move outside of the manipulation zone and for a third time threshold. Further, the manipulation area may be hidden, thereby entering a sleep mode.

Further, as an example, in order to improve the user experience, when the user needs to perform an operation, a corresponding operation description may be displayed on the screen.

FIG. 3 is a flow diagram 300 of clear operand determination according to one embodiment of the present invention. FIG. 4 is a flow diagram 400 for controlling a mouse pointer based on a spaceborne operand according to one embodiment of the present invention. In one embodiment, the user can only twist the head due to injury, now using the nose as the feature object, and the twisting head controls the nose to act as a mouse pointer to control the notebook computer to play the movie.

Specifically, the notebook 320 (i.e., a user terminal) displays a desktop status, the desktop has a movie file icon 306 and a default mouse pointer 308, the mouse pointer 308 is located on the desktop (horizontal 1300, vertical 700), and the notebook 320 has a front camera 302.

In response to the instruction to initiate the blank operation, notebook 320 may control camera 302 to start capturing a picture and display an image captured by camera 302 on the device screen in real time, as shown at 310 in fig. 3. Meanwhile, a hollow graph is generated on the screen of the device as a control area, the control area is located above the picture displayed on the screen and does not affect the normal display of the real-time picture, as shown in 312 in fig. 3, and the rectangular frame is used as the control area.

Then, the picture content in the manipulation area may be extracted, feature object contour extraction (in this embodiment, an opencv open source software library may be used for operation) may be performed, and data related to each frame is recorded, and assuming that a "triangle" feature object is identified in this embodiment, as shown in 314 of fig. 3, a "triangle" parameter is generated and recorded, as shown in block 322 of fig. 3, and 0 second (default to first occurrence) exists at time t 1; a "triangle" parameter, time t2 present (t2-t1) seconds; "triangle" parameter, time t3 exists (t3-t1) … … when the "triangle" feature object is detected to exist for more than 5 seconds, then the "triangle" is judged to be the spaced operation object.

In the above embodiment, the "please put the object to replace the mouse in full movement into the rectangle and keep for 5 seconds" may be further displayed on the screen for the user to remind the user. The user may, upon prompt, twist his head to bring his nose into the "red matrix" and hold it for more than 5 seconds, as shown at 316 in fig. 3. At this time, the user terminal finds that the triangle feature object is not in the data transmitted in real time, identifies that the triangle feature object is not the blank operation object selected by the user, and counts again. Meanwhile, parameters such as a "nose type" and a "nostril shape" (as indicated by 318 in fig. 3) contour coordinate set are transmitted to the characteristic object judgment module for judgment, and the result contains the parameters of the "nose type" and the "nostril shape", as indicated by a block 322 in fig. 3, 0 second exists at time t1 (the default is the first occurrence); the parameters of "nose shape" and "nostril shape" are that t2 exists (t2-t1) for seconds; and a parameter of 'nose type' and 'nostril shape', wherein the time is t3 (t3-t1) … … when the contour of the user keeping the nose intact always appears in the 'red rectangle' for more than 5 seconds, the object represented by the 'nose type' contour is set as an air-spaced operation object. In particular, the "nostril-shaped" profile is encompassed by the "nose-shaped" profile and the "nostril-shaped" profile has two similar circular feature elements that are similar, and it may be set to filter, for example, by selecting the largest profile, not selecting the profile with a plurality of similar feature elements, and so on. Note that other filtering methods can be performed according to the requirements of the user, such as: when a nose, a nostril and a mole on the nose exist in the manipulation region at the same time, the smallest feature "mole" can be selected as a selected object as long as the identification calculation method of the subsequent manipulation region and the manner of identification calculation can be identified.

As described above, the "nose-type" feature object is selected as the object of the user's blank operation, and this selected object is recorded as the subsequent unique matching item.

Continuing, as shown at 406 in FIG. 4, the notebook screen returns to the desktop page while the camera continues to operate, taking image region cuts according to the "rectangle" and showing the cut picture (the partial region of the picture taken by the camera) on the notebook screen in real time as well. In this way, the user can observe whether the nose of the user is in the control area or not and at what position of the control area.

Then, the image content contained in the manipulation region is extracted in real time and matched with the blank operation object, and as an example, the coordinates of the center point of the current blank operation object can be calculated through opencv (the object must completely appear in the region, and the center point is kept unchanged). It will be appreciated that the center point coordinates of the spaced-apart operands may be identified by other methods as well. As shown at 402 in FIG. 4, the center point identifying the "nose" feature object at 406 in FIG. 4 is [50,50] (since the center point was calculated for the first time, the screen system mouse position was kept unchanged and no displacement operation was performed).

The "nose" is slowly moved in the manipulation area by twisting the head, and at this time, the next frame image is continuously recognized (408 in fig. 4), as shown in 402 in fig. 4, the coordinates of the center point become [25,50], and the difference (-25,0) between the front and back of the abscissa and the ordinate is calculated. The mouse pointer is controlled to move according to the calculated difference (the moving size can be operated according to the screen pixel equal ratio, in the embodiment, the equal ratio is 10, and it is to be appreciated that other ratios can also be used). Accordingly, the mouse pointer abscissa moves 250 pixels to the right, and the mouse pointer is at the (abscissa 1050, ordinate 700) coordinate point.

Continuously, the user controls the nose to move in the manipulation area by twisting the head, as shown in 402 of fig. 4, the coordinates of the center point of the "nose" identified and calculated in the next frame are [24,35], and the mouse pointer is controlled to move 10 to the left and 150 to the up relative to the previous frame [25,50 ].

When the user controls the nose to continuously move in the control area, the system calculates the difference value of the adjacent frames before and after the control area by default. In one example, due to the limitation of the size of the manipulation region, when the user's nose briefly leaves the manipulation region and re-enters, the system will re-use the newly entered frame image for calculations. Specifically, when there is no designated spaced-apart operation object within the manipulation region, the corresponding information result may be identified and calculated as [ None, None ] to indicate that the nose of the user is not currently within the manipulation region, as shown at 410 and 402 in fig. 4. In one example, when the user's nose re-enters from the rightmost side of the manipulation area, the corresponding information result can be identified and calculated as [100,20] (as shown at 412 and 402 in fig. 4), at this time, the mouse pointer does not move for the first frame of re-entry, the next frame identifies the calculated result as [24,60] (as shown at 414 and 402 in fig. 4), and the mouse pointer control module controls the mouse pointer to move 760 and 400 to the left, just reaching the "video file" icon.

As one example, the user may trigger the mouse pointer to perform a mouse double click operation by holding the nose from moving within the manipulation region for a period of time.

The movie starts playing, as shown at 416 in fig. 4, the user does not need to manipulate the mouse for a while, so the control nose moves out of the manipulation area and keeps out for a certain time to clear the above recorded data, and further, the manipulation area may enter a sleep state, for example, hidden from being displayed on the screen. The user can then twist his head to adjust it to a comfortable position for viewing the video.

The network camera rotation control method and the network camera rotation control system of the invention are described above, and the operation of controlling the mouse pointer by the feature object is achieved by acquiring the feature object in the image, and further controlling the mouse pointer to make relevant reaction by calculating the displacement change of the feature object in each frame of image. Compared with the scheme in the prior art, the invention has at least the following advantages:

1. the method has the advantages that the applicable scene coverage is wider, most of the existing modes for controlling the equipment in the air are finished through specific gestures, the mode has larger limitation, the types of the first recognized gestures are less, the second recognized gestures can only be recognized to form standard gestures as far as possible, the method does not require a user to control the equipment through the specific gestures by hand due to the limitation of a model, and various characteristic objects can be used as recognition bases, so that the application range is greatly enlarged.

2. The cost is greatly reduced, most of the existing modes meeting the air-separating operation are based on machine learning to establish a model, and whether a designed instruction is executed or not is determined by judging whether the current operation condition meets certain characteristics or not. This approach requires the collection of a large amount of data to support the creation and testing of models, which is time and labor consuming. The invention does not need to collect data in advance because of different realization principles, thereby reducing the cost.

3. The system is more humanized, the problem that part of people cannot finish operation through gestures due to physical injury or defects is solved, the use threshold of a user is reduced, the user determines what object is used for controlling the terminal equipment, such as moles, buttons, hands, pencil heads and the like on eyes, noses and faces, and the terminal control can be carried out as long as the shape of the object in the image is kept unchanged for a period of time.

What has been described above includes examples of aspects of the claimed subject matter. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the claimed subject matter, but one of ordinary skill in the art may recognize that many further combinations and permutations of the claimed subject matter are possible. Accordingly, the disclosed subject matter is intended to embrace all such alterations, modifications and variations that fall within the spirit and scope of the appended claims.

Claims

1. A screen control method, comprising:

2. The method of claim 1, further comprising:

when more than one characteristic object which exists in the manipulation area and exceeds the first time threshold is detected, the characteristic object which is the object of the spaced operation is determined according to user selection or according to a preset rule.

3. The method of claim 1, wherein controlling movement of a mouse pointer on a screen in accordance with movement of the spacecrafts object in the manipulation area comprises:

4. The method of claim 3, wherein controlling movement of a mouse pointer on a screen in accordance with movement of the spacecrafts object in the manipulation region further comprises:

5. The method of claim 1, further comprising:

6. The method of claim 1, further comprising:

7. A user terminal, comprising:

a camera for capturing an image;

a display unit for displaying a screen; and

a control unit configured to:

capturing the image through the camera and displaying the captured image on the screen;

8. The terminal of claim 7, wherein controlling the movement of a mouse pointer on the screen in accordance with the movement of the spaceflight object in the manipulation region comprises:

9. The terminal of claim 8, wherein controlling movement of a mouse pointer on the screen in accordance with movement of the spaceflight object in the manipulation region further comprises:

10. The terminal of claim 7, wherein the control unit is further configured to: