CN112817557A - Volume adjusting method based on multi-person gesture recognition and display device - Google Patents
Volume adjusting method based on multi-person gesture recognition and display device Download PDFInfo
- Publication number
- CN112817557A CN112817557A CN202110184152.1A CN202110184152A CN112817557A CN 112817557 A CN112817557 A CN 112817557A CN 202110184152 A CN202110184152 A CN 202110184152A CN 112817557 A CN112817557 A CN 112817557A
- Authority
- CN
- China
- Prior art keywords
- gesture
- user
- volume
- volume adjustment
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 109
- 230000008569 process Effects 0.000 claims abstract description 80
- 230000009471 action Effects 0.000 claims abstract description 49
- 238000001514 detection method Methods 0.000 claims description 67
- 230000008859 change Effects 0.000 claims description 38
- 238000010586 diagram Methods 0.000 description 27
- 230000006870 function Effects 0.000 description 26
- 239000010410 layer Substances 0.000 description 22
- 238000004891 communication Methods 0.000 description 17
- 230000000694 effects Effects 0.000 description 8
- 238000006243 chemical reaction Methods 0.000 description 7
- 230000003247 decreasing effect Effects 0.000 description 7
- 230000003993 interaction Effects 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- 230000003111 delayed effect Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000005236 sound signal Effects 0.000 description 3
- 241000203475 Neopanax arboreus Species 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 239000002131 composite material Substances 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 230000001960 triggered effect Effects 0.000 description 2
- 230000003321 amplification Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000005452 bending Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 230000000739 chaotic effect Effects 0.000 description 1
- 239000012792 core layer Substances 0.000 description 1
- 230000006837 decompression Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000011038 discontinuous diafiltration by volume reduction Methods 0.000 description 1
- 238000012905 input function Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 238000003825 pressing Methods 0.000 description 1
- 230000002035 prolonged effect Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/165—Management of the audio stream, e.g. setting of volume, audio stream path
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0487—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
- G06F3/0488—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
- G06F3/04883—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures for inputting data by handwriting, e.g. gesture or text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/107—Static hand or arm
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
The application discloses a volume adjusting method and display equipment based on multi-person gesture recognition.A controller acquires a user image comprising at least one user gesture and determines an appointed gesture ID; calculating the recognition success rate of the volume adjusting gesture corresponding to the designated gesture ID, and if the recognition success rate exceeds a first threshold value, displaying a volume bar in a user interface; and executing the designated action by the user corresponding to the designated gesture ID based on the volume adjusting gesture to generate a corresponding volume adjusting instruction, and responding to the volume adjusting instruction by the controller to adjust the volume value corresponding to the volume bar. Therefore, according to the method and the display device, when a plurality of user gestures are recognized in the user image, the user is calibrated by determining the designated gesture ID, and the volume is adjusted by taking the user gesture corresponding to the designated gesture ID as a reference, so that the problems of loss, disorder and the like in the gesture recognition process are effectively solved, the smooth adjustment of the volume value of the display device by recognizing the sliding gesture is ensured, and the user experience is improved.
Description
Technical Field
The application relates to the technical field of smart television interaction, in particular to a volume adjusting method and display equipment based on multi-person gesture recognition.
Background
Along with the rapid development of display equipment, the functions of the display equipment are more and more abundant, and the performance is also more and more powerful, and at present, the display equipment comprises an intelligent television, an intelligent set top box, an intelligent box, other products with an intelligent display screen and the like. Taking the smart television as an example, the smart television provides a traditional television function and can play different television programs.
In the process of using the display device, a user can adjust the output volume of the display device based on the use requirement, and at present, the output volume is usually realized by using a remote controller configured with the display device, which is not convenient enough. Therefore, a way of adjusting the volume by gesture recognition is proposed. However, if gestures of a plurality of different users are recognized, it is not possible to accurately determine which user's gesture is used to implement volume adjustment, which affects user experience.
Disclosure of Invention
The application provides a volume adjustment method based on multi-user gesture recognition and a display device, and aims to solve the problem that accurate volume adjustment cannot be performed under the condition that multiple user gestures exist.
In a first aspect, the present application provides a display device comprising:
a display configured to present a user interface;
an image collector or a user input interface connectable to an image collector, the image collector configured to collect a user image;
a controller connected to the display and the image collector, respectively, the controller configured to:
acquiring a user image which is acquired by an image acquisition device and comprises at least one user gesture, and an appointed gesture ID of a first user gesture which is matched with a volume adjustment gesture in the user image;
calculating the recognition success rate of the volume adjusting gesture corresponding to the specified gesture ID in the frames of user images collected in the first time length;
displaying a volume bar in the user interface if the recognition success rate of the volume adjustment gesture exceeds a first threshold;
and responding to a volume adjusting instruction generated when a user corresponding to the specified gesture ID executes a specified action based on a volume adjusting gesture, and adjusting the volume value corresponding to the volume bar.
In some embodiments of the present application, the controller, in executing the specified gesture ID of the first user gesture in the captured user image that matches the volume adjustment gesture, is further configured to:
identifying at least one user gesture in the user image;
judging whether each user gesture is matched with a volume adjusting gesture;
if the user gesture which is matched with the volume adjusting gesture in a consistent mode exists, determining the first user gesture which is generated in a consistent mode as the appointed user gesture;
and acquiring the gesture ID of the designated user gesture, and determining the gesture ID as the designated gesture ID.
In some embodiments of the present application, the controller, in performing the determining whether each of the user gestures matches a volume adjustment gesture, is further configured to:
calculating gesture confidence degrees of the user gesture and the volume adjusting gesture;
if the gesture confidence exceeds a second threshold, determining that the user gesture matches the volume adjustment gesture consistently;
if the gesture confidence does not exceed a second threshold, determining that the user gesture does not match the volume adjustment gesture.
In some embodiments of the present application, in performing the calculating, the recognition success rate of the volume adjustment gesture corresponding to the designated gesture ID in the frames of user images acquired within the first duration is further configured to:
after the appointed gesture ID is determined, acquiring a plurality of frames of user images collected in a first time period, and determining the user image with the volume adjusting gesture as the appointed user image;
comparing the designated gesture ID with the gesture ID of the volume adjusting gesture in each frame of designated user image, and determining the designated user image to which the volume adjusting gesture corresponding to the consistent comparison belongs as a gesture recognition success frame;
counting the total number of successful gesture recognition frames of the successful gesture recognition frames and the total number of collected recognition frames of the user images in a first duration;
and calculating the ratio of the total number of successful gesture recognition frames to the total number of recognition frames, and determining the ratio as the recognition success rate of the volume adjustment gesture corresponding to the specified gesture ID.
In some embodiments of the present application, the controller, in performing the displaying the volume bar in the user interface if the recognition success rate of the volume adjustment gesture exceeds a first threshold, is further configured to:
if the recognition success rate of the volume adjusting gesture exceeds a first threshold, presenting a volume adjusting gesture prompting interface in the user interface, and presenting gesture recognition success prompting information and a volume adjusting gesture pattern in the volume adjusting gesture prompting interface;
canceling the display of the volume adjustment gesture prompt interface when the display duration of the volume adjustment gesture prompt interface exceeds a second duration, and displaying the volume adjustment interface in the user interface, wherein the volume adjustment interface comprises a volume bar and volume adjustment operation prompt information.
In some embodiments of the present application, the controller, when executing the volume adjustment instruction generated when the user corresponding to the designated gesture ID performs the designated action based on the volume adjustment gesture, adjusts the volume value corresponding to the volume bar, and is further configured to:
receiving a volume adjusting instruction generated when a user corresponding to the specified gesture ID executes a specified action based on the volume adjusting gesture, wherein the specified action is an action generated by the user based on volume adjusting operation prompt information;
responding to the volume adjusting instruction, and acquiring a starting coordinate value and a stopping coordinate value when a user executes a specified action, which are presented in the user image in a third time length;
based on the starting coordinate value and the ending coordinate value, calculating the abscissa variation generated when the user executes the specified action based on the volume adjusting gesture;
determining a volume adjustment value and a volume adjustment direction of the volume bar based on the abscissa variation amount;
and adjusting the volume value corresponding to the volume bar based on the volume adjustment value and the volume adjustment direction of the volume bar.
In some embodiments of the present application, the controller, in performing the determining the volume adjustment value and the volume adjustment direction of the volume bar based on the amount of change of the abscissa, is further configured to:
if the abscissa variation is larger than a third threshold, determining that the volume adjustment value of the volume bar is a specified adjustment amount, and the adjustment direction is volume increase;
and if the abscissa variation is smaller than a fourth threshold, determining that the volume adjustment value of the volume bar is a specified adjustment amount, and the adjustment direction is to reduce the volume.
In some embodiments of the present application, the controller is further configured to:
if the abscissa variation is zero within a third time length, prolonging the gesture detection time length according to the third time length;
and acquiring an initial coordinate value and a termination coordinate value when a user corresponding to the designated gesture ID executes a designated action based on the volume adjustment gesture based on the total duration corresponding to the gesture detection duration, wherein the total duration of the gesture detection duration refers to the total duration corresponding to a plurality of third durations, the initial coordinate value is the initial coordinate value corresponding to the first third duration, and the termination coordinate value is the termination coordinate value corresponding to the last third duration.
In some embodiments of the present application, the controller is further configured to: and responding to the volume adjustment instruction, and switching and displaying volume adjustment operation prompt information presented in the user interface into volume adjustment state prompt information.
In some embodiments of the present application, the controller is further configured to: and if the volume adjusting gesture is not included in the next frame of collected user image, or within the fourth time span, when the abscissa variation generated when the user corresponding to the specified gesture ID executes the specified action based on the volume adjusting gesture is zero, canceling the display of the volume bar, presenting a volume adjusting completion interface in the user interface, wherein the volume adjusting completion interface comprises a volume adjusting completion pattern and volume adjusting completion prompt information.
In some embodiments of the present application, the controller is further configured to: and if the user gesture is not included in the acquired user image within the first time length, acquiring the next frame of user image acquired by the image acquisition device after a fifth time length.
In some embodiments of the present application, the controller is further configured to:
executing a volume adjusting process when the volume bar is displayed in the user interface, and not repeatedly starting a gesture detection process of volume adjustment;
and when the volume bar is not displayed in the user interface, starting a gesture detection process of volume adjustment so as to perform the volume adjustment process.
In a second aspect, the present application further provides a volume adjustment method based on multi-person gesture recognition, where the method includes:
acquiring a user image which is acquired by an image acquisition device and comprises at least one user gesture, and an appointed gesture ID of a first user gesture which is matched with a volume adjustment gesture in the user image;
calculating the recognition success rate of the volume adjusting gesture corresponding to the specified gesture ID in the frames of user images collected in the first time length;
displaying a volume bar in the user interface if the recognition success rate of the volume adjustment gesture exceeds a first threshold;
and responding to a volume adjusting instruction generated when a user corresponding to the specified gesture ID executes a specified action based on a volume adjusting gesture, and adjusting the volume value corresponding to the volume bar.
In a third aspect, the present application further provides a storage medium, where the computer storage medium may store a program, and the program may implement, when executed, some or all of the steps in the embodiments of the volume adjustment method based on multi-person gesture recognition provided in the present application.
As can be seen from the foregoing technical solutions, in the volume adjustment method and the display device based on multi-user gesture recognition provided in the embodiments of the present invention, an image collector collects a user image in real time, and a controller obtains the user image collected by the image collector and including at least one user gesture, and determines an assigned gesture ID of a first user gesture matched with a volume adjustment gesture in the user image; calculating the recognition success rate of a volume adjusting gesture corresponding to the specified gesture ID in the frames of user images collected in the first time length, and if the recognition success rate of the volume adjusting gesture exceeds a first threshold value, displaying a volume bar in a user interface; and the user corresponding to the designated gesture ID generates position change when executing the designated action based on the volume adjusting gesture, generates a corresponding volume adjusting instruction, and the controller responds to the volume adjusting instruction to start adjusting the volume value corresponding to the volume bar. Therefore, according to the method and the display device provided by the embodiment of the invention, when a plurality of user gestures are recognized in the user image, the user is calibrated by determining the specified gesture ID, and the volume is adjusted by taking the user gesture corresponding to the specified gesture ID as a reference, so that the problems of loss, disorder and the like in the gesture recognition process are effectively solved, the smooth adjustment of the volume value of the display device by recognizing the sliding gesture is ensured, and the user experience is improved.
Drawings
In order to more clearly explain the technical solution of the present application, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious to those skilled in the art that other drawings can be obtained according to the drawings without any creative effort.
FIG. 1 illustrates a usage scenario of a display device according to some embodiments;
fig. 2 illustrates a hardware configuration block diagram of the control apparatus 100 according to some embodiments;
fig. 3 illustrates a hardware configuration block diagram of the display apparatus 200 according to some embodiments;
FIG. 4 illustrates a software configuration diagram in the display device 200 according to some embodiments;
FIG. 5 illustrates an icon control interface display of an application in display device 200, in accordance with some embodiments;
FIG. 6 illustrates a flow diagram of a method of volume adjustment based on multi-person gesture recognition, in accordance with some embodiments;
fig. 7 illustrates a data flow diagram of a volume adjustment method based on multi-person gesture recognition in accordance with some embodiments;
FIG. 8 illustrates an interface diagram showing a global gesture switch in a user interface in accordance with some embodiments;
FIG. 9 illustrates a schematic diagram of the presence of multiple user gestures in a user image, in accordance with some embodiments;
FIG. 10 illustrates a flow diagram of a method of calculating an identification success rate, according to some embodiments;
FIG. 11 illustrates a schematic diagram of a user interface displaying a volume adjustment gesture prompt interface, in accordance with some embodiments;
FIG. 12 illustrates a schematic diagram of a user interface displaying a volume adjustment interface, in accordance with some embodiments;
FIG. 13 illustrates a flow chart of a method of adjusting a volume corresponding to a volume bar according to some embodiments;
FIG. 14 illustrates a schematic diagram of calculating an amount of abscissa variation, according to some embodiments;
FIG. 15 illustrates a schematic diagram of displaying volume adjustment status prompt information in a user interface, in accordance with some embodiments;
fig. 16 illustrates a schematic diagram of displaying a volume adjustment completion interface in a user interface, according to some embodiments.
Detailed Description
To make the purpose and embodiments of the present application clearer, the following will clearly and completely describe the exemplary embodiments of the present application with reference to the attached drawings in the exemplary embodiments of the present application, and it is obvious that the described exemplary embodiments are only a part of the embodiments of the present application, and not all of the embodiments.
It should be noted that the brief descriptions of the terms in the present application are only for the convenience of understanding the embodiments described below, and are not intended to limit the embodiments of the present application. These terms should be understood in their ordinary and customary meaning unless otherwise indicated.
The terms "first," "second," "third," and the like in the description and claims of this application and in the above-described drawings are used for distinguishing between similar or analogous objects or entities and not necessarily for describing a particular sequential or chronological order, unless otherwise indicated. It is to be understood that the terms so used are interchangeable under appropriate circumstances.
The terms "comprises" and "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or apparatus that comprises a list of elements is not necessarily limited to all elements expressly listed, but may include other elements not expressly listed or inherent to such product or apparatus.
The term "module" refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware and/or software code that is capable of performing the functionality associated with that element.
FIG. 1 illustrates a usage scenario of a display device according to some embodiments. As shown in fig. 1, the display apparatus 200 is also in data communication with a server 400, and a user can operate the display apparatus 200 through the smart device 300 or the control device 100.
In some embodiments, the control apparatus 100 may be a remote controller, and the communication between the remote controller and the display device includes at least one of an infrared protocol communication or a bluetooth protocol communication, and other short-distance communication methods, and controls the display device 200 in a wireless or wired manner. The user may control the display apparatus 200 by inputting a user instruction through at least one of a key on a remote controller, a voice input, a control panel input, and the like.
In some embodiments, the smart device 300 may include any of a mobile terminal, a tablet, a computer, a laptop, an AR/VR device, and the like.
In some embodiments, the smart device 300 may also be used to control the display device 200. For example, the display device 200 is controlled using an application program running on the smart device.
In some embodiments, the smart device 300 and the display device may also be used for communication of data.
In some embodiments, the display device 200 may also be controlled in a manner other than the control apparatus 100 and the smart device 300, for example, the voice instruction control of the user may be directly received by a module configured inside the display device 200 to obtain a voice instruction, or may be received by a voice control apparatus provided outside the display device 200.
In some embodiments, the display device 200 is also in data communication with a server 400. The display device 200 may be allowed to be communicatively connected through a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and other networks. The server 400 may provide various contents and interactions to the display apparatus 200. The server 400 may be a cluster or a plurality of clusters, and may include one or more types of servers.
In some embodiments, software steps executed by one step execution agent may be migrated on demand to another step execution agent in data communication therewith for execution. Illustratively, software steps performed by the server may be migrated to be performed on a display device in data communication therewith, and vice versa, as desired.
Fig. 2 illustrates a block diagram of a hardware configuration of the control apparatus 100 according to some embodiments. As shown in fig. 2, the control device 100 includes a controller 110, a communication interface 130, a user input/output interface 140, a memory, and a power supply. The control apparatus 100 may receive an input operation instruction from a user and convert the operation instruction into an instruction recognizable and responsive by the display device 200, serving as an interaction intermediary between the user and the display device 200.
In some embodiments, the communication interface 130 is used for external communication, and includes at least one of a WIFI chip, a bluetooth module, NFC, or an alternative module.
In some embodiments, the user input/output interface 140 includes at least one of a microphone, a touchpad, a sensor, a key, or an alternative module.
Fig. 3 illustrates a hardware configuration block diagram of a display device 200 according to some embodiments. Referring to fig. 3, in some embodiments, the display apparatus 200 includes at least one of a tuner demodulator 210, a communicator 220, a detector 230, an external device interface 240, a controller 250, a display 260, an audio output interface 270, a memory, a power supply, and a user interface.
In some embodiments the controller comprises a central processor, a video processor, an audio processor, a graphics processor, a RAM, a ROM, a first interface to an nth interface for input/output.
In some embodiments, the display 260 includes a display screen component for displaying pictures, and a driving component for driving image display, a component for receiving image signals from the controller output, displaying video content, image content, and menu manipulation interface, and a user manipulation UI interface, etc.
In some embodiments, the display 260 may be at least one of a liquid crystal display, an OLED display, and a projection display, and may also be a projection device and a projection screen.
In some embodiments, the tuner demodulator 210 receives broadcast television signals via wired or wireless reception, and demodulates audio/video signals, such as EPG data signals, from a plurality of wireless or wired broadcast television signals.
In some embodiments, communicator 220 is a component for communicating with external devices or servers according to various communication protocol types. For example: the communicator may include at least one of a Wifi module, a bluetooth module, a wired ethernet module, and other network communication protocol chips or near field communication protocol chips, and an infrared receiver. The display apparatus 200 may establish transmission and reception of control signals and data signals with the control device 100 or the server 400 through the communicator 220.
In some embodiments, the detector 230 is used to collect signals of the external environment or interaction with the outside. For example, detector 230 includes a light receiver, a sensor for collecting ambient light intensity; alternatively, the detector 230 includes an image collector, such as a camera, which may be used to collect external environment scenes, attributes of the user, or user interaction gestures, or the detector 230 includes a sound collector, such as a microphone, which is used to receive external sounds.
In some embodiments, the external device interface 240 may include, but is not limited to, the following: high Definition Multimedia Interface (HDMI), analog or data high definition component input interface (component), composite video input interface (CVBS), USB input interface (USB), RGB port, and the like. The interface may be a composite input/output interface formed by the plurality of interfaces.
In some embodiments, the controller 250 and the modem 210 may be located in different separate devices, that is, the modem 210 may also be located in an external device of the main device where the controller 250 is located, such as an external set-top box.
In some embodiments, the controller 250 controls the operation of the display device and responds to user operations through various software control programs stored in memory. The controller 250 controls the overall operation of the display apparatus 200. For example: in response to receiving a user command for selecting a UI object to be displayed on the display 260, the controller 250 may perform an operation related to the object selected by the user command.
In some embodiments, the object may be any one of selectable objects, such as a hyperlink, an icon, or other actionable control. The operations related to the selected object are: displaying an operation connected to a hyperlink page, document, image, or the like, or performing an operation of a program corresponding to the icon.
In some embodiments the controller comprises at least one of a Central Processing Unit (CPU), a video processor, an audio processor, a Graphics Processing Unit (GPU), a RAM Random Access Memory (RAM), a ROM (Read-Only Memory), a first to nth interface for input/output, a communication Bus (Bus), and the like.
And the CPU is used for executing the operating system and the application program instructions stored in the memory and executing various application programs, data and contents according to various interaction instructions for receiving external input so as to finally display and play various audio and video contents. The CPU processor may include a plurality of processors. E.g. comprising a main processor and one or more sub-processors.
In some embodiments, a graphics processor for generating various graphics objects, such as: at least one of an icon, an operation menu, and a user input instruction display figure. The graphic processor comprises an arithmetic unit, which performs operation by receiving various interactive instructions input by a user and displays various objects according to display attributes; the system also comprises a renderer for rendering various objects obtained based on the arithmetic unit, wherein the rendered objects are used for being displayed on a display.
In some embodiments, the video processor is configured to receive an external video signal, and perform at least one of video processing such as decompression, decoding, scaling, noise reduction, frame rate conversion, resolution conversion, and image synthesis according to a standard codec protocol of the input signal, so as to obtain a signal displayed or played on the direct display device 200.
In some embodiments, the video processor includes at least one of a demultiplexing module, a video decoding module, an image composition module, a frame rate conversion module, a display formatting module, and the like. The demultiplexing module is used for demultiplexing the input audio and video data stream. And the video decoding module is used for processing the video signal after demultiplexing, including decoding, scaling and the like. And the image synthesis module is used for carrying out superposition mixing processing on the GUI signal input by the user or generated by the user and the video image after the zooming processing by the graphic generator so as to generate an image signal for display. And the frame rate conversion module is used for converting the frame rate of the input video. And the display formatting module is used for converting the received video output signal after the frame rate conversion, and changing the signal to be in accordance with the signal of the display format, such as an output RGB data signal.
In some embodiments, the audio processor is configured to receive an external audio signal, decompress and decode the received audio signal according to a standard codec protocol of the input signal, and perform at least one of noise reduction, digital-to-analog conversion, and amplification processing to obtain a sound signal that can be played in the speaker.
In some embodiments, a user may enter user commands on a Graphical User Interface (GUI) displayed on display 260, and the user input interface receives the user input commands through the Graphical User Interface (GUI). Alternatively, the user may input the user command by inputting a specific sound or gesture, and the user input interface receives the user input command by recognizing the sound or gesture through the sensor.
In some embodiments, a "user interface" is a media interface for interaction and information exchange between an application or operating system and a user that enables conversion between an internal form of information and a form that is acceptable to the user. A commonly used presentation form of the User Interface is a Graphical User Interface (GUI), which refers to a User Interface related to computer operations and displayed in a graphical manner. It may be an interface element such as an icon, a window, a control, etc. displayed in the display screen of the electronic device, where the control may include at least one of an icon, a button, a menu, a tab, a text box, a dialog box, a status bar, a navigation bar, a Widget, etc. visual interface elements.
In some embodiments, user interface 280 is an interface that may be used to receive control inputs (e.g., physical buttons on the body of the display device, or the like).
In some embodiments, a system of a display device may include a Kernel (Kernel), a command parser (shell), a file system, and an application program. The kernel, shell, and file system together make up the basic operating system structure that allows users to manage files, run programs, and use the system. After power-on, the kernel is started, kernel space is activated, hardware is abstracted, hardware parameters are initialized, and virtual memory, a scheduler, signals and interprocess communication (IPC) are operated and maintained. And after the kernel is started, loading the Shell and the user application program. The application program is compiled into machine code after being started, and a process is formed.
Fig. 4 illustrates a software configuration diagram in the display device 200 according to some embodiments. Referring to fig. 4, in some embodiments, the system is divided into four layers, which are an Application (Applications) layer (abbreviated as "Application layer"), an Application Framework (Application Framework) layer (abbreviated as "Framework layer"), an Android runtime (Android runtime) and system library layer (abbreviated as "system runtime library layer"), and a kernel layer from top to bottom.
In some embodiments, at least one application program runs in the application program layer, and the application programs may be windows (windows) programs carried by an operating system, system setting programs, clock programs or the like; or an application developed by a third party developer. In particular implementations, the application packages in the application layer are not limited to the above examples.
The framework layer provides an Application Programming Interface (API) and a programming framework for the application. The application framework layer includes a number of predefined functions. The application framework layer acts as a processing center that decides to let the applications in the application layer act. The application program can access the resources in the system and obtain the services of the system in execution through the API interface.
As shown in fig. 4, in the embodiment of the present application, the application framework layer includes Managers (Managers), providers (Content providers), a network management system, and the like, where the Managers include at least one of the following modules: an Activity Manager (Activity Manager) is used for interacting with all activities running in the system; the Location Manager (Location Manager) is used for providing the system service or application with the access of the system Location service; a Package Manager (Package Manager) for retrieving various information related to an application Package currently installed on the device; a Notification Manager (Notification Manager) for controlling display and clearing of Notification messages; a Window Manager (Window Manager) is used to manage the icons, windows, toolbars, wallpapers, and desktop components on a user interface.
In some embodiments, the activity manager is used to manage the lifecycle of the various applications as well as general navigational fallback functions, such as controlling exit, opening, fallback, etc. of the applications. The window manager is used for managing all window programs, such as obtaining the size of a display screen, judging whether a status bar exists, locking the screen, intercepting the screen, controlling the change of the display window (for example, reducing the display window, displaying a shake, displaying a distortion deformation, and the like), and the like.
In some embodiments, the system runtime layer provides support for the upper layer, i.e., the framework layer, and when the framework layer is used, the android operating system runs the C/C + + library included in the system runtime layer to implement the functions to be implemented by the framework layer.
In some embodiments, the kernel layer is a layer between hardware and software. As shown in fig. 4, the core layer includes at least one of the following drivers: audio drive, display driver, bluetooth drive, camera drive, WIFI drive, USB drive, HDMI drive, sensor drive (like fingerprint sensor, temperature sensor, pressure sensor etc.) and power drive etc..
FIG. 5 illustrates an icon control interface display of an application in display device 200, according to some embodiments. In some embodiments, the display device may directly enter the interface of the preset vod program after being activated, and the interface of the vod program may include at least a navigation bar 510 and a content display area located below the navigation bar 510, as shown in fig. 5, where the content displayed in the content display area may change according to the change of the selected control in the navigation bar. The programs in the application program layer can be integrated in the video-on-demand program and displayed through one control of the navigation bar, and can also be further displayed after the application control in the navigation bar is selected.
In some embodiments, the display device may directly enter a display interface of a signal source selected last time after being started, or a signal source selection interface, where the signal source may be a preset video-on-demand program, or may be at least one of an HDMI interface, a live tv interface, and the like, and after a user selects different signal sources, the display may display contents obtained from different signal sources.
In some embodiments, when the display device is used to implement a smart tv function or a video playing function, different tv programs or different video files, audio files, etc. may be played in the display device. During use of the display device, the user can adjust the output volume of the display device based on the usage requirements so that the user can be immersed in a television program or video.
The volume adjustment mode of the display device is usually realized by using a remote controller configured by the display device, which is not convenient enough and affects user experience. Therefore, in order to improve the efficiency of adjusting the output volume of the display device, the embodiment of the invention provides a set of intelligent algorithms for adjusting the volume through gesture recognition, uses an AI image detection technology to recognize the gestures of the person in the image and the position of the person in the image, and adjusts the volume through the position change of the gestures of the person in the image. And current smart television has supported external camera in a large number, and built-in self-contained camera, this provides the basis for shooing the image detection gesture through the camera.
The intelligent algorithm for realizing volume adjustment based on gesture recognition provided by the embodiment of the invention can realize volume adjustment, can avoid the problem of unsmooth volume adjustment caused by frame loss, blocking and slow detection in the gesture detection process even if the load of a system CPU is overlarge, and can realize volume adjustment on display equipment smoothly by recognizing a sliding gesture.
However, in using the display device, there may be a case where a plurality of users watch television programs at the same time. If when realizing display device's volume control based on gesture recognition, gather a plurality of users ' gesture, will can't confirm which user's gesture is as the control gesture, perhaps, when adjusting the volume with one person's gesture, other people lift up the gesture and arouse that volume control appears rocking, appear chaotic when leading to volume control.
Therefore, in order to avoid confusion when the volume is adjusted due to gestures of multiple persons in a scene of use of the multiple persons, the embodiment of the invention provides a gesture recognition algorithm which is anti-interference for the multiple persons for adjusting the volume of the display device based on the camera on the basis of an intelligent algorithm for realizing the volume adjustment through gesture recognition, so that the problem that the volume is adjusted only by taking the gesture of one user as a reference under the condition that the multiple gestures occur is solved, the problems of loss, disorder and the like in the gesture recognition process are effectively solved, and the volume of the display device can be adjusted smoothly through recognizing the sliding gesture.
FIG. 6 illustrates a flow diagram of a method of volume adjustment based on multi-person gesture recognition, in accordance with some embodiments; fig. 7 illustrates a data flow diagram of a method of volume adjustment based on multi-person gesture recognition, in accordance with some embodiments. In order to realize gesture interactive recognition of a shot image by utilizing a camera of a television, and realize intelligent volume adjustment on the television by recognizing a sliding gesture in a scene where gestures of multiple persons occur, the display equipment provided by the embodiment of the invention comprises: a display configured to present a user interface; the image collector is connected with a user input interface 140 of the image collector, the image collector is configured to collect user images, the user input interface 140 is used for realizing a user instruction input function through actions such as voice, touch, gestures and pressing, the input interface converts received analog signals into digital signals, and the digital signals are converted into corresponding instruction signals and sent to the image collector; a controller connected to the display and the image collector, respectively, and configured to perform the following steps when performing the volume adjusting method based on multi-person gesture recognition shown in fig. 6 and 7:
s1, acquiring a user image which is acquired by the image acquirer and comprises at least one user gesture, and acquiring an appointed gesture ID of a first user gesture matched with the volume adjusting gesture in the user image.
When the volume of the display device is adjusted based on gesture recognition, a global gesture detection function needs to be configured in the display device, and the implementation manner of the global gesture detection function can be implemented by configuring a global gesture switch of the display device. If the global gesture switch is turned on, the global gesture detection function of the display device can be turned on, so that the display device can be correspondingly controlled according to the recognized user gesture; if the global gesture switch is turned off, the global gesture detection function of the display device may be turned off.
FIG. 8 illustrates an interface diagram showing a global gesture switch in a user interface according to some embodiments. Referring to fig. 8 (a), if the global gesture detection function is configured in the display device, an AI setting button is displayed in the user interface, the AI setting button is triggered, an AI setting interface is displayed in the user interface, and the global gesture on-off control button is presented in the AI setting interface. And (3) clicking the global gesture switch control button, displaying a global gesture setting interface shown as (b) in fig. 8 in the user interface, wherein the gesture control switch is presented in the global gesture setting interface, namely the global gesture switch. And the global gesture switch of the display equipment can be turned on by clicking the gesture control switch. To enable the volume of the display device to be adjusted based on the gesture recognition, a global gesture switch may be turned on.
After the global gesture switch is turned on, the display device may perform a global gesture detection function. However, since the global gesture detection function needs to call an image collector (camera) to collect a user image including a user gesture in real time, the type of the user gesture in the user image is identified, so as to adjust the volume of the display device. Therefore, it is necessary to ensure that the image collector is in an unoccupied state when the global gesture detection function is implemented.
The application capable of calling the image collector is a camera application, and the camera application refers to an application which needs to utilize a camera to realize corresponding functions when the camera is in operation, such as a mirror application. If the image collector is occupied by the camera application, the image collector cannot acquire images in real time, and therefore, the global gesture detection function can be realized only when the image collector is not occupied by the camera application.
In order to accurately judge whether the display device can realize the global gesture detection function after the global gesture switch is turned on, whether the image collector is occupied by the camera application needs to be judged first. Specifically, in some embodiments, the controller, in performing detecting whether the image collector is occupied by the camera application, is further configured to perform the following steps: acquiring an attribute state value of an image collector; if the attribute state value is a first numerical value, determining that the image collector is occupied by the camera application; and if the attribute state value is the second numerical value, determining that the image collector is not occupied by the camera application.
Whether the image collector is occupied by the camera application or not can be judged according to the attribute state value, and the attribute state value of the image collector is obtained under the condition that the global gesture switch is in the on state. The attribute state value may include two values, a first value and a second value, depending on whether it is occupied. In some embodiments, the first value may be 1 and the second value may be 0.
And if the attribute state value is a first numerical value, namely 1, determining that the image collector is occupied by the camera application. And if the attribute state value is a second numerical value, namely 0, determining that the image collector is not occupied by the camera application.
And if the image collector is occupied by the camera application, not starting the global gesture detection function. And if the image collector is not occupied by the camera application, starting a global gesture detection function, wherein the global gesture detection function is used for detecting a user image comprising a user gesture, and adjusting the volume of the display equipment based on the user gesture.
In the process of using the display device by a user, if the volume of the display device needs to be adjusted, the user can stand in a detection area of an image collector (camera) and use fingers to swing out gestures. Under the condition that the global gesture detection function is started, the image collector collects user images in the detection area, the user images comprise user gestures, the controller conducts gesture recognition on the user images comprising the user gestures, and whether the user gestures are volume adjustment gestures or not is judged.
The volume adjustment gesture is used to enable intelligent adjustment of the volume of the display device, and in some embodiments, the volume adjustment gesture may be set to an OK gesture. The volume adjustment gesture may also be customized according to the usage habit of the user, for example, the volume adjustment gesture may also be set to be a palm gesture, a finger bending gesture, or the like, and this embodiment is not particularly limited.
The image collector collects user images in the detection area in real time according to a preset frequency and sequentially sends the user images to the controller. The controller performs gesture recognition on each frame of user image, if the success rate of the recognized volume adjusting gesture within the first duration is greater than a first threshold, the gesture recognition is regarded as successful, and intelligent adjustment of the volume of the display device can be achieved according to the recognized volume adjusting gesture.
In some embodiments, the preset frequency of the image collector is 30-40 ms/frame, i.e. the time taken for the image collector to collect one frame of user image is 30-40 ms.
In some embodiments, the device that recognizes the user gesture in the user image may be an image collector, and when detecting that the user gesture in the user image is a volume adjustment gesture, the image collector may send a detection result to the controller, and the controller adjusts the volume value of the display device based on the volume adjustment gesture.
In some embodiments, the device for recognizing the user gesture in the user image may also be a controller, and the controller performs gesture recognition on the user image after receiving the user image sent by the image collector. If the user gesture in the user image is detected to be a volume adjustment gesture, the volume value of the display device can be adjusted based on the volume adjustment gesture.
In some embodiments, if there is a situation where multiple people use the display device at the same time, a user image including multiple user gestures is captured within the detection area of the image capture device. At this time, in order to avoid confusion in volume adjustment, adjustment may be performed using one of the user gestures as a reference user gesture, and therefore, a designated user gesture needs to be determined among a plurality of user gestures in the user image.
In order to facilitate identification of each user gesture, a corresponding gesture ID may be configured for each user gesture, and the gesture ID is used to characterize a serial number of each user gesture in the user image and is randomly allocated to distinguish different user gestures. For example, if there are 3 user gestures in the same frame of user image, the gesture IDs of the three user gestures are 1, 2, and 3, respectively.
When the current frame user image collected by the image collector is identified and detected, if a plurality of user gestures are identified, the first identified information of the volume adjustment gesture is used as the standard, namely, the gesture ID of the first identified volume adjustment gesture is used as the designated gesture ID, so that the subsequent volume adjustment instruction corresponding to the designated gesture ID is only responded, the volume of the display device is smoothly adjusted, and the situation of multi-user gesture interference is prevented.
In some embodiments, the controller, in performing the obtaining the specified gesture ID of the first user gesture in the user image that matches the volume adjustment gesture, is further configured to perform the steps of:
and 11, identifying at least one user gesture in the user image.
And step 12, judging whether each user gesture is matched with the volume adjusting gesture.
And step 13, if the user gesture matched with the volume adjusting gesture is consistent, determining the first user gesture which is consistent in matching as the appointed user gesture.
And 14, acquiring the gesture ID of the gesture of the appointed user, and determining the gesture ID as the appointed gesture ID.
Although the image collector collects the user image including the user gesture in the detection area, if the user gesture swung by the user is not a gesture for realizing volume adjustment, but other gestures which are carelessly swung by the user, such as a five-finger gesture or a one-finger-to-digital-1 gesture, the gesture detection process of volume adjustment does not need to be started, that is, the volume bar does not need to be turned on for volume adjustment.
Therefore, an AI intelligent detection function is required to be called to perform gesture recognition on the user image, and if a plurality of user gestures exist, whether each user gesture matches with the volume adjustment gesture is sequentially determined.
Although the image collector collects the user image including the user gesture in the detection area, if the user gesture swung out by the user is not the gesture for realizing the volume adjustment but other gestures which are carelessly swung out, the gesture detection process of the volume adjustment does not need to be started at the moment.
In addition, if the gesture made by the user is a gesture for adjusting the volume, but the effect of the gesture made by the user is not good, for example, when the volume adjustment gesture is an OK gesture, the middle finger, the ring finger and the small finger of the user should be in a straight state, but the user makes the three fingers bent, so that the gesture made by the user is not like the OK gesture. At this time, the ambiguous user gesture will cause that it is unable to accurately determine whether to recognize the user gesture as a volume adjustment gesture, and cause that the volume adjustment cannot be performed in time due to the false recognition.
Therefore, when the controller performs gesture recognition on each frame of user image, it needs to first determine whether the user gesture in the user image is a volume adjustment gesture, that is, when the similarity that the user gesture is the volume adjustment gesture exceeds a threshold, the user gesture is determined to be the volume adjustment gesture.
In some embodiments, the controller, in performing determining whether each user gesture matches the volume adjustment gesture, is further configured to:
step 121, calculating gesture confidence degrees of the user gesture and the volume adjusting gesture;
step 122, if the gesture confidence exceeds a second threshold, determining that the user gesture is consistent with the volume adjustment gesture in matching;
and step 123, if the gesture confidence does not exceed the second threshold, determining that the user gesture is inconsistent with the volume adjustment gesture in matching.
When the controller identifies each user gesture in each frame of user image, each user gesture in each frame of user image is matched with the volume adjusting gesture respectively, so that the gesture confidence degrees, namely the similarity, of each user gesture and the volume adjusting gesture are calculated.
A determination is made as to whether each gesture confidence exceeds a second threshold, which may be set to 99% in some embodiments. If the gesture confidence coefficient does not exceed 99%, determining that the user gesture is inconsistent with the volume adjustment gesture in matching, and not determining the user gesture corresponding to the gesture confidence coefficient as the volume adjustment gesture; and if the confidence coefficient of a certain gesture exceeds 99%, determining that the user gesture is matched with the volume adjusting gesture in a consistent manner, and determining the user gesture corresponding to the confidence coefficient of the gesture as the volume adjusting gesture so as to avoid generating false recognition.
The intelligent algorithm for realizing volume adjustment through gesture recognition based on the AI image detection technology comprises a byte algorithm and a gesture algorithm. The byte algorithm is used for detecting user gestures in the user image, if the user image in the same frame is detected to comprise a plurality of user gestures, the recognized user gestures are transmitted to the gesture algorithm, and the sending sequence of each user gesture has no sequence.
A gesture algorithm is used to match each user gesture to a volume adjustment gesture. If only one of the user images is found to be the volume adjustment gesture, the gesture ID of the gesture is recorded. The sequence of matching each user gesture and the volume adjustment gesture by the gesture algorithm is random, if the user gesture recognized first is a non-volume adjustment gesture, a second user gesture is polled and checked until one user gesture is detected to be the volume adjustment gesture, and at the moment, the polling detection is not performed on the subsequent gestures in the frame of user image.
If 3 user gestures exist in the same user image, the gesture algorithm randomly selects one user gesture to be matched with the volume adjusting gesture, if the matching is inconsistent, then selects a second user gesture to be matched with the volume adjusting gesture, and if the matching is consistent, then a third user gesture is not matched with the volume adjusting gesture, no matter whether the third user gesture is the volume adjusting gesture or not.
Based on the intelligent algorithm, if the user gesture matched with the volume adjusting gesture in the same frame of user image is consistent, the first user gesture which is generated and matched in a consistent mode is determined as the appointed user gesture, meanwhile, the gesture ID of the appointed user gesture is obtained, and the gesture ID is determined as the appointed gesture ID.
FIG. 9 illustrates a schematic diagram of the presence of multiple user gestures in a user image, in accordance with some embodiments. For example, referring to fig. 9, if there are 3 user gestures in the same user image, the user gesture 1 is a single-hand five-finger gesture, and the corresponding gesture ID is 1; the user gesture 2 is a finger-to-number 1 gesture, and the corresponding gesture ID is 2; user gesture 3 is an OK gesture and the corresponding gesture ID is 3.
The gesture algorithm randomly selects the first matching user gesture to be user gesture 2, and user gesture 2 is a finger-to-number 1 gesture which is not matched with the volume adjustment gesture. At this time, the second user gesture is selected to be matched with the volume adjusting gesture, if the second user gesture 3 is matched and is consistent with the volume adjusting gesture, the third user gesture (user gesture 1) and the volume adjusting gesture are not matched any more, no matter whether the third user gesture is the volume adjusting gesture or not.
When the second user gesture is matched with the volume adjusting gesture in a matching manner, at this time, if the gesture ID of the second user gesture (user gesture 3) is 3, the designated gesture ID corresponding to the current volume adjusting process is 3.
The designated gesture ID is used for representing a first user gesture matched with the volume adjusting gesture, and the gesture of the same user can be calibrated by using the designated gesture ID, so that the volume of the display device is adjusted only by the volume adjusting instruction generated by the user corresponding to the designated gesture ID when the volume adjusting instruction is responded subsequently. And then guarantee even in the adjustment process, other user's gesture produces the interference, also do not respond other user's gesture, guarantee volume control's smoothness, avoid appearing the confusion to carry out accurate volume control.
S2, calculating the recognition success rate of the volume adjustment gesture corresponding to the specified gesture ID in the frames of user images collected in the first time length.
When a volume adjustment gesture is recognized in the current frame user image, recording an appointed gesture ID of the gesture, simultaneously starting a timer, wherein the duration of the timer is a first duration, and timing from the moment when the appointed gesture ID is determined. In some embodiments, the first duration is used for the gesture detection process, which may be set to 1 second. The first time period may also be set to other time periods according to practical applications, and is not limited specifically herein.
After the appointed gesture ID is determined, the image collector collects a plurality of frames of user images within a first time length and sends the frames of user images to the controller, so that gesture detection and recognition can be carried out on each frame of user image. In order to realize accurate volume adjustment of the display device based on the gesture, the volume adjustment of the display device is not realized when the volume adjustment gesture is recognized, but the volume adjustment gesture corresponding to the designated gesture ID is required to be continued for a period of time, so that the volume adjustment of the display device is realized, and the situation that a user puts down the gesture only by staying for a short time after putting out the volume adjustment gesture is avoided, and the controller cannot determine whether the volume adjustment is required or not is avoided.
Therefore, the recognition success rate of the volume adjustment gesture corresponding to the designated gesture ID needs to be calculated, the gesture recognition success can be determined only when the recognition success frame number of the volume adjustment gesture corresponding to the designated gesture ID in the first duration meets the condition, and then the volume adjustment can be performed on the display device according to the volume adjustment gesture.
FIG. 10 illustrates a flow diagram of a method of calculating an identification success rate according to some embodiments. Referring to fig. 10, in some embodiments, the controller, in performing the calculating of the recognition success rate of the volume adjustment gesture corresponding to the specified gesture ID in the frames of user images captured within the first duration, is further configured to:
and S21, after the appointed gesture ID is determined, acquiring a plurality of frames of user images collected in the first time period, and determining the user image with the volume adjustment gesture as the appointed user image.
S22, comparing the designated gesture ID with the gesture ID of the volume adjusting gesture in each frame of designated user image, and determining the designated user image to which the volume adjusting gesture corresponding to the consistent comparison belongs as a gesture recognition success frame.
And S23, counting the total number of the gesture recognition success frames in the first time period and the total number of the recognition frames of the collected user images.
S24, calculating the ratio of the total number of successful gesture recognition frames to the total number of recognition frames, and determining the ratio as the recognition success rate of the volume adjustment gesture corresponding to the designated gesture ID.
After a plurality of frames of user images collected within a first duration are acquired, comparing gesture information of each user in each frame of user image with an appointed gesture ID, and if the gesture IDs are the same and are volume adjustment gestures, indicating the detected gestures of the same user.
Therefore, each user image in each frame of user image is matched with the volume adjusting gesture to judge whether the volume adjusting gesture exists in each user image in each frame of user image, and the user image with the volume adjusting gesture is determined as the appointed user image. For a specific matching and determining manner, reference may be made to the contents of the foregoing embodiments, and details are not described herein.
For example, 10 frames of user images are acquired within a first duration, at least one user gesture exists in each frame of user image, each user gesture is matched with a volume adjustment gesture, and if it is judged that the volume adjustment gesture exists in 8 frames of user images, the 8 frames of user images are determined to be the designated user images.
Since even if a volume adjustment gesture exists in the user image, the gesture may be a gesture made by another user, in order to ensure that the volume adjustment of the display device is realized by only one user's gesture, it is necessary to determine again whether the gesture ID of each user's gesture is consistent with the designated gesture ID.
And acquiring the gesture ID of the volume adjusting gesture in each frame of appointed user image, comparing the gesture ID of the volume adjusting gesture in each frame of appointed user image with the gesture ID of the volume adjusting gesture in each frame of appointed user image by taking the appointed gesture ID as a reference, and determining the appointed user image to which the corresponding volume adjusting gesture belongs as a gesture recognition success frame when the gesture IDs are compared and consistent. Even if a plurality of volume adjusting gestures exist in the same frame of appointed user image, because the gesture IDs of different users are different, the volume adjusting gesture consistent with the appointed gesture ID can be determined according to the appointed gesture ID, and then the frame with successful gesture recognition is determined.
Summarizing all the gesture recognition success frames detected in the first time period, counting the total gesture recognition success frame number of the gesture recognition success frames, meanwhile, counting the total recognition frame number of all the user images collected in the first time period, calculating the ratio of the total gesture recognition success frame number to the total recognition frame number, and determining the ratio as the recognition success rate of the volume adjustment gesture corresponding to the appointed gesture ID.
For example, when the AI intelligent detection method detects that the user gesture in a certain frame of user image is a volume adjustment gesture, a counter of 1s (first duration) is started, and each frame of user image within 1s is detected. If the volume adjusting gesture is detected, the corresponding gesture ID is the same as the designated gesture ID, and the gesture is a gesture made by the same hand, the total gesture recognition success frame number (DetectedFrames) and the total recognition frame number (TotalFrames) of the gesture recognition success frames corresponding to the volume adjusting gesture are respectively increased by one. If the detected user gesture is not a volume adjust gesture or the gesture ID of the user gesture is not the same as the specified gesture ID, then only the total recognition frame number (TotalFrames) is incremented by one. And circulating the process, when the time of 1s (first duration) is reached, calculating the ratio of the total number of successful gesture recognition frames to the total number of recognition frames (success rate ═ detected frames/total frames), and determining the ratio as the recognition success rate of the volume adjustment gesture corresponding to the specified gesture ID.
And S3, if the recognition success rate of the volume adjusting gesture exceeds a first threshold value, displaying a volume bar in the user interface.
In some embodiments, to avoid the user from remaining for a short time after the user has posed the volume adjustment gesture and dropping the gesture, the controller cannot determine whether the volume bar needs to be turned out for volume adjustment. Therefore, for the convenience of timely calling the volume bar by the controller to adjust the volume, the first threshold value can be set, so that in a multi-frame user image acquired by the image acquisition device in the first time length, only if the recognition success rate of the volume adjustment gesture corresponding to the specified gesture ID exceeds the first threshold value, the gesture recognition can be regarded as successful, and then the volume bar control is called again to perform the subsequent volume adjustment process.
That is, in the gesture detection process (within the first time period), the ratio of the user image to which the user gesture recognized as the volume adjustment gesture belongs to the total user image needs to exceed the threshold, so that it can be said that the user needs to turn on the volume adjustment function, otherwise, the volume adjustment process is not performed.
Therefore, after determining the recognition success rate of the volume adjustment gesture corresponding to the first duration, and determining whether the recognition success rate of the volume adjustment gesture exceeds a first threshold, in some embodiments, the first threshold may be set to 90%, if the recognition success rate exceeds the first threshold by 90%, it indicates that the gesture recognition is successful in the current gesture detection process (within the first duration), and at this time, a volume bar may be displayed in the user interface. The first threshold may also be set to other values according to practical applications, and is not limited specifically herein.
If the recognition success rate of the volume adjustment gesture corresponding to the designated gesture ID is smaller than the first threshold, the gesture recognition detection fails, the volume adjustment gesture of the same user detected currently may be lost, and the detection is immediately re-detected at the moment.
In some embodiments, before the controller calls the volume bar control to display, since the step of successful gesture recognition needs to be performed first, in order to make it clear that the user knows that the current gesture recognition is successful, the controller may display a volume adjustment gesture prompt interface in the user interface.
In some embodiments, the controller, in performing if the recognition success rate of the volume adjustment gesture exceeds a first threshold, displaying a volume bar in the user interface, is further configured to:
and 31, if the recognition success rate of the volume adjustment gesture exceeds a first threshold, presenting a volume adjustment gesture prompt interface in the user interface, and presenting gesture recognition success prompt information and a volume adjustment gesture pattern in the volume adjustment gesture prompt interface.
And step 32, when the display duration of the volume adjustment gesture prompt interface exceeds the second duration, canceling the display of the volume adjustment gesture prompt interface, and displaying a volume adjustment interface in the user interface, wherein the volume adjustment interface comprises a volume bar and volume adjustment operation prompt information.
When the recognition success rate of the volume adjustment gesture corresponding to the designated gesture ID exceeds a first threshold, the current gesture is successfully recognized, a subsequent process of triggering volume adjustment after sliding operation based on the volume adjustment gesture can be performed, at the moment, feedback is provided for the volume adjustment gesture made by a user, and a volume adjustment gesture prompt interface can be generated and displayed in a user interface.
FIG. 11 illustrates a schematic diagram that shows a volume adjustment gesture prompt interface in a user interface, in accordance with some embodiments. Referring to fig. 11, when the gesture recognition is successful, a gesture recognition success prompt message and a volume adjustment gesture pattern are presented in the volume adjustment gesture prompt interface, the gesture recognition success prompt message is used to inform a user that the designated gesture recognition that can currently implement the volume adjustment is successful, the content of the prompt message may be "… in gesture recognition", and after seeing the prompt message, the user can know that the subsequent volume adjustment process is triggered after performing a sliding operation based on the volume adjustment gesture.
The volume adjustment gesture pattern is used to identify the UI form of the volume adjustment gesture, which is typically the same UI form as the user's gesture with a finger. For example, if the volume adjustment gesture is an OK gesture, the volume adjustment gesture pattern is an "OK" pattern.
After seeing the prompt information of the successful gesture recognition and the volume adjusting gesture pattern, the user can know that the gesture recognition is successful as a result of the detection process of the volume adjusting gesture. At the moment, the controller presents a volume adjustment gesture prompt interface in the user interface and simultaneously informs the system to call out the volume bar control in a broadcast sending mode. A volume bar control is displayed in the user interface to prompt the user that the current volume adjustment process is to begin.
Because the volume adjustment gesture prompt interface is displayed in the user interface when the gesture recognition is successful, the display duration can be set for the volume adjustment gesture prompt interface so that the system can automatically display the volume adjustment gesture prompt interface after the volume bar is called. And when the display duration of the volume adjustment gesture prompt interface reaches a threshold value, canceling the display of the volume adjustment gesture prompt interface, and simultaneously displaying a volume bar in the user interface. In the implementation process, the content presented by the user interface is switched and displayed as a volume bar by the volume adjustment gesture prompt interface by the viewing effect of the user.
In some embodiments, the display duration of the volume adjustment gesture prompting interface is set to be a second duration, which may be 500ms, and after the display duration of the volume adjustment gesture prompting interface exceeds 500ms, the display of the volume adjustment gesture prompting interface may be cancelled, and the volume adjustment interface is displayed in the user interface, where the volume adjustment interface includes a volume bar and volume adjustment operation prompting information.
FIG. 12 illustrates a schematic diagram of displaying a volume adjustment interface in a user interface, according to some embodiments. Referring to fig. 12, when the display duration of the volume adjustment gesture prompt interface reaches the threshold, the volume adjustment interface is switched and displayed in the user interface, where the volume adjustment interface includes a volume bar and volume adjustment operation prompt information, the volume bar is used to present the output volume of the current display device, and the volume adjustment operation prompt information is used to prompt the user to perform an operation step of adjusting the volume of the display device by using the volume adjustment gesture, for example, the prompt information may be "move left and right to adjust the volume".
After the user sees the volume adjustment operation prompt information in the user interface, the corresponding gesture operation can be executed according to the prompt content of the user interface, so that the volume adjustment process is started.
And S4, responding to a volume adjusting instruction generated when the user corresponding to the designated gesture ID executes the designated action based on the volume adjusting gesture, and adjusting the volume value corresponding to the volume bar.
When the volume adjustment gesture made by the user meets the requirement of starting the volume adjustment process, the user can perform corresponding operation according to the volume adjustment operation prompt information presented in the user interface. That is, after the gesture recognition detection is successful, the user corresponding to the designated gesture ID keeps the volume adjustment gesture, and the volume is adjusted by performing the designated action, which may be a slide left-right gesture in some embodiments.
For example, the user can perform volume adjustment by holding the OK gesture with a finger and sliding the OK gesture in the horizontal direction in front of the display device (in the detection area of the image capture device).
The controller only responds to the instruction generated by the user corresponding to the designated gesture ID, and if other users execute the designated action based on the volume adjusting gesture, the gesture ID of other users is different from the designated gesture ID determined before, so that the controller does not respond to the volume adjusting instruction of other users based on the volume adjusting gesture, and the situation that the volume of one person shakes along with the volume adjusting gesture of other users when the person slides the volume is avoided, and the user experience is influenced.
And the user corresponding to the designated gesture ID keeps the volume adjusting gesture to execute the designated action, for example, when the user slides left and right in the horizontal direction, the finger of the user generates position change, and the effect presented in the user image generates abscissa change for the gesture. At the moment, when the position changes, a volume adjusting instruction can be generated, and the controller responds to the volume adjusting instruction, so that the volume of the volume bar can be adjusted in real time according to the change of the abscissa. At this time, the volume value displayed on the volume bar is changed, such as increased or decreased.
Fig. 13 illustrates a flow chart of a method of adjusting a volume corresponding to a volume bar according to some embodiments. Referring to fig. 13, in some embodiments, the controller, in executing the volume adjustment instruction generated when the user corresponding to the designated gesture ID performs the designated action based on the volume adjustment gesture, adjusts the volume value corresponding to the volume bar, and is further configured to:
and S41, receiving a volume adjustment instruction generated when the user corresponding to the designated gesture ID executes the designated action based on the volume adjustment gesture, wherein the designated action is the action generated by the user based on the volume adjustment operation prompt information.
And S42, responding to the volume adjustment instruction, and acquiring a start coordinate value and an end coordinate value when the user executes the specified action presented in the user image in the third time length.
And S43, calculating the abscissa change quantity generated when the user performs the specified action based on the volume adjusting gesture based on the starting coordinate value and the ending coordinate value.
And S44, determining the volume adjustment value and the volume adjustment direction of the volume bar based on the change amount of the abscissa.
And S45, adjusting the volume value corresponding to the volume bar based on the volume adjustment value and the volume adjustment direction of the volume bar.
The user executes a designated action according to the volume adjustment operation prompt information presented in the user interface, for example, the user holds an OK gesture with a finger, slides left and right in the horizontal direction in front of the display device (in a detection area of the image acquirer), generates a position change, and generates a volume adjustment instruction at this time.
Because the time for the image collector to collect one frame of user image is 30-40ms under normal conditions, in order to ensure the real-time performance of volume adjustment, when a gesture detection algorithm is carried out, the position change generated by the user keeping the volume adjustment gesture to execute the specified action is calculated according to every 100ms, so that the volume is linearly adjusted according to the position change information.
Therefore, after responding to the volume adjustment instruction, the controller can acquire the start coordinate value and the end coordinate value of the user performing the specified action presented in the user image in the third time length so as to calculate the abscissa variation generated when the user performs the specified action based on the volume adjustment gesture. In some embodiments, the third duration may be set to 100ms, and the change in position generated by the user slide gesture may be equivalent to the amount of change in abscissa generated in the user image.
And the image collector collects the user image at the initial moment when the user executes the specified action and the user image at the termination moment in the third duration in real time. And establishing a rectangular coordinate system by taking the upper left corner of the user image as the origin of coordinates, taking the X-axis forward direction from left to right and the Y-axis forward direction from top to bottom.
In the rectangular coordinate system, a pixel coordinate value of a volume adjustment gesture in a user image acquired at an initial time is an initial coordinate value, a pixel coordinate value of the volume adjustment gesture in the user image acquired at a termination time is a termination coordinate value, and an abscissa variation is calculated based on an abscissa value of the initial coordinate value and an abscissa value of the termination coordinate value. Wherein, the coordinate values are all expressed by pixel coordinates.
The termination time refers to a time corresponding to each time of image acquisition within a third duration, for example, after 30-40ms after the initial time, the image acquisition device acquires a frame of user image, and the time for completing the image acquisition is the termination time; and after 30-40ms, the image collector collects a frame of user image, and the time for completing the second image collection is another termination time. Because each frame of user image is collected, a termination time is generated, and therefore the final termination time is the corresponding time when the abscissa change rate meets the threshold condition.
FIG. 14 illustrates a schematic diagram of calculating an amount of abscissa variation, according to some embodiments. Referring to fig. 14, at the initial time when the user corresponding to the designated gesture ID performs the designated action, the position of the volume adjustment gesture corresponding to the designated gesture ID is point a, and at this time, the initial coordinate value is a (x0, y 0); in the third time length, if the image acquisition processes are performed twice, the positions reached by the user sliding gesture are respectively the points B1 and B2, and at this time, the ending coordinate values are respectively B1(x1, y1) and B2(x2, y 2).
When B1 is the termination coordinate value, the abscissa change L1, which is generated when the corresponding user performs the designated action based on the volume adjustment gesture, is x1-x 0. When B2 is the termination coordinate value, the abscissa change L2, which is generated when the corresponding user performs the designated action based on the volume adjustment gesture, is x2-x 0.
In some embodiments, to avoid the phenomena of frame loss and pause during the process of capturing images by the image capturing device, a threshold condition may be set when determining the volume adjustment value and the volume adjustment direction of the volume bar based on the variation of the abscissa.
Specifically, the controller, in performing determining the volume adjustment value and the volume adjustment direction of the volume bar based on the amount of change in the abscissa, is further configured to perform the steps of:
step 441, if the abscissa variation is greater than the third threshold, the volume adjustment value of the volume bar is determined to be a designated adjustment amount, and the adjustment direction is to increase the volume.
Step 442, if the abscissa variation is smaller than the fourth threshold, determining that the volume adjustment value of the volume bar is a designated adjustment amount, and adjusting the direction to decrease the volume.
In some embodiments, a third threshold and a fourth threshold may be set according to the adjustment effect of increasing the volume and decreasing the volume, the third threshold may be 8, and the fourth threshold may be-8. Threshold 8 represents 8 pixels.
If the abscissa change rate is greater than the third threshold 8, it indicates that the user has left the volume adjust gesture to slide to the right, at which point the adjustment direction of the volume bar is determined to be increasing volume. If the abscissa change rate is less than the fourth threshold of-8, it indicates that the user has left the volume adjust gesture to slide to the left, at which point the adjustment direction of the volume bar is determined to be decreasing volume.
To achieve linear adjustment of the volume, the volume adjustment value, which is adjusted every volume, may be set to 3 volumes. For example, if the abscissa change amount is greater than the third threshold, the volume of the volume bar is increased by 3 volumes based on the current value; and if the abscissa change amount is smaller than the fourth threshold, decreasing the volume of the volume bar by 3 volumes based on the current value.
Taking the state shown in fig. 14 as an example, the user corresponding to the designated gesture ID slides to the left based on the volume adjustment gesture, and the image capturing device captures two frames of user images in the third duration, at this time, the abscissa change amount generated when the user appearing in the user image (corresponding to the position of B1 point) corresponding to the start time performs the designated action based on the volume adjustment gesture is calculated as L1 ═ x1-x0, and the abscissa change amount generated when the user appearing in the user image (corresponding to the position of B2 point) corresponding to the start time performs the designated action based on the volume adjustment gesture is calculated as L2 ═ x2-x 0.
If L1> -8 (fourth threshold), the amount of change in position indicating that the user held the volume adjust gesture sliding from point A to point B1 is insufficient to trigger the volume adjustment process, at which point the volume of the volume bar is not adjusted. If L2< -8 (fourth threshold), the amount of change in the position indicating that the user held the volume adjust gesture sliding from point A to point B2 is sufficient to trigger the volume adjustment process, at which point the volume value of the volume bar may be adjusted.
After the volume adjustment value and the volume adjustment direction of the volume bar are determined, the adjustment of the volume corresponding to the volume bar can be realized, for example, when the volume adjustment value is 3 and the volume adjustment direction is volume increase, 3 volume values are increased on the basis of the current volume value of the volume bar. When the volume adjustment value is 3 and the volume adjustment direction is volume reduction, 3 volume values are reduced on the basis of the current volume value of the volume bar.
In some embodiments, when the controller linearly adjusts the volume of the volume bar according to the abscissa variation in the third duration, that is, when it is detected that the abscissa variation corresponding to the user during the sliding gesture generates a variation of 8 pixels within 100ms, the controller performs a volume adjustment process once, and adjusts 3 volume values at a time.
The adjustment direction of the volume can be determined according to the positive and negative numbers of the detected 8 pixel point changes, for example, if the numerical value corresponding to the detected 8 pixel points is a negative number, it indicates that the volume value needs to be reduced; if the detected numerical value corresponding to the 8 pixel points is positive, it indicates that the volume value needs to be increased.
It should be noted that the set gesture detection time length (third time length), the position change threshold values (third threshold value and fourth threshold value) of 8 pixel points, and the adjustment values (specified adjustment amounts) of 3 volumes may also be set to other values according to the actual application, which are only used as examples herein and are not specifically limited.
In some embodiments, in the volume adjustment process, a phenomenon of frame loss and pause in gesture detection caused by an excessively large occupation of a display device CPU often occurs in the gesture sliding, so that the user gesture moves within a third duration (100ms) but does not correspondingly generate abscissa change in a user image, which causes an phenomenon of volume non-adjustment, and affects the accuracy of volume adjustment.
Therefore, to ensure accurate volume adjustment even in the presence of frame loss and a stuck phenomenon, the controller is further configured to perform the following steps when determining the start coordinate value and the end coordinate value:
and 421, if the abscissa variation is zero in the third time duration, prolonging the gesture detection time duration according to the third time duration.
Step 422, based on the total duration corresponding to the gesture detection duration, obtaining an initial coordinate value and a termination coordinate value when the user corresponding to the designated gesture ID executes the designated action based on the volume adjustment gesture, where the total duration of the gesture detection duration refers to the total durations corresponding to a plurality of third durations, the initial coordinate value is the initial coordinate value corresponding to the first third duration, and the termination coordinate value is the termination coordinate value corresponding to the last third duration.
When the volume adjusting process is started, the controller performs gesture detection in real time according to the user image collected by the image collector so as to determine the abscissa variation generated when the gesture slides. If no position change is detected within the third duration (100ms), i.e., the abscissa change amount is zero, during the gesture detection process, the duration of the third duration (100ms) will continue to be delayed until the volume adjustment threshold is reached (the abscissa change amount satisfies the third threshold or the fourth threshold).
At this time, the total gesture detection time length is the time length corresponding to the prolonged third time length, at this time, the initial coordinate value generated when the user corresponding to the designated gesture ID executes the designated action based on the volume adjustment gesture is still the initial coordinate value corresponding to the initial time, and the initial time is the time corresponding to the start of the sliding of the gesture in the first third time length; the ending coordinate value is the ending coordinate value corresponding to the ending time within the last third time length, and the ending time is the time corresponding to the time when the abscissa change rate generated within the last third time length meets the threshold condition.
For example, if the abscissa variation is not detected within the first 100ms during the gesture detection process, the time is delayed by 100ms again, and if the abscissa variation generated at the 150 th ms (within the second 100ms) meets the threshold condition, the pixel coordinate corresponding to the gesture position detected at the initial time when the gesture sliding is generated within the first 100ms is the start coordinate value, and the pixel coordinate corresponding to the gesture position detected at the 150ms (end time) is the end coordinate value.
Therefore, if the abscissa variation is not detected in the gesture detection time (third time), the time corresponding to the third time is continuously delayed until the abscissa variation reaches the volume adjustment threshold (third threshold or fourth threshold).
In some embodiments, when the user corresponding to the designated gesture ID performs a sliding motion to adjust the volume of the display device based on the volume adjustment gesture, in order to prompt the user that the user is currently in the volume adjustment process, volume adjustment state prompt information may be displayed in the user interface.
FIG. 15 illustrates a schematic diagram of displaying volume adjustment status prompt information in a user interface, according to some embodiments. Referring to fig. 15, in the volume adjustment process, the controller is further configured to: and responding to the volume adjustment instruction, and switching and displaying the volume adjustment operation prompt information presented in the user interface into volume adjustment state prompt information.
In the volume adjustment process, the volume adjustment interface is always displayed in the user interface, specifically, the volume bar is always displayed in the user interface, and a volume value corresponding to the volume bar is correspondingly changed, for example, increased or decreased, along with the sliding operation of the volume adjustment gesture of the user.
In order to prompt the user that the user is currently in the process of volume adjustment, the volume adjustment operation prompt message in the volume adjustment interface may be cancelled, instead of displaying the volume adjustment state prompt message, the prompt content of the volume adjustment state prompt message may be "… … in volume adjustment", and the like.
In some embodiments, after the user adjusts the volume of the display device to the volume required by the user, the user may put down the held volume adjustment gesture, and at this time, the image acquirer does not include the volume adjustment gesture any more in the next frame of user image acquired, which may indicate that the volume adjustment process of the user is completed.
Because of no gesture sliding, the volume of the volume bar does not slide left and right along with the gesture, namely the volume is not adjusted. Therefore, after the user completes the volume adjustment process, the volume adjustment interface (volume bar) displayed in the user interface is cancelled, and after the controller monitors that the volume bar disappears from broadcasting, the volume adjustment completion interface is generated and displayed in the user interface so as to prompt the user that the volume adjustment is completed.
Fig. 16 illustrates a schematic diagram of displaying a volume adjustment completion interface in a user interface, according to some embodiments. Referring to fig. 16, in order to prompt the user that the volume adjustment is completed, a volume adjustment completion pattern and volume adjustment completion prompt information are displayed in the volume adjustment completion interface, the volume adjustment completion pattern may be in the form of a "number matching" UI, and the content of the volume adjustment completion prompt information may be "volume adjustment is successful", and the like.
In some embodiments, when the user corresponding to the designated gesture ID performs a sliding operation based on the volume adjustment gesture to adjust the volume, if the user keeps the volume adjustment gesture but does not perform the sliding operation, at this time, the abscissa variation is zero, which indicates that the volume of the volume bar is not increased or decreased. Because the volume adjustment is not carried out in the current state, the controller can identify the state as a volume adjustment completion state, and the influence on normal use of the display equipment by a user due to the fact that a volume adjustment interface is always displayed in a user interface is avoided.
To ensure that the controller can accurately detect whether the volume adjustment process is complete, the duration of the state with zero abscissa change rate may be defined, and in particular, the controller is further configured to: and within the fourth time length, when the abscissa variation generated when the user corresponding to the designated gesture ID executes the designated action based on the volume adjusting gesture is zero, canceling the display of the volume bar, and presenting a volume adjusting completion interface in the user interface, wherein the volume adjusting completion interface comprises a volume adjusting completion pattern and volume adjusting completion prompt information.
In some embodiments, the duration of the state in which the abscissa change rate is zero is set to be a fourth duration, where the fourth duration refers to a duration from a time when the volume of the volume bar is changed for the last time to the current time, and also refers to a duration from a time corresponding to the time when the user stops sliding after keeping the volume adjustment gesture for sliding operation to the current time. The fourth time period may be set to 2 seconds, or may be other values, and may be set according to practical application conditions, and is not specifically limited herein.
If the abscissa variation is always zero after the fourth time period after the user performs the volume adjustment once, it indicates that the gesture of the user does not slide within 2 seconds, and further, the volume is not increased or decreased. Therefore, the volume adjusting interface (volume bar) displayed in the user interface is cancelled, and after the controller monitors that the volume bar disappears from broadcasting, a volume adjusting completion interface is generated and displayed in the user interface to prompt the user that the volume adjustment is completed. The volume adjustment completion interface is shown in fig. 16.
In some embodiments, if the user completes the volume adjustment process once, that is, after the volume adjustment completion interface is displayed in the user interface, if the user makes the volume adjustment gesture again, the controller may immediately respond to detect the user image including the gesture, and re-determine the designated gesture ID, and adjust the volume of the display device based on the sliding operation of the user corresponding to the designated gesture ID. The one-time complete volume adjustment process is a process of successfully identifying a volume adjustment gesture, starting a volume adjustment function through a sliding gesture, starting the gesture to adjust the volume in a sliding manner and finishing volume adjustment through user image acquisition, gesture identification (determining an appointed gesture ID), and the volume adjustment gesture.
Specifically, after completing one complete volume adjustment process, the controller is further configured to perform the following steps: after the current volume adjusting process is finished, acquiring a next frame of user image which is acquired by an image acquisition device and comprises user gestures; and executing the next volume adjusting process when the user gesture in the next frame of user image is the volume adjusting gesture.
After the current volume adjustment process is completed completely, if the user makes a user gesture again in the detection area of the image collector, the image collector collects the next frame of user image including the user gesture and sends the next frame of user image to the controller. And after receiving a new frame of user image, the controller judges the volume adjusting gesture again, determines the appointed gesture ID again, executes a volume adjusting instruction generated by an appointed action based on the volume adjusting gesture according to the user corresponding to the new appointed gesture ID when judging that the user gesture in the new user image is the volume adjusting gesture, and immediately responds to the volume adjusting instruction to execute the next volume adjusting process. For the specific volume adjustment process, reference may be made to the contents of the foregoing embodiments, and details are not described herein.
In some embodiments, if in the gesture recognition stage, the user puts down immediately after putting a certain gesture, the duration of the user holding the gesture does not reach the first duration, but the user puts down the gesture immediately after putting the gesture, so that the controller frequently recognizes the user gesture in each frame of user image, and normal operation of the display device is affected. Therefore, when the process of starting the volume adjustment is forcibly terminated and there is a state where the user frequently turns on the volume adjustment function, the controller does not immediately perform gesture recognition on the user image, but performs gesture recognition after a period of time.
Specifically, in a case where there is a gesture frequently made by the user for a short time, causing the volume adjustment process to stop halfway, the controller is further configured to: and if the user gesture is not included in the acquired user image within the first time length, acquiring the next frame of user image acquired by the image acquisition device after a fifth time length.
The user corresponding to the designated gesture ID immediately puts down after putting out a certain gesture, at this time, the duration of the user holding the gesture does not reach the first duration, and therefore, after the user puts out the gesture for the first time, the controller has not detected whether the user is a volume adjustment gesture, and the recognition success rate of the volume adjustment gesture may not exceed the first threshold, that is, the volume adjustment process has not been started yet.
In order to avoid that the user frequently starts the volume adjustment process in a short time, the time of the fifth time interval can be set after the moment that the user immediately puts down after a certain gesture is put out, and then the image collector collects the next frame of user image, or when the image collector collects the user image in real time, the controller can obtain the corresponding user image after the fifth time interval.
After the next frame of user image is acquired, gesture recognition needs to be performed on the frame of user image again, the designated gesture ID is determined again, a plurality of frames of user images within the next first duration are acquired again, and the next volume adjustment process is executed.
In some embodiments, the fifth duration refers to a time interval from a moment when a user makes a certain gesture, and the image collector collects a user image including the gesture to a moment when a next frame of user image including the gesture is collected. The fifth time period may be set to 3 seconds, or may be set to other values according to practical applications, and is not limited specifically herein.
For example, a user makes a gesture at a time of 8:05, at which time the image capturer captures an image of the user that includes the gesture at a time of 8: 05. If the user immediately drops the gesture and lifts the gesture again, this time at an interval of 3 seconds, the image capturer may capture an image of the user including the gesture after time 8:08, or the controller may capture an image of the user including the gesture captured by the image capturer after time 8: 08.
In some embodiments, to ensure that the volume adjustment process is performed properly, the volume adjustment process is performed when a volume bar is displayed in the user interface, and the gesture detection process for starting the volume adjustment is not repeated. The controller performs gesture recognition on the user image acquired by the image acquisition device, and starts a process of adjusting the volume through a sliding gesture after the gesture recognition is successful. In the process, the volume bar is turned up and is always displayed in the user interface, and the subsequent collected user images including the volume adjusting gesture are not subjected to repeated gesture detection process, namely the process of adjusting the volume through the sliding gesture is not started repeatedly.
In some embodiments, if the volume adjustment process is completed once, and the volume bar is not displayed in the user interface, the gesture detection process of the volume adjustment is started to perform the volume adjustment process. The one-time complete volume adjustment process is a process of successfully identifying a user image, recognizing a gesture, adjusting the volume through a sliding gesture, starting the gesture to adjust the volume, and finishing the volume adjustment.
After the complete volume adjustment process is completed once, the display of the volume bar is cancelled in the user interface, which can indicate that the volume adjustment is completed this time, and at this time, the next volume adjustment process can be started, that is, the user image and the subsequent detection process are collected again.
The intelligent algorithm executed by the display device provided by the embodiment of the invention is also suitable for detecting that only one gesture of a person is a volume adjustment gesture immediately before, and recording the gesture ID corresponding to the gesture of the user as the specified gesture ID. Then starting a timer for 1s, and then if a plurality of gestures appear in the subsequent user images, the intelligent algorithm detects and compares each gesture of each frame of image, namely, whether the comparison is a volume adjustment gesture and whether the comparison gesture ID is the same as the designated gesture ID. If the gesture is the volume adjusting gesture and the gesture IDs are the same, the gesture made by the same user is judged; if the gesture is the volume adjustment gesture but the gesture IDs are different, the gesture is made by other users. At the moment, the display device only responds to the volume control instruction generated by the user corresponding to the designated gesture ID, but does not respond to the volume control instruction generated by the user corresponding to other gesture IDs, so that the situation that the volume of a person shakes along with the volume adjustment gesture of other people in the process of sliding the volume can be prevented, the problems of loss, disorder and the like in the gesture recognition process can be effectively solved, and the smooth volume adjustment of the display device through recognizing the sliding gesture can be guaranteed.
As can be seen, in the display device provided in the embodiment of the present invention, the image collector collects the user image in real time, the controller obtains the user image collected by the image collector and including at least one user gesture, and determines the designated gesture ID of the first user gesture matched with the volume adjustment gesture in the user image; calculating the recognition success rate of a volume adjusting gesture corresponding to the specified gesture ID in the frames of user images collected in the first time length, and if the recognition success rate of the volume adjusting gesture exceeds a first threshold value, displaying a volume bar in a user interface; and the user corresponding to the designated gesture ID generates position change when executing the designated action based on the volume adjusting gesture, generates a corresponding volume adjusting instruction, and the controller responds to the volume adjusting instruction to start adjusting the volume value corresponding to the volume bar. Therefore, according to the display device provided by the embodiment of the invention, when a plurality of user gestures are recognized in the user image, the user is calibrated by determining the designated gesture ID, and the volume is adjusted by taking the user gesture corresponding to the designated gesture ID as a reference, so that the problems of loss, disorder and the like in the gesture recognition process are effectively solved, the smooth adjustment of the volume value of the display device by recognizing the sliding gesture is ensured, and the user experience is improved.
Fig. 6 illustrates a flow diagram of a method of volume adjustment based on multi-person gesture recognition, in accordance with some embodiments. Referring to fig. 6, an embodiment of the present invention provides a volume adjustment method based on multi-person gesture recognition, which is executed by a controller in a display device provided in the foregoing embodiment, and the method includes:
s1, acquiring a user image which is acquired by an image acquirer and comprises at least one user gesture, and acquiring an appointed gesture ID of a first user gesture which is matched with a volume adjusting gesture in the user image;
s2, calculating the recognition success rate of the volume adjustment gesture corresponding to the specified gesture ID in the frames of user images collected in the first time length;
s3, if the recognition success rate of the volume adjusting gesture exceeds a first threshold value, displaying a volume bar in the user interface;
and S4, responding to a volume adjusting instruction generated when a user corresponding to the designated gesture ID executes a designated action based on the volume adjusting gesture, and adjusting the volume value corresponding to the volume bar.
In a specific implementation manner, the present invention further provides a computer storage medium, where the computer storage medium may store a program, and the program may include some or all of the steps in each embodiment of the volume adjusting method based on multi-person gesture recognition provided by the present invention when executed. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM) or a Random Access Memory (RAM).
Those skilled in the art will readily appreciate that the techniques of the embodiments of the present invention may be implemented as software plus a required general purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present invention may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.
The same and similar parts in the various embodiments in this specification may be referred to each other. Especially, for the embodiment of the volume adjustment method based on multi-person gesture recognition, since it is substantially similar to the embodiment of the display device, the description is relatively simple, and the relevant points can be referred to the description in the embodiment of the display device.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.
The foregoing description, for purposes of explanation, has been presented in conjunction with specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the embodiments to the precise forms disclosed above. Many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles and the practical application, to thereby enable others skilled in the art to best utilize the embodiments and various embodiments with various modifications as are suited to the particular use contemplated.
Claims (13)
1. A display device, comprising:
a display configured to present a user interface;
an image collector or a user input interface connectable to an image collector, the image collector configured to collect a user image;
a controller connected to the display and the image collector, respectively, the controller configured to:
acquiring a user image which is acquired by an image acquisition device and comprises at least one user gesture, and an appointed gesture ID of a first user gesture which is matched with a volume adjustment gesture in the user image;
calculating the recognition success rate of the volume adjusting gesture corresponding to the specified gesture ID in the frames of user images collected in the first time length;
displaying a volume bar in the user interface if the recognition success rate of the volume adjustment gesture exceeds a first threshold;
and responding to a volume adjusting instruction generated when a user corresponding to the specified gesture ID executes a specified action based on a volume adjusting gesture, and adjusting the volume value corresponding to the volume bar.
2. The display device of claim 1, wherein the controller, in performing the specified gesture ID of the first user gesture in the captured user image that matches the volume adjustment gesture, is further configured to:
identifying at least one user gesture in the user image;
judging whether each user gesture is matched with a volume adjusting gesture;
if the user gesture which is matched with the volume adjusting gesture in a consistent mode exists, determining the first user gesture which is generated in a consistent mode as the appointed user gesture;
and acquiring the gesture ID of the designated user gesture, and determining the gesture ID as the designated gesture ID.
3. The display device of claim 2, wherein the controller, in performing the determining whether each of the user gestures matches a volume adjustment gesture, is further configured to:
calculating gesture confidence degrees of the user gesture and the volume adjusting gesture;
if the gesture confidence exceeds a second threshold, determining that the user gesture matches the volume adjustment gesture consistently;
if the gesture confidence does not exceed a second threshold, determining that the user gesture does not match the volume adjustment gesture.
4. The display device according to claim 1, wherein the controller, in performing the calculating, the recognition success rate of the volume adjustment gesture corresponding to the designated gesture ID in the frames of user images captured within the first duration, is further configured to:
after the appointed gesture ID is determined, acquiring a plurality of frames of user images collected in a first time period, and determining the user image with the volume adjusting gesture as the appointed user image;
comparing the designated gesture ID with the gesture ID of the volume adjusting gesture in each frame of designated user image, and determining the designated user image to which the volume adjusting gesture corresponding to the consistent comparison belongs as a gesture recognition success frame;
counting the total number of successful gesture recognition frames of the successful gesture recognition frames and the total number of collected recognition frames of the user images in a first duration;
and calculating the ratio of the total number of successful gesture recognition frames to the total number of recognition frames, and determining the ratio as the recognition success rate of the volume adjustment gesture corresponding to the specified gesture ID.
5. The display device of claim 1, wherein the controller, in performing the displaying a volume bar in the user interface if the recognition success rate of the volume adjustment gesture exceeds a first threshold, is further configured to:
if the recognition success rate of the volume adjusting gesture exceeds a first threshold, presenting a volume adjusting gesture prompting interface in the user interface, and presenting gesture recognition success prompting information and a volume adjusting gesture pattern in the volume adjusting gesture prompting interface;
canceling the display of the volume adjustment gesture prompt interface when the display duration of the volume adjustment gesture prompt interface exceeds a second duration, and displaying the volume adjustment interface in the user interface, wherein the volume adjustment interface comprises a volume bar and volume adjustment operation prompt information.
6. The display device according to claim 1, wherein the controller, when executing the volume adjustment instruction generated when the user corresponding to the designated gesture ID performs the designated action based on the volume adjustment gesture, adjusts the volume value corresponding to the volume bar, and is further configured to:
receiving a volume adjusting instruction generated when a user corresponding to the specified gesture ID executes a specified action based on the volume adjusting gesture, wherein the specified action is an action generated by the user based on volume adjusting operation prompt information;
responding to the volume adjusting instruction, and acquiring a starting coordinate value and a stopping coordinate value when a user executes a specified action, which are presented in the user image in a third time length;
based on the starting coordinate value and the ending coordinate value, calculating the abscissa variation generated when the user executes the specified action based on the volume adjusting gesture;
determining a volume adjustment value and a volume adjustment direction of the volume bar based on the abscissa variation amount;
and adjusting the volume value corresponding to the volume bar based on the volume adjustment value and the volume adjustment direction of the volume bar.
7. The display device of claim 6, wherein the controller, in performing the determining the volume adjustment value and the volume adjustment direction for the volume bar based on the abscissa change amount, is further configured to:
if the abscissa variation is larger than a third threshold, determining that the volume adjustment value of the volume bar is a specified adjustment amount, and the adjustment direction is volume increase;
and if the abscissa variation is smaller than a fourth threshold, determining that the volume adjustment value of the volume bar is a specified adjustment amount, and the adjustment direction is to reduce the volume.
8. The display device of claim 6, wherein the controller is further configured to:
if the abscissa variation is zero within a third time length, prolonging the gesture detection time length according to the third time length;
and acquiring an initial coordinate value and a termination coordinate value when a user corresponding to the designated gesture ID executes a designated action based on the volume adjustment gesture based on the total duration corresponding to the gesture detection duration, wherein the total duration of the gesture detection duration refers to the total duration corresponding to a plurality of third durations, the initial coordinate value is the initial coordinate value corresponding to the first third duration, and the termination coordinate value is the termination coordinate value corresponding to the last third duration.
9. The display device of claim 1, wherein the controller is further configured to:
and responding to the volume adjustment instruction, and switching and displaying volume adjustment operation prompt information presented in the user interface into volume adjustment state prompt information.
10. The display device of claim 1, wherein the controller is further configured to:
and if the volume adjusting gesture is not included in the next frame of collected user image, or within the fourth time span, when the abscissa variation generated when the user corresponding to the specified gesture ID executes the specified action based on the volume adjusting gesture is zero, canceling the display of the volume bar, presenting a volume adjusting completion interface in the user interface, wherein the volume adjusting completion interface comprises a volume adjusting completion pattern and volume adjusting completion prompt information.
11. The display device of claim 1, wherein the controller is further configured to:
and if the user gesture is not included in the acquired user image within the first time length, acquiring the next frame of user image acquired by the image acquisition device after a fifth time length.
12. The display device of claim 1, wherein the controller is further configured to:
executing a volume adjusting process when the volume bar is displayed in the user interface, and not repeatedly starting a gesture detection process of volume adjustment;
and when the volume bar is not displayed in the user interface, starting a gesture detection process of volume adjustment so as to perform the volume adjustment process.
13. A volume adjusting method based on multi-person gesture recognition is characterized by comprising the following steps:
acquiring a user image which is acquired by an image acquisition device and comprises at least one user gesture, and an appointed gesture ID of a first user gesture which is matched with a volume adjustment gesture in the user image;
calculating the recognition success rate of the volume adjusting gesture corresponding to the specified gesture ID in the frames of user images collected in the first time length;
displaying a volume bar in the user interface if the recognition success rate of the volume adjustment gesture exceeds a first threshold;
and responding to a volume adjusting instruction generated when a user corresponding to the specified gesture ID executes a specified action based on a volume adjusting gesture, and adjusting the volume value corresponding to the volume bar.
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110184152.1A CN112817557A (en) | 2021-02-08 | 2021-02-08 | Volume adjusting method based on multi-person gesture recognition and display device |
CN202180018828.8A CN115244503A (en) | 2021-02-08 | 2021-11-27 | Display device |
PCT/CN2021/133773 WO2022166338A1 (en) | 2021-02-08 | 2021-11-27 | Display device |
US18/366,017 US20230384868A1 (en) | 2021-02-08 | 2023-08-07 | Display apparatus |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110184152.1A CN112817557A (en) | 2021-02-08 | 2021-02-08 | Volume adjusting method based on multi-person gesture recognition and display device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112817557A true CN112817557A (en) | 2021-05-18 |
Family
ID=75865238
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110184152.1A Pending CN112817557A (en) | 2021-02-08 | 2021-02-08 | Volume adjusting method based on multi-person gesture recognition and display device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112817557A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022166338A1 (en) * | 2021-02-08 | 2022-08-11 | 海信视像科技股份有限公司 | Display device |
WO2024125478A1 (en) * | 2022-12-12 | 2024-06-20 | 索尼(中国)有限公司 | Audio presentation method and device |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104750252A (en) * | 2015-03-09 | 2015-07-01 | 联想(北京)有限公司 | Information processing method and electronic equipment |
CN106056085A (en) * | 2016-05-31 | 2016-10-26 | 广东美的制冷设备有限公司 | Gesture recognition method, gesture recognition device and equipment |
CN106325507A (en) * | 2016-08-18 | 2017-01-11 | 青岛海信医疗设备股份有限公司 | Cursor movement method and device for medical display and medical equipment |
CN107765853A (en) * | 2017-10-13 | 2018-03-06 | 广东欧珀移动通信有限公司 | Using method for closing, device, storage medium and electronic equipment |
CN110611788A (en) * | 2019-09-26 | 2019-12-24 | 上海赛连信息科技有限公司 | Method and device for controlling video conference terminal through gestures |
CN110764616A (en) * | 2019-10-22 | 2020-02-07 | 深圳市商汤科技有限公司 | Gesture control method and device |
-
2021
- 2021-02-08 CN CN202110184152.1A patent/CN112817557A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104750252A (en) * | 2015-03-09 | 2015-07-01 | 联想(北京)有限公司 | Information processing method and electronic equipment |
CN106056085A (en) * | 2016-05-31 | 2016-10-26 | 广东美的制冷设备有限公司 | Gesture recognition method, gesture recognition device and equipment |
CN106325507A (en) * | 2016-08-18 | 2017-01-11 | 青岛海信医疗设备股份有限公司 | Cursor movement method and device for medical display and medical equipment |
CN107765853A (en) * | 2017-10-13 | 2018-03-06 | 广东欧珀移动通信有限公司 | Using method for closing, device, storage medium and electronic equipment |
CN110611788A (en) * | 2019-09-26 | 2019-12-24 | 上海赛连信息科技有限公司 | Method and device for controlling video conference terminal through gestures |
CN110764616A (en) * | 2019-10-22 | 2020-02-07 | 深圳市商汤科技有限公司 | Gesture control method and device |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022166338A1 (en) * | 2021-02-08 | 2022-08-11 | 海信视像科技股份有限公司 | Display device |
WO2024125478A1 (en) * | 2022-12-12 | 2024-06-20 | 索尼(中国)有限公司 | Audio presentation method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114302190A (en) | Display device and image quality adjusting method | |
CN114327034A (en) | Display device and screen recording interaction method | |
CN113794917A (en) | Display device and display control method | |
CN112862859B (en) | Face characteristic value creation method, character locking tracking method and display device | |
CN112601117B (en) | Display device and content presentation method | |
CN112835506B (en) | Display device and control method thereof | |
CN112817557A (en) | Volume adjusting method based on multi-person gesture recognition and display device | |
CN113918010A (en) | Display apparatus and control method of display apparatus | |
CN112698905A (en) | Screen protection display method, display device, terminal device and server | |
CN112866773A (en) | Display device and camera tracking method in multi-person scene | |
CN113778217A (en) | Display apparatus and display apparatus control method | |
CN112860212A (en) | Volume adjusting method and display device | |
CN111669662A (en) | Display device, video call method and server | |
CN113453057B (en) | Display device and playing progress control method | |
CN113051435B (en) | Server and medium resource dotting method | |
US20230384868A1 (en) | Display apparatus | |
CN112261289B (en) | Display device and AI algorithm result acquisition method | |
CN114296841A (en) | Display device and AI enhanced display method | |
CN114302203A (en) | Image display method and display device | |
CN114302199A (en) | Display apparatus and data sharing method | |
CN114302101A (en) | Display apparatus and data sharing method | |
CN112367550A (en) | Method for realizing multi-title dynamic display of media asset list and display equipment | |
CN114071056B (en) | Video data display method and display device | |
CN114302206B (en) | Content display method, display equipment and server | |
CN114302131A (en) | Display device and black screen detection method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |