CN115604513A - System mode switching method, electronic equipment and computer readable storage medium - Google Patents

System mode switching method, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN115604513A
CN115604513A CN202110767081.8A CN202110767081A CN115604513A CN 115604513 A CN115604513 A CN 115604513A CN 202110767081 A CN202110767081 A CN 202110767081A CN 115604513 A CN115604513 A CN 115604513A
Authority
CN
China
Prior art keywords
user
system mode
image
instruction
mode
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110767081.8A
Other languages
Chinese (zh)
Inventor
李乐
高晓强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202110767081.8A priority Critical patent/CN115604513A/en
Priority to PCT/CN2022/101983 priority patent/WO2023280020A1/en
Publication of CN115604513A publication Critical patent/CN115604513A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42203Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/441Acquiring end-user identification, e.g. using personal code sent by the remote control or by inserting a card
    • H04N21/4415Acquiring end-user identification, e.g. using personal code sent by the remote control or by inserting a card using biometric characteristics of the user, e.g. by voice recognition or fingerprint scanning

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Theoretical Computer Science (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The application relates to the field of electronic equipment, and discloses a system mode switching method, electronic equipment and a computer-readable storage medium. The method comprises the following steps: operating a first system mode; acquiring voice of a user of the electronic equipment, and analyzing an instruction from the voice; determining that the instruction comprises a preset awakening word for switching a first system mode; acquiring a user image acquired by a user in front of the electronic equipment, and determining that the user comprises a user of a preset user type according to the user image; and according to the instruction and the user image and the determined user of the preset user type, switching the electronic equipment from the first system mode to a second system mode corresponding to the preset user type. This is favorable to promoting the degree of operating convenience and the accuracy of switching system mode of this electronic equipment.

Description

System mode switching method, electronic equipment and computer readable storage medium
Technical Field
The present disclosure relates to the field of electronic devices, and in particular, to a system mode switching method, an electronic device, and a computer-readable storage medium.
Background
With the popularization of smart home devices, smart televisions have entered into thousands of households. The system mode of the smart television generally includes a normal mode and a child mode.
When children use the smart television at home, the system mode of the smart television can be switched from the normal mode to the child mode, namely, the smart television exits the normal mode (starts the child mode), so that the scope, the duration, the sitting posture and the like of watching television film sources by the children at home can be controlled. When an adult uses the smart television at home, the system mode of the smart television can be switched from the child mode to the normal mode, namely, the smart television exits from the child mode (starts the normal mode).
The smart television has various problems when exiting from the child mode, for example, children can easily exit from the smart television by imitating adults, old people cannot operate the smart television easily, voiceprint recognition error rate is high, and the like.
Disclosure of Invention
The embodiment of the application provides a system mode switching method, which is used for improving the operation convenience and accuracy of switching the system mode of electronic equipment.
In order to achieve the above purpose, the embodiments of the present application adopt the following technical solutions:
in a first aspect, an embodiment of the present application discloses a system mode switching method, which is applied to an electronic device (for example, a smart television), and the method includes: operating a first system mode; acquiring voice of a user of the electronic equipment, and analyzing an instruction from the voice; determining that the instruction comprises a preset awakening word for switching the first system mode; acquiring a user image acquired by a user in front of the electronic equipment, and determining that the user comprises a user of a preset user type according to the user image; and according to the instruction and the user image and the determined user of the preset user type, switching the electronic equipment from the first system mode to a second system mode corresponding to the preset user type.
The first system mode described above is exemplified by the child mode, and the second system mode is exemplified by the normal mode. According to the method and the device for exiting the smart television, the instruction for exiting the child mode is generated by recognizing the voice of the user, and the presence of an adult is determined by recognizing the image of the user, so that the operation for exiting the child mode of the smart television can be completed according to the instruction for exiting the child mode and the presence of the adult. Therefore, the child mode exit operation path is short, and the usability is high. Moreover, the user identity identification is accurate and is not easy to be bypassed by children.
In some possible embodiments, the system mode switching method is suitable for exiting the guest mode of the mobile phone.
In one possible implementation of the first aspect described above, the user image comprises a first user image and a second user image;
according to the instruction and the user image and the user with the determined preset user type, the electronic equipment is switched from the first system mode to a second system mode corresponding to the preset user type, and the method comprises the following steps:
generating a confirmation window according to the instruction and the user of the preset user type determined from the first user image;
responding to the generation of the confirmation window, acquiring a user image acquired by a user in front of the electronic equipment as a second user image, and determining an action instruction according to the second user image;
and switching the electronic equipment from the first system mode to a second system mode corresponding to the preset user type according to the action instruction and the user of the preset user type determined from the second user image.
In a possible implementation of the first aspect described above, the action instruction is a static-based action.
In one possible implementation of the first aspect, the action instruction is based on a dynamic action, the dynamic action including a limb action of the user.
In a possible implementation of the first aspect, the first user image includes a static image, and it is determined that the user includes a user of a preset user type according to the static image.
In one possible implementation of the first aspect, the user of the preset user type is a speaker of speech.
In a possible implementation of the first aspect, switching, according to an instruction and according to a user image and a user of a predetermined user type, an electronic device from a first system mode to a second system mode corresponding to the predetermined user type includes:
determining an action instruction according to the user image;
and switching the electronic equipment from the first system mode to a second system mode corresponding to the preset user type according to the instruction, the user of the preset user type and the action instruction.
In a possible implementation of the first aspect, the user image includes a static image, and it is determined that the user includes a user of a preset user type according to the static image.
In one possible implementation of the first aspect, the user image comprises a dynamic image, the dynamic image comprising limb movements of the user.
In one possible implementation of the first aspect described above, the limb action comprises a hand waving gesture.
In a possible implementation of the first aspect, the user of the predetermined user type is an adult, the first system mode is a child mode, and the second system mode is a normal mode.
In a second aspect, an embodiment of the present application provides an electronic device, including: the sound pick-up is used for collecting the voice of a user; the camera is used for collecting a user image of a user; a processor; a memory including instructions that, when executed by the processor, cause the electronic device to perform the system mode switching method provided in any implementation manner of the first aspect.
In a third aspect, an embodiment of the present application provides a computer-readable storage medium, where instructions are stored on the computer-readable storage medium, and when the instructions are executed on a computer, the instructions cause the computer to perform the system mode switching method provided in any implementation manner of the first aspect.
Drawings
Fig. 1 illustrates a first scene diagram of a smart home, according to some embodiments of the present application;
fig. 2 illustrates a flow diagram of a smart tv exiting a child mode, in accordance with some embodiments of the present application;
FIG. 3 is a schematic diagram of a configuration of an electronic device according to an embodiment of the present application;
fig. 4 is a flowchart illustrating a system mode switching method according to an embodiment of the present application;
fig. 5 shows a second scenario diagram of a smart home, according to some embodiments of the present application;
fig. 6 illustrates a scene schematic diagram of a smart home, three, according to some embodiments of the present application;
fig. 7 illustrates a scene diagram four of a smart home, according to some embodiments of the present application;
FIG. 8 illustrates a block diagram of an electronic device provided by an embodiment of the present application;
fig. 9 illustrates a block diagram of a system on chip (SoC) according to an embodiment of the present application.
Detailed Description
Hereinafter, specific embodiments of the present application will be described in detail with reference to the accompanying drawings.
The application provides a system mode switching method, which is used for electronic equipment such as a television and the like, and is favorable for improving the operation convenience and accuracy of switching the system mode of the electronic equipment.
Fig. 1 shows an example of a smart home scenario involving a plurality of electronic devices. In the scenario shown in fig. 1, specific electronic devices are: smart tv 100 and sound box 200.
According to some embodiments of the present application, the smart tv 100 is capable of interacting with a cloud or other devices (e.g., interacting with the sound box 200) via a network. In addition, the smart tv 100 may be equipped with an operating system, so that the user can install and uninstall various application software by himself while enjoying the contents of the common tv, and continuously expand and upgrade the functions. For example, the smart television 100 may have interactive applications in various manners, such as human-computer interaction manner, multi-screen interaction, content sharing, and the like. It should be noted that the television in the embodiments of the present application may be the smart television 100 described above, or may be an intelligent screen with a larger screen.
According to some embodiments of the present application, loudspeaker 200 may be a smart loudspeaker. Illustratively, smart sound box 200 has a sound pickup function, and is capable of collecting a user's voice.
According to some embodiments of the present application, the system mode of the smart tv 100 includes a normal mode and a child mode. The smart tv 100 starts a corresponding system mode according to the type of the user (adult or child) using the smart tv 100. When the child uses the smart tv 100, the smart tv 100 switches to the child mode. As shown in fig. 1, the smart tv 100 is in a child mode, and a child at home sits on a sofa to watch a video of the child kicking a football.
When the adult uses the smart tv 100, the smart tv 100 exits the child mode, that is, the system mode of the smart tv 100 is switched from the child mode to the normal mode. Referring to fig. 2, according to a scheme of exiting the child mode by the smart television 100, a flow mainly includes the following three parts:
(1) Wake-up trigger
After the user (e.g., an adult) exits the current video playing of the smart tv 100 through the remote controller of the smart tv 100, the user selects the child exit mode icon on the smart tv 100, or the user wakes up with voice to trigger the child exit mode, and a confirmation pop box for exiting the child exit mode is displayed on the display screen of the smart tv 100.
(2) Verification of
And the user reads the confirmation information on the confirmation bullet frame exiting the child mode and carries out user identity authentication. User identity authentication currently mainly has three forms: (1) a user identity password preset by a user; (2) simple math calculation questions; (3) verification of the recorded voiceprint.
(3) Response to
If the user identity authentication meets the verification requirement, the smart television 100 exits the child mode. Otherwise, the smart television 100 does not perform the process of exiting the child mode.
As can be seen from the above, in the above scheme, the following problems mainly exist in the process of exiting the child mode by the smart television 100: (1) the operation path for exiting the child mode is longer; (2) The authentication mode (the (1) th and (2) th identity authentication modes) of the user identity is easy to be bypassed by children; (3) The success rate of voiceprint recognition is low (the (3) identity authentication form); (4) operating by depending on a remote controller; (5) elderly people have relatively difficult operations.
In summary, if the above problem exists in the process of exiting the child mode of the smart television 100, the operation is inconvenient and the accuracy is not high.
Therefore, the method for switching the system mode is characterized in that an instruction for exiting the child mode is generated and triggered by recognizing the voice of the user, and the presence of an adult is determined by recognizing the image of the user, so that the exiting operation of the child mode of the smart television can be completed according to the instruction for exiting the child mode and the presence of the adult. Therefore, the children mode exit operation path is short, and the usability is high. Moreover, the user identity identification is accurate and is not easy to be bypassed by children.
In the scenario shown in fig. 1, the smart tv 100 is provided as an example of the body of the electronic device. The present application is not limited thereto and the electronic apparatus may be various electronic apparatuses with a camera 101 and a sound pickup. For example, various electronic devices such as various smart home devices (e.g., the sound box 200, the smart alarm clock, etc. shown in fig. 1), a mobile phone, a tablet computer, a desktop computer, a laptop computer, a vehicle-mounted terminal, an Artificial Intelligence (AI) smart voice terminal, a wearable device, an Augmented Reality (AR) device, a Virtual Reality (VR) device, an ultra-mobile personal computer (UMPC), a handheld computer, a netbook, and a Personal Digital Assistant (PDA), etc. Exemplary embodiments of the electronic device include, but are not limited to, various electronic devices that carry an IOS, android, microsoft, harmony os, or other operating system.
Fig. 3 shows a schematic structural diagram of an electronic device 001 according to an embodiment of the present application. The structure of the smart tv 100 in the above embodiment may be the same as the electronic device 001. Specifically, the electronic device 001 may include:
the display screen 102: has a display interface. The display interface is used for playing videos or displaying screen projection pictures of a computer or a mobile phone and the like. For example, the display screen 102 displays an icon of a child mode and plays a video of a child playing a soccer ball as described in the above embodiment. Illustratively, the display screen 102 may be a capacitive touch screen.
The processor 110: one or more Processing units may be included, for example, a Processing module or Processing circuit that may include a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a Digital Signal Processor (DSP), a Micro-programmed Control Unit (MCU), an Artificial Intelligence (AI) processor, or a Programmable logic device (FPGA), among others. Wherein, the different processing units may be independent devices or may be integrated in one or more processors. The processor can complete the switching of the system mode according to the recognized voice and the user image of the user.
A memory 180 or buffer may be provided in the processor 110 for storing instructions and data. It may be implemented that the memory 180 or buffer may include a correspondence between the user image and the user type, or may include a correspondence between the user's voice, the user type, and the system mode. The correspondence may be pre-stored in the memory 180, and after recognizing the voice of the user and the user image, the processor 110 searches whether there is a corresponding system mode from the memory 180, and if so, the processor 110 switches the system mode.
The power module 140 may include a power supply, power management components, and the like. The power source may be a battery. The power management component is used for managing the charging of the power supply and the power supply of the power supply to other modules. In some embodiments, the power management component includes a charge management module and a power management module. The charging management module is used for receiving charging input from the charger; the power management module is used to connect a power source, the charging management module and the processor 110. The power management module receives power and/or charge management module input and provides power to the processor 110, the display 102, the camera 170, and the wireless communication module 120.
The wireless communication module 120 may include an antenna, and implement transceiving of electromagnetic waves via the antenna. The wireless communication module 120 may provide a solution for wireless communication applied to the electronic device 001, including Wireless Local Area Networks (WLANs) (e.g., wireless fidelity (Wi-Fi) networks), bluetooth (BT), global Navigation Satellite Systems (GNSS), frequency Modulation (FM), near Field Communication (NFC), infrared (IR), and the like. Electronic device 001 may communicate with a network and other devices (e.g., cell phone, speaker 200) via wireless communication techniques.
The audio module 150 is used to convert digital audio information into an analog audio signal output or convert an analog audio input into a digital audio signal. The audio module 150 may also be used to encode and decode audio signals. In some embodiments, the audio module 150 may be disposed in the processor 110, or some functional modules of the audio module 150 may be disposed in the processor 110. In some embodiments, audio module 150 may include speakers, earphones, microphones, and headphone interfaces.
The camera 170 is used to capture still images or moving images. The object generates an optical image through the lens and projects the optical image to the photosensitive element. The light receiving element converts an optical Signal into an electrical Signal, and then transmits the electrical Signal to an ISP (Image Signal Processing) to convert the electrical Signal into a digital Image Signal. The electronic device 001 may implement a shooting function through an ISP, the camera 170, a video codec, a GPU (graphics Processing Unit), the display screen 102, an application processor, and the like.
The interface module 160 includes an external memory interface, a Universal Serial Bus (USB) interface, and the like. The external memory interface can be used for connecting an external memory card, such as a Micro SD card, to extend the storage capability of the electronic device 001. The external memory card communicates with the processor 110 through an external memory interface to implement a data storage function. The universal serial bus interface is used for the electronic device 001 to communicate with other electronic devices.
In some embodiments, the electronic device 001 further comprises a key 101. The keys 101 may include a volume key, an on/off key, and the like.
In some embodiments, the electronic device 001 further comprises a touch detection device 190. The touch detection device 190 may detect a position of a touch point of a user, and identify a corresponding touch gesture according to the position of the touch point of the user.
It is to be understood that the illustrated structure of the embodiment of the present invention does not specifically limit the electronic device 001. In other embodiments of the present application, the electronic device 001 may include more or fewer components than shown, or combine certain components, or split certain components, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
A system mode switching method provided by the embodiment of the present application will be described in detail below with reference to the drawings, taking the smart television 100 as an example of an electronic device. According to the system mode switching method, the exit operation of the child mode of the intelligent television can be completed by recognizing the voice and the image of the user. Therefore, the child mode exit operation path is short, and the usability is high. Moreover, the user identity identification is accurate and is not easy to be bypassed by children.
The system mode switching method of the smart television 100 is described in detail below with reference to the flowchart shown in fig. 4.
Specifically, as shown in fig. 4, the method provided in this embodiment includes the following steps:
s100: the smart television runs a child mode.
As shown in fig. 5, the current system mode of the smart tv 100 is the child mode (i.e., the first system mode). When the smart television 100 operates in the child mode, the source range, duration, sitting posture and the like of the child watching the smart television 100 are controlled. For example, when the smart television 100 operates in the child mode, a child needs to sit to watch a video, what is watched is a video of the class of children, and the duration of watching the video is controlled to be 30 minutes. Then, the child sitting on the sofa can normally watch the video played by the smart television 100 when the child kicks the football, and the duration is 30 minutes. When the child lies down to watch the video, the camera 101 of the smart television 100 detects that the child is not seated on the sofa, and the smart television 100 stops playing the current video. For another example, when the duration of the current video watched by the child reaches 30 minutes, the smart television 100 also stops playing the current video.
S200: the intelligent television obtains the voice of the user and analyzes the instruction from the voice.
When the adult needs to watch the video by using the smart television 100 or the child finishes watching the video, and the adult switches the system mode of the smart television 100, the adult performs voice awakening to exit the child mode. For example, as shown in FIG. 5, an adult says "Xiao Yi, exits Children mode". The 'Xiao Yi, exit from the child mode' is the wake-up voice (wake-up word + command word) sent by the user. Wherein, the awakening word is 'Xiaoyi', and the command word is 'exit from the child mode'.
Illustratively, as shown in fig. 5, the above-mentioned wake-up voice is obtained by other electronic devices, such as the sound box 200, communicatively connected to the smart tv 100 (as indicated by the path (1) in fig. 5). The sound box 200 may transmit the wake-up voice to the smart tv 100, and the smart tv 100 may parse the wake-up voice. Or, the sound box 200 performs analysis of the wake-up voice by itself, and then sends the analyzed instruction to the smart television 100. Whether the sound box 200 or the smart television 100 analyzes the segment of the wake-up voice, the instruction finally analyzed from the segment of the wake-up voice is: the smart television 100 exits the child mode.
In addition, for example, the wake-up voice may also be acquired by other electronic devices communicatively connected to the smart tv 100, such as a remote controller connected via bluetooth, or a wearable device. The parsing of the wake-up language is performed in the same manner as described above with respect to speaker 200. In addition, the wake-up voice may also be directly obtained by a sound pickup built in the smart tv 100.
The wake-up voice may be preset in the electronic device by the user according to the need of the user, or may be set by the electronic device before the electronic device leaves a factory.
S300: the smart television determination instruction comprises a preset awakening word for switching the child mode.
Illustratively, the "exit from child mode" in the above instruction is a wake-up word for switching child mode.
S400: the intelligent television acquires a user image and determines that the user comprises an adult according to the user image.
Illustratively, the smart tv 100 confirms that the user corresponding to the acquired user image and the user who uttered the wake-up voice are the same person.
After the user utters the wake-up voice, the camera 101 of the smart television 100 acquires an image of the user by detecting a portrait of the sound source, that is, the camera 101 performs image acquisition on the user who utters the wake-up voice (indicated by a path (2) in fig. 5). The camera 101 obtains a user image of a user in front of the smart television 100. The "front of the smart tv 100" refers to "where the camera 101 of the smart tv 100 has a camera shooting range. The processor of the smart television 100 determines the user image obtained by the camera 101, and determines that the user who utters the wake-up voice is an adult, not a child. Then, the smart tv 100 satisfies the wake-up operation condition that triggers exiting the child mode.
Equivalently, the processor of the smart television 100 performs calculation and determines whether the condition for triggering the exit from the child mode scene is satisfied in combination with the received wake-up voice and the collected portrait data. That is, when the user who utters the wake-up voice is performed by an adult, the condition for triggering the exit from the child mode scene is satisfied.
The above-mentioned "the camera 101 performs image acquisition on the user who sends out the wake-up voice" for example, is implemented by the following manner: after the smart television 100 acquires the wake-up voice uttered by the user, the smart television 100 performs sound source recognition on the wake-up voice of the user to determine the position, specifically the azimuth angle, of the user when the user utters the voice. That is, whether the user who sends the wake-up sound is located on the left side, the right side, or a position facing the sound collector (e.g., the sound box 200 described above), and the like, and then the shooting angle of the camera 101 of the smart tv 100 is adjusted according to the location of the user, so as to accurately perform image acquisition on the user who sends the wake-up sound.
In some possible embodiments, the above-mentioned "the camera 101 performs image capture on the user who utters the wake-up voice" for example, by: the camera 101 of the smart television 100 captures image information in real time, the processor of the smart television 100 performs portrait recognition on the captured image information according to a face recognition technology, performs mouth opening recognition when the portrait is determined, and acquires the portrait of the mouth when the recognition result is determined to be mouth opening (the mouth opening description is that the awakening voice is sent out) so as to finish image acquisition for sending the awakening voice.
Illustratively, the image of the user collected by the camera 101 for the user who utters the wake-up voice is a human face, and a human face detection technology can be used to distinguish whether the user who utters the wake-up voice is an adult or a child. In particular, the processor is based on AI face recognition of the image. For example, AI face recognition may automatically extract face feature values, which include aspects such as skin texture, skin color, brightness, and wrinkle texture. And confirming that the portrait is an adult or a child through the face characteristic value. Illustratively, the user image (face) is a static image, and the processor determines that the user includes an adult (a user of a preset user type) according to the static image.
In addition, for example, the user corresponding to the acquired user image and the user who utters the wake-up voice do not have to be the same person.
In some possible embodiments, after the user utters the wake-up voice, the camera 101 of the smart tv 100 acquires a user image of the user in front of the smart tv, and the processor determines whether the user image acquired by the camera 101 contains an adult, where the adult may be a person who utters the wake-up voice or not, as long as whether an adult is present. For example, the processor may confirm that the user corresponding to the user image is an adult or a child through the AI face recognition described above. Or, in some possible embodiments, the user is in a static standing posture, the camera 101 of the smart tv 100 acquires an image of the user in the static standing posture, and the processor determines that the user corresponding to the image of the user is an adult or a child according to the height of the standing posture corresponding to the image of the user.
When the processor of the smart tv 100 determines that the user utters the wake-up voice and the user in front of the smart tv 100 includes an adult, that is, there is an adult present to supervise the operation of exiting the child mode, the condition for triggering the scene of exiting the child mode is satisfied.
In addition, in some possible embodiments, after the user utters the wake-up voice, the processor of the smart tv 100 confirms whether the user uttering the wake-up voice is an adult through a voiceprint recognition technology, and after the user utters the wake-up voice, the camera 101 of the smart tv 100 acquires a user image of the user in front of the smart tv, and the processor confirms whether the user image acquired by the camera 101 includes an adult. The process of the processor confirming whether the user image contains an adult is the same as the way of confirming whether the user image contains an adult described in any of the above embodiments.
The theoretical basis for voiceprint recognition is that each sound has a unique characteristic by which it is possible to effectively distinguish between different human voices. Such unique features are, for example, the dimensions of the acoustic cavity, including particularly the throat, nasal and oral cavities, etc., the shape, size and location of which determine the magnitude of vocal cord tension and the range of sound frequencies. I.e. the magnitude of the vocal cord tension and the range of vocal frequencies of the person and the child are different. Therefore, the collected vocal cord tension of the user and the voice frequency can be recognized through the vocal print recognition technology to confirm whether the user who utters the wake-up voice is an adult.
S500: the processor of the smart television switches the smart television 100 from the child mode to the normal mode according to the instruction and according to the user image and the determined adult user.
Illustratively, the processor of the smart tv 100 successfully triggers the scenario of exiting the child mode according to the above-mentioned "exit child mode" instruction and the instruction determined to be issued by an adult. In this scenario, the smart tv 100 is switched from the child mode to the normal mode (i.e., the second system mode) according to the image of the user. That is, the operation of the smart tv 100 exiting the child mode is completed.
Alternatively, in some possible implementations, the processor of the smart television 100 confirms that there is an adult who issues the above instruction to "exit the child mode", and determines that there is an adult (the adult is not necessarily the adult who issues "exit the child mode") in front of the smart television 100 as described in the above embodiments, then the smart television 100 is switched from the child mode to the normal mode.
Alternatively, in some possible embodiments, the processor of the smart television 100 confirms that the user (adult or child) issues the instruction of "exiting the child mode" described above, and determines that an adult (the adult is not necessarily the adult issuing "exiting the child mode") is present in front of the smart television 100 as described in the above embodiments, and switches the smart television 100 from the child mode to the normal mode.
Since the operation of exiting the child mode is performed by the adult image and the wake-up voice uttered by the adult, it can be ensured that there is an adult presence to supervise the exit of the child mode. Therefore, the system mode switching safety and accuracy are high. Moreover, the system mode switching of the intelligent television 100 is convenient for users. In addition, the physical components referred to in the present application are all existing components of the smart television 100, and there are no new physical components and the like.
The user image is obtained through the camera 101 of the smart tv 100, but the application is not limited to this, and in some possible embodiments, the user image may be obtained through the camera 101 of another electronic device communicatively connected to the smart tv 100.
As described above, in the case of making an adult judgment using a static user image, there is a possibility of imitation of the static image of an adult, such as a human paper board, a model, etc., which results in the above-mentioned solution being easily bypassed. In order to prevent the occurrence of a situation in which the exit from the child mode is successfully triggered by a still image imitating a human being, in conjunction with fig. 6, in some possible embodiments a confirmation action by the human user is incorporated to successfully trigger the exit from the child mode.
For example, after the smart tv 100 meets the wake-up condition triggering the exit from the child mode, referring to fig. 6, a confirmation window (e.g., a middle position of the screen) is generated on the display screen of the smart tv 100. That is, as described in S400 above, after the adult utters the wake-up voice, it is determined that the user image acquired by the camera 101 is an adult, which indicates that the condition for triggering the exit from the child mode scene is satisfied. Then, a confirmation window is displayed on the smart tv 100. As shown in fig. 6, the confirmation window displays "please swing the hand toward the screen to confirm the exit from the child mode" and "gesture pattern". Equivalently, the smart tv 100 prompts the user to confirm the operation of exiting the child mode through the confirmation window.
At this time, the user makes a hand waving gesture according to the prompt of the confirmation window. That is, the limb motion of the user is a hand waving gesture. In response to the generation of the confirmation window, the smart tv 100 acquires a user image of the user in front of the smart tv 100 (shown by a path (3) in fig. 6), and the processor of the smart tv 100 confirms that the user image includes a hand waving gesture (shown by a path (4) in fig. 6). The processor of the smart television 100 determines, according to the user image, that the action instruction of the user is: confirming the exit of the child mode. And the processor of the smart tv 100 determines whether the user image containing the "waving gesture" corresponds to an adult. That is, the smart television 100 collects the portrait and the gesture operation through the camera 101, and jointly detects whether the adult performs the gesture exit operation.
The hand-waving gesture may be static (a single frame analysis is sufficient) or dynamic (multiple frames are required to be analyzed consecutively).
The processor thus determines from the "confirm exit from child mode" action command and from the user image that the user is an adult. If the processor determines that the adult performs the gesture operation for exiting, the condition for confirming the child exiting mode is met, and the smart television 100 is switched from the child mode to the normal mode. If the condition of confirming the exit of the child mode is not satisfied, the smart television 100 does not perform the operation of exiting the child mode, continues the previous playing, and the like. The user identity identification is accurate and is not easy to be bypassed by children.
In other words, in this embodiment, while the user wakes up to exit the child mode by a specific wake-up word (for example, "art and exit the child mode"), the camera 101 of the smart television 100 completes the capturing of the user image described in the above embodiment, and the smart television 100 determines whether the wake-up condition for exiting the child mode is satisfied by combining the voice detection and the portrait detection.
If the wake-up condition is met, the user makes a specific gesture (such as a second-seen hand waving gesture) according to the prompt of the smart television 100, and the smart television 100 jointly judges the portrait acquired by the camera 101 and the associated limb gesture to judge whether the exit condition is met. And (4) finishing exiting when the condition is met, otherwise continuing playing before awakening, and the like.
Namely, the waking up of exiting the child mode is completed through the combined recognition of the adult portrait and the voice waking up of the smart television 100; exit from the child mode is jointly recognized and completed by the adult portrait and the associated exit gesture. The intelligent television 100 realizes that the child mode is switched to the common mode, and the system mode of the intelligent television 100 is switched from the first system mode to the second system mode.
As shown in fig. 7, the smart tv 100 exits the child mode, and the system mode of the smart tv 100 is switched from the child mode to the normal mode. In the normal mode, the range of film sources for the adult to watch the smart television 100 is not limited, and as shown in fig. 7, the adult sits on a sofa to watch the video of a person crossing a cliff.
It should be noted that the above-mentioned limb movement for the user to confirm exiting the child mode is a hand waving gesture. However, the present application is not limited thereto, and in some possible embodiments, the gesture corresponding to the above-mentioned body motion is, for example, a two-finger closing movement gesture, a single-finger sliding gesture, a page-turning gesture, a pinching motion of two fingers, a left-to-right gesture, a right-to-left gesture, an upward gesture, a downward gesture, a pressing gesture, a clenching fist opening, a tapping gesture, a clapping gesture, a reverse clapping gesture, a hand fist making, a pinching gesture, a reverse pinching gesture, a finger opening gesture, a finger reverse opening gesture, or the like.
The gestures described above are all dynamic actions, i.e. the generated action commands are based on dynamic actions. However, the present application is not limited thereto, and in some possible embodiments, the motion instruction is based on a static motion. For example, the user does a static action: the thumb gesture is held upright and remains motionless. The camera 101 of the smart television 100 collects the static motion, and determines that the motion instruction is: confirming to exit the child mode.
As described above, after the smart television 100 meets the wake-up operation condition for triggering the exit from the child mode, referring to fig. 6, a confirmation window is generated on the display screen of the smart television 100, and then the user makes a "gesture pattern" for confirming the exit from the child mode according to a prompt of the confirmation window. However, the present application is not limited thereto, and in some possible embodiments, after the smart tv 100 meets the wake-up operation condition that triggers exiting the child mode, no confirmation window is generated on the display screen of the smart tv 100. After the user utters the wake-up voice, the user directly makes the above-mentioned "gesture pattern" (e.g., hand-waving gesture) for confirming the exit from the child mode.
The camera 101 acquires a user image of the user in front of the smart television 100, and confirms that the user image includes a "waving gesture" (indicated by a path (4) in fig. 6). Likewise, the processor of the smart tv 100 determines, according to the user image, that the user's action command is: confirming to exit the child mode. And determining whether the user image containing the "waving gesture" corresponds to an adult. That is, the smart television 100 collects the portrait and the gesture operation through the camera 101, and jointly detects whether the gesture operation is performed by an adult.
Therefore, the display screen of the smart television 100 does not generate a confirmation window, and the processor of the smart television 100, according to the action command of "confirming to exit from the child mode" and the fact that the user is determined to be an adult from the user image, meets the condition of confirming to exit from the child mode, and switches the smart television 100 from the child mode to the normal mode.
In conclusion, when the smart television 100 of the present application is in the child mode, the waking up of exiting the child mode is completed by the dual combined recognition of voice and adult portrait; and confirming that the child exits the child mode is completed by the combined recognition of the adult portrait and the gesture of the specific limb. And the existing physical device is used, so that the exit of the child mode can be completed more accurately and conveniently for the personnel meeting the requirement of identity verification under the condition of not increasing the physical cost.
It should be noted that the first system mode is not limited to the child mode. For example, in some possible embodiments, the first system mode is a guest mode, such as a guest mode of a cell phone. The above-described method of switching system modes is equally applicable to exiting guest mode. Exemplary, triggering of guest mode exit: double recognition of portrait and voice; mode of confirming guest mode exit: adult portrait + associated limb gesture joint detection mode.
It should be noted that the above embodiments describe switching from the first system mode exemplified by the child mode to the second system mode exemplified by the normal mode. In some possible embodiments, it may be that the first system mode exemplified by the normal mode is switched to the second system mode exemplified by the child mode.
Illustratively, the method for switching the system mode is applicable to a scene that relates to switching from the low-authority system mode to the high-authority system mode. The permission required for switching the first system mode to the second system mode is higher, the first system mode cannot be easily switched to the second system mode, and the first system mode can be switched to the second system mode only by obtaining certain permission verification. Then, the triggering mode for switching the first system mode to the second system mode is as follows: double recognition of portrait and voice; confirming the mode of switching the first system mode to the second system mode: adult portrait + associated limb gesture joint detection mode. The mode switching process is convenient and not easy to bypass.
Referring now to FIG. 8, shown is a block diagram of an electronic device 400 in accordance with one embodiment of the present application. The electronic device 400 may include one or more processors 401 coupled to a controller hub 403. For at least one embodiment, the controller hub 403 communicates with the processor 401 via a multi-drop Bus such as a Front Side Bus (FSB), a point-to-point interface such as a QuickPath Interconnect (QPI), or similar connection 406. Processor 401 executes instructions that control general types of data processing operations. In one embodiment, the Controller Hub 403 includes, but is not limited to, a Graphics Memory Controller Hub (GMCH) (not shown) and an Input/Output Hub (IOH) (which may be on separate chips) (not shown), where the GMCH includes a Memory and a Graphics Controller and is coupled to the IOH.
The electronic device 400 may also include a coprocessor 402 and memory 404 coupled to the controller hub 403. Alternatively, one or both of the memory and GMCH may be integrated within the processor (as described herein), with the memory 404 and coprocessor 402 coupled directly to the processor 401 and controller hub 403, with the controller hub 403 and IOH in a single chip.
The Memory 404 may be, for example, a Dynamic Random Access Memory (DRAM), a Phase Change Memory (PCM), or a combination of the two. Memory 404 may include one or more tangible, non-transitory computer-readable media for storing data and/or instructions therein. The computer readable storage medium has stored therein instructions, in particular, temporary and permanent copies of the instructions. The instructions may include: instructions that, when executed by at least one of the processors, cause the electronic device 400 to perform the method illustrated in fig. 4. The instructions, when executed on a computer, cause the computer to perform the methods disclosed in any one or combination of the embodiments described above.
In one embodiment, the coprocessor 402 is a special-purpose processor, such as, for example, a high-throughput MIC (man Integrated Core) processor, a network or communication processor, compression engine, graphics processor, GPGPU (General-purpose-computing on graphics processing unit), embedded processor, or the like. The optional nature of coprocessor 402 is represented in FIG. 8 by dashed lines.
In one embodiment, the electronic device 400 may further include a Network Interface Controller (NIC) 406. Network interface 406 may include a transceiver to provide a radio interface for electronic device 400 to communicate with any other suitable device (e.g., front end module, antenna, etc.). In various embodiments, the network interface 406 may be integrated with other components of the electronic device 400. The network interface 406 may implement the functions of the communication unit in the above-described embodiments.
The electronic device 400 may further include an Input/Output (I/O) device 405.I/O405 may include: a user interface designed to enable a user to interact with the electronic device 400; the design of the peripheral component interface enables peripheral components to also interact with the electronic device 400; and/or sensors are designed to determine environmental conditions and/or location information associated with electronic device 400.
It is noted that fig. 8 is merely exemplary. That is, although fig. 8 shows that the electronic device 400 includes a plurality of components, such as the processor 401, the controller hub 403, and the memory 404, in a practical application, the device using the methods of the present application may include only a part of the components of the electronic device 400, for example, only the processor 401 and the network interface 406 may be included. The nature of the alternative device in fig. 8 is shown in dashed lines.
Referring now to fig. 9, shown is a block diagram of a SoC (System on Chip) 500 in accordance with an embodiment of the present application. In fig. 9, like parts have the same reference numerals. In addition, the dashed box is an optional feature of more advanced socs. In fig. 9, soC500 includes: an interconnect unit 550 coupled to the processor 510; a system agent unit 580; a bus controller unit 590; an integrated memory controller unit 540; a set or one or more coprocessors 520 which may include integrated graphics logic, an image processor, an audio processor, and a video processor; an Static Random-Access Memory (SRAM) unit 530; a Direct Memory Access (DMA) unit 560. In one embodiment, coprocessor 520 includes a special-purpose processor, such as, for example, a network or communication processor, compression engine, GPGPU (General-purpose computing on graphics processing units, general-purpose computing on a graphics processing unit), high-throughput MIC processor, or embedded processor, among others.
Static Random Access Memory (SRAM) unit 530 may include one or more tangible, non-transitory computer-readable media for storing data and/or instructions. The computer readable storage medium has stored therein instructions, in particular, temporary and permanent copies of the instructions. The instructions may include: instructions that when executed by at least one of the processors cause the SoC to implement the method shown in fig. 4. The instructions, when executed on a computer, cause the computer to perform the methods disclosed in the embodiments described above.
The method embodiments of the present application may be implemented in software, magnetic, firmware, etc.
Program code may be applied to input instructions to perform the functions described herein and generate output information. The output information may be applied to one or more output devices in a known manner. For purposes of this application, a processing system includes any system having a Processor such as, for example, a Digital Signal Processor (DSP), a microcontroller, an Application Specific Integrated Circuit (ASIC), or a microprocessor.
The program code may be implemented in a high level procedural or object oriented programming language to communicate with a processing system. Program code may also be implemented in assembly or machine language, if desired. Indeed, the mechanisms described herein are not limited in scope to any particular programming language. In any case, the language may be a compiled or interpreted language.
One or more aspects of at least one embodiment may be implemented by representative instructions stored on a computer-readable storage medium, which represent various logic in a processor, which when read by a machine causes the machine to fabricate logic to perform the techniques herein. These representations, known as "IP (Intellectual Property) cores," may be stored on a tangible computer-readable storage medium and provided to a number of customers or production facilities to load into the manufacturing machines that actually manufacture the logic or processors.
In some cases, an instruction converter may be used to convert instructions from a source instruction set to a target instruction set. For example, the instruction converter may transform (e.g., using a static binary transform, a dynamic binary transform including dynamic compilation), morph, emulate, or otherwise convert the instruction into one or more other instructions to be processed by the core. The instruction converter may be implemented in software, hardware, firmware, or a combination thereof. The instruction converter may be on the processor, external to the processor, or partially on and partially off the processor.

Claims (13)

1. A system mode switching method applied to electronic equipment is characterized by comprising the following steps:
operating a first system mode;
acquiring voice of a user of the electronic equipment, and analyzing an instruction from the voice;
determining that the instruction comprises a preset wake-up word for switching the first system mode;
acquiring a user image acquired by the user in front of the electronic equipment, and determining that the user comprises a user of a preset user type according to the user image;
and switching the electronic equipment from a first system mode to a second system mode corresponding to the preset user type according to the instruction, the user image and the determined user of the preset user type.
2. The system mode switching method according to claim 1, wherein the user image includes a first user image and a second user image;
the switching the electronic device from a first system mode to a second system mode corresponding to the preset user type according to the instruction, the user image and the determined user of the preset user type includes:
generating a confirmation window according to the instruction and a user of a preset user type determined from the first user image;
responding to the generation of the confirmation window, acquiring a user image acquired by the user in front of the electronic equipment as a second user image, and determining an action instruction according to the second user image;
and switching the electronic equipment from a first system mode to a second system mode corresponding to the preset user type according to the action instruction and the user of the preset user type determined from the second user image.
3. The system mode switching method according to claim 2, wherein the action instruction is a static-based action.
4. The system mode switching method according to claim 2, wherein the action instruction is based on a dynamic action, the dynamic action including a limb action of the user.
5. The system mode switching method according to claim 2, wherein the first user image comprises a static image, and it is determined from the static image that the user comprises a user of a preset user type.
6. The system mode switching method according to claim 1, wherein the user of the preset user type is a speaker of the voice.
7. The method for switching system modes according to claim 1, wherein the switching the electronic device from a first system mode to a second system mode corresponding to a predetermined user type according to the instruction and according to the user image and a user of the predetermined user type comprises:
determining an action instruction according to the user image;
and switching the electronic equipment from a first system mode to a second system mode corresponding to the preset user type according to the instruction, the determined user of the preset user type and the action instruction.
8. The system mode switching method according to claim 7, wherein the user image includes a static image, and it is determined from the static image that the user includes a user of a preset user type.
9. The system mode switching method according to claim 7, wherein the user image includes a dynamic image including a limb movement of the user.
10. The system mode switching method according to claim 4 or 9, wherein the limb action comprises a hand-waving gesture.
11. The system mode switching method according to any one of claims 1 to 10, wherein the user of the predetermined user type is an adult, the first system mode is a child mode, and the second system mode is a normal mode.
12. An electronic device, comprising:
the sound pick-up is used for collecting the voice of a user;
the camera is used for collecting a user image of a user;
a processor;
a memory comprising instructions that, when executed by the processor, cause the electronic device to perform the system mode switching method of any of claims 1 to 11.
13. A computer-readable storage medium having instructions stored thereon, which when executed on a computer, cause the computer to perform the system mode switching method of any one of claims 1 to 11.
CN202110767081.8A 2021-07-07 2021-07-07 System mode switching method, electronic equipment and computer readable storage medium Pending CN115604513A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110767081.8A CN115604513A (en) 2021-07-07 2021-07-07 System mode switching method, electronic equipment and computer readable storage medium
PCT/CN2022/101983 WO2023280020A1 (en) 2021-07-07 2022-06-28 System mode switching method, electronic device and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110767081.8A CN115604513A (en) 2021-07-07 2021-07-07 System mode switching method, electronic equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN115604513A true CN115604513A (en) 2023-01-13

Family

ID=84800364

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110767081.8A Pending CN115604513A (en) 2021-07-07 2021-07-07 System mode switching method, electronic equipment and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN115604513A (en)
WO (1) WO2023280020A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117037790A (en) * 2023-10-10 2023-11-10 朗朗教育科技股份有限公司 AI interaction intelligent screen control system and method

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104750252B (en) * 2015-03-09 2018-02-27 联想(北京)有限公司 A kind of information processing method and electronic equipment
CN106201663A (en) * 2015-05-06 2016-12-07 中兴通讯股份有限公司 The child mode changing method of mobile terminal and device
CN106921780A (en) * 2017-03-09 2017-07-04 广东小天才科技有限公司 Intelligent terminal operation mode switching method and device and intelligent terminal
CN108600796B (en) * 2018-03-09 2019-11-26 百度在线网络技术(北京)有限公司 Control mode switch method, equipment and the computer-readable medium of smart television
CN112530419B (en) * 2019-09-19 2024-05-24 百度在线网络技术(北京)有限公司 Speech recognition control method, device, electronic equipment and readable storage medium
CN112770186A (en) * 2020-12-17 2021-05-07 深圳Tcl新技术有限公司 Method for determining television viewing mode, television and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117037790A (en) * 2023-10-10 2023-11-10 朗朗教育科技股份有限公司 AI interaction intelligent screen control system and method
CN117037790B (en) * 2023-10-10 2024-01-09 朗朗教育科技股份有限公司 AI interaction intelligent screen control system and method

Also Published As

Publication number Publication date
WO2023280020A1 (en) 2023-01-12

Similar Documents

Publication Publication Date Title
CN107103905B (en) Method and product for speech recognition and information processing device
EP3951774B1 (en) Voice-based wakeup method and device
WO2021013137A1 (en) Voice wake-up method and electronic device
US20220269762A1 (en) Voice control method and related apparatus
WO2021135685A1 (en) Identity authentication method and device
US20220343919A1 (en) Voice-Controlled Split-Screen Display Method and Electronic Device
TWI571796B (en) Audio pattern matching for device activation
CN109166575A (en) Exchange method, device, smart machine and the storage medium of smart machine
CN104011735A (en) Vehicle Based Determination Of Occupant Audio And Visual Input
CN112739507B (en) Interactive communication realization method, device and storage medium
WO2020125038A1 (en) Voice control method and device
CN109032345A (en) Apparatus control method, device, equipment, server-side and storage medium
CN112634895A (en) Voice interaction wake-up-free method and device
WO2020087895A1 (en) Voice interaction processing method and apparatus
CN115206306A (en) Voice interaction method, device, equipment and system
CN112286364A (en) Man-machine interaction method and device
WO2023280020A1 (en) System mode switching method, electronic device and computer-readable storage medium
WO2024179425A1 (en) Voice interaction method and related device
CN114333774B (en) Speech recognition method, device, computer equipment and storage medium
CN111681654A (en) Voice control method and device, electronic equipment and storage medium
US20220284738A1 (en) Target user locking method and electronic device
CN110337030B (en) Video playing method, device, terminal and computer readable storage medium
CN111986700A (en) Method, device, equipment and storage medium for triggering non-contact operation
CN111652624A (en) Ticket buying processing method, ticket checking processing method, device, equipment and storage medium
WO2023006033A1 (en) Speech interaction method, electronic device, and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination