CN111243583B

CN111243583B - System awakening method and device

Info

Publication number: CN111243583B
Application number: CN201911414792.6A
Authority: CN
Inventors: 马磊
Original assignee: Shenzhen Ruixun Cloud Technology Co ltd
Current assignee: Shenzhen Ruixun Cloud Technology Co ltd
Priority date: 2019-12-31
Filing date: 2019-12-31
Publication date: 2023-03-10
Anticipated expiration: 2039-12-31
Also published as: CN111243583A

Abstract

The embodiment of the invention provides a system awakening method and a system awakening device, wherein the method is applied to an artificial intelligence system, the artificial intelligence system is provided with a detection assembly, and the method comprises the following steps: when the artificial intelligence system acquires voice data, recognizing voice words corresponding to the voice data; judging whether the voice words are the same as preset awakening words or not; if the voice words are the same as preset awakening words, controlling the detection assembly to detect the current action of the user and generate a detection result; and executing the awakening operation according to the detection result. The method and the device are simple and convenient to operate, detect the voice information of the user and can judge the current action of the user, determine whether the user wants to wake up the artificial intelligence system according to the current action of the user, avoid the problem of repeated wake-up, avoid resource waste, and meanwhile, the calculated amount in the judging process is small, so that the power consumption of the system can be further reduced, and the use experience of the user is improved.

Description

System awakening method and device

Technical Field

The present invention relates to the field of internet technologies, and in particular, to a system wake-up method and a system wake-up apparatus.

Background

With the continuous popularization of the internet, the artificial intelligence system gradually enters a part of the life of people, and convenience is provided for the life of people.

The artificial intelligence system can execute the operation corresponding to the voice data by identifying the voice data of the user, thereby providing convenience for the life of the user.

When the artificial intelligence system is awakened, the voice data of the user is possibly similar to the awakening information of the artificial intelligence system, and the artificial intelligence system is awakened by mistake, so that the energy consumption of the artificial intelligence system is increased.

Disclosure of Invention

In view of the above, embodiments of the present invention are proposed to provide a system wake-up method and a system wake-up apparatus that overcome or at least partially solve the above problems.

In order to solve the above problem, an embodiment of the present invention discloses a system wake-up method, which is applied to an artificial intelligence system, wherein the artificial intelligence system is provided with a detection component, and the method includes:

when the artificial intelligence system acquires voice data, recognizing voice words corresponding to the voice data;

judging whether the voice words are the same as preset awakening words or not;

if the voice words are the same as preset awakening words, controlling the detection assembly to detect the current action of the user and generate a detection result;

and executing the awakening operation according to the detection result.

Optionally, the detection assembly comprises: a rotating table on which a plurality of cameras are provided; the detection result comprises a direction vector;

if the voice word is the same as a preset awakening word, controlling the detection component to detect the current action of the user and generate a detection result, wherein the detection result comprises the following steps:

if the voice words are the same as the preset awakening words, establishing a three-dimensional coordinate system by taking the rotation center of the rotating table as the origin of coordinates;

acquiring a plurality of user images acquired by the plurality of cameras to determine coordinates of human eyes of the user in the three-dimensional coordinate system;

and calculating to obtain the direction vector of the human eye sight of the user in the three-dimensional coordinate system according to the coordinate origin and the human eye coordinate.

Optionally, the acquiring a plurality of images of the user captured by the plurality of cameras to determine coordinates of human eyes of the user in the three-dimensional coordinate system includes:

acquiring coordinates of a plurality of human eye images in a pixel coordinate system in the plurality of user images;

converting the coordinates of the human eye images in a pixel coordinate system into the coordinates of the human eye images in the three-dimensional coordinate system;

calculating target coordinates closest to the coordinates of the plurality of human eye images in the three-dimensional coordinate system;

and taking the target coordinates as the coordinates of the human eyes of the user in the three-dimensional coordinate system.

Optionally, the executing the wake-up operation according to the detection result includes:

judging whether the direction vector is the same as a preset direction vector or not;

and if the direction vector is the same as the preset direction vector, executing wakeup operation.

The embodiment of the invention also discloses a system awakening device, which is applied to an artificial intelligence system, wherein the artificial intelligence system is provided with a detection assembly, and the device comprises:

the recognition module is used for recognizing the voice words corresponding to the voice data when the artificial intelligence system acquires the voice data;

the judging module is used for judging whether the voice words are the same as preset awakening words or not;

the detection module is used for controlling the detection assembly to detect the current action of the user and generate a detection result if the voice word is the same as a preset awakening word;

and the execution module is used for executing the awakening operation according to the detection result.

the detection module comprises:

the establishing module is used for establishing a three-dimensional coordinate system by taking the rotation center of the rotating table as an origin of coordinates if the voice words are the same as the preset awakening words;

the determining module is used for acquiring a plurality of user images acquired by the plurality of cameras so as to determine the coordinates of human eyes of the user in the three-dimensional coordinate system;

and the calculation module is used for calculating to obtain a direction vector of the human eye sight of the user in the three-dimensional coordinate system according to the coordinate origin and the human eye coordinates.

Optionally, the determining module includes:

the acquisition module is used for acquiring the coordinates of a plurality of human eye images in the plurality of user images in a pixel coordinate system;

the transformation module is used for transforming the coordinates of the human eye images in the pixel coordinate system into the coordinates of the human eye images in the three-dimensional coordinate system;

a proximity coordinate module for calculating a target coordinate closest to the coordinates of the plurality of human eye images in the three-dimensional coordinate system;

and the target coordinate module is used for taking the target coordinate as the coordinate of the human eye of the user in the three-dimensional coordinate system.

Optionally, the execution module includes:

the direction vector judging module is used for judging whether the direction vector is the same as a preset direction vector or not;

and the same module is used for executing the awakening operation if the direction vector is the same as the preset direction vector.

The embodiment of the invention also discloses a device, which comprises:

one or more processors; and

one or more machine-readable media having instructions stored thereon, which when executed by the one or more processors, cause the apparatus to perform one or more methods as described in the embodiments above.

The embodiment of the invention also discloses a computer readable storage medium, which stores a computer program for causing a processor to execute any one of the methods described in the above embodiments.

The embodiment of the invention has the following advantages: the method can identify the voice words corresponding to the voice data when the artificial intelligence system acquires the voice data; judging whether the voice words are the same as preset awakening words or not; if the voice words are the same as preset awakening words, controlling the detection assembly to detect the current action of the user and generate a detection result; and finally executing the awakening operation according to the detection result. The system awakening method provided by the embodiment is simple and convenient to operate, can detect the voice information of the user and judge the current action of the user, and determines whether the user wants to awaken the artificial intelligence system according to the current action of the user, so that the awakening identification accuracy can be improved, the problem of repeated awakening can be avoided, the resource waste is avoided, meanwhile, the calculated amount in the judging process is small, the system power consumption can be further reduced, the probability of mistaken awakening can be reduced, and the use experience of the user is improved.

Drawings

FIG. 1 is a flowchart illustrating a first embodiment of a system wake-up method according to the present invention;

FIG. 2 is a flowchart illustrating the steps of a second embodiment of the wake-up method of the present invention;

fig. 3 is a schematic structural diagram of a system wake-up apparatus according to a first embodiment of the invention.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.

Referring to fig. 1, there is shown a flowchart of the steps of fig. 1, which is a first embodiment of the system wake-up method of the present invention, and the method can be applied to an artificial intelligence system, which may be an application system developed by using artificial intelligence technology or knowledge engineering technology, or a knowledge-based software engineering auxiliary system, or an intelligent operating system researched by integrating an operating system and artificial intelligence with cognitive science, or a mobile terminal, a computer terminal, or a similar computing device, etc. In a particular implementation, the artificial intelligence system may be a voice intelligence system. The voice intelligence system may include a voice receiving device for receiving voice data, a recognition device for recognizing voice data, an infrared sensor, a heat source detector, one or more processors (which may include, but are not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), and a memory for storing data.

The memory may be used to store a computer program, for example, a software program and a module of application software, such as a computer program corresponding to the message receiving method in the embodiment of the present invention, and the processor executes various functional applications and data processing by running the computer program stored in the memory, that is, implements the method described above. The memory may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory may further include memory located remotely from the processor, and these remote memories may be connected to the mobile terminal through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

Specifically, the artificial intelligence system is provided with a detection assembly. The detection component may be a face detection component, an image detection component, etc., which may be used to detect a current action of the user, such as eye rotation, visual coordinates, limb movement, etc. The detection component can further determine whether the user hopes to awaken the artificial intelligence system or not and execute the awakening action when sending the voice data, so that the false awakening of the artificial intelligence system can be avoided, and the energy consumption of the artificial intelligence system can be reduced.

In this embodiment, the method may include:

step 101, when the artificial intelligence system obtains voice data, identifying a voice word corresponding to the voice data.

In this embodiment, the artificial intelligence system may be in an off state, or the artificial intelligence system may be in a standby state, or the artificial intelligence system may be in a sleep state, and then receive the voice data, so that whether to perform the wakeup operation may be determined according to the voice data. The voice data may be voice information of the user, such as audio.

In an alternative embodiment, the artificial intelligence system may be provided with a speech receiving device, which may be employed to receive speech information input by a user. Specifically, the voice receiving device may be a microphone, and the microphone may be used to receive voice information input by a user.

In one optional embodiment, the artificial intelligence system may also be connected to an external device, where the external device may be an intelligent terminal, an intelligent device, a server, or the like. The voice information of the user can be received through the intelligent terminal or the intelligent equipment, and then the voice information is sent to the artificial intelligence system through the intelligent terminal or the intelligent equipment.

In the implementation of this embodiment, after the artificial intelligence system obtains the speech data, it may identify the corresponding speech word according to the speech data.

Specifically, the artificial intelligence system may set a speech recognition model, and the speech recognition model may obtain phonemes from the speech information, and may obtain corresponding characters by using the phonemes.

For example, in practical operation, a speech recognition model may be used to obtain acoustic features of speech data, convert the acoustic features into a phoneme array, and then further convert the phoneme array into a text sequence, thereby completing the recognition of the speech data.

It should be noted that the acoustic feature may be a waveform feature, such as amplitude, period, wavelength, decibel, acoustic power, fundamental frequency of sound intensity, formant, and so on. Phonemes (phoneme) are the smallest unit in speech, and are analyzed according to pronunciation actions in syllables, wherein one action constitutes one phoneme, and the phonemes are divided into two categories of vowels and consonants, such as Chinese syllables of \257 (o) which has only one phoneme, a i (love) which has two phonemes, d \257 (dumb) which has three phonemes and the like.

In actual operation, the phoneme array can be substituted into the recognition function to perform calculation training to obtain a corresponding character sequence.

And 102, judging whether the voice words are the same as preset awakening words or not.

In this embodiment, after obtaining the speech word, it may be determined whether the speech word is the same as a preset wake-up word, and it may be determined whether the user wishes to wake up the artificial intelligence system.

Specifically, the wake-up word may be "minuscule", "hi, sin", or the like.

And 103, if the voice word is the same as a preset awakening word, controlling the detection component to detect the current action of the user and generate a detection result.

In this embodiment, when the speech is the same as the preset wake-up word, several situations may occur, one is that the user is just chatting, and the utterance of the chat is the same as the wake-up word, and the other is that the user wants to wake up the artificial intelligence system. The former can wake up the artificial intelligence system but not execute any operation, which can increase the energy consumption of the artificial intelligence system and waste resources. In order to avoid the situation, the current action of the user can be detected through the detection component, the detection result is generated, and whether the user needs to be awakened or not can be determined according to the detection result, so that the energy consumption of the artificial intelligence system can be reduced, and the waste of resources is avoided.

In one alternative embodiment, the user's limb movement may be detected, for example, if the user's limb movement is large in magnitude, it may be determined that the user wishes to wake up the artificial intelligence system; if the limb amplitude of the user is small, it may be determined that the user does not wish to wake up the artificial intelligence system.

Or, it may be detected whether the face of the user is facing the artificial intelligence system, and if the face of the user is facing the artificial intelligence system, it may be determined that the user wishes to wake up the artificial intelligence system; if the user's face is not oriented toward the artificial intelligence system, it may be determined that the user does not wish to wake the artificial intelligence system.

In another optional embodiment, the detection assembly comprises: a rotating table on which a plurality of cameras are provided; the detection result includes a direction vector.

Wherein, this revolving stage can be steering wheel cloud platform. The plurality of cameras may be multi-lens cameras.

In this embodiment, the eyeball gazing direction of the user may be detected, and if the eyeball gazing direction of the current user is toward the artificial intelligence system, it may be determined that the user wishes to wake up the artificial intelligence system; if the eye gaze direction of the current user is not toward the artificial intelligence system, it may be determined that the user does not wish to wake up the artificial intelligence system.

And 104, executing a wakeup operation according to the detection result.

In this embodiment, if the detection result determines that the user wishes to wake up the artificial intelligence system, a wake-up operation may be performed; if the detection result determines that the user does not want to wake up the artificial intelligence system, the artificial intelligence system can continue to sleep, so that energy consumption can be reduced.

In an optional embodiment of the present invention, a system wake-up method is provided, where when the artificial intelligence system obtains speech data, a speech word corresponding to the speech data is identified; judging whether the voice words are the same as preset awakening words or not; if the voice words are the same as preset awakening words, controlling the detection assembly to detect the current actions of the user and generate a detection result; and finally executing the awakening operation according to the detection result. The system awakening method provided by the embodiment is simple and convenient to operate, can detect the voice information of the user and judge the current action of the user, determines whether the user wants to awaken the artificial intelligence system according to the current action of the user, can improve the accuracy of awakening identification, can avoid the problem of repeated awakening, avoids resource waste, and meanwhile has small calculated amount in the judging process, can further reduce the power consumption of the system, can reduce the probability of mistaken awakening and improve the use experience of the user.

Referring to fig. 2, a flowchart of the steps of the second embodiment of the system wake-up method of the present invention is shown, in this embodiment, the method may be applied to an artificial intelligence system, which may be an application system developed by using artificial intelligence technology or knowledge engineering technology, or a knowledge-based software engineering auxiliary system, or an intelligent operating system researched by integrating an operating system with artificial intelligence and cognitive science, or a mobile terminal, a computer terminal, or a similar computing device, etc. In a particular implementation, the artificial intelligence system may be a voice intelligence system. The voice intelligence system may include a voice receiving device for receiving voice data, a recognition device for recognizing voice data, an infrared sensor, a heat source detector, one or more processors (which may include, but are not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), and a memory for storing data.

In this embodiment, the method may include:

step 201, when the artificial intelligence system obtains the voice data, recognizing the voice word corresponding to the voice data.

In an alternative embodiment, the artificial intelligence system may also be connected to an external device, where the external device may be an intelligent terminal or an intelligent device or a server. The voice information of the user can be received through the intelligent terminal or the intelligent equipment, and then the voice information is sent to the artificial intelligence system through the intelligent terminal or the intelligent equipment.

It should be noted that the acoustic feature may be a waveform feature, such as amplitude, period, wavelength, decibel, acoustic power, fundamental frequency of sound intensity, formant, and so on. Phonemes (phoneme) are the smallest units in speech, and are analyzed according to pronunciation actions in syllables, wherein one action constitutes one phoneme, and the phonemes are divided into two categories of vowels and consonants, such as Chinese syllables \257 (a) with only one phoneme, aji (ai) with two phonemes, d \257i (dull) with three phonemes and the like.

Step 202, judging whether the voice word is the same as a preset awakening word.

Specifically, the wake-up word may be "minuscule", "hi, sin", or the like.

And 203, if the voice word is the same as a preset awakening word, controlling the detection component to detect the current action of the user and generate a detection result.

In this embodiment, when the speech is the same as the preset wake-up word, several situations may occur, one being that the user is just chatting, and the utterance of the chat is the same as the wake-up word, and the other being that the user wants to wake up the artificial intelligence system. The former can wake up the artificial intelligence system but not execute any operation, which can increase the energy consumption of the artificial intelligence system and waste resources. In order to avoid the situation, the current action of the user can be detected through the detection component, the detection result is generated, and whether the user needs to be awakened or not can be determined according to the detection result, so that the energy consumption of the artificial intelligence system can be reduced, and the waste of resources is avoided.

In one alternative embodiment, the user's limb movement may be detected, for example, if the user's limb movement is large in magnitude, it may be determined that the user wishes to wake up the artificial intelligence system; if the user's limb amplitude is small, it may be determined that the user does not wish to wake up the artificial intelligence system.

Optionally, step 203 may comprise the sub-steps of:

and a substep 2031 of establishing a three-dimensional coordinate system with the rotation center of the turntable as the origin of coordinates if the speech word is the same as the preset wake-up word.

In this embodiment, the turntable may rotate around a point, which is the rotation center of the turntable. In practice, a laser may be provided in the rotary table, and the laser may rotate together with the rotary table, so that the laser may be emitted at a plurality of angles. In the embodiment of the present invention, a three-dimensional coordinate system may be established with the rotation center of the turntable as the origin of coordinates.

The laser may emit a spot, infrared light, etc., and measure the distance between the spot or infrared light and the laser.

The coordinates of the spot or the infrared ray in the three-dimensional coordinate system can be calculated by acquiring the distance between the spot or the infrared ray measured by the laser and the laser. It should be noted that, for convenience of calculation, the laser may be disposed at the position of the rotation center of the turntable, and in this case, it may be considered that the laser measures the distance between the light spot or the infrared ray and the rotation center of the turntable, the distance between the light spot or the infrared ray measured by the laser and the rotation center of the turntable is directly acquired, and the coordinates of the light spot or the infrared ray in the three-dimensional coordinate system are calculated.

In a preferred embodiment of the present invention, the rotating platform may include two steering engines perpendicular to each other by a rotating plane; for example, it includes a horizontally rotating steering engine and a vertically rotating steering engine.

In the embodiment of the invention, the laser can be arranged on the two steering engines, and when the two steering engines rotate, the laser can be driven to move, so that the projection position of light spots or infrared rays emitted by the laser is changed.

Specifically, two rotation angles θ can be randomly generated,

when the rotation angles are all 90 degrees, the corresponding step length of the steering engine is N _a 、N _b And the rotation angle is theta,

step value V of time steering engine ₁ ，V ₂ The following formulas are satisfied:

V ₁ ＝θ÷90×N _a ；

when two steering engines are controlled, the step value V of the steering engines can be adjusted ₁ ，V ₂ And coding the command into a command statement of the steering engine to generate a rotation command. The rotation angle theta is recorded at the same time when the rotation angle theta is recorded,

after the two steering engines complete rotation at random angles, a ranging command can be sent to the laser, the laser emits laser (namely, emits light spots or infrared rays) after receiving the ranging command, and the laser can form light spots on an obstacle after encountering the obstacle. The laser may further measure the distance L between the spot or infrared and the rotation center of the rotary table.

In the embodiment of the present invention, the distance L and the rotation angle θ between the light spot or the infrared ray and the rotation center of the rotary table may be adopted,

to calculate the coordinates of the light spot or infrared ray in the three-dimensional coordinate system.

Specifically, the coordinates of the light spot or the infrared ray in the three-dimensional coordinate system satisfy the following formula:

wherein, the ratio of theta to theta is,

the rotation angle of the rotary table is shown, L is the distance between the light spot or the infrared ray and the rotation center of the rotary table, and P is the coordinate of the light spot or the infrared ray in a three-dimensional coordinate system.

The measured distance L and the recorded theta are,

the value of (2) is substituted into a formula satisfied by the coordinates of the light spot or the infrared ray in the three-dimensional coordinate system, and the coordinates of the light spot or the infrared ray in the three-dimensional coordinate system can be obtained.

Sub-step 2032 of acquiring a plurality of user images captured by the plurality of cameras to determine coordinates of the human eyes of the user in the three-dimensional coordinate system.

In this embodiment, after coordinates in the three-dimensional coordinate system are established, a plurality of cameras may respectively capture user images. When the user images are collected, the collected objects watch the light spots left on the front barrier at the moment, and meanwhile, each camera collects the user images to complete the shooting task.

In order to ensure that eyes (human eyes) of an acquired object watch the facula when acquiring the user image, the facula can be set to randomly flash for several times each time, and the observer needs to speak the times of the facula flashing so as to ensure that the watching action of the acquired object occurs.

In an embodiment of the present invention, a human eye image of the plurality of user images may be employed to determine coordinates of a human eye in a three-dimensional coordinate system. The human eye image may be an image of a user including a human eye part in the acquired user image.

In a preferred embodiment, sub-step 2032 may comprise the following sub-steps:

sub-step 20321 of obtaining coordinates of a plurality of human eye images in the pixel coordinate system among the plurality of user images.

In the embodiment of the present invention, to determine the coordinates of the human eye in the three-dimensional coordinate system, the coordinates of the plurality of human eye images in the plurality of user images in the pixel coordinate system may be determined first.

The origin of coordinates of the pixel coordinate system is the upper left corner of the user image, the positive direction of the X axis is from the upper left corner to the right, and the positive direction of the Y axis is from the upper left corner to the lower. The position P of the eye in the image of the human eye can be manually marked so as to obtain the coordinates of P in the pixel coordinate system.

In practical operations, the position P of the eye in the human eye image may also be automatically marked according to a certain rule, for example, the position P of the eye in the human eye image may be automatically marked as the center position of the human eye image, which is not limited in the embodiment of the present invention.

Sub-step 20322, converting the coordinates of the plurality of human eye images in the pixel coordinate system into the coordinates of the plurality of human eye images in the three-dimensional coordinate system.

In this embodiment, after determining the coordinates of the plurality of human eye images in the pixel coordinate system of the plurality of user images, the coordinates of the plurality of human eye images in the pixel coordinate system may be further converted into the coordinates of the plurality of human eye images in the three-dimensional coordinate system according to a preset conversion relationship.

Optionally, a conversion relationship between a preset pixel coordinate system and a three-dimensional coordinate system may be adopted to convert coordinates of the plurality of human eye images in the camera coordinate system into coordinates of the plurality of human eye images in the three-dimensional coordinate system.

In a particular implementation, the pixel coordinate system and the image coordinate system are both on the imaging plane, except that the respective origins and measurement units are different. The origin of the image coordinate system is usually the midpoint of the imaging plane, in mm, which belongs to the physical unit. The unit of the pixel coordinate system is pixel, and we usually describe that a pixel is several rows and several columns. The transition between the two is as follows: where dx and dy denote how many mm each column and each row respectively represents, i.e. 1pixel = dx mm.

The conversion relationship between the preset pixel coordinate system and the three-dimensional coordinate system is a conversion relationship between coordinate systems preset for each camera. Specifically, the parameters of each camera can be determined through camera calibration, and the conversion relation between the coordinate systems can be determined by using the parameters of each camera.

In the embodiment of the invention, the coordinates of the plurality of human eye images in the camera coordinate system can be converted into the coordinates of the plurality of human eye images in the three-dimensional coordinate system by adopting the conversion relationship between the preset pixel coordinate system and the three-dimensional coordinate system.

It should be noted that, since the imaging pixels of each camera are different, when the conversion relationship between the pixel coordinate system preset by the plurality of cameras and the three-dimensional coordinate system is acquired, the conversion relationship between the pixel coordinate system preset by the plurality of cameras and the three-dimensional coordinate system is respectively determined.

Sub-step 20323 of calculating target coordinates closest to the coordinates of the plurality of human eye images in the three-dimensional coordinate system.

In the embodiment of the invention, the coordinate with the most similar coordinates to the plurality of coordinates can be calculated according to the origin coordinate of each camera and the coordinate of the corresponding human eye image in the three-dimensional coordinate system, and the most similar coordinate is taken as the target coordinate.

Specifically, a straight line may be determined according to the origin coordinates of each camera and the coordinates of the eye image in the three-dimensional coordinate system, a straight line perpendicularly intersecting each straight line is further calculated, an intersection point of each perpendicularly intersecting straight line is obtained, the coordinates of the intersection point are used as the closest coordinates, and the closest coordinates are used as the target coordinates.

And a substep 20324 of using the target coordinates as the coordinates of the human eye of the user in the three-dimensional coordinate system.

In this embodiment, after the target coordinates are acquired, the target coordinates may be determined to be coordinates of the human eyes of the user in the three-dimensional coordinate system. The coordinates are the current coordinates of the user in the three-dimensional coordinate system.

And a substep 2033 of calculating a direction vector of the eye sight of the user in the three-dimensional coordinate system according to the coordinate origin and the eye coordinates.

In the embodiment of the invention, the coordinates of the light spot in the three-dimensional coordinate system and the coordinates of the human eye in the three-dimensional coordinate system, namely the origin coordinates and the target coordinates in the three-dimensional coordinate system, can be adopted to determine the direction vector of the line of sight of the human eye in the world coordinate system.

Specifically, the direction vector of the line of sight of the human eye in the three-dimensional world coordinate system can be obtained by subtracting the coordinates of the light spot in the three-dimensional coordinate system from the coordinates of the human eye in the three-dimensional coordinate system.

And 204, executing a wakeup operation according to the detection result.

Optionally, step 204 may include the following sub-steps:

substep 2041 determines whether the direction vector is the same as a predetermined direction vector.

In this embodiment, the preset direction vector may be a direction toward the artificial intelligence system, and specifically, the preset direction vector may be a direction vector of the artificial intelligence system in a range of 0-180 degrees in the horizontal direction and 0-180 degrees in the longitudinal direction.

It should be noted that, in a specific implementation, the preset direction vector may be adjusted according to actual needs, and the present invention is not limited herein.

Sub-step 2042, if the direction vector is the same as the preset direction vector, perform a wake-up operation.

In the embodiment of the invention, if the direction vector is in the range of 0-180 degrees in the horizontal direction and 0-180 degrees in the longitudinal direction of the artificial intelligence system, the direction vector is the direction vector.

Step 205, generating an operation result and sending the operation result to a user.

In this embodiment, the artificial intelligence system may generate operation results, such as voice prompt, video prompt, action prompt, etc., after performing the wake-up operation. After the operation result is generated, the operation result may be sent to the user, for example, a sound prompt may be sent to the user, and the user may perform a corresponding operation according to the sound prompt. The use experience of the user can be improved.

In the preferred embodiment of the present invention, a system wake-up method is provided, which can identify a speech word corresponding to speech data when the artificial intelligence system acquires the speech data; judging whether the voice words are the same as preset awakening words or not; if the voice words are the same as preset awakening words, controlling the detection assembly to detect the current action of the user and generate a detection result; and executing the awakening operation according to the detection result, finally generating an operation result, and sending the operation result to a user. The system awakening method provided by the embodiment is simple and convenient to operate, can detect the voice information of the user and judge the current action of the user, and determines whether the user wants to awaken the artificial intelligence system according to the current action of the user, so that the awakening identification accuracy can be improved, the problem of repeated awakening can be avoided, the resource waste is avoided, meanwhile, the calculated amount in the judging process is small, the system power consumption can be further reduced, the probability of mistaken awakening can be reduced, and the use experience of the user is improved.

It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Furthermore, those skilled in the art will also appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention

Referring to fig. 3, a schematic structural diagram of one embodiment of the distributed transcoding apparatus of the present invention is shown, in this embodiment, the apparatus may be applied to an artificial intelligence system, where the artificial intelligence system is provided with a detection component, and the apparatus includes:

the recognition module 301 is configured to recognize a speech word corresponding to the speech data when the artificial intelligence system obtains the speech data;

a determining module 302, configured to determine whether the speech word is the same as a preset wake-up word;

the detection module 303 is configured to control the detection component to detect a current action of the user and generate a detection result if the voice word is the same as a preset wake-up word;

an executing module 304, configured to execute a wakeup operation according to the detection result.

the detection module comprises:

the establishing module is used for establishing a three-dimensional coordinate system by taking the rotation center of the rotating table as the origin of coordinates if the voice words are the same as the preset awakening words;

and the calculation module is used for calculating to obtain the direction vector of the human eye sight of the user in the three-dimensional coordinate system according to the coordinate origin and the human eye coordinate.

Optionally, the determining module includes:

Optionally, the execution module includes:

Optionally, the apparatus further comprises:

and the generating module is used for generating an operation result and sending the operation result to a user.

In one embodiment of the present invention, a system wake-up apparatus is provided, which may include: the recognition module 301 is configured to recognize a speech word corresponding to speech data when the artificial intelligence system obtains the speech data; a judging module 302, configured to judge whether the speech word is the same as a preset wake-up word; the detection module 303 is configured to control the detection component to detect a current action of the user and generate a detection result if the voice word is the same as a preset wake-up word; an executing module 304, configured to execute a wakeup operation according to the detection result. The system awakening device provided by the embodiment can detect the voice information of the user and judge the current action of the user, and determines whether the user wants to awaken the artificial intelligence system according to the current action of the user, so that the accuracy of awakening identification can be improved, the problem of repeated awakening can be avoided, the waste of resources is avoided, meanwhile, the calculated amount in the judging process is small, the power consumption of the system can be further reduced, the probability of mistaken awakening can be reduced, and the use experience of the user is improved.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

An embodiment of the present invention further provides an apparatus, including:

the method comprises one or more processors, a memory and a machine readable medium stored in the memory and capable of running on the processor, wherein the machine readable medium is implemented by the processor to realize the processes of the method embodiments, and can achieve the same technical effects, and in order to avoid repetition, the machine readable medium is not described herein again.

The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when being executed by a processor, the computer program implements the processes of the foregoing method embodiments, and can achieve the same technical effects, and is not described herein again to avoid repetition.

As will be appreciated by one of skill in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.

Finally, it should also be noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrases "comprising one of \ 8230; \8230;" does not exclude the presence of additional like elements in a process, method, article, or terminal device that comprises the element.

The above detailed description is made on a system wake-up method and a system wake-up apparatus provided by the present invention, and a specific example is applied in the present document to explain the principle and the implementation of the present invention, and the description of the above embodiment is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. A system wake-up method is applied to an artificial intelligence system, the artificial intelligence system is provided with a detection component, and the method comprises the following steps:

judging whether the voice words are the same as preset awakening words or not;

executing a wakeup operation according to the detection result;

generating an operation result and sending the operation result to a user, wherein the operation result comprises a sound prompt, a video prompt and/or an action prompt;

wherein, if the voice word is the same as a preset awakening word, the detection component is controlled to detect the current action of the user and generate a detection result, and the method comprises the following steps:

if the voice words are the same as the preset awakening words, establishing a three-dimensional coordinate system by taking the rotation center of the rotation table in the detection assembly as the origin of coordinates;

acquiring a plurality of user images acquired by a plurality of cameras of the rotating platform to determine coordinates of human eyes of a user in the three-dimensional coordinate system;

calculating to obtain a direction vector of the eye sight of the user in the three-dimensional coordinate system according to the coordinate origin and the eye coordinate;

wherein the acquiring a plurality of user images captured by a plurality of cameras of the rotating table to determine coordinates of the human eye of the user in the three-dimensional coordinate system comprises:

converting the coordinates of the plurality of human eye images in the pixel coordinate system into the coordinates of the plurality of human eye images in the three-dimensional coordinate system by adopting a conversion relation between a preset pixel coordinate system and the three-dimensional coordinate system;

calculating target coordinates closest to the coordinates of the human eye images in the three-dimensional coordinate system according to the origin coordinates of each camera and the coordinates of the corresponding human eye images in the three-dimensional coordinate system;

2. The method of claim 1, wherein the detection component comprises: a rotating table on which a plurality of cameras are provided; the detection result includes a direction vector.

3. The method of claim 1, wherein performing the wake-up operation according to the detection result comprises:

and if the direction vector is the same as the preset direction vector, executing the awakening operation.

4. The utility model provides a system awakening device which characterized in that is applied to artificial intelligence system, artificial intelligence system is equipped with the detection component, the device includes:

the execution module is used for executing the awakening operation according to the detection result;

the generating module is used for generating an operation result and sending the operation result to a user, wherein the operation result comprises a sound prompt, a video prompt and/or an action prompt;

wherein the detection module comprises:

the establishing module is used for establishing a three-dimensional coordinate system by taking the rotation center of the rotation table in the detection assembly as a coordinate origin if the voice word is the same as a preset awakening word;

the determining module is used for acquiring a plurality of user images acquired by a plurality of cameras of the rotating platform so as to determine the coordinates of human eyes of a user in the three-dimensional coordinate system;

the calculation module is used for calculating to obtain a direction vector of the human eye sight of the user in the three-dimensional coordinate system according to the coordinate origin and the human eye coordinate;

wherein the determining module comprises:

the conversion module is used for converting the coordinates of the human eye images in the pixel coordinate system into the coordinates of the human eye images in the three-dimensional coordinate system by adopting a conversion relation between a preset pixel coordinate system and the three-dimensional coordinate system;

the proximity coordinate module is used for calculating a target coordinate closest to the coordinates of the human eye images in the three-dimensional coordinate system according to the origin coordinate of each camera and the coordinates of the corresponding human eye images in the three-dimensional coordinate system;

5. The apparatus of claim 4, wherein the detection component comprises: a rotating table on which a plurality of cameras are provided; the detection result includes a direction vector.

6. The apparatus of claim 4, wherein the execution module comprises:

7. A voice wake-up apparatus, comprising:

one or more processors; and

one or more machine-readable media having instructions stored thereon that, when executed by the one or more processors, cause the apparatus to perform the method of any of claims 1-3.

8. A computer-readable storage medium storing a computer program for causing a processor to perform the method according to any one of claims 1 to 3.