CN109767763B

CN109767763B - Method and device for determining user-defined awakening words

Info

Publication number: CN109767763B
Application number: CN201811593641.7A
Authority: CN
Inventors: 胡明国; 徐俊峰
Original assignee: AI Speech Ltd
Current assignee: Sipic Technology Co Ltd
Priority date: 2018-12-25
Filing date: 2018-12-25
Publication date: 2021-01-26
Anticipated expiration: 2038-12-25
Also published as: CN109767763A

Abstract

The invention discloses a method for determining a user-defined awakening word, which comprises the following steps: receiving a first user instruction; determining custom content according to a first user instruction; performing awakening word evaluation on the user-defined content; and determining the self-defined awakening words according to the evaluation result. The invention also discloses a device for determining the self-defined awakening words, the self-definition of the awakening words can be realized according to the method and the device provided by the invention, the self-defined awakening words with higher accuracy, higher awakening rate and lower false awakening rate can be obtained, and the process of generating the awakening words is more efficient and faster. Meanwhile, the quality of the awakening words of the user-defined part can be well guaranteed, so that the playability of the voice product is improved, and the user experience of the whole product is greatly improved.

Description

Method and device for determining user-defined awakening words

Technical Field

The invention relates to the technical field of voice interaction, in particular to a method for determining a user-defined awakening word and a device for determining the user-defined awakening word.

Background

With the increasing growth of voice interaction technology, two main types of voice awakening models are currently used, one is based on a language model, the other is based on a non-language model, the awakening model based on the language model comprises an acoustic model and a language model, the two models need to be checked, although the checking accuracy is high, the utilization rate and the usability of the obtained awakening words are high, the model needs a great amount of calculation, and therefore the model is slow in processing process and low in efficiency. For the awakening model based on the language-free model, only acoustic verification is carried out, the calculation amount is small, the processing speed is high, but when the user uses the model to define the awakening word, the operation is troublesome, and the accuracy and the usability are reduced.

Disclosure of Invention

In order to solve the problems, realize the self-defined setting of the awakening words and ensure that the usability of the self-defined awakening words is higher and more accurate, the inventor designs a method for evaluating the voice awakening words on the basis of the standards in the prior art, further evaluates and scores the awakening words with determined contents, performs sensitive word detection, repeated word overlap detection, spoken word detection and utterance insufficiency word detection on the awakening words, and establishes an awakening word threshold value for the awakening words, so that the awakening rate of a user using the self-defined awakening words is increased and the false awakening rate is reduced. And because the quality of the awakening words of the user-defined part can be well ensured, the playability of the voice product is improved, and the user experience of the whole product is greatly improved.

In a first aspect, an embodiment of the present invention provides a method for determining a custom wake-up word, including:

receiving a first user instruction; determining custom content according to a first user instruction; performing awakening word evaluation on the user-defined content; and determining the self-defined awakening words according to the evaluation result.

In a second aspect, an embodiment of the present invention provides an apparatus for determining a custom wake-up word, including: the first receiving module is used for receiving a first user instruction; the user-defined content acquisition module is used for determining user-defined content according to the first user instruction; the awakening word evaluation module is used for carrying out awakening word evaluation on the user-defined content; and the awakening threshold generation module is used for generating an awakening threshold for the self-defined content determined as the self-defined awakening word according to the evaluation result.

In a third aspect, an embodiment of the present invention provides an electronic device, including: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the method of determining a custom wake word.

In a fourth aspect, an embodiment of the present invention provides a storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the method for determining a custom wake-up word.

The embodiment of the invention has the beneficial effects that: the method and the device for determining the self-defined awakening word can achieve the purpose of obtaining the self-defined awakening word with low false awakening rate, the process of obtaining the awakening word is more efficient and rapid, the generated awakening word can be evaluated according to the awakening word threshold value, and the utilization rate of the awakening word is greatly improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a flowchart of a method for determining a custom wake-up word according to an embodiment of the present invention;

FIG. 2 is a flowchart of a method for determining a custom wake-up word according to another embodiment of the present invention;

FIG. 3 is a block diagram of an apparatus for determining a custom wake-up word according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.

The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

As used in this disclosure, "module," "device," "system," and the like are intended to refer to a computer-related entity, either hardware, a combination of hardware and software, or software in execution. In particular, for example, an element may be, but is not limited to being, a process running on a processor, an object, an executable, a thread of execution, a program, and/or a computer. Also, an application or script running on a server, or a server, may be an element. One or more elements may be in a process and/or thread of execution and an element may be localized on one computer and/or distributed between two or more computers and may be operated by various computer-readable media. The elements may also communicate by way of local and/or remote processes based on a signal having one or more data packets, e.g., from a data packet interacting with another element in a local system, distributed system, and/or across a network in the internet with other systems by way of the signal.

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The method and the device for determining the user-defined awakening word in the embodiment of the invention are applied to the terminal equipment, the intelligent terminal is configured with a display screen, or the terminal equipment can project a display interface for the user to perform interactive operation, for example, any intelligent hardware such as an intelligent television, an intelligent mobile phone, a tablet computer, a PC, an intelligent home, a projector and the like, and the invention is not limited thereto.

Fig. 1 schematically shows a flow chart of a method for determining a custom wake-up word according to the present invention. As shown in fig. 1, the present embodiment includes the following steps:

step S101: a first user instruction is received. Specifically, the user inputs a wakeup word to be set, so as to generate a first user instruction according to the content of the wakeup word set by the user, that is, the first user instruction includes the content of the wakeup word set by the user, and the content of the wakeup word is a Chinese character.

Step S102: and determining the self-defined content according to the first user instruction.

As for the implementation method of this step, this embodiment exemplarily implements it as:

the Chinese character pinyin word stock is configured in advance, and the interior of the word stock contains professional Chinese character dictionaries such as Xinhua dictionaries and the like, so that each Chinese character can have corresponding pinyin. Because many Chinese characters are polyphonic characters due to the profound and profound nature of the Chinese characters, and some well-known Chinese characters are rarely used for pronunciation, in the preferred implementation mode, the Chinese character pinyin lexicon can be optimized, the pronunciation of the rarely used polyphonic characters is screened out, and the commonly used pronunciation is reserved. For example, the "wild" character actually has "ye" and "ya" pronunciation, and since "ya" pronunciation is not commonly used, the "ya" pronunciation is screened out when optimizing, so that the efficiency of subsequent processing can be improved. The wake word input by the user may be a single word, phrase or sentence, such as the user's intention of having the "I am' phrase as the wake word.

In a specific application, when a first user instruction sent by a user is received, namely a user-defined awakening word submitted by manual input of the user is received, the set user-defined Chinese characters are obtained firstly, pinyin conversion is carried out on each Chinese character in the user-defined Chinese characters, an optional pronunciation sequence is determined, namely the user-defined Chinese characters are split firstly, a single Chinese character is determined, for example, the Chinese character arrives, the single Chinese character is split into three characters, the Chinese character arrives, the pronunciation of the single Chinese character is determined according to a preferred Chinese character pinyin word bank or a Chinese character pinyin bank, the Chinese character pronunciation corresponds to the word bank as a word bank or a word bank in an exemplary manner, the single Chinese character pronunciation determined in a specific example is possibly more than one, and then the optional pronunciation sequence is generated according to the pronunciation of the single Chinese character. Preferably, in order to avoid too many determined selectable pronunciation sequences, semantic analysis can be performed on the user-defined Chinese characters, and selectable pronunciation sequence screening is performed according to the semantics of the user-defined Chinese characters, so that selectable pronunciation sequence output more conforming to the actual situation is obtained. When only one optional pronunciation sequence is determined, that is, the unique pronunciation sequence can be directly obtained, the user-defined content can be directly determined as the unique pronunciation. For more than one determined alternative pronunciation sequence, a unique pronunciation sequence can be determined according to semantic analysis, or the pronunciation sequence can be determined by using the method shown in fig. 2. The Chinese character phonetic alphabet library is stored in the processing area and is stored in the external database, so that the Chinese character phonetic alphabet library does not occupy data, has higher processing speed and higher efficiency.

Step S103: and performing awakening word evaluation on the self-defined content. The concrete implementation is as follows: after accurate self-defined content is obtained, in order to reduce the false awakening rate, the awakening word needs to be evaluated, the evaluated content comprises sensitive word detection, such as words including national leaders, political factors and the like, repeated word overlap detection, spoken word detection and insufficient sounding word detection, and the method for evaluating the content of the awakening word can be realized by referring to the prior art.

Step S104: and determining the self-defined awakening words according to the evaluation result. The concrete implementation is as follows: after the custom content is evaluated according to the evaluation content, an evaluation result can be obtained, and if the custom content contains words which do not accord with the evaluation content, the user can be reminded to modify or input the proper custom content again. When the custom content matches the evaluation content, i.e., the custom content fits the wake-up word, an evaluation result is generated. In order to improve the utilization rate of the wake-up word and reduce the false wake-up rate, a wake-up threshold needs to be determined for the self-defined content, and the self-defined content and the wake-up threshold are determined as the final self-defined wake-up word. At least two Chinese character threshold word banks are configured in the database, the word banks can allocate corresponding threshold values to each character according to machine experience to generate threshold word banks, for example, statistics is carried out on awakening words of past generations, the appearance rate of the character is high, the score is 0.6, the appearance time of the character is high, the score is 0.7, the appearance time of the character is not so high, the score is 0.3, and the scores are added to obtain a joint score which is used as the awakening threshold value of the awakening words.

According to the method for determining the self-defined awakening word, whether the user is suitable for making the awakening word can be determined efficiently according to the self-defined content of the user, the threshold value is configured for the user, the utilization rate of the user-defined awakening word is greatly improved, and the false awakening rate is reduced.

Fig. 2 is a flow chart schematically illustrating a method for determining a custom wake-up word according to a further embodiment of the present invention. As shown in fig. 2, the present embodiment includes the following steps:

step S201: a first user instruction is received. A specific implementation manner may refer to step S101.

Step S202: and acquiring the input user-defined Chinese characters according to the first user instruction, performing pinyin conversion on the user-defined Chinese characters, determining an optional pronunciation sequence and presenting the selectable pronunciation sequence to the user. The specific implementation manner is substantially the same as that of step S102, except that since the words and sentences input by the user may contain multiple polyphonic characters, such as "who is at his dry mani", "who" character contains two pronunciations, "shui", "shei", "dry" character contains one and four tones of two pronunciations "gan", and the lam character has two pronunciations, "ma" with a light tone and two tones. Therefore, the four character combinations have 6 pronunciations, and when there are a plurality of pronunciation combinations, the user-defined content cannot be directly presented, and the six pronunciation combinations need to be sorted according to the preferred word stock mentioned in the above embodiment, that is, the most common pronunciation combination is placed at the head, and then presented to the user in a list form.

Step S203: and receiving a second user instruction sent by the user according to the presented content, and determining the appointed pronunciation sequence of the input custom Chinese character according to the second user instruction. The concrete implementation is as follows: when the possible pronunciation is presented to the user in the form of a list, the user can select the pronunciation which accords with the awakening word set by the user according to the list, and after the pronunciation is determined, a second user instruction is sent out, and the selected voice is used as the final awakening word.

Step S204 to step S205 can refer to steps S103 to S104, which are not described herein.

The method and the device can solve the problem that the identification process is inaccurate due to polyphones involved in the Chinese characters.

The setting method of the user-defined awakening word in the embodiment can be simultaneously suitable for the language model and the non-language model, and when the method is suitable for the language model, the usability of the user-defined awakening word can be further ensured, the user-defined awakening word with higher awakening rate can be set, and the user experience is improved. When the method is suitable for the non-language model, the correct pronunciation of the input user-defined Chinese character is determined by detecting the pronunciation and word sequence of the received user-defined awakening word, so that the usability of the user-defined awakening word under the non-language model is higher and more accurate. In addition, because the embodiment of the invention stores the dictionary for detecting pronunciation and word sequence in the database and optimizes the dictionary of the word library, the occupancy rate of internal resources can be greatly reduced, and the processing speed is higher.

In a preferred embodiment, the database is configured with two Chinese character threshold word banks, wherein one Chinese character threshold word bank is a high threshold word bank, and the other Chinese character threshold word bank is a low threshold word bank, wherein the threshold value of each Chinese character in the high threshold word bank is set to be higher so as to be suitable for the condition that the Chinese character is easy to wake up due to the lower threshold value in practical application, and the threshold value of each Chinese character in the low threshold word bank is set to be lower so as to be suitable for the condition that the Chinese character is not easy to wake up due to the higher threshold value in practical application, so that the wake-up rate can be ensured.

Fig. 3 schematically shows a block diagram of an apparatus for determining a custom wake-up word according to an embodiment of the present invention. As shown in figure 3 of the drawings,

the device 1 for determining the self-defined awakening word comprises a first receiving module 2, a self-defined content obtaining module 3, an awakening word evaluation module 4 and an awakening threshold generating module 5.

The first receiving module 2 is configured to receive a first user instruction, and may receive a user input through a user interface, and click and determine after a user manually inputs a user-defined content, that is, the user is considered to issue the first user instruction, and the first receiving module 2 may receive the instruction.

The custom content obtaining module 3 is configured to determine a custom content according to a first user instruction, and the custom content obtaining module 3 includes a pronunciation sequence obtaining unit 301 and a custom content determining unit 302. The pronunciation sequence acquisition unit 301 is used for determining an optional pronunciation sequence of the input custom Chinese character according to the first user instruction and presenting the optional pronunciation sequence to the user. The implementation manner of fig. 1 may be referred to, that is, when the user-defined chinese character output by the user has only a unique pronunciation, the user-defined chinese character may be directly presented without confirmation by the user. The custom content determining unit 302 is configured to receive a second user instruction issued by the user according to the presented content, and determine an assigned pronunciation sequence of the input custom chinese character according to the second user instruction. The implementation manner of fig. 2 may be referred to, that is, when the user-defined chinese characters input by the user have multiple pronunciation combinations, the user may be presented with the priority arrangement in the form of a list according to the word stock built in the device, and the user performs the second selection confirmation, that is, the second user instruction is generated.

In addition, the device also comprises a built-in word stock, specifically comprises a first Chinese character pinyin word stock 6 and a preferred Chinese character pinyin word stock 7, wherein the first Chinese character pinyin word stock 6 is used for storing Chinese characters and pinyin of all pronunciations matched with the Chinese characters; the preferred Chinese pinyin lexicon 7 is used to store Chinese characters and the pinyin for common pronunciations adapted to each Chinese character. The pronunciation sequence acquisition unit 301 determines the selectable pronunciation sequence of the input user-defined Chinese character according to the first Chinese character pinyin word bank 6 or the preferred Chinese character pinyin word bank 7. The workload of processing can be greatly reduced according to the optimized Chinese character pinyin word stock 7, and some unusual pronunciations can be quickly screened out. In addition, the two word banks do not occupy the resource space of the device, are stored in an external database and can be updated in real time, so that the processing speed of the device can be increased, and the efficiency is improved.

The awakening word evaluation module 4 is used for carrying out awakening word evaluation on the user-defined content, wherein the awakening word evaluation module 4 carries out awakening word evaluation on the user-defined Chinese characters according to the specified pronunciation sequence, and the evaluated content comprises sensitive word detection, such as words including national leaders, political factors and the like, repeated word overlapping detection, spoken word detection and insufficient word detection. The evaluation method can refer to the above method part, and is not described herein.

And the awakening threshold generation module 5 is used for generating an awakening threshold for the self-defined content determined as the self-defined awakening word according to the evaluation result, and determining the self-defined content and the awakening threshold as the self-defined awakening word to be output or stored. The determination of the awakening threshold value can be realized by configuring a Chinese character threshold value word bank based on the description of the method part. Therefore, the utilization rate of the awakening words can be improved, and the false awakening rate can be reduced.

The device not only can realize the setting of self-defined awakening words, but also has higher usability and awakening rate. And because the device can be fine the quality of the word of waking up of self-defined part of assurance, so make the product that utilizes it to wake up the word definition benefit, promoted the object for appreciation nature of pronunciation product, the user experience of whole product obtains very big promotion.

In some embodiments, the present invention provides a non-transitory computer readable storage medium, in which one or more programs including executable instructions are stored, and the executable instructions can be read and executed by an electronic device (including but not limited to a computer, a server, or a network device, etc.) to perform any one of the above methods for determining a custom wake word of the present invention.

In some embodiments, the present invention further provides a computer program product comprising a computer program stored on a non-volatile computer-readable storage medium, the computer program comprising program instructions that, when executed by a computer, cause the computer to perform any of the above methods for determining a custom wake-up word.

In some embodiments, an embodiment of the present invention further provides an electronic device, which includes: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of determining a custom wake word.

In some embodiments, an embodiment of the present invention further provides a storage medium having a computer program stored thereon, where the computer program is executed by a processor to define a method for determining a custom wake-up word.

The apparatus for determining a self-defined wake-up word according to the embodiment of the present invention may be configured to execute the method for determining a self-defined wake-up word according to the embodiment of the present invention, and accordingly achieve the technical effect achieved by the method for determining a self-defined wake-up word according to the embodiment of the present invention, which is not described herein again. In the embodiment of the present invention, the relevant functional module may be implemented by a hardware processor (hardware processor).

Fig. 4 is a schematic hardware structure diagram of an electronic device that executes a method for determining a custom wake-up word according to another embodiment of the present application, and as shown in fig. 4, the electronic device includes:

one or more processors 410 and a memory 420, with one processor 410 being an example in fig. 4.

The apparatus for performing the method of determining a custom wake-up word may further include: an input device 430 and an output device 440.

The processor 410, the memory 420, the input device 430, and the output device 440 may be connected by a bus or other means, such as the bus connection in fig. 4.

The memory 420 is a non-volatile computer-readable storage medium, and can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as program instructions/modules corresponding to the method for determining the customized wake word in the embodiment of the present application. The processor 410 executes various functional applications and data processing of the server by executing nonvolatile software programs, instructions and modules stored in the memory 420, that is, the method for determining the customized wake word of the above-described method embodiment is implemented.

The memory 420 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the voice control apparatus, and the like. Further, the memory 420 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, memory 420 may optionally include memory located remotely from processor 410, which may be connected to the voice control device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input device 430 may receive entered numeric or character information and generate signals related to user settings and functional control of the custom wake-up word determining device. The output device 440 may include a display device such as a display screen.

The one or more modules are stored in the memory 420 and when executed by the one or more processors 410 perform the method of determining a custom wake-up word in any of the method embodiments described above.

The product can execute the method provided by the embodiment of the application, and has the corresponding functional modules and beneficial effects of the execution method. For technical details that are not described in detail in this embodiment, reference may be made to the methods provided in the embodiments of the present application.

The electronic device of the embodiments of the present application exists in various forms, including but not limited to:

(1) mobile communication devices, which are characterized by mobile communication capabilities and are primarily targeted at providing voice and data communications. Such terminals include smart phones (e.g., iphones), multimedia phones, functional phones, and low-end phones, among others.

(2) The ultra-mobile personal computer equipment belongs to the category of personal computers, has calculation and processing functions and generally has the characteristic of mobile internet access. Such terminals include PDA, MID, and UMPC devices, such as ipads.

(3) Portable entertainment devices such devices may display and play multimedia content. Such devices include audio and video players (e.g., ipods), handheld game consoles, electronic books, as well as smart toys and portable car navigation devices.

(4) The server is similar to a general computer architecture, but has higher requirements on processing capability, stability, reliability, safety, expandability, manageability and the like because of the need of providing highly reliable services.

(5) And other electronic devices with data interaction functions.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a general hardware platform, and certainly can also be implemented by hardware. Based on such understanding, the above technical solutions substantially or contributing to the related art may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A method for determining a custom wake-up word comprises the following steps:

receiving a first user instruction;

determining custom content according to the first user instruction;

performing awakening word evaluation on the user-defined content;

determining a custom wake-up word according to the evaluation result, which is realized by comprising: and when the evaluation result is that the self-defined content is suitable for being used as a wake-up word, determining a wake-up threshold value for the self-defined content, and determining the self-defined content and the wake-up threshold value as the self-defined wake-up word.

2. The method of claim 1, wherein the determining custom content according to the first user instruction comprises:

acquiring an input custom Chinese character according to a first user instruction;

performing pinyin conversion on the user-defined Chinese characters, determining an optional pronunciation sequence and presenting the selectable pronunciation sequence to a user;

receiving a second user instruction sent by the user according to the presented content;

and determining the appointed pronunciation sequence of the input self-defined Chinese characters according to the second user instruction.

3. The method of claim 2, further comprising:

configuring a Chinese character pinyin word stock;

optimizing the configured Chinese character pinyin word stock to generate an optimal Chinese character pinyin word stock;

the Pinyin conversion of the user-defined Chinese characters and the determination of the selectable pronunciation sequence comprise the following steps:

splitting the user-defined Chinese characters to determine a single Chinese character;

determining the pronunciation of a single Chinese character according to the preferred Chinese character pinyin word stock;

and generating an optional pronunciation sequence according to the semantics of the self-defined Chinese character and the pronunciation of the single Chinese character.

4. The method of any one of claims 1 to 3, wherein the wake word evaluation of the custom content includes sensitive word detection, repeated word-stack detection, spoken word detection, and hypovocation word detection.

5. The method of claim 1, further comprising:

at least two Chinese character threshold word banks are configured;

the determining a wake-up threshold for the custom content comprises:

and generating the awakening threshold of the custom content according to the custom content and one of the Chinese character threshold word banks.

6. An apparatus for determining a custom wake word, comprising:

the first receiving module is used for receiving a first user instruction;

the user-defined content acquisition module is used for determining user-defined content according to the first user instruction;

the awakening word evaluation module is used for carrying out awakening word evaluation on the user-defined content;

and the awakening threshold generation module is used for determining a self-defined awakening word according to the evaluation result, and comprises the steps of determining an awakening threshold for the self-defined content when the evaluation result is that the self-defined content is suitable for being used as the awakening word, and determining the self-defined content and the awakening threshold as the self-defined awakening word to be output or stored.

7. The apparatus of claim 6, wherein the custom content obtaining module comprises

The pronunciation sequence acquisition unit is used for determining an optional pronunciation sequence of the input custom Chinese character according to the first user instruction and presenting the optional pronunciation sequence to a user;

the user-defined content determining unit is used for receiving a second user instruction sent by a user according to the presented content and determining the appointed pronunciation sequence of the input user-defined Chinese characters according to the second user instruction;

and the awakening word evaluation module evaluates the awakening words of the user-defined Chinese characters according to the specified pronunciation sequence.

8. The apparatus of claim 7, further comprising:

the first Chinese character pinyin word bank is used for storing Chinese characters and pinyin of all pronunciations matched with the Chinese characters;

the preferred Chinese character pinyin word stock is used for storing Chinese characters and pinyin of common pronunciations matched with the Chinese characters;

the pronunciation sequence acquisition unit determines the selectable pronunciation sequence of the input user-defined Chinese characters according to the first Chinese character pinyin word bank or the preferred Chinese character pinyin word bank.

9. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the method of any one of claims 1-5.

10. A storage medium on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 5.