CN115841815A

CN115841815A - Display device and voice repair reporting method

Info

Publication number: CN115841815A
Application number: CN202211194210.XA
Authority: CN
Inventors: 王娜; 董逸晨
Original assignee: Hisense Visual Technology Co Ltd
Current assignee: Hisense Visual Technology Co Ltd
Priority date: 2022-09-28
Filing date: 2022-09-28
Publication date: 2023-03-24

Abstract

The disclosure relates to a display device and a voice repair reporting method, and relates to the technical field of voice recognition. Wherein, this display device includes: a user input interface configured to: receiving user voice; a controller configured to: performing text conversion on the user voice to obtain a voice text corresponding to the user voice, and detecting whether a repair keyword is included in the voice text or not, wherein the repair keyword is not included in the voice text, and detecting the attribute of a negative word under the condition that the negative word exists; under the condition that the attribute of the negative word is the target attribute, determining that the voice text has a repair intention, and determining whether a first repair instruction corresponding to the voice of the user exists; and executing the repair instruction under the condition that the first repair instruction corresponding to the voice of the user is determined to exist. The embodiment of the disclosure is used for solving the problems that the existing voice repair is inflexible and the feedback is not timely.

Description

Display device and voice repair reporting method

Technical Field

The disclosure relates to the technical field of voice recognition, in particular to a display device and a voice repair method.

Background

With the application range of voice becoming wider and wider, the control of voice is more convenient and faster compared with other modes, and the voice repair reporting method has wide application. In the related technology, the display device receives repair voice input by a user, calls out a repair order interface, and sends the repair order interface to the operation and maintenance platform after the user voice is filled in, so that the operation and maintenance platform analyzes the fault type according to the repair order, determines a corresponding solution, and returns the solution to the display device, thereby meeting the repair demand of the user. However, the voice repair method is in an unnatural language environment, and a user needs to input repair voice with a fixed format, for example, "i need to repair … …", and the work order processing amount of the operation and maintenance platform is huge and is difficult to feed back a solution in time, so that the fault of the display device is difficult to solve in time, and the experience of the user is poor.

Disclosure of Invention

In order to solve the technical problem or at least partially solve the technical problem, the present disclosure provides a display device and a voice repair method, which can accurately identify the repair intention of a user in a natural context and timely feed back the repair intention for maintenance, and are simple and convenient to operate, and the use experience of the user is improved.

In order to achieve the above purpose, the technical solutions provided by the embodiments of the present disclosure are as follows:

in a first aspect, the present disclosure provides a display device comprising:

a user input interface configured to: receiving user voice;

a controller configured to: performing text conversion on the user voice to obtain a voice text corresponding to the user voice, and detecting whether the voice text comprises a repair keyword;

detecting the attribute of a negative word under the condition that the voice text does not include the repair keyword and the negative word exists;

under the condition that the attribute of the negative word is a target attribute, determining that the voice text has a repair intention, and determining whether a first repair instruction corresponding to the voice of the user exists;

executing the repair instruction under the condition that the first repair instruction corresponding to the user voice is determined to exist.

In a second aspect, the present disclosure provides a method for reporting a repair by voice, the method including:

receiving user voice;

performing text conversion on the user voice to obtain a voice text corresponding to the user voice, and detecting whether the voice text comprises a repair keyword;

In a third aspect, the present disclosure provides a computer-readable storage medium comprising: the computer-readable storage medium stores thereon a computer program which, when executed by a processor, implements the voice repair method as shown in the second aspect.

In a fourth aspect, the present disclosure provides a computer program product comprising a computer program which, when run on a computer, causes the computer to implement the speech repair method as shown in the second aspect.

The disclosed embodiment provides a display device and a voice repair method, wherein a controller in the display device performs text conversion on user voice received by a user input interface to obtain a voice text corresponding to the user voice, detects whether the voice text includes repair keywords, and detects attributes of negative words when the voice text does not include the repair keywords and the negative words exist; the method comprises the steps of determining that a repair intention exists in a voice text under the condition that the attribute of a negative word is a target attribute, further determining whether a first repair instruction corresponding to the voice of a user exists, executing the repair instruction under the condition that the first repair instruction corresponding to the voice of the user exists, accordingly, under the normal natural context of the user, the display device can accurately recognize and distinguish the negative intention and the repair intention of the voice of the user, and under the condition that the repair intention exists in the voice of the user, determining the first repair instruction corresponding to the voice of the user to execute the first repair instruction, meeting the requirement of the voice repair of the user, solving the problem that the display device needs to be repaired in time, being simple and convenient to operate, and improving the use experience of the user.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.

In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present disclosure, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.

Fig. 1 is a schematic view of a scenario in some embodiments provided by embodiments of the present disclosure;

fig. 2 is a block diagram of a configuration of the control device 100 according to the embodiment of the present disclosure;

fig. 3 is a block diagram of a hardware configuration of a display device 200 according to an embodiment of the present disclosure;

fig. 4 is a schematic diagram of a software configuration in a display device 200 according to one or more embodiments of the present disclosure;

fig. 5 is a schematic system architecture diagram of a display device according to an embodiment of the present disclosure;

fig. 6 is a schematic diagram of a voice interaction network architecture according to an embodiment of the present disclosure;

fig. 7 is a schematic flow chart illustrating a voice repair method according to an embodiment of the present disclosure;

fig. 8 is a first schematic view of a user interface of a display device according to an embodiment of the present disclosure;

fig. 9 is a schematic view of a user interface of a display device according to an embodiment of the present disclosure;

fig. 10 is a schematic structural diagram of a display device according to an embodiment of the present disclosure.

Detailed Description

In order that the above objects, features and advantages of the present disclosure may be more clearly understood, aspects of the present disclosure will be further described below. It should be noted that the embodiments and features of the embodiments of the present disclosure may be combined with each other without conflict.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure, but the present disclosure may be practiced in other ways than those described herein; it is to be understood that the embodiments disclosed in the specification are only a few embodiments of the present disclosure, and not all embodiments.

In the related voice repair reporting technology, the display device receives repair reporting voice output by a user to call out a repair order interface, the repair order interface is sent to the operation and maintenance platform after the voice of the user is filled in, the operation and maintenance platform analyzes the fault type according to the repair order, and the corresponding solution is determined and then returned to the display device, so that the repair reporting requirements of the user are met. However, the voice repair method is in an unnatural language environment, and a user needs to input repair voice with a fixed format, for example, "i need to repair … …", and the work order processing amount of the operation and maintenance platform is huge and is difficult to feed back a solution in time, so that the fault of the display device is difficult to solve in time, and the experience of the user is poor.

In order to solve the foregoing technical problem, an embodiment of the present disclosure provides a display device and a voice repair method, where a controller in the display device performs text conversion on a user voice received by a user input interface to obtain a voice text corresponding to the user voice, and detects whether the voice text includes a repair keyword, and detects an attribute of a negative word if the voice text does not include the repair keyword and the negative word exists; the method comprises the steps of determining that a repair intention exists in a voice text under the condition that the attribute of a negative word is a target attribute, further determining whether a first repair instruction corresponding to the voice of a user exists, executing the repair instruction under the condition that the first repair instruction corresponding to the voice of the user exists, accordingly, under the normal natural context of the user, a display device can accurately recognize and distinguish the negative intention and the repair intention of the voice of the user, and under the condition that the repair intention exists in the voice of the user, determining the first repair instruction corresponding to the voice of the user to execute the first repair instruction, so that the requirement of the voice repair of the user is met, the problem that the display device needs to be repaired can be solved timely, the operation is simple and convenient, and the use experience of the user is improved.

Fig. 1 is a schematic view of a scenario in some embodiments provided by embodiments of the present disclosure. As shown in fig. 1, a control apparatus 100, a display device 200, a smart device 300, and a server 400 are shown, and a user may operate the display device 200 through the smart device 300 or the control apparatus 100 to play audio and video resources on the display device 200.

Taking as an example that the user operates the display apparatus 200 through the control device 100, the user presses a voice input button on the control device 100, transmits an instruction to receive the user voice to the display apparatus 200, and the display apparatus 200 receives the user voice through the user input interface in response to the instruction.

In a scene of normal language communication of a user, a user input interface of the display device 200 receives a user voice, then a controller of the display device 200 performs text conversion on the user voice to obtain a voice text corresponding to the user voice, and detects whether a repair keyword is included in the voice text, and in a case that the voice text includes the repair keyword, for example, "repair", and the like, it is determined that the user voice includes an obvious repair intention; the subsequent display device with obvious repair intention can refer to the prior art to perform corresponding repair processing.

Under the condition that the voice text does not include the repair keywords, whether negative words exist in the voice text is detected, so that the negative sentences are distinguished from the repair sentences by utilizing the attributes of the negative words, and the repair intention except the repair intention hidden in the voice of the user is identified. The method comprises the steps of detecting the attribute of a negative word under the condition that the voice text comprises repair keywords and the negative word exists, further determining that repair intention exists in the voice text under the condition that the attribute of the negative word is a target attribute, namely, the repair intention is hidden in the voice of a user, further determining whether a first repair instruction corresponding to the voice of the user exists, and executing the repair instruction under the condition that the first repair instruction corresponding to the voice of the user exists, so that the requirement of the voice repair of the user is met, the problem of faults existing in display equipment is solved in time, the method is suitable for a natural language scene of the user, the learning cost for the user is low, the operation is simple and convenient, and the use experience of the user is improved.

In some embodiments, the control apparatus 100 may be a remote controller, and the communication between the remote controller and the display device includes an infrared protocol communication, a bluetooth protocol communication, a wireless or other wired method to control the display device 200. The user may input a user command through a key on a remote controller, voice input, control panel input, etc. to control the display apparatus 200. In some embodiments, mobile terminals, tablets, computers, laptops, and other smart devices may also be used to control the display device 200.

In some embodiments, the smart device 300 may install a software application with the display device 200 to implement connection communication through a network communication protocol for the purpose of one-to-one control operation and data communication. The audio and video content displayed on the intelligent device 300 can also be transmitted to the display device 200, so that the display device 200 with the synchronous display function can also perform data communication with the server 400 through multiple communication modes. The display device 200 may be allowed to be communicatively connected through a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and other networks. The server 400 may provide various contents and interactions to the display apparatus 200. The display device 200 may be a liquid crystal display, an OLED display, a projection display device. The display apparatus 200 may additionally provide an intelligent network tv function that provides a computer support function in addition to the broadcast receiving tv function.

Fig. 2 is a block diagram of a configuration of the control device 100 according to the embodiment of the present disclosure. As shown in fig. 2, the control device 100 includes a controller 110, a communication interface 130, a user input/output interface 140, a memory, and a power supply. The control apparatus 100 may receive an input operation instruction of a user and convert the operation instruction into an instruction recognizable and responsive to the display device 200, serving as an interaction intermediary between the user and the display device 200. The communication interface 130 is used for communicating with the outside, and includes at least one of a WIFI chip, a bluetooth module, NFC, or an alternative module. The user input/output interface 140 includes at least one of a microphone, a touch pad, a sensor, a key, or an alternative module.

Fig. 3 is a block diagram of a hardware configuration of a display device 200 according to an embodiment of the present disclosure. The display device 200 as shown in fig. 3 includes: at least one of a tuner demodulator 210, a communicator 220, a detector 230, an external device interface 240, a controller 250, a display 260, an audio output interface 270, a memory, a power supply, a user input interface 280, etc. The controller 250 includes at least one of a Central Processing Unit (CPU), a video processor, an audio processor, a RAM Random Access Memory, a ROM (Read-Only Memory), a first interface to an nth interface for input/output, a communication Bus (Bus), and the like. The display 260 may be at least one of a liquid crystal display, an OLED display, a touch display, and a projection display, and may also be a projection device and a projection screen. The tuner demodulator 210 receives a broadcast television signal through a wired or wireless reception manner, and demodulates an audio/video signal, such as an EPG data signal, from a plurality of wireless or wired broadcast television signals. The detector 230 is used to collect signals of the external environment or interaction with the outside. The controller 250 and the tuner-demodulator 210 may be located in different separate devices, that is, the tuner-demodulator 210 may also be located in an external device of the main device where the controller 250 is located, such as an external set-top box.

The detector 230 is used to collect signals of the external environment or interaction with the outside. For example, detector 230 includes a light receiver, a sensor for collecting the intensity of ambient light; alternatively, the detector 230 includes an image collector, such as a camera, which may be used to collect external environment scenes, attributes of the user, or user interaction gestures, or the detector 230 includes a sound collector, such as a microphone, which is used to receive external sounds.

The sound collector can be a microphone, also called a microphone or a microphone, and can be used for receiving the sound of a user and converting a sound signal into an electric signal. The display device 200 may be provided with at least one microphone. In other embodiments, the display device 200 may be provided with two microphones to achieve a noise reduction function in addition to collecting sound signals. In other embodiments, the display device 200 may further include three, four or more microphones to collect sound signals and reduce noise, and may further identify sound sources and perform directional recording functions.

In addition, the microphone may be built in the display device 200, or the microphone may be connected to the display device 200 by wire or wirelessly. Of course, the position of the microphone on the display device 200 is not limited by the embodiments of the present disclosure. Alternatively, the display apparatus 200 may not include a microphone, i.e., the microphone is not provided in the display apparatus 200. The display device 200 may be externally connected to a microphone (also referred to as a microphone) via an interface (e.g., the USB interface 130). The external microphone may be fixed to the display device 200 by an external fixing member (e.g., a camera holder with a clip).

In some embodiments, the display device is a terminal device with a display function, such as a television, a mobile phone, a computer, a learning machine, and the like.

In some embodiments, the controller 250 controls the operation of the display device and responds to user operations through various software control programs stored in memory. The controller 250 controls the overall operation of the display apparatus 200. A user may input a user command on a Graphical User Interface (GUI) displayed on the display 260, and the user input interface receives the user input command through the Graphical User Interface (GUI). Alternatively, the user may input the user command by inputting a specific sound or gesture, and the user input interface receives the user input command by recognizing the sound or gesture through the sensor.

An output interface (display 260, and/or audio output interface 270) configured to output user interaction information;

a communicator 220 for communicating with the server 400 or other devices.

An embodiment of the present disclosure provides a display device, including:

a user input interface 280 configured to: receiving user voice;

a controller 250 configured to: performing text conversion on the user voice to obtain a voice text corresponding to the user voice, and detecting whether a repair keyword is included in the voice text or not, wherein the repair keyword is not included in the voice text, and detecting the attribute of a negative word under the condition that the negative word exists;

under the condition that the attribute of the negative word is the target attribute, determining that the voice text has a repair intention, and determining whether a first repair instruction corresponding to the voice of the user exists;

and executing the repair instruction under the condition that the first repair instruction corresponding to the voice of the user is determined to exist.

According to the display equipment, whether the repair intention exists in the voice of the user is accurately distinguished in a natural language scene, the repair intention can be distinguished from the negative intention, and then under the condition that the repair intention hidden in the voice of the user is identified, the first repair instruction corresponding to the voice of the user is determined, and the first repair instruction is executed, so that the problem of faults existing in the display equipment is solved in time, the operation is simple and convenient, and the use experience of the user is effectively improved.

In some embodiments, after text-converting the user speech to obtain a speech text corresponding to the user speech, and before detecting whether the speech text includes the repair keyword, the controller 250 is further configured to:

under the condition that the voice text is detected to comprise the equipment keywords and the skill keywords, judging whether equipment indicated by the equipment keywords is display equipment or not;

if the equipment indicated by the equipment keyword is display equipment, determining whether the display equipment supports the skill indicated by the skill keyword;

the controller 250, detecting whether the speech text includes the repair keyword, is specifically configured to: in the case where the display device supports the skill indicated by the skill keyword, it is detected whether the repair keyword is included in the voice text.

In some embodiments, the controller, in a case where the device keyword and the skill keyword are included in the speech text, after determining whether the device indicated by the device keyword is a display device, is further configured to: if the equipment indicated by the equipment keyword is other equipment and the other equipment is equipment except the display equipment, searching based on the equipment keyword and the skill keyword to obtain a search result; and controlling the display to display the search result.

In some embodiments, the controller 250, determining whether the display device supports the skill indicated by the skill keyword, is specifically configured to: according to the skill keywords, inquiring whether matched target skills exist in a database; in the case where there is a matching target skill, it is determined that the display device supports the skill indicated by the skill keyword.

In some embodiments, after the controller 250 performs text conversion on the user speech to obtain a speech text corresponding to the user speech, and detects whether the speech text includes the repair keyword, it is further configured to: under the condition that the speech text comprises repair keywords and the skill keywords exist, whether a second repair instruction exists is inquired based on the repair keywords and the skill keywords, wherein the second repair instruction is a pre-stored repair instruction corresponding to the repair keywords and/or the skill keywords; and executing the second repair instruction under the condition that the second repair instruction exists in the inquiry.

In some embodiments, the controller 250, in the case that the attribute of the negative word is the target attribute, determines that there is a repair intention in the speech text, and after determining whether there is a first repair instruction corresponding to the speech of the user, is further configured to: under the condition that a first repair instruction corresponding to the user voice does not exist, matching a solution and an associated video corresponding to the user voice by a question-answering system based on information indexing, wherein the associated video is a video with a mapping relation with the solution; the display is controlled to display the solution and associated video.

In some embodiments, the controller 250, in the case that the attribute of the negative word is the target attribute, determines that there is a repair intention in the speech text, and after determining whether there is a first repair instruction corresponding to the speech of the user, is further configured to: under the condition that the first repair instruction corresponding to the voice of the user does not exist, a repair work order filling interface is called;

a user input interface 280 further configured to: receiving voice input of a user aiming at a repair work order filling interface; the controller is used for filling in a repair work order according to the voice input; and uploading the repair work order to the operation and maintenance platform after the repair work order is completely filled.

In some embodiments, the target attributes of the negative word include a complement attribute, a control word attribute.

Fig. 4 is a schematic diagram illustrating a software configuration in a display device 200 according to one or more embodiments of the present disclosure, and as shown in fig. 4, the system is divided into four layers, which are, from top to bottom, an Application (Applications) layer (referred to as an "Application layer"), an Application Framework (Application Framework) layer (referred to as a "Framework layer"), an Android runtime (Android runtime) and system library layer (referred to as a "system runtime library layer"), and a kernel layer. The inner core layer comprises at least one of the following drivers: audio drive, display driver, bluetooth drive, camera drive, WIFI drive, USB drive, HDMI drive, sensor drive (like fingerprint sensor, temperature sensor, pressure sensor etc.) and power drive etc..

In some examples, an operating system of the intelligent device is an Android system as an example, as shown in fig. 5, fig. 5 is a schematic system architecture diagram of a display device according to an embodiment of the present disclosure, and the display device 200 may be logically divided into an application (Applications) layer (referred to as "application layer") 21, a kernel layer 22, and a hardware layer 23.

As shown in fig. 5, the hardware layer may include the controller 250, the communicator 220, the detector 230, and the like shown in fig. 3. The application layer 21 includes one or more applications. The application may be a system application or a third party application. For example, the application layer 21 includes a voice recognition application that can provide voice interactive interfaces and services for the connection of the display device 200 with the server 400.

The kernel layer 22 acts as software middleware between the hardware layer and the application layer 21 for managing and controlling hardware and software resources.

In some examples, the core layer 22 includes a detector driver to send voice data collected by the detector 230 to a voice recognition application. Illustratively, when the voice recognition application in the display device 200 is started and the display device 200 establishes a communication connection with the server 400, the detector driver is configured to transmit the voice data input by the user, collected by the detector 230, to the voice recognition application. The speech recognition application then sends query information containing the speech data to the intent recognition module 202 in the server. The intention recognition module 202 is used to input the voice data transmitted by the display device 200 to the intention recognition model.

For clarity of explanation of the embodiments of the present disclosure, a speech recognition network architecture provided by the embodiments of the present disclosure is described below with reference to fig. 6.

Referring to fig. 6, fig. 6 is a schematic diagram of a voice interaction network architecture according to an embodiment of the present disclosure. In fig. 6, the display device is used to receive input information and output a processing result of the information. The voice recognition module is deployed with voice recognition service and used for recognizing the audio frequency as a text; the semantic understanding module is deployed with semantic understanding service and used for performing semantic analysis on the text; the business management module is provided with a business instruction management service for providing business instructions; the language generation module is deployed with a language generation service (NLG) and used for converting instructions for instructing the display equipment to execute into a text language; and the voice synthesis module is deployed with a voice synthesis (TTS) service and used for processing a text language corresponding to the instruction and then sending the processed text language to a loudspeaker for broadcasting. In one embodiment, there may be multiple entity service devices deployed with different business services in the architecture shown in fig. 6, and one or more function services may also be aggregated in one or more entity service devices.

In some embodiments, the following describes an example of a process for processing information input to a display device based on the architecture shown in fig. 6, taking the information input to the display device as a voice instruction input by voice as an example:

speech recognition display device may perform noise reduction processing and feature extraction on the audio of a speech command after receiving the speech command input by speech, where the noise reduction processing may include steps of removing echo and ambient noise.

Semantic understanding natural language understanding is performed on the identified candidate texts and associated context information by using an acoustic model and a language model, and the texts are analyzed into structured and machine-readable information, information such as business fields, intentions, word slots and the like so as to express semantics and the like. Deriving an actionable intent determination intent confidence score, a semantic understanding module selects one or more candidate actionable intents based on the determined intent confidence score,

the semantic understanding module issues an execution instruction to the corresponding business management module according to the semantic parsing result of the text of the voice instruction to execute the operation corresponding to the voice instruction, completes the operation requested by the user, and feeds back the execution result of the operation corresponding to the voice instruction.

For more detailed explanation of the present solution, the following description is given with reference to fig. 7 by way of example, and it is understood that the steps involved in fig. 7 may include more steps or fewer steps in actual implementation, and the sequence between the steps may also be different, so as to enable the voice repair method provided in the embodiment of the present disclosure.

As shown in fig. 7, fig. 7 is a schematic flow chart of a voice repair method according to an embodiment of the present disclosure, where the voice repair method includes the following steps S701 to S705:

s701, receiving user voice.

S702, text conversion is carried out on the user voice to obtain a voice text corresponding to the user voice, and whether the voice text comprises the repair keyword is detected.

In some embodiments, after the user speech is received by the user input interface of the display device, the user speech is preprocessed, including but not limited to at least one of: denoising, human voice extraction, which the present disclosure does not limit.

In some embodiments, after text conversion is performed on user speech to obtain a speech text corresponding to the user speech, first, whether the speech text includes an equipment keyword and a skill keyword is detected, whether equipment operated by the user speech is display equipment is judged according to the equipment keyword, and whether the equipment operated by the user speech supports a skill corresponding to the skill keyword is judged according to the skill keyword. The embodiment of the disclosure provides an implementation manner, and when a device keyword and a skill keyword are included in a voice text, whether a device indicated by the device keyword is the display device is judged.

In the case that the device indicated by the device keyword is the display device, determining whether the display device supports the skill indicated by the skill keyword, an embodiment provided in the present disclosure is to query, from a locally stored database, whether a target skill matching the skill keyword exists according to the skill keyword, and in the case that the target skill matching the skill keyword exists, determining that the display device supports the skill indicated by the skill keyword.

It should be noted that there is a corresponding relationship between the skill keyword and the skill in the database stored locally, so that it is possible to query whether the corresponding skill is the skill supported by the device according to the skill keyword, and further determine whether the user voice includes the repair keyword under the condition that the display device itself supports the skill indicated by the skill keyword.

As shown in table 1, table 1 shows the correspondence between skill keywords and skills in the locally stored database:

TABLE 1

Skill keyword	Skill of skill
		Network	Network setup
Sound equipment	Sound setting

It is emphasized that the skill keywords include, but are not limited to, synonyms, and omitted words, such as the skill keywords "network", "net", "wifi", corresponding to the skill "network settings". Table 1 is intended to be illustrative only and not limiting of the present disclosure.

Exemplarily, after receiving the user voice, the smart television performs text conversion on the user voice to obtain a voice text corresponding to the user voice: the television sound is small, then the voice text is detected, and the detected voice text comprises the equipment keywords: the "television" and the skill keyword "sound", and further determine that the device indicated by the "television" is the smart television itself according to the device keyword "television". Further, under the condition that the intelligent television supports the skill "sound setting" indicated by the skill keyword according to the skill keyword "sound", then, whether the repair keyword is included in the user speech is determined, and under the condition that the device keyword and the skill keyword are included in the speech text, the repair intention and the basic control intention are effectively distinguished, for example, under the condition that the aforementioned speech text "television sound is small" and does not include the repair keyword, after the display device performs semantic understanding on the speech text, it is determined that the speech text includes the control intention that the user desires to adjust the sound, and the corresponding display device can directly control and increase the volume of the display device.

In the embodiment, when the device indicated by the device keyword is a display device, and the skill indicated by the skill keyword is a skill supported by the display device, it indicates that the user desires to perform a corresponding operation on the skill of the display device, and specifically, controls to execute an instruction corresponding to the skill or execute a maintenance instruction corresponding to the skill, and needs to further identify whether the voice text contains a repair intention by detecting the repair keyword in the voice text.

In some embodiments, in a case that it is determined that the display device does not support the skill indicated by the skill keyword, the display device determines other devices capable of supporting the skill indicated by the skill keyword, displays prompt information on the display device, where the prompt information is used to indicate a device identifier supporting the skill indicated by the skill keyword, and sends a control instruction to the devices, where the control instruction includes the skill keyword, so that the devices, in response to the control instruction, can be adapted to a case where multiple smart devices exist in a smart home scene, and thus a voice instruction of a user can be fed back.

For example, if a skill keyword "cooling" is detected in a voice text, and the smart television does not support a cooling skill, the smart television may query whether an intelligent device supporting the cooling skill exists in a smart home scene, and if it is determined that the refrigerator supports the cooling skill, as shown in fig. 8, fig. 8 is a schematic view of a user interface of a display device provided in an embodiment of the present disclosure, and a user interface of the smart television displays "do you have a refrigerator at home to support cooling, and whether to control the refrigerator to be turned on for cooling? ", to ask the user whether to turn on the cooling of the refrigerator. In a case where the user operates the display device to determine that cooling of the refrigerator is turned on, a control instruction is transmitted to the refrigerator so that the refrigerator turns on cooling in response to the control instruction.

In some embodiments, in a case that the device keyword is not detected in the voice text, the device that indicates that the user desires to report a repair is a display device, which may be understood as a receiving object of the user voice input, and the default user desires to perform voice control on the display device.

According to the embodiment, when the voice text is detected to include the equipment key words, and the equipment indicated by the equipment key words is determined to be the display equipment, whether the display equipment can support the skills corresponding to the skill key words is determined according to the skill key words in the voice text, and when the display equipment can support the skills corresponding to the skill key words is determined, after the repair intention is conveniently identified accurately in the follow-up process, the repair is performed in response to the skills corresponding to the skill key words, so that the efficiency of removing faults of the display equipment is improved, the voice repair of the user can be fed back in time, and the use experience of the user is effectively improved.

It can be understood that, in the intelligent home scene, the user can report the fault problem existing in other intelligent devices through the intelligent television, and therefore when the user reports the fault problem through the intelligent television in a voice mode, the target intelligent device which the user really wants to report the fault problem needs to be determined according to the device keyword.

In the case that the device indicated by the device keyword detected in the voice text is other than the display device, searching based on the device keyword and the skill keyword to obtain a search result, the embodiment of the present disclosure provides an implementation manner, where the device keyword and the skill keyword are searched in a locally stored database of the display device, and the locally stored database includes device keywords of a plurality of smart devices in the smart home and a skill keyword corresponding to a skill that can be supported by each smart device; searching to obtain a search result, wherein the search result comprises the device keywords and the skill keywords of the other devices and content associated with the skill keywords, and the content associated with the skill keywords can be text content or video content; further, it should be noted that the text content and the video content may be configured in advance and stored in the display device to facilitate the user to obtain the search content when the display device is in an off-line state and feed back to the user in time.

Exemplarily, the voice text is 'what the washing machine has F11 meaning', the keyword 'washing machine' of the device is determined to be other devices except the smart television, the user is indicated to inquire the smart television about related problems of the washing machine, and then the keyword 'washing machine' and the skill keyword 'F11' of the device are searched in the local storage of the smart television, an alarm prompt that 'the washing machine has F11 meaning (abnormal water inflow') is obtained through searching, a water pipe is required to be connected, a water faucet is opened, and a motor start key is started; or cleaning the water inlet valve, and if the problem exists, dialing: XXX-XXX-XXX. And/or water intake anomaly associated video.

As shown in fig. 9, fig. 9 is a schematic diagram of a user interface of a display device provided by an embodiment of the present disclosure, where a voice text "what a washing machine appears in F11" corresponding to a user voice is shown, and a search result obtained by performing a search based on a device keyword "washing machine" and a skill keyword "F11" in the voice text is a search result, where the search result is a solution corresponding to the skill keyword, and includes text content and video content associated with the skill keyword, and it can be understood that the solution corresponding to the skill keyword is presented in text form and video form.

The embodiment of the disclosure provides another implementation way, a search is performed on the internet based on the device keyword and the skill keyword, optionally, a search instruction is generated based on the device keyword and the skill keyword and sent to the server, a search result returned by the server is received, and the display device is controlled to display the search result.

In some embodiments, in the case that the device keyword and the skill keyword are included in the speech text, and the device indicated by the device keyword is the other device, and the skill indicated by the skill keyword is a skill supported by the other device, whether the repair keyword is included in the speech text is detected to determine whether the repair intention of the user speech for the skill supported by the other device exists. The repair keywords include, but are not limited to, "repair", "bad", "failure", and the like. The present disclosure is not so limited.

And S703, detecting the attribute of the negative word under the condition that the voice text does not include the repair keyword and the negative word exists.

In some embodiments, in a case that the speech text includes a repair keyword, it indicates that there is a repair intention in the speech of the user, and the repair intention is obvious, and for the speech of the user with the obvious repair intention, an embodiment is provided in the embodiments of the present disclosure, which performs semantic understanding on the speech of the user, obtains a semantic understanding result, determines a repair instruction corresponding to the semantic understanding result, and executes the repair instruction.

In some embodiments, in the case where the repair keyword and the skill keyword are included in the speech text, it is understood that there is a repair intention of the user for the skill indicated by the skill keyword, and it may be that the skill of the display device may have a trouble problem and needs to be repaired. The embodiment of the disclosure provides an implementation manner, and whether a second repair instruction exists is inquired based on a repair keyword and a skill keyword, wherein the second repair instruction is a pre-stored repair instruction corresponding to the repair keyword and/or the skill keyword; and executing the second repair instruction under the condition that the second repair instruction exists in the inquiry.

Illustratively, the voice text is 'i want to report repair, the television does not have sound', it is detected that the voice text includes a repair keyword 'repair', an equipment keyword 'television' and a skill keyword 'sound', then the television determines that repair intention exists according to the repair keyword 'repair', a second repair instruction 'cancel mute' is obtained according to the skill keyword 'sound' inquiry, and the second repair instruction 'cancel mute' is further executed to repair the problem that the television does not have sound.

The method and the device mainly analyze the voice of the user or the voice text corresponding to the voice of the user under the condition that the voice text does not include the repair keyword, and then accurately identify whether the voice text has repair intention.

Optionally, in a case that the repair keyword does not exist in the voice text, detecting a negative word existing in the voice text, and then detecting an attribute of the negative word, where the attribute of the negative word includes, but is not limited to: idioms, complements, control words. In the embodiment of the present disclosure, a specific process for identifying an attribute of a negative word may refer to an existing parsing technology, which is not described herein again.

In the case that the attribute of the negative word is the shape language, the negative word is represented to be used as a shape language modification verb in the voice text, it can be determined that the voice text has no repair intention but has a negative intention.

S704, under the condition that the attribute of the negative word is the target attribute, determining that the voice text has a repair intention, and determining whether a first repair instruction corresponding to the voice of the user exists.

Wherein, the target attribute is a supplement or a control word.

And when the attribute of the negative word is a complement, the negative word indicates that the negative word acts as the complement of the action word in the voice text, and the voice text is determined to have the repair intention.

In the case that the attribute of the negative word is the control word, it means that the negative word actuates the word in the voice text, for example, "no sound" where the negative word "does not actuate the word in the voice text, it is determined that there is a repair intention in the voice text.

Optionally, in the case of determining whether the attribute of the negative word is the control word, the embodiment of the present disclosure adopts a format of "negative word + name/other media attribute", and excludes negative words not belonging to the control word, for example, the display device identifies and determines that there is no repair intention in the voice text for the voice text "movie without Liu Dehua".

In a case where it is determined that there is a repair intention for the voice text, it is determined whether there is a first repair instruction corresponding to the voice text. Optionally, semantic understanding is performed on the voice text to obtain a semantic understanding result, and then matching is performed on the semantic understanding result in a repair instruction library to determine whether a first repair instruction corresponding to the voice text (that is, the voice of the user) exists.

In some embodiments, when the repair keyword does not exist in the voice text, but the device keyword is detected, and the device indicated by the device keyword is not a display device for receiving the voice of the user, it is determined that the voice text has the repair intention according to the attribute of the negative word, and it indicates that the display device is used as a search device for solving the fault problem of other devices, the user inputs the voice search of the user to the display device, and expects to obtain an answer for solving the fault problem of other devices, the display device searches according to the device keyword and the skill keyword included in the voice of the user, obtains and displays the search result, so that the user can look up the search result by using the display device with a large screen to find the answer for solving the fault problem of other devices, and the operation is convenient and the visibility is strong.

S705, when the first repair instruction corresponding to the voice of the user is determined to exist, the repair instruction is executed.

In some embodiments, in the absence of the first repair instruction corresponding to the user voice, an Information indexed based question and answer system (IRQA) matches a solution corresponding to the user voice with an associated video in a question and answer matching manner, where the associated video is a video having a mapping relationship with the solution. Further, the display is controlled to display the solution and associated video.

In some embodiments, if there is no first repair instruction corresponding to the user voice, a repair order filling interface is called, a user input interface of the display device receives a voice input of the user for a repair order filling page, a repair order is filled in the repair order filling interface in response to the voice input, and the repair order is uploaded to the operation and maintenance platform after the completion of the repair order filling, so that an operation and maintenance person who is subsequent to the operation and maintenance platform analyzes the fault problem according to the reported repair order can report the fault problem to the operation and maintenance platform, and further contacts the user of the display device to guide maintenance or home maintenance.

In some embodiments, the display device records the content of the voice repair of the user and the processing result of the voice repair for counting the fault type of the device, which is not described herein again.

In summary, the embodiment of the present disclosure provides a voice repair method, which includes performing text conversion on a received user voice to obtain a voice text corresponding to the user voice, and detecting whether the voice text includes a repair keyword, where an attribute of a negative word is detected when the voice text does not include the repair keyword and the negative word exists; the method comprises the steps of determining that a repair intention exists in a voice text under the condition that the attribute of a negative word is a target attribute, further determining whether a first repair instruction corresponding to the voice of a user exists, executing the repair instruction under the condition that the first repair instruction corresponding to the voice of the user exists, accordingly, under the normal natural context of the user, a display device can accurately recognize and distinguish the negative intention and the repair intention of the voice of the user, and under the condition that the repair intention exists in the voice of the user, determining the first repair instruction corresponding to the voice of the user to execute the first repair instruction, so that the requirement of the voice repair of the user is met, the problem that the display device needs to be repaired can be solved timely, the operation is simple and convenient, and the use experience of the user is improved.

As shown in fig. 10, fig. 10 is a schematic structural diagram of a display device according to an embodiment of the present disclosure, where the display device includes: a processor 1001, a memory 1002, and a computer program stored on the memory 1002 and operable on the processor 1001, the computer program implementing the processes of the voice repair method in the above method embodiments when executed by the processor 1001. And the same technical effect can be achieved, and in order to avoid repetition, the description is omitted.

An embodiment of the present disclosure provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process executed by the foregoing voice repair method, and can achieve the same technical effect, and in order to avoid repetition, the computer program is not described herein again.

The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.

The present disclosure provides a computer program product including a computer program, which when run on a computer, causes the computer to implement the above-mentioned voice repair method.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media having computer-usable program code embodied in the medium.

In the several embodiments provided in the present disclosure, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative and, for example, the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In the present disclosure, the Processor may be a Central Processing Unit (CPU), and may also be other general purpose processors, digital Signal Processors (DSP), application Specific Integrated Circuits (ASIC), field-Programmable Gate arrays (FPGA) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

In the present disclosure, the memory may include volatile memory in a computer readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). The memory is an example of a computer-readable medium.

In the present disclosure, computer-readable media include both non-transitory and non-transitory, removable and non-removable storage media. Storage media may implement information storage by any method or technology, and the information may be computer-readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It is noted that, in this document, relational terms such as "first" and "second," and the like, are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.

The foregoing description, for purposes of explanation, has been presented in conjunction with specific embodiments. However, the foregoing discussion in some embodiments is not intended to be exhaustive or to limit the implementations to the precise forms disclosed above. Many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles and the practical application, to thereby enable others skilled in the art to best utilize the embodiments and various embodiments with various modifications as are suited to the particular use contemplated.

Claims

1. A display device, comprising:

a user input interface configured to: receiving user voice;

2. The display device according to claim 1, wherein the controller, after text-converting the user speech to obtain a speech text corresponding to the user speech, before detecting whether the speech text includes a repair keyword, is further configured to:

under the condition that the voice text is detected to include the device keywords and the skill keywords, judging whether the device indicated by the device keywords is the display device;

if the equipment indicated by the equipment keyword is the display equipment, determining whether the display equipment supports the skill indicated by the skill keyword;

the controller, detecting whether the speech text includes a repair keyword, is configured to:

and detecting whether a repair keyword is included in the voice text or not under the condition that the display equipment supports the skill indicated by the skill keyword.

3. The display device according to claim 2, wherein the controller, in a case where a device keyword and a skill keyword are included in the speech text, is further configured to, after determining whether the device indicated by the device keyword is the display device:

if the equipment indicated by the equipment keyword is other equipment and the other equipment is equipment except the display equipment, searching based on the equipment keyword and the skill keyword to obtain a search result;

and controlling a display to display the search result.

4. The display device according to claim 2, wherein the controller, determining whether the skill indicated by the skill keyword is supported by the display device, is specifically configured to:

according to the skill keywords, inquiring whether matched target skills exist in a database;

determining that the display device supports the skill indicated by the skill keyword in the case that there is the matching target skill.

5. The display device according to claim 1, wherein the controller, after performing text conversion on the user speech to obtain a speech text corresponding to the user speech and detecting whether the speech text includes a repair keyword, is further configured to:

under the condition that the voice text comprises repair keywords and skill keywords exist, whether a second repair instruction exists is inquired based on the repair keywords and the skill keywords, wherein the second repair instruction is a pre-stored repair instruction corresponding to the repair keywords and/or the skill keywords;

and executing the second repair instruction under the condition that the second repair instruction exists in the inquiry.

6. The display device according to claim 1, wherein the controller, in a case where the attribute of the negative word is a target attribute, determines that there is a repair intention in the voice text, and after determining whether there is a first repair instruction corresponding to the user voice, is further configured to:

under the condition that it is determined that a first repair instruction corresponding to the user voice does not exist, matching a solution corresponding to the user voice and an associated video based on an information index question-answering system, wherein the associated video is a video having a mapping relation with the solution;

controlling a display to display the solution and the associated video.

7. The display device according to claim 1, wherein the controller, in a case where the attribute of the negative word is a target attribute, determines that there is a repair intention in the voice text, and after determining whether there is a first repair instruction corresponding to the user voice, is further configured to:

under the condition that the first repair instruction corresponding to the user voice does not exist, a repair work order filling interface is called;

the user input interface further configured to: receiving voice input of a user aiming at the repair work order filling interface;

the controller is used for filling a repair work order according to the voice input;

and uploading the repair work order to an operation and maintenance platform after the repair work order is completely filled.

8. The display device according to claim 1, wherein the target attribute of the negative word includes a complement attribute, a control word attribute.

9. A voice repair method is characterized by comprising the following steps:

receiving user voice;

10. The method of claim 9, wherein after the text-converting the user speech to obtain the speech text corresponding to the user speech, and before the detecting whether the speech text includes the repair keyword, the method further comprises:

under the condition that the voice text is detected to include the equipment keywords and the skill keywords, judging whether the equipment indicated by the equipment keywords is display equipment or not;

the detecting whether the voice text includes the repair keyword includes: and detecting whether a repair keyword is included in the voice text or not under the condition that the display equipment supports the skill indicated by the skill keyword.