CN113593545A

CN113593545A - Linkage scene execution method and device, storage medium and electronic equipment

Info

Publication number: CN113593545A
Application number: CN202110707562.XA
Authority: CN
Inventors: 李阅苗; 刘建国
Original assignee: Qingdao Haier Technology Co Ltd; Haier Smart Home Co Ltd
Current assignee: Qingdao Haier Technology Co Ltd; Haier Smart Home Co Ltd
Priority date: 2021-06-24
Filing date: 2021-06-24
Publication date: 2021-11-02

Abstract

The invention discloses a linkage scene execution method and device, a storage medium and electronic equipment. Wherein, the method comprises the following steps: under the condition of receiving audio data, converting the audio data into text data, wherein the audio data is voice data collected by voice equipment in a target range, and the voice equipment is positioned in a target network; under the condition that the scene text is identified from the text data, acquiring an instruction text for removing the scene text from the text data; under the condition that a target text with similarity greater than a preset threshold value with the instruction text is contained in the instruction text set, determining a target linkage scene corresponding to the target text as a linkage scene matched with the instruction text, wherein the instruction text set comprises an execution text matched with the linkage scene in a target network; and executing the target linkage scene. The invention solves the technical problem that the linkage scene cannot be executed through the voice command.

Description

Linkage scene execution method and device, storage medium and electronic equipment

Technical Field

The invention relates to the field of Internet of things, in particular to a linkage scene execution method and device, a storage medium and electronic equipment.

Background

At present, a lot of families are all using thing networking household electrical appliances, and thing networking household electrical appliances are used for showing usually and can be connected the network, carry out the household electrical appliances of functions such as adjustment, the multi-device linkage of equipment state through network operation, for example: the system comprises an Internet of things television, an Internet of things air conditioner, an Internet of things refrigerator and an Internet of things sound box.

When other equipment that is in the thing networking of audio amplifier control through being in the thing networking is opened, need assign the start voice command to the audio amplifier, and because the recognition function of audio amplifier, the start-up pronunciation that all need assign of each equipment. Even if a plurality of devices need to be started in a short time, the voice starting instructions must be issued one by one, so that the devices are started one by one, and the operation is complicated. In order to facilitate the one-time opening of a plurality of devices, a corresponding linkage scene is usually set. For example, by setting a "home scene," turning on a light, turning on a curtain, turning on an air conditioner, and turning on a television are realized at the same time.

However, the voice platform corresponding to the sound box, that is, the platform for voice collection and recognition in the internet of things, cannot correctly recognize the execution instruction of the linkage scene, and therefore, the linkage scene cannot be executed by using the voice instruction.

In view of the above problems, no effective solution has been proposed.

Disclosure of Invention

The embodiment of the invention provides a linkage scene execution method and device, a storage medium and electronic equipment, and at least solves the technical problem that a linkage scene cannot be executed through a voice instruction.

According to an aspect of an embodiment of the present invention, there is provided a linkage scene execution method including: under the condition of receiving audio data, converting the audio data into text data, wherein the audio data is voice data collected by voice equipment in a target range, and the voice equipment is positioned in a target network; under the condition that a scene text is identified from the text data, acquiring an instruction text for removing the scene text from the text data; determining a target linkage scene corresponding to the target text as a linkage scene matched with the instruction text under the condition that the instruction text set contains the target text with the similarity larger than a preset threshold, wherein the instruction text set comprises an execution text matched with the linkage scene in the target network; and executing the target linkage scene.

According to another aspect of the embodiments of the present invention, there is also provided a linkage scene executing apparatus, including: the conversion unit is used for converting the audio data into text data under the condition of receiving the audio data, wherein the audio data is voice data collected by voice equipment in a target range, and the voice equipment is positioned in a target network; an acquisition unit configured to acquire, from the text data, an instruction text from which the scene text is removed, in a case where the scene text is recognized from the text data; the determining unit is used for determining a target linkage scene corresponding to the target text as a linkage scene matched with the instruction text under the condition that the instruction text set contains the target text with the similarity larger than a preset threshold with the instruction text, wherein the instruction text set comprises an execution text matched with the linkage scene in the target network; and the execution unit is used for executing the target linkage scene.

According to still another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium having a computer program stored therein, wherein the computer program is configured to execute the linkage scenario execution method when running.

According to another aspect of the embodiments of the present invention, there is also provided an electronic device, including a memory and a processor, where the memory stores a computer program, and the processor is configured to execute the linkage scene execution method through the computer program.

In the embodiment of the invention, under the condition of receiving the audio data, the scene text recognition is carried out on the text data converted from the audio data, under the condition that the scene text is identified, acquiring the instruction text in the text data, determining a target text with the similarity to the instruction text being greater than a preset threshold, executing a mode of target linkage scene corresponding to the target text, the method is characterized in that scene text recognition is carried out on the audio data to determine that in the case of scene text containing linkage scene execution, the matched target linkage scene is determined so as to execute the target linkage scene, thereby achieving the purpose of executing the linkage scene through the voice command, therefore, the technical effect that the execution instruction of the linkage scene is identified in the audio data to execute the linkage scene is achieved, and the technical problem that the linkage scene cannot be executed through the voice instruction is solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

FIG. 1 is a schematic diagram of an application environment of an alternative linkage scenario execution method according to an embodiment of the present invention;

FIG. 2 is a flow chart illustrating an alternative linkage scenario execution method according to an embodiment of the present invention;

FIG. 3 is a flow chart illustrating an alternative linkage scenario execution method according to an embodiment of the present invention;

FIG. 4 is a flow chart illustrating an alternative linkage scenario execution method according to an embodiment of the present invention;

FIG. 5 is a flow chart illustrating an alternative linkage scenario execution method according to an embodiment of the present invention;

FIG. 6 is a flow chart illustrating an alternative linkage scenario execution method according to an embodiment of the present invention;

FIG. 7 is a schematic structural diagram of an alternative linkage scenario execution apparatus according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of an alternative electronic device according to an embodiment of the invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

According to an aspect of the embodiments of the present invention, there is provided a linkage scene execution method, which may be applied to the environment shown in fig. 1. At least two terminal devices 104 form the target device network 102, and configure an interlocking scene for at least two terminal devices 104 in the target device network 102. The at least two terminal devices 104 include a device having a voice capturing function. The target device network 102 interacts with the server 112 through the network 110. The server 112 runs therein a database 114 and a processing engine 116, the database 114 is used for receiving and storing data, and the processing engine 116 is used for executing S102 to S108 in sequence on the data in the database 114.

The audio data is converted into text data. Audio data transmitted by the target device network 102 is received through the network 110 and converted into text data. The audio data is voice data collected by voice devices in the target device network 102 within a target range. And executing text recognition on the text data, and acquiring the instruction text from the text data under the condition that the scene text is recognized from the text data. The instruction text is text in the text data from which the scene text is removed. And under the condition of acquiring the instruction text, comparing the instruction text with the execution text contained in the instruction text set, wherein the instruction text set comprises the execution text matched with the linkage scene in the target equipment network. And under the condition that the instruction text set contains a target text with the similarity larger than a preset threshold value with the instruction text, determining a target linkage scene corresponding to the target text as a linkage scene matched with the instruction text. And executing the target linkage scene. The server 112 transmits an execution instruction to execute the target linkage scenario to the target device network 102 through the network 110, so that the target device network 102 executes the target linkage scenario.

Optionally, in this embodiment, the terminal device may be a terminal device configured with a voice collecting function, and may include but is not limited to at least one of the following: mobile phones (such as Android phones, IOS phones, etc.), notebook computers, tablet computers, palm computers, MID (Mobile Internet Devices), PAD, desktop computers, smart televisions, etc. Such networks may include, but are not limited to: a wired network, a wireless network, wherein the wired network comprises: a local area network, a metropolitan area network, and a wide area network, the wireless network comprising: bluetooth, WIFI, and other networks that enable wireless communication. The server may be a single server, a server cluster composed of a plurality of servers, or a cloud server. The above is merely an example, and this is not limited in this embodiment.

As an optional implementation manner, as shown in fig. 2, the linkage scenario execution method includes:

s202, converting the audio data into text data under the condition of receiving the audio data, wherein the audio data is the voice data collected by the voice equipment in the target range, and the voice equipment is positioned in the target network;

s204, under the condition that the scene text is identified from the text data, acquiring an instruction text for removing the scene text from the text data;

s206, under the condition that the similarity between the instruction text set and the target text is larger than a preset threshold value, determining a target linkage scene corresponding to the target text as a linkage scene matched with the instruction text, wherein the instruction text set comprises an execution text matched with the linkage scene in the target network;

and S208, executing the target linkage scene.

Alternatively, the target network may be, but is not limited to, a network of devices comprising a voice device, consisting of at least two devices. In the device network, the voice device may be a voice device for voice capture, which is independent of other devices in the target network, or may be any device having a voice capture function in the target network. For example, the voice device may be a smart speaker included in the target network, or may be a smart television with a voice capture function.

Optionally, the linkage scenario is used to indicate a scenario in which at least two terminal devices execute together. Multiple coordinated scenarios may be configured, but are not limited to, in the target network. For example, in an equipment network constructed by lighting equipment, a smart sound box, a smart television and a smart air conditioner, a first linkage scene can be configured to instruct to turn on the smart television and turn on the smart air conditioner while turning on the lighting equipment. Meanwhile, a second linkage scene can be configured for indicating that music is played through the intelligent sound box and simultaneously turning off the lighting equipment. In the first linkage scene, the smart television with the voice acquisition function is used as voice equipment for acquiring audio data, and in the second linkage scene, the smart sound box is used as voice equipment for acquiring audio data.

Optionally, in a case that the target network includes a plurality of terminal devices that can be used as voice devices, audio data collected by the plurality of terminal devices is received, and each audio data is processed separately to obtain more accurate text data. Thereby avoiding omission of audio data.

Optionally, the audio data is collected in case that the terminal device as a voice device is woken up. The terminal device may, but is not limited to, adjust to the awakened state when the awakening word is acquired in the standby state, so as to acquire the audio data. The wake-up word may be a wake-up word applied to the target network or a wake-up word applied to the voice device, and is usually a fixed word preset for the target network or the voice device.

Alternatively, in the case where the scene text is included in the text data, it is determined that the scene text is recognized from the text data, and the remaining text from which the scene text is removed in the text data is taken as the instruction text. The scene text is an identification text for distinguishing an instruction for executing the linkage scene from an instruction for executing another instruction. The scene text is a text configured in advance for identifying the execution of the linkage scene.

Optionally, the instruction text set is used to store an execution text corresponding to each linkage scene in all linkage scenes of the target network. And determining a linkage scene matched with the instruction text by comparing the instruction text with the execution text, and executing the linkage scene corresponding to the instruction text.

Optionally, while configuring each linkage scene, configuring a corresponding execution text for each linkage scene, and adding the execution text to the instruction text set.

Optionally, in a case where the scene text is not recognized from the text data, it is determined that the voice command determined by the audio data is not an execution command of the linkage scene, and the command recognition processing is performed on the text data to execute the voice command corresponding to the audio data.

Optionally, when the target text is not included in the instruction text set, determining that there is no linkage scene matched with the execution text, and performing instruction recognition processing on the text data to execute the voice instruction corresponding to the audio data.

In the embodiment of the application, the scene text recognition is carried out on the text data converted from the audio data under the condition that the audio data is received, under the condition that the scene text is identified, acquiring the instruction text in the text data, determining a target text with the similarity to the instruction text being greater than a preset threshold, executing a mode of target linkage scene corresponding to the target text, the method is characterized in that scene text recognition is carried out on the audio data to determine that in the case of scene text containing linkage scene execution, the matched target linkage scene is determined so as to execute the target linkage scene, thereby achieving the purpose of executing the linkage scene through the voice command, therefore, the technical effect that the execution instruction of the linkage scene is identified in the audio data to execute the linkage scene is achieved, and the technical problem that the linkage scene cannot be executed through the voice instruction is solved.

As an alternative embodiment, as shown in fig. 3, after converting the audio data into text data, the method further includes:

s302, performing text recognition on the text data;

s304, when the scene text is recognized from the text data, it is determined that the scene text is included in the text data.

Optionally, text recognition is performed on the text data to determine whether scene text is contained, and in the case that scene text is contained in the text data, it is determined that the scene text is recognized from the text data. The scene text is text configured for execution of the linkage scene. For example, the scene text is set as "execution scene", and in the case where "execution scene" is included in the text data of the audio data conversion, it is determined that the scene text is recognized.

Optionally, the text recognition of the text data to identify the scene text is not limited to the recognition of the scene text by using an error correction mechanism to ensure the correct recognition rate and efficiency of the scene text.

In the embodiment of the application, the scene text is configured through the execution of the linkage scene, so that the execution instruction of the linkage scene contained in the audio data is determined under the condition that the text data contains the scene text, the linkage scene is executed according to the execution instruction, and the purpose and the effect of executing the linkage scene through the voice instruction are achieved.

As an alternative implementation, as shown in fig. 4, after obtaining the instruction text for removing the scene text from the text data, the method further includes:

s402, comparing the instruction text with the execution text contained in the instruction text set in sequence;

and S404, when the target execution text which is consistent with the instruction text exists in the instruction text set, taking the target execution text as the target text.

Optionally, in a case where it is determined that the scene text is included in the text data, data after the scene text is removed from the text data is taken as the instruction text. And under the condition that the instruction text is acquired, calculating the similarity of each execution text contained in the instruction text set and the instruction text. And under the condition that the similarity between the instruction text set and the instruction text is one hundred percent, determining that a target execution text consistent with the instruction text exists in the instruction text set, so that the target execution text is used as the target text, and determining a target linkage scene matched with the target text to realize the determination of the target linkage scene matched with the instruction text.

Optionally, the similarity parameter may be, but is not limited to, at least one of: public substring length, public subsequence length, edit distance, hamming distance, character cosine value. And under the condition that the similarity parameter is determined to be a comprehensive parameter formed by two or more parameters, presetting the parameter weight of each similarity parameter.

As an optional implementation manner, under the condition that the instruction text set does not include the target execution text, the similarity parameters between the execution text and the instruction text included in the instruction text combination are sequentially acquired; and taking the execution text corresponding to the maximum numerical value in the similarity parameters as a target text.

Optionally, in a case that a target execution text with a similarity of one hundred percent to the instruction text is not included in the instruction text set, determining candidate execution texts with a similarity of more than a preset threshold to the instruction text in the instruction text set. And taking the candidate execution text corresponding to the maximum similarity parameter value as the target text under the condition that the number of the candidate execution texts is more than one. And in the case that the number of the candidate execution texts is one, taking the candidate execution texts as the target texts. And under the condition that the number of the candidate execution texts is less than one, namely, no execution text with the similarity parameter larger than a preset threshold exists, determining that the target text is not identified in the instruction text set.

In the embodiment of the application, the instruction text set comprises an execution text matched with the linkage scene, and whether the execution text corresponding to the configured linkage scene exists is determined by comparing the similarity between the instruction text and the execution text, so as to determine whether the audio data is an execution instruction of the linkage scene. By comparing the similarity between the execution text and the instruction text, the identification process of the instruction text is simplified, the identification difficulty is reduced, and the identification accuracy is improved.

As an alternative implementation, as shown in fig. 5, determining the target linkage scene corresponding to the target text as the linkage scene matching the instruction text includes:

s502, determining a scene identifier corresponding to the target text, wherein the scene identifier is used for identifying a linkage scene in the target network;

and S504, determining the linkage scene corresponding to the scene identification as a target linkage scene.

Optionally, when configuring the linkage scene for the target bit network, configuring a scene identifier for each linkage scene. The scene identification and the linkage scene have an incidence relation for identifying the linkage scene. And under the condition of determining the target text, determining a scene identifier corresponding to the target text, so as to determine the target linkage scene through the scene identifier.

The flow of executing the target linkage scene based on the audio data is not limited to that shown in fig. 6. S602 is executed, and audio data is received. In the case where the audio data is received, S604 is performed to convert the audio data into text data. In the case of obtaining the text data converted from the audio data, S606 is performed to identify whether the text data contains a scene text. For example, if the scene text configured for executing the linkage scene is "execution scene", it is searched for whether the text data includes an "execution scene" text. If it is determined in S606 that the scene text "execution scene" is not included in the text data, S616 is executed to execute the voice command corresponding to the text data.

If it is determined in S606 that the scene text "execution scene" is included in the text data, S608 is executed to extract the instruction text from the text data. And taking the text after the scene text is removed from the text data as an instruction text. And executing S610 under the condition of acquiring the instruction text, and judging whether a target text with the similarity higher than a preset threshold value with the instruction text exists in the instruction text set. And calculating the similarity between each execution text contained in the instruction text set and the instruction text so as to obtain the target text of which the similarity parameter is greater than a preset threshold value. If the judgment in S610 is no, that is, if the target text is not included in the instruction text set, S616 is executed to execute the voice instruction corresponding to the text data.

If it is determined in S610 that the target text is determined in the instruction text set, S612 is executed to determine a scene id corresponding to the target text. If the scene identifier is determined, S614 is executed to determine a target linked scene corresponding to the scene identifier and execute the target linked scene.

As an optional implementation, before converting the audio data into text data, the method further includes: and configuring a target linkage scene for a target equipment group operating in the target network, wherein the target equipment group comprises at least two pieces of equipment, and the target linkage scene is used for indicating that the working state of the target equipment group is adjusted under the condition that a target execution condition is met.

Optionally, at least two devices included in the target network may constitute a target device group, and a target linkage scenario is configured. Each device in the target network may form different device groups with different devices and configure the device groups with a linkage scenario. For example, the lighting devices included in the target device network in the above example are respectively configured with a first linkage scene and a second linkage scene.

Optionally, the configuration linkage scenario is not limited to include configuration scenario identification, configuration execution text, configuration execution condition, and configuration execution action. The configuration scene identification and the execution text are identification and text which are distinguished from other linkage scenes and configured for the linkage scenes, and an incidence relation is established.

Optionally, the configuring the execution condition of the linkage scenario is configuring a trigger condition for the execution of the linkage scenario. The trigger conditions may include, but are not limited to, instruction triggers and auto-triggers. The instruction trigger is to trigger execution of an interlocking scene matching the execution text in a case where the audio data includes the scene text and the execution text. The automatic triggering means that the execution of the linkage scene is automatically triggered when the triggering condition is reached. For example, a trigger condition is set for the second linkage scene described above: triggered automatically at 22 days. Then at 22 days, the smart speakers are automatically turned on and the lighting devices are turned off.

Optionally, the performing action configuring the linkage scenario is a device configuration performing action indicated for the linkage scenario. The performing action may be an action indicating a device state adjustment. For example, the smart speaker plays music and adjusts the lighting device to an operating state.

As an optional implementation manner, at least two devices report the device attribute to a control system of a target network; so that the control system monitors the target execution condition; and responding to the execution instruction to execute the target linkage scene under the condition that the device attributes of at least two devices meet the target execution condition.

Alternatively, the control system of the target network is not limited to being a server of the target network, and the target execution condition is not limited to being an execution condition that recognizes the scene text and the execution text or satisfies the linkage scene.

In the embodiment of the application, the execution conditions and the execution actions of the linkage scene are configured, so that the equipment group formed by at least two pieces of equipment can be executed in linkage. Meanwhile, the linkage scene is executed through a voice instruction by configuring the scene identifier and executing the text.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.

According to another aspect of the embodiment of the invention, a linkage scene executing device for implementing the linkage scene executing method is also provided. As shown in fig. 7, the apparatus includes:

a conversion unit 702, configured to convert, in a case that audio data is received, the audio data into text data, where the audio data is voice data collected by a voice device in a target range, and the voice device is located in a target network;

an obtaining unit 704 configured to obtain, from the text data, an instruction text for removing the scene text in a case where the scene text is recognized from the text data;

a determining unit 706, configured to determine, when a target text with a similarity greater than a preset threshold to the instruction text is included in the instruction text set, a target linkage scene corresponding to the target text as a linkage scene matched with the instruction text, where the instruction text set includes an execution text matched with a linkage scene in a target network;

and the execution unit 708 is used for executing the target linkage scene.

Optionally, the linkage scene executing apparatus further includes a recognition unit, configured to, after converting the audio data into text data, include:

the recognition module is used for performing text recognition on the text data;

the first determining module is used for determining that the scene text is contained in the text data under the condition that the scene text is identified from the text data.

Optionally, the linkage scene executing apparatus further includes a comparing unit, configured to, after acquiring the instruction text for removing the scene text from the text data, include:

the comparison module is used for sequentially comparing the instruction text with the execution text contained in the instruction text set;

and the second determining module is used for taking the target execution text as the target text under the condition that the target execution text consistent with the instruction text exists in the instruction text set.

Optionally, the comparing unit further includes a first obtaining module, configured to sequentially obtain similarity parameters between the execution text and the instruction text included in the instruction text combination when the instruction text set does not include the target execution text;

and the third determining module is used for taking the execution text corresponding to the maximum numerical value in the similarity parameters as the target text.

Optionally, the determining unit 706 includes:

the fourth determining module is used for determining a scene identifier corresponding to the target text, wherein the scene identifier is used for identifying a linkage scene in the target network;

and the fifth determining module is used for determining the linkage scene corresponding to the scene identifier as the target linkage scene.

Optionally, the linkage scene executing apparatus further includes a configuration unit, configured to configure a target linkage scene for a target device group operating in a target network before converting the audio data into text data, where the target device group includes at least two devices, and the target linkage scene is used to indicate that an operating state of the target device group is adjusted when a target executing condition is satisfied.

Optionally, the configuration unit is further configured to report the device attribute to a control system of the target network by the at least two devices; so that the control system monitors the target execution condition; and responding to the execution instruction to execute the target linkage scene under the condition that the device attributes of at least two devices meet the target execution condition.

According to another aspect of the embodiment of the present invention, there is also provided an electronic device for implementing the linkage scenario execution method, where the electronic device may be the terminal device or the server shown in fig. 1. The present embodiment takes the electronic device as a server as an example for explanation. As shown in fig. 8, the electronic device comprises a memory 802 and a processor 804, the memory 802 having a computer program stored therein, the processor 804 being arranged to perform the steps of any of the above-described method embodiments by means of the computer program.

Optionally, in this embodiment, the electronic device may be located in at least one network device of a plurality of network devices of a computer network.

Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:

s1, converting the audio data into text data under the condition of receiving the audio data, wherein the audio data is the voice data collected by the voice equipment in the target range, and the voice equipment is positioned in the target network;

s2, acquiring an instruction text for removing the scene text from the text data under the condition that the scene text is identified from the text data;

s3, under the condition that the similarity between the instruction text set and the target text is larger than a preset threshold value, determining a target linkage scene corresponding to the target text as a linkage scene matched with the instruction text, wherein the instruction text set comprises an execution text matched with the linkage scene in the target network;

and S4, executing the target linkage scene.

Alternatively, it can be understood by those skilled in the art that the structure shown in fig. 8 is only an illustration, and the electronic device may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palmtop computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 8 is a diagram illustrating a structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 8, or have a different configuration than shown in FIG. 8.

The memory 802 may be used to store software programs and modules, such as program instructions/modules corresponding to the linkage scene execution method and apparatus in the embodiments of the present invention, and the processor 804 executes various functional applications and data processing by running the software programs and modules stored in the memory 802, that is, the linkage scene execution method described above is implemented. The memory 802 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 802 can further include memory located remotely from the processor 804, which can be connected to the terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 802 may be specifically but not limited to be used for storing information such as audio data, text data, linkage scenes, scene texts, and the like. As an example, as shown in fig. 8, the memory 802 may include, but is not limited to, a conversion unit 702, an acquisition unit 704, a determination unit 706, and an execution unit 708 of the linkage scene execution device. In addition, the linkage scene execution device may further include, but is not limited to, other module units in the linkage scene execution device, which is not described in detail in this example.

Optionally, the transmitting device 806 is configured to receive or transmit data via a network. Examples of the network may include a wired network and a wireless network. In one example, the transmission device 806 includes a Network adapter (NIC) that can be connected to a router via a Network cable and other Network devices to communicate with the internet or a local area Network. In one example, the transmission device 806 is a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.

In addition, the electronic device further includes: a display 808 for displaying the text data and the linkage scene; and a connection bus 810 for connecting the respective module parts in the above-described electronic apparatus.

In other embodiments, the terminal device or the server may be a node in a distributed system, where the distributed system may be a blockchain system, and the blockchain system may be a distributed system formed by connecting a plurality of nodes through a network communication. Nodes can form a Peer-To-Peer (P2P, Peer To Peer) network, and any type of computing device, such as a server, a terminal, and other electronic devices, can become a node in the blockchain system by joining the Peer-To-Peer network.

According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the methods provided in the various alternative implementations of the linkage scenario execution aspect described above. Wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.

Alternatively, in the present embodiment, the above-mentioned computer-readable storage medium may be configured to store a computer program for executing the steps of:

and S4, executing the target linkage scene.

Alternatively, in this embodiment, a person skilled in the art may understand that all or part of the steps in the methods of the foregoing embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing one or more computer devices (which may be personal computers, servers, network devices, etc.) to execute all or part of the steps of the method according to the embodiments of the present invention.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. A linked scene execution method, comprising:

under the condition of receiving audio data, converting the audio data into text data, wherein the audio data is voice data collected by a voice device in a target range, and the voice device is positioned in a target network;

under the condition that a scene text is identified from the text data, acquiring an instruction text for removing the scene text from the text data;

under the condition that an instruction text set contains a target text with the similarity to the instruction text being larger than a preset threshold, determining a target linkage scene corresponding to the target text as a linkage scene matched with the instruction text, wherein the instruction text set comprises an execution text matched with the linkage scene in the target network;

and executing the target linkage scene.

2. The method of claim 1, after converting the audio data to text data, further comprising:

performing text recognition on the text data;

determining that the scene text is included in the text data if the scene text is identified from the text data.

3. The method according to claim 1, further comprising, after obtaining the instruction text for removing the scene text from the text data:

sequentially comparing the instruction text with the execution texts contained in the instruction text set;

and in the case that a target execution text which is consistent with the instruction text exists in the instruction text set, taking the target execution text as the target text.

4. The method of claim 3, wherein:

under the condition that the target execution text is not included in the instruction text set, sequentially acquiring similarity parameters of the execution text and the instruction text contained in the instruction text combination;

and taking the execution text corresponding to the maximum numerical value in the similarity parameters as the target text.

5. The method according to claim 1, wherein the determining the target linkage scene corresponding to the target text as the linkage scene matching the instruction text comprises:

determining a scene identifier corresponding to the target text, wherein the scene identifier is used for identifying a linkage scene in the target network;

and determining the linkage scene corresponding to the scene identifier as the target linkage scene.

6. The method according to any one of claims 1 to 5, further comprising, before converting the audio data into text data:

configuring the target linkage scene for a target equipment group operating in the target network, wherein the target equipment group comprises at least two pieces of equipment, and the target linkage scene is used for indicating that the working state of the target equipment group is adjusted under the condition that a target execution condition is met.

7. The method of claim 6, wherein:

the at least two devices report the device attributes to a control system of the target network; to cause the control system to monitor the target execution condition;

and responding to an execution instruction to execute the target linkage scene under the condition that the device attributes of the at least two devices meet the target execution condition.

8. A linkage scene execution apparatus, comprising:

the conversion unit is used for converting the audio data into text data under the condition of receiving the audio data, wherein the audio data is voice data collected by a voice device in a target range, and the voice device is positioned in a target network;

an acquisition unit configured to acquire, from the text data, an instruction text from which the scene text is removed, in a case where the scene text is recognized from the text data;

the determining unit is used for determining a target linkage scene corresponding to the target text as a linkage scene matched with the instruction text under the condition that the instruction text set contains the target text with the similarity larger than a preset threshold, wherein the instruction text set comprises an execution text matched with the linkage scene in the target network;

and the execution unit is used for executing the target linkage scene.

9. A computer-readable storage medium, characterized in that the computer-readable storage medium comprises a stored program which when executed performs the method of any of claims 1 to 7.

10. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method of any of claims 1 to 7 by means of the computer program.