CN113573132A

CN113573132A - Multi-application screen splicing method and device based on voice realization and storage medium

Info

Publication number: CN113573132A
Application number: CN202110841048.5A
Authority: CN
Inventors: 周胜杰; 赵家宇; 李涛
Original assignee: Shenzhen Konka Electronic Technology Co Ltd
Current assignee: Shenzhen Konka Electronic Technology Co Ltd
Priority date: 2021-07-23
Filing date: 2021-07-23
Publication date: 2021-10-29
Anticipated expiration: 2041-07-23
Also published as: CN113573132B

Abstract

The invention discloses a multi-application screen splicing method, a multi-application screen splicing device and a storage medium based on voice, wherein the method comprises the following steps: acquiring voice text information, and determining a plurality of application field categories corresponding to the voice text information; determining a plurality of target application programs according to the voice text information and the plurality of application field categories; and performing screen splicing display on the plurality of target application programs. According to the invention, the plurality of application programs needing to be opened in the voice of the user can be identified, and the identified application programs are subjected to screen splicing display, so that the problem that the existing voice interaction technology is difficult to be applied to a screen splicing scene of an intelligent screen can be effectively solved.

Description

Multi-application screen splicing method and device based on voice realization and storage medium

Technical Field

The invention relates to the field of voice interaction, in particular to a multi-application screen splicing method and device based on voice realization and a storage medium.

Background

With the hardware and system upgrading of the smart screen, the traditional smart television is upgraded to the smart screen, and the smart screen does not occupy the whole system by only one application in the traditional mode, but can simultaneously open a plurality of applications in the system and simultaneously present and interact in a screen window. However, the voice interaction technology usually has only one dimension when the application is opened, that is, only a specified application can be opened or a jump application can be applied to a specified page. Therefore, the existing voice interaction technology is difficult to be applied to the screen splicing scene of the smart screen, and further optimization and improvement are needed.

Thus, there is still a need for improvement and development of the prior art.

Disclosure of Invention

The technical problem to be solved by the present invention is to provide a method, an apparatus and a storage medium for implementing multi-application screen splicing based on voice, aiming at solving the problem that the existing voice interaction technology is difficult to be applied to the screen splicing scene of the smart screen.

The technical scheme adopted by the invention for solving the problems is as follows:

in a first aspect, an embodiment of the present invention provides a multi-application screen splicing method implemented based on voice, where the method includes:

acquiring voice text information, and determining a plurality of application field categories corresponding to the voice text information;

determining a plurality of target application programs according to the voice text information and the plurality of application field categories;

and performing screen splicing display on the plurality of target application programs.

In one embodiment, the obtaining the speech text information includes:

acquiring a voice instruction;

and performing character conversion on the voice command to obtain the voice text information.

In one embodiment, the determining a plurality of target applications according to the speech text information and the plurality of application domain categories includes:

according to the voice text information, matching each application field type in the plurality of application field types to obtain an application name corresponding to each application field type;

and determining the target application programs according to the application names corresponding to the application field categories.

taking the application field category which is not successfully matched in the plurality of application field categories as a fuzzy application field;

acquiring the quantity and historical operation data of the fuzzy application fields;

and determining a target application program corresponding to the fuzzy application field according to the quantity of the fuzzy application field and the historical operation data.

In one embodiment, the determining, according to the number of the fuzzy application fields and the historical operation data, a target application program corresponding to the fuzzy application field includes:

and when the number of the fuzzy application fields is 1, determining the application program with the largest opening times in the application fields corresponding to the fuzzy application fields according to the historical operation data, and obtaining the target application program corresponding to the fuzzy application fields.

when the number of the fuzzy application fields is larger than 1, determining the category priority corresponding to each fuzzy application field;

and determining a target application program corresponding to each fuzzy application field according to the category priority and the historical operation data.

In one embodiment, the determining a target application program corresponding to each of the fuzzy application fields according to the category priority and the historical operation data includes:

when the category priority corresponding to the fuzzy application field is the highest priority, taking the fuzzy application field as first data;

determining the application program with the largest opening times in the application field corresponding to the first data according to the historical operation data to obtain a target application program corresponding to the first data;

when the category priority corresponding to the fuzzy application field is not the highest priority, taking the fuzzy application field as second data;

and determining an application program with the maximum opening times in the application field corresponding to the second data according to the historical operation data, and combining the application program with the maximum opening times with the target application program corresponding to the first data to obtain the target application program corresponding to the second data.

In one embodiment, the screen-splicing display of the target applications includes:

acquiring a screen splicing template, and determining a plurality of windows according to the screen splicing template, wherein the windows correspond to the target application programs one by one;

and opening the target application programs according to the windows.

In a second aspect, an embodiment of the present invention further provides a multi-application screen splicing device implemented based on voice, where the device includes:

the acquisition module is used for acquiring voice text information and determining a plurality of application field categories corresponding to the voice text information;

the determining module is used for determining a plurality of target application programs according to the voice text information and the plurality of application field categories;

and the screen splicing module is used for splicing the screen of the target application programs and displaying the screen.

In a third aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a plurality of instructions are stored, where the instructions are adapted to be loaded and executed by a processor to implement any of the steps of the screen-splicing method for multiple applications based on voice implementation.

The invention has the beneficial effects that: the method comprises the steps of determining a plurality of application field categories corresponding to voice text information by acquiring the voice text information; determining a plurality of target application programs according to the voice text information and the plurality of application field categories; and performing screen splicing display on the plurality of target application programs. According to the invention, the plurality of application programs needing to be opened in the voice of the user can be identified, and the identified application programs are subjected to screen splicing display, so that the problem that the existing voice interaction technology is difficult to be applied to a screen splicing scene of an intelligent screen can be effectively solved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a simple flow chart of a multi-application screen splicing method based on voice implementation according to an embodiment of the present invention.

Fig. 2 is a detailed flowchart of a multi-application screen splicing method based on voice implementation according to an embodiment of the present invention.

Fig. 3 is a connection diagram of internal modules of a multi-application screen splicing device based on voice implementation according to an embodiment of the present invention.

Fig. 4 is a schematic block diagram of a terminal according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer and clearer, the present invention is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

It should be noted that, if directional indications (such as up, down, left, right, front, and back … …) are involved in the embodiment of the present invention, the directional indications are only used to explain the relative positional relationship between the components, the movement situation, and the like in a specific posture (as shown in the drawing), and if the specific posture is changed, the directional indications are changed accordingly.

Aiming at the defects in the prior art, the invention provides a multi-application screen splicing method based on voice implementation, which comprises the following steps: acquiring voice text information, and determining a plurality of application field categories corresponding to the voice text information; determining a plurality of target application programs according to the voice text information and the plurality of application field categories; and performing screen splicing display on the plurality of target application programs. According to the invention, the plurality of application programs needing to be opened in the voice of the user can be identified, and the identified application programs are subjected to screen splicing display, so that the problem that the existing voice interaction technology is difficult to be applied to a screen splicing scene of an intelligent screen can be effectively solved.

As shown in fig. 1, the method comprises the steps of:

step S100, voice text information is obtained, and a plurality of application field types corresponding to the voice text information are determined.

Specifically, since the present embodiment aims to implement multi-application screen splicing based on voice, in the present embodiment, the user voice may reflect several application programs that need to be opened. When the intelligent screen acquires voice text information generated based on user voice, a plurality of application field categories contained in the voice text information can be judged, application programs in each application field category are not overlapped with each other, namely, one application program only belongs to one application field category.

For example, when the speech text information is "watch movie and play game", the smart screen can determine two application field categories of "video application field" and "game application field" according to the speech text information. The video application field may include the following application programs: tencent video, mango TV, love art, etc. The game application field may include the following application programs: peace essence, happy, glorious and so on.

In one implementation, the obtaining of the speech text information includes the following steps:

s101, acquiring a voice instruction;

and step S102, performing character conversion on the voice command to obtain the voice text information.

Specifically, in this embodiment, one or more voice sensors are arranged on the smart screen in advance, and are used to detect whether a voice instruction of the user exists in real time, and when the smart screen is turned on and obtains the voice instruction sent by the user, immediately perform text conversion on the obtained voice instruction, so as to obtain voice text information corresponding to the voice instruction.

As shown in fig. 1, the method further comprises the steps of:

and S200, determining a plurality of target application programs according to the voice text information and the plurality of application field categories.

Specifically, since the application domain categories may reflect application domains to which the plurality of application programs that the user desires to open the smart screen belong, respectively, the present embodiment may determine one target application program in each of the application domain categories to obtain a plurality of target application programs. The target application programs are a plurality of application programs which need to be opened by the intelligent screen in the voice instruction of the user. Since the application names of the target application programs corresponding to part of the application field categories may directly exist in the speech text information, the present embodiment also needs to determine the target application programs corresponding to the application field categories respectively in combination with the speech text information.

For example, when the voice text information is "open peace elite" and "Tencent video", it may be determined that the application field categories corresponding to the voice text information are a game application field category and a video application field category, respectively. According to the voice text information, the target application program corresponding to the game application field type is 'peace elite', and the target application program corresponding to the video application field type is 'Tencent video'.

In one implementation, the step S200 specifically includes the following steps:

step S201, according to the voice text information, matching each application field type in the plurality of application field types to obtain an application name corresponding to each application field type;

step S202, determining the target application programs according to the application names corresponding to the application field categories.

Specifically, the speech text information may have definite directivity, that is, the speech text information directly contains the application name of the application program to be opened. The present embodiment therefore provides a method for determining a target application for such possible speech-to-text information. The application names corresponding to the application field types are matched according to the voice text information, and the target application programs corresponding to the application field types are directly obtained according to the application names.

For example, if the voice text information of the user is "open Tengchong video and Happy Xiaole", two application field categories included in the voice text information of the user, that is, a video application field category and a game application field category, may be determined according to the voice text information of the user, and according to the voice text information, a target application program corresponding to the video application field category may also be directly matched as "Tengchong video", and a target application program corresponding to the game application field category is "Happy Xiaole".

In another implementation, the determining a plurality of target applications according to the plurality of application domain categories includes:

step S203, the application field category which is not successfully matched in the application field categories is used as a fuzzy application field;

step S204, acquiring the quantity of the fuzzy application fields and historical operation data;

and S205, determining a target application program corresponding to the fuzzy application field according to the quantity of the fuzzy application fields and the historical operation data.

In particular, directional ambiguity is also possible due to the speech text information, i.e. the speech text information can only reflect which application programs in which application domain categories the user wants to open, but the name of the application program is not explicitly specified. Therefore, the present embodiment provides another method for determining the target application program for the voice text information. The application field types which are not successfully matched are used as fuzzy application fields, historical operation data of the user are obtained, and the historical operation data of the user can reflect common application programs in the application field types, so that the target application program corresponding to each fuzzy application field can be determined according to the historical operation data of the user and the quantity of the fuzzy application fields.

In an implementation manner, the step S205 specifically includes the following steps:

and S2051, when the number of the fuzzy application fields is 1, determining the application program with the largest opening times in the application fields corresponding to the fuzzy application fields according to the historical operation data, and obtaining the target application program corresponding to the fuzzy application fields.

Specifically, if only one application field category in the plurality of application field categories does not have a corresponding application name, the application field category is represented as a unique fuzzy application field, and the most common application program in the application field corresponding to the fuzzy application field is determined according to the obtained historical operation data, that is, the application program is used as a target application program corresponding to the fuzzy application field.

For example, if the voice text information is "Tencent video and game open", the target application program corresponding to the video application field type can be directly matched as "Tencent video" according to the voice text information, and the target application program of the game application field type cannot be directly matched, so that the game application field type is used as the fuzzy application field. And then acquiring historical operation data of the user, wherein only the game application field type is the fuzzy application field, so that the game frequently played by the user in the game application field type is determined to be peaceful elite according to the historical operation data, and the peaceful elite is the target application program corresponding to the game application field type.

In another implementation manner, the step S205 specifically includes the following steps:

step S2052, when the number of the fuzzy application fields is larger than 1, determining the category priority corresponding to each fuzzy application field;

and step S2053, determining a target application program corresponding to each fuzzy application field according to the category priority and the historical operation data.

And if the plurality of application field categories in the plurality of application field categories do not have corresponding application names, indicating that a plurality of fuzzy application fields exist. In this embodiment, priority information is preset, and the priority information includes a category priority corresponding to each application field category. Specifically, when a plurality of fuzzy application fields exist, the category priority corresponding to each fuzzy application field is determined according to preset priority information, and the category priority corresponding to each fuzzy application field can reflect the importance of the fuzzy application field, so that the target application program corresponding to the important fuzzy application field can be determined firstly based on the category priority, and then the target application programs corresponding to other fuzzy application fields are determined, and the target application programs corresponding to all fuzzy application fields are sequentially and regularly obtained.

In an implementation manner, the determining, according to the category priority and the historical operation data, a target application program corresponding to each of the fuzzy application fields specifically includes:

and determining the application program with the maximum opening times of the target application program combination corresponding to the first data in the application field corresponding to the second data according to the historical operation data to obtain the target application program corresponding to the second data.

Specifically, when there are multiple fuzzy application areas, the multiple fuzzy application areas of this embodiment are divided into two types according to the class priorities corresponding to the multiple fuzzy application areas, that is, first data and second data, where the first data is the fuzzy application area with the highest priority, and the other fuzzy application areas that are not the highest priority are the second data. For the first data, the present embodiment may determine, according to the historical operation data, the most common application program in the application field corresponding to the first data, and use the application program as the target application program corresponding to the first data. And for the second data, determining the application program which is the most frequently opened in the combination with the target application program corresponding to the first data in the application field corresponding to the second data according to the historical operation data by taking the category priority corresponding to each fuzzy application field in the second data as the sequence, wherein the application program is the target application program corresponding to the second data.

In an implementation manner, when the number of the second data is greater than 1, for each second data, determining, according to the historical operation data, an application program that is the most frequently opened in combination with the target application program corresponding to the priority of the previous category in the application field corresponding to the second data, and then the application program is the target application program corresponding to the second data.

For example, when the user's voice command is "play game, chat, watch video", then it can be determined that there are multiple application areas for ambiguity: the game type, the social type and the video type can be determined according to preset priority information, the category priority of the game type is a, the social type is b, the video type is c, the most common application program in the game application field is determined to be 'peace elite' according to historical operation data, and the 'peace elite' is a target application program corresponding to the game type. Then, in the field of social application, determining that the application program with the largest number of times of opening combined with the 'peace and elite' is 'WeChat' according to historical operation data, wherein the 'WeChat' is a target application program corresponding to the social class. Finally, in the field of video application, the application program which is combined with WeChat and is opened for the most times is determined to be 'Tencent video' according to historical operation data.

In one implementation, as shown in FIG. 2, before any one application is determined to be a target application, a voice prompt function may be generated from the application to confirm whether the user wants the application to be the target application. The smart screen may ask the user in speech "open the tremble for you," if you do not please say it again, "the user may reply" not open the tremble but open the goblet, "and if no reply is received from the user," tremble "is automatically selected as the target application. In short, the present embodiment can improve the accuracy of voice command recognition through multiple rounds of queries.

In one implementation, if the target application is not installed, the target application is automatically downloaded from the application mall and installed.

As shown in fig. 1, the method further comprises the steps of:

and S300, performing screen splicing display on the target application programs.

Specifically, in order to implement the screen-splicing scene of the smart screen, in this embodiment, after determining a plurality of target application programs corresponding to the voice instruction of the user, the target application programs need to be displayed in a screen-splicing manner, so that a plurality of application programs can be simultaneously opened on the screen of the smart screen, and the requirements of the user on different application programs are met.

In one implementation, the step S300 specifically includes the following steps:

step S301, acquiring a screen splicing template, and determining a plurality of windows according to the screen splicing template, wherein the windows correspond to the target application programs one by one;

and step S302, opening the target application programs according to the windows.

Specifically, the screen-splicing template in this embodiment may be a preset screen-splicing template, or may be a suitable screen-splicing template determined by the number of the plurality of target applications. The screen splicing template usually comprises windows with the number consistent with that of the target application programs, and then each window is opened with one target application program, so that screen splicing display of the target application programs can be realized.

In one implementation, when the smart screen is already in the screen-splicing scene, a new voice instruction of the user is received, and then prompt voice is automatically generated to inquire whether the user wants to reestablish the new screen-splicing scene or add a new window in the original screen-splicing scene. And if the reply of the user is 'no', the original screen splicing scene is closed, and a new screen splicing scene is reestablished according to the received new voice instruction.

In one implementation, the user may replace one of the target applications by speech in the already opened screenshot scenario, such as speaking "change XX application to XXX application" directly by speech or "do not see XX, see XXX".

In one implementation, the user may also exit the currently established screen-splicing scene by voice, or return to any one of the applications or to a default window from the currently established screen-splicing scene by voice.

In one implementation, the method further comprises:

and S400, determining the number of the application field categories to be 1 according to the voice command, determining a target application program corresponding to the application field categories, and opening the target application program by adopting a single window.

Specifically, when it is recognized that the voice data of the user only includes one application field category, the number of the application programs that the user needs to open currently is 1, and therefore the target application program does not need to be opened by adopting a screen splicing scene, and only the target application program needs to be opened by adopting a single window, that is, the screen splicing scene is not constructed, but a conventional single scene is constructed.

Based on the above embodiment, the present invention further provides a multi-application screen splicing device implemented based on voice, as shown in fig. 3, the device includes:

the acquisition module 01 is used for acquiring voice text information and determining a plurality of application field categories corresponding to the voice text information;

the determining module 02 is used for determining a plurality of target application programs according to the voice text information and the plurality of application field categories;

and the screen splicing module 03 is used for splicing and displaying the plurality of target application programs.

Based on the above embodiments, the present invention further provides a terminal, and a schematic block diagram thereof may be as shown in fig. 4. The terminal comprises a processor, a memory, a network interface and a display screen which are connected through a system bus. Wherein the processor of the terminal is configured to provide computing and control capabilities. The memory of the terminal comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the terminal is used for connecting and communicating with an external terminal through a network. The computer program is executed by a processor to implement a multi-application screen-splicing method based on a voice implementation. The display screen of the terminal can be a liquid crystal display screen or an electronic ink display screen.

It will be understood by those skilled in the art that the block diagram of fig. 4 is a block diagram of only a portion of the structure associated with the inventive arrangements and is not intended to limit the terminals to which the inventive arrangements may be applied, and that a particular terminal may include more or less components than those shown, or may have some components combined, or may have a different arrangement of components.

In one implementation, one or more programs are stored in a memory of the terminal and configured to be executed by one or more processors include instructions for performing a voice-based multi-application screen-splitting method.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, databases, or other media used in embodiments provided herein may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

In summary, the present invention discloses a method, an apparatus and a storage medium for implementing multi-application screen splicing based on voice, wherein the method comprises: acquiring voice text information, and determining a plurality of application field categories corresponding to the voice text information; determining a plurality of target application programs according to the voice text information and the plurality of application field categories; and performing screen splicing display on the plurality of target application programs. According to the invention, the plurality of application programs needing to be opened in the voice of the user can be identified, and the identified application programs are subjected to screen splicing display, so that the problem that the existing voice interaction technology is difficult to be applied to a screen splicing scene of an intelligent screen can be effectively solved.

It is to be understood that the invention is not limited to the examples described above, but that modifications and variations may be effected thereto by those of ordinary skill in the art in light of the foregoing description, and that all such modifications and variations are intended to be within the scope of the invention as defined by the appended claims.

Claims

1. A multi-application screen splicing method based on voice implementation is characterized by comprising the following steps:

2. The multi-application screen splicing method based on voice realization as claimed in claim 1, wherein the acquiring voice text information comprises:

acquiring a voice instruction;

3. The method for multi-application screen splicing based on voice implementation according to claim 1, wherein the determining a plurality of target application programs according to the voice text information and the plurality of application field categories comprises:

4. The method for multi-application screen splicing based on voice implementation according to claim 3, wherein the determining a plurality of target application programs according to the voice text information and the plurality of application field categories comprises:

5. The multi-application screen splicing method based on voice implementation according to claim 4, wherein the determining a target application program corresponding to the fuzzy application field according to the number of the fuzzy application fields and the historical operation data includes:

6. The multi-application screen splicing method based on voice implementation according to claim 4, wherein the determining a target application program corresponding to the fuzzy application field according to the number of the fuzzy application fields and the historical operation data includes:

7. The method for multi-application screen splicing based on voice implementation according to claim 6, wherein the determining the target application program corresponding to each of the fuzzy application fields according to the category priority and the historical operation data comprises:

8. The multi-application screen splicing method based on voice realization of claim 1, wherein the screen splicing display of the target applications comprises:

and opening the target application programs according to the windows.

9. A multi-application screen splicing device based on voice realization is characterized in that the device comprises:

10. A computer-readable storage medium having stored thereon instructions adapted to be loaded and executed by a processor to perform the steps of the speech-based multi-application screen-splicing method according to any one of claims 1 to 8.