CN113035189A

CN113035189A - Document demonstration control method, device and equipment

Info

Publication number: CN113035189A
Application number: CN202110209998.6A
Authority: CN
Inventors: 王强; 芦钢; 蒋斯琪
Original assignee: Beijing Xiaomi Mobile Software Co Ltd
Current assignee: Beijing Xiaomi Mobile Software Co Ltd
Priority date: 2021-02-24
Filing date: 2021-02-24
Publication date: 2021-06-25

Abstract

The disclosure provides a document demonstration control method, a document demonstration control device and document demonstration equipment. The document demonstration control method provided by the disclosure comprises the following steps: receiving voice data of a user based on a woken target application after receiving an opening instruction for instructing to open a document; acquiring a recognition result corresponding to the voice data; when the identification result is a document control instruction, controlling the demonstration process of the currently opened document based on the document control instruction; wherein the document is opened based on the open instruction. The document demonstration control method, the document demonstration control device and the document demonstration equipment can control the demonstration process of the document based on voice, simplify the operation process of a user and improve the user experience.

Description

Document demonstration control method, device and equipment

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method, an apparatus, and a device for controlling document presentation.

Background

With the gradual improvement of the application level of PPT (Power Point, PPT for short), PPT application is widely applied to the field of multimedia presentation. In the field of multimedia demonstration, a user can carry out work report, enterprise propaganda, product recommendation, education training and the like by virtue of a PPT file manufactured based on PPT application.

When a user performs presentation and report by means of the PPT file, the user often controls the presentation process by using equipment such as a page turner and the like, and the operation is complicated.

Disclosure of Invention

The present disclosure provides a method, an apparatus, and a device for controlling document presentation to solve the deficiencies in the related art.

According to a first aspect of the embodiments of the present disclosure, a method for controlling a document presentation is provided, where the method includes:

receiving voice data of a user based on a woken target application after receiving an opening instruction for instructing to open a document;

acquiring a recognition result corresponding to the voice data;

when the identification result is a document control instruction, controlling the demonstration process of the currently opened document based on the document control instruction; wherein the document is opened based on the open instruction.

According to a second aspect of the embodiments of the present disclosure, a control device for document presentation is provided, the device comprising a receiving module, an obtaining module and a processing module, wherein,

the receiving module is used for receiving voice data of a user based on the awakened target application after receiving an opening instruction for indicating to open a document;

the acquisition module is used for acquiring the recognition result corresponding to the voice data;

the processing module is used for controlling the demonstration process of the currently opened document based on the document control instruction when the identification result is the document control instruction; wherein the document is opened based on the open instruction.

According to a third aspect of embodiments of the present disclosure, a computer-readable storage medium is proposed, on which a computer program is stored, which when executed by a processor implements any of the methods provided by the first aspect of the present disclosure.

According to a fourth aspect of an embodiment of the present disclosure, there is provided a control apparatus of document presentation, including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to implement any of the methods provided by the first aspect of the present disclosure.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:

according to the embodiment, after receiving an opening instruction for instructing to open a document, the document demonstration control method, the document demonstration control device and the document demonstration control equipment provided by the disclosure receive voice data of a user based on a woken target application, and further acquire an identification result corresponding to the voice data, so that when the identification result is a document control instruction, a demonstration process of the currently opened document is controlled based on the document control instruction; wherein the document is opened based on the open instruction. Therefore, the demonstration process of the document can be controlled based on the voice, the operation process of the user is simplified, and the user experience is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.

FIG. 1 is a flowchart illustrating a method of controlling a document presentation according to an exemplary embodiment of the present disclosure;

FIG. 2 is a flow chart illustrating another method of controlling a document presentation according to an exemplary embodiment of the present disclosure;

FIG. 3 is a flowchart illustrating yet another method of controlling a document presentation according to an exemplary embodiment of the present disclosure;

FIG. 4 is a schematic diagram illustrating the structure of a document presentation control apparatus according to an exemplary embodiment of the present disclosure;

FIG. 5 is a schematic diagram illustrating the structure of another document presentation control apparatus according to an exemplary embodiment of the present disclosure;

FIG. 6 is a schematic diagram illustrating the structure of yet another document presentation control apparatus according to an exemplary embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of a control device for document presentation according to an exemplary embodiment of the present disclosure.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

The disclosure provides a document demonstration control method, a document demonstration control device and document demonstration equipment, so that a demonstration process of a document is controlled based on voice, a user operation process is simplified, and user experience is improved.

The document demonstration control method and device provided by the disclosure can be applied to document demonstration control equipment, and the equipment can be a terminal, for example, a computer, a personal notebook and the like. It should be noted that, the terminal is installed with a target application (the target application is an application program for performing voice control, for example, the target application may be a voice assistant), and the target application can receive voice data of a user after waking up.

FIG. 1 is a flowchart illustrating a method of controlling a document presentation according to an exemplary embodiment of the present disclosure. Referring to fig. 1, the method provided in this embodiment may include:

s101, after receiving an opening instruction for instructing to open a document, receiving voice data of a user based on a woken target application.

It should be noted that the target application may be an application program for performing voice control. For example, a voice assistant may be used.

It should be noted that the wake-up process of the target application may include: activating the target application when a preset wake-up condition is detected; wherein the preset wake-up condition comprises: detecting a specific awakening word of the target application, or detecting that an icon of the target application is triggered, or detecting that a specific key integrated on the device is triggered, wherein the specific key is a switch for controlling the target application.

The specific awakening words are set according to actual needs, and specific contents of the specific awakening words are not limited in the disclosure. For example, in one embodiment, the designated wake word may be "love classmates".

For example, in an embodiment, a plurality of voice data including a designated wakeup word may be pre-recorded, and then the recorded voice data is used to train a wakeup word detection model, so that the voice sent by the user may be detected in real time through the wakeup word detection model, and then when it is determined that the voice sent by the user includes the designated wakeup word, the voice assistant is awakened.

Specifically, after the voice assistant wakes up, it can receive voice data sent by the user.

It should be noted that the document may be a PPT document, a word document, or the like. The following description will be made taking a PPT document as an example.

In addition, the opening instruction may be triggered by clicking on the document or uttering voice. For example, a user may trigger an open instruction for a document by clicking on an icon of the document.

As another example, when the target application wakes up, the user may open the document based on speech. For example, when a user wants to open a PPT file related to a neural network, a voice "open the PPT file related to the neural network" may be issued, based on which the device may search for and present PPT files matching the "neural network", and then open a target PPT file in response to an open instruction for the target PPT file triggered by the user based on the presented PPT files.

And S102, acquiring a recognition result corresponding to the voice data.

It should be noted that, after receiving the voice data, the voice assistant recognizes the voice data to obtain a recognition result corresponding to the voice data.

It should be noted that the voice assistant may identify the voice data directly by the voice assistant in the device, or the voice assistant may send the voice data to the cloud server to request the cloud server to obtain and return an identification result corresponding to the voice data. The principle of the cloud server obtaining the recognition result is similar to the principle of the device recognizing the voice data by the voice assistant, and reference may be made to the following description of the device recognizing the voice data by the voice assistant (i.e., the description of the embodiment shown in fig. 2 or fig. 3), which is not repeated herein. For example, in an embodiment, the cloud server may obtain the recognition result corresponding to the voice data based on an NLP (Natural Language Processing, NLP for short) technology.

For example, in one embodiment, the user utters a voice "enter slide show mode", and after the voice data is recognized, the recognition result is "enter slide show mode".

For another example, if the user utters a voice "create contact", the voice data is recognized, and the recognition result is "create contact".

S103, when the identification result is a document control instruction, controlling the demonstration process of the currently opened document based on the document control instruction, wherein the document is opened based on the opening instruction.

In one embodiment, the document control instructions include at least any one of: an instruction for instructing to close the document, an instruction for instructing to enter the slide show mode or to exit the slide show mode, an instruction for instructing to turn pages, an instruction for instructing to jump to a specified page, and an instruction for instructing to jump to a specified position.

It should be noted that the designated position may be a designated segment, a designated row, or the like. In the present embodiment, the number of components is not limited.

In an embodiment, the instruction for instructing to jump to the specified page may be an instruction carrying page number information for explicitly indicating the specified page, or the instruction for instructing to jump to the specified page may be an instruction carrying document content information for implicitly indicating the specified page.

For example, the user may issue a voice "open page 5" to trigger a document control instruction carrying page number information for explicitly indicating a specified page; for another example, the user may issue a voice "open chapter two" to trigger a document control instruction carrying document content information implicitly indicating the specified page.

In an embodiment, when the control instruction is an instruction for controlling jumping to a specified page and the control instruction carries document content information for implicitly indicating the specified page; the controlling the presentation process of the currently opened document based on the document control instruction comprises the following steps:

searching a target page matched with the document content information from the document;

and displaying the target page.

For example, in an embodiment, a user sends a voice "open second chapter", at this time, after recognition, it is determined that a recognition result corresponding to the voice data is an instruction for instructing to jump to a specified page, and the control instruction carries document content information "second chapter" for implicitly instructing the specified page, at this time, a target slide sheet corresponding to "second chapter" may be searched from a currently opened PPT document, and the target slide sheet is displayed.

For another example, when the user wants to turn a page, the user may speak "previous page", "next page", "jump to page 5", or the like as follows.

Through the mode, after the document is opened, the document demonstration process can be directly controlled through the document control instruction, and the awakening words do not need to be carried each time the document control instruction is sent, for example, when the document needs to be jumped to the 5 th page, only the voice is required to be sent to jump to the 5 th page, and the voice assistant awakening words do not need to be sent to jump to the 5 th page, so that the convenience of a user for operating the document is simplified, and the efficiency is improved.

In the method provided by the embodiment, after an opening instruction for instructing to open a document is received, voice data of a user is received based on a woken target application, and a recognition result corresponding to the voice data is further acquired, so that when the recognition result is a document control instruction, a presentation process of the currently opened document is controlled based on the document control instruction; wherein the document is opened based on the open instruction. Therefore, the demonstration process of the document can be controlled based on the voice, the operation process of the user is simplified, and the user experience is improved.

FIG. 2 is a flow chart illustrating another method of controlling a document presentation according to an exemplary embodiment of the present disclosure. Referring to fig. 2, on the basis of the foregoing embodiment, the method provided in this embodiment, step S102, may include:

s201, converting the voice data into text information.

Specifically, Speech Recognition ASR (Automatic Speech Recognition, abbreviated as ASR) processing may be performed on the Speech data to convert the Speech data into text information. The specific implementation principle and implementation process of ASR processing may refer to descriptions in the related art, and are not described herein again.

S202, recognizing the text information based on a preset database or a natural speech processing NLP technology to obtain a recognition result corresponding to the speech data.

It should be noted that the preset database stores the association relationship between the text content and the control command. The association relationship between the text content and the control command is set according to actual needs, and in this embodiment, the association relationship is not limited. For example, table 1 shows an association between text content and control commands in an exemplary embodiment:

table 1 Preset contents of database

In one embodiment, text information may be matched with text contents stored in a preset database, and then when the text information is successfully matched with target text contents stored in the database, a control instruction corresponding to the target text contents in the database is determined as an identification result corresponding to the voice data;

and when the matching of the text information and any text content stored in the database fails, performing natural language processing on the text information based on a Natural Language Processing (NLP) technology to obtain a recognition result corresponding to the voice data.

In another embodiment, the text information may be directly processed in natural language based on the NLP technology, so as to obtain a recognition result corresponding to the speech data.

It should be noted that, for the specific implementation principle and implementation process of the NLP technology, reference may be made to descriptions in the related art, and details are not described here.

For example, in one embodiment, the process of natural language processing may include: the method comprises the steps of collecting voice data containing document control instructions in advance, training a classification model through the pre-collected voice data (for example, the preset classification of the classification model comprises various document control instructions and other classes), identifying the probability that a section of voice data belongs to each preset classification based on the classification model, and determining the preset classification with the maximum probability as the identification result corresponding to the voice data.

It should be noted that, for the specific implementation process and implementation principle of model training, reference may be made to the description in the related art, and details are not described here.

The embodiment provides a method for identifying voice data, by which the voice data can be accurately identified, and then document demonstration is controlled based on an identification result, so that user requirements can be met, and user experience is improved.

FIG. 3 is a flowchart illustrating yet another method of controlling a document presentation according to an exemplary embodiment of the present disclosure. Referring to fig. 3, on the basis of the foregoing embodiment, the method provided in this embodiment, step S102, may include:

s301, detecting the voice data in real time through a pre-trained document control instruction detection model or a pre-stored document control instruction voice library, and determining whether the voice data contains a document control instruction.

Specifically, voice data containing the document control instruction can be collected in advance, and then a document control instruction voice library (composed of a plurality of voice data containing the document control instruction) is constructed through the voice data, so that the voice data can be detected in real time based on the document control instruction voice library, and whether the voice data contains the document control instruction or not is determined.

In specific implementation, when the voice data is matched with target voice data in a document control instruction voice library, determining a document control instruction corresponding to the target voice data (the document control instruction corresponding to the target voice data is a document control instruction contained in the target voice data) as a recognition result of the voice data.

Furthermore, voice data containing the document control instruction can be collected in advance, and then the document control instruction detection model is trained by utilizing the voice data, so that the voice data can be detected in real time based on the document control instruction detection model, and whether the voice data contains the document control instruction or not is determined.

It should be noted that the document control instruction detection model may be a classification model, and after the voice data is input into the classification model, the probability that the voice data belongs to each preset category may be output, and the preset category with the highest probability is determined as the category to which the voice data belongs.

Further, when the category to which the voice data belongs is a document control instruction, determining that the voice data contains the document control instruction, otherwise, determining that the voice data does not contain the document control instruction.

It should be noted that the preset category may include various document control instructions and other categories. For example, in one embodiment, the preset categories include: previous page, next page, enter PPT projection mode, exit PPT projection mode, jump to first page, jump to last page, open page 5, close PPT, and others. For another example, when a piece of voice data is input to the model and the class to which the voice data belongs is output as another class, it may be determined that the voice data does not include a document control instruction.

S302, when the voice data is determined to contain the document control instruction, determining the document control instruction as the recognition result corresponding to the voice data.

And S303, when the voice data do not contain the document control instruction, performing natural language processing on the voice data based on an NLP technology to obtain a recognition result corresponding to the voice data.

In specific implementation, the voice data can be converted into text information, and then NLP is performed on the text information to obtain a corresponding recognition result.

The method provided by this embodiment includes detecting, in real time, the voice data based on a pre-trained document control instruction detection model or a pre-stored document control instruction voice library, determining whether the voice data includes a document control instruction, determining, when it is determined that the voice data includes the document control instruction, the document control instruction as a recognition result corresponding to the voice data, and performing, based on an NLP technique, natural language processing on the voice data to obtain a recognition result corresponding to the voice data when it is determined that the voice data does not include the document control instruction. Therefore, whether the document control instruction exists in the voice data can be detected quickly, the response speed can be improved, and NLP processing resources can be saved.

Corresponding to the foregoing embodiments of the document demonstration control method, the present disclosure also provides embodiments of a document demonstration control apparatus.

Fig. 4 is a schematic structural diagram of a control device for document presentation according to an exemplary embodiment of the present disclosure. Referring to fig. 4, the apparatus provided in this embodiment includes a receiving module 410, an obtaining module 420, and a processing module 430, wherein,

the receiving module 410 is configured to receive voice data of a user based on a woken target application after receiving an open instruction for instructing to open a document

The obtaining module 420 is configured to obtain a recognition result corresponding to the voice data;

the processing module 430 is configured to, when the identification result is a document control instruction, control a presentation process of a currently opened document based on the document control instruction; wherein the document is opened based on the open instruction.

The apparatus of this embodiment may be used to implement the technical solution of the method embodiment shown in fig. 1, and the implementation principle and the technical effect are similar, which are not described herein again.

Further, the obtaining module 420 is further configured to send the voice data to a cloud server, so that the cloud server obtains and returns a recognition result corresponding to the voice data.

Fig. 5 is a schematic structural diagram of another document demonstration control device according to an exemplary embodiment of the present disclosure. Referring to fig. 5, in the apparatus provided in this embodiment, on the basis of the above embodiment, the obtaining module 420 includes a converting unit 421 and a recognizing unit 422, wherein,

the conversion unit 421 is configured to convert the voice data into text information;

the recognition unit 422 is configured to recognize the text information based on a preset database or a natural speech processing NLP technology, and obtain a recognition result corresponding to the speech data.

Further, the identifying unit 422 is configured to perform natural language processing on the text information based on the LP technology to obtain an identification result corresponding to the speech data.

Further, the identifying unit 422 is specifically configured to:

matching the text information with text contents stored in the database;

when the text information is successfully matched with target text content stored in the database, determining a control instruction corresponding to the target text content in the database as an identification result corresponding to the voice data;

and when the matching of the text information and any text content stored in the database fails, performing natural language processing on the text information based on the NLP technology to obtain an identification result corresponding to the voice data.

Fig. 6 is a schematic structural diagram of a control device for document presentation according to another exemplary embodiment of the present disclosure. Referring to fig. 6, in the apparatus provided in this embodiment, on the basis of the above embodiments, the obtaining module 420 includes a detecting unit 423 and a processing unit 424, wherein,

the detecting unit 423 is configured to detect the voice data in real time through a pre-trained document control instruction detection model or a pre-stored document control instruction voice library, and determine whether the voice data includes a document control instruction;

the processing unit 424 is configured to, when it is determined that the voice data includes a document control instruction, determine the document control instruction as a recognition result corresponding to the voice data;

the processing unit 424 is further configured to, when it is determined that the voice data does not include a document control instruction, perform natural language processing on the voice data based on an NLP technique to obtain a recognition result corresponding to the voice data.

Further, the document control instructions include at least any one of the following instructions: an instruction for instructing to close the document, an instruction for instructing to enter the slide show mode or to exit the slide show mode, an instruction for instructing to turn pages, an instruction for instructing to jump to a specified page, and an instruction for instructing to jump to a specified position.

Further, when the document control instruction is an instruction for indicating to jump to a specified page and carries content information for implicitly indicating the specified page; the processing module 430 is specifically configured to search the document for a target page matching the document content information, and display the target page.

With regard to the apparatus in the above embodiments, the specific manner in which each module performs operations has been described in detail in the embodiments of the related method, and will not be described in detail here.

For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, wherein the modules described as separate parts may or may not be physically separate, and the parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules can be selected according to actual needs to achieve the purpose of the disclosed solution. One of ordinary skill in the art can understand and implement it without inventive effort.

An embodiment of the present disclosure also provides a control apparatus for document presentation, including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to implement the method of any of the above embodiments.

Embodiments of the present disclosure also provide a computer-readable storage medium on which a computer program is stored, which when executed by a processor implements the method of any of the above embodiments.

Fig. 7 is a schematic structural diagram of a control device for document presentation according to an exemplary embodiment of the present disclosure. For example, the device 700 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.

Referring to fig. 7, device 700 may include one or more of the following components: a processing component 702, a memory 704, a power component 706, a multimedia component 708, an audio component 710, an input/output (I/O) interface 712, a sensor component 714, and a communication component 716.

The processing component 702 generally controls the overall operation of the device 700, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 702 may include one or more processors 720 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 702 may include one or more modules that facilitate interaction between the processing component 702 and other components. For example, the processing component 702 may include a multimedia module to facilitate interaction between the multimedia component 708 and the processing component 702.

The memory 704 is configured to store various types of data to support operation at the device 700. Examples of such data include instructions for any application or method operating on device 700, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 704 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

The power supply component 706 provides power to the various components of the device 700. The power components 706 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device 700.

The multimedia component 708 includes a screen that provides an output interface between the device 700 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 708 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the device 700 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 710 is configured to output and/or input audio signals. For example, the audio component 710 includes a Microphone (MIC) configured to receive external audio signals when the device 700 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may further be stored in the memory 704 or transmitted via the communication component 716. In some embodiments, audio component 710 also includes a speaker for outputting audio signals.

The I/O interface 712 provides an interface between the processing component 702 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor assembly 714 includes one or more sensors for providing status assessment of various aspects of the device 700. For example, the sensor assembly 714 may detect an open/closed state of the device 700, the relative positioning of components, such as a display and keypad of the device 700, the sensor assembly 714 may also detect a change in the position of the device 700 or a component of the device 700, the presence or absence of user contact with the device 700, orientation or acceleration/deceleration of the device 700, and a change in the temperature of the device 700. The sensor assembly 714 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 714 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 714 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 716 is configured to facilitate wired or wireless communication between the device 700 and other devices. The device 700 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, 4G LTE, 5G NR, or a combination thereof. In an exemplary embodiment, the communication component 716 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 716 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the device 700 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the methods described in any of the above embodiments.

In an exemplary embodiment, a non-transitory computer readable storage medium comprising instructions, such as the memory 704 comprising instructions, executable by the processor 720 of the device 700 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method of controlling a document presentation, the method comprising:

acquiring a recognition result corresponding to the voice data;

2. The method according to claim 1, wherein the obtaining of the recognition result corresponding to the voice data comprises:

and sending the voice data to a cloud server so as to obtain and return a recognition result corresponding to the voice data by the cloud server.

3. The method according to claim 1, wherein the obtaining of the recognition result corresponding to the voice data comprises:

converting the voice data into text information;

and identifying the text information based on a preset database or a natural speech processing NLP technology to obtain an identification result corresponding to the speech data.

4. The method according to claim 3, wherein the recognizing the text information to obtain a recognition result corresponding to the voice data comprises:

matching the text information with text contents stored in the database;

when the matching of the text information and any text content stored in the database fails, natural language processing is carried out on the text information based on the NLP technology to obtain a recognition result corresponding to the voice data;

alternatively, the first and second electrodes may be,

and based on the NLP technology, performing natural language processing on the text information to obtain a recognition result corresponding to the voice data.

5. The method according to claim 1, wherein the obtaining of the recognition result corresponding to the voice data comprises:

detecting the voice data in real time through a pre-trained document control instruction detection model or a pre-stored document control instruction voice library, and determining whether the voice data contains a document control instruction;

when the voice data is determined to contain a document control instruction, determining the document control instruction as a recognition result corresponding to the voice data;

and when the voice data is determined not to contain the document control instruction, natural language processing is carried out on the voice data based on the NLP technology, and a recognition result corresponding to the voice data is obtained.

6. The method of claim 1, wherein the document control instructions comprise at least any one of: an instruction for instructing to close the document, an instruction for instructing to enter the slide show mode or to exit the slide show mode, an instruction for instructing to turn pages, an instruction for instructing to jump to a specified page, and an instruction for instructing to jump to a specified position.

7. The method according to claim 1, characterized in that when the document control instruction is an instruction for instructing to jump to a specified page and the document control instruction carries document content information for implicitly instructing the specified page; the controlling the presentation process of the currently opened document based on the document control instruction comprises the following steps:

and displaying the target page.

8. A control device for document demonstration is characterized in that the device comprises a receiving module, an acquisition module and a processing module,

9. A control apparatus for document presentation, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to implement the method of any one of claims 1 to 7.

10. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, is adapted to carry out the method of any one of claims 1 to 7.