WO2019080873A1 - 一种批注生成的方法及相关装置 - Google Patents

一种批注生成的方法及相关装置

Info

Publication number
WO2019080873A1
WO2019080873A1 PCT/CN2018/111660 CN2018111660W WO2019080873A1 WO 2019080873 A1 WO2019080873 A1 WO 2019080873A1 CN 2018111660 W CN2018111660 W CN 2018111660W WO 2019080873 A1 WO2019080873 A1 WO 2019080873A1
Authority
WO
WIPO (PCT)
Prior art keywords
annotation
terminal device
instruction
data stream
target document
Prior art date
Application number
PCT/CN2018/111660
Other languages
English (en)
French (fr)
Inventor
熊飞
任旻
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2019080873A1 publication Critical patent/WO2019080873A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Definitions

  • the present application relates to the field of Internet technologies, and in particular, to annotation generation techniques.
  • the embodiment of the present application provides a method for generating annotations and related devices. On one hand, it can directly annotate multiple places of a document without having to take screenshots or modify the document, thereby improving the execution efficiency of the solution, and on the other hand, At the same time, the documents are annotated and communicated in the instant messaging application, which makes the solution more flexible.
  • the first aspect of the present application provides a method for generating an annotation, the method being applied to an instant messaging application, the method comprising:
  • the first terminal device receives the annotation input instruction set by the instant messaging application, wherein the annotation input instruction set includes at least one instruction for annotating the target document, each instruction corresponding to one moment;
  • the first terminal device synthesizes an annotation video according to the annotation information and a time corresponding to each instruction.
  • the second aspect of the present application provides a terminal device, where the terminal device is installed with an instant messaging application, including:
  • a receiving module configured to receive, by the instant messaging application, an annotation input instruction set, where the annotation input instruction set includes at least one instruction for annotating a target document, each instruction corresponding to a moment;
  • a determining module configured to determine annotation information corresponding to the target document according to the annotation input instruction set received by the receiving module
  • a synthesizing module configured to synthesize the annotation video according to the annotation information determined by the determining module and the moment corresponding to each instruction.
  • a third aspect of the present application provides a terminal device, where the terminal device is installed with an instant messaging application, including: a memory, a transceiver, a processor, and a bus system;
  • the memory is used to store a program
  • the processor is configured to execute a program in the memory, including the following steps:
  • an annotation input instruction set receives, by the instant messaging application, an annotation input instruction set, wherein the annotation input instruction set includes at least one instruction for annotating a target document, each instruction corresponding to one moment;
  • the bus system is configured to connect the memory and the processor to cause the memory and the processor to communicate.
  • a fourth aspect of the present application provides a computer readable storage medium having stored therein a computer program; the computer program for performing the method of the first aspect described above.
  • a fifth aspect of the present application provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of the first aspect described above.
  • the embodiments of the present application have the following advantages:
  • a method for generating an annotation is provided.
  • the method is applied to an instant messaging application.
  • the first terminal receives an annotation input instruction set by using an instant messaging application, where the annotation input instruction set includes at least one An instruction for annotating the target document, each instruction corresponding to one moment, and then determining the annotation information corresponding to the target document according to the instruction in the annotation input instruction set, and then, the first terminal device according to the annotation information and the corresponding corresponding to each instruction At the moment, synthesize the annotation video.
  • the document can be directly annotated in multiple places without the need to take screenshots or modify the document, thereby improving the execution efficiency of the solution, and on the other hand, being able to annotate the document in the instant messaging application at the same time.
  • Communication makes the program more flexible.
  • FIG. 1 is a schematic diagram of a relationship between a hierarchical relationship and a display level in an embodiment of the present application
  • FIG. 2 is a schematic diagram of another relationship between a hierarchical relationship and a display level in the embodiment of the present application;
  • FIG. 3 is a schematic diagram of an embodiment of a method for generating annotations in an embodiment of the present application
  • FIG. 4 is a schematic diagram of an interface for enabling a voice annotation function in an embodiment of the present application
  • FIG. 5 is a schematic diagram of an interface for confirming voice annotation in the embodiment of the present application.
  • FIG. 6 is a schematic diagram of an interface of a target document annotation in an embodiment of the present application.
  • FIG. 7 is a schematic diagram of an interface for synthesizing and transmitting an annotation video in an embodiment of the present application.
  • FIG. 8 is a schematic diagram of an interface for displaying subtitles in an annotation video according to an embodiment of the present application.
  • FIG. 9 is a schematic diagram of an interface for confirming voice annotation and video annotation in the embodiment of the present application.
  • FIG. 10 is a schematic diagram of an interface for previewing a target document by using a system plug-in in an application scenario of the present application
  • FIG. 11 is a schematic diagram of an interface for viewing a target document by using a cloud preview in an application scenario of the present application
  • FIG. 12 is a schematic diagram of an embodiment of a terminal device according to an embodiment of the present application.
  • FIG. 12b is a schematic diagram of another embodiment of a terminal device according to an embodiment of the present application.
  • FIG. 13 is a schematic diagram of another embodiment of a terminal device according to an embodiment of the present application.
  • FIG. 14 is a schematic diagram of another embodiment of a terminal device according to an embodiment of the present application.
  • FIG. 15 is a schematic structural diagram of a terminal device according to an embodiment of the present application.
  • the embodiment of the present application provides a method for generating annotations and related devices. On one hand, it can directly annotate multiple places of a document without having to take screenshots or modify the document, thereby improving the execution efficiency of the solution, and on the other hand, At the same time, the documents are annotated and communicated in the instant messaging application, which makes the solution more flexible.
  • IM instant messaging
  • application application, APP
  • IM apps commonly used on the Internet include Tencent QQ, WeChat, Easy Letter, Nail, Baidu HI, Fetion, Ali Wangwang, Jingdong, Feiyu, yy, Skype, Google Talk, icq, FastMsg and parox.
  • Some instant messaging services provide the characteristics of status information—displaying a list of contacts, whether the contact is online and able to talk to the contact.
  • the IM service will send a message to the user when the person on the user's call list (similar to the phone book) is connected to the IM, and the user can start real-time communication with the person via the Internet.
  • This application can use the IM function to open the document preview directly on the IM APP to display the document content, and can annotate the document and then record the annotation process.
  • the size of the recording frame cannot be changed during recording, and only the document can be paged. Recording can include page flipping, annotation actions, and mouse actions. If the user chooses to turn on the microphone, the track retains the microphone content during recording.
  • FIG. 1 is a schematic diagram of a relationship between a hierarchical relationship and a display hierarchy in the embodiment of the present application.
  • an annotation is superimposed on the document preview view.
  • all annotation content corresponds to the document one by one, you can scroll the document on the ScrollView container, and you can also undo and delete the annotation action on the annotation view. Record all page flipping, annotation actions, and mouse actions.
  • the microphone track, the document operation video, and the annotation operation video are merged into one video, displayed on the preview window, and finally the synthesized video is shared to other users on the IM APP.
  • FIG. 2 is another schematic diagram of the relationship between the hierarchical relationship and the display level in the embodiment of the present application.
  • the Preview Window contains a preview view of the document, where the Document Preview view is used to display the content of the document.
  • the toolbar is used to add annotation elements such as rectangles, circles, arrows, text, labels, and handwriting. You can also undo the previous step, control the microphone switch, and display the recording time.
  • the annotation view is used to display the annotation content.
  • the inside of the ScrollView container contains a document preview view and an annotation view.
  • the ScrollView displays a scroll bar.
  • the added annotations remain fixed relative to the document content.
  • the annotation view is the same size as the document preview view and is a child view of the ScrollView.
  • the annotation view and the document preview view move simultaneously and remain in the same position. This will ensure that the annotations and document content are not misplaced.
  • the added annotations are fixed relative to the document content.
  • the document preview view is zoomed, its size changes. In this case, the size of the annotation view is adjusted so that it is always the same size as the document preview view, and the relative position is unchanged.
  • an embodiment of the method for generating annotation in the embodiment of the present application includes:
  • the first terminal device receives the annotation input instruction set by the instant messaging application, wherein the annotation input instruction set includes at least one instruction for annotating the target document, and each instruction corresponds to one moment.
  • the first terminal device receives the user-triggered annotation input instruction set through the IM APP
  • the annotation input instruction set includes at least one instruction for annotating the target document, for example, adding a rectangular frame instruction, adding a circle
  • the instruction for annotating the target document may further include an undo instruction, a delete instruction, a display recording time instruction, and a video and audio recording instruction.
  • the target document may be any IM APP supported document such as a picture, a microsoft office word, or a portable document format (PDF), which is not limited herein.
  • IM APP supported document such as a picture, a microsoft office word, or a portable document format (PDF), which is not limited herein.
  • PDF portable document format
  • each instruction corresponds to a time, for example, 10 minutes and 25 seconds to start inputting text, 12 minutes and 37 seconds to start adding a rectangular frame.
  • the first terminal device determines, according to the instruction in the annotation input instruction set, the annotation information corresponding to the target document.
  • the first terminal device may determine the annotation information included in the target document according to the received instruction in the annotation input instruction set, and the annotation information of the target document is as shown in Table 1 below.
  • the first terminal device synthesizes the annotation video according to the annotation information and the time corresponding to each instruction.
  • the first terminal device can synthesize an annotation video according to the annotation information and the time corresponding to each instruction, and the annotation video is a video of the recording annotation process.
  • the first terminal device can transmit the annotation video to the second terminal device, wherein the second terminal device is configured to receive and display the annotation video through the instant messaging application.
  • the first terminal device may send the annotation video to the at least one second terminal device through the IM APP.
  • step 101 to step 103, and sending the annotation video to the second terminal device are all performed in the same IM APP, during which the user does not need to exit the IM APP to perform the recording operation of the annotation video, that is, the user After receiving the target document directly on the IM APP, you can start the annotation and record it into the corresponding annotation video.
  • the process of the entire annotation can be seen by directly opening the annotation video through the IM APP.
  • a method for generating an annotation is provided.
  • the method is applied to an instant messaging application.
  • the first terminal device receives an annotation input instruction set by using an instant messaging application, where the annotation input instruction set includes at least one The instruction for annotating the target document, each instruction corresponding to one moment, and then determining the annotation information corresponding to the target document according to the instruction in the annotation input instruction set, and then, the first terminal device corresponding to the annotation information and each instruction The moment, synthesizing the annotation video.
  • the document can be directly annotated in multiple places without the need to take screenshots or modify the document, thereby improving the execution efficiency of the solution, and on the other hand, being able to annotate the document in the instant messaging application at the same time.
  • Communication makes the program more flexible.
  • the first terminal device according to the annotation information and the time corresponding to each instruction, Before synthesizing the annotation video, you can also include:
  • the first terminal device synthesizes the annotation video according to the annotation information and the time corresponding to each instruction, and may include:
  • the first terminal device synthesizes the annotation video according to the annotation information, the time corresponding to each instruction, and the audio data stream, wherein the time corresponding to each instruction has a corresponding relationship with the time identifier carried in the audio data stream.
  • FIG. 4 is a schematic diagram of an interface for opening a voice annotation function according to an embodiment of the present application.
  • a user sends a target document on an IM APP, and if the target document is a WORD document, then next to the target document bubble.
  • the "voice annotation” can be added to the voice annotation function. Click on "Voice Annotation" to open the target document for browsing and provide an entry to start the annotation.
  • FIG. 5 is a schematic diagram of an interface for confirming voice annotation in the embodiment of the present application.
  • the user can click to select to turn on the microphone. Then click “Start Voice Annotation”, and the voice annotation stage will be entered.
  • FIG. 6 is a schematic diagram of an interface of the target document annotation in the embodiment of the present application. As shown in the figure, the user can use the tool to annotate the target. The document, explained by voice, helps the listener to better understand the annotation.
  • FIG. 7 is a schematic diagram of an interface for synthesizing and sending an annotation video in the embodiment of the present application. As shown in the figure, the user may choose to save to the local or share the video with other modes. user.
  • the first terminal device can receive the audio data stream in addition to receiving the annotation input instruction set, that is, the user can record while being annotated, and the final synthesized annotation video includes the audio data stream.
  • the annotation experience of the document can be improved, and the use of voice combined annotation is beneficial to increase the efficiency of annotation and expression.
  • the first terminal device after the first terminal device receives the audio data stream, include:
  • the first terminal device processes the audio data stream by using a voice recognition model, and acquires subtitle information corresponding to the audio data stream;
  • the first terminal device synthesizes the annotation video according to the annotation information, the time corresponding to each instruction, and the audio data stream, and may include:
  • the first terminal device synthesizes the annotation video according to the annotation information, the time corresponding to each instruction, the audio data stream, and the subtitle information.
  • the first terminal device may further process the audio data stream by using a voice recognition model, and obtain the subtitle information corresponding to the audio data stream, so that when the second terminal displays the annotation video, the audio data stream may be displayed. subtitle.
  • FIG. 8 is a schematic diagram of an interface for displaying subtitles in an annotation video according to an embodiment of the present application.
  • the audio data stream may also be displayed.
  • Subtitles it should be noted that the subtitle position at the bottom of Figure 8 is only a schematic. In practical applications, the subtitle position can be adjusted according to user habits.
  • speech recognition models include, but are not limited to, acoustic models and language models.
  • the language model represents the probability of occurrence of a sequence of words.
  • the chain rule is used to disassemble the probability of a sentence into the product of the probability of each word in the instrument.
  • the task of an acoustic model is to give the probability of this speech after a given text.
  • subtitle information may be displayed below the annotation video, or may be displayed above the annotation video, or set according to user requirements, which is not limited herein.
  • the terminal device processes the audio data stream by using a voice recognition model, acquires the subtitle information corresponding to the audio data stream, and then combines the annotation information, the time corresponding to each instruction, the audio data stream, and the subtitle information to synthesize the annotation video. .
  • a voice recognition model acquires the subtitle information corresponding to the audio data stream
  • the subtitle information to synthesize the annotation video.
  • the first terminal device is configured according to the foregoing first or second embodiment corresponding to FIG. 3 and FIG.
  • the annotation information, the time corresponding to each of the instructions, and the audio data stream may also include:
  • the first terminal device synthesizes the annotation video according to the annotation information, the time corresponding to each instruction, and the audio data stream, and may include:
  • the first terminal device synthesizes the annotation video according to the annotation information, the time corresponding to each instruction, the audio data stream, and the video data stream, where the time corresponding to each instruction, the time identifier carried in the audio data stream, and the video data stream
  • the time stamps carried in the correspondence have a corresponding relationship.
  • the first terminal device may receive the video data stream in addition to the audio data stream before synthesizing the annotation video according to the annotation information and the time corresponding to each instruction.
  • the video data stream is captured by the camera. For example, when the user starts recording the video at the same time, the user can record the expression and action of the user at the time of annotation, and then make a video, and synthesize the annotation video together with the annotation information and the audio data stream.
  • the time corresponding to each instruction, the time identifier of the audio data stream, and the time identifier of the video data stream are important reference values for synthesizing the annotation video, so that the problem that the audio and video are not synchronized can be prevented.
  • FIG. 9 is a schematic diagram of an interface for confirming voice annotation and video annotation in the embodiment of the present application.
  • a “camera” can also be selected, so that video recording can be performed.
  • the video display position at the upper right of FIG. 9 is only one indication. In practical applications, the video display position can be adjusted according to user habits.
  • the first terminal device can receive the audio data stream and receive the video data stream in addition to receiving the annotation input instruction set, that is, the user can record while recording, and the final synthesized annotation video includes Audio data stream and video data stream.
  • the annotation experience of the document can be better improved, and the annotation method combining voice and video is adopted, which is beneficial to increase the efficiency of annotation and expression.
  • the first terminal device receives the annotation input instruction set through the instant communication application program.
  • the first terminal device acquires a document type of the target document
  • the first terminal device determines whether the document type of the target document belongs to the preset document type
  • the first terminal device displays the target document on the display interface of the instant messaging application
  • the first terminal device displays the target document by calling the system plug-in.
  • the document type of the target document needs to be acquired first. If the document type belongs to the preset document type, then the IM APP can be directly used. The target document content is displayed in the document preview view.
  • the default document type can be a text file or a picture file. If it is not a preset document type, you need to call the system plugin to display the target document.
  • a system plug-in is a program written in accordance with a certain specification of the application program interface.
  • the system plug-in runs under the system platform specified by the program (may support multiple platforms at the same time), and cannot run separately from the specified platform. Because the system plug-in needs to call the function library or data provided by the original pure system, many IM APPs have system plug-ins.
  • the first terminal device may display the target document by calling a system plug-in in the IM APP, or may display the target document by calling a system plug-in in the operating system.
  • the terminal device may also obtain the type of the target document. If the document type of the target document belongs to the preset document type, the terminal device directly displays the target document on the instant messaging application. Otherwise, the terminal device needs to Call the system plugin to display the target document through the system plugin. In the above manner, even if the instant messaging application does not support a certain document type, the system plug-in can be called to display the target document corresponding to the document type, thereby improving the feasibility and operability of the solution, and is applicable to various types. Target document.
  • the first terminal device sends a document browsing instruction to the server, so that the server generates a preview image corresponding to the target document according to the document browsing instruction, where the document browsing instruction carries the identifier of the target document;
  • the first terminal device displays the target document by calling the system plug-in, which may include:
  • the first terminal device displays the preview picture corresponding to the target document in sequence by calling the system plug-in.
  • the document browsing instruction may be further sent to the server, that is, the “cloud preview” function is started.
  • the server invokes the target document in the memory according to the identifier carried in the document browsing instruction, and sends the target document to the first terminal device in the form of a preview image.
  • the first terminal device displays each preview picture corresponding to the target document in the order from the back to the front or from the back to the front.
  • the user can annotate each preview image.
  • the target document has a total of ten preview images
  • the composite annotation video also includes annotations for the ten preview images.
  • the process of calling the target document in the background by the server is specifically indexed by the identifier of the target document, and each target document corresponds to one identifier, and therefore, the identifier is unique.
  • the identifier of the target document may be a message digest algorithm (MD5) or a secure hash algorithm (SHA), and may be other types of identifiers, which are not limited herein.
  • how to display a target document by calling a system plug-in is introduced, and the target document may be displayed in a preview image in a certain order.
  • the user when recording the annotation video, the user can annotate the target document in a reasonable order, thereby improving the rationality and feasibility of the solution.
  • the first terminal device receives the annotation input through the instant messaging application.
  • the set of instructions can include:
  • the first terminal device receives the first annotation input instruction subset corresponding to the first preview image by using the instant messaging application, wherein the first preview image is a preview image corresponding to the target document, and the first annotation input instruction subset belongs to the annotation input instruction set;
  • the first terminal device receives the second annotation input instruction subset corresponding to the second preview image by using the instant messaging application, wherein the second preview image is a preview image corresponding to the target document, and the second annotation input instruction subset belongs to the annotation input instruction set;
  • the first terminal device creates an annotation data array according to the first preview image, the first annotation input instruction subset, the second preview image, and the second annotation input instruction subset, wherein the annotation data array includes a preview image and an annotation input instruction Correspondence between sets;
  • the first terminal device determines, according to the annotation input instruction set, the annotation information corresponding to the target document, which may include:
  • the first terminal device determines the annotation information corresponding to the target document according to the annotation input instruction set, the preview image corresponding to the target document, and the annotation data array.
  • the target document includes a two-page preview image, which is a first preview image and a second preview image respectively, and the user annotates the first preview image, that is, the first preview image corresponds to the first annotation input instruction subset, and then the user pairs
  • the second preview picture is annotated, that is, the second preview picture corresponds to the second annotation input instruction sub-set.
  • the first terminal device will maintain an array of annotation data, as shown in Table 2.
  • preview picture Annotation input instruction sub-set First preview image First batch of input instruction sub-sets Second preview image Second annotation input instruction sub-set
  • the correspondence between the preview image and the annotation input instruction sub-set may also be included in the annotation data array.
  • Table 2 is only a schematic and should not be construed as limiting the application.
  • the number of elements in the annotation data array is the same as the number of pages in the target document.
  • the annotation input instruction subset is stored in the array. The user can switch pages by turning the page button or previewing the picture. When the page is turned on, the comment view is cleared. After the page is turned over, the corresponding annotation input instruction sub-set is taken out from the annotation data array according to the current page number, and is drawn on the annotation view.
  • the user can annotate each page, and each page is a preview image, and the annotation made on the preview image is an annotation input sub-collection.
  • the terminal device stores the correspondence between the preview picture and the annotation input instruction subset in the form of an annotation data array.
  • the terminal device can obtain the correspondence between the annotation and the page in the annotation data array when synthesizing the annotation video, so that the accuracy of the synthetic annotation video can be effectively improved in the case of a multi-page document, and the annotation and the page are avoided. There is a misalignment.
  • IM APP is QQ developed by Tencent.
  • User A wants to open a presentation (powerpoint, PPT) on QQ, but QQ can't directly open PPT. Therefore, QQ can call the system plugin to display the content of PPT, that is, as shown in the figure.
  • 10 is a schematic diagram of an interface for previewing a target document by using a system plug-in in the application scenario of the present application.
  • the server will be queried whether it can support cloud preview of the type of file. If the file is previewed in the cloud, the Cloud Preview button is displayed in the preview view.
  • the cloud preview mode of the PPT file is that the server is installed with software that supports opening the PPT format, such as Microsoft Office.
  • the server uses Microsoft Office to open a PPT file and then stores each page of the PPT as an image file. Then send all the image files to the client for viewing in the order of the pages in the PPT.
  • FIG. 11 is a schematic diagram of an interface for viewing a target document by using a cloud preview in the application scenario of the present application.
  • the server uses the MD5 value of the PPT file as an index to manage and cache the generated preview image. .
  • the preview window first asks if the cloud preview server needs to upload the PPT file.
  • the server for cloud preview will check if the image file cache of the file preview content already exists in the cloud. If a user previewed the file a while ago, the server has a cache. At this point, the server can notify the client that there is no need to upload a PPT file, and notify the client to preview the image ready.
  • the server does not cache the image file, check whether the cloud has a cache of the PPT file and index it through MD5. If there has been a user performing a cloud disk storage or QQ offline file transfer operation on the file, the cloud has a cache of the file. The server opens the file and generates a preview image. The client is then notified that there is no need to upload a PPT file and the client is notified that the preview image is ready. Otherwise, the server needs to notify the client to upload the PPT file.
  • the server After the client uploads the PPT file, the server opens the PPT file and generates a preview image. The server notifies the client that the preview image is ready. After receiving the notification that the preview image is ready, the client requests a preview image from the server. The server tells the client to preview the total number of images. The client is attached to each preview and displayed in the preview window.
  • FIG. 12a is a schematic diagram of an embodiment of a terminal device according to an embodiment of the present disclosure.
  • the terminal device 20 includes:
  • the receiving module 201 is configured to receive, by the instant messaging application, an annotation input instruction set, where the annotation input instruction set includes at least one instruction for annotating a target document, where each instruction corresponds to one moment;
  • the determining module 202 is configured to determine the annotation information corresponding to the target document according to the instruction in the annotation input instruction set received by the receiving module 201;
  • the synthesizing module 203 is configured to synthesize the annotation video according to the annotation information determined by the determining module 202 and the time corresponding to each instruction.
  • the receiving module 201 receives the annotation input instruction set by using the instant messaging application, wherein the annotation input instruction set includes at least one instruction for annotating the target document, and each instruction corresponds to a moment, determining
  • the module 202 determines the annotation information corresponding to the target document according to the instruction in the annotation input instruction set received by the receiving module 201, and the synthesis module 203 determines the annotation information determined by the determining module 202 and each of the annotation information.
  • the annotation video is synthesized at the time corresponding to the instruction.
  • a terminal device receives an annotation input instruction set by using an instant messaging application, where the annotation input instruction set includes at least one instruction for annotating the target document, each The instruction corresponds to a moment, and then the annotation information corresponding to the target document can be determined according to the instruction in the annotation input instruction set.
  • the first terminal device synthesizes the annotation video according to the annotation information and the time corresponding to each instruction.
  • the document can be directly annotated in multiple places without the need to take screenshots or modify the document, thereby improving the execution efficiency of the solution, and on the other hand, being able to annotate the document in the instant messaging application at the same time. Communication makes the program more flexible.
  • the terminal device 20 further includes:
  • the sending module 204 is configured to send the annotation video synthesized by the synthesizing module 203 to the second terminal device, where the second terminal device is configured to receive and display the annotation video by using the instant messaging application.
  • the terminal device 20 further includes:
  • the receiving module 201 is further configured to: before the synthesizing module 203 synthesizes the annotation video according to the annotation information and the moment corresponding to each instruction, receiving the audio data stream, where the audio data stream carries the moment Identification
  • the synthesizing module 203 is specifically configured to synthesize the annotation video according to the annotation information, a time corresponding to each instruction, and the audio data stream, where the timing corresponding to each instruction and the audio The time stamps carried in the data stream have a corresponding relationship.
  • the first terminal device can receive the audio data stream in addition to receiving the annotation input instruction set, that is, the user can record while being annotated, and the final synthesized annotation video includes the audio data stream.
  • the annotation experience of the document can be improved, and the use of voice combined annotation is beneficial to increase the efficiency of annotation and expression.
  • the terminal device 20 further includes an obtaining module 205;
  • the obtaining module 205 is configured to: after the receiving module 201 receives the audio data stream, process the audio data stream by using a voice recognition model, and acquire the caption information corresponding to the audio data stream;
  • the synthesizing module 203 is specifically configured to synthesize the annotation video according to the annotation information, the time corresponding to each instruction, the audio data stream, and the subtitle information.
  • the terminal device processes the audio data stream by using a voice recognition model, acquires the subtitle information corresponding to the audio data stream, and then combines the annotation information, the time corresponding to each instruction, the audio data stream, and the subtitle information to synthesize the annotation video. .
  • a voice recognition model acquires the subtitle information corresponding to the audio data stream
  • the subtitle information to synthesize the annotation video.
  • the terminal device 20 further includes:
  • the receiving module 201 is further configured to: before the synthesizing module 203 is configured to receive the video data stream according to the annotation information, the time corresponding to each instruction, and the audio data stream, before synthesizing the annotation video,
  • the video data stream carries a time identifier
  • the synthesizing module 203 is specifically configured to synthesize the annotation video according to the annotation information, the time corresponding to each instruction, the audio data stream, and the video data stream, where each instruction corresponds to The time instant, the time identifier carried in the audio data stream, and the time identifier carried in the video data stream have a corresponding relationship.
  • the first terminal device can receive the audio data stream and receive the video data stream in addition to receiving the annotation input instruction set, that is, the user can record while recording, and the final synthesized annotation video includes Audio data stream and video data stream.
  • the annotation experience of the document can be better improved, and the annotation method combining voice and video is adopted, which is beneficial to increase the efficiency of annotation and expression.
  • the terminal device 20 further includes a determining module 206 and a display module 207;
  • the obtaining module 205 is further configured to: before the receiving module 201 receives the annotation input instruction set by using the instant messaging application, acquiring the document type of the target document;
  • the determining module 206 is configured to determine whether the document type of the target document acquired by the obtaining module 205 belongs to a preset document type.
  • the display module 207 is configured to: if the determining module 206 determines that the document type of the target document belongs to the preset document type, the first terminal device displays on the display interface of the instant messaging application The target document;
  • the display module 207 is configured to: if the determining module 206 determines that the document type of the target document does not belong to the preset document type, the first terminal device displays the target document by calling a system plug-in.
  • the terminal device may also obtain the document type of the target document. If the document type of the target document belongs to the preset document type, the terminal device directly displays the target document on the instant messaging application, otherwise, the terminal device You need to call the system plugin to display the documentation through the system plugin. In the above manner, even if the instant messaging application does not support a certain document type, the system plug-in can be called to display the target document corresponding to the document type, thereby improving the feasibility and operability of the solution, and is applicable to various types. Target document.
  • the terminal device 20 further includes:
  • the sending module 204 is configured to send the document browsing instruction to the server by using the system plug-in to display the target document, so that the server generates the target document according to the document browsing instruction. Previewing a picture, where the document browsing instruction carries an identifier of the target document;
  • the receiving module 201 is configured to receive the preview picture sent by the server
  • the display module 207 is specifically configured to display the preview image corresponding to the target document in sequence by calling the system plug-in.
  • how to display a target document by calling a system plug-in is introduced, and the target document may be displayed in the form of a picture in a certain order.
  • the user when recording the annotation video, the user can annotate the target document in a reasonable order, thereby improving the rationality and feasibility of the solution.
  • the terminal device 20 further includes:
  • the receiving module 201 is configured to receive, by the instant messaging application, a first annotation input instruction subset corresponding to the first preview image, where the first preview image is a preview image corresponding to the target document,
  • the first annotation input instruction subset belongs to the annotation input instruction set;
  • annotation data array includes a preview The correspondence between the picture and the annotation input instruction sub-set
  • the determining module 202 is configured to determine, according to the annotation input instruction set, the preview image corresponding to the target document, and the annotation data array, the annotation information corresponding to the target document.
  • the user can annotate each page, and each page is a preview image, and the annotation made on the preview image is an annotation input sub-collection.
  • the terminal device stores the correspondence between the preview image and the annotation input sub-set in the form of a data array. In the above manner, the terminal device can obtain the correspondence between the annotation and the page in the data array when synthesizing the annotation video, so that the accuracy of the synthetic annotation video can be effectively improved in the case of a multi-page document, and the annotation and the page appear are avoided. Not aligned.
  • the embodiment of the present application further provides another terminal device.
  • the terminal may be any terminal device including a mobile phone, a tablet computer, a personal digital assistant (PDA), a point of sales (POS), a car computer, and the like, and the terminal is a mobile phone as an example:
  • FIG. 15 is a block diagram showing a partial structure of a mobile phone related to a terminal provided by an embodiment of the present application.
  • the mobile phone includes: a radio frequency (RF) circuit 310 , a memory 320 , an input unit 330 , a display unit 340 , a sensor 350 , an audio circuit 360 , a wireless fidelity (WiFi) module 370 , and a processor 380 .
  • RF radio frequency
  • the RF circuit 310 can be used for transmitting and receiving information or during a call, and receiving and transmitting the signal. Specifically, after receiving the downlink information of the base station, the processor 380 processes the data. In addition, the uplink data is designed to be sent to the base station.
  • RF circuitry 310 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier (LNA), a duplexer, and the like. In addition, RF circuitry 310 can also communicate with the network and other devices via wireless communication.
  • the above wireless communication may use any communication standard or protocol, including but not limited to global system of mobile communication (GSM), general packet radio service (GPRS), code division multiple access (code division) Multiple access (CDMA), wideband code division multiple access (WCDMA), long term evolution (LTE), e-mail, short messaging service (SMS), and the like.
  • GSM global system of mobile communication
  • GPRS general packet radio service
  • CDMA code division multiple access
  • WCDMA wideband code division multiple access
  • LTE long term evolution
  • SMS short messaging service
  • the memory 320 can be used to store software programs and modules, and the processor 380 executes various functional applications and data processing of the mobile phone by running software programs and modules stored in the memory 320.
  • the memory 320 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be stored according to Data created by the use of the mobile phone (such as audio data, phone book, etc.).
  • memory 320 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
  • the input unit 330 can be configured to receive input numeric or character information and to generate key signal inputs related to user settings and function controls of the handset.
  • the input unit 330 may include a touch panel 331 and other input devices 332.
  • the touch panel 331 also referred to as a touch screen, can collect touch operations on or near the user (such as a user using a finger, a stylus, or the like on the touch panel 331 or near the touch panel 331 Operation), and drive the corresponding connecting device according to a preset program.
  • the touch panel 331 can include two parts: a touch detection device and a touch controller.
  • the touch detection device detects the touch orientation of the user, and detects a signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts the touch information into contact coordinates, and sends the touch information.
  • the processor 380 is provided and can receive commands from the processor 380 and execute them.
  • the touch panel 331 can be implemented in various types such as resistive, capacitive, infrared, and surface acoustic waves.
  • the input unit 330 may also include other input devices 332.
  • other input devices 332 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackballs, mice, joysticks, and the like.
  • the display unit 340 can be used to display information input by the user or information provided to the user as well as various menus of the mobile phone.
  • the display unit 340 can include a display panel 341.
  • the display panel 341 can be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), or the like.
  • the touch panel 331 can cover the display panel 341. When the touch panel 331 detects a touch operation on or near it, the touch panel 331 transmits to the processor 380 to determine the type of the touch event, and then the processor 380 according to the touch event. The type provides a corresponding visual output on display panel 341.
  • touch panel 331 and the display panel 341 are used as two independent components to implement the input and input functions of the mobile phone in FIG. 15, in some embodiments, the touch panel 331 and the display panel 341 may be integrated. Realize the input and output functions of the phone.
  • the handset can also include at least one type of sensor 350, such as a light sensor, motion sensor, and other sensors.
  • the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 341 according to the brightness of the ambient light, and the proximity sensor may close the display panel 341 and/or when the mobile phone moves to the ear. Or backlight.
  • the accelerometer sensor can detect the magnitude of acceleration in all directions (usually three axes). When it is stationary, it can detect the magnitude and direction of gravity.
  • the mobile phone can be used to identify the gesture of the mobile phone (such as horizontal and vertical screen switching, related Game, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tapping), etc.; as for the mobile phone can also be configured with gyroscopes, barometers, hygrometers, thermometers, infrared sensors and other sensors, no longer Narration.
  • the gesture of the mobile phone such as horizontal and vertical screen switching, related Game, magnetometer attitude calibration
  • vibration recognition related functions such as pedometer, tapping
  • the mobile phone can also be configured with gyroscopes, barometers, hygrometers, thermometers, infrared sensors and other sensors, no longer Narration.
  • the audio circuit 360, the speaker 361, and the microphone 362 provide an audio interface between the user and the handset.
  • the audio circuit 360 can transmit the converted electrical data of the received audio data to the speaker 361 for conversion to the sound signal output by the speaker 361; on the other hand, the microphone 362 converts the collected sound signal into an electrical signal, by the audio circuit 360. After receiving, it is converted into audio data, and then processed by the audio data output processor 380, sent to the other mobile phone via the RF circuit 310, or outputted to the memory 320 for further processing.
  • WiFi is a short-range wireless transmission technology.
  • the mobile phone can help users to send and receive emails, browse web pages and access streaming media through the WiFi module 370, which provides users with wireless broadband Internet access.
  • FIG. 15 shows the WiFi module 370, it can be understood that it does not belong to the essential configuration of the mobile phone, and can be omitted as needed within the scope of not changing the essence of the invention.
  • the processor 380 is the control center of the handset, which connects various portions of the entire handset using various interfaces and lines, by executing or executing software programs and/or modules stored in the memory 320, and invoking data stored in the memory 320, The phone's various functions and processing data, so that the overall monitoring of the phone.
  • the processor 380 may include one or more processing units; optionally, the processor 380 may integrate an application processor and a modem processor, where the application processor mainly processes an operating system, a user interface, and an application. Etc.
  • the modem processor primarily handles wireless communications. It will be appreciated that the above described modem processor may also not be integrated into the processor 380.
  • the handset also includes a power source 390 (such as a battery) that powers the various components.
  • a power source can be logically coupled to the processor 380 through a power management system to manage charging, discharging, and power management functions through the power management system.
  • the mobile phone may further include a camera, a Bluetooth module, and the like, and details are not described herein again.
  • the processor 380 included in the terminal further has the following functions:
  • an annotation input instruction set receives, by the instant messaging application, an annotation input instruction set, wherein the annotation input instruction set includes at least one instruction for annotating a target document, each instruction corresponding to one moment;
  • An annotation video is synthesized based on the annotation information and the time corresponding to each of the instructions.
  • processor 380 is further configured to perform the following steps:
  • processor 380 is further configured to perform the following steps:
  • the processor 380 is specifically configured to perform the following steps:
  • processor 380 is specifically configured to perform the following steps:
  • the annotation video is synthesized according to the annotation information, the time corresponding to each instruction, the audio data stream, and the subtitle information.
  • processor 380 is further configured to perform the following steps:
  • the processor 380 is specifically configured to perform the following steps:
  • the carried time identifier has a corresponding relationship with the time identifier carried in the video data stream.
  • processor 380 is further configured to perform the following steps:
  • the document type of the target document belongs to the preset document type, displaying the target document on a display interface of the instant messaging application;
  • the target document is displayed by calling a system plug-in.
  • processor 380 is further configured to perform the following steps:
  • the processor 380 is specifically configured to perform the following steps:
  • the preview image corresponding to the target document is displayed in order by calling the system plug-in.
  • processor 380 is specifically configured to perform the following steps:
  • annotation data array includes a preview The correspondence between the picture and the annotation input instruction sub-set
  • the disclosed system, apparatus, and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • a computer readable storage medium A number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program code. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Information Transfer Between Computers (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本申请公开了一种批注生成的方法,方法应用于即时通信应用程序,包括:第一终端设备通过即时通信应用程序接收批注输入指令集合,批注输入指令集合包含至少一个用于对目标文档进行批注的指令,每个指令对应一个时刻;第一终端设备根据批注输入指令集合中的指令确定目标文档所对应的批注信息;第一终端设备根据批注信息以及每个指令对应的时刻,合成批注视频。本申请还提供一种终端设备。本申请可以直接对文档的多个地方进行批注,从而提升了方案的执行效率,并能够同时在即时通信应用程序中对文档进行批注和沟通,使得方案具有更强的灵活性。

Description

一种批注生成的方法及相关装置
本申请要求于2017年10月27日提交中国专利局、申请号201711022730.1、申请名称为“一种批注展示的方法及相关装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及互联网技术领域,尤其涉及批注生成技术。
背景技术
随着互联网技术的不断发展,越来越多的人依赖于即时通信应用程序进行交流。在日常工作和生活中,为了便于沟通,一个用户经常需要将文档传给其他用户,以便大家就同一个文档中的内容进行探讨。
目前,在讨论文档中的问题时,通常可以先对文档中的内容进行截图或者直接进行修改,然后将截图或修改后的内容发送给其他用户,再与其他用户对文档内容进行探讨。
然而,在修改内容比较多的时候,直接对文档做修改需要耗费较多时间,不利于方案的实用性。此外,如果文档篇幅较长,那么对文档进行截图操作也需要耗费很多的时间和精力,降低了方案的可行性。
发明内容
本申请实施例提供了一种批注生成的方法及相关装置,一方面可以直接对文档的多个地方进行批注,无需对文档进行截图或者修改,从而提升了方案的执行效率,另一方面,能够同时在即时通信应用程序中对文档进行批注和沟通,使得方案具有更强的灵活性。
有鉴于此,本申请第一方面提供了一种批注生成的方法,所述方法应用于即时通信应用程序,所述方法包括:
第一终端设备通过所述即时通信应用程序接收批注输入指令集合,其中,所述批注输入指令集合包含至少一个用于对目标文档进行批注的指令,每个指令对应一个时刻;
所述第一终端设备根据所述批注输入指令集合确定所述目标文档所对应的批注信息;
所述第一终端设备根据所述批注信息以及所述每个指令对应的时刻,合成批注视频。
本申请第二方面提供了一种终端设备,所述终端设备安装有即时通信应用程序,包括:
接收模块,用于通过所述即时通信应用程序接收批注输入指令集合,其中,所述批注输入指令集合包含至少一个用于对目标文档进行批注的指令,每个指令对应一个时刻;
确定模块,用于根据所述接收模块接收的所述批注输入指令集合确定所述目标文档所对应的批注信息;
合成模块,用于根据所述确定模块确定的所述批注信息以及所述每个指令对应的时刻,合成批注视频。
本申请第三方面提供了一种终端设备,所述终端设备安装有即时通信应用程序,包括:存储器、收发器、处理器以及总线系统;
其中,所述存储器用于存储程序;
所述处理器用于执行所述存储器中的程序,包括如下步骤:
通过所述即时通信应用程序接收批注输入指令集合,其中,所述批注输入指令集合包含至少一个用于对目标文档进行批注的指令,每个指令对应一个时刻;
根据所述批注输入指令集合确定所述目标文档所对应的批注信息;
根据所述批注信息以及所述每个指令对应的时刻,合成批注视频;
所述总线系统用于连接所述存储器以及所述处理器,以使所述存储器以及所述处理器进行通信。
本申请的第四方面提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序;所述计算机程序用于执行上述第一方面所述的方法。
本申请的第五方面提供了一种计算机程序产品,包括指令,当其在计算机上运行时,使得计算机执行上述第一方面所述的方法。
从以上技术方案可以看出,本申请实施例具有以下优点:
本申请实施例中,提供了一种批注生成的方法,该方法应用于即时通信应用程序,首先第一终端通过即时通信应用程序接收批注输入指令集合,其中,批注输入指令集合包含至少一个用于对目标文档进行批注的指令,每个指令对应一个时刻,然后可以根据批注输入指令集合中的指令确定目标文档所对应的批注信息,接下来,第一终端设备根据批注信息以及每个指令对应的时刻,合成批注视频。通过上述方式,一方面可以直接对文档的多个地方进行批注,无需对文档进行截图或者修改,从而提升了方案的执行效率,另一方面,能够同时在即时通信应用程序中对文档进行批注和沟通,使得方案具有更强的灵活性。
附图说明
图1为本申请实施例中层级关系与显示层级之间的一个关系示意图;
图2为本申请实施例中层级关系与显示层级之间的另一个关系示意图;
图3为本申请实施例中批注生成的方法一个实施例示意图;
图4为本申请实施例中开启语音批注功能的一个界面示意图;
图5为本申请实施例中确认语音批注的一个界面示意图;
图6为本申请实施例中目标文档批注的一个界面示意图;
图7为本申请实施例中合成并发送批注视频的一个界面示意图;
图8为本申请实施例中批注视频中显示字幕的一个界面示意图;
图9为本申请实施例中确认语音批注及视频批注的一个界面示意图;
图10为本申请应用场景中使用系统插件预览目标文档的一个界面示意图;
图11为本申请应用场景中使用云端预览查看目标文档的一个界面示意图;
图12a为本申请实施例中终端设备一个实施例示意图;
图12b为本申请实施例中终端设备另一个实施例示意图;
图13为本申请实施例中终端设备另一个实施例示意图;
图14为本申请实施例中终端设备另一个实施例示意图;
图15为本申请实施例中终端设备一个结构示意图。
具体实施方式
本申请实施例提供了一种批注生成的方法及相关装置,一方面可以直接对文档的多个地方进行批注,无需对文档进行截图或者修改,从而提升了方案的执行效率,另一方面,能够同时在即时通信应用程序中对文档进行批注和沟通,使得方案具有更强的灵活性。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例例如能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
应理解,本申请主要应用于即时通讯(instant messaging,IM)应用程序(application,APP)。目前在互联网上常用的IM APP包括腾讯QQ、微信、易信、钉钉、百度HI、飞信、阿里旺旺、京东咚咚、飞语、yy、Skype、Google Talk、icq、FastMsg以及parox等,大部分的即时通讯服务提供了状态信息的特性——显示联络人名单,联络人是否在线及能否与联络人交谈。通常IM服务会在使用者通话清单(类似电话簿)上的某人连上IM时发出讯息通知使用者,使用者便可据此与此人透过互联网开始进行实时的通讯。除了文字外,在频宽充足的前提下,大部分IM服务事实上也提供视讯通讯的能力。实时传讯与电子邮件 最大的不同在于不用等候,只要两个人都同时在线,就能像多媒体电话一样,传送文字、档案、声音、影像给对方,只要有网络,无论对方在天涯海角,或是双方隔得多远都没有距离。
本申请可以利用IM功能,直接在IM APP上打开文档预览展示文档内容,并且可以对文档进行批注,然后将批注的过程录制下来。录制过程中,录制框的大小不可改变,仅能对文档进行翻页。录制可以包括翻页动作、批注动作和鼠标动作。如果用户选择开启麦克风,则录制过程中,音轨保留麦克风内容。
为了便于理解,请参阅图1,图1为本申请实施例中层级关系与显示层级之间的一个关系示意图,如图所示,如果用户需要使用批注工具,则在文档预览视图上叠加一个批注视图,所有批注内容与文档一一对应,可在ScrollView容器上滚动文档,还可以在批注视图上进行批注动作的撤销以及删除。录制所有翻页动作、批注动作和鼠标动作。完成批注后,将麦克风音轨、文档操作视频和批注操作视频合并成一路视频,显示于预览窗口上,最后将合成的视频分享给IM APP上的其他用户。
请继续参阅图2,图2为本申请实施例中层级关系与显示层级之间的另一个关系示意图,如图所示,用户点击“语音批注”按钮之后,就会打开“预览窗口”。“预览窗口”中包含文档预览视图,其中,文档预览视图用于展示文档内容。工具栏用于添加矩形、圆形、箭头、文字、标签以及手写等批注元素,还可以撤销上一步操作,控制麦克风开关,以及显示录制时间等。批注视图用于展示批注内容。
ScrollView容器的内部包含文档预览视图和批注视图,当视图尺寸大于预览窗口尺寸时,ScrollView会显示滚动条。用户滑动滚动条时,已添加的批注要保持和文档内容相对位置固定。批注视图与文档预览视图的尺寸相同,都是ScrollView的子视图。当用户滑动ScrollView的滚动条时,批注视图与文档预览视图会同时移动,并保持相对位置不变。这样就可以保证批注和文档内容不会错位。用户缩放预览视图时,已添加的批注要和文档内容相对位置固定。当文档预览视图缩放时,其尺寸会发生变化,此时,对应的调整批注视图的尺寸,使其始终与文档预览视图尺寸相同,且相对位置不变。
下面将对本申请中批注生成的方法进行介绍,该方法应用于即时通信应用程序,请参阅图3,本申请实施例中批注生成的方法一个实施例包括:
101、第一终端设备通过即时通信应用程序接收批注输入指令集合,其中,批注输入指令集合包含至少一个用于对目标文档进行批注的指令,每个指令对应一个时刻。
本实施例中,首先第一终端设备通过IM APP接收用户触发的批注输入指令集合,批注输入指令集合中包含了至少一个用于对目标文档进行批注的指令,例如,添加矩形框指令、添加圆形框指令、添加箭头指令、添加文字指令、添加标签指令以及添加手写指令等,当然,对目标文档进行批注的指令还可以包括撤销指令、删除指令、显示录制时间指令和视音频录制指令等。
可以理解的是,目标文档可以是图片、文字处理器应用程序(microsoft office word)或者便携式文档格式(portable document format,PDF)等任意IM APP支持的文档,此处不作限定。
此外,每个指令都对应一个时刻,比如,10分25秒开始输入文字,12分37秒开始添加矩形框等。
102、第一终端设备根据批注输入指令集合中的指令确定目标文档所对应的批注信息。
本实施例中,第一终端设备可以根据接收到的批注输入指令集合中的指令,确定目标文档包含的批注信息,目标文档的批注信息如下表1所示。
表1
时刻 指令 批注信息
0分1秒 添加手写指令 手写“NO”
0分16秒 添加箭头指令 在“培训”下方画向右箭头
0分55秒 添加手写指令 手写“GOOD”
1分03秒 添加文字指令 输入“样本”二字
1分17秒 添加圆形框指令 圆形框框住“微信”二字
1分44秒 添加标签指令 添加“第一稿”标签
2分00秒 撤销指令 撤销添加的“第一稿”标签
表1中的批注信息仅为一个示意,不应理解为对本申请的限定。
103、第一终端设备根据批注信息以及每个指令对应的时刻,合成批注视频。
本实施例中,第一终端设备根据批注信息以及每个指令对应的时刻,能够合成一个批注视频,该批注视频即为录制批注过程的视频。
之后,第一终端设备可以将批注视频发送至第二终端设备,其中,第二终端设备用于通过即时通信应用程序接收并展示批注视频。
本实施例中,第一终端设备在合成批注视频之后,可以通过IM APP将该批注视频发送至至少一个第二终端设备。需要说明的是,步骤101至步骤103,以及将批注视频发送至第二终端设备均在同一个IM APP中进行,期间不需要用户退出该IM APP进行批注视频的录制操作,也就是说,用户直接在IM APP上收到目标文档之后,就可以开始进行批注并录制成对应的批注视频。
第二终端设备通过IM APP收到第一终端设备发送的批注视频之后,直接通过IM APP开启该批注视频就可以看到整个批注的过程。
本申请实施例中,提供了一种批注生成的方法,该方法应用于即时通信应用程序,首先第一终端设备通过即时通信应用程序接收批注输入指令集合,其中,批注输入指令集合包含至少一个用于对目标文档进行批注的指令,每个指令对应一个时刻,然后可以根据批注输入指令集合中的指令确定目标文档所对应的批注信息,接下来,第一终端设备根据批注信息以及每个指令对应的时刻,合成批注视频。通过上述方式,一方面可以直接对文档的多个地方进行批注,无需对文档进行截图或者修改,从而提升了方案的执行效率,另一方面,能够同时在即时通信应用程序中对文档进行批注和沟通,使得方案具有更强的灵活性。
可选地,在上述图3对应的实施例的基础上,本申请实施例提供的批注生成的方法第一个可选实施例中,第一终端设备根据批注信息以及每个指令对应的时刻,合成批注视频之前,还可以包括:
第一终端设备接收音频数据流,其中,音频数据流中携带时刻标识;
第一终端设备根据批注信息以及每个指令对应的时刻,合成批注视频,可以包括:
第一终端设备根据批注信息、每个指令对应的时刻以及音频数据流,合成批注视频,其中,每个指令对应的时刻与音频数据流中携带的时刻标识具有对应关系。
本实施例中,将具体介绍如何在批注过程中加入语音解释。具体地,请参阅图4,图4为本申请实施例中开启语音批注功能的一个界面示意图,首先,用户在IM APP上发送目标文档,假设该目标文档是WORD文档,那么在目标文档气泡旁边的“语音批注”即可以加入语音批注功能。点击“语音批注”后打开目标文档进行浏览,并提供开始批注的入口。
接下来,请参阅图5,图5为本申请实施例中确认语音批注的一个界面示意图,如图所示,用户可以点击选择开启麦克风。然后点击“开始语音批注”,此时将会进入语音批注阶段,请参阅图6,图6为本申请实施例中目标文档批注的一个界面示意图,如图所示,用户可以一边使用工具批注目标文档,一边通过语音进行解释,帮助听者更好地理解批注。
录制完成后,将以视频的形式存储整个批注的过程,由于是视频录制,所以每个指令所对应的时刻以及音频数据流的时刻标识都作为合成批注视频的重要参考值,这样可以防止音画不同步的问题。合成完批注视频后,请参阅图7,图7为本申请实施例中合成并发送批注视频的一个界面示意图,如图所示,用户可以选择保存到本地,或者用小视频的模式分享给其他用户。
本申请实施例中,第一终端设备除了接收批注输入指令集合,还可以接收音频数据流,也就是用户可以在一边批注的时候一边录音,最后合成的批注视频中包含音频数据流。通过上述方式,能够提升文档的批注体验,并且采用语音结合批注的方式有利于增加批注和表达的效率。
可选地,在上述图3对应的第一个实施例的基础上,本申请实施例提供的批注生成的方法第二个可选实施例中,第一终端设备接收音频数据流之后,还可以包括:
第一终端设备通过语音识别模型对音频数据流进行处理,获取音频数据流所对应的字幕信息;
第一终端设备根据批注信息、每个指令对应的时刻以及音频数据流,合成批注视频,可以包括:
第一终端设备根据批注信息、每个指令对应的时刻、音频数据流以及字幕信息,合成批注视频。
本实施例中,第一终端设备还可以通过语音识别模型对音频数据流进行处理,获取音频数据流所对应的字幕信息,以便在第二终端显示批注视频时,可以显示音频数据流所对应的字幕。
请参阅图8,图8为本申请实施例中批注视频中显示字幕的一个界面示意图,如图所示,在播放批注视频时,除了可以显示当前播放的进度,还可以显示音频数据流所对应的字幕,需要说明的是,图8下方的字幕位置仅为一个示意,在实际应用中,该字幕位置可以根据用户习惯进行调整。
可以理解的是,语音识别模型包含但不仅限于声学模型和语言模型。语言模型表示某一字序列发生的概率,一般采用链式法则,把一个句子的概率拆解成器中的每个词的概率之积。声学模型的任务是给定文字之后发出这段语音的概率。
需要说明的是,字幕信息可以显示于批注视频的下方,也可以显示在批注视频的上方,或者根据用户需求进行设置,此处不做限定。
本申请实施例中,终端设备通过语音识别模型对音频数据流进行处理,获取音频数据流所对应的字幕信息,然后结合批注信息、每个指令对应的时刻、音频数据流以及字幕信息合成批注视频。通过上述方式,可以帮助听力较弱或者无法在当前环境下听声音的用户理解批注视频中的内容。此外,由于很多字词同音,只有通过字幕文字和音频结合来观看,才能更加清楚批注视频中的内容,从而提升方案的实用性和可行性。
可选地,在上述图3、图3对应的第一个或第二个实施例的基础上,本申请实施例提供的批注生成的方法第三个可选实施例中,第一终端设备根据所述批注信息、所述每个指令对应的时刻以及所述音频数据流,合成所述批注视频之前,还可以包括:
第一终端设备接收视频数据流,其中,视频数据流中携带时刻标识;
第一终端设备根据所述批注信息、所述每个指令对应的时刻以及所述音频数据流,合成所述批注视频,可以包括:
第一终端设备根据批注信息、每个指令对应的时刻、音频数据流以及视频数据流,合成批注视频,其中,所述每个指令对应的时刻、音频数据流中携带的时刻标识与视频数据流中携带的时刻标识均具有对应关系。
本实施例中,第一终端设备根据批注信息以及每个指令对应的时刻,合成批注视频之前,除了可以接收音频数据流以外,还可以接收视频数据流。视频数据流是通过摄像头采集的。比如,用户在录音的同时,还开启了视频录制,那么就可以记录用户在批注时的表情和动作,然后制成一路视频,与批注信息以及音频数据流共同合成批注视频。
所以每个指令所对应的时刻、音频数据流的时刻标识以及视频数据流的时刻标识作为合成批注视频的重要参考值,这样可以防止音画不同步的问题。
请参阅图9,图9为本申请实施例中确认语音批注及视频批注的一个界面示意图,如图所示,当需要录制视频时还可以选择“摄像头”,这样即可进行视频录制,需要说明的是,图9右上方的视频显示位置仅为一个示意,在实际应用中,该视频显示位置可以根据用户习惯进行调整。
本申请实施例中,第一终端设备除了接收批注输入指令集合,还可以接收音频数据流,以及接收视频数据流,也就是用户可以在批注的时候一边录音一边录像,最后合成的批注视频中包含音频数据流和视频数据流。通过上述方式,能够更好地提升文档的批注体验,并且采用语音和视频相结合的批注方式,有利于增加批注和表达的效率。
可选地,在上述图3对应的实施例的基础上,本申请实施例提供的批注生成的方法第四个可选实施例中,第一终端设备通过即时通信应用程序接收批注输入指令集合之前,还可以包括:
第一终端设备获取目标文档的文档类型;
第一终端设备判断目标文档的文档类型是否属于预设文档类型;
若目标文档的文档类型属于预设文档类型,则第一终端设备在即时通信应用程序的显示界面上展示目标文档;
若目标文档的文档类型不属于预设文档类型,则第一终端设备通过调用系统插件展示目标文档。
本实施例中,在第一终端设备通过即时通信应用程序接收批注输入指令集合之前,需要先获取目标文档的文档类型,如果文档类型是属于预设文档类型的,那么就可以直接通过IM APP在文档预览视图中展示目标文档内容。预设文档类型可以是文本文件或者图片文件等。如果不属于预设文档类型,需要调用系统插件来显示目标文档。
系统插件是一种遵循一定规范的应用程序接口编写出来的程序。系统插件运行在程序规定的系统平台下(可能同时支持多个平台),而不能脱离指定的平台单独运行。因为系统插件需要调用原纯净系统提供的函数库或者数据,很多IM APP都有系统插件。本申请中, 第一终端设备可以通过调用IM APP中的系统插件来展示目标文档,也可以通过调用操作系统中的系统插件来展示目标文档。
本申请实施例中,终端设备还可以获取目标文档的类型,如果该目标文档的文档类型属于预设文档类型,那么终端设备直接在即时通讯应用程序上显示该目标文档,否则,终端设备就需要调用系统插件,通过系统插件展示目标文档。通过上述方式,即便即时通信应用程序不支持某个文档类型,也可以调用系统插件来显示该文档类型所对应的目标文档,从而提升了方案的可行性和可操作性,适用于各种不同类型的目标文档。
可选地,在上述图3对应的第四个实施例的基础上,本申请实施例提供的批注生成的方法第五个可选实施例中,第一终端设备通过调用系统插件展示目标文档之后,还可以包括:
第一终端设备向服务器发送文档浏览指令,以使服务器根据文档浏览指令生成目标文档所对应的预览图片,其中,文档浏览指令中携带目标文档的标识;
第一终端设备接收服务器发送的预览图片;
第一终端设备通过调用系统插件展示目标文档,可以包括:
第一终端设备通过调用系统插件按照顺序展示目标文档所对应的预览图片。
本实施例中,第一终端设备通过调用系统插件展示目标文档之后,还可以进而向服务器发送文档浏览指令,也就是启动“云端预览”功能。服务器根据该文档浏览指令中携带的标识,调用存储器中的目标文档,并将目标文档以预览图片的形式发送至第一终端设备。第一终端设备按照从前往后或者从后往前的顺序,展示目标文档所对应的每个预览图片。用户可以在每个预览图片上进行批注,比如,目标文档一共有十张预览图片,合成的批注视频也包括了对这十张预览图片的批注。
可以理解的是,服务器在后台调用目标文档的过程具体为,通过目标文档的标识进行索引,每个目标文档对应一个标识,因此,该标识具有唯一性。目标文档的标识可以是消息摘要算法第五版(message digest algorithm,MD5)或者安全哈希算法(secure hash algorithm,SHA),还可以是其他类型的标识,此处不做限定。
本申请实施例中,介绍了如何通过调用系统插件来展示目标文档,可以按照一定的次序以预览图片的形式来展示目标文档。通过上述方式,用户在录制批注视频的时候,能够按照合理的顺序对目标文档进行批注,从而提升方案的合理性和可行性。
可选地,在上述图3对应的第五个实施例的基础上,本申请实施例提供的批注生成的方法第六个可选实施例中,第一终端设备通过即时通信应用程序接收批注输入指令集合,可以包括:
第一终端设备通过即时通信应用程序接收第一预览图片对应的第一批注输入指令子集合,其中,第一预览图片为目标文档所对应的预览图片,第一批注输入指令子集合属于批注输入指令集合;
第一终端设备通过即时通信应用程序接收第二预览图片对应的第二批注输入指令子集合,其中,第二预览图片为目标文档所对应的预览图片,第二批注输入指令子集合属于批注输入指令集合;
第一终端设备根据第一预览图片、第一批注输入指令子集合、第二预览图片和第二批注输入指令子集合,建立批注数据数组,其中,批注数据数组中包含预览图片与批注输入指令子集合之间的对应关系;
第一终端设备根据批注输入指令集合确定目标文档所对应的批注信息,可以包括:
第一终端设备根据批注输入指令集合、目标文档所对应的预览图片以及批注数据数组,确定目标文档所对应的批注信息。
本实施例中,对于包含多页预览图片的目标文档,当用户翻页时,添加的批注内容需要和预览图片相对应。具体地,目标文档包含两页预览图片,分别为第一预览图片和第二预览图片,用户对第一预览图片进行批注,即第一预览图片对应于第一批注输入指令子集合,接着用户对第二预览图片进行批注,即第二预览图片对应于第二批注输入指令子集合。第一终端设备将维护一个批注数据数组,该批注数据数组如表2所示。
表2
预览图片 批注输入指令子集合
第一预览图片 第一批注输入指令子集合
第二预览图片 第二批注输入指令子集合
需要说明的是,批注数据数组中还可以包含更多预览图片与批注输入指令子集合之间的对应关系,表2仅为一个示意,并不应理解为对本申请的限定。批注数据数组中的元素数量与目标文档的分页数相同。用户添加批注时,用当前预览图片的页数作为索引,将批注输入指令子集合存储在数组中。用户可以通过翻页按钮或者预览图片切换页面。翻页开始时,清空批注视图。翻页结束后,根据当前的页数,从批注数据数组中取出对应批注输入指令子集合,绘制在批注视图上。
本申请实施例中,如果目标文档包含多个页面的时候,则用户能够针对每个页面进行批注,每个页面即为一个预览图片,在预览图片上所做的批注即为批注输入指令子集合,终端设备以批注数据数组的形式存储预览图片和批注输入指令子集合之间的对应关系。通过上述方式,终端设备在合成批注视频的时候可以获取批注数据数组中批注和页面的对应 关系,这样能够在多页文档的情况下,有效地提升合成批注视频的准确性,避免出现批注与页面出现不对齐的情况。
为了便于理解,下面以一个具体应用场景对本申请中使用系统插件预览的方法进行详细描述,具体地:
假设IM APP为腾讯公司开发的QQ,用户A想在QQ上打开一个演示文稿(powerpoint,PPT),但是QQ并不能直接开启PPT,于是,QQ可以调用系统插件来显示PPT的内容,即如图10所示,图10为本申请应用场景中使用系统插件预览目标文档的一个界面示意图。
由于系统插件不一定能够完美显示文件内容,所以此时会向服务器查询是否能够支持该类型文件的云端预览。如果该文件的云端预览,则在预览视图中显示“云端预览”按钮。
PPT文件的云端预览方式为,服务器安装有支持打开PPT格式的软件,例如微软Office。该服务器利用微软Office打开PPT文件,然后将PPT的每一页存储为图片文件。然后将所有图片文件按照PPT中的页面顺序发送给客户端进行查看。请参阅图11,图11为本申请应用场景中使用云端预览查看目标文档的一个界面示意图,如图所示,服务器会以该PPT文件的MD5值为索引,对生成的预览图片进行管理和缓存。
如果用户对系统插件显示的PPT结果不满意,比如,发现PPT中字体不正确或者内容错位,则可以点击“云端预览”按钮。预览窗口首先询问云端预览服务器是否需要上传该PPT文件。接着,用于云端预览的服务器会检查云端是否已经存在该文件预览内容的图片文件缓存。如果在不久前曾经有用户预览过该文件,则服务器端有缓存,此时服务器可以通知客户端无需上传PPT文件,且通知客户端预览图片就绪。
如果服务器没有缓存图片文件,则检查云端是否有该PPT文件的缓存,通过MD5进行索引。如果曾经有用户对该文件进行过云盘存储或者QQ离线传文件等操作,则云端有该文件的缓存。服务器打开该文件,并生成预览图片。然后通知客户端无需上传PPT文件,并通知客户端预览图片就绪。否则,服务器就需要通知客户端上传该PPT文件。
客户端上传PPT文件之后,服务器打开该PPT文件,并生成预览图片。服务器通知客户端预览图片就绪。客户端收到预览图片就绪的通知后,向服务器申请预览图。服务器告知客户端预览图片的总数。客户端依次系在每一张预览图,并在预览窗口显示。
下面对本申请中的终端设备进行详细描述,请参阅图12a,图12a为本申请实施例中终端设备一个实施例示意图,终端设备20包括:
接收模块201,用于通过所述即时通信应用程序接收批注输入指令集合,其中,所述批注输入指令集合包含至少一个用于对目标文档进行批注的指令,每个指令对应一个时刻;
确定模块202,用于根据所述接收模块201接收的所述批注输入指令集合中的指令确定所述目标文档所对应的批注信息;
合成模块203,用于根据所述确定模块202确定的所述批注信息以及所述每个指令对应的时刻,合成批注视频。
本实施例中,接收模块201通过所述即时通信应用程序接收批注输入指令集合,其中,所述批注输入指令集合包含至少一个用于对目标文档进行批注的指令,每个指令对应一个时刻,确定模块202根据所述接收模块201接收的所述批注输入指令集合中的指令确定所述目标文档所对应的批注信息,合成模块203根据所述确定模块202确定的所述批注信息以及所述每个指令对应的时刻,合成批注视频。
本申请实施例中,提供了一种终端设备,首先第一终端设备通过即时通信应用程序接收批注输入指令集合,其中,批注输入指令集合包含至少一个用于对目标文档进行批注的指令,每个指令对应一个时刻,然后可以根据批注输入指令集合中的指令确定目标文档所对应的批注信息,接下来,第一终端设备根据批注信息以及每个指令对应的时刻,合成批注视频。通过上述方式,一方面可以直接对文档的多个地方进行批注,无需对文档进行截图或者修改,从而提升了方案的执行效率,另一方面,能够同时在即时通信应用程序中对文档进行批注和沟通,使得方案具有更强的灵活性。
可选地,在上述图12a对应的实施例的基础上,本申请实施例提供的终端设备另一实施例中,请参阅图12b,所述终端设备20还包括:
发送模块204,用于将所述合成模块203合成的所述批注视频发送至第二终端设备,其中,所述第二终端设备用于通过所述即时通信应用程序接收并展示所述批注视频。
可选地,在上述图12a对应的实施例的基础上,本申请实施例提供的终端设备另一实施例中,所述终端设备20还包括:
所述接收模块201,还用于在所述合成模块203根据所述批注信息以及所述每个指令对应的时刻,合成批注视频之前,接收音频数据流,其中,所述音频数据流中携带时刻标识;
所述合成模块203,具体用于根据所述批注信息、所述每个指令对应的时刻以及所述音频数据流,合成所述批注视频,其中,所述每个指令对应的时刻与所述音频数据流中携带的时刻标识具有对应关系。
本申请实施例中,第一终端设备除了接收批注输入指令集合,还可以接收音频数据流,也就是用户可以在一边批注的时候一边录音,最后合成的批注视频中包含音频数据流。通过上述方式,能够提升文档的批注体验,并且采用语音结合批注的方式有利于增加批注和表达的效率。
可选地,在上述图12a对应的实施例的基础上,请参阅图13,本申请实施例提供的终端设备另一实施例中,所述终端设备20还包括获取模块205;
所述获取模块205,用于在所述接收模块201接收音频数据流之后,通过语音识别模型对所述音频数据流进行处理,获取所述音频数据流所对应的字幕信息;
所述合成模块203,具体用于根据所述批注信息、所述每个指令对应的时刻、所述音频数据流以及所述字幕信息,合成所述批注视频。
本申请实施例中,终端设备通过语音识别模型对音频数据流进行处理,获取音频数据流所对应的字幕信息,然后结合批注信息、每个指令对应的时刻、音频数据流以及字幕信息合成批注视频。通过上述方式,可以帮助听力较弱或者无法在当前环境下听声音的用户理解批注视频中的内容。此外,由于很多字词同音,只有通过字幕文字和音频结合来观看,才能更加清楚批注视频中的内容,从而提升方案的实用性和可行性。
可选地,在上述图12a或图13对应的实施例的基础上,本申请实施例提供的终端设备另一实施例中,所述终端设备20还包括:
所述接收模块201,还用于在所述合成模块203用于根据所述批注信息、所述每个指令对应的时刻以及所述音频数据流,合成所述批注视频之前,接收视频数据流,其中,所述视频数据流中携带时刻标识;
所述合成模块203,具体用于根据所述批注信息、所述每个指令对应的时刻、所述音频数据流以及所述视频数据流,合成所述批注视频,其中,所述每个指令对应的时刻、所述音频数据流中携带的时刻标识与所述视频数据流中携带的时刻标识均具有对应关系。
本申请实施例中,第一终端设备除了接收批注输入指令集合,还可以接收音频数据流,以及接收视频数据流,也就是用户可以在批注的时候一边录音一边录像,最后合成的批注视频中包含音频数据流和视频数据流。通过上述方式,能够更好地提升文档的批注体验,并且采用语音和视频相结合的批注方式,有利于增加批注和表达的效率。
可选地,在上述图12a对应的实施例的基础上,请参阅图14,本申请实施例提供的终端设备另一实施例中,所述终端设备20还包括判断模块206和展示模块207;
所述获取模块205,还用于所述接收模块201通过所述即时通信应用程序接收批注输入指令集合之前,获取所述目标文档的文档类型;
所述判断模块206,用于判断所述获取模块205获取的所述目标文档的文档类型是否属于预设文档类型;
所述展示模块207,用于若所述判断模块206判断得到所述目标文档的文档类型属于所述预设文档类型,则所述第一终端设备在所述即时通信应用程序的显示界面上展示所述目标文档;
所述展示模块207,用于若所述判断模块206判断得到所述目标文档的文档类型不属于所述预设文档类型,则所述第一终端设备通过调用系统插件展示所述目标文档。
本申请实施例中,终端设备还可以获取目标文档的文档类型,如果该目标文档的文档类型属于预设文档类型,那么终端设备直接在即时通讯应用程序上显示该目标文档,否则,终端设备就需要调用系统插件,通过系统插件展示文档。通过上述方式,即便即时通信应用程序不支持某个文档类型,也可以调用系统插件来显示该文档类型所对应的目标文档,从而提升了方案的可行性和可操作性,适用于各种不同类型的目标文档。
可选地,在上述图14对应的实施例的基础上,本申请实施例提供的终端设备另一实施例中,所述终端设备20还包括:
所述发送模块204,用于所述展示模块207通过调用系统插件展示所述目标文档之后,向服务器发送文档浏览指令,以使所述服务器根据所述文档浏览指令生成所述目标文档所对应的预览图片,其中,所述文档浏览指令中携带所述目标文档的标识;
所述接收模块201,用于接收所述服务器发送的所述预览图片;
所述展示模块207,具体用于通过调用系统插件按照顺序展示所述目标文档所对应的预览图片。
本申请实施例中,介绍了如何通过调用系统插件来展示目标文档,可以按照一定的次序以图片的形式来展示目标文档。通过上述方式,用户在录制批注视频的时候,能够按照合理的顺序对目标文档进行批注,从而提升方案的合理性和可行性。
可选地,在上述图14对应的实施例的基础上,本申请实施例提供的终端设备另一实施例中,所述终端设备20还包括:
所述接收模块201,具体用于通过所述即时通信应用程序接收第一预览图片对应的第一批注输入指令子集合,其中,所述第一预览图片为所述目标文档所对应的预览图片,所述第一批注输入指令子集合属于所述批注输入指令集合;
通过所述即时通信应用程序接收第二预览图片对应的第二批注输入指令子集合,其中,所述第二预览图片为所述目标文档所对应的预览图片,所述第二批注输入指令子集合属于所述批注输入指令集合;
根据所述第一预览图片、所述第一批注输入指令子集合、所述第二预览图片和所述第二批注输入指令子集合,建立批注数据数组,其中,所述批注数据数组中包含预览图片与批注输入指令子集合之间的对应关系;
所述确定模块202,具体用于根据所述批注输入指令集合、所述目标文档所对应的预览图片以及所述批注数据数组,确定所述目标文档所对应的所述批注信息。
本申请实施例中,如果目标文档包含多个页面的时候,则用户能够针对每个页面进行批注,每个页面即为一个预览图片,在预览图片上所做的批注即为批注输入指令子集合,终端设备以数据数组的形式存储预览图片和批注输入指令子集合之间的对应关系。通过上述方式,终端设备在合成批注视频的时候可以获取数据数组中批注和页面的对应关系,这 样能够在多页文档的情况下,有效地提升合成批注视频的准确性,避免出现批注与页面出现不对齐的情况。
本申请实施例还提供了另一种终端设备,如图15所示,为了便于说明,仅示出了与本申请实施例相关的部分,具体技术细节未揭示的,请参照本申请实施例方法部分。该终端可以为包括手机、平板电脑、个人数字助理(personal digital assistant,PDA)、销售终端(point of sales,POS)、车载电脑等任意终端设备,以终端为手机为例:
图15示出的是与本申请实施例提供的终端相关的手机的部分结构的框图。参考图15,手机包括:射频(radio frequency,RF)电路310、存储器320、输入单元330、显示单元340、传感器350、音频电路360、无线保真(wireless fidelity,WiFi)模块370、处理器380、以及电源390等部件。本领域技术人员可以理解,图15中示出的手机结构并不构成对手机的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
下面结合图15对手机的各个构成部件进行具体的介绍:
RF电路310可用于收发信息或通话过程中,信号的接收和发送,特别地,将基站的下行信息接收后,给处理器380处理;另外,将设计上行的数据发送给基站。通常,RF电路310包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器(low noise amplifier,LNA)、双工器等。此外,RF电路310还可以通过无线通信与网络和其他设备通信。上述无线通信可以使用任一通信标准或协议,包括但不限于全球移动通讯系统(global system of mobile communication,GSM)、通用分组无线服务(general packet radio service,GPRS)、码分多址(code division multiple access,CDMA)、宽带码分多址(wideband code division multiple access,WCDMA)、长期演进(long term evolution,LTE)、电子邮件、短消息服务(short messaging service,SMS)等。
存储器320可用于存储软件程序以及模块,处理器380通过运行存储在存储器320的软件程序以及模块,从而执行手机的各种功能应用以及数据处理。存储器320可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据手机的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器320可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。
输入单元330可用于接收输入的数字或字符信息,以及产生与手机的用户设置以及功能控制有关的键信号输入。具体地,输入单元330可包括触控面板331以及其他输入设备332。触控面板331,也称为触摸屏,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板331上或在触控面板331附近的操作),并根据预先设定的程式驱动相应的连接装置。可选的,触控面板331可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转 换成触点坐标,再送给处理器380,并能接收处理器380发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板331。除了触控面板331,输入单元330还可以包括其他输入设备332。具体地,其他输入设备332可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。
显示单元340可用于显示由用户输入的信息或提供给用户的信息以及手机的各种菜单。显示单元340可包括显示面板341,可选的,可以采用液晶显示器(liquid crystal display,LCD)、有机发光二极管(organic light-emitting diode,OLED)等形式来配置显示面板341。进一步的,触控面板331可覆盖显示面板341,当触控面板331检测到在其上或附近的触摸操作后,传送给处理器380以确定触摸事件的类型,随后处理器380根据触摸事件的类型在显示面板341上提供相应的视觉输出。虽然在图15中,触控面板331与显示面板341是作为两个独立的部件来实现手机的输入和输入功能,但是在某些实施例中,可以将触控面板331与显示面板341集成而实现手机的输入和输出功能。
手机还可包括至少一种传感器350,比如光传感器、运动传感器以及其他传感器。具体地,光传感器可包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板341的亮度,接近传感器可在手机移动到耳边时,关闭显示面板341和/或背光。作为运动传感器的一种,加速计传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别手机姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;至于手机还可配置的陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。
音频电路360、扬声器361,传声器362可提供用户与手机之间的音频接口。音频电路360可将接收到的音频数据转换后的电信号,传输到扬声器361,由扬声器361转换为声音信号输出;另一方面,传声器362将收集的声音信号转换为电信号,由音频电路360接收后转换为音频数据,再将音频数据输出处理器380处理后,经RF电路310以发送给比如另一手机,或者将音频数据输出至存储器320以便进一步处理。
WiFi属于短距离无线传输技术,手机通过WiFi模块370可以帮助用户收发电子邮件、浏览网页和访问流式媒体等,它为用户提供了无线的宽带互联网访问。虽然图15示出了WiFi模块370,但是可以理解的是,其并不属于手机的必须构成,完全可以根据需要在不改变发明的本质的范围内而省略。
处理器380是手机的控制中心,利用各种接口和线路连接整个手机的各个部分,通过运行或执行存储在存储器320内的软件程序和/或模块,以及调用存储在存储器320内的数据,执行手机的各种功能和处理数据,从而对手机进行整体监控。可选的,处理器380可包括一个或多个处理单元;可选的,处理器380可集成应用处理器和调制解调处理器,其 中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器380中。
手机还包括给各个部件供电的电源390(比如电池),可选的,电源可以通过电源管理系统与处理器380逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。
尽管未示出,手机还可以包括摄像头、蓝牙模块等,在此不再赘述。
在本申请实施例中,该终端所包括的处理器380还具有以下功能:
通过所述即时通信应用程序接收批注输入指令集合,其中,所述批注输入指令集合包含至少一个用于对目标文档进行批注的指令,每个指令对应一个时刻;
根据所述批注输入指令集合确定所述目标文档所对应的批注信息;
根据所述批注信息以及所述每个指令对应的时刻,合成批注视频。
可选地,处理器380还用于执行如下步骤:
将所述批注视频发送至第二终端设备,其中,所述第二终端设备用于通过所述即时通信应用程序接收并展示所述批注视频。
可选地,处理器380还用于执行如下步骤:
接收音频数据流,其中,所述音频数据流中携带时刻标识;
处理器380具体用于执行如下步骤:
根据所述批注信息、所述每个指令对应的时刻以及所述音频数据流,合成所述批注视频,其中,所述每个指令对应的时刻与所述音频数据流中携带的时刻标识具有对应关系。
可选地,处理器380具体用于执行如下步骤:
通过语音识别模型对所述音频数据流进行处理,获取所述音频数据流所对应的字幕信息;
根据所述批注信息、所述每个指令对应的时刻、所述音频数据流以及所述字幕信息,合成所述批注视频。
可选地,处理器380还用于执行如下步骤:
接收视频数据流,其中,所述视频数据流中携带时刻标识;
处理器380具体用于执行如下步骤:
根据所述批注信息、所述每个指令对应的时刻、所述音频数据流以及所述视频数据流,合成所述批注视频,其中,所述每个指令对应的时刻、所述音频数据流中携带的时刻标识与所述视频数据流中携带的时刻标识均具有对应关系。
可选地,处理器380还用于执行如下步骤:
获取所述目标文档的文档类型;
判断所述目标文档的文档类型是否属于预设文档类型;
若所述目标文档的文档类型属于所述预设文档类型,则在所述即时通信应用程序的显示界面上展示所述目标文档;
若所述目标文档的文档类型不属于所述预设文档类型,则通过调用系统插件展示所述目标文档。
可选地,处理器380还用于执行如下步骤:
向服务器发送文档浏览指令,以使所述服务器根据所述文档浏览指令生成所述目标文档所对应的预览图片,其中,所述文档浏览指令中携带所述目标文档的标识;
接收所述服务器发送的所述预览图片;
处理器380具体用于执行如下步骤:
通过调用系统插件按照顺序展示所述目标文档所对应的预览图片。
可选地,处理器380具体用于执行如下步骤:
通过所述即时通信应用程序接收第一预览图片对应的第一批注输入指令子集合,其中,所述第一预览图片为所述目标文档所对应的预览图片,所述第一批注输入指令子集合属于所述批注输入指令集合;
通过所述即时通信应用程序接收第二预览图片对应的第二批注输入指令子集合,其中,所述第二预览图片为所述目标文档所对应的预览图片,所述第二批注输入指令子集合属于所述批注输入指令集合;
根据所述第一预览图片、所述第一批注输入指令子集合、所述第二预览图片和所述第二批注输入指令子集合,建立批注数据数组,其中,所述批注数据数组中包含预览图片与批注输入指令子集合之间的对应关系;
根据所述批注输入指令集合、所述目标文档所对应的预览图片以及所述批注数据数组,确定所述目标文档所对应的所述批注信息。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (16)

  1. 一种批注生成的方法,所述方法应用于即时通信应用程序,所述方法包括:
    第一终端设备通过所述即时通信应用程序接收批注输入指令集合,其中,所述批注输入指令集合包含至少一个用于对目标文档进行批注的指令,每个指令对应一个时刻;
    所述第一终端设备根据所述批注输入指令集合中的指令确定所述目标文档所对应的批注信息;
    所述第一终端设备根据所述批注信息以及所述每个指令对应的时刻,合成批注视频。
  2. 根据权利要求1所述的方法,所述方法还包括:
    所述第一终端设备将所述批注视频发送至第二终端设备,其中,所述第二终端设备用于通过所述即时通信应用程序接收并展示所述批注视频。
  3. 根据权利要求1所述的方法,所述第一终端设备根据所述批注信息以及所述每个指令对应的时刻,合成批注视频之前,所述方法还包括:
    所述第一终端设备接收音频数据流,其中,所述音频数据流中携带时刻标识;
    所述第一终端设备根据所述批注信息以及所述每个指令对应的时刻,合成批注视频,包括:
    所述第一终端设备根据所述批注信息、所述每个指令对应的时刻以及所述音频数据流,合成所述批注视频,其中,所述每个指令对应的时刻与所述音频数据流中携带的时刻标识具有对应关系。
  4. 根据权利要求3所述的方法,所述第一终端设备接收音频数据流之后,所述方法还包括:
    所述第一终端设备通过语音识别模型对所述音频数据流进行处理,获取所述音频数据流所对应的字幕信息;
    所述第一终端设备根据所述批注信息、所述每个指令对应的时刻以及所述音频数据流,合成所述批注视频,包括:
    所述第一终端设备根据所述批注信息、所述每个指令对应的时刻、所述音频数据流以及所述字幕信息,合成所述批注视频。
  5. 根据权利要求3或4所述的方法,所述第一终端设备根据所述批注信息、所述每个指令对应的时刻以及所述音频数据流,合成所述批注视频之前,所述方法还包括:
    所述第一终端设备接收视频数据流,其中,所述视频数据流中携带时刻标识;
    所述第一终端设备根据所述批注信息、所述每个指令对应的时刻以及所述音频数据流,合成所述批注视频,包括:
    所述第一终端设备根据所述批注信息、所述每个指令对应的时刻、所述音频数据流以及所述视频数据流,合成所述批注视频,其中,所述每个指令对应的时刻、所述音频数据流中携带的时刻标识与所述视频数据流中携带的时刻标识均具有对应关系。
  6. 根据权利要求1所述的方法,所述第一终端设备通过所述即时通信应用程序接收批注输入指令集合之前,所述方法还包括:
    所述第一终端设备获取所述目标文档的文档类型;
    所述第一终端设备判断所述目标文档的文档类型是否属于预设文档类型;
    若所述目标文档的文档类型属于所述预设文档类型,则所述第一终端设备在所述即时通信应用程序的显示界面上展示所述目标文档;
    若所述目标文档的文档类型不属于所述预设文档类型,则所述第一终端设备通过调用系统插件展示所述目标文档。
  7. 根据权利要求6所述的方法,所述第一终端设备通过调用系统插件展示所述目标文档之后,所述方法还包括:
    所述第一终端设备向服务器发送文档浏览指令,以使所述服务器根据所述文档浏览指令生成所述目标文档所对应的预览图片,其中,所述文档浏览指令中携带所述目标文档的标识;
    所述第一终端设备接收所述服务器发送的所述预览图片;
    所述第一终端设备通过调用系统插件展示所述目标文档,包括:
    所述第一终端设备通过调用系统插件按照顺序展示所述目标文档所对应的预览图片。
  8. 根据权利要求7所述的方法,所述第一终端设备通过所述即时通信应用程序接收批注输入指令集合,包括:
    所述第一终端设备通过所述即时通信应用程序接收第一预览图片对应的第一批注输入指令子集合,其中,所述第一预览图片为所述目标文档所对应的预览图片,所述第一批注输入指令子集合属于所述批注输入指令集合;
    所述第一终端设备通过所述即时通信应用程序接收第二预览图片对应的第二批注输入指令子集合,其中,所述第二预览图片为所述目标文档所对应的预览图片,所述第二批注输入指令子集合属于所述批注输入指令集合;
    所述第一终端设备根据所述第一预览图片、所述第一批注输入指令子集合、所述第二预览图片和所述第二批注输入指令子集合,建立批注数据数组,其中,所述批注数据数组中包含预览图片与批注输入指令子集合之间的对应关系;
    所述第一终端设备根据所述批注输入指令集合确定所述目标文档所对应的批注信息,包括:
    所述第一终端设备根据所述批注输入指令集合、所述目标文档所对应的预览图片以及所述批注数据数组,确定所述目标文档所对应的所述批注信息。
  9. 一种终端设备,所述终端设备安装有即时通信应用程序,包括:
    接收模块,用于通过所述即时通信应用程序接收批注输入指令集合,其中,所述批注输入指令集合包含至少一个用于对目标文档进行批注的指令,每个指令对应一个时刻;
    确定模块,用于根据所述接收模块接收的所述批注输入指令集合确定所述目标文档所对应的批注信息;
    合成模块,用于根据所述确定模块确定的所述批注信息以及所述每个指令对应的时刻,合成批注视频。
  10. 根据权利要求9所述的终端设备,还包括:
    发送模块,用于将所述合成模块合成的所述批注视频发送至第二终端设备,其中,所述第二终端设备用于通过所述即时通信应用程序接收并展示所述批注视频。
  11. 根据权利要求9所述的终端设备,
    所述接收模块,还用于在所述合成模块根据所述批注信息以及所述每个指令对应的时刻,合成批注视频之前,接收音频数据流,其中,所述音频数据流中携带时刻标识;
    所述合成模块,具体用于根据所述批注信息、所述每个指令对应的时刻以及所述音频数据流,合成所述批注视频,其中,所述每个指令对应的时刻与所述音频数据流中携带的时刻标识具有对应关系。
  12. 根据权利要求11所述的终端设备,还包括获取模块;
    所述获取模块,用于在所述接收模块接收音频数据流之后,通过语音识别模型对所述音频数据流进行处理,获取所述音频数据流所对应的字幕信息;
    所述合成模块,具体用于根据所述批注信息、所述每个指令对应的时刻、所述音频数据流以及所述字幕信息,合成所述批注视频。
  13. 根据权利要求11或12所述的终端设备,
    所述接收模块,还用于在所述合成模块用于根据所述批注信息、所述每个指令对应的时刻以及所述音频数据流,合成所述批注视频之前,接收视频数据流,其中,所述视频数据流中携带时刻标识;
    所述合成模块,具体用于根据所述批注信息、所述每个指令对应的时刻、所述音频数据流以及所述视频数据流,合成所述批注视频,其中,所述每个指令对应的时刻、所述音频数据流中携带的时刻标识与所述视频数据流中携带的时刻标识均具有对应关系。
  14. 一种终端设备,所述终端设备安装有即时通信应用程序,包括:存储器、收发器、处理器以及总线系统;
    其中,所述存储器用于存储程序;
    所述处理器用于执行所述存储器中的程序,包括如下步骤:
    通过所述即时通信应用程序接收批注输入指令集合,其中,所述批注输入指令集合包含至少一个用于对目标文档进行批注的指令,每个指令对应一个时刻;
    根据所述批注输入指令集合确定所述目标文档所对应的批注信息;
    根据所述批注信息以及所述每个指令对应的时刻,合成批注视频;
    所述总线系统用于连接所述存储器以及所述处理器,以使所述存储器以及所述处理器进行通信。
  15. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序;所述计算机程序用于执行如权利要求1-8任一项所述的方法。
  16. 一种计算机程序产品,包括指令,当其在计算机上运行时,使得计算机执行如权利要求1-8任一项所述的方法。
PCT/CN2018/111660 2017-10-27 2018-10-24 一种批注生成的方法及相关装置 WO2019080873A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711022730.1 2017-10-27
CN201711022730.1A CN109726367B (zh) 2017-10-27 2017-10-27 一种批注展示的方法及相关装置

Publications (1)

Publication Number Publication Date
WO2019080873A1 true WO2019080873A1 (zh) 2019-05-02

Family

ID=66247176

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/111660 WO2019080873A1 (zh) 2017-10-27 2018-10-24 一种批注生成的方法及相关装置

Country Status (2)

Country Link
CN (1) CN109726367B (zh)
WO (1) WO2019080873A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428453A (zh) * 2019-12-31 2020-07-17 杭州海康威视数字技术股份有限公司 批注同步过程中的处理方法、装置以及系统
CN111666451A (zh) * 2020-05-21 2020-09-15 北京梧桐车联科技有限责任公司 路书展示方法、装置、设备及存储介质
CN112685997A (zh) * 2020-12-31 2021-04-20 安徽鸿程光电有限公司 批注信息的显示方法、装置、设备及计算机可读存储介质
CN113542332A (zh) * 2020-04-22 2021-10-22 中移智行网络科技有限公司 基于定位标注的客服视频交互方法和设备
CN113807071A (zh) * 2021-08-31 2021-12-17 浙江浙大中控信息技术有限公司 一种基于ocr的文档生成方法

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112580313B (zh) * 2019-09-30 2023-06-20 广州视源电子科技股份有限公司 一种课件的批注方法、系统、设备和存储介质
CN111428452B (zh) * 2019-11-27 2023-09-05 杭州海康威视数字技术股份有限公司 一种批注数据保存方法及装置
CN111325004B (zh) 2020-02-21 2021-08-31 腾讯科技(深圳)有限公司 一种文件评论、查看方法
CN111382561B (zh) * 2020-03-13 2022-11-01 北大方正集团有限公司 文件校验方法、装置、设备及存储介质
CN111785098B (zh) * 2020-06-30 2021-08-13 南京百家云科技有限公司 一种课程文件的生成方法、装置、电子设备及存储介质
CN112084756B (zh) * 2020-09-08 2023-10-10 远光软件股份有限公司 会议文件生成方法、装置及电子设备
CN116719459A (zh) * 2022-09-26 2023-09-08 荣耀终端有限公司 批注框的显示方法、电子设备及可读存储介质
CN116362697A (zh) * 2023-06-01 2023-06-30 北京尽微致广信息技术有限公司 一种协作处理方法、系统、存储介质及电子设备

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1858786A (zh) * 2006-06-09 2006-11-08 宋丽娟 一种电子文档格式化批注系统与方法
CN105701078A (zh) * 2014-11-25 2016-06-22 珠海金山办公软件有限公司 一种文档批注分类方法及装置
CN105743973A (zh) * 2016-01-22 2016-07-06 上海科牛信息科技有限公司 一种多人多设备实时同步云协作方法及系统

Family Cites Families (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100512138B1 (ko) * 2000-03-08 2005-09-02 엘지전자 주식회사 합성 키프레임을 이용한 비디오 브라우징 시스템
US7577901B1 (en) * 2000-03-15 2009-08-18 Ricoh Co., Ltd. Multimedia document annotation
US7647555B1 (en) * 2000-04-13 2010-01-12 Fuji Xerox Co., Ltd. System and method for video access from notes or summaries
US7346841B2 (en) * 2000-12-19 2008-03-18 Xerox Corporation Method and apparatus for collaborative annotation of a document
US7222300B2 (en) * 2002-06-19 2007-05-22 Microsoft Corporation System and method for automatically authoring video compositions using video cliplets
US7394969B2 (en) * 2002-12-11 2008-07-01 Eastman Kodak Company System and method to compose a slide show
CN1285045C (zh) * 2005-01-31 2006-11-15 王小元 一种电子页面手写批注方法
CN101499977A (zh) * 2008-01-28 2009-08-05 万德洪 一种即时通讯系统及实现方法
US8612469B2 (en) * 2008-02-21 2013-12-17 Globalenglish Corporation Network-accessible collaborative annotation tool
US8892553B2 (en) * 2008-06-18 2014-11-18 Microsoft Corporation Auto-generation of events with annotation and indexing
US20110249954A1 (en) * 2010-04-09 2011-10-13 Microsoft Corporation Capturing presentations in online conferences
CN101930779B (zh) * 2010-07-29 2012-02-29 华为终端有限公司 一种视频批注方法及视频播放器
US8924884B2 (en) * 2010-12-06 2014-12-30 International Business Machines Corporation Automatically capturing and annotating content
WO2012170913A1 (en) * 2011-06-08 2012-12-13 Vidyo, Inc. Systems and methods for improved interactive content sharing in video communication systems
US8380040B2 (en) * 2011-07-18 2013-02-19 Fuji Xerox Co., Ltd. Systems and methods of capturing and organizing annotated content on a mobile device
CN103024602B (zh) * 2011-09-23 2016-10-05 华为技术有限公司 一种针对视频添加批注的方法及装置
US20140122991A1 (en) * 2012-03-25 2014-05-01 Imc Technologies Sa Fast annotation of electronic content and mapping of same
KR101984823B1 (ko) * 2012-04-26 2019-05-31 삼성전자주식회사 웹 페이지에 주석을 부가하는 방법 및 그 디바이스
CN103517158B (zh) * 2012-06-25 2017-02-22 华为技术有限公司 一种生成可展示视频批注的视频的方法、装置及系统
US20140006921A1 (en) * 2012-06-29 2014-01-02 Infosys Limited Annotating digital documents using temporal and positional modes
US9542377B2 (en) * 2013-05-06 2017-01-10 Dropbox, Inc. Note browser
CN103500158A (zh) * 2013-10-08 2014-01-08 北京百度网讯科技有限公司 批注电子文档的方法和装置
CN103514297B (zh) * 2013-10-16 2022-02-08 上海合合信息科技股份有限公司 文本增加批注数据的方法及装置,查询方法及装置
CN105741622B (zh) * 2016-05-13 2018-12-18 福州新锐同创电子科技有限公司 基于触控操作的数字化教材制作系统
CN106021216A (zh) * 2016-05-24 2016-10-12 杭州圆本科技有限公司 电子批注方法
CN105844987B (zh) * 2016-05-30 2019-10-08 深圳科润视讯技术有限公司 多媒体教学互动操作方法及装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1858786A (zh) * 2006-06-09 2006-11-08 宋丽娟 一种电子文档格式化批注系统与方法
CN105701078A (zh) * 2014-11-25 2016-06-22 珠海金山办公软件有限公司 一种文档批注分类方法及装置
CN105743973A (zh) * 2016-01-22 2016-07-06 上海科牛信息科技有限公司 一种多人多设备实时同步云协作方法及系统

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428453A (zh) * 2019-12-31 2020-07-17 杭州海康威视数字技术股份有限公司 批注同步过程中的处理方法、装置以及系统
CN111428453B (zh) * 2019-12-31 2023-09-05 杭州海康威视数字技术股份有限公司 批注同步过程中的处理方法、装置以及系统
CN113542332A (zh) * 2020-04-22 2021-10-22 中移智行网络科技有限公司 基于定位标注的客服视频交互方法和设备
CN113542332B (zh) * 2020-04-22 2023-04-07 中移智行网络科技有限公司 基于定位标注的客服视频交互方法和设备
CN111666451A (zh) * 2020-05-21 2020-09-15 北京梧桐车联科技有限责任公司 路书展示方法、装置、设备及存储介质
CN111666451B (zh) * 2020-05-21 2023-06-23 北京梧桐车联科技有限责任公司 路书展示方法、装置、服务器、终端及存储介质
CN112685997A (zh) * 2020-12-31 2021-04-20 安徽鸿程光电有限公司 批注信息的显示方法、装置、设备及计算机可读存储介质
CN113807071A (zh) * 2021-08-31 2021-12-17 浙江浙大中控信息技术有限公司 一种基于ocr的文档生成方法

Also Published As

Publication number Publication date
CN109726367A (zh) 2019-05-07
CN109726367B (zh) 2022-06-10

Similar Documents

Publication Publication Date Title
WO2019080873A1 (zh) 一种批注生成的方法及相关装置
WO2019120191A1 (zh) 多段文本复制方法及移动终端
WO2021104365A1 (zh) 对象分享方法及电子设备
WO2022017107A1 (zh) 信息处理方法、装置、计算机设备及存储介质
WO2021077897A1 (zh) 文件发送方法、装置和电子设备
WO2018072459A1 (zh) 一种屏幕截图和读取的方法及终端
US20200183499A1 (en) Apparatus, system, and method for transferring data from a terminal to an electromyography (emg) device
WO2016045226A1 (zh) 信息处理方法和装置
WO2018141144A1 (zh) 一种文本和语音信息的处理方法以及终端
WO2016184302A1 (zh) 消息转发方法及电子设备
WO2018196588A1 (zh) 一种信息分享方法、装置和系统
US20140365918A1 (en) Incorporating external dynamic content into a whiteboard
CN111432265B (zh) 一种处理视频画面的方法、相关装置及存储介质
WO2020238938A1 (zh) 信息输入方法及移动终端
WO2021147785A1 (zh) 思维导图显示方法及电子设备
WO2019214072A1 (zh) 一种显示输入法虚拟键盘的方法及终端
JP6910300B2 (ja) チャット履歴記録を表示するための方法およびチャット履歴記録を表示するための装置
US20230118214A1 (en) Method and apparatus for editing electronic document, device, and storage medium
CN109948102B (zh) 页面内容编辑方法及终端
CN110278141B (zh) 一种即时通讯信息的处理方法、装置及存储介质
CN107071512B (zh) 一种配音方法、装置及系统
US20150025882A1 (en) Method for operating conversation service based on messenger, user interface and electronic device using the same
WO2021104160A1 (zh) 编辑方法及电子设备
WO2022156606A1 (zh) 信息处理方法、装置及电子设备
WO2021169954A1 (zh) 搜索方法及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18871278

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18871278

Country of ref document: EP

Kind code of ref document: A1