WO2020047719A1 - 一种速记方法及装置、终端、存储介质 - Google Patents

一种速记方法及装置、终端、存储介质 Download PDF

Info

Publication number
WO2020047719A1
WO2020047719A1 PCT/CN2018/103832 CN2018103832W WO2020047719A1 WO 2020047719 A1 WO2020047719 A1 WO 2020047719A1 CN 2018103832 W CN2018103832 W CN 2018103832W WO 2020047719 A1 WO2020047719 A1 WO 2020047719A1
Authority
WO
WIPO (PCT)
Prior art keywords
shorthand
language type
terminal
recording
recorded data
Prior art date
Application number
PCT/CN2018/103832
Other languages
English (en)
French (fr)
Inventor
谢琴
Original Assignee
深圳市欢太科技有限公司
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市欢太科技有限公司, Oppo广东移动通信有限公司 filed Critical 深圳市欢太科技有限公司
Priority to CN201880096704.XA priority Critical patent/CN112585562A/zh
Priority to PCT/CN2018/103832 priority patent/WO2020047719A1/zh
Publication of WO2020047719A1 publication Critical patent/WO2020047719A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer

Definitions

  • the embodiments of the present invention relate to electronic technology, and relate to, but are not limited to, a shorthand method and device, a terminal, and a storage medium.
  • embodiments of the present invention provide a shorthand method, device, terminal, and storage medium in order to solve at least one problem in the related art.
  • an embodiment of the present invention provides a shorthand method, which is applied to a terminal.
  • the method includes: determining a language type of recording data to be shorthanded; and using a speech recognition engine corresponding to the language type to The recorded data is identified to obtain the text content of the recorded data; the text content is saved as a shorthand entry.
  • an embodiment of the present invention provides a shorthand device, the device includes: a language type determination module configured to determine a language type of recording data to be shorthanded; and a voice recognition module configured to use a language corresponding to the language type The speech recognition engine recognizes the recorded data to obtain the text content of the recorded data; a storage module is configured to save the text content as a shorthand entry.
  • an embodiment of the present invention provides a terminal, including a memory and a processor.
  • the memory stores a computer program that can run on the processor, and the processor implements the steps in the shorthand method described above when the program is executed. .
  • an embodiment of the present invention provides a computer-readable storage medium on which a computer program is stored.
  • the computer program is executed by a processor, the steps in the shorthand method are implemented.
  • the speech recognition engine matching the language type is used to perform text conversion on the recorded data, which can improve the accuracy of speech recognition.
  • FIG. 1A is a schematic flowchart of a shorthand method according to an embodiment of the present invention.
  • FIG. 1B is a schematic diagram of an implementation process of converting recorded data to text content according to an embodiment of the present invention
  • FIG. 2A is a schematic flowchart of another shorthand writing method according to an embodiment of the present invention.
  • 2B is a schematic diagram of a sliding operation according to an embodiment of the present invention.
  • 2C is an interface diagram showing a shorthand application icon according to an embodiment of the present invention.
  • 2D is an interface diagram showing a shorthand mode according to an embodiment of the present invention.
  • 2E is an interface diagram showing a recording button according to an embodiment of the present invention.
  • 2F is another interface diagram of a recording button according to an embodiment of the present invention.
  • 2G is a schematic diagram of an implementation process of determining a language type according to an embodiment of the present invention.
  • 3A is a schematic flowchart of another shorthand writing method according to an embodiment of the present invention.
  • 3B is an interface diagram showing a language type option according to an embodiment of the present invention.
  • 3C is another interface diagram showing a language type option according to an embodiment of the present invention.
  • FIG. 5A is a schematic flowchart of an operation performed on a response shorthand card according to an embodiment of the present invention.
  • 5B is a schematic diagram showing a first shorthand interface according to an embodiment of the present invention.
  • 6A is a schematic structural diagram of a shorthand device according to an embodiment of the present invention.
  • 6B is a schematic structural diagram of another shorthand device according to an embodiment of the present invention.
  • 6C is a schematic structural diagram of another shorthand device according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of a hardware entity of a terminal according to an embodiment of the present invention.
  • FIG. 1A is a schematic flowchart of a shorthand method according to an embodiment of the present invention. As shown in FIG. 1A, the method includes the following steps:
  • the language type is Putonghua, Henan dialect, Sichuan dialect, Cantonese dialect, Hong Kong dialect, etc. That is to say, languages can be divided into different types according to different language features.
  • S12 Use a speech recognition engine corresponding to the language type to identify the recorded data to obtain the text content of the recorded data.
  • different users may speak different languages. For example, user A speaks Putonghua, user B speaks Henan dialect, and user C speaks Sichuan dialect. Then, when these three users speak through the shorthand application, In shorthand, if the shorthand application uses the same speech recognition engine (for example, the speech recognition engine corresponding to Mandarin is used) to identify the three recording data of each of them, compared with the recognition result of the recording data of user A, user B and user The accuracy of the recognition result of the recorded data of C is low.
  • the shorthand application uses the same speech recognition engine (for example, the speech recognition engine corresponding to Mandarin is used) to identify the three recording data of each of them, compared with the recognition result of the recording data of user A, user B and user The accuracy of the recognition result of the recorded data of C is low.
  • the language type of the recorded data to be shorthanded can be determined first, and then the recorded data is identified using a speech recognition engine corresponding to the language type, so that the corresponding speech recognition engine is matched according to the language type , Thereby improving the recognition accuracy of the recorded data and improving the user experience.
  • the recorded data is also stored in the shorthand entry so that the user does not understand the expression of the text content When it comes to meaning, it is necessary to clarify the matters or inspirations to be processed by playing back the recorded data.
  • the shorthand entries are saved in a shorthand application. Of course, the shorthand entry can also be saved in other applications.
  • the speech recognition engine matching the speech type is used to perform text conversion on the recorded data, which can improve the accuracy of speech recognition.
  • step S12 the speech recognition engine corresponding to the language type is used to identify the recorded data to obtain the text content of the recorded data. As shown in FIG. 1B, the following may be included: step:
  • FIG. 2A is a schematic flowchart of another shorthand method according to an embodiment of the present invention. As shown in FIG. 2A, the method may include the following steps:
  • the terminal when the terminal is in the lock screen state, if it receives a touch instruction that satisfies a preset condition, for example, as shown in FIG. 2B, in the lock screen state, the terminal detects the touch operation edge.
  • a preset condition for example, as shown in FIG. 2B
  • a preset time for example, 20 milliseconds
  • the received It is a startup instruction for a shorthand application.
  • the terminal runs a non-shorthand application in the foreground, the user can long-press the non-shorthand application interface to call out a function button for starting the shorthand application.
  • a function button for shorthand is displayed on the non-shorthand application interface.
  • the function button receives an operation instruction, it is determined that the shorthand application is started. instruction.
  • the icon 21 of the shorthand application is displayed on the terminal screen 20.
  • the icon 21 receives an operation instruction, it is determined that the startup instruction of the shorthand application is received, and the shorthand mode is displayed at this time.
  • the user can wake up the terminal to invoke the shorthand application by using a voice "Xiaoou, open shorthand", that is, when the terminal detects that the received voice content is related to starting the shorthand application through voice recognition technology, the received voice is determined
  • the instruction is a startup instruction of the shorthand application.
  • the user can use the control center key to invoke the shorthand application with one key.
  • the shorthand mode further includes one of the following: a shooting shorthand mode and a text shorthand mode.
  • 2D is an interface diagram showing a shorthand mode according to an embodiment of the present invention.
  • the display interface 22 of the shorthand mode includes: a function button 221 for invoking a text shorthand mode and a function button for invoking a shorthand mode for voice. 222.
  • S23 Receive a first selection instruction, where the first selection instruction is used to instruct selection of a voice shorthand mode
  • the display interface of the terminal after receiving the first selection instruction, as shown in FIG. 2E, the display interface of the terminal jumps from the display interface 22 of the shorthand mode to the interface 23 including the recording button 231.
  • the user can press and hold the recording button 231 performs recording, that is, when the recording button 231 receives a touch operation, it is determined that the recording instruction is received.
  • the terminal starts the recording function and starts recording the user's words.
  • the touch operation on the recording button 231 is released, And confirm that the recording is over.
  • the terminal display interface after receiving the first selection instruction, as shown in FIG. 2F, the terminal display interface jumps from the display interface 22 of the shorthand mode to the interface 24 displaying a button 241 including a recording start button and a button 242 ending the recording. The user You can click the button 241 to start the recording function. When the recording ends, click the button 242 to end the recording.
  • S26 Use a speech recognition engine corresponding to the language type to identify the recorded data to obtain the text content of the recorded data.
  • a method for calling out a voice shorthand mode is provided, that is, after receiving a startup instruction, one or more shorthand modes including the voice shorthand mode are displayed for a user to select, and when the voice shorthand mode is received, When operating the instruction, the voice shorthand function is activated.
  • determining the language type of the recording data to be shorthanded may include the following steps:
  • the accuracy of the current location is not limited.
  • the current location may be the country, province, autonomous region, municipality, or city where the terminal is currently located.
  • the resident location may be the terminal's resident location. State, province, autonomous region, municipality, city, etc.
  • S254. Determine the language type of the recorded data as the mother tongue of the country to which the current location belongs.
  • the resident location is usually the hometown location of the user of the terminal.
  • the user generally speaks the hometown dialect in the hometown.
  • the user may speak a native language (such as Mandarin).
  • the language type of the recorded data for example, if the user's resident location is Chengdu, then the language type of the recorded data is Sichuan dialect; when the current location is not the resident location, the language of the recorded data is determined
  • the type is the mother tongue of the country to which the current location belongs. For example, if the user's current location is in Xi'an and the resident location is in Zhengzhou, then the user may speak Mandarin in Xi'an.
  • a method for determining a language type is provided, that is, the language type of the recorded data is determined according to whether the current location of the terminal is a resident location of the terminal.
  • FIG. 3A is a schematic flowchart of another shorthand method according to an embodiment of the present invention. As shown in FIG. 3A, the method may include the following steps:
  • S301 Receive a startup instruction, where the startup instruction is used to instruct to start a shorthand application;
  • the first reception of the first selection instruction refers to that the terminal receives the first selection instruction for the first time, instead of receiving the first selection instruction for the first time after starting the shorthand application.
  • the language type of the language that the user may speak can be determined according to the resident location and current location of the terminal, and then displayed as an option, for example, the resident location is Zhengzhou, the current The location is Xi'an guesses that the user may speak Henan dialect, Shaanxi dialect or Mandarin.
  • the interface 30 displaying the recording button 31 displays an option 32 for the user to select a language type.
  • the option 32 includes "Henan dialect” ",” Shaanxi dialect ",” Mandarin "; or, set all possible language types in the options, for example, as shown in FIG. 3C, the interface 30 for displaying the recording button 31 displays an option box 33 for selecting a language type, and the user You can click the drop-down button 331 in the option box to browse and view all language types. For example, by clicking the drop-down button 331, "Henan Dialect”, “Sichuan Dialect”, “Mandarin”, “Cantonese”, and “Min Nan” are displayed , “Hong Kong dialect” and other options, swipe up and down to view all language types.
  • S305 Receive a recording instruction, start recording, and obtain recording data to be shorthanded after the recording ends;
  • the language type of the recording data to be shorthanded is determined by displaying one or more language type options for the user to select.
  • FIG. 4 is a schematic flowchart of another shorthand method according to an embodiment of the present invention. As shown in FIG. 4, the method may include the following steps:
  • S402. Start the shorthand application according to the startup instruction, and display one or more shorthand modes of the shorthand application, where the one or more shorthand modes include a voice shorthand mode;
  • step S404 Detect whether the current position of the terminal is the position of the terminal at the previous time; if so, perform step S405; otherwise, perform step S406;
  • step S405 Determine the language type of the recorded data to be the language type indicated by the second selection instruction, and then proceed to step S410;
  • the advantage of this is that limiting the timing of displaying the one or more language type options can reduce unnecessary interaction between the terminal and the user, that is, when the current location of the terminal has not changed, for example, the The current position of the terminal and the position of the terminal at the previous moment are both in Chengdu, then the language type of the language currently spoken by the user may be considered to be the language type selected by the user last time, that is, the language indicated by the second selection instruction Types of.
  • the language type of the language that the user may speak may change, so the option of the one or more language types may pop up for the user Reselecting the language type, that is, receiving the third selection instruction, and determining that the language type of the recorded data is the language type indicated by the third selection instruction.
  • S408 Receive a recording instruction, start recording, and obtain recording data to be shorthanded after the recording ends;
  • step S409 Determine the language type of the recorded data to be the language type indicated by the third selection instruction, and then proceed to step S410;
  • S410 Use a speech recognition engine corresponding to the language type to identify the recorded data to obtain the text content of the recorded data.
  • the language type selected by the user last time is determined as the language type of the recorded data this time;
  • the current position of the terminal is changed, one or more language type options are displayed for the user to select again. In this way, by limiting the display timing of displaying the one or more language types, frequent interaction with the user can be avoided.
  • the saving the text content as a shorthand entry may include: recording the text content and the saving time in a shorthand card of the shorthand application, and saving the shorthand card.
  • the method further includes the following steps:
  • the first shorthand interface can be called up and displayed in any working state of the terminal screen. For example, when the terminal is in the lock screen state, if it receives a touch instruction that meets a preset condition, it can be called directly. Display the first shorthand interface; for another example, if the terminal detects that the user's touch time on the non-shorthand application interface exceeds a preset time threshold, a function button for shorthand is displayed on the non-shorthand application interface, when When the function button receives an operation instruction, it directly calls up the first shorthand interface.
  • the manner of triggering the display of the first shorthand interface here is the same as the above-mentioned manner of invoking the shorthand application, so no further details are given here.
  • the operation instruction includes at least one of the following:
  • Expand or collapse the shorthand card change the card-level color identification, delete the shorthand card, forward the shorthand card, add attachments (eg, add pictures, web page connections, etc.), and record instructions. For example, users can add shorthand content by long pressing the record button on the shorthand card.
  • the shorthand card 501 displays: the last save time 5011 of the shorthand card 501, a function button 5012 for expanding or retracting the shorthand card 501, the text content 5013, and a recording button 5014 for adding recording data
  • the color mark 5015 used to characterize the card level for example, red means that the card's content is first-level processing content; yellow means that the card's content is second-level processing content, the yellow label has a higher priority than the red label
  • a function button 5016 for changing the color identification 6125, a function button 5017 for deleting the shorthand card 501, a function button 5018 for adding an attachment, and a function button 5019 for forwarding the shorthand card 501;
  • the contents of historical shorthand cards 502 to 50N are displayed in a collapsed state.
  • the user can expand and view all the contents recorded by the touched shorthand card by touching the area where other historical shorthand cards 502 to 50N are located. That is, when the history When the shorthand card receives an operation instruction, it expands the entire contents of the historical shorthand card that received the operation instruction;
  • the function button 506 includes at least one of the following: a function button 5061 for invoking a text shorthand mode, a function button 5050 for invoking a shorthand mode for voice, and a function button 5063 for invoking a shorthand mode for shooting.
  • an embodiment of the present invention provides a shorthand device.
  • the device includes each module included, and each unit included in each module can be implemented by a processor in a terminal; of course, it can also be implemented by a specific
  • the logic circuit is implemented; in the implementation process, the processor may be a central processing unit (CPU), a microprocessor (MPU), a digital signal processor (DSP), or a field programmable gate array (FPGA).
  • CPU central processing unit
  • MPU microprocessor
  • DSP digital signal processor
  • FPGA field programmable gate array
  • FIG. 6A is a schematic structural diagram of a shorthand device according to an embodiment of the present invention.
  • the device 600 includes: a language type determination module 601 configured to determine the language type of the recording data to be shorthanded; and a voice recognition module 602. Configured to use a speech recognition engine corresponding to the language type to identify the recorded data to obtain the text content of the recorded data; a save module 603 is configured to save the text content as a shorthand entry.
  • the speech recognition module 602 may be configured as:
  • the corresponding speech recognition engine is called according to the identification of the speech recognition engine, and the recorded data is identified to obtain the text content of the recorded data.
  • the apparatus 600 further includes:
  • the receiving module 604 is configured to receive a startup instruction, where the startup instruction is used to instruct to start a shorthand application;
  • a startup module 605 configured to start the shorthand application according to the startup instruction
  • a display module 606 configured to display one or more shorthand modes of the shorthand application, the one or more shorthand modes including a voice shorthand mode;
  • the receiving module 604 is configured to receive a first selection instruction, where the first selection instruction is used to instruct selection of the voice shorthand mode;
  • the first acquisition module 607 is configured to start recording when a recording instruction is received, and after recording ends, acquire recording data to be shorthanded, and trigger a language type determination module 601.
  • the language type determination module 601 may be configured as:
  • the current location of the terminal is the resident location of the terminal, determining the language type of the recorded data according to the resident location of the terminal;
  • the language type of the recorded data is the native language of the country to which the current location belongs.
  • the apparatus 600 further includes:
  • a display module 606 configured to display one or more language type options when the first selection instruction is received for the first time
  • the receiving module 604 is configured to receive a second selection instruction, where the second selection instruction is used to instruct to select one of the one or more language type options, and trigger the first obtaining module 607.
  • the language type determination module 601 is configured to determine that a language type of the recorded data is a language type indicated by the second selection instruction.
  • the apparatus 600 further includes:
  • a second obtaining module 608, configured to obtain a current position of the terminal when the first selection instruction is received next time
  • a display module 606 configured to display the one or more language type options if the current position of the terminal is not the position of the terminal at a previous time;
  • the receiving module 604 is configured to receive a third selection instruction, where the third selection instruction is used to instruct to select one of the one or more language type options, and trigger the first obtaining module 607;
  • the language type determination module 601 is configured to determine that a language type of the recorded data is a language type indicated by the third selection instruction.
  • the language type determination module 601 is configured to determine, if the current position of the terminal is the position of the terminal at a previous time, that the language type of the recorded data is indicated by the second selection instruction Language type.
  • the technical solution of the embodiments of the present invention that is essential or contributes to related technologies can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium and includes several instructions for enabling A terminal (which may be a mobile phone, a tablet computer, a desktop computer, a personal digital assistant, a navigator, a digital phone, a video phone, a television, a sensing device, etc.) executes all or part of the method described in each embodiment of the present invention.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (Read Only Memory, ROM), a magnetic disk, or an optical disk, which can store program codes.
  • ROM Read Only Memory
  • magnetic disk or an optical disk, which can store program codes.
  • optical disk which can store program codes.
  • FIG. 7 is a schematic diagram of a hardware entity of a terminal according to an embodiment of the present invention.
  • the terminal 700 includes a memory 701 and a processor 702.
  • a computer program that can be run on the processor 702 is stored, and the processor 702 executes the program to implement the steps in the shorthand method provided in the foregoing embodiment.
  • the memory 701 is configured to store instructions and applications executable by the processor 702, and may also buffer data to be processed or processed by each module in the processor 702 and the terminal 700 (for example, image data, audio data, (Voice communication data and video communication data), can be realized by flash memory (FLASH) or random access memory (Random Access Memory, RAM).
  • FLASH flash memory
  • RAM Random Access Memory
  • an embodiment of the present invention provides a computer-readable storage medium on which a computer program is stored.
  • the computer program is executed by a processor, the steps in the shorthand method provided in the foregoing embodiment are implemented.
  • an embodiment or “an embodiment” mentioned throughout the specification means that a particular feature, structure, or characteristic related to the embodiment is included in at least one embodiment of the present invention.
  • the appearances of "in one embodiment” or “in an embodiment” appearing throughout the specification are not necessarily referring to the same embodiment.
  • the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
  • the size of the sequence numbers of the above processes does not mean the order of execution, and the execution order of each process should be determined by its function and internal logic, and should not deal with the embodiments of the present invention.
  • the implementation process constitutes any limitation.
  • the sequence numbers of the foregoing embodiments of the present invention are only for description, and do not represent the superiority or inferiority of the embodiments.
  • the disclosed device and method may be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of the unit is only a logical function division.
  • there may be another division manner such as multiple units or components may be combined, or Can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed components are coupled, or directly coupled, or communicated with each other through some interfaces.
  • the indirect coupling or communication connection of the device or unit may be electrical, mechanical, or other forms. of.
  • the units described above as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units; they may be located in one place or distributed across multiple network units; Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may be separately used as a unit, or two or more units may be integrated into one unit; the above integration
  • the unit can be implemented in the form of hardware, or in the form of hardware plus software functional units.
  • the foregoing program may be stored in a computer-readable storage medium.
  • the execution includes Steps of the above method embodiment; and the foregoing storage medium includes: various types of media that can store program codes, such as a mobile storage device, a read-only memory (Read Only Memory, ROM), a magnetic disk, or an optical disc.
  • ROM Read Only Memory
  • the above-mentioned integrated unit of the present invention is implemented in the form of a software functional module and sold or used as an independent product, it may also be stored in a computer-readable storage medium.
  • the computer software product is stored in a storage medium and includes several instructions for enabling A terminal (which may be a mobile phone, a tablet computer, a desktop computer, a personal digital assistant, a navigator, a digital phone, a video phone, a television, a sensing device, etc.) performs all or part of the method described in each embodiment of the present invention.
  • the foregoing storage media include: various types of media that can store program codes, such as a mobile storage device, a ROM, a magnetic disk, or an optical disc.
  • the speech recognition engine matching the language type is used to perform text conversion on the recorded data, which can improve the accuracy of speech recognition.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

本发明实施例提供一种速记方法及装置、终端、存储介质,其中,所述方法包括:确定待速记的录音数据的语言类型;利用与所述语言类型对应的语音识别引擎,对所述录音数据进行识别,得到所述录音数据的文字内容;将所述文字内容保存为一个速记条目。

Description

一种速记方法及装置、终端、存储介质 技术领域
本发明实施例涉及电子技术,涉及但不限于一种速记方法及装置、终端、存储介质。
背景技术
随着生活节奏的加快,人们在日常的生活和工作中需要处理非常多的任务,尤其是对于多任务处理者、健忘的人、移动办公的人,一般会在手机上下载一个速记应用,来快速记录一些待办事项、备忘、灵感等。而语音速记方式相比于文字速记方式,其记录速度更快更便捷,因此受到广大用户的青睐。
然而,当用户通过速记应用进行语音速记时,可能会存在语音识别准确度低的问题,即,当对待速记的录音数据进行文字转换时,转换后的文字内容存在误差。
发明内容
有鉴于此,本发明实施例为解决相关技术中存在的至少一个问题而提供一种速记方法及装置、终端、存储介质。
本发明实施例的技术方案是这样实现的:
第一方面,本发明实施例提供一种速记方法,所述方法应用于终端,所述方法包括:确定待速记的录音数据的语言类型;利用与所述语言类型对应的语音识别引擎,对所述录音数据进行识别,得到所述录音数据的文字内容;将所述文字内容保存为一个速记条目。
第二方面,本发明实施例提供一种速记装置,所述装置包括:语言类型确定模块,配置为确定待速记的录音数据的语言类型;语音识别模块, 配置为利用与所述语言类型对应的语音识别引擎,对所述录音数据进行识别,得到所述录音数据的文字内容;保存模块,配置为将所述文字内容保存为一个速记条目。
第三方面,本发明实施例提供一种终端,包括存储器和处理器,所述存储器存储有可在处理器上运行的计算机程序,所述处理器执行所述程序时实现上述速记方法中的步骤。
第四方面,本发明实施例提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述速记方法中的步骤。
本发明实施例中,在确定待速记的录音数据的语言类型之后,利用与该语言类型匹配的语音识别引擎,对所述录音数据进行文字转换,可以提高语音识别的准确度。
附图说明
图1A为本发明实施例的一种速记方法的实现流程示意图;
图1B为本发明实施例录音数据转文字内容的实现流程示意图;
图2A为本发明实施例的另一种速记方法的实现流程示意图;
图2B为本发明实施例的一种滑动操作的示意图;
图2C为本发明实施例的一种显示速记应用图标的界面图;
图2D为本发明实施例的一种显示速记模式的界面图;
图2E为本发明实施例的一种显示录音按钮的界面图;
图2F为本发明实施例的另一种显示录音按钮的界面图;
图2G为本发明实施例的确定语言类型的实现流程示意图;
图3A为本发明实施例的又一种速记方法的实现流程示意图;
图3B为本发明实施例的显示语言类型选项的界面图;
图3C为本发明实施例的另一显示语言类型选项的界面图;
图4为本发明实施例的再一种速记方法的实现流程示意图;
图5A为本发明实施例的响应速记卡片上的操作的实现流程示意图;
图5B为本发明实施例的一种显示第一速记界面的示意图;
图6A为本发明实施例的一种速记装置的组成结构示意图;
图6B为本发明实施例的另一种速记装置的组成结构示意图;
图6C为本发明实施例的又一种速记装置的组成结构示意图;
图7为本发明实施例终端的一种硬件实体示意图。
具体实施方式
下面结合附图和实施例对本发明的技术方案进一步详细阐述。
本发明实施例提供一种速记方法,该方法应用于终端,图1A为本发明实施例速记方法的实现流程示意图,如图1A所示,该方法包括以下步骤:
S11、确定待速记的录音数据的语言类型;
举例来说,所述语言类型为普通话、河南话、四川话、广东话、香港话等,也就是说,根据语言特征的不同,可以将语言分为不同的类型。
S12、利用与所述语言类型对应的语音识别引擎,对所述录音数据进行识别,得到所述录音数据的文字内容;
可以理解地,不同的用户可能说的语言是不同的,例如,用户A说的是普通话,用户B说的河南话,用户C说的四川话,那么,当这三个用户通过速记应用进行语音速记时,如果速记应用使用同一种语音识别引擎(例如均使用普通话对应的语音识别引擎)对三者各自的录音数据进行识别时,相比于用户A的录音数据的识别结果,用户B和用户C的录音数据的识别结果的准确度较低。基于此,可以先确定待速记的录音数据的语言类型,然后,利用与所述语言类型对应的语音识别引擎,对所述录音数据进行识别,这样,根据语言类型匹配与之对应的语音识别引擎,从而提高对所述录音数据的识别准确度,提高用户体验。
S13、将所述文字内容保存为一个速记条目。
在其他实施例中,为了避免所述文字内容与所述录音数据的实际内容有较大的误差,将所述录音数据也保存在所述速记条目中,以便用户不理解所述文字内容所表达的含义时,通过回放所述录音数据明确要处理的事项或灵感等。一般来说,所述速记条目保存在速记应用中。当然,所述速记条目也可以保存到其他应用中。
在本发明实施例中,在确定待速记的录音数据的语言类型之后,利用与该语音类型匹配的语音识别引擎,对所述录音数据进行文字转换,可以提高语音识别的准确度。
在其他实施例中,对于步骤S12,所述利用与所述语言类型对应的语音识别引擎,对所述录音数据进行识别,得到所述录音数据的文字内容,如图1B所示,可以包括以下步骤:
S121、确定与所述语言类型对应的语音识别引擎的标识;
S122、根据所述语音识别引擎的标识调用对应的语音识别引擎,对所述录音数据进行识别,得到所述录音数据的文字内容。
本发明实施例提供另一种速记方法,图2A为本发明实施例的另一种速记方法的实现流程示意图,如图2A所示,所述方法可以包括以下步骤:
S21、接收启动指令,所述启动指令用于指示启动速记应用;
这里,需要说明的是,对于所述启动指令的输入方式不受限制,也就是说,所述速记应用的启动可以全局调起,即用户可以随时随地调起所述速记应用,从而方便用户快速记录灵感。例如,终端在锁屏状态下,如果接收到满足预设条件的触摸指令时,例如,图2B所示,在锁屏状态下,终端检测到触摸操作沿
Figure PCTCN2018103832-appb-000001
方向滑动,可以确定接收的操作指令是用于调起所述速记应用的启动指令,此时启动速记模式,并显示;或者,终端在锁屏状态下,用户可以通过同时按压音量增大键和电源键,从而调起所述速记应用,即,当终端在预设的时间内(例如20毫秒)分别接收到音量增大键 发送的操作指令和电源键发送的操作指令时,可以确定接收的是速记应用的启动指令。再如,终端在前台运行非速记应用时,用户可以通过长按所述非速记应用界面,从而呼出用于启动所述速记应用的功能按钮,也就是说,如果终端检测到用户在所述非速记应用界面上的触摸时间超过预设的时间阈值时,在所述非速记应用界面上显示进行速记的功能按钮,当该功能按钮接收到操作指令时,确定接收的是所述速记应用的启动指令。又如,图2C所示,在终端屏幕20上显示所述速记应用的图标21,当图标21接收到操作指令时,确定接收的是所述速记应用的启动指令,此时显示速记模式。再如,用户可以通过语音“小欧,打开速记”唤醒终端调起所述速记应用,即,当终端通过语音识别技术检测到接收的语音内容与启动所述速记应用相关,则确定接收的语音指令时所述速记应用的启动指令。又如,用户可以通过控制中心键一键唤起所述速记应用。
S22、根据所述启动指令启动所述速记应用,并显示所述速记应用的一个或多个速记模式,所述一个或多个速记模式包括语音速记模式;
在其他实施例中,速记模式还包括以下之一:拍摄速记模式和文字速记模式。图2D为本发明实施例显示速记模式的界面图,如图2D所示,速记模式的显示界面22包括:用于调起文字速记模式的功能按钮221、用于调起语音速记模式的功能按钮222、用于调起拍摄速记模式的功能按钮223。
S23、接收第一选择指令,第一选择指令用于指示选择语音速记模式;
如果功能按钮222接收到操作指令时,确定接收的是第一选择指令。
S24、接收录音指令,开始录音,录音结束后,获取待速记的录音数据;
一般来说,在接收到所述第一选择指令之后,如图2E所示,终端显示界面由速记模式的显示界面22跳转至显示包括录音按钮231的界面23,用户可以通过长按录音按钮231进行录音,也就是,当录音按钮231接收到触摸操作时,确定接收的是录音指令,此时终端启动录音功能,开始对用户说的话进行录音,当录音按钮231上的触摸操作被释放时,确定录音结 束。或者,在接收到所述第一选择指令之后,如图2F所示,终端显示界面由速记模式的显示界面22跳转至显示包括开始录音的按钮241和结束录音的按钮242的界面24,用户可以通过点击按钮241,启动录音功能,在录音结束时,点击按钮242,结束录音。
S25、确定待速记的录音数据的语言类型;
S26、利用与所述语言类型对应的语音识别引擎,对所述录音数据进行识别,得到所述录音数据的文字内容;
S27、将所述文字内容保存为一个速记条目。
在本发明实施例中,提供了一种呼出语音速记模式的方式,即在接收到启动指令之后,显示包括语音速记模式的一个或多个速记模式,以供用户选择,当语音速记模式接收到操作指令时,启动语音速记功能。
在其他实施例中,对于步骤S11或步骤S25中所述的,确定待速记的录音数据的语言类型,如图2G所示,可以包括如下步骤:
S251、获取所述终端的当前位置;
S252、检测所述终端的当前位置是否是所述终端的常驻位置;如果是,执行步骤S253;否则,执行步骤S254;
这里,对所述当前位置的精度不做限制,所述当前位置可以是终端当前所在的国家、省、自治区、直辖市、城市等,对应的,所述常驻位置可以是所述终端的常驻国家、省、自治区、直辖市、城市等。
S253、根据所述终端的常驻位置确定所述录音数据的语言类型;
S254、确定所述录音数据的语言类型为所述当前位置所属国的母语。
可以理解地,所述常驻位置通常是所述终端的用户的家乡所在地,用户在家乡一般说的是家乡话,出了家乡,用户可能说的是母语(例如普通话),因此,这里可以通过检测所述终端的当前位置是否是所述终端的常驻位置,来确定待速记的录音数据的语言类型,即,当所述当前位置是所述常驻位置时,根据所述常驻位置确定所述录音数据的语言类型,例如,用 户的常驻位置是成都,那么所述录音数据的语言类型为四川话;当所述当前位置不是所述常驻位置时,确定所述录音数据的语言类型为所述当前位置所属国的母语,例如,用户的当前位置在西安,常驻位置在郑州,那么用户在西安可能说的是普通话。
在本发明实施例中,提供了一种确定语言类型的方法,即,根据终端的当前位置是否是终端的常驻位置来确定所述录音数据的语言类型。
本发明实施例提供又一种速记方法,图3A为本发明实施例的又一种速记方法的实现流程示意图,如图3A所示,所述方法可以包括以下步骤:
S301、接收启动指令,所述启动指令用于指示启动速记应用;
S302、根据所述启动指令启动所述速记应用,并显示所述速记应用的一个或多个速记模式,所述一个或多个速记模式包括语音速记模式;
S303、首次接收到所述第一选择指令时,显示一个或多个语言类型的选项;
这里,所述首次接收到第一选择指令,指的是终端首次接收述第一选择指令,而不是在启动速记应用后首次接收到第一选择指令。对于显示一个或多个语言类型的选项,一般来说,可以根据终端的常驻位置和当前位置确定用户可能会说的语言的语言类型,然后作为选项显示,例如,常驻位置是郑州,当前位置是西安猜测用户可能会说的河南话、陕西话或者普通话,此时,如图3B所示,在显示录音按钮31的界面30显示供用户选择语言类型的选项32,选项32包括“河南话”、“陕西话”、“普通话”;或者,在选项中设置所有可能的语言类型,例如,图3C所示,在显示录音按钮31的界面30显示用于选择语言类型的选项框33,用户可以通过点击选项框中的下拉按钮331,浏览查看所有的语言类型,例如,通过点击下拉按钮331,显示“河南话”、“四川话”、“普通话”、“广东话”、“闽南语”、“香港话”等选项,通过上下滑动可浏览查看所有的语言类型。
S304、接收第二选择指令,所述第二选择指令用于指示选择所述一个 或多个语言类型的选项中的一个,然后进入步骤S305;
S305、接收录音指令,开始录音,录音结束后获取待速记的录音数据;
S306、确定所述录音数据的语言类型为所述第二选择指令所指示的语言类型;
S307、利用与所述语言类型对应的语音识别引擎,对所述录音数据进行识别,得到所述录音数据的文字内容;
S308、将所述文字内容保存为一个速记条目。
在本发明实施例中,通过显示一个或多个语言类型的选项,以供用户进行选择,从而确定待速记的录音数据的语言类型。
本发明实施例提供再一种速记方法,图4为本发明实施例的再一种速记方法的实现流程示意图,如图4所示,所述方法可以包括以下步骤:
S401、接收启动指令,所述启动指令用于指示启动所述速记应用;
S402、根据所述启动指令启动所述速记应用,并显示所述速记应用的一个或多个速记模式,所述一个或多个速记模式包括语音速记模式;
S403、下一次接收到所述第一选择指令时,获取所述终端的当前位置;
S404、检测所述终端的当前位置是否是所述终端在前一时刻的位置;如果是,执行步骤S405;否则,执行步骤S406;
S405、确定所述录音数据的语言类型为所述第二选择指令所指示的语言类型,然后进入步骤S410;
S406、显示所述一个或多个语言类型的选项;
这样做的好处在于,限定显示所述一个或多个语言类型的选项的时机,可以减少终端与用户之间的不必要的交互,即当所述终端的当前位置没有改变时,例如,所述终端的当前位置与所述终端在前一时刻的位置都在成都,那么可以认为用户当前说的语言的语言类型是用户上一次所选择的语言类型,即所述第二选择指令所指示的语言类型。当所述终端的当前位置改变时,例如,用户从成都去了西安,此时,用户可能说的语言的语言类 型会改变,因此可以弹出所述一个或多个语言类型的选项,以供用户重新选择语言类型,即接收第三选择指令,确定所述录音数据的语言类型为所述第三选择指令所指示的语言类型。
S407、接收第三选择指令,所述第三选择指令用于指示选择所述一个或多个语言类型的选项中的一个,然后进入步骤S408;
S408、接收录音指令,开始录音,录音结束后获取待速记的录音数据;
S409、确定所述录音数据的语言类型为所述第三选择指令所指示的语言类型,然后进入步骤S410;
S410、利用与所述语言类型对应的语音识别引擎,对所述录音数据进行识别,得到所述录音数据的文字内容;
S411、将所述文字内容保存为一个速记条目。
在本发明实施例中,在下一次接收到所述第一选择指令时,如果所述终端的当前位置没有改变,则将上一次用户选择的语言类型确定为这一次录音数据的语言类型;如果所述终端的当前位置改变,则显示一个或多个语言类型的选项,以供用户重新选择,这样,通过限定显示所述一个或多个语言类型的显示时机,可以避免与用户的频繁交互。
在其他实施例中,上述将文字内容保存为一个速记条目,可以包括:将文字内容和保存时间记录在速记应用的速记卡片中,并保存速记卡片。
在其他实施例中,如图5A所示,所述方法还包括以下步骤:
S501、在第一速记界面显示所述速记卡片;
在实际应用中,所述第一速记界面可以在所述终端屏幕的任一工作状态被调出显示,例如,终端在锁屏状态下,如果接收到满足预设条件的触摸指令时,直接调出所述第一速记界面;再如,如果终端检测到用户在所述非速记应用界面上的触摸时间超过预设时间阈值时,在所述非速记应用界面上显示进行速记的功能按钮,当该功能按钮接收到操作指令时,直接调出所述第一速记界面。事实上,这里触发显示所述第一速记界面的方式 和上述调起所述速记应用的方式是相同的,所以,此处不再举例赘述。
S502、当在所述速记卡片上接收到操作指令时,响应所述操作指令;
其中,所述操作指令至少包括以下之一:
展开或收起所述速记卡片、更改卡片级别的颜色标识、删除所述速记卡片、转发所述速记卡片、添加附件(例如,添加图片、网页连接等)、录音指令。例如,用户可以通过长按速记卡片上的录音按钮,添加速记内容。
在实际应用中,如图5B所示,所述第一速记界面50上除了显示所述速记卡片501以外,还显示了历史速记卡片502至50N的内容、用于上下查看速记卡片501至50N的功能按钮503、用于搜索速记卡片501至50N的功能按钮504、用于编辑速记卡片501至50N(例如删除其中的某一速记卡片、添加新的速记卡片等)的功能按钮505、以及供用户选择速记模式的功能按钮506;其中,
所述速记卡片501上显示:所述速记卡片501的最后保存时间5011、用于展开或收起所述速记卡片501的功能按钮5012、所述文字内容5013、用于添加录音数据的录音按钮5014、用于表征卡片级别的颜色标识5015(例如,红色表示该卡片的内容为一级处理内容;黄色表示该卡片的内容为二级处理内容,黄色标识的优先级没有红色标识的高)、用于更改颜色标识6125的功能按钮5016、用于删除所述速记卡片501的功能按钮5017、用于添加附件的功能按钮5018、用于转发所述速记卡片501的功能按钮5019;
历史速记卡片502至50N的内容以收起的状态显示,用户可以通过触摸其他历史速记卡片502至50N所在的区域,来展开查看被触摸的速记卡片所记录的全部内容,也就是说,当历史速记卡片接收到操作指令时,展开接收所述操作指令的历史速记卡片的全部内容;
所述功能按钮506至少包括以下之一:用于调起文字速记模式的功能按钮5061、用于调起语音速记模式的功能按钮5050、用于调起拍摄速记模式的功能按钮5063。
基于前述的实施例,本发明实施例提供一种速记装置,该装置包括所包括的各模块、以及各模块所包括的各单元,可以通过终端中的处理器来实现;当然也可通过具体的逻辑电路实现;在实施的过程中,处理器可以为中央处理器(CPU)、微处理器(MPU)、数字信号处理器(DSP)或现场可编程门阵列(FPGA)等。
图6A为本发明实施例的一种速记装置的组成结构示意图,如图6A所示,所述装置600包括:语言类型确定模块601,配置为确定待速记的录音数据的语言类型;语音识别模块602,配置为利用与所述语言类型对应的语音识别引擎,对所述录音数据进行识别,得到所述录音数据的文字内容;保存模块603,配置为将所述文字内容保存为一个速记条目。
在其他实施例中,语音识别模块602,可以配置为:
确定与所述语言类型对应的语音识别引擎的标识;
根据所述语音识别引擎的标识调用对应的语音识别引擎,对所述录音数据进行识别,得到所述录音数据的文字内容。
在其他实施例中,如图6B所示,所述装置600还包括:
接收模块604,配置为接收启动指令,所述启动指令用于指示启动速记应用;
启动模块605,配置为根据所述启动指令启动所述速记应用;
显示模块606,配置为显示所述速记应用的一个或多个速记模式,所述一个或多个速记模式包括语音速记模式;
接收模块604,配置为接收第一选择指令,所述第一选择指令用于指示选择所述语音速记模式;
第一获取模块607,配置为接收到录音指令时,开始录音,录音结束后,获取待速记的录音数据,并触发语言类型确定模块601。
在其他实施例中,语言类型确定模块601,可以配置为:
获取所述终端的当前位置;
如果所述终端的当前位置是所述终端的常驻位置时,根据所述终端的常驻位置确定所述录音数据的语言类型;
如果所述终端的当前位置不是所述终端的常驻位置时,确定所述录音数据的语言类型为所述当前位置所属国的母语。
在其他实施例中,所述装置600还包括:
显示模块606,配置为首次接收到所述第一选择指令时,显示一个或多个语言类型的选项;
接收模块604,配置为接收第二选择指令,所述第二选择指令用于指示选择所述一个或多个语言类型的选项中的一个,并触发第一获取模块607。
语言类型确定模块601,配置为确定所述录音数据的语言类型为所述第二选择指令所指示的语言类型。
在其他实施例中,如图6C所示,所述装置600还包括:
第二获取模块608,配置为下一次接收到所述第一选择指令时,获取所述终端的当前位置;
显示模块606,配置为如果所述终端的当前位置不是所述终端在前一时刻的位置时,显示所述一个或多个语言类型的选项;
接收模块604,配置为接收第三选择指令,所述第三选择指令用于指示选择所述一个或多个语言类型的选项中的一个,并触发第一获取模块607;
语言类型确定模块601,配置为确定所述录音数据的语言类型为所述第三选择指令所指示的语言类型。
在其他实施例中,语言类型确定模块601,配置为如果所述终端的当前位置是所述终端在前一时刻的位置时,确定所述录音数据的语言类型为所述第二选择指令所指示的语言类型。
以上装置实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果。对于本发明装置实施例中未披露的技术细节,请参照本发明方法实施例的描述而理解。
需要说明的是,本发明实施例中,如果以软件功能模块的形式实现上述的速记方法,并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实施例的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得终端(可以是手机、平板电脑、台式机、个人数字助理、导航仪、数字电话、视频电话、电视机、传感设备等)执行本发明各个实施例所述方法的全部或部分。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的介质。这样,本发明实施例不限制于任何特定的硬件和软件结合。
对应地,本发明实施例提供一种终端,图7为本发明实施例的一种终端的硬件实体示意图,如图7所示,所述终端700包括存储器701和处理器702,所述存储器701存储有可在处理器702上运行的计算机程序,所述处理器702执行所述程序时实现上述实施例中提供的速记方法中的步骤。
需要说明的是,存储器701配置为存储由处理器702可执行的指令和应用,还可以缓存待处理器702以及终端700中各模块待处理或已经处理的数据(例如,图像数据、音频数据、语音通信数据和视频通信数据),可以通过闪存(FLASH)或随机访问存储器(Random Access Memory,RAM)实现。
对应地,本发明实施例提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述实施例中提供的速记方法中的步骤。
这里需要指出的是:以上存储介质和设备实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果。对于本发明存储介质和设备实施例中未披露的技术细节,请参照本发明方法实施例的描述而理解。
应理解,说明书通篇中提到的“一个实施例”或“一实施例”意味着与实施例有关的特定特征、结构或特性包括在本发明的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。应理解,在本发明的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本发明实施例的实施过程构成任何限定。上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。
上述作为分离部件说明的单元可以是、或也可以不是物理上分开的,作为单元显示的部件可以是、或也可以不是物理单元;既可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。
另外,在本发明各实施例中的各功能单元可以全部集成在一个处理单 元中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储设备、只读存储器(Read Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的介质。
或者,本发明上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实施例的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台终端(可以是手机、平板电脑、台式机、个人数字助理、导航仪、数字电话、视频电话、电视机、传感设备等)执行本发明各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储设备、ROM、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本发明的实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。
工业实用性
本申请实施例中,在确定待速记的录音数据的语言类型之后,利用与该语言类型匹配的语音识别引擎,对所述录音数据进行文字转换,可以提高语音识别的准确度。

Claims (13)

  1. 一种速记方法,其特征在于,所述方法应用于终端,所述方法包括:
    确定待速记的录音数据的语言类型;
    利用与所述语言类型对应的语音识别引擎,对所述录音数据进行识别,得到所述录音数据的文字内容;
    将所述文字内容保存为一个速记条目。
  2. 根据权利要求1所述的方法,其特征在于,所述利用与所述语言类型对应的语音识别引擎,对所述录音数据进行识别,得到所述录音数据的文字内容,包括:
    确定与所述语言类型对应的语音识别引擎的标识;
    根据所述语音识别引擎的标识调用对应的语音识别引擎,对所述录音数据进行识别,得到所述录音数据的文字内容。
  3. 根据权利要求1所述的方法,其特征在于,在所述确定待速记的录音数据的语言类型之前,所述方法还包括:
    接收启动指令,所述启动指令用于指示启动速记应用;
    根据所述启动指令启动所述速记应用,并显示所述速记应用的一个或多个速记模式,所述一个或多个速记模式包括语音速记模式;
    接收第一选择指令,所述第一选择指令用于指示选择所述语音速记模式;
    接收录音指令,开始录音,录音结束后,获取待速记的录音数据,并触发所述确定待速记的录音数据的语言类型的步骤。
  4. 根据权利要求1至3任一项所述的方法,其特征在于,所述确定待速记的录音数据的语言类型,包括:
    获取所述终端的当前位置;
    如果所述终端的当前位置是所述终端的常驻位置时,根据所述终端的 常驻位置确定所述录音数据的语言类型。
  5. 根据权利要求4所述的方法,其特征在于,所述方法还包括:
    如果所述终端的当前位置不是所述终端的常驻位置时,确定所述录音数据的语言类型为所述当前位置所属国的母语。
  6. 根据权利要求3所述的方法,其特征在于,所述方法还包括:
    首次接收到所述第一选择指令时,显示一个或多个语言类型的选项;
    接收第二选择指令,所述第二选择指令用于指示选择所述一个或多个语言类型的选项中的一个,并触发所述接收录音指令,开始录音,录音结束后,获取待速记的录音数据的步骤。
  7. 根据权利要求6所述的方法,其特征在于,所述确定待速记的录音数据的语言类型,包括:
    确定所述录音数据的语言类型为所述第二选择指令所指示的语言类型。
  8. 根据权利要求6所述的方法,其特征在于,所述方法还包括:
    下一次接收到所述第一选择指令时,获取所述终端的当前位置;
    如果所述终端的当前位置不是所述终端在前一时刻的位置时,显示所述一个或多个语言类型的选项;
    接收第三选择指令,所述第三选择指令用于指示选择所述一个或多个语言类型的选项中的一个,并触发所述接收录音指令,开始录音,录音结束后,获取待速记的录音数据的步骤。
  9. 根据权利要求8所述的方法,所述确定待速记的录音数据的语言类型,包括:
    确定所述录音数据的语言类型为所述第三选择指令所指示的语言类型。
  10. 根据权利要求8所述的方法,其特征在于,所述确定待速记的录音数据的语言类型,包括:
    如果所述终端的当前位置是所述终端在前一时刻的位置时,确定所述录音数据的语言类型为所述第二选择指令所指示的语言类型。
  11. 一种速记装置,其特征在于,所述装置包括:
    语言类型确定模块,配置为确定待速记的录音数据的语言类型;
    语音识别模块,配置为利用与所述语言类型对应的语音识别引擎,对所述录音数据进行识别,得到所述录音数据的文字内容;
    保存模块,配置为将所述文字内容保存为一个速记条目。
  12. 一种终端,包括存储器和处理器,所述存储器存储有可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现权利要求1至10任一项所述速记方法中的步骤。
  13. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该计算机程序被处理器执行时实现权利要求1至10任一项所述速记方法中的步骤。
PCT/CN2018/103832 2018-09-03 2018-09-03 一种速记方法及装置、终端、存储介质 WO2020047719A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201880096704.XA CN112585562A (zh) 2018-09-03 2018-09-03 一种速记方法及装置、终端、存储介质
PCT/CN2018/103832 WO2020047719A1 (zh) 2018-09-03 2018-09-03 一种速记方法及装置、终端、存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/103832 WO2020047719A1 (zh) 2018-09-03 2018-09-03 一种速记方法及装置、终端、存储介质

Publications (1)

Publication Number Publication Date
WO2020047719A1 true WO2020047719A1 (zh) 2020-03-12

Family

ID=69721446

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/103832 WO2020047719A1 (zh) 2018-09-03 2018-09-03 一种速记方法及装置、终端、存储介质

Country Status (2)

Country Link
CN (1) CN112585562A (zh)
WO (1) WO2020047719A1 (zh)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050108017A1 (en) * 2003-10-27 2005-05-19 John-Alexander Esser Determining language for word recognition event
CN101697581A (zh) * 2009-10-26 2010-04-21 深圳华为通信技术有限公司 支持同声传译视讯会议的方法、装置及系统
CN102663143A (zh) * 2012-05-18 2012-09-12 徐信 一种音视频语音处理与检索的系统和方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4407494B2 (ja) * 2004-11-26 2010-02-03 株式会社日立製作所 デジタルペンを用いた速記文字反訳作業支援システム
CN103903611B (zh) * 2012-12-24 2018-07-03 联想(北京)有限公司 一种语音信息的识别方法和设备
CN104702791A (zh) * 2015-03-13 2015-06-10 安徽声讯信息技术有限公司 长时间录音并同步转写文字的智能手机及其信息处理方法
CN106302980A (zh) * 2015-06-29 2017-01-04 上海卓易科技股份有限公司 事件主动记录的方法及终端设备

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050108017A1 (en) * 2003-10-27 2005-05-19 John-Alexander Esser Determining language for word recognition event
CN101697581A (zh) * 2009-10-26 2010-04-21 深圳华为通信技术有限公司 支持同声传译视讯会议的方法、装置及系统
CN102663143A (zh) * 2012-05-18 2012-09-12 徐信 一种音视频语音处理与检索的系统和方法

Also Published As

Publication number Publication date
CN112585562A (zh) 2021-03-30

Similar Documents

Publication Publication Date Title
RU2718154C1 (ru) Способ и устройство для отображения возможного слова и графический пользовательский интерфейс
US9111538B2 (en) Genius button secondary commands
US9904906B2 (en) Mobile terminal and data provision method thereof
US9083848B2 (en) Speaker displaying method and videophone terminal therefor
EP2440988B1 (en) Touch anywhere to speak
US8413050B2 (en) Information entry mechanism for small keypads
EP2680110B1 (en) Method and apparatus for processing multiple inputs
US9836448B2 (en) Text editing
US20150262583A1 (en) Information terminal and voice operation method
US9857966B2 (en) Electronic device and method for converting image format object to text format object
JP2000250694A (ja) 予測エディタアプリケーションを有する通信ターミナル
WO2018082657A1 (zh) 一种查找图标的方法及终端
KR20160060110A (ko) 온스크린 키보드에 대한 빠른 작업
WO2019095682A1 (zh) 一种候选项联想方法和装置
WO2015000429A1 (zh) 智能选词的方法和装置
WO2021169954A1 (zh) 搜索方法及电子设备
WO2015000430A1 (zh) 智能选词的方法和装置
WO2023078414A1 (zh) 相关文章搜索方法、装置、电子设备和存储介质
US20140288916A1 (en) Method and apparatus for function control based on speech recognition
CN109308126B (zh) 一种候选词展示方法和装置
CN113241097A (zh) 录音方法、装置、电子设备和可读存储介质
CN106339160A (zh) 浏览交互处理方法及装置
CN108628461B (zh) 一种输入方法和装置、一种更新词库的方法和装置
WO2020047719A1 (zh) 一种速记方法及装置、终端、存储介质
JP4796131B2 (ja) 筆記による及び/又は可聴音によるユーザ指示に応える、電子デバイス内のデータ管理のための、方法、電子デバイス、及びコンピュータ読み取り可能な記録媒体

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18932645

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 08/07/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18932645

Country of ref document: EP

Kind code of ref document: A1