CN112181164B - Intelligent voice typing method based on cursor focus coordinate positioning - Google Patents

Intelligent voice typing method based on cursor focus coordinate positioning Download PDF

Info

Publication number
CN112181164B
CN112181164B CN202011043308.6A CN202011043308A CN112181164B CN 112181164 B CN112181164 B CN 112181164B CN 202011043308 A CN202011043308 A CN 202011043308A CN 112181164 B CN112181164 B CN 112181164B
Authority
CN
China
Prior art keywords
cursor
text box
focus
text
voice typing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011043308.6A
Other languages
Chinese (zh)
Other versions
CN112181164A (en
Inventor
虞焰兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Semxum Information Technology Co ltd
Original Assignee
Anhui Semxum Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Semxum Information Technology Co ltd filed Critical Anhui Semxum Information Technology Co ltd
Priority to CN202011043308.6A priority Critical patent/CN112181164B/en
Publication of CN112181164A publication Critical patent/CN112181164A/en
Application granted granted Critical
Publication of CN112181164B publication Critical patent/CN112181164B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • G06F3/04812Interaction techniques based on cursor appearance or behaviour, e.g. being affected by the presence of displayed objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback

Abstract

The invention discloses an intelligent voice typing method based on cursor focus coordinate positioning, which monitors cursor focus coordinates of a text box in real time in the voice typing process, when voice typing is interrupted due to the starting of other threads, the cursor focus is always bound with a current text box, and the initial cursor focus coordinate of the subsequent text input is determined according to the last monitored cursor focus coordinates of the text box before the starting of other threads. The invention can directly bind the cursor focus in the text box after the voice typing is interrupted by other threads by monitoring the cursor focus coordinates of the text box in real time in the voice typing process, and is convenient for the re-input of the subsequent text after the original input text is positioned, and compared with the manual return of the text box and the re-selection of the focus, the invention is more convenient, quick and accurate, and simultaneously greatly reduces the possibility of losing the intermediate text.

Description

Intelligent voice typing method based on cursor focus coordinate positioning
Technical Field
The invention relates to the technical field of voice typing, in particular to an intelligent voice typing method based on cursor focus coordinate positioning.
Background
With the continuous development of speech recognition technology, editing text by means of speech typing has been used. The voice typing is realized by collecting voice through a microphone connected with a computer, recording the voice by the computer, transmitting the voice to a voice recognition server through a network, converting the voice into characters by the voice recognition server, and returning the characters to the computer.
The location of text entry is known as the location of the focus of the current page text box cursor. However, when the multithreaded computer starts other threads in the voice typing process, the focus of the text box cursor is lost or changed, the voice typing is interrupted, the subsequent text cannot be displayed on the corresponding page, the user needs to manually return to the voice typing interface to continue the voice typing, and the lost text cannot be automatically reproduced due to the interruption.
Disclosure of Invention
Aiming at the technical defects of the existing voice typing, the invention provides an intelligent voice typing method based on cursor focus coordinate positioning, which solves the problem of voice typing interference under the condition of multi-thread operation of a computer.
The intelligent voice typing method based on cursor focus coordinate positioning includes that a computer monitors the cursor focus coordinate of a text box in real time during voice typing, when voice typing is interrupted due to starting of other threads, the cursor focus is always bound with a current text box, and the initial cursor focus coordinate of subsequent text input is determined according to the cursor focus coordinate of the text box monitored last time before starting of other threads.
Further, if the recording thread for voice typing is in progress when the other threads are started, the voice typing is considered to be interrupted by the other threads.
Further, the intelligent voice typing method comprises the following steps:
1. a microphone connected with the computer collects voice;
2. the computer sends the audio stream to the voice recognition server through the network on one hand, and stores the record in the hard disk on the other hand;
3. the voice recognition server converts the audio into text and feeds the text back to the computer;
4. the computer judges whether the focus coordinate of the text box cursor is normal, if so, the step 5 is directly skipped; if the text box is abnormal, the cursor focus is always bound with the current text box, and the initial cursor focus coordinate of the subsequent text input is determined according to the cursor focus coordinate of the text box monitored last time before other threads are started;
5. the text is displayed in sequence behind the cursor focus coordinates.
Whether the focus coordinate of the text box cursor in the step 4 is normal or not is judged by the flicker interval time delta of the text box cursor and the focus coordinate of the cursor;
if the blinking interval time delta of the text box cursor is more than 1s, directly judging that the focus of the text box cursor is abnormal;
if the blinking interval time of the textbox cursor is more than or equal to delta and less than or equal to 1s, but the focal point coordinate of the textbox cursor is changed compared with the focal point coordinate of the textbox cursor monitored last time before other threads are started, the focal point abnormality of the textbox cursor is also judged.
Further, after the voice typing is completed and the file is stored or output, the cached focus coordinate monitoring record of the text box cursor is deleted.
The invention can directly bind the cursor focus in the text box after the voice typing is interrupted by other threads by monitoring the cursor focus coordinates of the text box in real time in the voice typing process, and is convenient for the re-input of the subsequent text after the original input text is positioned, and compared with the manual return of the text box and the re-selection of the focus, the invention is more convenient, quick and accurate, and simultaneously greatly reduces the possibility of losing the intermediate text.
Drawings
FIG. 1 is a flow chart of an intelligent voice typing method.
Detailed Description
The invention will be described in further detail with reference to the drawings and the detailed description. The embodiments of the invention have been presented for purposes of illustration and description, and are not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.
Example 1
According to daily experience, the text box always has a continuously flashing cursor at the text input position in the text input process, and the cursor position gradually moves backwards along with the continuous text input. The cursor is the focal point for locating the text input position. When other processes are started, a change in the focus of the text box cursor may result. For example, characters are originally input in the word document, and an advertisement popup window is popped up suddenly at the moment, so that the cursor focus of the word document is lost; if a page with a text box is popped up at this time and the cursor focus is positioned in the text box of the page by default, the cursor focus still exists, but its position is changed compared to the cursor focus in the word document. The invention formally realizes the intelligent voice typing method based on the cursor focus coordinate positioning based on the characteristic of the cursor focus, and solves the problem of voice typing interference under the multi-thread running condition of a computer.
The computer monitors the cursor focus coordinates of the text box in real time in the voice typing process, when the voice typing is interrupted due to the starting of other threads, the cursor focus is always bound with the current text box, and the initial cursor focus coordinates of the subsequent text input are determined according to the cursor focus coordinates of the text box monitored last time before the starting of the other threads.
Regarding the judgment that the voice typing is interrupted due to the starting of other threads, if the recording thread for the voice typing is in progress when the other threads are started, the voice typing is considered to be interrupted by the other threads. Here, the recording thread for voice typing is a recording thread which adopts specific marks and is not mixed with other recording threads.
The intelligent voice typing method based on cursor focus coordinate positioning, as shown in fig. 1, specifically comprises the following steps:
1. a microphone connected to the computer collects the speech.
2. The computer sends the audio stream to the voice recognition server via the network on the one hand, and stores the recording on the hard disk on the other hand.
3. The voice recognition server converts the audio into text and feeds the text back to the computer.
4. The computer judges whether the focus coordinate of the text box cursor is normal, if so, the step 5 is directly skipped; if the current text box is abnormal, the cursor focus is always bound with the current text box, and the initial cursor focus coordinate of the subsequent text input is determined according to the cursor focus coordinate of the text box monitored last time before other threads are started.
Whether the focus coordinate of the text box cursor is normal or not is judged by the flicker interval time delta of the text box cursor and the focus coordinate of the cursor; if the blinking interval time delta of the text box cursor is more than 1s, directly judging that the focus of the text box cursor is abnormal; if the blinking interval time of the textbox cursor is more than or equal to delta and less than or equal to 1s, but the focal point coordinate of the textbox cursor is changed compared with the focal point coordinate of the textbox cursor monitored last time before other threads are started, the focal point abnormality of the textbox cursor is also judged.
5. The text is displayed in sequence behind the cursor focus coordinates.
In order to reduce the occupation of the computer cache, after the voice typing is finished, the start-stop time cache record, the cursor coordinate focus cache record and the character length cache record are all emptied after the file is stored.
For a clearer description of the present application, the following examples are given.
The first part is editing text through word at the computer end, inputting voice 'global outbreak of large-scale epidemic situation in the present year' through microphone, sending the voice into voice recognition server in audio stream mode, and the voice recognition server returns text in form of 'phrase' or 'phrase' back to the computer by combining with semantic meaning in the process of converting voice into text. For example, the input speech "global outbreaks of large-scale epidemic" may return the text "this year", "global", "outbreaks", "large-scale epidemic" in sequence.
After the text is completely input in the word text box, the advertisement popup window pops up, and the text input is interrupted. Since advertisement popup typically has no text box, the text box cursor blinks for an interval time delta > 1s, i.e., the text box cursor focus is abnormal, thereby rebinding the cursor focus to the word text box. However, the cursor focus does not return directly to the "world now" and instead defaults to the leftmost initial position at the top of the word text box. At this time, the actual cursor focus coordinate needs to be reset according to the monitored cursor focus coordinate, so that the cursor focus is still located after "this year worldwide".
It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art and which are included in the embodiments of the present invention without the inventive step, are intended to be within the scope of the present invention.

Claims (2)

1. The intelligent voice typing method based on cursor focus coordinate positioning is characterized in that a computer monitors cursor focus coordinates of a text box in real time in the voice typing process, when voice typing is interrupted due to the starting of other threads, a cursor focus is always bound with a current text box, and the initial cursor focus coordinate of the subsequent text input is determined according to the last monitored cursor focus coordinates of the text box before the starting of other threads; if the other threads are started, the recording thread for voice typing is in progress, and the voice typing is considered to be interrupted by the other threads;
the method specifically comprises the following steps:
step 1, a microphone connected with a computer collects voice;
step 2, the computer sends the audio stream to the voice recognition server through the network on one hand, and stores the record in the hard disk on the other hand;
step 3, the voice recognition server converts the audio into text and feeds the text back to the computer;
step 4, the computer judges whether the focus coordinates of the text box cursor are normal, if so, the computer directly jumps to step 5; if the text box is abnormal, the cursor focus is always bound with the current text box, and the initial cursor focus coordinate of the subsequent text input is determined according to the cursor focus coordinate of the text box monitored last time before other threads are started;
whether the focus coordinate of the text box cursor is normal or not is judged by the flicker interval time delta of the text box cursor and the focus coordinate of the cursor; if the blinking interval time delta of the text box cursor is more than 1s, directly judging that the focus of the text box cursor is abnormal; if the blinking interval time of the text box cursor is more than or equal to delta and less than or equal to 1s, but the focus coordinate of the text box cursor changes compared with the focus coordinate of the text box cursor monitored last time before other threads are started, judging that the focus of the text box cursor is abnormal;
and 5, displaying the text behind the cursor focus coordinates in sequence.
2. The intelligent voice typing method of claim 1, wherein the cached text box cursor focus coordinate monitoring record is deleted after the voice typing is completed, the file is saved or output.
CN202011043308.6A 2020-09-28 2020-09-28 Intelligent voice typing method based on cursor focus coordinate positioning Active CN112181164B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011043308.6A CN112181164B (en) 2020-09-28 2020-09-28 Intelligent voice typing method based on cursor focus coordinate positioning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011043308.6A CN112181164B (en) 2020-09-28 2020-09-28 Intelligent voice typing method based on cursor focus coordinate positioning

Publications (2)

Publication Number Publication Date
CN112181164A CN112181164A (en) 2021-01-05
CN112181164B true CN112181164B (en) 2024-03-12

Family

ID=73946629

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011043308.6A Active CN112181164B (en) 2020-09-28 2020-09-28 Intelligent voice typing method based on cursor focus coordinate positioning

Country Status (1)

Country Link
CN (1) CN112181164B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113301416B (en) * 2021-04-30 2023-06-27 当趣网络科技(杭州)有限公司 Voice frame display method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103092456A (en) * 2011-10-31 2013-05-08 国际商业机器公司 Method and system for inputting textboxes
WO2016107147A1 (en) * 2014-12-31 2016-07-07 中兴通讯股份有限公司 Character entry method and device
CN106990957A (en) * 2017-03-16 2017-07-28 北京云知声信息技术有限公司 A kind of windows switching method and device
CN109637541A (en) * 2018-12-29 2019-04-16 联想(北京)有限公司 The method and electronic equipment of voice conversion text

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103092456A (en) * 2011-10-31 2013-05-08 国际商业机器公司 Method and system for inputting textboxes
WO2016107147A1 (en) * 2014-12-31 2016-07-07 中兴通讯股份有限公司 Character entry method and device
CN106990957A (en) * 2017-03-16 2017-07-28 北京云知声信息技术有限公司 A kind of windows switching method and device
CN109637541A (en) * 2018-12-29 2019-04-16 联想(北京)有限公司 The method and electronic equipment of voice conversion text

Also Published As

Publication number Publication date
CN112181164A (en) 2021-01-05

Similar Documents

Publication Publication Date Title
US10679615B2 (en) Adaptive interface in a voice-based networked system
US11869506B2 (en) Selectively generating expanded responses that guide continuance of a human-to-computer dialog
US8515762B2 (en) Markup language-based selection and utilization of recognizers for utterance processing
US7092496B1 (en) Method and apparatus for processing information signals based on content
JP2018502344A (en) Language model training method, apparatus, and device
US20080177536A1 (en) A/v content editing
CN107943834B (en) Method, device, equipment and storage medium for implementing man-machine conversation
US20210343269A1 (en) Text-to-speech audio segment retrieval
CN111949240A (en) Interaction method, storage medium, service program, and device
CN112181164B (en) Intelligent voice typing method based on cursor focus coordinate positioning
US11240363B2 (en) Proactive caching of transient assistant action suggestions at a feature phone
CN110196927B (en) Multi-round man-machine conversation method, device and equipment
WO2021034382A1 (en) Presenting electronic communications in narrative form
CN109782997B (en) Data processing method, device and storage medium
CN111279333A (en) Language-based search of digital content in a network
US20200151227A1 (en) Computing system with dynamic web page feature
CN111916055A (en) Speech synthesis method, platform, server and medium for outbound system
JP7209433B2 (en) Data prefetch method, device, electronic equipment, computer readable storage medium and computer program product
JP4962416B2 (en) Speech recognition system
WO2022213943A1 (en) Message sending method, message sending apparatus, electronic device, and storage medium
CN115174285B (en) Conference record generation method and device and electronic equipment
KR101368464B1 (en) Apparatus of speech recognition for speech data transcription and method thereof
CN104318923B (en) Voice processing method and device and terminal
CN110263313B (en) Man-machine collaborative editing method for conference shorthand
CN110895576A (en) Display method and device for terminal screen protection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant