CN113221990A - Information input method and device and related equipment - Google Patents

Information input method and device and related equipment Download PDF

Info

Publication number
CN113221990A
CN113221990A CN202110485532.9A CN202110485532A CN113221990A CN 113221990 A CN113221990 A CN 113221990A CN 202110485532 A CN202110485532 A CN 202110485532A CN 113221990 A CN113221990 A CN 113221990A
Authority
CN
China
Prior art keywords
voice
target
template
preset
recorded
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110485532.9A
Other languages
Chinese (zh)
Other versions
CN113221990B (en
Inventor
张远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202110485532.9A priority Critical patent/CN113221990B/en
Publication of CN113221990A publication Critical patent/CN113221990A/en
Application granted granted Critical
Publication of CN113221990B publication Critical patent/CN113221990B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3343Query execution using phonetics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The application relates to a data processing technology, and provides an information input method, an information input device, computer equipment and a storage medium, wherein the information input method comprises the following steps: acquiring a target entry item in a preset page, and acquiring historical voice data corresponding to the target entry item; analyzing historical voice data to obtain a standard voice template; when a monitoring buried point preset in a target entry item is triggered, acquiring a first voice to be recorded, and matching a target standard voice template corresponding to the first voice to be recorded; splitting a first voice to be recorded according to a target standard voice template to obtain a second voice to be recorded; processing the second voice to be recorded according to a pre-trained intention prediction model, and predicting the input intention of the collector; screening and displaying a text template with the matching degree with the input intention within a preset matching degree threshold range from a preset database; and inputting the text template into the target input item. This application can improve information input rate, promotes the rapid development in wisdom city.

Description

Information input method and device and related equipment
Technical Field
The present application relates to the field of data processing technologies, and in particular, to an information entry method and apparatus, a computer device, and a medium.
Background
With the rapid development of the information age, the development of the mobile terminal tends to be more intelligent. It is becoming more and more common to fill in various application forms through a mobile terminal, or to complete various operations such as paying for life through a mobile terminal. For example, a user can fill in resume information through a mobile terminal to complete a post application for a certain unit; for another example, when a user transacts opening a card or other business in a bank, the user can fill in an application form through the mobile terminal, so that paper can be saved, and the business transaction time can be shortened.
When the resume filling or the application form for handling banking business is filled in by the mobile terminal, a user firstly touches a text box in a corresponding information column in the mobile terminal, then the terminal pops up a virtual keyboard, and the user selects one input method to input content. In the above method for filling in the form of the mobile terminal by using characters, if the contents to be filled in the form have characters, English letters and numbers, the user needs to continuously switch the input method, the operation is complicated, the information filling efficiency is low, and especially for the user who cannot skillfully use the virtual keyboard of the terminal.
Therefore, it is necessary to provide an information entry method capable of increasing the rate of information entry.
Disclosure of Invention
In view of the above, it is necessary to provide an information entry method, an information entry apparatus, a computer device, and a medium, which can improve the rate of information entry.
A first aspect of an embodiment of the present application provides an information entry method, where the information entry method includes:
acquiring a target entry item in a preset page, and acquiring historical voice data corresponding to the target entry item;
analyzing the historical voice data to obtain a standard voice template;
when a monitoring buried point preset in the target entry item is triggered, acquiring a first voice to be entered, and matching a target standard voice template corresponding to the first voice to be entered;
splitting the first voice to be recorded according to the target standard voice template to obtain a second voice to be recorded;
processing the second voice to be recorded according to a pre-trained intention prediction model, and predicting the input intention of the collector;
screening and displaying a text template with the matching degree of the input intention within a preset matching degree threshold range from a preset database;
and inputting the text template into the target input item.
Further, in the information entry method provided in the embodiment of the present application, the acquiring a target entry item in a preset page includes:
splitting a preset page according to the function information to obtain a plurality of target acquisition areas;
detecting whether a preset mark exists in the target acquisition area;
and when the detection result indicates that the preset mark exists in the target acquisition area, determining the position of the preset mark as a target entry item.
Further, in the above information entry method provided in the embodiment of the present application, the acquiring the historical speech data corresponding to the target entry includes:
acquiring the content attribute and the format attribute of the target entry item;
structuring the content attribute and the format attribute to obtain a target attribute;
determining reference voice data corresponding to the target attribute;
traversing a preset database, and detecting whether the voice data to be detected in the preset database contains the reference voice data;
and when the detection result is that the voice data to be detected in the preset database contains the reference voice data, determining the voice data to be detected as historical voice data meeting the target attribute.
Further, in the information entry method provided in the embodiment of the present application, the analyzing the historical speech data to obtain a standard speech template includes:
converting voice into word to process the historical voice data to obtain corresponding historical text data;
clustering, analyzing and processing the historical text data to obtain a plurality of clustering clusters, and acquiring the quantity of the historical text data in each clustering cluster;
determining target historical text data exceeding a preset number threshold in the number;
obtaining target historical voice data corresponding to the target historical text data according to the corresponding relation between the text data and the voice data;
and acquiring key voice in the target historical voice data as a standard voice template.
Further, in the above information entry method provided in the embodiment of the present application, the method further includes:
when monitoring that a monitoring buried point is triggered, determining the area position of the monitoring buried point;
acquiring a plurality of target entry items in the region position;
and acquiring a first to-be-recorded voice of each target recording item according to a preset sequence.
Further, in the information entry method provided in the embodiment of the present application, the acquiring a first voice to be entered and matching a target standard voice template corresponding to the first voice to be entered includes:
analyzing the first voice to be recorded to obtain a key voice;
calculating the content similarity degree of the key voice and the standard voice template;
and determining the standard voice template with the content similarity exceeding a preset similarity threshold as a target standard voice template.
Further, in the above information entry method provided in the embodiment of the present application, after the collecting the first voice to be entered, the method further includes:
inputting the first to-be-recorded voice into a pre-trained mandarin proficiency calculation model to obtain the mandarin proficiency of the acquirer;
inquiring a preset mapping relation between the proficiency level of the Mandarin and the speech rate according to the proficiency level of the Mandarin to obtain a target speech rate;
outputting a prompt regarding the target speech rate.
A second aspect of the embodiments of the present application further provides an information entry apparatus, where the information entry apparatus includes:
the historical voice acquisition module is used for acquiring a target entry item in a preset page and acquiring historical voice data corresponding to the target entry item;
the voice template analysis module is used for analyzing the historical voice data to obtain a standard voice template;
the voice template matching module is used for acquiring a first voice to be recorded when a preset monitoring buried point in the target entry item is triggered, and matching a target standard voice template corresponding to the first voice to be recorded;
the recorded voice splitting module is used for splitting the first voice to be recorded according to the target standard voice template to obtain a second voice to be recorded;
the input intention prediction module is used for processing the second voice to be recorded according to a pre-trained intention prediction model and predicting the input intention of the collector;
the text template screening module is used for screening and displaying a text template with the matching degree with the input intention within a preset matching degree threshold range from a preset database;
and the text template entry module is used for entering the text template into the target entry item.
A third aspect of embodiments of the present application further provides a computer device, where the computer device includes a processor, and the processor is configured to implement the information entry method according to any one of the above items when executing a computer program stored in a memory.
The fourth aspect of the embodiments of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements any one of the information entry methods described above.
According to the information input method, the information input device, the computer equipment and the computer readable storage medium, the standard voice template is matched with the first to-be-input voice of different collectors, the to-be-input voice of the collector can be rapidly and accurately determined, and the speed and the accuracy of information input are improved; in addition, the text template with the matching degree of the input intention within the preset matching degree threshold range is screened and displayed from the preset database, and the text template is input into the target input item, so that the content of each input item is not required to be completely expressed by an acquirer, and the information input rate can be improved. The application can be applied to each function module in wisdom cities such as wisdom government affairs, wisdom traffic, for example, the information entry module of wisdom government affairs etc. can promote the rapid development in wisdom city.
Drawings
Fig. 1 is a flowchart of an information entry method provided in an embodiment of the present application.
Fig. 2 is a structural diagram of an information recording apparatus according to a second embodiment of the present application.
Fig. 3 is a schematic structural diagram of a computer device provided in the third embodiment of the present application.
The following detailed description will further illustrate the present application in conjunction with the above-described figures.
Detailed Description
In order that the above objects, features and advantages of the present application can be more clearly understood, a detailed description of the present application will be given below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth to provide a thorough understanding of the present application, and the described embodiments are a part, but not all, of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein in the description of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application.
The information entry method provided by the embodiment of the invention is executed by computer equipment, and correspondingly, the information entry device runs in the computer equipment.
Fig. 1 is a flowchart of an information entry method according to a first embodiment of the present application. As shown in fig. 1, the information entry method may include the following steps, and the order of the steps in the flowchart may be changed and some may be omitted according to different requirements:
s11, acquiring a target entry item in a preset page, and acquiring historical voice data corresponding to the target entry item.
In at least one embodiment of the present application, the preset page refers to a page that converts collected voice information into text information and enters a corresponding entry item. In an embodiment, the preset page may be divided into a basic information acquisition area, a policy information acquisition area, or other information acquisition areas according to the functional area. For example, the basic information acquisition area contains a plurality of target entry items, and the target entry items can be name entry items, age entry items, identity card entry items, address entry items and the like; the policy information acquisition area can contain dangerous type entry items, policy amount entry items, insurance duration entry items and the like.
Optionally, the acquiring a target entry in the preset page includes:
splitting a preset page according to the function information to obtain a plurality of target acquisition areas;
detecting whether a preset mark exists in the target acquisition area;
and when the detection result indicates that the preset mark exists in the target acquisition area, determining the position of the preset mark as a target entry item.
The target acquisition area can be a basic information acquisition area, a policy information acquisition area or other information acquisition areas. The preset mark is a preset mark for determining an entry item, and the preset mark may be a digital mark, a letter mark, or a color mark, which is not limited herein.
In at least one embodiment of the present application, a large amount of historical voice data is stored in a preset database, and in order to ensure privacy and privacy of the historical voice data, the preset database may be a target node in a block chain. Exemplarily, when the target entry is a name entry, the corresponding historical voice data refers to voice data related to name entry; when the target entry item is an address entry item, the corresponding historical voice data refers to voice data related to address entry.
Optionally, the acquiring the historical voice data corresponding to the target entry includes:
acquiring the content attribute and the format attribute of the target entry item;
structuring the content attribute and the format attribute to obtain a target attribute;
determining reference voice data corresponding to the target attribute;
traversing a preset database, and detecting whether the voice data to be detected in the preset database contains the reference voice data;
and when the detection result is that the voice data to be detected in the preset database contains the reference voice data, determining the voice data to be detected as historical voice data meeting the target attribute.
For the Name entry, the content attribute refers to content keywords such as "Name", and the like, and the format attribute refers to keywords in a chinese format or an english format, for example, the keyword in the chinese format may be "Name", and the keyword in the english format may be "Name", which is not limited herein. The reference voice data refers to a voice corresponding to the target attribute, and for example, the reference voice data may be voice form data of keywords such as "Name", and "Name". Whether the voice data to be detected is related to a Name entry item or not can be determined by detecting whether the voice data to be detected in the preset database contains reference voice data such as 'Name', 'Name' and 'Name', and the voice data to be detected related to the Name entry item is used as historical voice data corresponding to the Name entry item.
And S12, analyzing the historical voice data to obtain a standard voice template.
In at least one embodiment of the present application, the standard voice template refers to voice information in which the occurrence number of corresponding target entry items in the historical voice data is higher than a preset number threshold, where the preset number threshold is preset. The number of the standard voice templates can be 1 or more. The regularly updated standard voice template is obtained by regularly collecting and analyzing historical voice data, and the standard voice template can be ensured to meet the information input requirement.
Optionally, the analyzing the historical speech data to obtain a standard speech template includes:
converting voice into word to process the historical voice data to obtain corresponding historical text data;
clustering, analyzing and processing the historical text data to obtain a plurality of clustering clusters, and acquiring the quantity of the historical text data in each clustering cluster;
determining target historical text data exceeding a preset number threshold in the number;
obtaining target historical voice data corresponding to the target historical text data according to the corresponding relation between the text data and the voice data;
and acquiring key voice in the target historical voice data as a standard voice template.
When the historical voice data is processed by converting voice into words, the corresponding relation between the historical voice data and the historical text data is established, and the technology for processing the voice into words is the prior art and is not described herein again. Exemplarily, when the target entry is a name entry, the corresponding key voice refers to voice data obtained by removing the name voice, that is, a voice containing a keyword such as "name" or "name"; when the target entry item is an address entry item, the corresponding key voice refers to voice data from which the address voice is removed, that is, voice containing keywords such as "address" or "address". And processing the historical text data by clustering analysis to obtain a plurality of clustering clusters, namely processing the historical text data according to the keyword clustering. For example, the keyword may be a keyword such as "name" or "name", and historical text data corresponding to the "name" keyword may be clustered into a cluster.
And processing the historical voice data through voice to words to obtain corresponding historical text data, and performing quantity analysis on the historical text data to improve the speed of cluster analysis and further improve the acquisition speed of a standard voice template.
And S13, when the preset monitoring buried point in the target entry item is triggered, acquiring a first voice to be recorded, and matching a target standard voice template corresponding to the first voice to be recorded.
In at least one embodiment of the present application, the monitoring embedded point refers to an event that is preset in the target entry item and is used for monitoring whether the target entry item is triggered, and the mode of setting the monitoring embedded point may be a visual embedded point, a code embedded point, or another embedded point mode, which is not limited herein. In an embodiment, the preset page may be divided into a basic information acquisition area, a policy information acquisition area, or other information acquisition areas according to a functional area, and a monitoring buried point is set for each area. The method further comprises the following steps:
when monitoring that a monitoring buried point is triggered, determining the area position of the monitoring buried point;
acquiring a plurality of target entry items in the region position;
and acquiring a first to-be-recorded voice of each target recording item according to a preset sequence.
The first voice to be recorded can be collected according to the position sequence of a plurality of target recording items, when the target recording item with the front position collects the first voice to be recorded, the next target recording item with the back position is collected according to the preset time interval, and so on.
In at least one embodiment of the present application, there may be differences between the spoken habits and accents of different collectors, for example, when the spoken habits of different collectors are different, for the target entry being a name entry, the first to-be-entered speech of collector a is: i call Zhang III; the first voice to be recorded of the collector B is: my name is LiIV; the first voice to be recorded for collecting C is as follows: i is Wang II. The standard voice template is matched with the first voice to be recorded of different collectors, so that the voice to be recorded of the collector can be quickly determined, and the information recording rate is improved. Further, when the accents of different acquirers are different, for the target entry being a name entry, the first voice to be entered of the acquirer a may be: me di name is zhang san. The standard voice template is matched with the first to-be-recorded voice of different collectors, the to-be-recorded voice of the collector can be accurately determined, and the accuracy of information recording is improved.
Optionally, the acquiring a first voice to be recorded and matching a target standard voice template corresponding to the first voice to be recorded includes:
analyzing the first voice to be recorded to obtain a key voice;
calculating the content similarity degree of the key voice and the standard voice template;
and determining the standard voice template with the content similarity exceeding a preset similarity threshold as a target standard voice template.
When the target entry item is a name entry item, the corresponding key voice refers to voice data obtained after name voice is removed, namely, voice containing keywords such as "name" or "name"; when the target entry item is an address entry item, the corresponding key voice refers to voice data from which the address voice is removed, that is, voice containing keywords such as "address" or "address". The number of the standard voice templates can be multiple. The content similarity degree between the key speech and the standard speech template can be calculated through a pre-trained content similarity calculation model, the content similarity calculation model can be a neural network model, and the training process of the content similarity calculation model is the prior art and is not repeated herein. The preset similarity threshold may be a preset threshold for evaluating the similarity of voices.
And S14, splitting the first voice to be recorded according to the target standard voice template to obtain a second voice to be recorded.
In at least one embodiment of the present application, the first voice to be recorded includes a key voice and a second voice to be recorded, where the second voice to be recorded is a voice to be recorded into the target entry. Exemplarily, when the target entry item is a name entry item, the corresponding key voice refers to voice data obtained after the name voice is removed, and the corresponding second voice to be entered refers to name voice; when the target entry item is an address entry item, the corresponding key voice refers to voice data with address voice removed, and the corresponding second voice to be entered refers to address voice.
Optionally, the splitting the first voice to be recorded according to the target standard voice template to obtain a second voice to be recorded includes:
positioning the position of the target standard voice template in the first voice to be recorded;
and determining the voice content outside the position as second voice to be recorded.
In at least one embodiment of the present application, there may be differences in mandarin proficiency among different acquirers, there may be a high mandarin proficiency among a portion of acquirers, there may also be a medium mandarin proficiency among a portion of acquirers, and there may also be a poor mandarin proficiency among a portion of acquirers (e.g., situations such as adulteration of dialects in mandarin). It can be understood that the speech rate of the collector with higher proficiency level of Mandarin Chinese can be accurately identified when the collector with higher proficiency level of Mandarin Chinese is used, and the speech rate of the collector with lower proficiency level of Mandarin Chinese must be properly reduced when the collector with lower proficiency level of Mandarin Chinese is used for ensuring accurate identification.
Optionally, after the acquiring the first voice to be recorded, the method further includes:
inputting the first to-be-recorded voice into a pre-trained mandarin proficiency calculation model to obtain the mandarin proficiency of the acquirer;
inquiring a preset mapping relation between the proficiency level of the Mandarin and the speech rate according to the proficiency level of the Mandarin to obtain a target speech rate;
outputting a prompt regarding the target speech rate.
The prompt for outputting the target speech rate may be a standard male voice or a standard female voice which is output corresponding to the target speech rate, and the acquirer performs speech output according to the speech rate in the standard male voice or the standard female voice. In other embodiments, the outputting of the prompt about the target speech rate may be performed by a prompt tone (e.g., a prompt tone of "a drop") during the voice output of the collector, the prompt tone being output at the target speech rate.
The method and the device can ensure the speed and the accuracy of collection by calculating the proficiency level of the mandarin of different collectors and proposing the corresponding collection speed to the collectors with different proficiency levels of the mandarin.
And S15, processing the second voice to be recorded according to the pre-trained intention prediction model, and predicting the input intention of the collector.
In at least one embodiment of the present application, for some target entry items, the prediction of the input intention while collecting the second voice to be entered may be adopted, that is, the content of the second half of the second voice to be entered is predicted according to the content of the first half of the second voice to be entered which is being output by the collector. And acquiring the feature vector of the second voice to be recorded through a pre-trained intention prediction model, and identifying the input intention of the acquirer according to the feature vector. Illustratively, when the target entry is an address entry, the second speech to be entered refers to an address containing information of the province, for example, the second speech to be entered may be an address speech of a district of dragon region street of shenzhen, guang province. The pre-trained intention prediction model can be used for quickly predicting the possible street and cell information in the Longhua region when the acquirer inputs the information of 'Guandong Shenzhen City dragon'.
Optionally, the processing the second speech to be recorded according to a pre-trained intent prediction model, and predicting the input intent of the acquirer includes:
denoising the second voice to be recorded to obtain second denoised voice to be recorded;
performing feature extraction processing on the second to-be-recorded denoising voice to obtain a feature vector;
and inputting the feature vector into a pre-trained intention prediction model to predict the input intention of the collector.
The second Voice to be recorded is subjected to noise reduction, that is, silence removal of the head and the tail ends is performed, so as to reduce interference to subsequent steps, and the operation of silence removal is generally called Voice Activity Detection (VAD). And performing feature extraction processing on the second to-be-recorded denoising voice, namely performing sound framing, namely cutting the sound into small segments and small segments, wherein each small segment is called a frame and is realized by using a moving window function, and overlapping parts are reserved among the frames. The main algorithms of the feature processing process include Linear Prediction Cepstrum Coefficient (LPCC) and Mel Frequency Cepstrum Coefficient (MFCC), so as to transform each frame waveform into a multi-dimensional vector containing sound information.
And S16, screening and displaying the text template with the matching degree of the input intention within the preset matching degree threshold range from the preset database.
In at least one embodiment of the present application, the preset database may refer to a target node of a block chain, and the preset database further stores a large number of text templates related to target entries, where the text templates refer to text information entered into the target entries. The preset matching degree threshold value is a preset value.
Optionally, the screening and displaying the text template, the matching degree of which with the input intention is within a preset matching degree threshold range, from a preset database includes:
acquiring the input intention;
calculating the matching degree of the input intention and text information in a preset database;
and selecting the text information with the matching degree exceeding a preset matching degree threshold value as a text template.
The step of calculating the matching degree of the input intention and the text information in the preset database refers to calculating the same degree of the input intention and the text information. The preset matching degree threshold is a preset threshold.
S17, inputting the text template into the target input item.
In at least one embodiment of the present application, before the text template is entered into the target entry item, a format attribute of the target entry item may be further determined, and after the text template is adjusted according to the format attribute, the adjusted text template is entered into the target entry item. The format attribute can also comprise information such as the font, the word size, the space and the like of the input information besides the Chinese format or the English format. By acquiring the format attribute of the target entry item and entering the text template according to the format attribute, the accuracy of information entry can be improved.
Optionally, the entering the text template into the target entry includes:
analyzing the target entry to obtain a format attribute;
adjusting the text template according to the format attribute to obtain a target text template;
and inputting the target text template into the target input item.
According to the information input method provided by the embodiment of the application, the standard voice template is matched with the first to-be-input voice of different collectors, the to-be-input voice of the collector can be rapidly and accurately determined, and the speed and the accuracy of information input are improved; in addition, the text template with the matching degree of the input intention within the preset matching degree threshold range is screened and displayed from the preset database, and the text template is input into the target input item, so that the content of each input item is not required to be completely expressed by an acquirer, and the information input rate can be improved. The application can be applied to each function module in wisdom cities such as wisdom government affairs, wisdom traffic, for example, the information entry module of wisdom government affairs etc. can promote the rapid development in wisdom city.
Fig. 2 is a structural diagram of an information recording apparatus according to a second embodiment of the present application.
In some embodiments, the information entry device 20 may include a plurality of functional modules comprised of computer program segments. The computer programs of the various program segments in the information entry apparatus 20 may be stored in a memory of a computer device and executed by at least one processor to perform (see detailed description of fig. 1) the functions of information entry.
In this embodiment, the information entry device 20 may be divided into a plurality of functional modules according to the functions performed by the device. The functional module may include: a historical speech acquisition module 201, a speech template parsing module 202, a speech template matching module 203, an input speech splitting module 204, an input intention prediction module 205, a text template screening module 206, and a text template input module 207. A module as referred to herein is a series of computer program segments capable of being executed by at least one processor and capable of performing a fixed function and is stored in a memory. In the present embodiment, the functions of the modules will be described in detail in the following embodiments.
The historical voice acquisition module 201 is configured to acquire a target entry item in a preset page, and acquire historical voice data corresponding to the target entry item.
In at least one embodiment of the present application, the preset page refers to a page that converts collected voice information into text information and enters a corresponding entry item. In an embodiment, the preset page may be divided into a basic information acquisition area, a policy information acquisition area, or other information acquisition areas according to the functional area. For example, the basic information acquisition area contains a plurality of target entry items, and the target entry items can be name entry items, age entry items, identity card entry items, address entry items and the like; the policy information acquisition area can contain dangerous type entry items, policy amount entry items, insurance duration entry items and the like.
Optionally, the acquiring a target entry in the preset page includes:
splitting a preset page according to the function information to obtain a plurality of target acquisition areas;
detecting whether a preset mark exists in the target acquisition area;
and when the detection result indicates that the preset mark exists in the target acquisition area, determining the position of the preset mark as a target entry item.
The target acquisition area can be a basic information acquisition area, a policy information acquisition area or other information acquisition areas. The preset mark is a preset mark for determining an entry item, and the preset mark may be a digital mark, a letter mark, or a color mark, which is not limited herein.
In at least one embodiment of the present application, a large amount of historical voice data is stored in a preset database, and in order to ensure privacy and privacy of the historical voice data, the preset database may be a target node in a block chain. Exemplarily, when the target entry is a name entry, the corresponding historical voice data refers to voice data related to name entry; when the target entry item is an address entry item, the corresponding historical voice data refers to voice data related to address entry.
Optionally, the acquiring the historical voice data corresponding to the target entry includes:
acquiring the content attribute and the format attribute of the target entry item;
structuring the content attribute and the format attribute to obtain a target attribute;
determining reference voice data corresponding to the target attribute;
traversing a preset database, and detecting whether the voice data to be detected in the preset database contains the reference voice data;
and when the detection result is that the voice data to be detected in the preset database contains the reference voice data, determining the voice data to be detected as historical voice data meeting the target attribute. For the Name entry, the content attribute refers to content keywords such as "Name", and the like, and the format attribute refers to keywords in a chinese format or an english format, for example, the keyword in the chinese format may be "Name", and the keyword in the english format may be "Name", which is not limited herein. The reference voice data refers to a voice corresponding to the target attribute, and for example, the reference voice data may be voice form data of keywords such as "Name", and "Name". Whether the voice data to be detected is related to a Name entry item or not can be determined by detecting whether the voice data to be detected in the preset database contains reference voice data such as 'Name', 'Name' and 'Name', and the voice data to be detected related to the Name entry item is used as historical voice data corresponding to the Name entry item.
The voice template analysis module 202 is configured to analyze the historical voice data to obtain a standard voice template.
In at least one embodiment of the present application, the standard voice template refers to voice information in which the occurrence number of corresponding target entry items in the historical voice data is higher than a preset number threshold, where the preset number threshold is preset. The number of the standard voice templates can be 1 or more. The regularly updated standard voice template is obtained by regularly collecting and analyzing historical voice data, and the standard voice template can be ensured to meet the information input requirement.
Optionally, the analyzing the historical speech data to obtain a standard speech template includes:
converting voice into word to process the historical voice data to obtain corresponding historical text data;
clustering, analyzing and processing the historical text data to obtain a plurality of clustering clusters, and acquiring the quantity of the historical text data in each clustering cluster;
determining target historical text data exceeding a preset number threshold in the number;
obtaining target historical voice data corresponding to the target historical text data according to the corresponding relation between the text data and the voice data;
and acquiring key voice in the target historical voice data as a standard voice template.
When the historical voice data is processed by converting voice into words, the corresponding relation between the historical voice data and the historical text data is established, and the technology for processing the voice into words is the prior art and is not described herein again. Exemplarily, when the target entry is a name entry, the corresponding key voice refers to voice data obtained by removing the name voice, that is, a voice containing a keyword such as "name" or "name"; when the target entry item is an address entry item, the corresponding key voice refers to voice data from which the address voice is removed, that is, voice containing keywords such as "address" or "address". And processing the historical text data by clustering analysis to obtain a plurality of clustering clusters, namely processing the historical text data according to the keyword clustering. For example, the keyword may be a keyword such as "name" or "name", and historical text data corresponding to the "name" keyword may be clustered into a cluster.
And processing the historical voice data through voice to words to obtain corresponding historical text data, and performing quantity analysis on the historical text data to improve the speed of cluster analysis and further improve the acquisition speed of a standard voice template.
The voice template matching module 203 is configured to collect a first voice to be recorded when a preset monitoring buried point in the target entry is triggered, and match a target standard voice template corresponding to the first voice to be recorded.
In at least one embodiment of the present application, the monitoring embedded point refers to an event that is preset in the target entry item and is used for monitoring whether the target entry item is triggered, and the mode of setting the monitoring embedded point may be a visual embedded point, a code embedded point, or another embedded point mode, which is not limited herein. In an embodiment, the preset page may be divided into a basic information acquisition area, a policy information acquisition area, or other information acquisition areas according to a functional area, and a monitoring buried point is set for each area. The method further comprises the following steps:
when monitoring that a monitoring buried point is triggered, determining the area position of the monitoring buried point;
acquiring a plurality of target entry items in the region position;
and acquiring a first to-be-recorded voice of each target recording item according to a preset sequence.
The first voice to be recorded can be collected according to the position sequence of a plurality of target recording items, when the target recording item with the front position collects the first voice to be recorded, the next target recording item with the back position is collected according to the preset time interval, and so on.
In at least one embodiment of the present application, there may be differences between the spoken habits and accents of different collectors, for example, when the spoken habits of different collectors are different, for the target entry being a name entry, the first to-be-entered speech of collector a is: i call Zhang III; the first voice to be recorded of the collector B is: my name is LiIV; the first voice to be recorded for collecting C is as follows: i is Wang II. The standard voice template is matched with the first voice to be recorded of different collectors, so that the voice to be recorded of the collector can be quickly determined, and the information recording rate is improved. Further, when the accents of different acquirers are different, for the target entry being a name entry, the first voice to be entered of the acquirer a may be: me di name is zhang san. The standard voice template is matched with the first to-be-recorded voice of different collectors, the to-be-recorded voice of the collector can be accurately determined, and the accuracy of information recording is improved.
Optionally, the acquiring a first voice to be recorded and matching a target standard voice template corresponding to the first voice to be recorded includes:
analyzing the first voice to be recorded to obtain a key voice;
calculating the content similarity degree of the key voice and the standard voice template;
and determining the standard voice template with the content similarity exceeding a preset similarity threshold as a target standard voice template.
When the target entry item is a name entry item, the corresponding key voice refers to voice data obtained after name voice is removed, namely, voice containing keywords such as "name" or "name"; when the target entry item is an address entry item, the corresponding key voice refers to voice data from which the address voice is removed, that is, voice containing keywords such as "address" or "address". The number of the standard voice templates can be multiple. The content similarity degree between the key speech and the standard speech template can be calculated through a pre-trained content similarity calculation model, the content similarity calculation model can be a neural network model, and the training process of the content similarity calculation model is the prior art and is not repeated herein. The preset similarity threshold may be a preset threshold for evaluating the similarity of voices.
The recorded voice splitting module 204 is configured to split the first voice to be recorded according to the target standard voice template to obtain a second voice to be recorded.
In at least one embodiment of the present application, the first voice to be recorded includes a key voice and a second voice to be recorded, where the second voice to be recorded is a voice to be recorded into the target entry. Exemplarily, when the target entry item is a name entry item, the corresponding key voice refers to voice data obtained after the name voice is removed, and the corresponding second voice to be entered refers to name voice; when the target entry item is an address entry item, the corresponding key voice refers to voice data with address voice removed, and the corresponding second voice to be entered refers to address voice.
Optionally, the splitting the first voice to be recorded according to the target standard voice template to obtain a second voice to be recorded includes:
positioning the position of the target standard voice template in the first voice to be recorded;
and determining the voice content outside the position as second voice to be recorded.
In at least one embodiment of the present application, there may be differences in mandarin proficiency among different acquirers, there may be a high mandarin proficiency among a portion of acquirers, there may also be a medium mandarin proficiency among a portion of acquirers, and there may also be a poor mandarin proficiency among a portion of acquirers (e.g., situations such as adulteration of dialects in mandarin). It can be understood that the speech rate of the collector with higher proficiency level of Mandarin Chinese can be accurately identified when the collector with higher proficiency level of Mandarin Chinese is used, and the speech rate of the collector with lower proficiency level of Mandarin Chinese must be properly reduced when the collector with lower proficiency level of Mandarin Chinese is used for ensuring accurate identification.
Optionally, after the acquiring the first voice to be recorded, the method further includes:
inputting the first to-be-recorded voice into a pre-trained mandarin proficiency calculation model to obtain the mandarin proficiency of the acquirer;
inquiring a preset mapping relation between the proficiency level of the Mandarin and the speech rate according to the proficiency level of the Mandarin to obtain a target speech rate;
outputting a prompt regarding the target speech rate.
The prompt for outputting the target speech rate may be a standard male voice or a standard female voice which is output corresponding to the target speech rate, and the acquirer performs speech output according to the speech rate in the standard male voice or the standard female voice. In other embodiments, the outputting of the prompt about the target speech rate may be performed by a prompt tone (e.g., a prompt tone of "a drop") during the voice output of the collector, the prompt tone being output at the target speech rate.
The method and the device can ensure the speed and the accuracy of collection by calculating the proficiency level of the mandarin of different collectors and proposing the corresponding collection speed to the collectors with different proficiency levels of the mandarin.
The input intention prediction module 205 is configured to process the second speech to be recorded according to a pre-trained intention prediction model, and predict an input intention of the acquirer.
In at least one embodiment of the present application, for some target entry items, the prediction of the input intention while collecting the second voice to be entered may be adopted, that is, the content of the second half of the second voice to be entered is predicted according to the content of the first half of the second voice to be entered which is being output by the collector. And acquiring the feature vector of the second voice to be recorded through a pre-trained intention prediction model, and identifying the input intention of the acquirer according to the feature vector. Illustratively, when the target entry is an address entry, the second speech to be entered refers to an address containing information of the province, for example, the second speech to be entered may be an address speech of a district of dragon region street of shenzhen, guang province. The pre-trained intention prediction model can be used for quickly predicting the possible street and cell information in the Longhua region when the acquirer inputs the information of 'Guandong Shenzhen City dragon'.
Optionally, the processing the second speech to be recorded according to a pre-trained intent prediction model, and predicting the input intent of the acquirer includes:
denoising the second voice to be recorded to obtain second denoised voice to be recorded;
performing feature extraction processing on the second to-be-recorded denoising voice to obtain a feature vector;
and inputting the feature vector into a pre-trained intention prediction model to predict the input intention of the collector.
The second Voice to be recorded is subjected to noise reduction, that is, silence removal of the head and the tail ends is performed, so as to reduce interference to subsequent steps, and the operation of silence removal is generally called Voice Activity Detection (VAD). And performing feature extraction processing on the second to-be-recorded denoising voice, namely performing sound framing, namely cutting the sound into small segments and small segments, wherein each small segment is called a frame and is realized by using a moving window function, and overlapping parts are reserved among the frames. The main algorithms of the feature processing process include Linear Prediction Cepstrum Coefficient (LPCC) and Mel Frequency Cepstrum Coefficient (MFCC), so as to transform each frame waveform into a multi-dimensional vector containing sound information.
The text template screening module 206 is configured to screen and display a text template from a preset database, where a matching degree with the input intention is within a preset matching degree threshold range.
In at least one embodiment of the present application, the preset database may refer to a target node of a block chain, and a large number of text templates related to target entry items are stored in the preset database, where the text templates refer to text information entered into the target entry items. The preset matching degree threshold value is a preset value.
Optionally, the screening and displaying the text template, the matching degree of which with the input intention is within a preset matching degree threshold range, from a preset database includes:
acquiring the input intention;
calculating the matching degree of the input intention and text information in a preset database;
and selecting the text information with the matching degree exceeding a preset matching degree threshold value as a text template.
The step of calculating the matching degree of the input intention and the text information in the preset database refers to calculating the same degree of the input intention and the text information. The preset matching degree threshold is a preset threshold.
The text template entry module 207 is configured to enter the text template into the target entry.
In at least one embodiment of the present application, before the text template is entered into the target entry item, a format attribute of the target entry item may be further determined, and after the text template is adjusted according to the format attribute, the adjusted text template is entered into the target entry item. The format attribute can also comprise information such as the font, the word size, the space and the like of the input information besides the Chinese format or the English format. By acquiring the format attribute of the target entry item and entering the text template according to the format attribute, the accuracy of information entry can be improved.
Optionally, the entering the text template into the target entry includes:
analyzing the target entry to obtain a format attribute;
adjusting the text template according to the format attribute to obtain a target text template;
and inputting the target text template into the target input item.
Fig. 3 is a schematic structural diagram of a computer device according to a third embodiment of the present application. In the preferred embodiment of the present application, the computer device 3 includes a memory 31, at least one processor 32, at least one communication bus 33, and a transceiver 34.
It will be appreciated by those skilled in the art that the configuration of the computer device shown in fig. 3 is not a limitation of the embodiments of the present application, and may be a bus-type configuration or a star-type configuration, and that the computer device 3 may include more or less hardware or software than those shown, or a different arrangement of components.
In some embodiments, the computer device 3 is a device capable of automatically performing numerical calculation and/or information processing according to instructions set or stored in advance, and the hardware includes but is not limited to a microprocessor, an application specific integrated circuit, a programmable gate array, a digital processor, an embedded device, and the like. The computer device 3 may also include a client device, which includes, but is not limited to, any electronic product capable of interacting with a client through a keyboard, a mouse, a remote controller, a touch pad, or a voice control device, for example, a personal computer, a tablet computer, a smart phone, a digital camera, etc.
It should be noted that the computer device 3 is only an example, and other existing or future electronic products, such as those that may be adapted to the present application, are also included in the scope of the present application and are incorporated herein by reference.
In some embodiments, the memory 31 has stored therein a computer program which, when executed by the at least one processor 32, carries out all or part of the steps of the information entry method as described. The Memory 31 includes a Read-Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), a One-time Programmable Read-Only Memory (OTPROM), an electronically Erasable rewritable Read-Only Memory (Electrically-Erasable Programmable Read-Only Memory (EEPROM)), an optical Read-Only disk (CD-ROM) or other optical disk Memory, a magnetic disk Memory, a tape Memory, or any other medium readable by a computer capable of carrying or storing data.
Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
The block chain referred by the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
In some embodiments, the at least one processor 32 is a Control Unit (Control Unit) of the computer device 3, connects various components of the entire computer device 3 by using various interfaces and lines, and executes various functions and processes data of the computer device 3 by running or executing programs or modules stored in the memory 31 and calling data stored in the memory 31. For example, the at least one processor 32, when executing the computer program stored in the memory, implements all or part of the steps of the information entry method described in the embodiments of the present application; or to implement all or part of the functionality of the information entry device. The at least one processor 32 may be composed of an integrated circuit, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips.
In some embodiments, the at least one communication bus 33 is arranged to enable connection communication between the memory 31 and the at least one processor 32 or the like.
Although not shown, the computer device 3 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 32 through a power management device, so as to implement functions of managing charging, discharging, and power consumption through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The computer device 3 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
The integrated unit implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a computer device, or a network device) or a processor (processor) to execute parts of the methods according to the embodiments of the present application.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or that the singular does not exclude the plural. A plurality of units or means recited in the specification may also be implemented by one unit or means through software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present application and not for limiting, and although the present application is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions can be made on the technical solutions of the present application without departing from the spirit and scope of the technical solutions of the present application.

Claims (10)

1. An information entry method, characterized in that the information entry method comprises:
acquiring a target entry item in a preset page, and acquiring historical voice data corresponding to the target entry item;
analyzing the historical voice data to obtain a standard voice template;
when a monitoring buried point preset in the target entry item is triggered, acquiring a first voice to be entered, and matching a target standard voice template corresponding to the first voice to be entered;
splitting the first voice to be recorded according to the target standard voice template to obtain a second voice to be recorded;
processing the second voice to be recorded according to a pre-trained intention prediction model, and predicting the input intention of the collector;
screening and displaying a text template with the matching degree of the input intention within a preset matching degree threshold range from a preset database;
and inputting the text template into the target input item.
2. The information entry method of claim 1, wherein the obtaining of the target entry in the preset page comprises:
splitting a preset page according to the function information to obtain a plurality of target acquisition areas;
detecting whether a preset mark exists in the target acquisition area;
and when the detection result indicates that the preset mark exists in the target acquisition area, determining the position of the preset mark as a target entry item.
3. An information entry method as claimed in claim 1, wherein said collecting historical speech data corresponding to said target entry comprises:
acquiring the content attribute and the format attribute of the target entry item;
structuring the content attribute and the format attribute to obtain a target attribute;
determining reference voice data corresponding to the target attribute;
traversing a preset database, and detecting whether the voice data to be detected in the preset database contains the reference voice data;
and when the detection result is that the voice data to be detected in the preset database contains the reference voice data, determining the voice data to be detected as historical voice data meeting the target attribute.
4. An information entry method as claimed in claim 1, wherein said parsing said historical speech data to obtain a standard speech template comprises:
converting voice into word to process the historical voice data to obtain corresponding historical text data;
clustering, analyzing and processing the historical text data to obtain a plurality of clustering clusters, and acquiring the quantity of the historical text data in each clustering cluster;
determining target historical text data exceeding a preset number threshold in the number;
obtaining target historical voice data corresponding to the target historical text data according to the corresponding relation between the text data and the voice data;
and acquiring key voice in the target historical voice data as a standard voice template.
5. An information entry method as claimed in claim 1, the method further comprising:
when monitoring that a monitoring buried point is triggered, determining the area position of the monitoring buried point;
acquiring a plurality of target entry items in the region position;
and acquiring a first to-be-recorded voice of each target recording item according to a preset sequence.
6. The information entry method of claim 1, wherein the collecting a first voice to be entered and matching a target standard voice template corresponding to the first voice to be entered comprises:
analyzing the first voice to be recorded to obtain a key voice;
calculating the content similarity degree of the key voice and the standard voice template;
and determining the standard voice template with the content similarity exceeding a preset similarity threshold as a target standard voice template.
7. The information entry method according to claim 1, wherein after said acquiring the first voice to be entered, said method further comprises:
inputting the first to-be-recorded voice into a pre-trained mandarin proficiency calculation model to obtain the mandarin proficiency of the acquirer;
inquiring a preset mapping relation between the proficiency level of the Mandarin and the speech rate according to the proficiency level of the Mandarin to obtain a target speech rate;
outputting a prompt regarding the target speech rate.
8. An information entry device, characterized in that the information entry device comprises:
the historical voice acquisition module is used for acquiring a target entry item in a preset page and acquiring historical voice data corresponding to the target entry item;
the voice template analysis module is used for analyzing the historical voice data to obtain a standard voice template;
the voice template matching module is used for acquiring a first voice to be recorded when a preset monitoring buried point in the target entry item is triggered, and matching a target standard voice template corresponding to the first voice to be recorded;
the recorded voice splitting module is used for splitting the first voice to be recorded according to the target standard voice template to obtain a second voice to be recorded;
the input intention prediction module is used for processing the second voice to be recorded according to a pre-trained intention prediction model and predicting the input intention of the collector;
the text template screening module is used for screening and displaying a text template with the matching degree with the input intention within a preset matching degree threshold range from a preset database;
and the text template entry module is used for entering the text template into the target entry item.
9. A computer device, characterized in that it comprises a processor for implementing the information entry method according to any one of claims 1 to 7 when executing a computer program stored in a memory.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the information entry method according to any one of claims 1 to 7.
CN202110485532.9A 2021-04-30 2021-04-30 Information input method and device and related equipment Active CN113221990B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110485532.9A CN113221990B (en) 2021-04-30 2021-04-30 Information input method and device and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110485532.9A CN113221990B (en) 2021-04-30 2021-04-30 Information input method and device and related equipment

Publications (2)

Publication Number Publication Date
CN113221990A true CN113221990A (en) 2021-08-06
CN113221990B CN113221990B (en) 2024-02-23

Family

ID=77090789

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110485532.9A Active CN113221990B (en) 2021-04-30 2021-04-30 Information input method and device and related equipment

Country Status (1)

Country Link
CN (1) CN113221990B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116882591A (en) * 2023-09-05 2023-10-13 北京国网信通埃森哲信息技术有限公司 Information generation method, apparatus, electronic device and computer readable medium

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020178004A1 (en) * 2001-05-23 2002-11-28 Chienchung Chang Method and apparatus for voice recognition
JP2011053563A (en) * 2009-09-03 2011-03-17 Neikusu:Kk Collation system of voice keyword in voice data, method thereof, and collation program of voice keyword in voice data
CN107783953A (en) * 2017-09-22 2018-03-09 平安普惠企业管理有限公司 Information input method and terminal device
CN108287815A (en) * 2017-12-29 2018-07-17 重庆小雨点小额贷款有限公司 Information input method, device, terminal and computer readable storage medium
CN109147792A (en) * 2018-08-10 2019-01-04 安徽网才信息技术股份有限公司 A kind of voice resume system
CN111243596A (en) * 2020-01-08 2020-06-05 中保车服科技服务股份有限公司 Insurance information acquisition method, device and equipment based on voice recognition and storage medium
CN111552833A (en) * 2020-03-30 2020-08-18 深圳壹账通智能科技有限公司 Intelligent double recording method, device and storage medium
CN111613220A (en) * 2020-05-19 2020-09-01 浙江省人民医院 Pathological information registration and input device and method based on voice recognition interaction
CN111640417A (en) * 2020-05-13 2020-09-08 广州国音智能科技有限公司 Information input method, device, equipment and computer readable storage medium
CN111833842A (en) * 2020-06-30 2020-10-27 讯飞智元信息科技有限公司 Synthetic sound template discovery method, device and equipment
CN111883137A (en) * 2020-07-31 2020-11-03 龙马智芯(珠海横琴)科技有限公司 Text processing method and device based on voice recognition
CN112201245A (en) * 2020-09-30 2021-01-08 中国银行股份有限公司 Information processing method, device, equipment and storage medium
CN112214997A (en) * 2020-10-09 2021-01-12 深圳壹账通智能科技有限公司 Voice information recording method and device, electronic equipment and storage medium
CN112232042A (en) * 2020-09-08 2021-01-15 广州金域医学检验中心有限公司 Material taking information input method and device, computer equipment and storage medium

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020178004A1 (en) * 2001-05-23 2002-11-28 Chienchung Chang Method and apparatus for voice recognition
JP2011053563A (en) * 2009-09-03 2011-03-17 Neikusu:Kk Collation system of voice keyword in voice data, method thereof, and collation program of voice keyword in voice data
CN107783953A (en) * 2017-09-22 2018-03-09 平安普惠企业管理有限公司 Information input method and terminal device
CN108287815A (en) * 2017-12-29 2018-07-17 重庆小雨点小额贷款有限公司 Information input method, device, terminal and computer readable storage medium
CN109147792A (en) * 2018-08-10 2019-01-04 安徽网才信息技术股份有限公司 A kind of voice resume system
CN111243596A (en) * 2020-01-08 2020-06-05 中保车服科技服务股份有限公司 Insurance information acquisition method, device and equipment based on voice recognition and storage medium
CN111552833A (en) * 2020-03-30 2020-08-18 深圳壹账通智能科技有限公司 Intelligent double recording method, device and storage medium
CN111640417A (en) * 2020-05-13 2020-09-08 广州国音智能科技有限公司 Information input method, device, equipment and computer readable storage medium
CN111613220A (en) * 2020-05-19 2020-09-01 浙江省人民医院 Pathological information registration and input device and method based on voice recognition interaction
CN111833842A (en) * 2020-06-30 2020-10-27 讯飞智元信息科技有限公司 Synthetic sound template discovery method, device and equipment
CN111883137A (en) * 2020-07-31 2020-11-03 龙马智芯(珠海横琴)科技有限公司 Text processing method and device based on voice recognition
CN112232042A (en) * 2020-09-08 2021-01-15 广州金域医学检验中心有限公司 Material taking information input method and device, computer equipment and storage medium
CN112201245A (en) * 2020-09-30 2021-01-08 中国银行股份有限公司 Information processing method, device, equipment and storage medium
CN112214997A (en) * 2020-10-09 2021-01-12 深圳壹账通智能科技有限公司 Voice information recording method and device, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116882591A (en) * 2023-09-05 2023-10-13 北京国网信通埃森哲信息技术有限公司 Information generation method, apparatus, electronic device and computer readable medium
CN116882591B (en) * 2023-09-05 2023-11-24 北京国网信通埃森哲信息技术有限公司 Information generation method, apparatus, electronic device and computer readable medium

Also Published As

Publication number Publication date
CN113221990B (en) 2024-02-23

Similar Documents

Publication Publication Date Title
WO2022116420A1 (en) Speech event detection method and apparatus, electronic device, and computer storage medium
CN107992596B (en) Text clustering method, text clustering device, server and storage medium
US11315366B2 (en) Conference recording method and data processing device employing the same
CN114007131B (en) Video monitoring method and device and related equipment
CN112001175A (en) Process automation method, device, electronic equipment and storage medium
CN109978619B (en) Method, system, equipment and medium for screening air ticket pricing strategy
CN112527994A (en) Emotion analysis method, emotion analysis device, emotion analysis equipment and readable storage medium
CN112634889B (en) Electronic case input method, device, terminal and medium based on artificial intelligence
CN113903363B (en) Violation behavior detection method, device, equipment and medium based on artificial intelligence
CN113707173B (en) Voice separation method, device, equipment and storage medium based on audio segmentation
CN111462761A (en) Voiceprint data generation method and device, computer device and storage medium
CN112863529A (en) Speaker voice conversion method based on counterstudy and related equipment
WO2022178933A1 (en) Context-based voice sentiment detection method and apparatus, device and storage medium
CN110246496A (en) Speech recognition method, system, computer device and storage medium
CN113591489A (en) Voice interaction method and device and related equipment
CN113077821A (en) Audio quality detection method and device, electronic equipment and storage medium
CN113420556A (en) Multi-mode signal based emotion recognition method, device, equipment and storage medium
CN111988294A (en) User identity recognition method, device, terminal and medium based on artificial intelligence
CN116956896A (en) Text analysis method, system, electronic equipment and medium based on artificial intelligence
CN113436617B (en) Voice sentence breaking method, device, computer equipment and storage medium
CN111488501A (en) E-commerce statistical system based on cloud platform
CN114372082A (en) Data query method and device based on artificial intelligence, electronic equipment and medium
CN113221990B (en) Information input method and device and related equipment
CN112542172A (en) Communication auxiliary method, device, equipment and medium based on online conference
CN113254814A (en) Network course video labeling method and device, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant