CN109101484A - Recording file processing method, device, computer equipment and storage medium - Google Patents

Recording file processing method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN109101484A
CN109101484A CN201810735639.2A CN201810735639A CN109101484A CN 109101484 A CN109101484 A CN 109101484A CN 201810735639 A CN201810735639 A CN 201810735639A CN 109101484 A CN109101484 A CN 109101484A
Authority
CN
China
Prior art keywords
recording
recording file
text
file
cleaning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810735639.2A
Other languages
Chinese (zh)
Other versions
CN109101484B (en
Inventor
岳鹏昱
闫冬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201810735639.2A priority Critical patent/CN109101484B/en
Priority to PCT/CN2018/106259 priority patent/WO2020006879A1/en
Publication of CN109101484A publication Critical patent/CN109101484A/en
Application granted granted Critical
Publication of CN109101484B publication Critical patent/CN109101484B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a kind of recording file processing method, for solving the problems, such as that it is nonstandard that the cleaning operation inefficiency of recording file and being easy to appear arranges.The method include that obtaining the recording file uploaded and urtext corresponding with the recording file;It calls speech recognition interface to carry out speech recognition to the recording file, obtains identification text;Judge identify whether text and urtext are consistent;If consistent, recording file is stored into preset model training set;If inconsistent, recording file is recorded to catalogue to be cleaned, the recording file of catalogue record to be cleaned is listened to by treatment people and feeds back correctly recording text;Obtain the recording file and corresponding recording text after cleaning in catalogue to be cleaned;By after cleaning recording file and corresponding recording textual association store into model training set.The present invention also provides recording file processing unit, computer equipment and storage mediums.

Description

Recording file processing method, device, computer equipment and storage medium
Technical field
The present invention relates to technical field of voice recognition more particularly to a kind of recording file processing method, device, computer to set Standby and storage medium.
Background technique
Currently, speech recognition technology has been applied extremely wide, training has speech recognition modeling to many platforms in advance, can be with Speech-recognition services are externally provided.Platform needs to collect a large amount of voice document as sample in training speech recognition modeling Speech recognition modeling is supplied to learn and train, therefore, many platforms issue the cell phone application client of oneself, encourage vast User uploads recording by the APP and is used as training sample.This mode can rapidly, efficiently be collected into great multifarious instruction Practice sample, but there is also the nonstandard problems of recording file.Although these recording files that platform is collected are to pass through platform Urtext is provided, obtained from user reads the original document, but different user situation is different, and user is not by platform Constraint, the content for causing certain customers to read out and urtext are inconsistent, if the recording file obtained in this case provides Give speech recognition modeling training, be not only unable to get expected training effect, may slow down instead the training of speech recognition modeling into Degree seriously even damages speech recognition modeling.For this purpose, when the recording file obtained through the above way is as training sample, It needs to carry out cleaning operation before training.
Currently, the cleaning operation of recording file is by processing people artificial treatment one by one, not only inefficiency, but also is easy The problems such as existing recording file cleaning final finishing is lack of standardization, file directory is irregular.
Summary of the invention
Based on this, it is necessary to which in view of the above technical problems, providing one kind can be improved treatment people cleaning recording text Efficiency and convenient for being situated between to the management of recording file and the recording file processing method used, device, computer equipment and storage Matter.
A kind of recording file processing method, comprising:
Obtain the recording file uploaded and urtext corresponding with the recording file;
It calls speech recognition interface to carry out speech recognition to the recording file, obtains identification text;
Judge whether the identification text and the urtext are consistent;
If the identification text is consistent with the urtext, the recording file is stored to preset model training In set;
If the identification text and the urtext are inconsistent, the recording file is recorded to catalogue to be cleaned, The recording file of the catalogue record to be cleaned is listened to by treatment people and feeds back correctly recording text;
Obtain the recording file and corresponding recording text after cleaning in the catalogue to be cleaned;
By after the cleaning recording file and corresponding recording textual association store into the model training set.
A kind of recording file processing unit, comprising:
Recording file obtains module, for obtaining the recording file and original text corresponding with the recording file that upload This;
Speech recognition module is identified for calling speech recognition interface to carry out speech recognition to the recording file Text;
Text judgment module, for judging whether the identification text and the urtext are consistent;
First memory module, if the judging result for the text judgment module be it is yes, the recording file is deposited Storage is into preset model training set;
File recording module, if the judging result for the text judgment module be it is no, the recording file is remembered Record to catalogue to be cleaned, the recording file of the catalogue record to be cleaned is listened to by treatment people and feeds back correctly recording text This;
File acquisition module after cleaning, for obtaining the recording file and corresponding after cleaning in the catalogue to be cleaned Recording text;
Second memory module, for by after the cleaning recording file and corresponding recording textual association store to institute It states in model training set.
A kind of computer equipment, including memory, processor and storage are in the memory and can be in the processing The computer program run on device, the processor realize above-mentioned recording file processing method when executing the computer program Step.
A kind of computer readable storage medium, the computer-readable recording medium storage have computer program, the meter The step of calculation machine program realizes above-mentioned recording file processing method when being executed by processor.
Above-mentioned recording file processing method, device, computer equipment and storage medium, firstly, obtaining the recording text uploaded Part and urtext corresponding with the recording file;Then, speech recognition interface is called to carry out language to the recording file Sound identification obtains identification text;Judge whether the identification text and the urtext are consistent;If the identification text and institute It is consistent to state urtext, then stores the recording file into preset model training set;If the identification text and institute It is inconsistent to state urtext, then records the recording file to catalogue to be cleaned, the recording text of the catalogue record to be cleaned Part is listened to by treatment people and feeds back correctly recording text;In addition, the recording text after cleaning in the catalogue to be cleaned is obtained Part and corresponding recording text;Finally, by after the cleaning recording file and corresponding recording textual association store to In the model training set.As it can be seen that speech recognition is carried out by recording file of this method to upload, it can be in treatment people Whether the identification text that first identifies consistent with urtext before cleaning recording file, for identification text with it is original For the consistent recording file of text without being supplied to treatment people cleaning and directly storing into model training set, only identification is literary This just will record in catalogue to be cleaned with the inconsistent recording file of urtext is cleaned by treatment people, can be saved in this way The cleaning for the treatment of people a part improves the efficiency for the treatment of people cleaning recording text;Also, the recording without cleaning Recording text after text and cleaning is finally stored into model training set, convenient for the management and use to recording file, Facilitating subsequent speech recognition model uses these recording texts to be trained as sample.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below by institute in the description to the embodiment of the present invention Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the invention Example, for those of ordinary skill in the art, without any creative labor, can also be according to these attached drawings Obtain other attached drawings.
Fig. 1 is an application environment schematic diagram of recording file processing method in one embodiment of the invention;
Fig. 2 is a flow chart of recording file processing method in one embodiment of the invention;
Fig. 3 is that process of the recording file process method step S103 under an application scenarios is shown in one embodiment of the invention It is intended to;
Fig. 4 is recording file processing method original text when an application scenarios subscript is remembered and washed in one embodiment of the invention The flow diagram of content is paid close attention in this;
Fig. 5 is that process of the recording file process method step S105 under an application scenarios is shown in one embodiment of the invention It is intended to;
Fig. 6 is recording file processing method screening high-quality recording account under an application scenarios in one embodiment of the invention Flow diagram;
Fig. 7 is the structural schematic diagram of recording file processing unit in one embodiment of the invention;
Fig. 8 is a schematic diagram of computer equipment in one embodiment of the invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are some of the embodiments of the present invention, instead of all the embodiments.Based on this hair Embodiment in bright, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, shall fall within the protection scope of the present invention.
Recording file processing method provided by the present application, can be applicable in the application environment such as Fig. 1, wherein client is logical Network is crossed to be communicated with server.Wherein, which can be, but not limited to various personal computers, laptop, intelligence It can mobile phone, tablet computer and portable wearable device.Server can use independent server either multiple server groups At server cluster realize.
In one embodiment, it as shown in Fig. 2, providing a kind of recording file processing method, applies in Fig. 1 in this way It is illustrated, includes the following steps: for server
S101, the recording file uploaded and urtext corresponding with the recording file are obtained;
In the present embodiment, user can be used each recording file of APP client recording, APP client by network with The server of voice cleaning platform docks, and each recording file is uploaded to the voice cleaning platform automatically after the completion of recording On, so that voice cleaning platform is available to arrive these recording files.
In addition, being to compare the urtext that APP client provides to read, can recognizing when user records these recording files For urtext, that is, recording file received text, when voice cleaning platform obtains recording file, should obtain together and institute State the corresponding urtext of recording file.
S102, it calls speech recognition interface to carry out speech recognition to the recording file, obtains identification text;
After getting the recording file of upload, the speech recognition interface of the voice cleaning platform can be called to the record Sound file carries out speech recognition, obtains identification text.It is understood that the voice cleaning platform in the present embodiment can carry Speech identifying function can also carry out voice knowledge to these recording files by calling the speech recognition interface of other platforms Not, the sound for completing recording turns word, the identification text after obtaining the identification of these recording files.It is found that since identification text is to use What speech recognition technology obtained, therefore identify that text may be identical as urtext, it is also possible to it is different.When identification text with it is original It when text is not identical, causes different reason at least and there is a possibility that three kinds: 1, not according to original when user records recording file Beginning text, which is read, to be read;2, user pronunciation inaccuracy;3, there are errors for speech recognition.This also secondary evidence is using recording file conduct Before sample training speech recognition modeling, cleaning is carried out to recording file and is necessary.
S103, judge whether the identification text and the urtext are consistent, if so, S104 is thened follow the steps, if it is not, Then follow the steps S105;
In the present embodiment, in order to reduce the workload for the treatment of people cleaning recording file, cleaning efficiency is improved, is recorded in cleaning Before sound file, first judge that the content of text of which recording file is consistent with urtext.Therefore, it is possible to judge that passing through language Whether the identification text that sound identifies is consistent with the urtext, if the two is consistent, illustrates that the recording file is recorded Accuracy in pitch is true, can be directly used as trained sample, be cleaned without treatment people;, whereas if the two is inconsistent, then illustrate Recording file recording inaccuracy, it is not possible to be directly used as trained sample, treatment people is needed to be cleaned, therefore execute Step S105 completes cleaning operation.
Further, the present embodiment can calculate the recording file pair while judging whether recording file needs to clean The wrong word rate for the identification text answered, for subsequent analysis use.Specifically, as shown in figure 3, the step S103 includes:
S201, the wrong word rate for calculating the identification relatively described urtext of text;
S202, judge whether the wrong word rate is 0, if so, S203 is thened follow the steps, if it is not, thening follow the steps S204;
S203, determine that the identification text is consistent with the urtext;
S204, determine that the identification text and the urtext are inconsistent.
For step S201, the identification text can be compared with the urtext, calculate identification text phase To the wrong word rate (word error rate) of urtext.It is found that wrong word rate is higher, the identification text and urtext are represented Difference is bigger;Wrong word rate is lower, represents the identification text and urtext difference is smaller, when wrong word rate is 0, indicates the identification Text is consistent with urtext.
For step S202~S204, it is known that, if wrong word rate is 0, then it represents that the identification text and the original text This is consistent, then executes step S104;, whereas if wrong word rate is not 0, then it represents that the identification text and the urtext Unanimously, step S105 is then executed.
S104, the recording file is stored into preset model training set;
For step 104, it is to be understood that, can after determining that the identification text is consistent with the urtext It, therefore, can be by the recording text to think that the recording file is accurately, to be used directly for the training of speech recognition modeling Part is stored into preset model training set.Specifically, which can be stored into specified data library, it can also The recording file to be recorded under specified training file directory, when needing sample training, the training file directory is searched These recording files for being used as training sample can be got.
S105, the recording file is recorded to catalogue to be cleaned, the recording file of the catalogue record to be cleaned by Reason personnel listen to and feed back correctly recording text;
For step 105, it is to be understood that after determining that the identification text and the urtext are inconsistent, It represents the recording file and needs to carry out cleaning operation, therefore the recording file is recorded to catalogue to be cleaned.It, should in the present embodiment The recording file of catalogue record to be cleaned is listened to by treatment people and feeds back correctly recording text, that is, when treatment people cleans When recording file, catalogue to be cleaned is first inquired, learns which recording file needs to carry out cleaning operation, then obtains these needs The recording file of cleaning.Then, treatment people can directly pass through these records of player plays built-in on voice cleaning platform Sound file, treatment people recording file can be carried out by the player " accelerating to play ", " broadcasting of slowing down ", " F.F. ", " after Move back ", the operation such as " pause ", voice cleaning platform, which provides these functions and greatly facilitates treatment people, listens to recording file, After treatment people listens to the audio of recording file, by the words input heard to voice cleaning platform, the text of these typings It is regarded as correct text corresponding with the recording file, i.e., the described recording text.
Further, as shown in figure 4, recording file processing side before executing following step S106, in the present embodiment Method can also include:
S301, recording file to be cleaned and corresponding urtext are obtained;
S302, the voice that the recording file to be cleaned and corresponding urtext are sent to each different platform are known Other service interface carries out speech recognition, obtains each land identification text of each platform feedback;
S303, the urtext is compared with each land identification text respectively, determines the original text In this with the consistent part content of text of each land identification text;
Content of text in S304, the mark urtext in addition to the part content of text;
S305, it the urtext after mark is sent to designated terminal personnel for processing starts the cleaning processing.
It should be noted that although cleaning recording file relies primarily on the artificial treatment for the treatment of people, it is still, many to record Not only content of text is various for file, but also extremely huge by the recording file quantity that APP client is collected, the overwhelming majority Recording file all needs just to can be used as training sample after carrying out cleaning operation.This results in treatment people workload excessive, The case where being often short of hands.For the cleaning efficiency for mitigating this situation, improving recording file, the present embodiment passes through upper The processing for stating step S301~S305, before treatment people artificial treatment, in advance corresponding to the recording file to be cleaned The content that treatment people needs to pay close attention to is marked out in urtext, aid in treatment personnel pointedly listen to record with verification File, urtext effectively improve the efficiency that treatment people cleans the recording file.
For step S301, it is possible, firstly, to inquire catalogue to be cleaned, recording file to be cleaned is then obtained, meanwhile, it obtains Urtext corresponding to the recording file for taking this to be cleaned.
For step S302, it is to be understood that the speech recognition modeling used due to each different platform often it is each not Identical, therefore, different platform carries out the recognition result that speech recognition obtains to same recording file, and there is also differences.The present embodiment By carrying out speech recognition to same recording file to be cleaned using different platform, each platform of each platform feedback is obtained It identifies text, then determines the corresponding content of text of recording file compared with urtext according to these land identification texts In without the content paid close attention to.
It, can will be described original after each land identification text for obtaining each platform feedback for step S303 Text is compared with each land identification text respectively, is determined literary with each land identification in the urtext This consistent part content of text.It is understood that if the part content of text in the urtext is described each Occur in a land identification text, then what was certain was that the audio that the recording text corresponds to the part content of text is accurate , so that treatment people is not necessarily to paying close attention to audio corresponding to this part content of text.From another side it has also been discovered that, in addition to Other than the part content of text, in urtext audio-frequency unit corresponding to other content of text be likely that there are need clean, Modified part, this just needs that treatment people is reminded to pay close attention to.
For step S304 and S305, as described above, sound corresponding to the part content of text in the urtext Frequency partially may be considered without cleaning namely treatment people and be not necessarily to the place paid close attention to, therefore, by the contents of the section with Outer other content of text carry out emphasis mark, increase overstriking, underscore, addition background color etc. and highlight effect, can make Treatment people in the cleaning treatment recording file, check corresponding to the recording file while playback file by treatment people Urtext, and pay close attention to those mark out come content of text, so that treatment people can focus in emphasis Place, undoubtedly improve the efficiency that treatment people cleans these recording files, it is easier to find the location of " wrong word " simultaneously Amendment, so that treatment people more easily once listens to the cleaning that a recording file can be completed.
S106, recording file and corresponding recording text after cleaning in the catalogue to be cleaned are obtained;
Wherein, which refers to the text determined after cleaning through treatment people, it is believed that the recording text and record Audio content in sound file is consistent.Treatment people after cleaning recording file, can on voice cleaning platform It is modified to recording text on the basis of urtext or identification text, voice cleaning platform can be obtained the record after cleaning after submission The corresponding recording text of sound file.
S107, by after the cleaning recording file and corresponding recording textual association store to the model training collection In conjunction.
It is understood that the recording file after cleaning can be used for having trained for speech recognition modeling, therefore Can by after the cleaning recording file and corresponding recording textual association store into the model training set.
In addition, it is necessary to which explanation, above-mentioned catalogue to be cleaned can divide recording file when recording recording file Class, typically, since different phonetic identification model is often distinguished according to application field, such as finance and economics, news, sport, electricity Shadow dialogue etc., therefore, which can also record recording file to be cleaned according to the different classifications of application field.Into One step, as shown in figure 5, the step S105 may include:
Fixed application field at the beginning of S401, the acquisition recording file;
S402, the recording file is recorded to position belonging to application field just fixed described in catalogue to be cleaned;
On the basis of above-mentioned steps S401 and S402, the step S106 specifically: obtain in the catalogue to be cleaned The corresponding recording text of recording file and the first application field after recording file, the cleaning after cleaning, described first The application field that application field is determined by treatment people when cleaning the recording file;Also, step S107 specifically can wrap Include following step S501-S503:
Whether S501, the first application field for judging the recording file and just fixed application field are consistent;
If the first application field of S502, the recording file is consistent with just fixed application field, after the cleaning Recording file and corresponding recording textual association store to first fixed application field institute described in the model training set On the position of category;
If the first application field of S503, the recording file and just fixed application field are inconsistent, by the cleaning Recording file and corresponding recording textual association afterwards is stored to the first application field institute described in the model training set On the position of category.
For step S401 and S402, it is to be understood that the application field of a recording file can be according to the recording The corresponding urtext of file is determined that, since urtext is that one side of voice cleaning platform provides, application field can To predefine and get.It is especially noted that if user does not record according to urtext when recording recording file Sound, then the recording file is necessarily required to cleaning operation, treatment people can listen to recording text when cleaning to the recording file Then part redefines the application field of the recording file.And about catalogue to be cleaned, can be divided in the catalogue to be cleaned more A different position records the information for belonging to the recording file in different application field respectively, the pipe to recording file more convenient in this way Reason and application can be from the phases of catalogue to be cleaned when treatment people needs to concentrate the recording file for cleaning a certain application field Position quick-searching is answered to go out to belong to the recording file of the application field.
For this purpose, the step S106 is specifically as follows on the basis of above-mentioned steps S401 and S402: obtaining described to clear Recording file, the corresponding recording text of the recording file after the cleaning and the first application field after cleaning in catalogue are washed, The application field that first application field is determined by treatment people when cleaning the recording file.That is, treatment people exists When cleaning the recording file, after listening to the recording file, the recording text listened to not only is fed back, but also is artificially determined new Application field, i.e. first application field.At this point, step S107 by after the cleaning recording file and corresponding record Sound textual association is stored into the model training set, in order to which the recording file after cleaning is being collected model training set More classify when middle cleaning, above-mentioned steps S501-S503 can also be added in this method, first judge the first application of the recording file Field and just whether fixed application field is consistent, if unanimously, then it represents that the first fixed application field before recording file cleaning is Accurately, therefore when collecting the recording file after cleaning, it can be stored with corresponding recording textual association to described On position belonging to just fixed application field described in model training set;Conversely, if inconsistent, then it represents that the recording file is clear First fixed application field inaccuracy before washing, the first new application field determined together when should be cleaned with treatment people are Standard, to storing itself and corresponding recording textual association to belonging to the first application field described in the model training set On position.As it can be seen that being facilitated after using these cleanings by carrying out the division in application field to the recording file after cleaning Recording file as sample training speech recognition modeling when, target application field can be retrieved rapidly, targeted specifically Recording file trains speech recognition modeling.Such as the current speech recognition modeling for preparing training " automobile " field, it can be at this Each recording file is obtained as training sample on position belonging to " automobile " field in model training set.
Further, in this embodiment due to the recording file that server is recorded by client collection user, difference is used There can be significant difference between the recording file that family is recorded, the recording file better quality that some users record, such as wrong word Rate is very low even without wrong word, and the recording file quality that some users record is then poor.Obviously, it is contemplated that cost problem, recording High-quality user is more by the welcome of server.Therefore, go recording quality stable, high-quality to screen from users' entirety User, as shown in fig. 6, this method can also include:
The wrong word rate for each recording file that S501, statistics target account history upload;
The wrong word rate calculating target account of S502, each recording file uploaded according to the history that statistics obtains Record the average wrong word rate of recording file;
S503, judge whether the average wrong word rate of the target account is less than preset threshold, if so, thening follow the steps S504, if it is not, thening follow the steps S505;
S504, the target account is determined as to high-quality recording account, when high-quality recording account uploads recording file by The reward of default incentive mechanism;
S505, according to preset flow processing.
For step S501, by above-mentioned steps S201 it is found that server is judging whether recording file needs to clean same When can calculate the wrong word rate of the corresponding identification text of recording file, therefore when needed, server can be counted easily Obtain the wrong word rate of each recording file and these recording files that each user account is transmitted through in history.
For step S502, it is to be understood that the average wrong word rate of the target account is target account history upload The average value of the wrong word rate for each recording file crossed.It illustrating, it is assumed that target account A is transmitted through 3 recording files in history, The wrong word rate of this 3 recording files is respectively 0.2,0.1 and 0.3, then the average wrong word rate that target account A is calculated is (0.2+0.1+0.3)/3=0.2.
For step S503-S505, in the present embodiment, if the average wrong word rate of some user account is lower than some default threshold Value, it is believed that the recording file that the user records is user good, which is welcome by server.Wherein, this is pre- If threshold value can be determined according to actual use situation, for example can be determined as 0.1, i.e., 10%, when being averaged for the target account When wrong word rate is less than 10%, it is believed that the target account is the recording account of high-quality user, can determine it as high-quality record Sound account, in order to encourage the user of the high-quality recording account actively to upload recording file, server can be in the high-quality recording account Its reward is given when uploading recording file or after uploading according to preset incentive mechanism, for example the high-quality recording account can be improved The permission at family, Free Account system integral, send novelties with charge free etc..Conversely, be greater than when the average wrong word rate of the target account or When equal to 10%, it is believed that the target account is not the recording account of the high-quality user to merit attention, therefore can be according to pre- If flow processing.Preset process said herein, which specifically can be, is not determined as high-quality recording account for the target account.
In addition, the server in the present embodiment can also look at the cleaning situation of each treatment people, such as clearly Wash the quantity of recording file, recording file quantity to be cleaned, etc.
In the present embodiment, by carrying out speech recognition to the recording file of upload, recording text can be cleaned in treatment people Whether the identification text first identified before part is consistent with urtext, consistent for identification text and urtext Recording file only identifies text and original text without being supplied to treatment people cleaning and directly storing into model training set This inconsistent recording file, which just will record, to be cleaned in catalogue to be cleaned by treatment people, can save treatment people one in this way Partial cleaning improves the efficiency for the treatment of people cleaning recording text;Also, recording text and cleaning without cleaning Recording text afterwards is finally stored into model training set, convenient for the management and use to recording file, is facilitated subsequent Speech recognition modeling uses these recording texts to be trained as sample.
It should be understood that the size of the serial number of each step is not meant that the order of the execution order in above-described embodiment, each process Execution sequence should be determined by its function and internal logic, the implementation process without coping with the embodiment of the present invention constitutes any limit It is fixed.
In one embodiment, a kind of recording file processing unit is provided, the recording file processing unit and above-described embodiment Middle recording file processing method corresponds.As shown in fig. 7, the recording file processing unit includes that recording file obtains module 601, speech recognition module 602, text judgment module 603, the first memory module 604, file recording module 605, cleaning hereinafter Part obtains module 606 and the second memory module 607.Detailed description are as follows for each functional module:
Recording file obtains module 601, for obtaining the recording file uploaded and original corresponding with the recording file Beginning text;
Speech recognition module 602 is known for calling speech recognition interface to carry out speech recognition to the recording file Other text;
Text judgment module 603, for judging whether the identification text and the urtext are consistent;
First memory module 604, if the judging result for the text judgment module be it is yes, by the recording file It stores into preset model training set;
File recording module 605, if the judging result for the text judgment module be it is no, by the recording file Record to catalogue to be cleaned, the recording file of the catalogue record to be cleaned is listened to by treatment people and feeds back correctly recording text This;
File acquisition module 606 after cleaning, for obtaining the recording file and right after cleaning in the catalogue to be cleaned The recording text answered;
Second memory module 607, for by after the cleaning recording file and corresponding recording textual association store To in the model training set.
Further, the text judgment module may include:
Wrong word rate computing unit, for calculating the wrong word rate of the identification relatively described urtext of text;
Wrong word rate judging unit, for judging whether the wrong word rate is 0;
First determination unit, if the judging result for the wrong word rate judging unit is yes, it is determined that the identification text This is consistent with the urtext;
Second determination unit, if the judging result for the wrong word rate judging unit is no, it is determined that the identification text This is inconsistent with the urtext.
Further, the file recording module may include:
Just determine field acquiring unit, fixed application field at the beginning of for obtaining the recording file;
First recording unit, for recording the recording file to application field institute just fixed described in catalogue to be cleaned On the position of category;
File acquisition module is specifically used for after the cleaning: obtain the recording file after being cleaned in the catalogue to be cleaned, The corresponding recording text of recording file and the first application field after the cleaning, first application field is by handling people The application field that member determines when cleaning the recording file;
Second memory module can specifically include:
Field judging unit, for judge the recording file the first application field and just fixed application field whether one It causes;
First storage unit, if the judging result for the field judging unit be it is yes, by the record after the cleaning Sound file and corresponding recording textual association are stored to belonging to application field just fixed described in the model training set On position;
Second storage unit, if the judging result for the field judging unit be it is no, by the record after the cleaning Sound file and corresponding recording textual association are stored to position belonging to the first application field described in the model training set It sets.
Further, the recording file processing unit can also include:
Body of an instrument obtains module, for obtaining recording file to be cleaned and corresponding urtext;
Land identification module, for the recording file to be cleaned and corresponding urtext to be sent to each difference The speech-recognition services interface of platform carries out speech recognition, obtains each land identification text of each platform feedback;
Land identification transcription comparison's module, for carrying out the urtext with each land identification text respectively Comparison, determine in the urtext with the consistent part content of text of each land identification text;
Text marking module, for marking the content of text in the urtext in addition to the part content of text;
Sending module is cleaned, is carried out for the urtext after mark to be sent to designated terminal personnel for processing Cleaning treatment.
Further, the recording file processing unit can also include:
Statistical module, the wrong word rate of each recording file for counting the upload of target account history;
Average mistake word rate computing module, the wrong word for each recording file that the history for being obtained according to statistics uploads Rate calculates the average wrong word rate that the target account records recording file;
Average mistake word rate judgment module, for judging whether the average wrong word rate of the target account is less than preset threshold;
High-quality account determining module, if the judging result for the average wrong word rate judgment module be it is yes, will described in Target account is determined as high-quality recording account, reward when high-quality recording account uploads recording file by default incentive mechanism.
Specific about recording file processing unit limits the limit that may refer to above for recording file processing method Fixed, details are not described herein.Modules in above-mentioned recording file processing unit can fully or partially through software, hardware and its Combination is to realize.Above-mentioned each module can be embedded in the form of hardware or independently of in the processor in computer equipment, can also be with It is stored in the memory in computer equipment in a software form, in order to which processor calls the above modules of execution corresponding Operation.
In one embodiment, a kind of computer equipment is provided, which can be server, internal junction Composition can be as shown in Figure 8.The computer equipment include by system bus connect processor, memory, network interface and Database.Wherein, the processor of the computer equipment is for providing calculating and control ability.The memory packet of the computer equipment Include non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program and data Library.The built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium.The calculating The database of machine equipment is for storing the data being related in recording file processing method.The network interface of the computer equipment is used It is communicated in passing through network connection with external terminal.To realize at a kind of recording file when the computer program is executed by processor Reason method.
In one embodiment, a kind of computer equipment is provided, including memory, processor and storage are on a memory And the computer program that can be run on a processor, processor realize recording file in above-described embodiment when executing computer program The step of processing method, such as step S101 shown in Fig. 2 to step S107.Alternatively, reality when processor executes computer program The function of each module/unit of recording file processing unit in existing above-described embodiment, such as module 601 shown in Fig. 7 is to module 607 Function.To avoid repeating, which is not described herein again.
In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, is calculated Machine program realizes the step of recording file processing method in above-described embodiment, such as step shown in Fig. 2 when being executed by processor S101 to step S107.Alternatively, realizing recording file processing unit in above-described embodiment when computer program is executed by processor Each module/unit function, such as module 601 shown in Fig. 7 is to the function of module 607.It is no longer superfluous here to avoid repeating It states.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, To any reference of memory, storage, database or other media used in each embodiment provided herein, Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms, Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each function Can unit, module division progress for example, in practical application, can according to need and by above-mentioned function distribution by different Functional unit, module are completed, i.e., the internal structure of described device is divided into different functional unit or module, more than completing The all or part of function of description.
Embodiment described above is merely illustrative of the technical solution of the present invention, rather than its limitations;Although referring to aforementioned reality Applying example, invention is explained in detail, those skilled in the art should understand that: it still can be to aforementioned each Technical solution documented by embodiment is modified or equivalent replacement of some of the technical features;And these are modified Or replacement, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution should all It is included within protection scope of the present invention.

Claims (10)

1. a kind of recording file processing method characterized by comprising
Obtain the recording file uploaded and urtext corresponding with the recording file;
It calls speech recognition interface to carry out speech recognition to the recording file, obtains identification text;
Judge whether the identification text and the urtext are consistent;
If the identification text is consistent with the urtext, the recording file is stored to preset model training set In;
If the identification text and the urtext are inconsistent, the recording file is recorded to catalogue to be cleaned, it is described The recording file of catalogue record to be cleaned is listened to by treatment people and feeds back correctly recording text;
Obtain the recording file and corresponding recording text after cleaning in the catalogue to be cleaned;
By after the cleaning recording file and corresponding recording textual association store into the model training set.
2. recording file processing method according to claim 1, which is characterized in that described to judge the identification text and institute State urtext whether unanimously include:
Calculate the wrong word rate of the identification relatively described urtext of text;
Judge whether the wrong word rate is 0;
If the mistake word rate is 0, it is determined that the identification text is consistent with the urtext;
If the mistake word rate is not 0, it is determined that the identification text and the urtext are inconsistent.
3. recording file processing method according to claim 1, which is characterized in that it is described by the recording file record to Catalogue to be cleaned includes:
Obtain fixed application field at the beginning of the recording file;
The recording file is recorded to position belonging to application field just fixed described in catalogue to be cleaned;
The recording file and corresponding recording text obtained after being cleaned in the catalogue to be cleaned specifically: described in acquisition Recording file, the corresponding recording text of the recording file after the cleaning and the first application after being cleaned in catalogue to be cleaned Field, the application field that first application field is determined by treatment people when cleaning the recording file;
The recording file by after the cleaning and corresponding recording textual association are stored into the model training set It specifically includes:
Judge whether the first application field and the just fixed application field of the recording file are consistent;
If the first application field of the recording file is consistent with just fixed application field, by the recording file after the cleaning And corresponding recording textual association is stored to position belonging to application field just fixed described in the model training set;
If the first application field of the recording file and just fixed application field are inconsistent, by the recording text after the cleaning Part and corresponding recording textual association are stored to position belonging to the first application field described in the model training set.
4. recording file processing method according to claim 1, which is characterized in that clear in obtaining the catalogue to be cleaned Before recording file and corresponding recording text after washing, further includes:
Obtain recording file to be cleaned and corresponding urtext;
The speech-recognition services that the recording file to be cleaned and corresponding urtext are sent to each different platform are connect Mouth carries out speech recognition, obtains each land identification text of each platform feedback;
The urtext is compared with each land identification text respectively, determine in the urtext with it is each The consistent part content of text of the land identification text;
Mark the content of text in the urtext in addition to the part content of text;
The urtext after mark is sent to designated terminal personnel for processing to start the cleaning processing.
5. recording file processing method according to any one of claim 1 to 4, which is characterized in that the recording file Processing method further include:
Count the wrong word rate for each recording file that target account history uploads;
The wrong word rate of each recording file uploaded according to the history that statistics obtains calculates the target account and records recording The average wrong word rate of file;
Judge whether the average wrong word rate of the target account is less than preset threshold;
If the average wrong word rate of the target account is less than preset threshold, the target account is determined as high-quality recording account Family, reward when high-quality recording account uploads recording file by default incentive mechanism.
6. a kind of recording file processing unit characterized by comprising
Recording file obtains module, for obtaining the recording file uploaded and urtext corresponding with the recording file;
Speech recognition module obtains identification text for calling speech recognition interface to carry out speech recognition to the recording file;
Text judgment module, for judging whether the identification text and the urtext are consistent;
First memory module, if the judging result for the text judgment module be it is yes, by the recording file store to In preset model training set;
File recording module, if the judging result for the text judgment module be it is no, by the recording file record to The recording file of catalogue to be cleaned, the catalogue record to be cleaned is listened to by treatment people and feeds back correctly recording text;
File acquisition module after cleaning, for obtaining recording file and corresponding recording after cleaning in the catalogue to be cleaned Text;
Second memory module, for by after the cleaning recording file and corresponding recording textual association store to the mould In type training set.
7. recording file processing unit according to claim 6, which is characterized in that the text judgment module includes:
Wrong word rate computing unit, for calculating the wrong word rate of the identification relatively described urtext of text;
Wrong word rate judging unit, for judging whether the wrong word rate is 0;
First determination unit, if the judging result for the wrong word rate judging unit be yes, it is determined that the identification text and The urtext is consistent;
Second determination unit, if the judging result for the wrong word rate judging unit be no, it is determined that the identification text and The urtext is inconsistent.
8. recording file processing unit according to claim 6 or 7, which is characterized in that the file recording module includes:
Just determine field acquiring unit, fixed application field at the beginning of for obtaining the recording file;
First recording unit, for recording the recording file to belonging to application field just fixed described in catalogue to be cleaned On position;
File acquisition module is specifically used for after the cleaning: obtaining the recording file after cleaning in the catalogue to be cleaned, described The corresponding recording text of recording file and the first application field, first application field after cleaning are existed by treatment people The application field determined when cleaning the recording file;
Second memory module specifically includes:
Whether field judging unit, the first application field and just fixed application field for judging the recording file are consistent;
First storage unit, if the judging result for the field judging unit be it is yes, by after the cleaning recording text Part and corresponding recording textual association are stored to position belonging to application field just fixed described in the model training set On;
Second storage unit, if the judging result for the field judging unit be it is no, by after the cleaning recording text Part and corresponding recording textual association are stored to position belonging to the first application field described in the model training set.
9. a kind of computer equipment, including memory, processor and storage are in the memory and can be in the processor The computer program of upper operation, which is characterized in that the processor realized when executing the computer program as claim 1 to Described in any one of 5 the step of recording file processing method.
10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists In realization recording file processing method as described in any one of claims 1 to 5 when the computer program is executed by processor The step of.
CN201810735639.2A 2018-07-06 2018-07-06 Recording file processing method and device, computer equipment and storage medium Active CN109101484B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201810735639.2A CN109101484B (en) 2018-07-06 2018-07-06 Recording file processing method and device, computer equipment and storage medium
PCT/CN2018/106259 WO2020006879A1 (en) 2018-07-06 2018-09-18 Recording file processing method and apparatus, and computer device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810735639.2A CN109101484B (en) 2018-07-06 2018-07-06 Recording file processing method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN109101484A true CN109101484A (en) 2018-12-28
CN109101484B CN109101484B (en) 2023-04-18

Family

ID=64845566

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810735639.2A Active CN109101484B (en) 2018-07-06 2018-07-06 Recording file processing method and device, computer equipment and storage medium

Country Status (2)

Country Link
CN (1) CN109101484B (en)
WO (1) WO2020006879A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110310626A (en) * 2019-05-23 2019-10-08 平安科技(深圳)有限公司 Voice training data creation method, device, equipment and readable storage medium storing program for executing
CN112509608A (en) * 2020-11-25 2021-03-16 广州朗国电子科技有限公司 Method and device for recording sound along with channel of USB (Universal Serial bus) equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101154379A (en) * 2006-09-27 2008-04-02 夏普株式会社 Method and device for locating keywords in voice and voice recognition system
CN102522084A (en) * 2011-12-22 2012-06-27 广东威创视讯科技股份有限公司 Method and system for converting voice data into text files
CN102737634A (en) * 2012-05-29 2012-10-17 百度在线网络技术(北京)有限公司 Authentication method and device based on voice
CN102867511A (en) * 2011-07-04 2013-01-09 余喆 Method and device for recognizing natural speech
CN104636475A (en) * 2015-02-13 2015-05-20 小米科技有限责任公司 Method and device for optimizing multimedia file storage space
CN106228980A (en) * 2016-07-21 2016-12-14 百度在线网络技术(北京)有限公司 Data processing method and device
CN106875464A (en) * 2017-01-11 2017-06-20 深圳云创享网络有限公司 Threedimensional model document handling method, method for uploading and client
CN207380829U (en) * 2017-10-26 2018-05-18 福建星联科技有限公司 A kind of intelligent medicine storage device based on PLC and speech recognition

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8380510B2 (en) * 2005-05-20 2013-02-19 Nuance Communications, Inc. System and method for multi level transcript quality checking
CN107578769B (en) * 2016-07-04 2021-03-23 科大讯飞股份有限公司 Voice data labeling method and device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101154379A (en) * 2006-09-27 2008-04-02 夏普株式会社 Method and device for locating keywords in voice and voice recognition system
CN102867511A (en) * 2011-07-04 2013-01-09 余喆 Method and device for recognizing natural speech
CN102522084A (en) * 2011-12-22 2012-06-27 广东威创视讯科技股份有限公司 Method and system for converting voice data into text files
CN102737634A (en) * 2012-05-29 2012-10-17 百度在线网络技术(北京)有限公司 Authentication method and device based on voice
CN104636475A (en) * 2015-02-13 2015-05-20 小米科技有限责任公司 Method and device for optimizing multimedia file storage space
CN106228980A (en) * 2016-07-21 2016-12-14 百度在线网络技术(北京)有限公司 Data processing method and device
CN106875464A (en) * 2017-01-11 2017-06-20 深圳云创享网络有限公司 Threedimensional model document handling method, method for uploading and client
CN207380829U (en) * 2017-10-26 2018-05-18 福建星联科技有限公司 A kind of intelligent medicine storage device based on PLC and speech recognition

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
谢锦辉: "采用上下文相关音素HMM的连续语音识别" *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110310626A (en) * 2019-05-23 2019-10-08 平安科技(深圳)有限公司 Voice training data creation method, device, equipment and readable storage medium storing program for executing
CN112509608A (en) * 2020-11-25 2021-03-16 广州朗国电子科技有限公司 Method and device for recording sound along with channel of USB (Universal Serial bus) equipment and storage medium

Also Published As

Publication number Publication date
CN109101484B (en) 2023-04-18
WO2020006879A1 (en) 2020-01-09

Similar Documents

Publication Publication Date Title
JP6799574B2 (en) Method and device for determining satisfaction with voice dialogue
WO2018201964A1 (en) Processing method for session information, server, and computer readable storage medium
CN112492111B (en) Intelligent voice outbound method, device, computer equipment and storage medium
CN109360550A (en) Test method, device, equipment and the storage medium of voice interactive system
CN109934433A (en) A kind of personnel ability's appraisal procedure, device and cloud service platform
US20210034660A1 (en) Audio File Quality and Accuracy Assessment
CN115665325B (en) Intelligent outbound method, device, electronic equipment and storage medium
US20210256534A1 (en) Supporting automation of customer service
CN109743589B (en) Article generation method and device
CN109241334A (en) Audio keyword quality detecting method, device, computer equipment and storage medium
CN110298379A (en) Assessment models selection method, device, computer equipment and storage medium
CN109101484A (en) Recording file processing method, device, computer equipment and storage medium
US20170180219A1 (en) System and method of analyzing user skill and optimizing problem determination steps with helpdesk representatives
CN110211590A (en) A kind of processing method, device, terminal device and the storage medium of meeting hot spot
CN113297365B (en) User intention judging method, device, equipment and storage medium
CN110209768A (en) The problem of automatic question answering treating method and apparatus
CN110933504B (en) Video recommendation method, device, server and storage medium
US10341491B1 (en) Identifying unreported issues through customer service interactions and website analytics
CN112101046B (en) Conversation analysis method, device and system based on conversation behavior
CN110516056A (en) Interactive autonomous learning method, autonomous learning systems and storage medium
CN112837688B (en) Voice transcription method, device, related system and equipment
CN109213971A (en) The generation method and device of court's trial notes
CN113240345A (en) Customer service satisfaction management method and device, storage medium and electronic equipment
CN113868445A (en) Continuous playing position determining method and continuous playing system
CN112562688A (en) Voice transcription method, device, recording pen and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant