CN110599028A - Text positioning method, device, equipment and storage medium - Google Patents

Text positioning method, device, equipment and storage medium Download PDF

Info

Publication number
CN110599028A
CN110599028A CN201910851802.6A CN201910851802A CN110599028A CN 110599028 A CN110599028 A CN 110599028A CN 201910851802 A CN201910851802 A CN 201910851802A CN 110599028 A CN110599028 A CN 110599028A
Authority
CN
China
Prior art keywords
text
character
acquiring
array
context model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910851802.6A
Other languages
Chinese (zh)
Other versions
CN110599028B (en
Inventor
张超
汤耀华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN201910851802.6A priority Critical patent/CN110599028B/en
Priority to PCT/CN2019/116470 priority patent/WO2021047003A1/en
Publication of CN110599028A publication Critical patent/CN110599028A/en
Application granted granted Critical
Publication of CN110599028B publication Critical patent/CN110599028B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06398Performance of employee with respect to a job function
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/01Customer relationship services
    • G06Q30/015Providing customer assistance, e.g. assisting a customer within a business location or via helpdesk
    • G06Q30/016After-sales
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0281Customer communication at a business location, e.g. providing product or service information, consulting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Tourism & Hospitality (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Educational Administration (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to the field of financial science and technology, and discloses a text positioning method, a device, equipment and a storage medium, wherein the text positioning method comprises the following steps: acquiring recording content, and performing voice recognition processing on the recording content to obtain a text to be positioned; acquiring a standard speech text, and constructing a distance context model according to the standard speech text; and obtaining a candidate text segment according to the distance context model and the text to be positioned. The method and the device solve the technical problem that the text quality inspection content to be evaluated cannot be quickly positioned in the prior art.

Description

Text positioning method, device, equipment and storage medium
Technical Field
The invention relates to the technical field of financial science and technology, in particular to a text positioning method, a text positioning device, text positioning equipment and a text positioning storage medium.
Background
With the development of computer technology, more and more technologies (big data, distributed, Blockchain, artificial intelligence, etc.) are applied to the financial field, and the traditional financial industry is gradually changing to financial technology (Fintech), but higher requirements are also put forward on the technologies due to the requirements of security and real-time performance of the financial industry.
At present, the quality inspection and evaluation process of the customer service industry generally needs manual check of customer service recording, and manual operation often has certain subjectivity and limitation, so that the quality of the customer service cannot be comprehensively and objectively evaluated; meanwhile, manual spot check may always check the recording with poor service quality, which causes quality check unbalance and inaccurate spot check; moreover, manual spot check requires quality control personnel to evaluate the text by word, and voice recording may contain a large amount of other irrelevant information, which results in that the text quality control content to be evaluated cannot be quickly positioned, so that the quality control personnel cannot quickly position the text content to be evaluated, that is, the positioning accuracy of the text positioning function in the prior art is low, the text positioning efficiency is low, and the quality control working quality and the quality control efficiency are indirectly reduced.
Therefore, how to realize high-precision text positioning and improve text positioning efficiency is a technical problem to be solved urgently at present.
Disclosure of Invention
The invention mainly aims to provide a text positioning method, a text positioning device, text positioning equipment and a storage medium, and aims to solve the technical problem that text quality inspection content to be evaluated cannot be quickly positioned.
In order to achieve the above object, an embodiment of the present invention provides a text positioning method, where the text positioning method includes:
acquiring recording content, and performing voice recognition processing on the recording content to obtain a text to be positioned;
acquiring a standard speech text, and constructing a distance context model according to the standard speech text;
and obtaining a candidate text segment according to the distance context model and the text to be positioned.
Optionally, the step of constructing a distance context model according to the standard language text comprises:
acquiring text characters of the standard language text;
performing step processing on the text characters to obtain a first step array of each text character;
and carrying out list sorting according to the first step array to construct a distance context model.
Optionally, the step of obtaining a candidate text segment according to the distance context model and the text to be positioned includes:
acquiring text characters which accord with a preset distance range in the text to be positioned according to the distance context model, and determining a target positioning range of the text characters;
and determining the target characters in the target positioning range as candidate text segments.
Optionally, the step of obtaining text characters in the text to be positioned according to the distance context model includes:
performing character traversal in the text to be positioned according to the first step array so as to acquire a character sequence which accords with a preset distance range from the text to be positioned;
acquiring a target character sequence with the longest sequence length in the character sequences;
confirming the text to be positioned corresponding to the target character sequence as a text character.
Optionally, the step of performing character traversal in the text to be positioned according to the first step array to obtain a character sequence conforming to a preset distance range from the text to be positioned further includes:
acquiring the text of the text to be positioned;
performing step calculation according to the text to obtain a second step array;
and performing character traversal in the text to be positioned according to the first step array and the second step array so as to acquire a character sequence which accords with a preset distance range from the text to be positioned.
Optionally, the step of performing character traversal in the text to be positioned according to the first step array to obtain a character sequence conforming to a preset distance range from the text to be positioned further includes:
acquiring a text below the text to be positioned;
performing step calculation according to the text below to obtain a third step array;
and performing character traversal in the text to be positioned according to the first step array and the third step array so as to acquire a character sequence which accords with a preset distance range from the text to be positioned.
Optionally, the step of performing character traversal in the text to be positioned according to the first step array to obtain a character sequence conforming to a preset distance range from the text to be positioned further includes:
acquiring an upper text of the text to be positioned, and acquiring a lower text of the text to be positioned;
performing step calculation according to the text to obtain a second step array, and performing step calculation according to the text to obtain a third step array;
and performing character traversal in the text to be positioned according to the first step array, the second step array and the third step array so as to acquire a character sequence which accords with a preset distance range from the text to be positioned.
The present invention also provides a text positioning apparatus, comprising:
the identification module is used for acquiring recording contents and carrying out voice identification processing on the recording contents so as to obtain texts to be positioned;
the system comprises a construction module, a distance context model and a distance context model, wherein the construction module is used for acquiring a standard language text and constructing the distance context model according to the standard language text;
and the acquisition module is used for acquiring a candidate text segment according to the distance context model and the text to be positioned.
Optionally, the building module comprises:
the obtaining submodule is used for obtaining text characters of the standard language text;
the step submodule is used for carrying out step processing on the text characters to obtain a first step array of each text character;
and the constructing submodule is used for carrying out list sorting according to the first step array so as to construct a distance context model.
Optionally, the obtaining module includes:
the first determining submodule is used for acquiring text characters which accord with a preset distance range in the text to be positioned according to the distance context model and determining a target positioning range of the text characters;
and the second determining submodule is used for determining the target characters in the target positioning range as candidate text segments.
Optionally, the first determining sub-module includes:
the traversal unit is used for performing character traversal in the text to be positioned according to the first step array so as to acquire a character sequence which accords with a preset distance range from the text to be positioned;
the acquisition unit is used for acquiring a target character sequence with the longest sequence length in the character sequences;
and the confirming unit is used for confirming the text to be positioned corresponding to the target character sequence as a text character.
Optionally, the traversal unit further includes:
the first acquisition subunit is used for acquiring the text of the text to be positioned;
the first step subunit is used for performing step calculation according to the text to obtain a second step array;
and the first traversal subunit is used for performing character traversal in the text to be positioned according to the first step array and the second step array so as to acquire a character sequence which accords with a preset distance range from the text to be positioned.
Optionally, the traversal unit further includes:
the second acquisition subunit is used for acquiring a text below the text to be positioned;
the second step subunit is used for performing step calculation according to the text below to obtain a third step array;
and the second traversal subunit is used for performing character traversal in the text to be positioned according to the first step array and the third step array so as to acquire a character sequence which accords with a preset distance range from the text to be positioned.
Optionally, the traversal unit further includes:
the third acquisition subunit is used for acquiring the text of the text to be positioned and acquiring the text of the text to be positioned;
the third step subunit is used for performing step calculation according to the text to obtain a second step array, and performing step calculation according to the text to obtain a third step array;
and the third traversal subunit is used for performing character traversal in the text to be positioned according to the first step array, the second step array and the third step array so as to acquire a character sequence which accords with a preset distance range from the text to be positioned.
Further, to achieve the above object, the present invention also provides an apparatus comprising: a memory, a processor, and a text positioning program stored on the memory and executable on the processor, wherein:
the text positioning program when executed by the processor implements the steps of the text positioning method as described above.
In addition, to achieve the above object, the present invention also provides a computer storage medium;
the computer storage medium has stored thereon a text positioning program which, when executed by a processor, implements the steps of the text positioning method as described above.
The method comprises the steps of obtaining recording contents, and carrying out voice recognition processing on the recording contents to obtain texts to be positioned; acquiring a standard speech text, and constructing a distance context model according to the standard speech text; and obtaining a candidate text segment according to the distance context model and the text to be positioned. Through the scheme, high-precision voice text positioning is realized, the text positioning efficiency is improved, the technical problem that the text positioning content to be evaluated cannot be quickly positioned in the prior art is solved, and the quality inspection working quality and the quality inspection efficiency are indirectly improved.
Drawings
FIG. 1 is a schematic diagram of an apparatus architecture of a hardware operating environment according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating a text positioning method according to an embodiment of the present invention.
The objects, features and advantages of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, fig. 1 is a schematic device structure diagram of a hardware operating environment according to an embodiment of the present invention.
The device of the embodiment of the invention can be a PC or a server device.
As shown in fig. 1, the apparatus may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, a communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.
Those skilled in the art will appreciate that the configuration of the apparatus shown in fig. 1 is not intended to be limiting of the apparatus and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is one type of computer storage medium, may include an operating system, a network communication module, a user interface module, and a text positioning program.
In the device shown in fig. 1, the network interface 1004 is mainly used for connecting to a backend server and performing data communication with the backend server; the user interface 1003 is mainly used for connecting a client (user side) and performing data communication with the client; and processor 1001 may be configured to invoke the text-location program stored in memory 1005 and perform the operations in the various embodiments of the text-location method described below.
The main idea of the embodiment scheme of the invention is as follows: the method comprises the steps of obtaining recording contents, and carrying out voice recognition processing on the recording contents to obtain texts to be positioned; acquiring a standard speech text, and constructing a distance context model according to the standard speech text; and obtaining a candidate text segment according to the distance context model and the text to be positioned. Through the scheme, high-precision voice text positioning is realized, the text positioning efficiency is improved, the technical problem that the text quality inspection content to be evaluated cannot be quickly positioned in the prior art is solved, and the quality inspection working quality and the quality inspection efficiency are indirectly improved.
In the embodiment of the invention, the fact that manual spot check of the customer service recording in the prior art often has certain subjectivity and limitation is considered, so that the customer service quality cannot be comprehensively and objectively evaluated; meanwhile, manual spot check may always check the recording with poor service quality, which causes quality check unbalance and inaccurate spot check; moreover, manual spot check requires quality control personnel to evaluate the text by word, and voice recording may contain a large amount of other irrelevant information, which results in that the text quality control content to be evaluated cannot be quickly positioned, so that the quality control personnel cannot quickly position the text content to be evaluated, that is, the positioning accuracy of the text positioning function in the prior art is low, the text positioning efficiency is low, and the quality control working quality and the quality control efficiency are indirectly reduced.
The invention provides a solution, which realizes high-precision voice text positioning, improves the text positioning efficiency, solves the technical problem that the text quality inspection content to be evaluated cannot be quickly positioned in the prior art, and indirectly improves the quality inspection working quality and the quality inspection efficiency.
Based on the hardware structure, the embodiment of the text positioning method is provided.
The invention belongs to the field of financial science and technology (Fintech), and provides a text positioning method which is mainly applied to equipment, wherein in one embodiment of the text positioning method, referring to FIG. 2, the text positioning method comprises the following steps:
step S10, acquiring recording content, and performing voice recognition processing on the recording content to obtain a text to be positioned;
step S20, acquiring a standard phonetics text, and constructing a distance context model according to the standard phonetics text;
and step S30, obtaining candidate text segments according to the distance context model and the text to be positioned.
The specific contents are as follows:
step S10, acquiring recording content, and performing voice recognition processing on the recording content to obtain a text to be positioned;
the recorded content in this embodiment comprises customer service recordings. Usually, the customer service will leave a customer service recording in the process of communicating with the customer, and the device performs voice recognition on the customer service recording to convert the customer service recording into a text to be positioned. It should be noted that, since the quality inspection object is a customer service, the text to be positioned refers to a text converted from voice in the content of the one-pass recording, not a text converted from the voice of the user.
Step S20, acquiring a standard phonetics text, and constructing a distance context model according to the standard phonetics text;
the standard dialect text refers to a predetermined dialect which is required to be attached to the text to be detected in the quality inspection process, that is, a dialect sample of the predetermined dialect is required to be included in the text to be detected, so that the standard dialect text is a reference standard for detecting whether the predetermined dialect exists in the text to be detected, that is, the standard dialect text is used for detecting whether the standard dialect text exists in the text to be detected. For example, "thank you for application and coordination". The standard dialect text is obtained, and text positioning can be carried out on the text to be detected.
Specifically, the step of constructing a distance context model according to the standard language text comprises:
step A1, acquiring text characters of the standard phonetics text;
in this embodiment, the standard phonetic text is used as a reference sample, so that the standard phonetic text needs to be converted into characters to obtain text characters of the standard phonetic text. For example, the text characters of the standard dialogistic text "thank you for your application and matching" are "thank", "you", "for", "apply", "please", "and", "match" and ".
Step A2, performing step processing on the text characters to obtain a first step array of each text character;
and step A3, performing list sorting according to the first step array to construct a distance context model.
Assuming that the standard dialogical text is str1, str1 thank you for application and matching, and the character length len of str1 is obtained. A model matrix [ len ] [ len ] is constructed by the following matrix formula:
matrix[i][j]=[min distance,max distance]
min distance=min(sgn(j-i)*1,j-i)if i<=j
max distance=max(sgn(j-i)*1,j-i)if i<=j
and sgn (x) is a step function, and step coordinates of the standard language text relative to each character are obtained by traversing character interval features of the standard language text, so that a first step array of each text character is obtained. If the value of x is greater than 0, sgn (x) returns 1; if the value is equal to 0, returning to 0; if the value is less than 0, the value returns to-1. The first step array refers to an array consisting of a minimum distance value and a maximum distance value between one character number and other character numbers, for example, the character number of the character a is i, j, then the step array of the character a is [ min distance, max distance ], where the minimum distance value is min distance, the maximum distance value is max distance, and min distance is obtained by min (sgn (j-i) 1, j-1), and max distance is obtained by max (sgn (j-i) 1, j-1), and i and j represent the character array numbers. That is, the step function calculates the character number, the obtained return value is used as the input parameter of the min function and the max function, so as to calculate the min function and the max function, and the return value of the min function and the max function is used as the first step array of the corresponding text character.
According to the above matrix formula, a first step array of each text character is obtained, the step arrays are arranged in a list, corresponding to each text character, for example, the array character number a [ j ], represented by j, of the character numbers of horizontal "thank you apply and match", is a [0], a [1], a [2], a [3], a [4], a [5], a [6], a [7], and a [8], and the character number of vertical "thank you apply and match" is represented by i, and the corresponding array character number b [ i ] is obtained, and according to the character array numbers and the corresponding matrix formula, the matrix model as shown in the following table 1 can be obtained:
TABLE 1
Thus, the matrix model shown in table 1 above is a distance context model.
And step S30, obtaining candidate text segments according to the distance context model and the text to be positioned.
The distance context model is a standard matrix of a standard speech text and can be used as a reference model of a text to be positioned.
Further, the step of obtaining a candidate text segment according to the distance context model and the text to be positioned includes:
step B1, acquiring text characters which accord with a preset distance range in the text to be positioned according to the distance context model, and determining a target positioning range of the text characters;
and inputting the text to be positioned into the distance context model, and performing text character retrieval on the text to be positioned according to the distance context model. Assuming that the text to be positioned is 'casting you apply for matching we will process you as soon as possible', the text to be positioned is determined as text. And the equipment screens the characters in the text to obtain the text characters which accord with a preset distance range. The text characters are character texts in the text which accord with the semantics of the standard language text, namely the character samples with the closest language order to the standard language text.
After the text characters are acquired, the device determines the target positioning range of the text characters based on the starting position and the ending position of the text characters.
Specifically, the step of obtaining text characters in the text to be positioned according to the distance context model includes:
step a, performing character traversal in the text to be positioned according to the first step array to acquire a character sequence which accords with a preset distance range from the text to be positioned;
the first step array in the distance context model can provide character matching retrieval processing for the text to be positioned so as to obtain the co-occurrence relation between characters of the text to be positioned, and therefore character features which accord with a preset distance range in the text to be positioned are retrieved so as to obtain a character sequence.
Traversing each character chari of the text to be positioned, wherein the index of the character chari is i:
if the character chari, in matrix first column:
traversing each character charj of the text, with an index j:
if char j, in matrix first row:
if dis (i, j) > < matrix [ i ] [ j ] [0] and dis (i, j) < matrix [ i ] [ j ] [1]:
it can be understood that the text [ j ] and the text [ i ] in the text to be positioned have a co-occurrence relationship conforming to the distance range, and the text segment within the range [ j: k ] of the character sequence seq of the character text [ i ], that is, the text segment determining the starting position and the ending position, is obtained by obtaining the character sequence seq which conforms to the co-occurrence relationship of the distance range with all the characters text [ i ] ═ text [ j ] … text [ k ].
It will be appreciated that text segments within the range j: k of the character sequence seq of the character text i. Text segments within the range [ j ': k ' ] of the character sequence seq of the character text [ i ' ].
Further, the step of performing character traversal in the text to be positioned according to the first step array to obtain a character sequence conforming to a preset distance range from the text to be positioned further includes:
step a1, acquiring the text of the text to be positioned;
it will be appreciated that the distance context model is built using the standard conversational text itself. Assuming that M is a standard conversational text, the distance context model of M itself is matrix (M, M), in addition to using M itself as the distance context model, it may also be assumed that M is a standard conversational text, L is the above text of M, and the distance context models of M and LM are matrix (M, LM). Note that the above text L will be displayed in the model as the left text of the standard phonetics text.
For example, if the text above is "thank you for your call", then the canonical language text and the text above construct matrix (M, LM).
Step a2, performing step calculation according to the above text to obtain a second step array;
similarly, the device will compute a step array of the upper text, resulting in a second step array.
Step a3, performing character traversal in the text to be positioned according to the first step array and the second step array so as to acquire a character sequence which accords with a preset distance range from the text to be positioned.
Due to the fact that the second step array is added to serve as a reference comparison array, the text to be positioned can be subjected to character traversal processing according to the first step array and the second step array, and through array traversal of more reference samples, a character sequence which meets a preset distance range is obtained from the text to be positioned.
Step b, obtaining a target character sequence with the longest sequence length in the character sequences;
the character sequences represent character samples having a co-occurrence relationship, and since the character sequences are screened by conforming to a predetermined distance range, the sequence lengths of the character sequences may not be consistent. The device screens out the target character sequence with the longest sequence length from all the character sequences, and the specific mode is that the target character sequence with the longest sequence length can be obtained by arranging the sequence lengths of all the character sequences from big to big through length reverse order arrangement.
For example, the character sequence of the character "match" in "for you to process you as soon as possible in text to be located" match for you to you application "is" match for you to apply for you ", which is the longest target character sequence.
And c, confirming the text to be positioned corresponding to the target character sequence as a text character.
The target character sequence corresponds to a part of characters in the text to be positioned, and the part of the text to be positioned can be determined as text characters.
And step B2, determining the target characters in the target positioning range as candidate text segments.
After the target positioning range is determined, the device acquires the target characters in the target positioning range from the text to be positioned, and determines the target characters as candidate text segments for later use.
The method comprises the steps of obtaining recording contents, and carrying out voice recognition processing on the recording contents to obtain texts to be positioned; acquiring a standard speech text, and constructing a distance context model according to the standard speech text; and obtaining a candidate text segment according to the distance context model and the text to be positioned. Through the scheme, high-precision voice text positioning is realized, the text positioning efficiency is improved, the technical problem that the text positioning content to be evaluated cannot be quickly positioned in the prior art is solved, and the quality inspection working quality and the quality inspection efficiency are indirectly improved.
Further, based on the first embodiment, a second embodiment of the text positioning method of the present invention is provided, in which the step of performing character traversal in the text to be positioned according to the first step array to obtain a character sequence conforming to a preset distance range from the text to be positioned further includes:
acquiring a text below the text to be positioned;
performing step calculation according to the text below to obtain a third step array;
it will be appreciated that the distance context model is built using the standard conversational text itself. Assuming that M is standard conversational text and the distance context model of M itself is matrix (M, M), in addition to using M itself as the distance context model, it can also be assumed that M is standard conversational text, R is the context of M, and the distance context models of M and MR are matrix (M, MR). It is noted that the text R below will be displayed in the model as the right text of the standard phonetics text.
For example, if the following text is "we will do you as soon as possible", then the standard dialect text and the following text construct matrix (M, MR).
Similarly, the device will calculate the step array of the text below, resulting in a third step array.
And performing character traversal in the text to be positioned according to the first step array and the third step array so as to acquire a character sequence which accords with a preset distance range from the text to be positioned.
Because the third step array is added as the reference comparison array, the embodiment can perform character traversal processing on the text to be positioned according to the first step array and the third step array, and acquire the character sequence which accords with the preset distance range from the text to be positioned through array traversal of more reference samples.
Further, the step of performing character traversal in the text to be positioned according to the first step array to obtain a character sequence conforming to a preset distance range from the text to be positioned further includes:
acquiring an upper text of the text to be positioned, and acquiring a lower text of the text to be positioned;
performing step calculation according to the text to obtain a second step array, and performing step calculation according to the text to obtain a third step array;
and performing character traversal in the text to be positioned according to the first step array, the second step array and the third step array so as to acquire a character sequence which accords with a preset distance range from the text to be positioned.
The distance context model in the present embodiment is obtained by adding a new step calculation sample on the basis of the distance context model in the first embodiment, and the model in the first embodiment is constructed by using a standard dialect text: m is standard conversational text, and the distance context model of M itself is matrix (M, M). To further enhance the text localization efficiency of the model, the present embodiment adds the following text of the text to be localized into the model. The device obtains the text from the upper part and performs step calculation to obtain a second step array, obtains the text from the lower part and performs step calculation to obtain a third step array, and the principle of the device is the same as that of the first step array.
Assume that M is standard dialect text, L is the above text of M, R is the below text of M, and the distance context model of M and LMR is matrix (M, LMR). It is noted that the above text L will be displayed in the model as the left text of the standard phonetics text, while the below text R will be displayed in the model as the right text of the standard phonetics text.
For example, the canonical language text M is "thank you for application and cooperation", the upper text L thereof is "thank you for power", and the lower text R thereof is "we will handle you as soon as possible", and matrix (M, LMR) is constructed.
Similar to the square table matrix [ M, M ] of the first embodiment, in this embodiment, matrix [ M, LMR ] is a rectangular table matrix encoding M and LMR.
It can be understood that, in the present embodiment, the basic principle of the construction of matrix [ M, LMR ] is completely consistent with matrix [ M, M ], the second step array is obtained by performing step calculation on the above text, and the third step array is obtained by performing step calculation on the below text. And traversing, retrieving and mapping the characters in the text to be positioned based on the first step array, the second step array and the third step array to determine the position relation between the characters in the text to be positioned and other characters, and further determining a character sequence which accords with a preset distance range.
Table 2 below is an example matrix [ M, LMR ] of the present embodiment:
based on matrix (M, LMR), in text to be located "you will be processed as soon as possible for the cooperation of casting you application", we locate [ "," cooperation of casting you application "," we will process as soon as possible for you "].
The advantages of matrix (M, LMR) over matrix (M, M) are:
(1) standard speech text is generally short and candidate segments of text to be located tend to have many errors due to speech recognition problems, which can cause problems with matching short text to short text. By adding context and expanding the standard dialect text, the matching of the short text pair is converted into the matching of the long text pair, so that the matching effect and the text positioning accuracy are improved.
(2) After speech recognition, customer service speech has various problems, and problematic texts are often random. By adding context, the proportion of the problematic texts can be well reduced, and the matching effect is improved. The error proportion in the short text is higher than that in the long text, so that the matching effect by using the long text is better.
Through the above manner, in the embodiment, on the basis of the first embodiment, the step calculation of the text below is added, and the text positioning reference sample from the context model is added, so that the text positioning efficiency is improved.
Further, based on the first embodiment, a third embodiment of the text positioning method of the present invention is provided, in which the step of performing speech recognition processing on the recording content to obtain the text to be positioned includes:
step e, carrying out voice recognition processing on the recording content to obtain a first text;
in this embodiment, the recorded content may have a syntax error or a semantic divergence, so that it is necessary to perform a standardized speech recognition process on the recorded content to obtain the first text.
F, performing text word segmentation on the first text to obtain a second text;
english words naturally have spaces which are separated, so that words can be easily divided according to the spaces, but sometimes a plurality of words are required to be used as a word, for example, some nouns such as 'New York' need to be treated as a word. Since Chinese has no space, the first text needs to be participled. The first text is composed of a plurality of phrases, and the device separates the phrases in the first text to obtain meaningful phrases.
Clustering the Words of each text sample and the corresponding word frequency based on the characteristics of the Words by a Bag of Words (BoW) model to realize text vectorization so as to form phrase clusters; or by a word Set model (Set of words, SoW for short), which is different from the bag-of-words model only considers whether a word appears in the text, not the word frequency.
And performing text word segmentation processing based on the model to obtain a second text.
Step g, performing text error correction processing on the second text to obtain a third text;
common errors of text errors mainly include different characters, pure pinyin, fuzzy sound, pinyin and Chinese character mixing, pinyin and other symbol mixing and other problems. One or more of the above problems may exist in the second text, and thus a text correction process is required.
The text error correction process is divided into two steps, the first step is error detection and the second step is error correction. 1. The error detection part firstly cuts words of the second text through the Chinese word segmentation device, and the word cutting result is often wrong in segmentation because the second text possibly contains wrong words, so that errors are detected from two aspects of word granularity and word granularity, suspected error results of the two granularities are integrated, and a suspected error position candidate set is formed; 2. and the error correction part is used for traversing all suspected error positions, replacing words in the error positions by using similar dictionaries, calculating sentence confusion degree through a language model, and comparing and sequencing results of all candidate sets to obtain the optimal corrected words.
Through the above text error correction processing, the third text can be obtained.
And h, carrying out text rewriting processing on the third text to obtain a text to be positioned.
The text rewriting process achieves the effect of clearing the disordered text by modifying the vocabulary attributes in the third text. For example, the language sequence grammar and the vocabulary words in the third text are corrected, so that the technical effect of clear and smooth text semantic expression of the third text is achieved, and the corrected text obtained after the text is rewritten is the text to be positioned.
Through the voice recognition processing of customer service voice, the recognition accuracy of the text to be positioned is greatly improved, and effective data support is provided for the application of the subsequent text to be positioned. The ultimate goal of the present embodiment is to serve efficient positioning of text data.
In addition, an embodiment of the present invention further provides a text positioning apparatus, where the text positioning apparatus includes:
the identification module is used for acquiring recording contents and carrying out voice identification processing on the recording contents so as to obtain texts to be positioned;
the system comprises a construction module, a distance context model and a distance context model, wherein the construction module is used for acquiring a standard language text and constructing the distance context model according to the standard language text;
and the acquisition module is used for acquiring a candidate text segment according to the distance context model and the text to be positioned.
Optionally, the building module comprises:
the obtaining submodule is used for obtaining text characters of the standard language text;
the step submodule is used for carrying out step processing on the text characters to obtain a first step array of each text character;
and the constructing submodule is used for carrying out list sorting according to the first step array so as to construct a distance context model.
Optionally, the obtaining module includes:
the first determining submodule is used for acquiring text characters which accord with a preset distance range in the text to be positioned according to the distance context model and determining a target positioning range of the text characters;
and the second determining submodule is used for determining the target characters in the target positioning range as candidate text segments.
Optionally, the first determining sub-module includes:
the traversal unit is used for performing character traversal in the text to be positioned according to the first step array so as to acquire a character sequence which accords with a preset distance range from the text to be positioned;
the acquisition unit is used for acquiring a target character sequence with the longest sequence length in the character sequences;
and the confirming unit is used for confirming the text to be positioned corresponding to the target character sequence as a text character.
Optionally, the traversal unit further includes:
the first acquisition subunit is used for acquiring the text of the text to be positioned;
the first step subunit is used for performing step calculation according to the text to obtain a second step array;
and the first traversal subunit is used for performing character traversal in the text to be positioned according to the first step array and the second step array so as to acquire a character sequence which accords with a preset distance range from the text to be positioned.
Optionally, the traversal unit further includes:
the second acquisition subunit is used for acquiring a text below the text to be positioned;
the second step subunit is used for performing step calculation according to the text below to obtain a third step array;
and the second traversal subunit is used for performing character traversal in the text to be positioned according to the first step array and the third step array so as to acquire a character sequence which accords with a preset distance range from the text to be positioned.
Optionally, the traversal unit further includes:
the third acquisition subunit is used for acquiring the text of the text to be positioned and acquiring the text of the text to be positioned;
the third step subunit is used for performing step calculation according to the text to obtain a second step array, and performing step calculation according to the text to obtain a third step array;
and the third traversal subunit is used for performing character traversal in the text to be positioned according to the first step array, the second step array and the third step array so as to acquire a character sequence which accords with a preset distance range from the text to be positioned.
In addition, an embodiment of the present invention further provides an apparatus, where the apparatus includes: a memory 109, a processor 110, and a text positioning program stored on the memory 109 and executable on the processor 110, the text positioning program implementing the steps of the embodiments of the text positioning method described above when executed by the processor 110.
Furthermore, the present invention also provides a computer storage medium storing one or more programs, which can be further executed by one or more processors for implementing the steps of the embodiments of the text positioning method described above.
The specific implementation of the device and the storage medium (i.e., the computer storage medium) of the present invention is basically the same as the embodiments of the text positioning method, and will not be described herein again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) as described above, and includes instructions for enabling a device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (10)

1. A text positioning method is characterized by comprising the following steps:
acquiring recording content, and performing voice recognition processing on the recording content to obtain a text to be positioned;
acquiring a standard speech text, and constructing a distance context model according to the standard speech text;
and obtaining a candidate text segment according to the distance context model and the text to be positioned.
2. The text localization method of claim 1, wherein the step of constructing a distance context model from the standard linguistic text comprises:
acquiring text characters of the standard language text;
performing step processing on the text characters to obtain a first step array of each text character;
and carrying out list sorting according to the first step array to construct a distance context model.
3. The method as claimed in claim 2, wherein said step of obtaining candidate text segments according to the distance context model and the text to be located comprises:
acquiring text characters which accord with a preset distance range in the text to be positioned according to the distance context model, and determining a target positioning range of the text characters;
and determining the target characters in the target positioning range as candidate text segments.
4. The method as claimed in claim 3, wherein the step of obtaining text characters in the text to be located according to the distance context model, which meet a preset distance range, comprises:
performing character traversal in the text to be positioned according to the first step array so as to acquire a character sequence which accords with a preset distance range from the text to be positioned;
acquiring a target character sequence with the longest sequence length in the character sequences;
confirming the text to be positioned corresponding to the target character sequence as a text character.
5. The method as claimed in claim 4, wherein said step of performing character traversal in said text to be positioned according to said first step array to obtain a character sequence conforming to a preset distance range from said text to be positioned further comprises:
acquiring the text of the text to be positioned;
performing step calculation according to the text to obtain a second step array;
and performing character traversal in the text to be positioned according to the first step array and the second step array so as to acquire a character sequence which accords with a preset distance range from the text to be positioned.
6. The method as claimed in claim 4, wherein said step of performing character traversal in said text to be positioned according to said first step array to obtain a character sequence conforming to a preset distance range from said text to be positioned further comprises:
acquiring a text below the text to be positioned;
performing step calculation according to the text below to obtain a third step array;
and performing character traversal in the text to be positioned according to the first step array and the third step array so as to acquire a character sequence which accords with a preset distance range from the text to be positioned.
7. The method as claimed in claim 4, wherein said step of performing character traversal in said text to be positioned according to said first step array to obtain a character sequence conforming to a preset distance range from said text to be positioned further comprises:
acquiring an upper text of the text to be positioned, and acquiring a lower text of the text to be positioned;
performing step calculation according to the text to obtain a second step array, and performing step calculation according to the text to obtain a third step array;
and performing character traversal in the text to be positioned according to the first step array, the second step array and the third step array so as to acquire a character sequence which accords with a preset distance range from the text to be positioned.
8. A text-locating device, comprising:
the identification module is used for acquiring recording contents and carrying out voice identification processing on the recording contents so as to obtain texts to be positioned;
the system comprises a construction module, a distance context model and a distance context model, wherein the construction module is used for acquiring a standard language text and constructing the distance context model according to the standard language text;
and the acquisition module is used for acquiring a candidate text segment according to the distance context model and the text to be positioned.
9. An apparatus, characterized in that the apparatus comprises: memory, processor and a text positioning program stored on the memory and executable on the processor, the text positioning program when executed by the processor implementing the steps of the text positioning method according to any one of claims 1 to 7.
10. A storage medium having stored thereon a text localization program, which when executed by a processor implements the steps of the text localization method according to any one of claims 1 to 7.
CN201910851802.6A 2019-09-09 2019-09-09 Text positioning method, device, equipment and storage medium Active CN110599028B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910851802.6A CN110599028B (en) 2019-09-09 2019-09-09 Text positioning method, device, equipment and storage medium
PCT/CN2019/116470 WO2021047003A1 (en) 2019-09-09 2019-11-08 Text positioning method, apparatus, device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910851802.6A CN110599028B (en) 2019-09-09 2019-09-09 Text positioning method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110599028A true CN110599028A (en) 2019-12-20
CN110599028B CN110599028B (en) 2022-05-17

Family

ID=68858597

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910851802.6A Active CN110599028B (en) 2019-09-09 2019-09-09 Text positioning method, device, equipment and storage medium

Country Status (2)

Country Link
CN (1) CN110599028B (en)
WO (1) WO2021047003A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112364130A (en) * 2020-11-10 2021-02-12 深圳前海微众银行股份有限公司 Sample sampling method, device and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100250497A1 (en) * 2007-01-05 2010-09-30 Redlich Ron M Electromagnetic pulse (EMP) hardened information infrastructure with extractor, cloud dispersal, secure storage, content analysis and classification and method therefor
CN105609107A (en) * 2015-12-23 2016-05-25 北京奇虎科技有限公司 Text processing method and device based on voice identification
CN107861779A (en) * 2017-09-20 2018-03-30 东软集团股份有限公司 Page object localization method and device, storage medium, electronic equipment
US20180137350A1 (en) * 2016-11-14 2018-05-17 Kodak Alaris Inc. System and method of character recognition using fully convolutional neural networks with attention
CN109389971A (en) * 2018-08-17 2019-02-26 深圳壹账通智能科技有限公司 Insurance recording quality detecting method, device, equipment and medium based on speech recognition

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103577598B (en) * 2013-11-15 2017-02-15 曙光信息产业(北京)有限公司 Matching method and device for pattern string and text string
US11068658B2 (en) * 2016-12-07 2021-07-20 Disney Enterprises, Inc. Dynamic word embeddings
CN109271489B (en) * 2018-10-25 2020-12-15 第四范式(北京)技术有限公司 Text detection method and device
CN110008335A (en) * 2018-12-12 2019-07-12 阿里巴巴集团控股有限公司 The method and device of natural language processing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100250497A1 (en) * 2007-01-05 2010-09-30 Redlich Ron M Electromagnetic pulse (EMP) hardened information infrastructure with extractor, cloud dispersal, secure storage, content analysis and classification and method therefor
CN105609107A (en) * 2015-12-23 2016-05-25 北京奇虎科技有限公司 Text processing method and device based on voice identification
US20180137350A1 (en) * 2016-11-14 2018-05-17 Kodak Alaris Inc. System and method of character recognition using fully convolutional neural networks with attention
CN107861779A (en) * 2017-09-20 2018-03-30 东软集团股份有限公司 Page object localization method and device, storage medium, electronic equipment
CN109389971A (en) * 2018-08-17 2019-02-26 深圳壹账通智能科技有限公司 Insurance recording quality detecting method, device, equipment and medium based on speech recognition

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王涛 等: ""基于 LDA 模型与语义网络对评论文本挖掘研究"", 《重庆工商大学学报( 自然科学版)》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112364130A (en) * 2020-11-10 2021-02-12 深圳前海微众银行股份有限公司 Sample sampling method, device and readable storage medium
CN112364130B (en) * 2020-11-10 2024-04-09 深圳前海微众银行股份有限公司 Sample sampling method, apparatus and readable storage medium

Also Published As

Publication number Publication date
CN110599028B (en) 2022-05-17
WO2021047003A1 (en) 2021-03-18

Similar Documents

Publication Publication Date Title
CN109408526B (en) SQL sentence generation method, device, computer equipment and storage medium
CN108363790B (en) Method, device, equipment and storage medium for evaluating comments
CN110033760B (en) Modeling method, device and equipment for speech recognition
KR101623891B1 (en) Optimizing parameters for machine translation
JP5599662B2 (en) System and method for converting kanji into native language pronunciation sequence using statistical methods
CN107729313B (en) Deep neural network-based polyphone pronunciation distinguishing method and device
US5930746A (en) Parsing and translating natural language sentences automatically
WO2020215554A1 (en) Speech recognition method, device, and apparatus, and computer-readable storage medium
CN109616096B (en) Construction method, device, server and medium of multilingual speech decoding graph
KR20180005850A (en) Automatic interpretation method and apparatus, and machine translation method and apparatus
JP2016513269A (en) Method and device for acoustic language model training
CN106570180A (en) Artificial intelligence based voice searching method and device
CN110019741B (en) Question-answering system answer matching method, device, equipment and readable storage medium
CN111460793A (en) Error correction method, device, equipment and storage medium
CN111897511A (en) Voice drawing method, device, equipment and storage medium
CN111666764A (en) XLNET-based automatic summarization method and device
CN111326144B (en) Voice data processing method, device, medium and computing equipment
JP7096199B2 (en) Information processing equipment, information processing methods, and programs
CN110599028B (en) Text positioning method, device, equipment and storage medium
KR101709693B1 (en) Method for Web toon Language Automatic Translating Using Crowd Sourcing
CN110442876B (en) Text mining method, device, terminal and storage medium
CN110929514B (en) Text collation method, text collation apparatus, computer-readable storage medium, and electronic device
CN112559711A (en) Synonymous text prompting method and device and electronic equipment
JP4653598B2 (en) Syntax / semantic analysis device, speech recognition device, and syntax / semantic analysis program
CN110750967A (en) Pronunciation labeling method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant