WO2022134798A1 - 基于自然语言的断句方法、装置、设备及存储介质 - Google Patents

基于自然语言的断句方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2022134798A1
WO2022134798A1 PCT/CN2021/124954 CN2021124954W WO2022134798A1 WO 2022134798 A1 WO2022134798 A1 WO 2022134798A1 CN 2021124954 W CN2021124954 W CN 2021124954W WO 2022134798 A1 WO2022134798 A1 WO 2022134798A1
Authority
WO
WIPO (PCT)
Prior art keywords
text
data
processed
segmentation
sequence
Prior art date
Application number
PCT/CN2021/124954
Other languages
English (en)
French (fr)
Inventor
赵焕丽
徐国强
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2022134798A1 publication Critical patent/WO2022134798A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Definitions

  • the present application relates to the technical field of natural language processing, and in particular, to a method, apparatus, device and storage medium for sentence segmentation based on natural language.
  • the user communicates with the customer service through voice signals or text, and segmenting the user's voice signals and text is a necessary process.
  • most of the customer service robot sentence segmentation modules on the market use voice endpoint detection technology for sentence segmentation, that is, combined with speech features such as frequency domain, spectral entropy, fundamental frequency, etc., to detect the starting point and ending point of the actual voice segment from the continuous audio signal, and then The sentence is segmented at the end point, or the segment is determined according to the pause time between the end point and the next starting point.
  • the inventor realized that the segmentation method of the speech endpoint detection technology will cause the problem of incorrect identification of the start point and the end point of the sentence, thereby reducing the problem of sentence segmentation. accuracy.
  • the present application provides a natural language-based sentence segmentation method, device, device, and storage medium, which are used for sentence segmentation by using a natural language processing algorithm, thereby improving the flexibility and accuracy of sentence segmentation.
  • a first aspect of the present application provides a natural language-based sentence segmentation method, including: acquiring to-be-processed speech data from a first business scenario, or acquiring to-be-processed text data from a second business scenario; When acquiring the to-be-processed speech data, input the to-be-processed speech data into a preset speech recognition model to generate a to-be-recognized text sequence, and perform feature screening and sentence segmentation on the to-be-recognized text sequence in combination with a natural language processing algorithm to generate a target Text segmentation data, the text sequence to be recognized includes a plurality of text characters; when acquiring the to-be-processed text data from the second business scenario, the to-be-processed text data is input into a pre-trained text segmentation model, and combined with natural
  • the language processing algorithm performs feature screening and sentence segmentation on the text data to be processed, and generates target text sentence segmentation data; generates target response data according to the target text sentence segmentation
  • a second aspect of the present application provides a natural language-based sentence segmentation device, comprising a memory, a processor, and computer-readable instructions stored on the memory and executable on the processor, and the processor executes the
  • the computer-readable instructions implement the following steps: acquiring the to-be-processed voice data from the first business scenario, or acquiring the to-be-processed text data from the second business scenario; when acquiring the to-be-processed voice data from the first business scenario, all Input the speech data to be processed into a preset speech recognition model, generate a text sequence to be recognized, and perform feature screening and sentence segmentation on the to-be-recognized text sequence in combination with natural language processing algorithms to generate target text segmentation data, the to-be-recognized text
  • the sequence includes a plurality of text characters; when the to-be-processed text data is obtained from the second business scenario, the to-be-processed text data is input into the pre-trained text segmentation model, and the to-be-
  • the data is subjected to feature screening and sentence segmentation to generate target text sentence segmentation data; target response data is generated according to the target text sentence segmentation data and the corresponding scene configuration, and the target response data is transmitted to the target terminal, and the scene configuration is set in advance. configuration.
  • a third aspect of the present application provides a computer-readable storage medium, where computer instructions are stored in the computer-readable storage medium, and when the computer instructions are executed on a computer, the computer is caused to perform the following steps: from the first industry obtain the to-be-processed voice data from the service scenario, or obtain the to-be-processed text data from the second service scenario; when obtaining the to-be-processed voice data from the first service scenario, input the to-be-processed voice data into a preset speech recognition model , generate a text sequence to be recognized, and perform feature screening and sentence segmentation on the text sequence to be recognized in combination with a natural language processing algorithm to generate target text segmentation data, where the text sequence to be recognized includes multiple text characters; When acquiring the to-be-processed text data in the scene, input the to-be-processed text data into a pre-trained text segmentation model, and combine the natural language processing algorithm to perform feature screening and sentence segmentation on the to-be-processed text data to generate target text segmentation
  • a fourth aspect of the present application provides a natural language-based sentence segmentation device, wherein the natural language-based sentence segmentation device includes: an acquisition module, configured to acquire the speech data to be processed from the first service scenario, or obtain the speech data from the second service Obtaining the text data to be processed in the scene; the first sentence segmentation module, when obtaining the voice data to be processed from the first business scenario, is used to input the voice data to be processed into a preset voice recognition model, and generate a text sequence to be recognized , and combined with the natural language processing algorithm to perform feature screening and sentence segmentation on the text sequence to be recognized, to generate target text segmentation data, the text sequence to be recognized includes a plurality of text characters; the second sentence segmentation module, when from the second business scenario When acquiring the to-be-processed text data, it is used to input the to-be-processed text data into a pre-trained text segmentation model, and to perform feature screening and sentence segmentation on the to-be-processed text data in combination with a natural language processing
  • the to-be-processed voice data is obtained from the first business scenario, or the to-be-processed text data is obtained from the second business scenario;
  • the Input the speech data to be processed into a preset speech recognition model generate a text sequence to be recognized, and perform feature screening and sentence segmentation on the to-be-recognized text sequence in combination with a natural language processing algorithm to generate target text sentence segmentation data, the to-be-recognized text sequence Including a plurality of text characters;
  • the to-be-processed text data is input into a pre-trained text segmentation model, and the to-be-processed text data is processed in combination with a natural language processing algorithm Perform feature screening and sentence segmentation to generate target text segmentation data; generate target response data according to the target text segmentation data and the corresponding scene configuration, and transmit the
  • the to-be-processed speech data obtained from the first business scenario is input into the speech recognition model to generate the to-be-recognized text sequence, and combined with the natural language processing algorithm to segment the to-be-recognized text sequence, or the
  • the text data to be processed obtained from the second business scenario is input into the trained text segmentation model, and the text data to be processed is segmented in combination with natural language processing algorithms; Segmenting the to-be-processed voice data of the business scenario and the to-be-processed text data of the second business scenario improves the flexibility and accuracy of the segmented sentences.
  • FIG. 1 is a schematic diagram of an embodiment of a natural language-based sentence segmentation method in an embodiment of the present application
  • FIG. 2 is a schematic diagram of another embodiment of the natural language-based sentence segmentation method in the embodiment of the present application.
  • FIG. 3 is a schematic diagram of an embodiment of a natural language-based sentence segmentation device in an embodiment of the present application
  • FIG. 4 is a schematic diagram of another embodiment of a natural language-based sentence segmentation device in an embodiment of the present application.
  • FIG. 5 is a schematic diagram of an embodiment of a device for segmenting sentences based on natural language in an embodiment of the present application.
  • Embodiments of the present application provide a natural language-based sentence segmentation method, apparatus, device, and storage medium, which are used for sentence segmentation by using a natural language processing algorithm, thereby improving the flexibility and accuracy of sentence segmentation.
  • an embodiment of the method for segmenting sentences based on natural language in the embodiment of the present application includes:
  • the server obtains the voice data from the first service scenario to obtain the to-be-processed voice data, or the server obtains the text data from the second service scenario to obtain the to-be-processed text data. It should be emphasized that, in order to further ensure the privacy and security of the above-mentioned data fields, the above-mentioned to-be-processed voice data and to-be-processed text data can also be stored in a node of a blockchain.
  • the first business scenario is a telemarketing scenario
  • the second business scenario is a customer service scenario
  • the data acquired by the server from the telemarketing scenario is voice type data, that is, voice data to be processed
  • the server obtains data from the second business scenario
  • the acquired data is text type data, that is, the text data to be processed, and the voice data to be processed can be "May I ask what is the matter", “Yes, I am Mr. Zhang”, “What is the matter", etc., to be processed
  • the text data can be "Hello, ask a question", and "I got it, thank you", and the like.
  • the execution body of the present application may be a sentence segmentation device based on natural language, and may also be a terminal or a server, which is not specifically limited here.
  • the embodiments of the present application take the server as an execution subject as an example for description.
  • the to-be-processed speech data When acquiring the to-be-processed speech data from the first business scenario, input the to-be-processed speech data into a preset speech recognition model, generate a to-be-recognized text sequence, and perform feature screening and analysis of the to-be-recognized text sequence in combination with a natural language processing algorithm. Segmentation, generating target text segmentation data, and the text sequence to be recognized includes multiple text characters;
  • the server When acquiring the to-be-processed speech data from the first business scenario, the server inputs the to-be-processed speech data into the speech recognition model for processing, first generates the to-be-recognized text sequence, and then combines the natural speech processing algorithm to segment the to-be-recognized text sequence, thereby Generate target text segmentation data.
  • the server when the server obtains the to-be-processed voice data of "Who is it? What's the matter?" from the telemarketing scenario, the server inputs the to-be-processed voice data into the speech recognition model, and generates a text sequence to be recognized, where in the speech recognition model Perform noise removal, channel processing, feature extraction, etc. on the to-be-processed speech data of "May I ask what's the matter?" to generate a text sequence to be recognized [what's the matter, please], and then combine it with natural language processing algorithms to treat it Identify the text sequence for sentence segmentation, and generate the target text segmentation data as "who is it, what's the matter".
  • the server When acquiring the to-be-processed text data from the second business scenario, the server inputs the to-be-processed text data into the trained text segmentation model for data processing, and then combines the natural language processing algorithm to segment the to-be-processed text data to generate the target text Segmentation data.
  • the server when obtaining the pending text data of "Hello, ask a question” from a customer service scenario, the server inputs "Hello, ask a question” into the trained text segmentation model for data processing, combined with natural language processing algorithms Segment the text data to be processed, and generate the segmented data of the target text as "Hello, ask a question”.
  • the server generates target response data according to the target text segmentation data, and transmits the target response data to the target terminal,
  • the target text segmentation data is "Hello, ask a question”
  • the server generates the target response data "Please explain the problem” based on the target text segmentation data of "Hello, ask a question” and the corresponding scene configuration.
  • the target response data of "Please explain the problem” is transmitted to the target terminal.
  • the to-be-processed speech data obtained from the first business scenario is input into the speech recognition model to generate the to-be-recognized text sequence, and combined with the natural language processing algorithm to segment the to-be-recognized text sequence, or the
  • the to-be-processed text data obtained from the second business scenario is input into the trained text segmentation model, and combined with natural language processing algorithms to segment the to-be-processed text data;
  • the speech data to be processed and the text data to be processed in the second business scenario are segmented, which improves the flexibility and accuracy of segmenting.
  • another embodiment of the natural language-based sentence segmentation method in the embodiment of the present application includes:
  • the server obtains the voice data from the first service scenario to obtain the to-be-processed voice data, or the server obtains the text data from the second service scenario to obtain the to-be-processed text data. It should be emphasized that, in order to further ensure the privacy and security of the above-mentioned data fields, the above-mentioned to-be-processed voice data and to-be-processed text data can also be stored in a node of a blockchain.
  • the first business scenario is a telemarketing scenario
  • the second business scenario is a customer service scenario
  • the data acquired by the server from the telemarketing scenario is voice type data, that is, voice data to be processed
  • the server obtains data from the second business scenario
  • the acquired data is text type data, that is, the text data to be processed, and the voice data to be processed can be "May I ask what is the matter", “Yes, I am Mr. Zhang”, “What is the matter", etc., to be processed
  • the text data can be "Hello, ask a question", and "I got it, thank you", and the like.
  • the server when the server obtains the pending voice data of "Who is it, what's the matter?" from a telemarketing scenario, the server inputs the pending voice data of "Who is it, and what's the matter?" into the speech recognition module for feature extraction.
  • the generated speech signal features are:
  • the server when acquiring the to-be-processed voice data from the first business scenario, the server inputs the to-be-processed voice data into a preset voice recognition model, performs noise removal processing, and generates noise-removed voice data; The voice data after the enhancement is processed by signal enhancement to generate the enhanced voice data; finally, the server performs feature extraction on the enhanced voice data to generate voice signal features.
  • the server When the server obtains the to-be-processed voice data of "Who is it, what's the matter?" from the telemarketing scenario, the server inputs "who is it, what is it," into the speech recognition model.
  • the noise is the interfering data in the data, that is, the data that describe inaccurately.
  • a clustering algorithm is used for noise processing. First, the clustering algorithm is used to classify the similar sample points in the similar speech data to be processed into a cluster, and then the sample points that fall outside the cluster are determined.
  • the server performs signal enhancement processing on the noise-eliminated voice data, and firstly performs pre-emphasis processing on the noise-eliminated voice data to amplify the high frequency signal, obtain the voice data after amplifying the high-frequency signal, and then split the voice data after amplifying the high-frequency signal into the data of the short-time frame signal, obtain the split voice data, and add a window function to the split voice data. , generate the voice data after adding the window, and combine the Fourier transform to calculate and normalize the voice data after adding the window, and generate the voice data after the enhancement processing; Extract features from the voice data to generate voice signal features.
  • the server After generating the voice signal features, the server processes the voice signal features in step 202 to generate a text sequence to be recognized [May I ask what is the matter?], and finally combined with the natural language processing algorithm to [May I ask what is the matter] Segmentation is performed to generate the segmented data of the target text "Which is it, what's the matter?".
  • the server inputs the speech signal features into the acoustic model of the speech recognition model for scoring, and generates multiple acoustic model scores; the server inputs the speech signal features into the language model of the speech recognition model for scoring, and generates multiple language model scores, wherein
  • the language model can be an n-gram model, an RNN model, etc.; then the server uses a decoder to search for the highest scoring target acoustic model score and target language model score among multiple acoustic model scores and multiple language model scores, and score the target acoustic model.
  • the text characters corresponding to the score of the target language model are determined as the target text characters, thereby generating a text sequence to be recognized including multiple target text characters, and finally, combined with the natural language processing algorithm, the text sequence to be recognized is segmented to generate the segmented data of the target text.
  • the server When acquiring the to-be-processed text data from the second business scenario, the server inputs the to-be-processed text data into the trained text segmentation model for data processing, and then combines the natural language processing algorithm to segment the to-be-processed text data to generate the target text Segmentation data.
  • the server when obtaining the pending text data of "Hello, ask a question” from a customer service scenario, the server inputs "Hello, ask a question” into the trained text segmentation model for data processing, combined with natural language processing algorithms Segment the text data to be processed, and generate the segmented data of the target text as "Hello, ask a question”.
  • the server when acquiring the to-be-processed text data from the second business scenario, the server inputs the to-be-processed text data into the pre-trained text segmentation model, and combines the natural language processing algorithm to filter the features of the to-be-processed text data to generate a text observation sequence and observation label sequence; the server performs sentence segmentation based on the text observation sequence and observation label sequence to generate target text segmentation data.
  • the server When the server obtains the to-be-processed text data of "Hello, ask a question” from the customer service scenario, the server inputs the to-be-processed text data into the trained text segmentation model, and first combines the natural language processing algorithm to generate [Hello, ask a question] ] and the observation label sequence of "0100001", the server segmented the text observation sequence of [Hello Consult a Question] based on the observation label sequence of "0100001", and generated the target text segmentation data as "Hello, consult a question. question”.
  • the server When acquiring the to-be-processed text data from the second business scenario, the server inputs the to-be-processed text data into the pre-trained text segmentation model, and combines the natural language processing algorithm to filter the features of the to-be-processed text data to generate text observation sequences and observation labels
  • the sequence includes:
  • the server When acquiring the text data to be processed from the second business scenario, the server inputs the text data to be processed into the embedding layer of the pre-trained text segmentation model for vector mapping, and generates a vector sequence, the vector sequence does not include spaces; then the server converts the vector sequence Input the bidirectional long-short-term memory recurrent neural network, perform feature screening, and generate a vector sequence after screening features; finally, the server inputs the vector sequence after screening features into a conditional random field to generate a text observation sequence and an observation label sequence.
  • the text observation sequence includes multiple character
  • the observation label sequence includes multiple observation labels
  • the multiple characters correspond to the multiple observation labels one-to-one.
  • the server When the server obtains the to-be-processed text data of "Hello, ask a question" from the customer service scenario, the server inputs the to-be-processed text data into the embedding layer, that is, the Embedding layer for vector mapping. It should be noted that the to-be-processed text The data consists of multiple character data.
  • the server maps multiple character data in the text data to be processed into word vectors in the low-dimensional space, thereby generating an initial vector sequence, and based on the preset rules, the initial The spaces in the vector sequence are filtered out to generate a vector sequence; the server inputs the vector sequence into a bidirectional long-short-term memory recurrent neural network, that is, the BiLSTM neural network, which is used to delete useless features in the vector sequence.
  • a bidirectional long-short-term memory recurrent neural network that is, the BiLSTM neural network, which is used to delete useless features in the vector sequence.
  • the specific process is: calling the matrix Identify the parameters, multiply the matrix identification parameters with the vector sequence, and calculate with the activation function to obtain useless features, filter out the useless features, and complete the selection of the vector sequence, thereby generating the vector sequence after filtering the features, and then Input the vector sequence after filtering the features into the conditional random field, that is, the CRF layer for label calculation, thereby generating the text observation sequence of [Hello, ask a question] and the observation label sequence of "0100001".
  • the server performs sentence segmentation based on the text observation sequence and observation label sequence, and generates the target text sentence segmentation data, including: the server determines whether each observation label in the observation label sequence is a preset sentence segmentation label; if the target observation label is a sentence segmentation label, the server determines the target.
  • the character corresponding to the observation label is the target sentence fragmentation character, and a preset sentence fragmentation separator is added after the target sentence fragmentation character for sentence fragmentation to generate target text sentence fragmentation data.
  • the server performs label judgment on the observation label sequence of "0100001", assuming that the segmentation label is "1" and the non-stop label is "0", the server starts from the first observation label in the observation label sequence. Whether the observation label is the preset segmentation label "1", the second observation label and the last observation label are determined as the segmentation label "1”, and the server determines the second observation label as the target segmentation character, and in the target The sentence segmentation character ",” is added to the sentence fragmentation character. It should be noted that, in this embodiment, the sentence fragmentation separator is not inserted at the end of the sentence, so the generated target text sentence fragmentation data is "Hello, ask a question".
  • the server generates target response data according to the target text segmentation data, and transmits the target response data to the target terminal.
  • the target text segmentation data is "Hello, ask a question”
  • the server is based on the target of "Hello, ask a question”
  • the text segmentation data and the corresponding scene configuration generate the target response data "Please explain the problem”, and finally transmit the target response data of "Please explain the problem” to the target terminal.
  • the to-be-processed speech data obtained from the first business scenario is input into the speech recognition model to generate the to-be-recognized text sequence, and combined with the natural language processing algorithm to segment the to-be-recognized text sequence, or the
  • the to-be-processed text data obtained from the second business scenario is input into the trained text segmentation model, and combined with natural language processing algorithms to segment the to-be-processed text data;
  • the speech data to be processed and the text data to be processed in the second business scenario are segmented, which improves the flexibility and accuracy of segmenting.
  • an embodiment of the sentence segmentation device based on natural language in the embodiment of the present application include:
  • an obtaining module 301 configured to obtain the to-be-processed voice data from the first business scenario, or obtain the to-be-processed text data from the second business scenario;
  • the first sentence segmentation module 302 is configured to input the to-be-processed speech data into a preset speech recognition model when acquiring the to-be-processed speech data from the first business scenario, generate a to-be-recognized text sequence, and combine with a natural language processing algorithm Perform feature screening and sentence segmentation on the to-be-recognized text sequence to generate target text segmentation data, where the to-be-recognized text sequence includes a plurality of text characters;
  • the second sentence segmentation module 303 is configured to input the to-be-processed text data into the pre-trained text segmentation model when acquiring the to-be-processed text data from the second business scenario, and combine the natural language processing algorithm to analyze the to-be-processed text data. Feature filtering and segmentation of text data to generate target text segmentation data;
  • a response data generation module 304 configured to generate a target according to the target text segmentation data and the corresponding scene configuration response data, and transmit the target response data to the target terminal, and the scene configuration is the scene configuration set in advance.
  • the to-be-processed speech data obtained from the first business scenario is input into the speech recognition model to generate the to-be-recognized text sequence, and combined with the natural language processing algorithm to segment the to-be-recognized text sequence, or the
  • the text data to be processed obtained from the second business scenario is input into the trained text segmentation model, and the text data to be processed is segmented in combination with natural language processing algorithms; Segmenting the to-be-processed voice data of the business scenario and the to-be-processed text data of the second business scenario improves the flexibility and accuracy of the segmented sentences.
  • another embodiment of the device for segmenting sentences based on natural language in this embodiment of the present application includes:
  • an obtaining module 301 configured to obtain the to-be-processed voice data from the first business scenario, or obtain the to-be-processed text data from the second business scenario;
  • the first sentence segmentation module 302 is configured to input the to-be-processed speech data into a preset speech recognition model when acquiring the to-be-processed speech data from the first business scenario, generate a to-be-recognized text sequence, and combine with a natural language processing algorithm Perform feature screening and sentence segmentation on the to-be-recognized text sequence to generate target text segmentation data, where the to-be-recognized text sequence includes a plurality of text characters;
  • the second sentence segmentation module 303 is configured to input the to-be-processed text data into the pre-trained text segmentation model when acquiring the to-be-processed text data from the second business scenario, and combine the natural language processing algorithm to analyze the to-be-processed text data. Feature filtering and segmentation of text data to generate target text segmentation data;
  • the response data generation module 304 is configured to generate target response data according to the target text segmentation data and the corresponding scene configuration, and transmit the target response data to the target terminal, where the scene configuration is a preset scene configuration.
  • the first sentence segmentation module 302 includes:
  • the feature extraction unit 3021 when acquiring the to-be-processed voice data from the first business scenario, is used to input the to-be-processed voice data into a preset voice recognition model, perform feature extraction, and generate voice signal features;
  • the first sentence segmentation unit 3022 is configured to process the features of the speech signal, generate a text sequence to be recognized, and perform feature screening and sentence segmentation on the to-be-recognized text sequence in combination with a natural language processing algorithm to generate target text segmentation data, the
  • the text sequence to be recognized includes multiple target text characters.
  • the feature extraction unit 3021 can also be specifically used for:
  • Feature extraction is performed on the enhanced speech data to generate speech signal features.
  • the first sentence segmentation unit 3022 may also be specifically used for:
  • the text sequence to be recognized includes multiple target text characters
  • Segmentation is performed on the text sequence to be recognized in combination with a natural language processing algorithm to generate segmented data of target text.
  • the second sentence segmentation module 303 includes:
  • the feature screening module 3031 when acquiring the to-be-processed text data from the second business scenario, is used to input the to-be-processed text data into the pre-trained text segmentation model, and combine the natural language processing algorithm to analyze the to-be-processed text data. Perform feature screening to generate text observation sequences and observation label sequences;
  • the second sentence segmentation module 3032 is configured to perform sentence segmentation based on the text observation sequence and the observation label sequence to generate target text sentence segmentation data.
  • the feature screening module 3031 can also be specifically used for:
  • the vector sequence after the screening feature is input into a conditional random field to generate a text observation sequence and an observation label sequence the text observation sequence includes a plurality of characters
  • the observation label sequence includes a plurality of observation labels
  • the plurality of characters are the same as the observation label.
  • the multiple observation labels are in one-to-one correspondence.
  • the second sentence segmentation module 3032 can also be used to:
  • each observation tag in the observation tag sequence is a preset sentence segmentation tag
  • the target observation label is a sentence fragmentation label
  • the character corresponding to the target observation label is determined as the target sentence fragmentation character, and a preset sentence fragmentation separator is added after the target sentence fragmentation character to perform sentence fragmentation to generate target text sentence fragmentation data.
  • the to-be-processed speech data obtained from the first business scenario is input into the speech recognition model to generate the to-be-recognized text sequence, and combined with the natural language processing algorithm to segment the to-be-recognized text sequence, or the
  • the text data to be processed obtained from the second business scenario is input into the trained text segmentation model, and the text data to be processed is segmented in combination with natural language processing algorithms; Segmenting the to-be-processed voice data of the business scenario and the to-be-processed text data of the second business scenario improves the flexibility and accuracy of the segmented sentences.
  • Figures 3 and 4 above describe the natural language-based sentence segmentation device in the embodiment of the present application in detail from the perspective of modular functional entities, and the following describes the natural language-based sentence segmentation device in the embodiment of the present application from the perspective of hardware processing.
  • the natural language-based sentence segmentation device 500 may vary greatly due to different configurations or performance, and may include one or more processors (central processing units, CPU) 510 (eg, one or more processors) and memory 520, one or more storage media 530 (eg, one or more mass storage devices) that store application programs 533 or data 532.
  • the memory 520 and the storage medium 530 may be short-term storage or persistent storage.
  • the program stored in the storage medium 530 may include one or more modules (not shown in the figure), and each module may include a series of instructions to operate on the natural language-based sentence segmentation device 500 .
  • the processor 510 may be configured to communicate with the storage medium 530 to execute a series of instruction operations in the storage medium 530 on the natural language-based sentence segmentation device 500 .
  • the natural language-based sentence segmentation device 500 may also include one or more power supplies 540, one or more wired or wireless network interfaces 550, one or more input and output interfaces 560, and/or, one or more operating systems 531, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, and more.
  • operating systems 531 such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, and more.
  • the present application also provides a natural language-based sentence segmentation device, the computer device includes a memory and a processor, the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processor, causes the processor to execute the above embodiments.
  • the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium may be a non-volatile computer-readable storage medium.
  • the computer-readable storage medium may also be a volatile computer-readable storage medium.
  • calculate The machine-readable storage medium stores instructions that, when executed on a computer, cause the computer to execute the steps of the natural language-based sentence segmentation method.
  • the blockchain referred to in this application is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium.
  • the technical solutions of the present application can be embodied in the form of software products in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, and the computer software products are stored in a storage medium , including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)

Abstract

一种基于自然语言的断句方法、装置、设备及存储介质,涉及大数据处理领域,用于采用自然语言处理算法进行断句,从而提高断句的灵活性和准确性。所述方法包括:从第一业务场景中获取待处理语音数据,或者从第二业务场景中获取待处理文本数据;当获取待处理语音数据时,将待处理语音数据输入预置的语音识别模型中结合自然语言处理算法进行特征筛选与断句,生成目标文本断句数据;当获取待处理文本数据时,将待处理文本数据输入预先训练好的文本断句模型进行特征筛选与断句,生成目标文本断句数据;根据目标文本断句数据和场景配置生成目标应答数据。还涉及区块链技术,待处理语音数据可存储于区块链中。

Description

基于自然语言的断句方法、装置、设备及存储介质
本申请要求于2020年12月23日提交中国专利局、申请号为202011538883.3、发明名称为“基于自然语言的断句方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。
技术领域
本申请涉及自然语言处理技术领域,尤其涉及一种基于自然语言的断句方法、装置、设备及存储介质。
背景技术
随着人工智能技术的不断发展与应用,出现了越来越多的机器人服务场景应用,人机交互将成为新时代的常用技术。对于客服行业,无论是电话销售还是客服中心,智能客服机器人不仅可以帮助企业节约成本、降低人力成本,还能大幅度地提升工作效率,是客服人员的最佳助手。对用户话语进行断句,是客服机器人文本处理的第一步,断句会影响后续所有交互模块的准确率,因而影响客服机器人的性能。
在电话销售场景和在客服中心场景中,用户与客服通过语音信号或者文字进行交流,对用户的语音信号和文字进行断句是必备过程。目前市面上的客服机器人断句模块大多使用语音端点检测技术进行断句,即结合频域、谱熵、基频等语音特征,从连续音频信号中检测出实际语音片段的起始点和终止点,然后从终止点处断句,或者根据终止点与下一个起始点间的停顿时间长短确定断句,发明人意识到语音端点检测技术的断句方法会出现断句起始点和断句终止点识别错误的问题,从而降低断句的准确性。
发明内容
本申请提供了一种基于自然语言的断句方法、装置、设备及存储介质,用于采用自然语言处理算法进行断句,从而提高断句的灵活性和准确性。
本申请第一方面提供了一种基于自然语言的断句方法,包括:从第一业务场景中获取待处理语音数据,或者从第二业务场景中获取待处理文本数据;当从第一业务场景中获取待处理语音数据时,将所述待处理语音数据输入预置的语音识别模型中,生成待识别文本序列,并结合自然语言处理算法对所述待识别文本序列进行特征筛选与断句,生成目标文本断句数据,所述待识别文本序列包括多个文本字符;当从第二业务场景中获取待处理文本数据时,将所述待处理文本数据输入预先训练好的文本断句模型中,并结合自然语言处理算法对所述待处理文本数据进行特征筛选与断句,生成目标文本断句数据;根据所述目标文本断句数据和对应的场景配置生成目标应答数据,并将所述目标应答数据传输至目标终端,场景配置为提前设置好的场景配置。
本申请第二方面提供了一种基于自然语言的断句设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:从第一业务场景中获取待处理语音数据,或者从第二业务场景中获取待处理文本数据;当从第一业务场景中获取待处理语音数据时,将所述待处理语音数据输入预置的语音识别模型中,生成待识别文本序列,并结合自然语言处理算法对所述待识别文本序列进行特征筛选与断句,生成目标文本断句数据,所述待识别文本序列包括多个文本字符;当从第二业务场景中获取待处理文本数据时,将所述待处理文本数据输入预先训练好的文本断句模型中,并结合自然语言处理算法对所述待处理文本数据进行特征筛选与断句,生成目标文本断句数据;根据所述目标文本断句数据和对应的场景配置生成目标应答数据,并将所述目标应答数据传输至目标终端,场景配置为提前设置好的场景配置。
本申请的第三方面提供了一种计算机可读存储介质,所述计算机可读存储介质中存储计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:从第一业 务场景中获取待处理语音数据,或者从第二业务场景中获取待处理文本数据;当从第一业务场景中获取待处理语音数据时,将所述待处理语音数据输入预置的语音识别模型中,生成待识别文本序列,并结合自然语言处理算法对所述待识别文本序列进行特征筛选与断句,生成目标文本断句数据,所述待识别文本序列包括多个文本字符;当从第二业务场景中获取待处理文本数据时,将所述待处理文本数据输入预先训练好的文本断句模型中,并结合自然语言处理算法对所述待处理文本数据进行特征筛选与断句,生成目标文本断句数据;根据所述目标文本断句数据和对应的场景配置生成目标应答数据,并将所述目标应答数据传输至目标终端,场景配置为提前设置好的场景配置。
本申请第四方面提供了一种基于自然语言的断句装置,其中,所述基于自然语言的断句装置包括:获取模块,用于从第一业务场景中获取待处理语音数据,或者从第二业务场景中获取待处理文本数据;第一断句模块,当从第一业务场景中获取待处理语音数据时,用于将所述待处理语音数据输入预置的语音识别模型中,生成待识别文本序列,并结合自然语言处理算法对所述待识别文本序列进行特征筛选与断句,生成目标文本断句数据,所述待识别文本序列包括多个文本字符;第二断句模块,当从第二业务场景中获取待处理文本数据时,用于将所述待处理文本数据输入预先训练好的文本断句模型中,并结合自然语言处理算法对所述待处理文本数据进行特征筛选与断句,生成目标文本断句数据;应答数据生成模块,用于根据所述目标文本断句数据和对应的场景配置生成目标应答数据,并将所述目标应答数据传输至目标终端,场景配置为提前设置好的场景配置。
本申请提供的技术方案中,从第一业务场景中获取待处理语音数据,或者从第二业务场景中获取待处理文本数据;当从第一业务场景中获取待处理语音数据时,将所述待处理语音数据输入预置的语音识别模型中,生成待识别文本序列,并结合自然语言处理算法对所述待识别文本序列进行特征筛选与断句,生成目标文本断句数据,所述待识别文本序列包括多个文本字符;当从第二业务场景中获取待处理文本数据时,将所述待处理文本数据输入预先训练好的文本断句模型中,并结合自然语言处理算法对所述待处理文本数据进行特征筛选与断句,生成目标文本断句数据;根据所述目标文本断句数据和对应的场景配置生成目标应答数据,并将所述目标应答数据传输至目标终端,场景配置为提前设置好的场景配置。本申请实施例中,将从第一业务场景(电话销售场景)获取的待处理语音数据输入语音识别模型中,生成待识别文本序列,并结合自然语言处理算法对待识别文本序列进行断句,或者将从第二业务场景(客户服务场景)获取的待处理文本数据输入训练好的文本断句模型中,并结合自然语言处理算法对所述待处理文本数据进行断句;通过使用自然语言处理算法对第一业务场景的待处理语音数据和对第二业务场景的待处理文本数据进行断句,提高了断句的灵活性和准确性。
附图说明
图1为本申请实施例中基于自然语言的断句方法的一个实施例示意图;
图2为本申请实施例中基于自然语言的断句方法的另一个实施例示意图;
图3为本申请实施例中基于自然语言的断句装置的一个实施例示意图;
图4为本申请实施例中基于自然语言的断句装置的另一个实施例示意图;
图5为本申请实施例中基于自然语言的断句设备的一个实施例示意图。
具体实施方式
本申请实施例提供了一种基于自然语言的断句方法、装置、设备及存储介质,用于采用自然语言处理算法进行断句,从而提高断句的灵活性和准确性。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理 解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”或“具有”及其任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
为便于理解,下面对本申请实施例的具体流程进行描述,请参阅图1,本申请实施例中基于自然语言的断句方法的一个实施例包括:
101、从第一业务场景中获取待处理语音数据,或者从第二业务场景中获取待处理文本数据;
服务器从第一业务场景中获取语音数据,得到待处理语音数据,或者服务器从第二业务场景中获取文本数据,得到待处理文本数据。需要强调的是,为进一步保证上述数据字段的私密和安全性,上述待处理语音数据和待处理文本数据还可以存储于一区块链的节点中。
在本实施例中,第一业务场景为电话销售场景,第二业务场景为客户服务场景,服务器从电话销售场景获取的数据为语音类型数据,即待处理语音数据,服务器从第二业务场景中获取的数据为文本类型数据,即待处理文本数据,其中,待处理语音数据可以为“请问是哪位有什么事情”、“是的我是张先生”、“有什么事情”等,待处理文本数据可以为“你好咨询个问题”、以及“知道了谢谢”等。
可以理解的是,本申请的执行主体可以为基于自然语言的断句装置,还可以是终端或者服务器,具体此处不做限定。本申请实施例以服务器为执行主体为例进行说明。
102、当从第一业务场景中获取待处理语音数据时,将待处理语音数据输入预置的语音识别模型中,生成待识别文本序列,并结合自然语言处理算法对待识别文本序列进行特征筛选与断句,生成目标文本断句数据,待识别文本序列包括多个文本字符;
当从第一业务场景中获取待处理语音数据时,服务器将待处理语音数据输入语音识别模型中进行处理,首先生成待识别文本序列,然后再结合自然语音处理算法对待识别文本序列进行断句,从而生成目标文本断句数据。
例如,当服务器从电话销售场景中获取“请问是哪位有什么事情”的待处理语音数据时,服务器将待处理语音数据输入语音识别模型中,生成待识别文本序列,其中在语音识别模型中对“请问是哪位有什么事情”的待处理语音数据进行消除噪声、信道处理、特征提取等处理,从而生成待识别文本序列[请问是哪位有什么事情],然后结合自然语言处理算法对待识别文本序列进行断句,生成目标文本断句数据为“请问是哪位,有什么事情”。
103、当从第二业务场景中获取待处理文本数据时,将待处理文本数据输入预先训练好的文本断句模型中,并结合自然语言处理算法对待处理文本数据进行特征筛选与断句,生成目标文本断句数据;
当从第二业务场景中获取待处理文本数据时,服务器将待处理文本数据输入训练好的文本断句模型中进行数据处理,然后结合自然语言处理算法对待处理文本数据进行断句处理,从而生成目标文本断句数据。
例如,当从客户服务场景中获取“你好咨询个问题”的待处理文本数据时,服务器将“你好咨询个问题”输入训练好的文本断句模型中进行数据处理,并结合自然语言处理算法对待处理文本数据进行断句,生成目标文本断句数据为“你好,咨询个问题”。
104、根据目标文本断句数据和对应的场景配置生成目标应答数据,并将目标应答数据传输至目标终端,场景配置为提前设置好的场景配置。
服务器根据目标文本断句数据生成目标应答数据,并将目标应答数据传输至目标终端, 例如,目标文本断句数据为“你好,咨询个问题”,服务器则基于“你好,咨询个问题”的目标文本断句数据和对应的场景配置生成目标应答数据“请您说明问题”,最后将“请您说明问题”的目标应答数据传输至目标终端。
本申请实施例中,将从第一业务场景(电话销售场景)获取的待处理语音数据输入语音识别模型中,生成待识别文本序列,并结合自然语言处理算法对待识别文本序列进行断句,或者将从第二业务场景(客户服务场景)获取的待处理文本数据输入训练好的文本断句模型中,并结合自然语言处理算法对待处理文本数据进行断句;通过使用自然语言处理算法对第一业务场景的待处理语音数据和对第二业务场景的待处理文本数据进行断句,提高了断句的灵活性和准确性。
请参阅图2,本申请实施例中基于自然语言的断句方法的另一个实施例包括:
201、从第一业务场景中获取待处理语音数据,或者从第二业务场景中获取待处理文本数据;
服务器从第一业务场景中获取语音数据,得到待处理语音数据,或者服务器从第二业务场景中获取文本数据,得到待处理文本数据。需要强调的是,为进一步保证上述数据字段的私密和安全性,上述待处理语音数据和待处理文本数据还可以存储于一区块链的节点中。
在本实施例中,第一业务场景为电话销售场景,第二业务场景为客户服务场景,服务器从电话销售场景获取的数据为语音类型数据,即待处理语音数据,服务器从第二业务场景中获取的数据为文本类型数据,即待处理文本数据,其中,待处理语音数据可以为“请问是哪位有什么事情”、“是的我是张先生”、“有什么事情”等,待处理文本数据可以为“你好咨询个问题”、以及“知道了谢谢”等。
202、当从第一业务场景中获取待处理语音数据时,将待处理语音数据输入预置的语音识别模型中,进行特征提取,生成语音信号特征;
例如,当服务器从电话销售场景中获取“请问是哪位有什么事情”的待处理语音数据时,服务器将“请问是哪位有什么事情”的待处理语音数据输入语音识别模块中进行特征提取,生成语音信号特征为:
Figure PCTCN2021124954-appb-000001
具体的,当从第一业务场景中获取待处理语音数据时,服务器将待处理语音数据输入预置的语音识别模型,进行噪声消除处理,生成消除噪声后的语音数据;然后服务器对消除噪声后的语音数据进行信号增强处理,生成增强处理后的语音数据;最后服务器对增强处理后的语音数据进行特征提取,生成语音信号特征。
当服务器从电话销售场景中获取“请问是哪位有什么事情”的待处理语音数据时,服务器将“请问是哪位有什么事情”输入语音识别模型中,首先对“请问是哪位有什么事情”进行噪声处理,噪声为数据中的干扰数据,即描述不准确的数据。在本实施例中采用聚类算法进行噪声处理,首先采用聚类算法将相似的待处理语音数据中的相似的样本点归为一个类簇,然后将落在该类簇之外的样本点确定为噪声点,并过滤掉这些噪声点,生成消除噪声后的语音数据;然后服务器对消除噪声后的语音数据进行信号增强处理,首先对消除噪声后的语音数据进行预加重处理,从而放大高频信号,得到放大高频信号后的语音数据,然后将放大高频信号后的语音数据拆分为短时帧信号的数据,得到拆分后的语音数据,对拆分后的语音数据添加窗口函数,生成添加窗口后的语音数据,结合傅里叶变换对添加窗口后的语音数据进行计算并进行归一化,生成增强处理后的语音数据;最后从增强处理后 的语音数据中提取特征,生成语音信号特征。
203、对语音信号特征进行处理,生成待识别文本序列,并结合自然语言处理算法对待识别文本序列进行特征筛选与断句,生成目标文本断句数据,待识别文本序列包括多个目标文本字符;
在生成语音信号特征之后,服务器对步骤202中的语音信号特征进行处理,生成待识别文本序列[请问是哪位有什么事情],最后结合自然语言处理算法对[请问是哪位有什么事情]进行断句,生成目标文本断句数据“请问是哪位,有什么事情”。
具体的,服务器将语音信号特征输入语音识别模型的声学模型中进行评分,生成多个声学模型评分;服务器将语音信号特征输入语音识别模型的语言模型中进行评分,生成多个语言模型评分,其中语言模型可以为n-gram模型、RNN模型等;然后服务器采用解码器在多个声学模型评分和多个语言模型得分中搜寻评分最高的目标声学模型评分和目标语言模型评分,将目标声学模型评分和目标语言模型评分对应的文本字符确定为目标文本字符,从而生成包括多个目标文本字符的待识别文本序列,最后结合自然语言处理算法对待识别文本序列进行断句,生成目标文本断句数据。
204、当从第二业务场景中获取待处理文本数据时,将待处理文本数据输入预先训练好的文本断句模型中,并结合自然语言处理算法对待处理文本数据进行特征筛选与断句,生成目标文本断句数据;
当从第二业务场景中获取待处理文本数据时,服务器将待处理文本数据输入训练好的文本断句模型中进行数据处理,然后结合自然语言处理算法对待处理文本数据进行断句处理,从而生成目标文本断句数据。
例如,当从客户服务场景中获取“你好咨询个问题”的待处理文本数据时,服务器将“你好咨询个问题”输入训练好的文本断句模型中进行数据处理,并结合自然语言处理算法对待处理文本数据进行断句,生成目标文本断句数据为“你好,咨询个问题”。
具体的,当从第二业务场景中获取待处理文本数据时,服务器将待处理文本数据输入预先训练好的文本断句模型中,结合自然语言处理算法对待处理文本数据进行特征筛选,生成文本观测序列和观测标签序列;服务器基于文本观测序列和观测标签序列进行断句,生成目标文本断句数据。
当服务器从客户服务场景中获取“你好咨询个问题”的待处理文本数据时,服务器将待处理文本数据输入训练好的文本断句模型中,首先结合自然语言处理算法生成[你好咨询个问题]的文本观测序列和“0100001”的观测标签序列,服务器基于“0100001”的观测标签序列对[你好咨询个问题]的文本观测序列进行断句,生成目标文本断句数据为“你好,咨询个问题”。
当从第二业务场景中获取待处理文本数据时,服务器将待处理文本数据输入预先训练好的文本断句模型中,结合自然语言处理算法对待处理文本数据进行特征筛选,生成文本观测序列和观测标签序列包括:
当从第二业务场景中获取待处理文本数据时,服务器将待处理文本数据输入预先训练好的文本断句模型的嵌入层进行向量映射,生成向量序列,向量序列不包括空格;然后服务器将向量序列输入双向长短时记忆循环神经网络,进行特征筛选,生成筛选特征后的向量序列;最后服务器将筛选特征后的向量序列输入条件随机场,生成文本观测序列和观测标签序列,文本观测序列包括多个字符,观测标签序列包括多个观测标签,多个字符与多个观测标签一一对应。
当服务器从客户服务场景中获取“你好咨询个问题”的待处理文本数据时,服务器将待处理文本数据输入嵌入层,即Embedding层进行向量映射,需要说明的是,待处理文本 数据有多个字符数据组成,经过Embedding层,服务器将待处理文本数据中的多个字符数据映射为低维空间的上的字向量,从而生成初始向量序列,并基于预置的规则,将初始向量序列中的空格过滤掉,从而生成向量序列;服务器将向量序列输入双向长短时记忆循环神经网络,即BiLSTM神经网络,该神经网络用于删除向量序列中的无用特征,具体过程为:调用矩阵标识参数,并将矩阵标识参数与向量序列进行相乘,并结合激活函数进行计算,从而得到无用特征,并将无用特征过滤掉,完成向量序列的筛选,从而生成筛选特征后的向量序列,然后将筛选特征后的向量序列输入条件随机场,即CRF层进行标签计算,从而生成[你好咨询个问题]的文本观测序列和“0100001”的观测标签序列。
服务器基于文本观测序列和观测标签序列进行断句,生成目标文本断句数据包括:服务器判断观测标签序列中的每个观测标签是否为预置的断句标签;若目标观测标签为断句标签,服务器则确定目标观测标签对应的字符为目标断句字符,并在目标断句字符后添加预置的断句分隔符进行断句,生成目标文本断句数据。
服务器对“0100001”的观测标签序列进行标签判断,假设断句标签为“1”,不断句标签为“0”,服务器则从观测标签序列中的第一位观测标签开始进行标签判断,判断每个观测标签是否为预置的断句标签“1”,判定第二位观测标签和最后一位观测标签为断句标签“1”,服务器则将第二位观测标签确定为目标断句字符,并在该目标断句字符添加断句分割符“,”,需要说明的是,在本实施例中,不在句末插入断句分隔符,因此生成目标文本断句数据为“你好,咨询个问题”。
205、根据目标文本断句数据和对应的场景配置生成目标应答数据,并将目标应答数据传输至目标终端,场景配置为提前设置好的场景配置。
服务器根据目标文本断句数据生成目标应答数据,并将目标应答数据传输至目标终端,例如,目标文本断句数据为“你好,咨询个问题”,服务器则基于“你好,咨询个问题”的目标文本断句数据和对应的场景配置生成目标应答数据“请您说明问题”,最后将“请您说明问题”的目标应答数据传输至目标终端。
本申请实施例中,将从第一业务场景(电话销售场景)获取的待处理语音数据输入语音识别模型中,生成待识别文本序列,并结合自然语言处理算法对待识别文本序列进行断句,或者将从第二业务场景(客户服务场景)获取的待处理文本数据输入训练好的文本断句模型中,并结合自然语言处理算法对待处理文本数据进行断句;通过使用自然语言处理算法对第一业务场景的待处理语音数据和对第二业务场景的待处理文本数据进行断句,提高了断句的灵活性和准确性。
上面对本申请实施例中基于自然语言的断句方法进行了描述,下面对本申请实施例中基于自然语言的断句装置进行描述,请参阅图3,本申请实施例中基于自然语言的断句装置一个实施例包括:
获取模块301,用于从第一业务场景中获取待处理语音数据,或者从第二业务场景中获取待处理文本数据;
第一断句模块302,当从第一业务场景中获取待处理语音数据时,用于将所述待处理语音数据输入预置的语音识别模型中,生成待识别文本序列,并结合自然语言处理算法对所述待识别文本序列进行特征筛选与断句,生成目标文本断句数据,所述待识别文本序列包括多个文本字符;
第二断句模块303,当从第二业务场景中获取待处理文本数据时,用于将所述待处理文本数据输入预先训练好的文本断句模型中,并结合自然语言处理算法对所述待处理文本数据进行特征筛选与断句,生成目标文本断句数据;
应答数据生成模块304,用于根据所述目标文本断句数据和对应的场景配置生成目标 应答数据,并将所述目标应答数据传输至目标终端,场景配置为提前设置好的场景配置。
本申请实施例中,将从第一业务场景(电话销售场景)获取的待处理语音数据输入语音识别模型中,生成待识别文本序列,并结合自然语言处理算法对待识别文本序列进行断句,或者将从第二业务场景(客户服务场景)获取的待处理文本数据输入训练好的文本断句模型中,并结合自然语言处理算法对所述待处理文本数据进行断句;通过使用自然语言处理算法对第一业务场景的待处理语音数据和对第二业务场景的待处理文本数据进行断句,提高了断句的灵活性和准确性。
请参阅图4,本申请实施例中基于自然语言的断句装置的另一个实施例包括:
获取模块301,用于从第一业务场景中获取待处理语音数据,或者从第二业务场景中获取待处理文本数据;
第一断句模块302,当从第一业务场景中获取待处理语音数据时,用于将所述待处理语音数据输入预置的语音识别模型中,生成待识别文本序列,并结合自然语言处理算法对所述待识别文本序列进行特征筛选与断句,生成目标文本断句数据,所述待识别文本序列包括多个文本字符;
第二断句模块303,当从第二业务场景中获取待处理文本数据时,用于将所述待处理文本数据输入预先训练好的文本断句模型中,并结合自然语言处理算法对所述待处理文本数据进行特征筛选与断句,生成目标文本断句数据;
应答数据生成模块304,用于根据所述目标文本断句数据和对应的场景配置生成目标应答数据,并将所述目标应答数据传输至目标终端,场景配置为提前设置好的场景配置。
可选的,第一断句模块302包括:
特征提取单元3021,当从第一业务场景中获取待处理语音数据时,用于将所述待处理语音数据输入预置的语音识别模型中,进行特征提取,生成语音信号特征;
第一断句单元3022,用于对所述语音信号特征进行处理,生成待识别文本序列,并结合自然语言处理算法对所述待识别文本序列进行特征筛选与断句,生成目标文本断句数据,所述待识别文本序列包括多个目标文本字符。
可选的,特征提取单元3021还可以具体用于:
当从第一业务场景中获取待处理语音数据时,将所述待处理语音数据输入预置的语音识别模型,进行噪声消除处理,生成消除噪声后的语音数据;
对所述消除噪声后的语音数据进行信号增强处理,生成增强处理后的语音数据;
对所述增强处理后的语音数据进行特征提取,生成语音信号特征。
可选的,第一断句单元3022还可以具体用于:
将所述语音信号特征输入所述语音识别模型的声学模型中进行评分,生成多个声学模型评分;
将所述语音信号特征输入所述语音识别模型的语言模型中进行评分,生成多个语言模型评分;
在所述多个声学模型评分和多个语言模型得分中搜寻评分最高的目标声学模型评分和目标语言模型评分,并基于所述目标声学模型评分和目标语言模型评分确定待识别文本序列,所述待识别文本序列包括多个目标文本字符;
结合自然语言处理算法对所述待识别文本序列进行断句,生成目标文本断句数据。
可选的,第二断句模块303包括:
特征筛选模块3031,当从第二业务场景中获取待处理文本数据时,用于将所述待处理文本数据输入预先训练好的文本断句模型中,结合自然语言处理算法对所述待处理文本数据进行特征筛选,生成文本观测序列和观测标签序列;
第二断句模块3032,用于基于所述文本观测序列和所述观测标签序列进行断句,生成目标文本断句数据。
可选的,特征筛选模块3031还可以具体用于:
当从第二业务场景中获取待处理文本数据时,将所述待处理文本数据输入预先训练好的文本断句模型的嵌入层进行向量映射,生成向量序列,所述向量序列不包括空格;
将所述向量序列输入双向长短时记忆循环神经网络,进行特征筛选,生成筛选特征后的向量序列;
将所述筛选特征后的向量序列输入条件随机场,生成文本观测序列和观测标签序列,所述文本观测序列包括多个字符,所述观测标签序列包括多个观测标签,所述多个字符与所述多个观测标签一一对应。
可选的,第二断句模块3032还可以用于:
判断所述观测标签序列中的每个观测标签是否为预置的断句标签;
若目标观测标签为断句标签,则确定所述目标观测标签对应的字符为目标断句字符,并在所述目标断句字符后添加预置的断句分隔符进行断句,生成目标文本断句数据。
本申请实施例中,将从第一业务场景(电话销售场景)获取的待处理语音数据输入语音识别模型中,生成待识别文本序列,并结合自然语言处理算法对待识别文本序列进行断句,或者将从第二业务场景(客户服务场景)获取的待处理文本数据输入训练好的文本断句模型中,并结合自然语言处理算法对所述待处理文本数据进行断句;通过使用自然语言处理算法对第一业务场景的待处理语音数据和对第二业务场景的待处理文本数据进行断句,提高了断句的灵活性和准确性。
上面图3和图4从模块化功能实体的角度对本申请实施例中的基于自然语言的断句装置进行详细描述,下面从硬件处理的角度对本申请实施例中基于自然语言的断句设备进行详细描述。
图5是本申请实施例提供的一种基于自然语言的断句设备的结构示意图,该基于自然语言的断句设备500可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器(central processing units,CPU)510(例如,一个或一个以上处理器)和存储器520,一个或一个以上存储应用程序533或数据532的存储介质530(例如一个或一个以上海量存储设备)。其中,存储器520和存储介质530可以是短暂存储或持久存储。存储在存储介质530的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对基于自然语言的断句设备500中的一系列指令操作。更进一步地,处理器510可以设置为与存储介质530通信,在基于自然语言的断句设备500上执行存储介质530中的一系列指令操作。
基于自然语言的断句设备500还可以包括一个或一个以上电源540,一个或一个以上有线或无线网络接口550,一个或一个以上输入输出接口560,和/或,一个或一个以上操作系统531,例如Windows Serve,Mac OS X,Unix,Linux,FreeBSD等等。本领域技术人员可以理解,图5示出的基于自然语言的断句设备结构并不构成对基于自然语言的断句设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
本申请还提供一种基于自然语言的断句设备,所述计算机设备包括存储器和处理器,存储器中存储有计算机可读指令,计算机可读指令被处理器执行时,使得处理器执行上述各实施例中的所述基于自然语言的断句方法的步骤。
本申请还提供一种计算机可读存储介质,该计算机可读存储介质可以为非易失性计算机可读存储介质,该计算机可读存储介质也可以为易失性计算机可读存储介质,所述计算 机可读存储介质中存储有指令,当所述指令在计算机上运行时,使得计算机执行所述基于自然语言的断句方法的步骤。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (20)

  1. 一种基于自然语言的断句方法,其中,所述基于自然语言的断句方法包括:
    从第一业务场景中获取待处理语音数据,或者从第二业务场景中获取待处理文本数据;
    当从第一业务场景中获取待处理语音数据时,将所述待处理语音数据输入预置的语音识别模型中,生成待识别文本序列,并结合自然语言处理算法对所述待识别文本序列进行特征筛选与断句,生成目标文本断句数据,所述待识别文本序列包括多个文本字符;
    当从第二业务场景中获取待处理文本数据时,将所述待处理文本数据输入预先训练好的文本断句模型中,并结合自然语言处理算法对所述待处理文本数据进行特征筛选与断句,生成目标文本断句数据;
    根据所述目标文本断句数据和对应的场景配置生成目标应答数据,并将所述目标应答数据传输至目标终端,场景配置为提前设置好的场景配置。
  2. 根据权利要求1所述的基于自然语言的断句方法,其中,所述当从第一业务场景中获取待处理语音数据时,将所述待处理语音数据输入预置的语音识别模型中,生成待识别文本序列,并结合自然语言处理算法对所述待识别文本序列进行特征筛选与断句,生成目标文本断句数据,所述待识别文本序列包括多个文本字符包括:
    当从第一业务场景中获取待处理语音数据时,将所述待处理语音数据输入预置的语音识别模型中,进行特征提取,生成语音信号特征;
    对所述语音信号特征进行处理,生成待识别文本序列,并结合自然语言处理算法对所述待识别文本序列进行特征筛选与断句,生成目标文本断句数据,所述待识别文本序列包括多个目标文本字符。
  3. 根据权利要求2所述的基于自然语言的断句方法,其中,所述当从第一业务场景中获取待处理语音数据时,将所述待处理语音数据输入预置的语音识别模型中,进行特征提取,生成语音信号特征包括:
    当从第一业务场景中获取待处理语音数据时,将所述待处理语音数据输入预置的语音识别模型,进行噪声消除处理,生成消除噪声后的语音数据;
    对所述消除噪声后的语音数据进行信号增强处理,生成增强处理后的语音数据;
    对所述增强处理后的语音数据进行特征提取,生成语音信号特征。
  4. 根据权利要求2所述的基于自然语言的断句方法,其中,所述对所述语音信号特征进行处理,生成待识别文本序列,并结合自然语言处理算法对所述待识别文本序列进行特征筛选与断句,生成目标文本断句数据,所述待识别文本序列包括多个目标文本字符包括:
    将所述语音信号特征输入所述语音识别模型的声学模型中进行评分,生成多个声学模型评分;
    将所述语音信号特征输入所述语音识别模型的语言模型中进行评分,生成多个语言模型评分;
    在所述多个声学模型评分和多个语言模型得分中搜寻评分最高的目标声学模型评分和目标语言模型评分,并基于所述目标声学模型评分和目标语言模型评分确定待识别文本序列,所述待识别文本序列包括多个目标文本字符;
    结合自然语言处理算法对所述待识别文本序列进行断句,生成目标文本断句数据。
  5. 根据权利要求1所述的基于自然语言的断句方法,其中,所述当从第二业务场景中获取待处理文本数据时,将所述待处理文本数据输入预先训练好的文本断句模型中,并结合自然语言处理算法对所述待处理文本数据进行特征筛选与断句,生成目标文本断句数据包括:
    当从第二业务场景中获取待处理文本数据时,将所述待处理文本数据输入预先训练好 的文本断句模型中,结合自然语言处理算法对所述待处理文本数据进行特征筛选,生成文本观测序列和观测标签序列;
    基于所述文本观测序列和所述观测标签序列进行断句,生成目标文本断句数据。
  6. 根据权利要求5所述的基于自然语言的断句方法,其中,所述当从第二业务场景中获取待处理文本数据时,将所述待处理文本数据输入预先训练好的文本断句模型中,结合自然语言处理算法对所述待处理文本数据进行特征筛选,生成文本观测序列和观测标签序列包括:
    当从第二业务场景中获取待处理文本数据时,将所述待处理文本数据输入预先训练好的文本断句模型的嵌入层进行向量映射,生成向量序列,所述向量序列不包括空格;
    将所述向量序列输入双向长短时记忆循环神经网络,进行特征筛选,生成筛选特征后的向量序列;
    将所述筛选特征后的向量序列输入条件随机场,生成文本观测序列和观测标签序列,所述文本观测序列包括多个字符,所述观测标签序列包括多个观测标签,所述多个字符与所述多个观测标签一一对应。
  7. 根据权利要求5所述的基于自然语言的断句方法,其中,所述基于所述文本观测序列和所述观测标签序列进行断句,生成目标文本断句数据包括:
    判断所述观测标签序列中的每个观测标签是否为预置的断句标签;
    若目标观测标签为断句标签,则确定所述目标观测标签对应的字符为目标断句字符,并在所述目标断句字符后添加预置的断句分隔符进行断句,生成目标文本断句数据。
  8. 一种基于自然语言的断句设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:
    从第一业务场景中获取待处理语音数据,或者从第二业务场景中获取待处理文本数据;
    当从第一业务场景中获取待处理语音数据时,将所述待处理语音数据输入预置的语音识别模型中,生成待识别文本序列,并结合自然语言处理算法对所述待识别文本序列进行特征筛选与断句,生成目标文本断句数据,所述待识别文本序列包括多个文本字符;
    当从第二业务场景中获取待处理文本数据时,将所述待处理文本数据输入预先训练好的文本断句模型中,并结合自然语言处理算法对所述待处理文本数据进行特征筛选与断句,生成目标文本断句数据;
    根据所述目标文本断句数据和对应的场景配置生成目标应答数据,并将所述目标应答数据传输至目标终端,场景配置为提前设置好的场景配置。
  9. 根据权利要求8所述的基于自然语言的断句设备,所述处理器执行所述计算机程序时还实现以下步骤:
    当从第一业务场景中获取待处理语音数据时,将所述待处理语音数据输入预置的语音识别模型中,进行特征提取,生成语音信号特征;
    对所述语音信号特征进行处理,生成待识别文本序列,并结合自然语言处理算法对所述待识别文本序列进行特征筛选与断句,生成目标文本断句数据,所述待识别文本序列包括多个目标文本字符。
  10. 根据权利要求9所述的基于自然语言的断句设备,所述处理器执行所述计算机程序时还实现以下步骤:
    当从第一业务场景中获取待处理语音数据时,将所述待处理语音数据输入预置的语音识别模型,进行噪声消除处理,生成消除噪声后的语音数据;
    对所述消除噪声后的语音数据进行信号增强处理,生成增强处理后的语音数据;
    对所述增强处理后的语音数据进行特征提取,生成语音信号特征。
  11. 根据权利要求9所述的基于自然语言的断句设备,所述处理器执行所述计算机程序时还实现以下步骤:
    将所述语音信号特征输入所述语音识别模型的声学模型中进行评分,生成多个声学模型评分;
    将所述语音信号特征输入所述语音识别模型的语言模型中进行评分,生成多个语言模型评分;
    在所述多个声学模型评分和多个语言模型得分中搜寻评分最高的目标声学模型评分和目标语言模型评分,并基于所述目标声学模型评分和目标语言模型评分确定待识别文本序列,所述待识别文本序列包括多个目标文本字符;
    结合自然语言处理算法对所述待识别文本序列进行断句,生成目标文本断句数据。
  12. 根据权利要求9所述的基于自然语言的断句设备,所述处理器执行所述计算机程序时还实现以下步骤:
    当从第二业务场景中获取待处理文本数据时,将所述待处理文本数据输入预先训练好的文本断句模型中,结合自然语言处理算法对所述待处理文本数据进行特征筛选,生成文本观测序列和观测标签序列;
    基于所述文本观测序列和所述观测标签序列进行断句,生成目标文本断句数据。
  13. 根据权利要求12所述的基于自然语言的断句设备,所述处理器执行所述计算机程序时还实现以下步骤:
    当从第二业务场景中获取待处理文本数据时,将所述待处理文本数据输入预先训练好的文本断句模型的嵌入层进行向量映射,生成向量序列,所述向量序列不包括空格;
    将所述向量序列输入双向长短时记忆循环神经网络,进行特征筛选,生成筛选特征后的向量序列;
    将所述筛选特征后的向量序列输入条件随机场,生成文本观测序列和观测标签序列,所述文本观测序列包括多个字符,所述观测标签序列包括多个观测标签,所述多个字符与所述多个观测标签一一对应。
  14. 根据权利要求12所述的基于自然语言的断句设备,所述处理器执行所述计算机程序时还实现以下步骤:
    判断所述观测标签序列中的每个观测标签是否为预置的断句标签;
    若目标观测标签为断句标签,则确定所述目标观测标签对应的字符为目标断句字符,并在所述目标断句字符后添加预置的断句分隔符进行断句,生成目标文本断句数据。
  15. 一种计算机可读存储介质,所述计算机可读存储介质中存储计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:
    从第一业务场景中获取待处理语音数据,或者从第二业务场景中获取待处理文本数据;
    当从第一业务场景中获取待处理语音数据时,将所述待处理语音数据输入预置的语音识别模型中,生成待识别文本序列,并结合自然语言处理算法对所述待识别文本序列进行特征筛选与断句,生成目标文本断句数据,所述待识别文本序列包括多个文本字符;
    当从第二业务场景中获取待处理文本数据时,将所述待处理文本数据输入预先训练好的文本断句模型中,并结合自然语言处理算法对所述待处理文本数据进行特征筛选与断句,生成目标文本断句数据;
    根据所述目标文本断句数据和对应的场景配置生成目标应答数据,并将所述目标应答数据传输至目标终端,场景配置为提前设置好的场景配置。
  16. 根据权利要求15所述的计算机可读存储介质,所述计算机可读存储介质中存储计 算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:
    当从第一业务场景中获取待处理语音数据时,将所述待处理语音数据输入预置的语音识别模型中,进行特征提取,生成语音信号特征;
    对所述语音信号特征进行处理,生成待识别文本序列,并结合自然语言处理算法对所述待识别文本序列进行特征筛选与断句,生成目标文本断句数据,所述待识别文本序列包括多个目标文本字符。
  17. 根据权利要求16所述的计算机可读存储介质,所述计算机可读存储介质中存储计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:
    当从第一业务场景中获取待处理语音数据时,将所述待处理语音数据输入预置的语音识别模型,进行噪声消除处理,生成消除噪声后的语音数据;
    对所述消除噪声后的语音数据进行信号增强处理,生成增强处理后的语音数据;
    对所述增强处理后的语音数据进行特征提取,生成语音信号特征。
  18. 根据权利要求16所述的计算机可读存储介质,所述计算机可读存储介质中存储计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:
    将所述语音信号特征输入所述语音识别模型的声学模型中进行评分,生成多个声学模型评分;
    将所述语音信号特征输入所述语音识别模型的语言模型中进行评分,生成多个语言模型评分;
    在所述多个声学模型评分和多个语言模型得分中搜寻评分最高的目标声学模型评分和目标语言模型评分,并基于所述目标声学模型评分和目标语言模型评分确定待识别文本序列,所述待识别文本序列包括多个目标文本字符;
    结合自然语言处理算法对所述待识别文本序列进行断句,生成目标文本断句数据。
  19. 根据权利要求15所述的计算机可读存储介质,所述计算机可读存储介质中存储计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:
    当从第二业务场景中获取待处理文本数据时,将所述待处理文本数据输入预先训练好的文本断句模型中,结合自然语言处理算法对所述待处理文本数据进行特征筛选,生成文本观测序列和观测标签序列;
    基于所述文本观测序列和所述观测标签序列进行断句,生成目标文本断句数据。
  20. 一种基于自然语言的断句装置,其中,所述基于自然语言的断句装置包括:
    获取模块,用于从第一业务场景中获取待处理语音数据,或者从第二业务场景中获取待处理文本数据;
    第一断句模块,当从第一业务场景中获取待处理语音数据时,用于将所述待处理语音数据输入预置的语音识别模型中,生成待识别文本序列,并结合自然语言处理算法对所述待识别文本序列进行特征筛选与断句,生成目标文本断句数据,所述待识别文本序列包括多个文本字符;
    第二断句模块,当从第二业务场景中获取待处理文本数据时,用于将所述待处理文本数据输入预先训练好的文本断句模型中,并结合自然语言处理算法对所述待处理文本数据进行特征筛选与断句,生成目标文本断句数据;
    应答数据生成模块,用于根据所述目标文本断句数据和对应的场景配置生成目标应答数据,并将所述目标应答数据传输至目标终端,场景配置为提前设置好的场景配置。
PCT/CN2021/124954 2020-12-23 2021-12-22 基于自然语言的断句方法、装置、设备及存储介质 WO2022134798A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011538883.3A CN112711939A (zh) 2020-12-23 2020-12-23 基于自然语言的断句方法、装置、设备及存储介质
CN202011538883.3 2020-12-23

Publications (1)

Publication Number Publication Date
WO2022134798A1 true WO2022134798A1 (zh) 2022-06-30

Family

ID=75543784

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/124954 WO2022134798A1 (zh) 2020-12-23 2021-12-22 基于自然语言的断句方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN112711939A (zh)
WO (1) WO2022134798A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117473047A (zh) * 2023-12-26 2024-01-30 深圳市明源云客电子商务有限公司 业务文本生成方法、装置、电子设备及可读存储介质

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112711939A (zh) * 2020-12-23 2021-04-27 深圳壹账通智能科技有限公司 基于自然语言的断句方法、装置、设备及存储介质
CN113825082B (zh) * 2021-09-19 2024-06-11 武汉左点科技有限公司 一种用于缓解助听延迟的方法及装置
CN114265918B (zh) * 2021-12-01 2024-08-23 北京捷通华声科技股份有限公司 文本切分方法、装置及电子设备
CN114420102B (zh) * 2022-01-04 2022-10-14 广州小鹏汽车科技有限公司 语音断句方法、装置、电子设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10635751B1 (en) * 2019-05-23 2020-04-28 Capital One Services, Llc Training systems for pseudo labeling natural language
CN111737991A (zh) * 2020-07-01 2020-10-02 携程计算机技术(上海)有限公司 文本断句位置的识别方法及系统、电子设备及存储介质
CN111753524A (zh) * 2020-07-01 2020-10-09 携程计算机技术(上海)有限公司 文本断句位置的识别方法及系统、电子设备及存储介质
CN111816165A (zh) * 2020-07-07 2020-10-23 北京声智科技有限公司 语音识别方法、装置及电子设备
CN112711939A (zh) * 2020-12-23 2021-04-27 深圳壹账通智能科技有限公司 基于自然语言的断句方法、装置、设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10635751B1 (en) * 2019-05-23 2020-04-28 Capital One Services, Llc Training systems for pseudo labeling natural language
CN111737991A (zh) * 2020-07-01 2020-10-02 携程计算机技术(上海)有限公司 文本断句位置的识别方法及系统、电子设备及存储介质
CN111753524A (zh) * 2020-07-01 2020-10-09 携程计算机技术(上海)有限公司 文本断句位置的识别方法及系统、电子设备及存储介质
CN111816165A (zh) * 2020-07-07 2020-10-23 北京声智科技有限公司 语音识别方法、装置及电子设备
CN112711939A (zh) * 2020-12-23 2021-04-27 深圳壹账通智能科技有限公司 基于自然语言的断句方法、装置、设备及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117473047A (zh) * 2023-12-26 2024-01-30 深圳市明源云客电子商务有限公司 业务文本生成方法、装置、电子设备及可读存储介质
CN117473047B (zh) * 2023-12-26 2024-04-12 深圳市明源云客电子商务有限公司 业务文本生成方法、装置、电子设备及可读存储介质

Also Published As

Publication number Publication date
CN112711939A (zh) 2021-04-27

Similar Documents

Publication Publication Date Title
WO2022134798A1 (zh) 基于自然语言的断句方法、装置、设备及存储介质
WO2022134833A1 (zh) 语音信号的处理方法、装置、设备及存储介质
WO2019179036A1 (zh) 深度神经网络模型、电子装置、身份验证方法和存储介质
CN110634472B (zh) 一种语音识别方法、服务器及计算机可读存储介质
CN107229627B (zh) 一种文本处理方法、装置及计算设备
CN106844571B (zh) 识别同义词的方法、装置和计算设备
JP6242963B2 (ja) 言語モデル改良装置及び方法、音声認識装置及び方法
WO2021051877A1 (zh) 人工智能面试中获取输入文本和相关装置
CN113435196B (zh) 意图识别方法、装置、设备及存储介质
CN113055537A (zh) 客服人员的语音质检方法、装置、设备及存储介质
CN112562736A (zh) 一种语音数据集质量评估方法和装置
CN113920986A (zh) 会议记录生成方法、装置、设备及存储介质
CN113949582A (zh) 一种网络资产的识别方法、装置、电子设备及存储介质
WO2020238681A1 (zh) 音频处理方法、装置和人机交互系统
CN116564315A (zh) 一种声纹识别方法、装置、设备及存储介质
CN112632248A (zh) 问答方法、装置、计算机设备和存储介质
CN112883703B (zh) 一种识别关联文本的方法、装置、电子设备及存储介质
CN117235137B (zh) 一种基于向量数据库的职业信息查询方法及装置
Wang et al. ASVspoof 5: Crowdsourced speech data, deepfakes, and adversarial attacks at scale
CN112735432B (zh) 音频识别的方法、装置、电子设备及存储介质
Sohail et al. Text classification in an under-resourced language via lexical normalization and feature pooling
CN114530142A (zh) 基于随机森林的信息推荐方法、装置、设备及存储介质
George et al. Minimizing the false alarm probability of speaker verification systems for mimicked speech
Hu et al. Synthetic voice spoofing detection based on online hard example mining
CN113643718B (zh) 音频数据处理方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21908799

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 30.10.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21908799

Country of ref document: EP

Kind code of ref document: A1