CN110162176A - The method for digging and device terminal, computer-readable medium of phonetic order - Google Patents

The method for digging and device terminal, computer-readable medium of phonetic order Download PDF

Info

Publication number
CN110162176A
CN110162176A CN201910419367.XA CN201910419367A CN110162176A CN 110162176 A CN110162176 A CN 110162176A CN 201910419367 A CN201910419367 A CN 201910419367A CN 110162176 A CN110162176 A CN 110162176A
Authority
CN
China
Prior art keywords
phonetic order
voice fragment
voice
fragment
attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910419367.XA
Other languages
Chinese (zh)
Other versions
CN110162176B (en
Inventor
孙俊岭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201910419367.XA priority Critical patent/CN110162176B/en
Publication of CN110162176A publication Critical patent/CN110162176A/en
Application granted granted Critical
Publication of CN110162176B publication Critical patent/CN110162176B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3343Query execution using phonetics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces
    • G06F9/453Help systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Present disclose provides a kind of method for digging of phonetic order, this method comprises: according to the intention of each phonetic order, each phonetic order is cut into the voice fragment of preset quantity according to attribute, the voice fragment to be spliced for meeting preset condition is chosen from voice fragment, wherein, preset condition includes: from least two phonetic orders, and attribute difference, and quantity is equal to preset quantity, by voice fragments mosaicing to be spliced at extensive phonetic order.The disclosure additionally provides excavating gear, terminal, the computer-readable medium of a kind of phonetic order.

Description

The method for digging and device terminal, computer-readable medium of phonetic order
Technical field
The embodiment of the present disclosure is related to database technical field, the in particular to method for digging of phonetic order and device, end End, computer-readable medium.
Background technique
With the continuous development of internet and technology of Internet of things, human-computer interaction is there has also been new breakthrough, and human-computer interaction The relevant technologies are also widely used to every field.
Wherein, human-computer interaction technology (Human-Computer Interaction Techniques), which refers to, passes through service Device Input/Output Device realizes the technology of people and server dialogue in an efficient way.Human-computer interaction technology includes that machine is logical Cross export or show equipment to people provide it is a large amount of for information about and prompt is asked for instructions, people has by input equipment to machine input Information is closed, answers a question and prompt to ask for instructions.Server includes terminal, computer etc..
The phonetic order that the core of human-computer interaction is that server can issue user identifies, and feeds back corresponding Voice, or execute corresponding movement.In the prior art, phonetic order mainly passes through constructs voice command data in advance The mode in library is realized.Such as: the phonetic order issued to mass users is collected, and constructs the response language of every phonetic order Sound, or the corresponding operational order of every phonetic order of building, the voice so as to server based on operational order execution user Instruction.
Summary of the invention
The embodiment of the present disclosure provides the method for digging and device terminal, computer-readable medium of a kind of phonetic order.
In a first aspect, the embodiment of the present disclosure provides a kind of method for digging of phonetic order, comprising:
According to the intention of each phonetic order, each phonetic order is cut into the language of preset quantity according to attribute Sound fragment;
The voice fragment to be spliced for meeting preset condition is chosen from the voice fragment, wherein the preset condition packet Include: from least two phonetic orders, and attribute is different, and quantity is equal to the preset quantity;
By the voice fragments mosaicing to be spliced at extensive phonetic order.
In some embodiments, in the intention according to each phonetic order, by each phonetic order according to category Property is cut into after the voice fragment of preset quantity, further includes;
Obtained at least one the voice fragment of cutting is trained according to preset deep learning model, obtain and its Semantic relevant voice fragment;
The voice fragment that the voice fragment and training cut according to each phonetic order obtains, generates language Sound fragment set to be selected;
And it is chosen from the voice fragment and meets the voice fragment to be spliced of preset condition and include:
The voice fragment to be spliced for meeting the preset condition is chosen from the voice fragment set to be selected.
In some embodiments, after generation voice fragment set to be selected, further includes:
Duplicate removal processing is carried out to voice fragment set to be selected;
And the voice fragment to be spliced that the selection from the voice fragment meets preset condition includes:
From the voice fragment set to be selected after duplicate removal processing choose meet the preset condition voice to be spliced it is broken Piece.
In some embodiments, described to include: to voice fragment set to be selected progress duplicate removal processing
Calculate the number that each voice fragment in the voice fragment set to be selected occurs;
The number occurred according to each voice fragment carries out each voice fragment in the voice fragment set to be selected Sequence;
Duplicate removal is carried out to the voice fragment set to be selected after sequence.
In some embodiments, the intention according to each phonetic order, by each phonetic order according to attribute It is cut into the voice fragment of preset quantity, comprising:
The type of each phonetic order is determined according to the intention of each phonetic order;
According to the type of each phonetic order, each phonetic order is cut into preset quantity according to attribute Voice fragment.
In some embodiments, if phonetic order is the map phonetic order of navigation type, the attribute includes trip Mode, behavior and point of interest;
If phonetic order is the map phonetic order of function type, the attribute includes behavior and point of interest.
Second aspect, the embodiment of the present disclosure provide a kind of excavating gear of phonetic order, comprising:
Each phonetic order is cut by cutting module for the intention according to each phonetic order according to attribute The voice fragment of preset quantity;
Module is chosen, for choosing the voice fragment to be spliced for meeting preset condition from the voice fragment, wherein The preset condition includes: from least two phonetic orders, and attribute is different, and quantity is equal to the preset quantity;
Splicing module, for by the voice fragments mosaicing to be spliced at extensive phonetic order.
In some embodiments, training module, at least one for being obtained according to preset deep learning model to cutting A voice fragment is trained, and is obtained and its semantic relevant voice fragment;
Generation module, the language that voice fragment and training for being cut according to each phonetic order obtain Sound fragment generates voice fragment set to be selected;
The selection module is specifically used for, and chooses from the voice fragment set to be selected and meets the preset condition Voice fragment to be spliced.
In some embodiments, further includes:
Deduplication module, for carrying out duplicate removal processing to voice fragment set to be selected;
The selection module is specifically used for, and selection meets described pre- from the voice fragment set to be selected after duplicate removal processing If the voice fragment to be spliced of condition.
In some embodiments, the deduplication module is specifically used for:
Calculate the number that each voice fragment in the voice fragment set to be selected occurs;
The number occurred according to each voice fragment carries out each voice fragment in the voice fragment set to be selected Sequence;
Duplicate removal is carried out to the voice fragment set to be selected after sequence.
In some embodiments, the cutting module is specifically used for:
The type of each phonetic order is determined according to the intention of each phonetic order;
According to the type of each phonetic order, each phonetic order is cut into preset quantity according to attribute Voice fragment.
In some embodiments, if phonetic order is the map phonetic order of navigation type, the attribute includes trip Mode, behavior and point of interest;
If phonetic order is the map phonetic order of function type, the attribute includes behavior and point of interest.
The third aspect, the embodiment of the present disclosure provide a kind of terminal, comprising:
One or more processors;
Storage device is stored thereon with one or more programs, when one or more of programs are by one or more A processor executes, so that one or more of processors realize the method as described in any embodiment.
Fourth aspect, the embodiment of the present disclosure provide a kind of computer-readable medium, are stored thereon with computer program, Wherein, method described in any embodiment as above is realized when described program is executed by processor.
The method for digging for the phonetic order that the embodiment of the present disclosure provides, this method comprises: according to the meaning of each phonetic order Each phonetic order, is cut into the voice fragment of preset quantity by figure according to attribute, is chosen from voice fragment and is met default item The voice fragment to be spliced of part, wherein preset condition includes: from least two phonetic orders, and attribute difference, and number Amount is equal to preset quantity, by voice fragments mosaicing to be spliced at extensive phonetic order.The skill provided by the embodiment of the present disclosure Art scheme is spliced and combined by sample size phonetic order, obtains the phonetic order for being far longer than sample size, thus real Showed to phonetic order carry out it is extensive, realize the diversification of phonetic order, meet the needs of users and experience.
Detailed description of the invention
Attached drawing is used to provide to further understand the embodiment of the present disclosure, and constitutes part of specification, with this public affairs The embodiment opened is used to explain the disclosure together, does not constitute the limitation to the disclosure.By reference to attached drawing to detailed example reality It applies example to be described, the above and other feature and advantage will become apparent those skilled in the art, in attached drawing In:
Fig. 1 is the schematic diagram of the method for digging of the phonetic order of the embodiment of the present disclosure;
Fig. 2 is the schematic diagram of the method for digging of the phonetic order of another embodiment of the disclosure;
Fig. 3 is the schematic diagram that duplicate removal processing is carried out to voice fragment set to be selected of the embodiment of the present disclosure;
Fig. 4 is the schematic diagram for the method for the embodiment of the present disclosure cut to phonetic order;
Fig. 5 is the schematic diagram of the excavating gear of the phonetic order of the embodiment of the present disclosure;
Fig. 6 is the schematic diagram of the excavating gear of the phonetic order of another embodiment of the disclosure;
Fig. 7 is the schematic diagram of the excavating gear of the phonetic order of another embodiment of the disclosure;
Appended drawing reference:
1, cutting module, 2, selection module, 3, splicing module, 4, training module, 5, generation module, 6, deduplication module.
Specific embodiment
To make those skilled in the art more fully understand technical solution of the present invention, with reference to the accompanying drawing to the present invention The method for digging and device terminal, computer-readable medium of the phonetic order of offer are described in detail.
Example embodiment will hereinafter be described more fully hereinafter with reference to the accompanying drawings, but the example embodiment can be with not It is embodied with form and should not be construed as being limited to embodiment set forth herein.Conversely, the purpose for providing these embodiments exists It is thoroughly and complete in making the disclosure, and those skilled in the art will be made to fully understand the scope of the present disclosure.
As it is used herein, term "and/or" includes any and all groups of one or more associated listed entries It closes.
Term as used herein is only used for description specific embodiment, and is not intended to limit the disclosure.As used herein , "one" is also intended to "the" including plural form singular, unless in addition context is expressly noted that.It will also be appreciated that Be, when in this specification use term " includes " and/or " by ... be made " when, specify there are the feature, entirety, step, Operation, element and/or component, but do not preclude the presence or addition of one or more other features, entirety, step, operation, member Part, component and/or its group.
Embodiment described herein can be by the idealized schematic diagram of the disclosure and reference planes figure and/or sectional view are retouched It states.It therefore, can be according to manufacturing technology and/or tolerance come modified example diagram.Therefore, embodiment is not limited to shown in the drawings Embodiment, but the modification of the configuration including being formed based on manufacturing process.Therefore, the area illustrated in attached drawing has schematic Attribute, and the shape in area as shown in the figure instantiates the concrete shape in the area of element, but is not intended to restrictive.
Unless otherwise defined, the otherwise meaning and ability of all terms (including technical and scientific term) used herein The normally understood meaning of domain those of ordinary skill is identical.It will also be understood that such as those those of limit term in common dictionary It should be interpreted as having and its consistent meaning of meaning under the background of the relevant technologies and the disclosure, and will not explain So to be limited unless defining herein with idealization or excessively formal meaning.
According to the one aspect of the embodiment of the present disclosure, the embodiment of the present disclosure provides a kind of method for digging of phonetic order.
Referring to Fig. 1, Fig. 1 is the schematic diagram of the method for digging of the phonetic order of the embodiment of the present disclosure.
As shown in Figure 1, this method comprises:
S1: according to the intention of each phonetic order, each phonetic order is cut into the voice of preset quantity according to attribute Fragment.
Wherein, it is intended that for embodying the demand of user.
It is understood that the corresponding intention of a phonetic order.And phonetic order is many kinds of, such as man-machine Interactive voice instruction, voice inquirement instruction, map phonetic order etc..Be herein it is random it is exemplary enumerate it is several common Phonetic order, should not be understood as the restriction to the range of the embodiment of the present disclosure.
Such as: human-computer interaction phonetic order " east wind for playing Zhou Jielun is broken " solves the man-machine interactive voice instruction Analysis, available user want to listen to the song " east wind is broken " of Zhou Jielun version.And " broadcasting " is the attribute of behavior, " Zhou Jie Human relations " are the attribute of personage, and " east wind is broken " is the attribute of song.
Such as: voice inquirement instructs " brief introduction of Gorky ", parses to voice inquirement instruction, available user Wish to learn the demand of the relevant information of Gorky.And " Gorky " is the attribute of personage, " brief introduction " is the category of information Property.
As: map phonetic order " first is gone in navigation " parses the map phonetic order, and available user is uncommon Hope the demand that first ground is gone to by way of navigation.And " navigation " the i.e. attribute of behavior, " going " are the attribute of behavior, " first Ground " is the attribute of destination.Certainly, an attribute will can also " with going first " be regarded as.
In this step, each phonetic order is cut by the intention of each phonetic order and attribute, so as to To the corresponding multiple voice fragments of each phonetic order.
Illustratively, 1,000 (can be by obtaining online, can also be by obtaining from preset memory) languages are shared Sound instruction.1,000 phonetic orders are parsed, 1,000 intentions are obtained.Wherein, the corresponding meaning of a phonetic order Figure.
According to being intended to cut phonetic order, if a phonetic order is cut into three voice fragments, to one Thousand phonetic orders are cut, and 3,000 voice fragments are obtained.
S2: the voice fragment to be spliced for meeting preset condition is chosen from voice fragment.
Wherein, preset condition includes: from least two phonetic orders, and attribute is different, and quantity is equal to present count Amount.
It in this step, is by the way that it is broken as voice to be spliced to choose multiple voice fragments from multiple voice fragments Voice fragment to be spliced is spliced so as to subsequent, and then obtains extensive phonetic order by piece.
Illustratively, phonetic order A is cut into voice fragment A1, A2 and A3, and it is broken that phonetic order B is cut into voice Piece B1, B2 and B3, phonetic order C are cut into voice fragment C1, C2 and C3.A1, A2 and C3 can be chosen as language to be spliced Sound fragment can also choose A1, B2 and C3 as voice fragment to be spliced.
S3: by voice fragments mosaicing to be spliced at extensive phonetic order.
It, can (i.e. phonetic order A be (i.e. by three original phonetic orders after splicing to voice fragment to be spliced The phonetic order of A1, A2 and A3 composition), phonetic order B (phonetic order being made of B1, B2 and B3) and phonetic order C The phonetic order of composition (C1, C2 and C3)) available phonetic order (such as A1, A2 and B3 group considerably beyond three phonetic orders At phonetic order etc.).
Due to it is found that in the prior art, mainly by being acquired to phonetic order, such as (such as being moved from different terminals Dynamic terminal and car-mounted terminal etc.) phonetic order of magnanimity is obtained, and then realize meet the needs of mass users.
And in the present embodiment, multiple voice fragments are obtained by cutting phonetic order, and from multiple voices Voice fragment to be spliced, and the technical solution that voice fragment to be spliced is spliced are determined in fragment, by sample size voice Instruction is spliced and combined, and the phonetic order for being far longer than sample size is obtained, to realize general to phonetic order progress Change, realizes the diversification of phonetic order, meet the needs of users and experience.
In some embodiments, before S1, include the steps that obtaining phonetic order.
Specifically, phonetic order can be obtained by way of obtaining online, it can also be by from memory (or database etc.) Middle acquisition phonetic order.
After getting phonetic order, further include the steps that carrying out duplicate removal to phonetic order, to reduce subsequent parsing language The operand of the intention of sound instruction, economizes on resources, improves efficiency.
Illustratively, 1,000 phonetic orders are obtained altogether.The frequency of occurrence for counting each phonetic order, number is greater than Two phonetic order carries out duplicate removal processing.
In conjunction with Fig. 2 it is found that in some embodiments, after S1, further includes:
S1 ': it is trained, is obtained according at least one voice fragment that preset deep learning model obtains cutting To its semantic relevant voice fragment.
Wherein, deep learning model can be used neural network model in the prior art and realize, herein without limitation.
In this step, it is the process being extended to voice fragment, voice fragment is carried out by deep learning model Voice fragment relevant to the semanteme of the voice fragment can be obtained in training.
Illustratively, voice fragment A1 is trained by deep learning model, is obtained and the semantic relevant voice of A1 Fragment A1-1 and A1-2.
Wherein, the foundation of deep learning model can be found in the prior art, and details are not described herein again.
S2 ': the voice fragment that the voice fragment and training cut according to each phonetic order obtains generates language Sound fragment set to be selected.
It had both included that the voice that is cut according to each phonetic order is broken that is, in voice fragment set to be selected Piece, and the voice fragment obtained including training.
Illustratively, if the voice fragment cut according to each phonetic order obtains 1,000 voice fragments, according to Deep learning model is trained wherein 50 voice fragments respectively, relevant voice fragment 200 is obtained, then language Sound fragment collection to be selected is amounted to including 1,200 voice fragments.
From the foregoing, it will be observed that in the embodiments of the present disclosure, by the extension to voice fragment, it can be achieved that the expansion to phonetic order Exhibition.That is, by the way that voice fragment is combined abundant extensive, the realization, it can be achieved that phonetic order with related semantic voice fragment The diversity of phonetic order.
Then S2 includes: that the voice fragment to be spliced for meeting preset condition is chosen from voice fragment set to be selected.
In some embodiments, after S2 ', further includes:
S3 ': duplicate removal processing is carried out to voice fragment set to be selected.
Wherein, the mode that duplicate removal processing in the prior art can be used carries out duplicate removal processing to the combination to be selected of voice fragment. Details are not described herein again.
By the duplicate removal processing of the step, it can be achieved that reducing the operand of the intention of subsequent parsing phonetic order, saving money Source improves efficiency.
Wherein, S2 is specifically included: from the voice fragment set to be selected after duplicate removal processing choose meet preset condition to Splice voice fragment.
In conjunction with Fig. 3 it is found that in some embodiments, S3 ' includes:
S3 ' -1: the number that each voice fragment in voice fragment set to be selected occurs is calculated.
S3 ' -2: according to each voice fragment occur number to each voice fragment in voice fragment set to be selected into Row sequence.
S3 ' -3: duplicate removal is carried out to the voice fragment set to be selected after sequence.
Each voice fragment is arranged by the number according to the broken appearance of each voice provided in the embodiment of the present disclosure Sequence, to carry out the technical solution of duplicate removal based on the voice fragment set to be selected after sequence, on the one hand, by based on sequence Voice fragment is removed the high efficiency, it can be achieved that duplicate removal to duplicate voice fragment.On the other hand, by each voice Fragment is all counted the step of (i.e. calculation times) in advance, can avoid careless omission part of speech fragment, and then realize duplicate removal Accuracy and comprehensive.
In conjunction with Fig. 4 it is found that in some embodiments, S1 includes:
S1-1: the type of each phonetic order is determined according to the intention of each phonetic order.
Such as, the type of human-computer interaction phonetic order includes that application type (as opened air-conditioning) and type of action (play Zhou Jie The east wind of human relations is broken).
And the type of map phonetic order includes that navigation type (as driveed to first) and function type (open stroke Assistant etc.).
S1-2: according to the type of each phonetic order, each phonetic order is cut into the language of preset quantity according to attribute Sound fragment.
Such as, when phonetic order is that man-machine interactive voice instructs, and specially type of action " plays the east wind of Zhou Jielun It is broken ", then the voice fragment after cutting includes " broadcasting " that attribute is behavior, and attribute is " Zhou Jielun " of personage, and attribute is song " east wind broken ".
It is now described in detail with map phonetic order, other phonetic order (such as human-computer interaction phonetic orders and inquiry language Sound instruction etc.) can be found in map phonetic order elaboration, will not enumerate herein.
If map phonetic order is the map phonetic order of navigation type, attribute includes trip mode, behavior and interest Point.
That is, the map phonetic order of navigation type is cut into three voices according to trip mode, behavior and point of interest Fragment.
Wherein, point of interest (Point of Interest, POI) i.e. information point, such as the sight spot on electronic map, government's machine Structure, company, market, restaurant etc..
Specifically, occur mode include but is not limited to drive, with vehicle, public transport, ride, walking.Behavior includes but is not limited to Go, be past, by way of, open.Point of interest includes but is not limited to cuisines, stroke assistant.
Illustratively, to map phonetic order " with driving first " is parsed, and obtains the meaning of the map phonetic order Figure, based on the intention it is found that the trip mode of the map phonetic order is to drive, behavior is to go, and point of interest is for first.
Based on above-mentioned example it is found that map phonetic order " with driving first " is cut, it is broken to obtain three voices Piece respectively " is driven " (trip mode), " going " (behavior) and " first " (point of interest).
In some embodiments, if map phonetic order is the map phonetic order of function type, attribute includes behavior And point of interest.
Then according to behavior and point of interest, the map phonetic order of function type is cut into two voice fragments.
Illustratively, to map phonetic order " opening stroke assistant " parses, and obtains the meaning of the map phonetic order Figure.Based on the intention it is found that the behavior of the map phonetic order is to open, point of interest is stroke assistant.
Based on above-mentioned example it is found that map phonetic order " opening stroke assistant " is cut, it is broken to obtain two voices Piece respectively " opens " (behavior) and " stroke assistant " (point of interest).
In some embodiments, the corresponding list of each attribute can be preset, such as preset trip mode list, Behavior list and interest point list.It is cut when by somewhere figure phonetic order, when obtaining two or three voice fragments, Voice fragment is divided into corresponding list.
Illustratively, to map phonetic order " with driving first " is cut, and obtains three voice fragments, respectively " driving ", " going " and " first ".Voice fragment " driving " is stored to trip mode list, voice fragment " going " is stored to row For list, voice fragment " first " store to interest point list.
Similarly, to map phonetic order " with riding first " is cut, and is obtained three voice fragments, respectively " is ridden Row ", " going " and " first ".Voice fragment " riding " is stored to trip mode list, voice fragment " going " is stored to behavior and arranged Table, voice fragment " first " store to interest point list.
Similarly, to map phonetic order " public transport is toward second " is cut, and obtains three voice fragments, respectively " public Hand over ", " past " and " second ".Voice fragment " public transport " is stored to trip mode list, voice fragment " past " is stored to behavior and arranged Table, voice fragment " second " store to interest point list.
Similarly, to map phonetic order " opening stroke assistant " is cut, and is obtained two voice fragments, respectively " is beaten Open " and " stroke assistant ".Voice fragment " opening " is stored to behavior list, voice fragment " stroke assistant " is stored to interest Point list.
Voice fragment in different lists is spliced, complete map phonetic order is obtained.Specifically:
By " going " in " public transport " and behavior list in trip mode list and " first " in interest point list into Row splicing, obtains complete map phonetic order " public transport is with going first ".
It is understood that the map phonetic order obtained is more, it is understood that there may be identical map phonetic order then exists Before to map phonetic order is parsed and cut, include the steps that carrying out duplicate removal to multiple map phonetic orders of acquisition, To realize reduction energy consumption, the technical effects such as lighten the load.
Wherein, the step of carrying out duplicate removal to multiple map phonetic orders includes: to calculate each map phonetic order to occur Number is ranked up multiple map phonetic orders according to the number that each map phonetic order occurs, to multiple after sequence Map phonetic order carries out duplicate removal.
It should be noted that since the different possible parts of the corresponding voice fragment of map phonetic order is identical, it is such as above-mentioned Include in two map phonetic orders in example voice fragment " going ".It therefore, in the embodiments of the present disclosure, further include pair Three lists carry out the step of duplicate removal respectively.Details are not described herein again.
It, then can be by behavior based on above-mentioned example it is found that since map phonetic order can be divided into navigation type and function type List is divided into travel behaviour sublist and behaviour sublist.
Similarly, interest point list can be also divided into trip point of interest sublist and function point of interest sublist.
According to the other side of the embodiment of the present disclosure, the embodiment of the present disclosure provides a kind of excavation dress of phonetic order It sets.
Referring to Fig. 5, Fig. 5 is the schematic diagram of the excavating gear of the phonetic order of the embodiment of the present disclosure.
As shown in figure 5, the device includes:
Each phonetic order is cut into according to attribute pre- by cutting module 1 for the intention according to each phonetic order If the voice fragment of quantity.
Module 2 is chosen, for choosing the voice fragment to be spliced for meeting preset condition from voice fragment, wherein default Condition includes: from least two phonetic orders, and attribute is different, and quantity is equal to preset quantity.
Splicing module 3, for by voice fragments mosaicing to be spliced at extensive phonetic order.
In conjunction with Fig. 6 it is found that in some embodiments, the device further include:
Training module 4, for being carried out according to preset deep learning model at least one voice fragment that cutting obtains Training obtains and its semantic relevant voice fragment.
Generation module 5, the voice that voice fragment and training for being cut according to each phonetic order obtain are broken Piece generates voice fragment set to be selected.
Choose module 2 be specifically used for, from voice fragment set to be selected choose meet preset condition voice to be spliced it is broken Piece.
In conjunction with Fig. 7 it is found that in some embodiments, the device further include:
Deduplication module 6, for carrying out duplicate removal processing to voice fragment set to be selected.
It chooses module 2 to be specifically used for, is chosen from the voice fragment set to be selected after duplicate removal processing and meet preset condition Voice fragment to be spliced.
In some embodiments, deduplication module 6 is specifically used for:
Calculate the number that each voice fragment in voice fragment set to be selected occurs.
Each voice fragment in voice fragment set to be selected is arranged according to the number that each voice fragment occurs Sequence.
Duplicate removal is carried out to the voice fragment set to be selected after sequence.
In some embodiments, cutting module 1 is specifically used for:
The type of each phonetic order is determined according to the intention of each phonetic order.
According to the type of each phonetic order, the voice that each phonetic order is cut into preset quantity according to attribute is broken Piece.
In some embodiments, if phonetic order is the map phonetic order of navigation type, attribute includes trip side Formula, behavior and point of interest;
If phonetic order is the map phonetic order of function type, attribute includes behavior and point of interest.
According to the other side of the embodiment of the present disclosure, the embodiment of the present disclosure provides a kind of terminal, comprising:
One or more processors;
Storage device is stored thereon with one or more programs, when one or more of programs are by one or more A processor executes, so that one or more of processors realize method described in as above any one embodiment.
According to the other side of the embodiment of the present disclosure, the embodiment of the present disclosure provides a kind of computer-readable medium, On be stored with computer program, wherein method described in any embodiment as above is realized when described program is executed by processor.
It will appreciated by the skilled person that whole or certain steps, system in method disclosed hereinabove, Functional module/unit in device may be implemented as software, firmware, hardware and its combination appropriate.In hardware embodiment In, the division between functional module/unit referred in the above description not necessarily corresponds to the division of physical assemblies;For example, One physical assemblies can have multiple functions or a function or step and can be executed by several physical assemblies cooperations.Certain A little physical assemblies or all physical assemblies may be implemented as by processor, as central processing unit, digital signal processor or The software that microprocessor executes, is perhaps implemented as hardware or is implemented as integrated circuit, such as specific integrated circuit.In this way Software can be distributed on a computer-readable medium, computer-readable medium may include computer storage medium (or it is non-temporarily When property medium) and communication media (or fugitive medium).As known to a person of ordinary skill in the art, term computer storage Medium is included in for storing appointing for information (such as computer readable instructions, data structure, program module or other data) The volatile and non-volatile implemented in what method or technique, removable and nonremovable medium.Computer storage medium includes But it is not limited to RAM, ROM, EEPROM, flash memory or other memory technologies, CD-ROM, digital versatile disc (DVD) or other light Disk storage, magnetic holder, tape, disk storage or other magnetic memory apparatus or it can be used for storing desired information and can be with Any other medium being accessed by a computer.In addition, known to a person of ordinary skill in the art be, communication media is usually wrapped Modulation data letter containing computer readable instructions, data structure, program module or such as carrier wave or other transmission mechanisms etc Other data in number, and may include any information delivery media.
Example embodiment has been disclosed herein, although and use concrete term, they are only used for simultaneously only should It is interpreted general remark meaning, and is not used in the purpose of limitation.In some instances, aobvious to those skilled in the art And be clear to, unless otherwise expressly stated, otherwise can be used alone the feature that description is combined with specific embodiment, characteristic And/or element, or the feature, characteristic and/or element of description can be combined with other embodiments and be applied in combination.Therefore, this field The skilled person will understand that can be carried out each in the case where not departing from the scope of the present disclosure illustrated by the attached claims The change of kind in form and details.

Claims (14)

1. a kind of method for digging of phonetic order, comprising:
According to the intention of each phonetic order, the voice that each phonetic order is cut into preset quantity according to attribute is broken Piece;
The voice fragment to be spliced for meeting preset condition is chosen from the voice fragment, wherein the preset condition includes: to come Derived from least two phonetic orders, and attribute is different, and quantity is equal to the preset quantity;
By the voice fragments mosaicing to be spliced at extensive phonetic order.
2. according to the method described in claim 1, wherein, in the intention according to each phonetic order, by each language Sound instruction is cut into after the voice fragment of preset quantity according to attribute, further includes:
It is trained, is obtained and its semantic phase according at least one voice fragment that preset deep learning model obtains cutting The voice fragment of pass;
The voice fragment that the voice fragment and training cut according to each phonetic order obtains, generates voice fragment Set to be selected;
And it is chosen from the voice fragment and meets the voice fragment to be spliced of preset condition and include:
The voice fragment to be spliced for meeting the preset condition is chosen from the voice fragment set to be selected.
3. according to the method described in claim 2, wherein, after generation voice fragment set to be selected, further includes:
Duplicate removal processing is carried out to voice fragment set to be selected;
And the voice fragment to be spliced that the selection from the voice fragment meets preset condition includes:
The voice fragment to be spliced for meeting the preset condition is chosen from the voice fragment set to be selected after duplicate removal processing.
4. described to carry out duplicate removal processing packet to voice fragment set to be selected according to the method described in claim 3, wherein It includes:
Calculate the number that each voice fragment in the voice fragment set to be selected occurs;
Each voice fragment in the voice fragment set to be selected is ranked up according to the number that each voice fragment occurs;
Duplicate removal is carried out to the voice fragment set to be selected after sequence.
5. method according to claim 1 to 4, wherein the intention according to each phonetic order, by institute State the voice fragment that each phonetic order is cut into preset quantity according to attribute, comprising:
The type of each phonetic order is determined according to the intention of each phonetic order;
According to the type of each phonetic order, each phonetic order is cut into the voice of preset quantity according to attribute Fragment.
6. according to the method described in claim 5, wherein,
If phonetic order is the map phonetic order of navigation type, the attribute includes trip mode, behavior and point of interest;
If phonetic order is the map phonetic order of function type, the attribute includes behavior and point of interest.
7. a kind of excavating gear of phonetic order, comprising:
Each phonetic order is cut into according to attribute default by cutting module for the intention according to each phonetic order The voice fragment of quantity;
Module is chosen, for choosing the voice fragment to be spliced for meeting preset condition from the voice fragment, wherein described pre- If condition includes: from least two phonetic orders, and attribute is different, and quantity is equal to the preset quantity;
Splicing module, for by the voice fragments mosaicing to be spliced at extensive phonetic order.
8. device according to claim 7, further includes;
Training module, at least one voice fragment for being obtained according to preset deep learning model to cutting are trained, It obtains and its semantic relevant voice fragment;
Generation module, the voice that voice fragment and training for being cut according to each phonetic order obtain are broken Piece generates voice fragment set to be selected;
The selection module is specifically used for, and chooses from the voice fragment set to be selected and meets the to be spliced of the preset condition Voice fragment.
9. device according to claim 8, wherein further include:
Deduplication module, for carrying out duplicate removal processing to voice fragment set to be selected;
The selection module is specifically used for, and chooses from the voice fragment set to be selected after duplicate removal processing and meets the preset condition Voice fragment to be spliced.
10. device according to claim 9, wherein the deduplication module is specifically used for:
Calculate the number that each voice fragment in the voice fragment set to be selected occurs;
Each voice fragment in the voice fragment set to be selected is ranked up according to the number that each voice fragment occurs;
Duplicate removal is carried out to the voice fragment set to be selected after sequence.
11. device according to any one of claims 7 to 10, wherein the cutting module is specifically used for:
The type of each phonetic order is determined according to the intention of each phonetic order;
According to the type of each phonetic order, each phonetic order is cut into the voice of preset quantity according to attribute Fragment.
12. device according to claim 11, wherein
If phonetic order is the map phonetic order of navigation type, the attribute includes trip mode, behavior and point of interest;
If phonetic order is the map phonetic order of function type, the attribute includes behavior and point of interest.
13. a kind of terminal, comprising:
One or more processors;
Storage device is stored thereon with one or more programs, when one or more of programs are by one or more of places It manages device to execute, so that one or more of processors realize the method as described in claim 1 to 6 is any.
14. a kind of computer-readable medium, is stored thereon with computer program, wherein real when described program is executed by processor The now method as described in claim 1 to 6 is any.
CN201910419367.XA 2019-05-20 2019-05-20 Voice instruction mining method and device, terminal and computer readable medium Active CN110162176B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910419367.XA CN110162176B (en) 2019-05-20 2019-05-20 Voice instruction mining method and device, terminal and computer readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910419367.XA CN110162176B (en) 2019-05-20 2019-05-20 Voice instruction mining method and device, terminal and computer readable medium

Publications (2)

Publication Number Publication Date
CN110162176A true CN110162176A (en) 2019-08-23
CN110162176B CN110162176B (en) 2022-04-26

Family

ID=67631591

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910419367.XA Active CN110162176B (en) 2019-05-20 2019-05-20 Voice instruction mining method and device, terminal and computer readable medium

Country Status (1)

Country Link
CN (1) CN110162176B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106648530A (en) * 2016-11-21 2017-05-10 海信集团有限公司 Voice control method and terminal
CN106814639A (en) * 2015-11-27 2017-06-09 富泰华工业(深圳)有限公司 Speech control system and method
CN107146602A (en) * 2017-04-10 2017-09-08 北京猎户星空科技有限公司 A kind of audio recognition method, device and electronic equipment
CN107657947A (en) * 2017-09-20 2018-02-02 百度在线网络技术(北京)有限公司 Method of speech processing and its device based on artificial intelligence
CN107688608A (en) * 2017-07-28 2018-02-13 合肥美的智能科技有限公司 Intelligent sound answering method, device, computer equipment and readable storage medium storing program for executing
CN108847236A (en) * 2018-07-26 2018-11-20 珠海格力电器股份有限公司 The analysis method and device of the method for reseptance and device of voice messaging, voice messaging
CN108877791A (en) * 2018-05-23 2018-11-23 百度在线网络技术(北京)有限公司 Voice interactive method, device, server, terminal and medium based on view
CN108877765A (en) * 2018-05-31 2018-11-23 百度在线网络技术(北京)有限公司 Processing method and processing device, computer equipment and the readable medium of voice joint synthesis
CN109545190A (en) * 2018-12-29 2019-03-29 联动优势科技有限公司 A kind of audio recognition method based on keyword
CN109584876A (en) * 2018-12-26 2019-04-05 珠海格力电器股份有限公司 Processing method, device and the voice air conditioner of voice data
CN109671427A (en) * 2018-12-10 2019-04-23 珠海格力电器股份有限公司 A kind of sound control method, device, storage medium and air-conditioning
CN109686382A (en) * 2018-12-29 2019-04-26 平安科技(深圳)有限公司 A kind of speaker clustering method and device

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106814639A (en) * 2015-11-27 2017-06-09 富泰华工业(深圳)有限公司 Speech control system and method
CN106648530A (en) * 2016-11-21 2017-05-10 海信集团有限公司 Voice control method and terminal
CN107146602A (en) * 2017-04-10 2017-09-08 北京猎户星空科技有限公司 A kind of audio recognition method, device and electronic equipment
CN107688608A (en) * 2017-07-28 2018-02-13 合肥美的智能科技有限公司 Intelligent sound answering method, device, computer equipment and readable storage medium storing program for executing
CN107657947A (en) * 2017-09-20 2018-02-02 百度在线网络技术(北京)有限公司 Method of speech processing and its device based on artificial intelligence
CN108877791A (en) * 2018-05-23 2018-11-23 百度在线网络技术(北京)有限公司 Voice interactive method, device, server, terminal and medium based on view
CN108877765A (en) * 2018-05-31 2018-11-23 百度在线网络技术(北京)有限公司 Processing method and processing device, computer equipment and the readable medium of voice joint synthesis
CN108847236A (en) * 2018-07-26 2018-11-20 珠海格力电器股份有限公司 The analysis method and device of the method for reseptance and device of voice messaging, voice messaging
CN109671427A (en) * 2018-12-10 2019-04-23 珠海格力电器股份有限公司 A kind of sound control method, device, storage medium and air-conditioning
CN109584876A (en) * 2018-12-26 2019-04-05 珠海格力电器股份有限公司 Processing method, device and the voice air conditioner of voice data
CN109545190A (en) * 2018-12-29 2019-03-29 联动优势科技有限公司 A kind of audio recognition method based on keyword
CN109686382A (en) * 2018-12-29 2019-04-26 平安科技(深圳)有限公司 A kind of speaker clustering method and device

Also Published As

Publication number Publication date
CN110162176B (en) 2022-04-26

Similar Documents

Publication Publication Date Title
US8346563B1 (en) System and methods for delivering advanced natural language interaction applications
Dybkjaer et al. Evaluation and usability of multimodal spoken language dialogue systems
McCullough Noninterference and the composability of security properties
US9268774B2 (en) Storage medium, apparatus, and method to author and play interactive content
TW201921267A (en) Method and system for generating a conversational agent by automatic paraphrase generation based on machine translation
US11822904B2 (en) Generating and updating voice-based software applications using application templates
CN110222225A (en) The abstraction generating method and device of GRU codec training method, audio
CN109948151A (en) The method for constructing voice assistant
Thorogood et al. Computationally Created Soundscapes with Audio Metaphor.
CN110008326A (en) Knowledge abstraction generating method and system in conversational system
CN112579757A (en) Intelligent question and answer method and device, computer readable storage medium and electronic equipment
Lee Voice user interface projects: build voice-enabled applications using dialogflow for google home and Alexa skills kit for Amazon Echo
CN108153904A (en) Language material collection method, device and computer equipment
Bowden et al. Slugbot: An application of a novel and scalable open domain socialbot framework
CN110162176A (en) The method for digging and device terminal, computer-readable medium of phonetic order
Cervone et al. Roving mind: a balancing act between open–domain and engaging dialogue systems
CN109558131A (en) A kind of intelligence generates the method and system of front end static page
Willems Theorizing Media as/and Civil Society in Africa
CN111798847A (en) Voice interaction method, server and computer-readable storage medium
Ward et al. Data collection for the Similar Segments in Social Speech task
CN109276886A (en) A kind of document creation method, system and terminal device
Becker et al. The SAMMIE system: Multimodal in-car dialogue
CN111368099B (en) Method and device for generating core information semantic graph
Klein et al. Evaluating multi-modal input modes in a wizard-of-oz study for the domain of web search
JP3787623B2 (en) Conversation expression generation device and conversation expression generation program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant