CN107894972A - A kind of session tokens method, apparatus, aggregate server and storage medium - Google Patents
A kind of session tokens method, apparatus, aggregate server and storage medium Download PDFInfo
- Publication number
- CN107894972A CN107894972A CN201711130201.3A CN201711130201A CN107894972A CN 107894972 A CN107894972 A CN 107894972A CN 201711130201 A CN201711130201 A CN 201711130201A CN 107894972 A CN107894972 A CN 107894972A
- Authority
- CN
- China
- Prior art keywords
- session
- information
- tokens
- instruction
- session information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/103—Formatting, i.e. changing of presentation of documents
- G06F40/117—Tagging; Marking up; Designating a block; Setting of attributes
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Abstract
The present invention discloses a kind of session tokens method, apparatus, aggregate server and storage medium, including:First session information and the second session information according to corresponding to predetermined session identification obtains session to be marked;Wherein, the first session information includes:Input voice information and the text message for input voice information formed after speech recognition;Second session information includes:The text message and output text message for input voice information formed after speech recognition;The first session information and the second session information are converged according to session identification;Receive the session tokens instruction of user feedback;Wherein, session tokens instruction includes:First session tokens instruct or the instruction of the second session tokens;The first session information after convergence and the second session information are marked respectively according to the instruction of the first session tokens and the instruction of the second session tokens.The input amount of manpower when can reduce voice conversation System Back-end data markers, improve data-handling efficiency.
Description
Technical field
The present invention relates to technical field of internet application, more particularly to a kind of session tokens method, apparatus, aggregate server
And storage medium.
Background technology
With the fast development of artificial intelligence, the intelligent robot based on voice conversation system is applied to every field, leads to
Cross the mode of natural language dialogue, it is possible to achieve the function such as audio-visual amusement, information inquiry, service for life and trip road conditions.
At present, due to being limited by existing voice technology and semantic technology, identification and semantic reason of the machine to voice
Solution ability is also needed constantly to improve, therefore regularly collects the interaction data of user and machine, and user interactive data is carried out
Checking and mark, it is the prerequisite for improving voice conversation system.In the prior art, user is using after voice conversation system,
The interaction results of mistake can not be fed back, therefore the rear end personnel of system need regularly to carry out the interaction data of user
Manual verification, the input voice of duplicate customer is verified and screened one by one, and the interaction data of mistake is marked, for
The re -training of voice conversation system and improvement.
However, because voice conversation systematic difference scope is wide, it is more using user, so the interaction data quantity ten of user
Divide huge.Therefore the data processing method in the prior art in the artificial repeated screening in rear end and mark mass data is used,
Human cost is high, and treatment effeciency is low.
The content of the invention
, being capable of basis the embodiments of the invention provide a kind of session tokens method, apparatus, aggregate server and storage medium
Interaction data is marked the feedback result of user, the input amount of manpower when reducing voice conversation System Back-end data markers,
Improve data-handling efficiency.
In a first aspect, the embodiments of the invention provide a kind of session tokens method, applied to aggregate server, methods described
Including:
First session information and the second session information according to corresponding to predetermined session identification obtains session to be marked;
Wherein, first session information includes:Input voice information and to the input voice information carry out speech recognition after formed
Text message;Second session information includes:The text envelope for the input voice information formed after speech recognition
Breath and output text message;
First session information and second session information are converged according to the session identification;
Receive the session tokens instruction of user feedback;Wherein, the session tokens instruction includes:First session tokens instruct
Or second session tokens instruction;
According to first session tokens instruction and second session tokens instruction respectively to described first after convergence
Session information and second session information are marked.
Second aspect, the embodiments of the invention provide a kind of session tokens device, described device includes:Acquisition module, remittance
Poly- module, receiving module and mark module;Wherein,
The acquisition module, for the first session letter according to corresponding to predetermined session identification acquisition session to be marked
Breath and the second session information;Wherein, first session information includes:Input voice information and the input voice information is entered
The text message formed after row speech recognition;Second session information includes:Voice knowledge is carried out to the input voice information
The text message and output text message not formed afterwards;
The convergence module, for according to the session identification by first session information and second session information
Converged;
The receiving module, the session tokens for receiving user feedback instruct;Wherein, the session tokens instruction bag
Include:First session tokens instruct or the instruction of the second session tokens;
The mark module, for right respectively according to first session tokens instruction and second session tokens instruction
First session information and second session information after convergence are marked.
The third aspect, the embodiments of the invention provide a kind of aggregate server, including:
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are by one or more of computing devices so that one or more of processing
Device realizes the session tokens method described in any embodiment of the present invention.
Fourth aspect, the embodiments of the invention provide a kind of storage medium, computer program is stored thereon with, the program quilt
The session tokens method described in any embodiment of the present invention is realized during computing device.
The embodiment of the present invention proposes a kind of session tokens method, apparatus, aggregate server and storage medium, aggregated service
Device can receive the session tokens instruction of user feedback, then be instructed according to session tokens to the first session information and the second session
Information is marked.And existing session tokens method, using manual type to the session information and semanteme in voice server
Session information in server is marked one by one, therefore, compared to the prior art, the session tokens that the embodiment of the present invention proposes
Method, apparatus, aggregate server and storage medium, session information can be marked according to the feedback result of user, so as to
Human input amount can be reduced, improves labeling effciency;Also, the technical scheme of the embodiment of the present invention realize it is simple and convenient, be easy to
Popularization, the scope of application are wider.
Brief description of the drawings
Fig. 1 is a kind of flow chart for session tokens method that the embodiment of the present invention one provides;
Fig. 2 is a kind of flow chart for session tokens method that the embodiment of the present invention two provides;
Fig. 3 is a kind of structural representation for session tokens device that the embodiment of the present invention three provides;
Fig. 4 is a kind of structural representation for aggregate server that the embodiment of the present invention four provides.
Embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining the present invention, rather than limitation of the invention.It also should be noted that in order to just
Part related to the present invention rather than entire infrastructure are illustrate only in description, accompanying drawing.
Embodiment one
Fig. 1 is a kind of flow chart for session tokens method that the embodiment of the present invention one provides, and the present embodiment is applicable to language
The situation that human-machine interaction data is collected in sound conversational system, this method can be performed by session tokens device, the session tokens
Device can be realized by the way of software and/or hardware.With reference to figure 1, this method specifically comprises the following steps:
S110, first session information and the second session according to corresponding to predetermined session identification obtains session to be marked
Information;Wherein, the first session information includes:Input voice information and to the input voice information carry out speech recognition after formed
Text message;Second session information includes:The input voice information is carried out after speech recognition the text message that is formed and
Export text message.
In a particular embodiment of the present invention, session identification refers to receive original user conversation language in voice server
Generated when sound is input voice information with the unique corresponding information with mark action of input voice information.First session is believed
Breath includes the session information after voice server receives and handles, including input voice information and voice server pair
Input voice information carries out the text message that is formed after speech recognition, at the same assign the text message that is formed after speech recognition with it is right
The unique session identification of input voice information identical answered, with this by session identification by after input voice information and speech recognition
The associate text information of formation gets up.
Second session information includes the session information after semantic service device receives and handles, including output text
Information and the text message formed after voice server speech recognition, output text message are semantic service device to voice
The text message formed after identification carries out the text message formed after semantic parsing and result satisfaction.Wherein, the above results meet
Refer to, semantic service device, can be according to the text formed after speech recognition after the text message formed after receiving speech recognition
The content of this information inquiry Query Result corresponding with the text message formed after speech recognition in the resources reserve storehouse of local,
Or web search is carried out according to the content of the text message formed after speech recognition, obtain in real time with being formed after speech recognition
Text message corresponding to search result, and the Query Result or search result that the text message formed after speech recognition will be met
Form output text message.Output text message and the text message identical that is formed after corresponding speech recognition are assigned simultaneously only
One session identification, the associate text information formed after text message and speech recognition will be exported by session identification with this.
During session interaction, according to session identification corresponding to voice to be marked, aggregate server in real time respectively from
The first session information and the second session information in voice server and this time session of semantic service device active obtaining, so as to the later stage
Finish message and mark.
Exemplary, the interactive operation of the robot for being integrated with voice conversation system of the embodiment of the present invention, Yong Huke
To want the voice content of inquiry or search by the subsidiary microphone output of robot, such as " why is Beijing weather for input voice
Sample ".Voice server receives the input voice information, and what is firstly generated is unique corresponding with mark with input voice information
The session identification of knowledge effect, such as 001;Secondly speech recognition is carried out to input voice information, forms text message " Beijing weather
How ".Assign the text message and the unique session mark of corresponding input voice information identical formed after speech recognition simultaneously
Know, i.e., 001.Semantic service device receives the text message and session identification formed after the speech recognition, after speech recognition
The text message of formation carries out semantic understanding.When correctly recognizing that user wants obtained response for Beijing weather at this very moment
During situation, semantic service device can not obtain corresponding real-time weather situation by being inquired about from the resources reserve storehouse of local, and then
Semantic service device carries out web search, obtains corresponding search result.For example, pass through network search query to real-time Beijing day
Gas is:Fine, 10 DEG C, northwester 3-4 poles, and then output text message " fine, 10 DEG C, northwester 3-4 poles " is formed, it is met
The output text message of the text message formed after speech recognition.And to output text message " fine, 10 DEG C, northwester 3-4 poles "
Assign the session identification 001 same with the text message formed after corresponding speech recognition.So far, the institute during voice conversation
There is interaction data that all there is corresponding session identification, i.e., 001.Therefore, the session identification 001 according to corresponding to voice to be marked, gather
Active obtaining session identification is 001 interaction data to hop server from voice server and semantic service device respectively in real time,
The text message " how is Beijing weather " that is formed after input voice information " how is Beijing weather ", speech recognition and defeated
Go out text message " fine, 10 DEG C, northwester 3-4 poles ", so as to the later stage finish message and mark when use.
S120, according to session identification the first session information and the second session information are converged.
In a particular embodiment of the present invention, aggregate server is actively obtaining from voice server and semantic service device respectively
After taking the first session information and the second session information in this session, session to be marked at present is obtained from voice server
Session identification, and the session identification according to corresponding to each information, by with first meeting consistent with the session identification of session to be marked
Words information and the second session information are converged, will corresponding to formed after session identification, input voice information, speech recognition
Text message and output text message are built into a four-tuple, i.e. (session identification, input voice information, after speech recognition
The text message of formation, export text message).And then interaction data of the user in a conversation procedure can be associated
Come, so as to the later stage mark and displaying when use.
Exemplary, on the basis of above-mentioned example, due to input voice information " Beijing weather is how ", speech recognition
The text message " how is Beijing weather " and output text message " fine, 10 DEG C, northwester 3-4 poles " formed afterwards is each corresponding
Session identification be 001, therefore above- mentioned information is pooled into four-tuple, i.e., (001, voice " how is Beijing weather ", text
" how is Beijing weather ", text " fine, 10 DEG C, northwester 3-4 poles ").
By the way that the interaction data in a conversation procedure is converged, the arrangement of interaction data, Er Qie are not only convenient for
In the data markers and screening in later stage, it is easy to the inquiry of data and the determination of Error Location, and then obtain the effect of data processing
Rate.
S130, the session tokens instruction for receiving user feedback;Wherein, the session tokens instruction includes:First session mark
Note instruction or the instruction of the second session tokens.
In a particular embodiment of the present invention, the output text message exported by semantic service device is reversely input to voice clothes
It is engaged in device, output text message can be converted into corresponding voice by voice server by speech synthesis technique, and pass through machine
The device plays such as the subsidiary loudspeaker of device people make user obtain response voice corresponding with the input voice of oneself to user.Therefore,
User can feed back the result oneself whether being satisfied with to response voice to robot according to the response voice of acquisition, i.e. polymerization clothes
Business device receives the session tokens instruction of user feedback.
Exemplary, in the examples described above, voice server will export text message " fine, 10 DEG C, northwester 3-4 poles "
Corresponding response voice is converted into by speech synthesis technique, user hears the response voice after being played by robot, obtained
Pekinese's real-time weather situation is arrived, therefore response voice of the user to robot is satisfied.User can without operation or
The operation for meeting condition is carried out, and then aggregate server can obtain the mark instructions for meeting condition of user feedback.
Wherein, session tokens instruction includes the instruction of the first session tokens or the instruction of the second session tokens.First session refers to
Make the feedback to the first session information for user, feedback of the second session instruction for user to the second session information.When system
Identify and stagger the time, i.e., input voice information is identified as the text message of mistake by voice server, and/or semantic service device will be defeated
Enter the output text message that text message understands or is converted into mistake, and then cause user to obtain the response sound result of mistake.
Therefore, when user is fed back using the equipment without display screen, such as microphone, user can be according to the response languages of mistake
Sound feeds back to four-tuple corresponding to this conversation procedure, and then aggregate server receives the first consistent meeting of user feedback
Talk about mark instructions and the instruction of the second session tokens.When user uses the equipment with display screen to be fed back, such as mobile phone,
The mobile terminals such as tablet personal computer, user can be according to the four-tuples shown on display screen, respectively to the number in different disposal stage
It is believed that breath is fed back, the instruction of the first session tokens and/or the second session mark of user feedback are received with this aggregate server
Note instruction.
It is exemplary, when input voice information is the text message " north that is formed after " Beijing weather is how ", speech recognition
Capital weather is how ", output text message " your current location be Beijing Quanjude restaurant (preceding shops) ", this session mark
Know for 002., can be by pressing the button on microphone, instead after then user receives robot on the response voice of Quanjude
The session tokens instruction of this time session mistake is presented.Or the first button or the set on mobile phone for the first session information
The second button that two session informations are set, user can according to the four-tuple shown on display screen, i.e., (002, voice " Beijing
Weather is how ", text " Beijing weather is how ", " your current location is Quanjude restaurant (front door, Beijing to text
Shop) "), learn that the second session information has gone out mistake, therefore user can feed back this by pressing the second button on mobile phone
Second session information of secondary session instructs for the second session tokens of mistake.Similarly, when the first session information in four-tuple goes out
Mistake, user can also be by pressing the first button on mobile phone, and the first session information for having fed back this time session is mistake
The first session tokens instruction.The session instruction of above-mentioned user feedback is all to receive to obtain by aggregate server.
S140, according to the first session tokens instruction and the second session tokens instruction respectively to the first session information after convergence
It is marked with the second session information.
In a particular embodiment of the present invention, aggregate server is after the session tokens for receiving user feedback instruct, root
According to the corresponding session information of session tokens instruction, session information is marked.Exemplary, in the examples described above, user
According to shown on screen four-tuple (002, voice " how is Beijing weather ", text " Beijing weather is how ", text " you
Location is Beijing Quanjude restaurant (preceding shops) at present ") feed back the instruction of the second session tokens, then aggregate server root
According to the instruction of the second session tokens, to the semantic understanding text in interaction data corresponding to session identification 002, " you current location are
Beijing Quanjude restaurant (preceding shops) " is marked, and is used when improving voice conversation system for the later stage.
The technical scheme of the present embodiment, aggregate server are obtained voice conversation mistake according to the session identification of voice to be marked
The first session information and the second session information in journey are converged, and receive the session tokens instruction of user feedback, with this root
The first session information and the second session information are marked according to the session tokens instruction of user feedback.And prior art therefore,
Compare, session information can be marked according to the feedback result of user, so as to reduce human input amount, improve mark
Efficiency;Also, the technical scheme of the embodiment of the present invention realize it is simple and convenient, be easy to popularize, the scope of application is wider.
Embodiment two
The present embodiment is on the basis of above-described embodiment one, there is provided a preferred embodiment of session tokens method,
Interaction data can be converged according to unique session identification, and the interaction text message to the different disposal stage enters rower
Note.Fig. 2 is a kind of flow chart for session tokens method that the embodiment of the present invention two provides, as shown in Fig. 2 this method is including following
Specific steps:
S210, in voice server according to corresponding to session identification obtains session to be marked first session information.
In a particular embodiment of the present invention, session identification refers to receive original user conversation language in voice server
Generated when sound is input voice information with the unique corresponding information with mark action of input voice information.First session is believed
Breath includes the session information after voice server receives and handles, including formed after input voice information and speech recognition
Text message, the text message formed after speech recognition are formed after voice server carries out speech recognition to input voice information
Text message, while assign the text message formed after speech recognition and the corresponding unique session of input voice information identical
Mark, the associate text information that will be formed with this by session identification after input voice information and speech recognition.And then root
According to session identification corresponding to voice to be marked, it is possible to obtain first session information corresponding with voice to be marked.
S220, in semantic service device according to corresponding to session identification obtains session to be marked second session information;Wherein,
Output text message is that semantic service device carries out shape after semantic parsing and result satisfaction to the text message formed after speech recognition
Into text message.
In a particular embodiment of the present invention, the second session information includes the meeting after semantic service device receives and handles
Information is talked about, including exports text message and the text message formed after voice server speech recognition, exports text envelope
Cease the text envelope for carrying out being formed after semantic parsing and result satisfaction to the text message formed after speech recognition for semantic service device
Breath.Wherein, the above results meet to refer to, semantic service device is after the text message formed after receiving speech recognition, Ke Yigen
The text with being formed after speech recognition is inquired about in the resources reserve storehouse of local according to the content of the text message formed after speech recognition
Query Result corresponding to this information, or web search is carried out according to the content of the text message formed after speech recognition, in real time
Ground obtains search result corresponding with the text message formed after speech recognition, and will meet the text envelope formed after speech recognition
The Query Result or search result of breath form output text message.Simultaneously assign output text message with after corresponding speech recognition
The unique session identification of text message identical of formation, shape after text message and speech recognition will be exported by session identification with this
Into associate text information get up.And then the session identification according to corresponding to voice to be marked, it is possible to obtain and voice to be marked
Corresponding second session information.
S230, by the text message formed after session identification, input voice information, speech recognition and output text message converge
Gather for a four-tuple corresponding with session to be marked.
In a particular embodiment of the present invention, aggregate server session identification according to corresponding to each information, obtain and wait to mark
Remember the first session information and the second session information corresponding to voice, will corresponding session identification, input voice information, voice knowledge
The text message that is not formed afterwards and output text message, and by above-mentioned four information architectures corresponding with voice to be marked into one
Individual four-tuple, i.e., (session identification, input voice information, the text message formed after speech recognition, export text message).And then
All interaction datas of the user in a conversation procedure can be associated, so as to the later stage mark and displaying when use.
S240, receive the session tokens instruction that user is fed back by the first feedback device or the second feedback device;Wherein,
First feedback device is two different feedback devices from the second feedback device.
In a particular embodiment of the present invention, the output text message exported by semantic service device is reversely input to voice clothes
It is engaged in device, output text message can be converted into corresponding voice by voice server by speech synthesis technique, and pass through machine
The device plays such as the subsidiary loudspeaker of device people make user obtain response voice corresponding with the input voice of oneself to user.Therefore,
User can feed back the result oneself whether being satisfied with to response voice to robot according to the response voice of acquisition, i.e. polymerization clothes
Business device receives the session tokens instruction of user feedback.Wherein, session tokens instruction includes the first session tokens and instructed or the
Two session tokens instruct.First session instruction is user to the feedback of the first session information, and the second session instruction is user to the
The feedback of two session informations.
First feedback device refers to no touch display screen but the equipment with operation button, such as microphone etc., user
The result of voice conversation can be fed back by pressing the button on the first feedback device.Second feedback device refers to have
There is the equipment of touch display screen curtain, such as the mobile terminal such as mobile phone and tablet personal computer, user can show according on display screen
Interaction data four-tuple, by touch press button different on touch display screen curtain, come respectively to different disposal rank with this
The interaction data information of section is fed back.
Exemplary, when identifying for system is staggered the time, i.e., input voice information is identified as the text of mistake by voice server
The text message formed after speech recognition is understood or is converted into the output text envelope of mistake by this information, and/or semantic service device
Breath, and then cause user to obtain the response sound result of mistake.Therefore, when user is fed back using the first feedback device,
User can feed back according to the response voice of mistake to four-tuple corresponding to this conversation procedure, press on the first feedback device
Button, and then aggregate server receives consistent the first session tokens instruction of user feedback and the second session tokens refer to
Order.When user uses the first feedback device, user according to mistake response voice learn this interaction results be it is wrong,
And want to feed back voice conversation system, then user further can judge to hand over according to the four-tuple shown on display screen
There is mistake in stage in which mutual, and then the interaction data information in different disposal stage is fed back respectively, passes through touch
Button different on touch display screen curtain is pressed, the first session tokens that user feedback is received with this aggregate server instruct
And/or second session tokens instruction.
S250, according to the first session tokens instruction and the second session tokens instruction respectively to the first session information after convergence
It is marked with the second session information.
Preferably, when the instruction of the first session tokens and the instruction of the second session tokens are customer satisfaction system mark instructions,
The first session information after convergence and the second session information are respectively labeled as customer satisfaction system session information;Or when first
When session tokens instruct and the instruction of the second session tokens is user's unsatisfied mark instructions, the first session after convergence is believed
Breath and the second session information are respectively labeled as the unsatisfied session information of user;Or when the instruction of the first session tokens is user
When satisfied mark instructions and the second session tokens instruct mark instructions unsatisfied for user, the first session after convergence is believed
Breath and the second session information are respectively labeled as customer satisfaction system session information and the unsatisfied session information of user;Or when
When one session mark instructions are the unsatisfied mark instructions of user and the instruction of the second session tokens is customer satisfaction system mark instructions,
The first session information after convergence and the second session information are respectively labeled as the unsatisfied session information of user and user's satisfaction
Session information.
In a particular embodiment of the present invention, customer satisfaction system mark instructions represent that voice to be marked passes through voice conversation system
Obtained response voice of uniting is correct, and user is met the result of self-demand, conversely, the unsatisfied mark of user refers to
Order represents that the response voice that voice to be marked obtains by voice conversation system is wrong, and user has obtained being unsatisfactory for itself need
The result asked.Therefore, instructed according to the session tokens of user feedback, different session informations is marked.
The technical scheme of the present embodiment, aggregate server is by the session identification of session to be marked by during voice conversation
The first session information and the second session information converged, and the session of user can be received from different types of feedback device
Mark instructions, the first session information and the second session information are marked according to the instruction of the session tokens of user feedback with this.
Therefore, compared to the prior art, session information can be marked according to the feedback result of user, so as to reduce manpower
Input amount, improve labeling effciency;Also, the technical scheme of the embodiment of the present invention realize it is simple and convenient, be easy to popularize, the scope of application
It is wider.
Embodiment three
Fig. 3 is a kind of structural representation for session tokens device that the embodiment of the present invention three provides, and the present embodiment is applicable
In the situation of voice conversation Method of Man-computer Interaction Data Collection, the device can realize the session described in any embodiment of the present invention
Labeling method.The device specifically includes:
Acquisition module 310, for the first session letter according to corresponding to predetermined session identification acquisition session to be marked
Breath and the second session information;Wherein, first session information includes:Input voice information and the input voice information is entered
The text message formed after row speech recognition;Second session information includes:Voice knowledge is carried out to the input voice information
The text message and output text message not formed afterwards;
Convergence module 320, for according to the session identification by first session information and second session information
Converged;
Receiving module 330, the session tokens for receiving user feedback instruct;Wherein, the session tokens instruction includes:
First session tokens instruct or the instruction of the second session tokens;
Mark module 340, for right respectively according to first session tokens instruction and second session tokens instruction
First session information and second session information after convergence are marked.
Further, the acquisition module 310, including:
First acquisition unit 3101, for obtaining the session to be marked according to the session identification in voice server
Corresponding first session information;
Second acquisition unit 3102, for obtaining the session to be marked according to the session identification in semantic service device
Corresponding second session information;Wherein, the output text message is the semantic service device to being formed after the speech recognition
Text message carry out semantic parsing and result meet after the text message that is formed.
Further, the convergence module 320, specifically for by the session identification, input voice information, described
The text message and the output text message convergence formed after speech recognition is one and the session corresponding four to be marked
Tuple.
Further, the receiving module 330, the first feedback device or second are passed through specifically for receiving the user
The session tokens instruction of feedback device feedback;Wherein, first feedback device and second feedback device are two
Different feedback devices.
Further, the mark module 340, specifically for when first session tokens instruction and second session
When mark instructions are customer satisfaction system mark instructions, by first session information after convergence and second session information
It is respectively labeled as customer satisfaction system session information;Or when first session tokens instruction and second session tokens refer to
When order is user's unsatisfied mark instructions, first session information after convergence and second session information are distinguished
Labeled as the unsatisfied session information of user;Or when first session tokens instruction refers to for the customer satisfaction system mark
When order and second session tokens instruct mark instructions unsatisfied for the user, first session after convergence is believed
Breath and second session information are respectively labeled as the customer satisfaction system session information and the unsatisfied session letter of the user
Breath;Or when first session tokens instruction refers to for the unsatisfied mark instructions of the user and second session tokens
Make for the customer satisfaction system mark instructions when, first session information after convergence and second session information are distinguished
Labeled as the unsatisfied session information of the user and the customer satisfaction system session information.
The technical scheme of the present embodiment, by the mutual cooperation between modules, realize acquisition, the session of session information
The reception and session information is marked according to session tokens instruction that the session tokens of convergence, the user feedback of information instruct
Etc. function, compared to the prior art, session information can be marked according to the feedback result of user, so as to reduce people
Power input amount, improve labeling effciency;Also, the technical scheme of the embodiment of the present invention realize it is simple and convenient, be easy to popularize, be applicable model
Enclose wider.
Example IV
Fig. 4 is a kind of structural representation for aggregate server that the embodiment of the present invention four provides.Fig. 4 is shown suitable for being used for
Realize the block diagram of the exemplary aggregate server 12 of embodiment of the present invention.The aggregate server 12 that Fig. 4 is shown is only one
Example, any restrictions should not be brought to the function and use range of the embodiment of the present invention.
As shown in figure 4, aggregate server 12 is showed in the form of universal computing device.The component of aggregate server 12 can
To include but is not limited to:One or more processor or processing unit 16, system storage 28, connect different system group
The bus 18 of part (including system storage 28 and processing unit 16).
Bus 18 represents the one or more in a few class bus structures, including memory bus or Memory Controller,
Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.Lift
For example, these architectures include but is not limited to industry standard architecture (ISA) bus, MCA (MAC)
Bus, enhanced isa bus, VESA's (VESA) local bus and periphery component interconnection (PCI) bus.
Aggregate server 12 typically comprises various computing systems computer-readable recording medium.These media can be it is any can be by
The usable medium that aggregate server 12 accesses, including volatibility and non-volatile media, moveable and immovable medium.
System storage 28 can include the computer system readable media of form of volatile memory, such as arbitrary access
Memory (RAM) 30 and/or cache memory 32.Aggregate server 12 may further include it is other it is removable/can not
Mobile, volatile/non-volatile computer system storage medium.Only as an example, storage system 34 can be used for read-write not
Movably, non-volatile magnetic media (Fig. 4 is not shown, is commonly referred to as " hard disk drive ").Although not shown in Fig. 4, can with
There is provided for the disc driver to may move non-volatile magnetic disk (such as " floppy disk ") read-write, and to removable non-volatile
The CD drive of CD (such as CD-ROM, DVD-ROM or other optical mediums) read-write.In these cases, each driving
Device can be connected by one or more data media interfaces with bus 18.Memory 28 can include at least one program and produce
Product, the program product have one group of (for example, at least one) program module, and these program modules are configured to perform of the invention each
The function of embodiment.
Program/utility 40 with one group of (at least one) program module 42, such as memory 28 can be stored in
In, such program module 42 include but is not limited to operating system, one or more application program, other program modules and
Routine data, the realization of network environment may be included in each or certain combination in these examples.Program module 42 is usual
Perform the function and/or method in embodiment described in the invention.
Aggregate server 12 can also be with one or more external equipments 14 (such as keyboard, sensing equipment, display 24
Deng) communication, the equipment communication interacted with the aggregate server 12 can be also enabled a user to one or more, and/or with making
Obtain any equipment that the aggregate server 12 can be communicated with one or more of the other computing device (such as network interface card, modulatedemodulate
Adjust device etc.) communication.This communication can be carried out by input/output (I/O) interface 22.Also, aggregate server 12 may be used also
To pass through network adapter 20 and one or more network (such as LAN (LAN), wide area network (WAN) and/or public network
Network, such as internet) communication.As illustrated, network adapter 20 is led to by other modules of bus 18 and aggregate server 12
Letter.It should be understood that although not shown in the drawings, can combine aggregate server 12 uses other hardware and/or software module, bag
Include but be not limited to:Microcode, device driver, redundant processing unit, external disk drive array, RAID system, magnetic tape drive
Device and data backup storage system etc..
Processing unit 16 is stored in program in system storage 28 by operation, so as to perform various function application and
Data processing, such as realize the session tokens method that the embodiment of the present invention is provided.
Embodiment five
The embodiment of the present invention five also provides a kind of computer-readable recording medium, be stored thereon with computer program (or
For computer executable instructions), it is used to perform a kind of session tokens method when the program is executed by processor, this method includes:
First session information and the second session information according to corresponding to predetermined session identification obtains session to be marked;
Wherein, first session information includes:Input voice information and to the input voice information carry out speech recognition after formed
Text message;Second session information includes:The text envelope for the input voice information formed after speech recognition
Breath and output text message;
First session information and second session information are converged according to the session identification;
Receive the session tokens instruction of user feedback;Wherein, the session tokens instruction includes:First session tokens instruct
Or second session tokens instruction;
According to first session tokens instruction and second session tokens instruction respectively to described first after convergence
Session information and second session information are marked.
The computer-readable storage medium of the embodiment of the present invention, any of one or more computer-readable media can be used
Combination.Computer-readable medium can be computer-readable signal media or computer-readable recording medium.It is computer-readable
Storage medium for example may be-but not limited to-the system of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, device or
Device, or any combination above.The more specifically example (non exhaustive list) of computer-readable recording medium includes:Tool
There are the electrical connections of one or more wires, portable computer diskette, hard disk, random access memory (RAM), read-only storage
(ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only storage (CD-
ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer-readable storage
Medium can be any includes or the tangible medium of storage program, the program can be commanded execution system, device or device
Using or it is in connection.
Computer-readable signal media can include in a base band or as carrier wave a part propagation data-signal,
Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited
In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can
Any computer-readable medium beyond storage medium is read, the computer-readable medium, which can send, propagates or transmit, to be used for
By instruction execution system, device either device use or program in connection.
The program code included on computer-readable medium can be transmitted with any appropriate medium, including --- but it is unlimited
In wireless, electric wire, optical cable, RF etc., or above-mentioned any appropriate combination.
It can be write with one or more programming languages or its combination for performing the computer that operates of the present invention
Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++,
Also include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with
Fully perform, partly perform on the user computer on the user computer, the software kit independent as one performs, portion
Divide and partly perform or performed completely on remote computer or server on the remote computer on the user computer.
Be related in the situation of remote computer, remote computer can pass through the network of any kind --- including LAN (LAN) or
Wide area network (WAN)-be connected to subscriber computer, or, it may be connected to outer computer (such as carried using Internet service
Pass through Internet connection for business).
Pay attention to, above are only presently preferred embodiments of the present invention and institute's application technology principle.It will be appreciated by those skilled in the art that
The invention is not restricted to specific embodiment described here, can carry out for a person skilled in the art various obvious changes,
Readjust and substitute without departing from protection scope of the present invention.Therefore, although being carried out by above example to the present invention
It is described in further detail, but the present invention is not limited only to above example, without departing from the inventive concept, also
Other more equivalent embodiments can be included, and the scope of the present invention is determined by scope of the appended claims.
Claims (12)
- A kind of 1. session tokens method, it is characterised in that applied to aggregate server, methods described includes:First session information and the second session information according to corresponding to predetermined session identification obtains session to be marked;Its In, first session information includes:Input voice information and formed after carrying out speech recognition to the input voice information Text message;Second session information includes:The text message for the input voice information formed after speech recognition With output text message;First session information and second session information are converged according to the session identification;Receive the session tokens instruction of user feedback;Wherein, the session tokens instruction includes:First session tokens instruct or Second session tokens instruct;According to first session tokens instruction and second session tokens instruction respectively to first session after convergence Information and second session information are marked.
- 2. according to the method for claim 1, it is characterised in that described to be marked according to the acquisition of predetermined session identification First session information corresponding to session and the second session information, including:In voice server according to corresponding to the session identification obtains the session to be marked first session information;In semantic service device according to corresponding to the session identification obtains the session to be marked second session information;Wherein, The output text message be the semantic service device text message that is formed after the speech recognition is carried out semantic parsing and As a result the text message formed after meeting.
- 3. according to the method for claim 2, it is characterised in that described to be believed first session according to the session identification Breath and second session information are converged, including:By the text message formed after the session identification, the input voice information, the speech recognition and the output text This converging information is a four-tuple corresponding with the session to be marked.
- 4. according to the method for claim 1, it is characterised in that the mark instructions for receiving user feedback, including:The session tokens that the user is fed back by the first feedback device or the second feedback device are received to instruct;Wherein, First feedback device is two different feedback devices from second feedback device.
- 5. according to the method for claim 1, it is characterised in that described according to first session tokens instruction and described the First session information after convergence and second session information are marked respectively for the instruction of two session tokens, including:When first session tokens instruction and second session tokens instruction are customer satisfaction system mark instructions, will converge First session information and second session information after poly- are respectively labeled as customer satisfaction system session information;Or, will when first session tokens instruction and second session tokens instruction are the unsatisfied mark instructions of user First session information and second session information after convergence are respectively labeled as the unsatisfied session information of user;Or Person,It is institute when first session tokens are instructed as the customer satisfaction system mark instructions and second session tokens instruction When stating the unsatisfied mark instructions of user, first session information after convergence and second session information are marked respectively For the customer satisfaction system session information and the unsatisfied session information of the user;OrWhen first session tokens instruction is for the unsatisfied mark instructions of the user and second session tokens instruction During the customer satisfaction system mark instructions, first session information after convergence and second session information are marked respectively For the unsatisfied session information of the user and the customer satisfaction system session information.
- 6. a kind of session tokens device, it is characterised in that described device includes:Acquisition module, convergence module, receiving module and mark Remember module;Wherein,The acquisition module, for the first session information corresponding to obtaining session to be marked according to predetermined session identification and Second session information;Wherein, first session information includes:Input voice information and to the input voice information carry out language The text message formed after sound identification;Second session information includes:After speech recognition being carried out to the input voice information The text message and output text message of formation;The convergence module, for being carried out first session information and second session information according to the session identification Convergence;The receiving module, the session tokens for receiving user feedback instruct;Wherein, the session tokens instruction includes:The One session mark instructions or the instruction of the second session tokens;The mark module, for being instructed respectively to convergence according to first session tokens instruction and second session tokens First session information and second session information afterwards is marked.
- 7. device according to claim 6, it is characterised in that the acquisition module includes:First acquisition unit and second Acquiring unit;Wherein,The first acquisition unit, it is corresponding for obtaining the session to be marked according to the session identification in voice server The first session information;The second acquisition unit, it is corresponding for obtaining the session to be marked according to the session identification in semantic service device The second session information;Wherein, the output text message is the semantic service device to the text envelope that is formed after speech recognition Breath carries out the text message formed after semantic parsing and result satisfaction.
- 8. device according to claim 7, it is characterised in that:The convergence module, specifically for will be formed after the session identification, the input voice information, the speech recognition Text message and the output text message convergence are a four-tuple corresponding with the session to be marked.
- 9. device according to claim 6, it is characterised in that:The receiving module, the institute fed back specifically for receiving the user by the first feedback device or the second feedback device State session tokens instruction;Wherein, first feedback device is two different feedback devices from second feedback device.
- 10. device according to claim 6, it is characterised in that:The mark module, is specifically used for:When first session tokens instruction and second session tokens instruction are customer satisfaction system mark instructions, will converge First session information and second session information after poly- are respectively labeled as customer satisfaction system session information;Or, will when first session tokens instruction and second session tokens instruction are the unsatisfied mark instructions of user First session information and second session information after convergence are respectively labeled as the unsatisfied session information of user;Or Person,It is institute when first session tokens are instructed as the customer satisfaction system mark instructions and second session tokens instruction When stating the unsatisfied mark instructions of user, first session information after convergence and second session information are marked respectively For the customer satisfaction system session information and the unsatisfied session information of the user;OrWhen first session tokens instruction is for the unsatisfied mark instructions of the user and second session tokens instruction During the customer satisfaction system mark instructions, first session information after convergence and second session information are marked respectively For the unsatisfied session information of the user and the customer satisfaction system session information.
- A kind of 11. aggregate server, it is characterised in that including:One or more processors;Memory, for storing one or more programs;When one or more of programs are by one or more of computing devices so that one or more of processors are real The now session tokens method as any one of claim 1 to 5.
- 12. a kind of computer-readable recording medium, is stored thereon with computer program, it is characterised in that the program is by processor The session tokens method as any one of claim 1 to 5 is realized during execution.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711130201.3A CN107894972A (en) | 2017-11-15 | 2017-11-15 | A kind of session tokens method, apparatus, aggregate server and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711130201.3A CN107894972A (en) | 2017-11-15 | 2017-11-15 | A kind of session tokens method, apparatus, aggregate server and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107894972A true CN107894972A (en) | 2018-04-10 |
Family
ID=61804209
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711130201.3A Pending CN107894972A (en) | 2017-11-15 | 2017-11-15 | A kind of session tokens method, apparatus, aggregate server and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107894972A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110708441A (en) * | 2018-07-25 | 2020-01-17 | 南阳理工学院 | Word-prompting device |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101923857A (en) * | 2009-06-17 | 2010-12-22 | 复旦大学 | Extensible audio recognition method based on man-machine interaction |
CN103020047A (en) * | 2012-12-31 | 2013-04-03 | 威盛电子股份有限公司 | Method for revising voice response and natural language dialogue system |
CN104519040A (en) * | 2013-09-29 | 2015-04-15 | 中兴通讯股份有限公司 | Method, device and server for processing online interaction |
CN105206269A (en) * | 2015-08-14 | 2015-12-30 | 百度在线网络技术(北京)有限公司 | Voice processing method and device |
US9257115B2 (en) * | 2012-03-08 | 2016-02-09 | Facebook, Inc. | Device for extracting information from a dialog |
US20170004133A1 (en) * | 2015-06-30 | 2017-01-05 | International Business Machines Corporation | Natural language interpretation of hierarchical data |
CN107068144A (en) * | 2016-01-08 | 2017-08-18 | 王道平 | It is easy to the method for manual amendment's word in a kind of speech recognition |
-
2017
- 2017-11-15 CN CN201711130201.3A patent/CN107894972A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101923857A (en) * | 2009-06-17 | 2010-12-22 | 复旦大学 | Extensible audio recognition method based on man-machine interaction |
US9257115B2 (en) * | 2012-03-08 | 2016-02-09 | Facebook, Inc. | Device for extracting information from a dialog |
CN103020047A (en) * | 2012-12-31 | 2013-04-03 | 威盛电子股份有限公司 | Method for revising voice response and natural language dialogue system |
CN104519040A (en) * | 2013-09-29 | 2015-04-15 | 中兴通讯股份有限公司 | Method, device and server for processing online interaction |
US20170004133A1 (en) * | 2015-06-30 | 2017-01-05 | International Business Machines Corporation | Natural language interpretation of hierarchical data |
CN105206269A (en) * | 2015-08-14 | 2015-12-30 | 百度在线网络技术(北京)有限公司 | Voice processing method and device |
CN107068144A (en) * | 2016-01-08 | 2017-08-18 | 王道平 | It is easy to the method for manual amendment's word in a kind of speech recognition |
Non-Patent Citations (5)
Title |
---|
ASHWINI JAYA KUMAR, ET AL: "A knowledge graph based speech interface for question answering systems", 《SPEECH COMUNICATION》 * |
王玉,等: "口语对话系统中对话管理方法研究综述", 《计算机科学》 * |
虞欣: "《贝叶斯网络在影像翻译中的应用》", 30 June 2011, 测绘出版社 * |
贾熹滨,等: "智能对话系统研究综述", 《北京工业大学学报》 * |
陈力为 等: "《语言工程》", 31 August 1997, 清华大学出版社 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110708441A (en) * | 2018-07-25 | 2020-01-17 | 南阳理工学院 | Word-prompting device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108470034B (en) | A kind of smart machine service providing method and system | |
CN108010531B (en) | Visual intelligent inquiry method and system | |
CN107623614A (en) | Method and apparatus for pushed information | |
CN108962255B (en) | Emotion recognition method, emotion recognition device, server and storage medium for voice conversation | |
CN107657017A (en) | Method and apparatus for providing voice service | |
CN109101545A (en) | Natural language processing method, apparatus, equipment and medium based on human-computer interaction | |
EP4113507A1 (en) | Speech recognition method and apparatus, device, and storage medium | |
CN107844586A (en) | News recommends method and apparatus | |
CN107833574A (en) | Method and apparatus for providing voice service | |
CN111739553B (en) | Conference sound collection, conference record and conference record presentation method and device | |
CN107430858A (en) | The metadata of transmission mark current speaker | |
WO2020253064A1 (en) | Speech recognition method and apparatus, and computer device and storage medium | |
KR20180091707A (en) | Modulation of Packetized Audio Signal | |
JP7059929B2 (en) | Information processing equipment | |
CN106971009A (en) | Speech data library generating method and device, storage medium, electronic equipment | |
CN106713111B (en) | Processing method for adding friends, terminal and server | |
CN107943914A (en) | Voice information processing method and device | |
CN110782962A (en) | Hearing language rehabilitation device, method, electronic equipment and storage medium | |
CN109325091A (en) | Update method, device, equipment and the medium of points of interest attribute information | |
CN109671435A (en) | Method and apparatus for waking up smart machine | |
WO2021068493A1 (en) | Method and apparatus for processing information | |
CN109597996A (en) | A kind of semanteme analytic method, device, equipment and medium | |
CN114064943A (en) | Conference management method, conference management device, storage medium and electronic equipment | |
CN107894972A (en) | A kind of session tokens method, apparatus, aggregate server and storage medium | |
TWM607509U (en) | Voice serving system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180410 |
|
RJ01 | Rejection of invention patent application after publication |